Homework 2: Solution

Size: px

Start display at page:

Download "Homework 2: Solution"

Clyde Hopkins
5 years ago
Views:

1 0-704: Information Processing and Learning Sring 0 Lecturer: Aarti Singh Homework : Solution Acknowledgement: The TA graciously thanks Rafael Stern for roviding most of these solutions.. Problem Hence, Dq = q log q dx Similarly, h i q = E[r i X] = r i xqdx. Thus: Dq = + log q h i q = r i x Finally h 0 q = qdx and hence, h 0 =. Since D q is convex and the equality restrictions are linear, we wish to solve a convex otimization roblem. The Lagrangian of this roblem is: Solving for Lq, λ i = 0, obtain: Calling λ 0 = λ 0, obtain: Lq, λ i = Dq + + logq log + λ 0 + m λ i h i q i=0 m λ i r i x = 0 i= Taking λ 0 such that qdx =, obtain: q = e λ 0 + m i= λiri q = e m i= λiri x e m i= λiri Assume there exist unique values for each λ i such that the equality constraints are satisfied. In this case, q, λ clearly satisfy stationarity and rimal feasibility. Since there are no inequality conditions, dual -

2 - Lecture : Solution feasibility and comlementary slackness are also satisfied. Hence, the KKT conditions are satisfied and q minimizes Dq.. Problem By results from class, we need only find constants λ 0, λ, λ such that the distribution x = exλ 0 + λ x + λ x satisfy the moment constraints. We insect the Gaussian df with first moment µ and second moment µ φx = µ ex x π = x ex π + x µ And we conclude immediately that λ = and λ = the distribution. and λ 0 is whatever constant required to normalize.3 Problem 3 Recall that, by HW b: HP,..., P n = HP i P i,..., P i= HP i The right side is comletely determined by the marginals and corresonds exactly to the joint distribution of indeendent variables. Hence, the result is roven. i=.4 Problem Let rx be the entroy rate of a stocastic rocess X. Recall that: by HW b: HX,..., X n rx = lim n n HX,..., X n = HX + HX i X i,..., X By the Markovian roerty, X i is conditionally indeendent of X i,..., X given X i. Hence: i=

3 Lecture : Solution -3 HX i X i,..., X = HX i X i i= i= Since the Markov chain is homogeneous and stationary, for all i, HX i X i = HX X. Thus: Finally, HX + n HX X rx = lim = HX X n n HX X = i P X = i j P X = j X = i log P X = j X = i Call P i the i th row of P. Observe that: Hence, by stationarity: j P X = j X = i log P X = j X = i = HP i HX X = i P X = ihp i = i µihp i = µ HP i i Observe that rx = HX X HX. If we take the variables to be i.i.d. HX X = HX. Finally, HX is maximized taking the uniform distribution on the suort of the Markov chain. Hence, the rx is maximized taking P as having all rows equal to S, were S is the suort of the Markov chain The invariant measure is obtained solving for µ = µ0 and µ0 + µ =, which lead to µ0 = + and µ = +. From the last item, the entroy rate of the Markov chain is HP µ i i. Observe that P is degenerate and, therefore, HP = 0. Hence, rx = + log + log. Setting dr d = 0, obtain: dr d = log + log + log + log = + + = log log + log log = 0 =

4 -4 Lecture : Solution 3 + = 0 Obtain: = 3± 5. Since 0 and rx = 0 for = 0 and =, by Weiestrass s theorem: = 3 5 maximizes the entroy rate of this Markov chain. On one hand, Reducing increases the weight HX X = 0 contributes to the entroy, which hels increase the entroy. On the other hand, reducing decreases the value of HX X = 0. The otimum value is the sweet sot between these tendencies. 5. IX; Y = HX HX Y. In class we roved that HX = 0.5 logπe. Hence, it suffices to find HX Y. Recall that X Y is a normal random variable with variance ρ ρ = ρ, which does not deend on Y. Hence HX Y = 0.5 logπe ρ if ρ <. Thus, IX; Y = HX HX Y = 0.5 log ρ This value is minimized when ρ = 0. In this case, the variables are indeendent and, therefore, there is no mutual information. When ρ = or ρ =, X is comletely determined by Y, and therefore HX Y = 0. Hence, in this case, IX; Y = HX and is the maximum value obtainable..5 Problem 5 IX; Y = HX HX Y. In class we roved that HX = 0.5 logπe. Hence, it suffices to find HX Y. Recall that X Y is a normal random variable with variance ρ ρ = ρ, which does not deend on Y. Hence HX Y = 0.5 logπe ρ if ρ <. Thus, IX; Y = HX HX Y = 0.5 log ρ This value is minimized when ρ = 0. In this case, the variables are indeendent and, therefore, there is no mutual information. When ρ = or ρ =, X is comletely determined by Y, and IX, Y =.6 Problem 6 HY X = x x y y x logy x Hence, HY X = xlogy x + Similarly, h i q = E[r i XY ] = x r ixx y yy x. Thus: h i = r i xxy Finally h 0,x = y y x and hence, h 0,x = I x. Since HY X is convex and the equality restrictions are linear, we wish to solve a convex otimization roblem. The Lagrangian of this roblem is:

5 Lecture : Solution -5 L, λ = xlogy x + + i λ i r i xxy + x λ 0,x I x Call x λ 0,xI x = fx and obtain: L, λ = xlogy x + + i λ i r i xxy + fx Solving for L, λ = 0: i y x = ex λ ir i xxy + fx x = x Call gx = fx x x : y x = ex i yλ i r i x + gx Since 0 x + x = : y x = ex i yλ ir i x + ex i yλ ir i x Note that we can cancel out the gx from the numerator and the denominator. Observe that clearly satisfies stationarity. Hence, if there exist λ i s such that satisfies the constraints, it also satisfies rimal feasiblity. Finally, since the solution follows the inequalities but did not use them as a constraint, dual feasibility and comlementary slackness are also satisfies. Hence, since the KKT conditions are satisfied, maximizes HY X.

Exercises with solutions (Set D)

Exercises with solutions (Set D) Exercises with solutions Set D. A fair die is rolled at the same time as a fair coin is tossed. Let A be the number on the upper surface of the die and let B describe the outcome of the coin toss, where