UCSD ECE53 Handout #7 Prof. Young-Han Kim Tuesday, May 6, 4 Solutions to Homework Set #5 (Prepared by TA Fatemeh Arbabjolfaei). Neural net. Let Y = X + Z, where the signal X U[,] and noise Z N(,) are independent. (a) Find the function g(y) that minimizes MSE = E [ (sgn(x) g(y)) ], where sgn(x) = { x + x >. (b) Plot g(y) vs. y. Solution: The minimum MSE is achieved when g(y) = E(sgn(X) Y). We have g(y) = E(sgn(X) Y = y) = To find the conditional pdf of X given Y, we use f X Y (x y) = f Y X(y x)f X (x) f Y (y) Since X and Z are independent,, where f X (x) = sgn(x)f X Y (x y)dx. { X otherwise. f Y X (y x) = f Z (y x) Y {X = x} N(x,). To find f Y (y) we integrate f Y X (y x)f X (x) over x: f Y (y) = = f Y X (y x)f X (x)dx = π e (x y) dx = = (Q( y ) Q( y +)). Combining the above results, we get π e ( (y x) dx π e (x y) dx ) e (x y) dx π
g(y) = sgn(x)f X Y (x y)dx = = = f Y (y) ( π e (x y) dx+ sgn(x) Q( y )+Q( y) Q( y +) Q( y ) Q( y +). e (y x) π f Y (y) dx ) e (x y) dx π The plot is shown below. Note the sigmoidal shape corresponding to the common neural network activation function..5.5 g(y).5.5 8 6 4 4 6 8 y. Additive shot noise channel. Consider an additive noise channel Y = X + Z, where the signal X N(,), and the noise Z {X = x} N(,x ), i.e., the noise power of increases linearly with the signal squared. (a) Find E(Z ). (b) Find the best linear MSE estimate of X given Y. Solution: (a) Since Z {X = x} N(,x ), Therefore, E(Z X) = Var(Z X = x) = E(Z X = x) E(Z X = x) = x. E(Z ) = E(E(Z X)) = E(X ) =.
(b) From the the best linear estimate formula, Here we have E(X) =, ˆX = Cov(X,Y) σy (Y E(Y))+E(X). E(Y) = E(X +Z) = E(X)+E(Z) = E(X)+E(E(Z X)) = +E() =, E(XZ) = E(E(XZ X)) = E(XE(Z X)) = E(X ) = E() =, σ Y = E(Y ) (E(Y)) = E((X +Z) ) = E(X )+E(Z )+E(XZ) = ++ =, Cov(X,Y) = E((X E(X))(Y E(Y))) = E(XY) = E(X(X +Z)) = E(X )+E(XZ) = + =. Using all of the above, we get ˆX = Y. 3. Estimation vs. detection. Let the signal { +, with probability X =, with probability, and the noise Z Unif[,] be independent random variables. Their sum Y = X + Z is observed. (a) Find the best MSE estimate of X given Y and its MSE. (b) Now suppose we use a decoder to decide whether X = + or X = so that the probability of error is minimized. Find the optimal decoder and its probability of error. Compare the optimal decoder s MSE to the minimum MSE. Solution: (a) We can easily find the piecewise constant density of Y 4 y f Y (y) = 8 < y 3 otherwise 3
The conditional probabilities of X given Y are 3 y < P{X = + Y = y} = y + + < y +3 3 y < P{X = Y = y} = y + + < y +3 Thus the best MSE estimate is 3 Y < g(y) = E(X Y) = Y + + + < Y +3 The minimum mean square error is E Y (Var(X Y)) = E Y (E(X Y) (E(X Y)) ) = E( g(y) ) = E(g(Y) ) = ( = = 3 3 8 dy + 3 8 dy g(y) f Y (y)dy +3 4 dy + + 8 dy = 4 4 =. 8 dy ) (b) The optimal decoder is given by the MAP rule. The a posteriori pmf of X was found in part (a). Thus the MAP rule reduces to 3 y D(y) = ± < y < + + + < y +3 Since either value can be chosen for D(y) in the center range of Y, a symmetrical decoder is sufficient, i.e., { y < D(y) = + y The probability of decoding error is P{d(Y) X} = P{X =, Y } + P{X =, Y < } = P{X = Y }P{Y } + P{X = Y < }P{Y < } = 4 + 4 = 4. 4
If we use the decoder (detector) as an estimator, its MSE is E ( (d(y) X) ) = 3 4 + 4 =. This MSE is twice that of the minimum mean square error estimator. 5
Solutions to Additional Exercises. Orthogonality. Let ˆX be the minimum MSE estimate of X given Y. (a) Show that for any function g(y), E((X ˆX)g(Y)) =, i.e., the error (X ˆX) and g(y) are orthogonal. (b) Show that Var(X) = E(Var(X Y))+Var( ˆX). Provide a geometric interpretation for this result. Solution: (a) We use iterated expectation and the fact that E(g(Y) Y) = g(y): ( E (X ˆX)g(Y) ) [ ( = E E (X ˆX)g(Y) Y )] (b) First we write and = E[E((X E(X Y))g(Y)) Y)] = E(g(Y)E((X E(X Y)) Y)) = E(g(Y)(E(X Y) E(X Y))) =. E(Var(X Y)) = E(X ) E((E(X Y)) ), Var(E(X Y)) = E((E(X Y)) ) (E(E(X Y))) Adding the two terms completes the proof. = E((E(X Y)) ) (E(X)). Interpretation: IfweviewX,E(X Y),andX E(X Y)asvectorswith norm Var(X), Var(E(X Y)), and E(Var(X Y)), respectively, then this result provides a Pythagorastheorem, wherethesignal, theerror, andtheestimatearethesidesofarighttriangle (estimate and error being orthogonal).. Let X and Y be two random variables. Let Z = X +Y and let W = X Y. Find the best linear estimate of W given Z as a function of E(X),E(Y),σ X,σ Y,ρ XY and Z. Solution: By the theorem of MMSE linear estimate, we have Ŵ = Cov(W,Z) σz (Z E(Z))+E(W). 6
Here we have E(W) = E(X) E(Y), E(Z) = E(X)+E(Y), σ Z = σ X +σ Y +ρ XY σ X σ Y, Cov(W,Z) = E(WZ) E(W)E(Z) = E((X Y)(X +Y)) (E(X) E(Y))(E(X)+E(Y)) = E(X ) E(Y ) (E(X)) +(E(Y)) = σ X σ Y. So the best linear estimate of W given Z is Ŵ = σ X σ Y σ X +σ Y +ρ XYσ X σ Y (Z E(X) E(Y))+E(X) E(Y). 3. Let X and Y be two random variables with joint pdf { x+y, x, y, f(x,y) =, otherwise. (a) Find the MMSE estimator of X given Y. (b) Find the corresponding MSE. (c) Find the pdf of Z = E(X Y). (d) Find the linear MMSE estimator of X given Y. (e) Find the corresponding MSE. Solution: (a) We first calculate the marginal pdf of Y, by a direct integration. For y < or y >, we integrate over. For y, f Y (y) = (x+y)dx = y +. Note that the limits of the integration are derived from the definition of the joint pdf. Thus, { y + f Y (y) =, y, otherwise. Now we can calculate the conditional pdf f X Y (x y) = f XY(x,y) f Y (y) for x,y. Therefore, for y hence E[X Y = y] = E[X Y] = xf X Y (x y)dx = 3 + Y +Y. 7 = x+y +y x +yx y + dx = 3 + y +y,
(b) The MSE is given by E(Var(X Y)) = EX E((E(X Y)) ) ( = x x+ ) dx = 5 ( 3 + ln(3) 44.757. ) ( 3 + y +y = ln(3) 44 ) ( y + ) dy (c) Since E[X Y = y] = 3 +y = +y + 6(+y), we have 5 9 E[X Y = y] 3 for y. From the part (a), we know Z = E[X Y] = /3+Y/ /+Y. Thus, 5 9 Z 3. We first find the cdf F Z (z) of Z and then differentiate it to get the pdf. Consider { 3 F Z (z) = P{Z z} = P + Y { = P Y < 3z 3 6z } +Y z }. For 3z 3 6z, i.e., 5 9 z 3, we have F Z (z) = 3z 3 6z By differentiating with respect to z, we get f Z (z) = ( 3z 3 6z + ) = 8(z ) 3 for 5 9 z 3. Otherwise f Z(z) =. (d) The best linear MMSE estimate is Here we have E(X) = E(Y) = { = P{+3Y 3z +6zY} = P Y f Y (y)dx = 3z 3 6z d dz ( ) y + dx (3z ) = 3 6z 6( z) ˆX = Cov(X,Y) σy (Y E(Y))+E(X). x(x+y)dydx = y(y + )dy = 7, Cov(X,Y) = E(XY) E(X)E(Y) = σ Y = E(Y ) (E(Y)) = x(x+ )dydx = 7, y (y + )dy ( 7 3( z) 3z } 3 6z xy(x+y)dydx 7 7 = 44, ) = 44. 8
So the best linear MMSE estimate is ˆX = ( Y 7 ) + 7. (e) The MSE of linear estimate is Here by symmetry, σ X = σ Y = 44. Thus, MSE = σx Cov (X,Y) σy. MSE = σ X Cov (X,Y) σ Y = 44 +.67. We can check that LMSE > MSE. 4. Additive-noise channel with signal dependent noise. Consider the channel with correlated signal X and noise Z and observation Y = X +Z, where µ X =, µ Z =, σ X = 4, σ Z = 9, ρ X,Z = 3 8. Find the best MSE linear estimate of X given Y. Solution: The best linear MMSE estimate is given by the formula Here we have ˆX = Cov(X,Y) σy (Y E(Y))+E(X). E(Y) = E(X)+E(Z) =, σ Y = 4σ X +σ Z +4ρ XZ σ X σ Z = 6, Cov(X,Y) = E(XY) E(X)E(Y) = E(X +XZ) So the best linear MMSE estimate is = (σ X +µ X)+(ρ XZ σ X σ Z +µ X µ Z ) = 3 4. ˆX = 3 3 (Y )+ = 64 64 Y + 9 3. 9