Lecture Notes Special: Using Neural Networks for Mean-Squared Estimation. So: How can we evaluate E(X Y )? What is a neural network? How is it useful?

Size: px

Start display at page:

Download "Lecture Notes Special: Using Neural Networks for Mean-Squared Estimation. So: How can we evaluate E(X Y )? What is a neural network? How is it useful?"

Vincent Myles Gibbs
5 years ago
Views:

1 Lecture Notes Special: Using Neural Networks for Mean-Squared Estimation The Story So Far... So: How can we evaluate E(X Y )? What is a neural network? How is it useful? EE 278: Using Neural Networks for Mean-Squared Estimation age 1

2 The Story So Far... Consider the AWGN Channel: Y = X + Z X N (0, ), Z N (0, N), X and Z are independent. MMSE roblem: Find ˆX = g(y ) such that E(X ˆX) 2 is minimized. Solution: ˆX = g(y ) = E(X Y ). The best linear estimator ˆX lin = ay + b = + N Y also turns out to be the best MMSE estimator for the AWGN channel. That is, E(X Y ) = + N Y. EE 278: Using Neural Networks for Mean-Squared Estimation age 2

3 In general, estimation problems are formulated as follows: Given Y = h(x, Z), where X is signal and Z is noise, find ˆX = g(y ) such that E(X ˆX) 2 is minimized. We have seen that the best MMSE estimate is ˆX = g(y ) = E(X Y ). However, E(X Y ) can be very hard to compute. For example, consider some example h(, ) and X, Z distributions Linear channel: Y = X + Z, but X, Z are not Gaussian already very hard. Nonlinear channels: Y = XZ X 2 Z X 2 + Z X, Z arbitrarily distributed (including Gaussian) also very hard! Vector channels (linear or nonlinear) X, Z arbitrarily distributed. Y = h(x, Z) EE 278: Using Neural Networks for Mean-Squared Estimation age 3

4 These channels arise in many practical applications, including Communications (mostly additive or multiplicative channels with Gaussian variables) Biological systems (arbitrary channels and X, Z distributions) E-commerce/retail data (arbitrary channels and X, Z distributions) Networking (arbitrary channels and X, Z distributions) etc. EE 278: Using Neural Networks for Mean-Squared Estimation age 4

5 So: How can we evaluate E(X Y )? Two ways to proceed: Approximate E(X Y ) = g(y ) with polynomials. Approximate g( ) using neural networks and DATA. Let s look at polynomial approximations first. ( ( 1 Take a simple case Y = X + Z, X Laplace ), Z Laplace X and Z are independent. Additive Laplacian Channel: Y = X + Z f X (x) = 1 e x 2, f Z (z) = 1 e z 2 N, X and Z are independent. N 1 N ), EE 278: Using Neural Networks for Mean-Squared Estimation age 5

6 Note: Similarly E(X k ) = E(X) = 0 = E(Z) E(X 2 ) = 2 ; E(Z 2 ) = 2N 0 k odd xk e x E(Z k ) = dx = 1 k! { 0 k odd k! N k 2 k even ( 1 ) k+1 = k! k 2 k even First, let s see if we can determine E(X Y ): E(X Y ) = where we use Bayes rule to get f X Y (x y). x f X Y (x y)dx, EE 278: Using Neural Networks for Mean-Squared Estimation age 6

7 f X Y (x y) = f Y X(y x)f X (x) f Y (y) = f Z (y x)f X (x) f X(v)f Z (y v)dv Highly complicated! = 1 e y x 2 N 1 e x N 2 1 e v 2 1 e y v 2 N N dv So let s look for polynomial approximations to g( ) (where E(X Y ) = g(y )). Linear: ˆX = ay + b Using the orthogonality principle, we get a = ˆX lin = ay + b = +N Y. Quadratic: ˆX = ay 2 + by + c By the orthogonality principle, (X ˆX) 1, Y, Y 2. +N, b = 0 so (X ˆX) 1 E [ (X (ay 2 + by + c)) 1 ] = 0 E(X) a E(Y 2 ) b E(Y ) c = 0 c = a E(Y 2 ) EE 278: Using Neural Networks for Mean-Squared Estimation age 7

8 (X ˆX) Y E [ (X (ay 2 + by + c)) Y ] = 0 E(XY ) a E(Y 3 ) b E(Y 2 ) c E(Y ) = 0 E(X 2 + XN) 0 b E(X 2 + Z 2 + 2XZ) 0 = 0 E(X 2 ) = b((e(x 2 ) + E(Z 2 ) + 0) b = E(X 2 ) E(X 2 ) + E(Z 2 ) = N = + N (X ˆX) Y 2 E [ (X (ay 2 + by + c)) Y 2] = 0 E(X(X 2 + Z 2 + 2XZ)) a E(Y 4 ) b E(Y 3 ) c E(Y 2 ) = 0 E(X 3 + XZ 2 + 2X 2 Z) a E(Y 4 ) 0 c E(Y 2 ) = 0 Substituting c = a E(Y 2 ), 0 a E(Y 4 ) + a[e(y 2 )] 2 = 0 a(e(y 4 ) [E(Y 2 )] 2 ) = avar(y 2 ) = 0 Since Var(Y 2 ) > 0, ( Y 2 > 0 is a random variable and (Y 2 > 0) = 1) It must be that a = 0; Hence c = 0. ˆX quadratic = by = +N Y = ˆX lin!!! EE 278: Using Neural Networks for Mean-Squared Estimation age 8

9 Cubic: ˆX = ay 3 + by 2 + cy + d Again, by orthogonality (X ˆX) 1, Y, Y 2, Y 3. Solving the equations (and after a lot of algebra), we get where b = d = 0 a = 2( + N)[E(X4 ) + 24 N] 2 E(Y 6 ) 2( + N) E(Y 6 ) [E(Y 4 )] 2 c = [2 a E(Y 4 )], 2( + N) E(X 4 ) = 24 2 E(Y 4 ) = 24( 2 + N + N 2 ) E(Y 6 ) = 6!( N + N 2 + N 3 ) Highly messy! Now let s consider using neural networks and data EE 278: Using Neural Networks for Mean-Squared Estimation age 9

10 Take Y = h(x, Z); X = g(y) = E(X Y) EE 278: Using Neural Networks for Mean-Squared Estimation age 10

11 What is a neural network? EE 278: Using Neural Networks for Mean-Squared Estimation age 11

12 h 1,before = W Y + b = W 1 W 2. W k Y 1 Y 2. Y n b 1 + b 2. = W 1, Y + b 1. W k, Y + b 2 where W R k n is the weight matrix and b R k 1 is the bias vector. Apply ReLU (Rectified Linear Unit) type of nonlinearity h 1,after = max{ W 1, Y + b 1, 0}. max{ W k, Y + b 2, 0} h 2,after is an input to the 2 nd hidden layer and so on... Geometrically, the h 2,after vector takes Y and projects it onto the affine places defined by (W i, b i ) and takes the projection only if Y is on the positive side of the affine hyperplane. And so on.. Thus layer 1 classifies Y into regions defined by the projections onto (W i, b i ) hyperplanes. It then passes the positive projections to layer 2, setting negative projections to zero. b k EE 278: Using Neural Networks for Mean-Squared Estimation age 12

13 And so on for subsequent layers. Eventually, ˆX is a piecewise linear combination of the affine transformations in the neural network. We train the neural network with data (Y i, X i ) (Recall that X i ˆX i ) to get the weights W ij and the biases b i,k. Then, we use the trained neural network to estimate a new ˆX given a new Y. EE 278: Using Neural Networks for Mean-Squared Estimation age 13

UCSD ECE153 Handout #27 Prof. Young-Han Kim Tuesday, May 6, Solutions to Homework Set #5 (Prepared by TA Fatemeh Arbabjolfaei)

UCSD ECE153 Handout #27 Prof. Young-Han Kim Tuesday, May 6, Solutions to Homework Set #5 (Prepared by TA Fatemeh Arbabjolfaei) UCSD ECE53 Handout #7 Prof. Young-Han Kim Tuesday, May 6, 4 Solutions to Homework Set #5 (Prepared by TA Fatemeh Arbabjolfaei). Neural net. Let Y = X + Z, where the signal X U[,] and noise Z N(,) are independent.