ECE531 Screencast 5.5: Bayesian Estimation for the Linear Gaussian Model

Size: px

Start display at page:

Download "ECE531 Screencast 5.5: Bayesian Estimation for the Linear Gaussian Model"

Julius Flynn
5 years ago
Views:

1 ECE53 Screencast 5.5: Bayesian Estimation for the Linear Gaussian Model D. Richard Brown III Worcester Polytechnic Institute Worcester Polytechnic Institute D. Richard Brown III / 8

2 Bayesian Estimation for the Linear Gaussian Model Recall the linear Gaussian model Y = HΘ+W where the observation Y R n, the mixing matrix H R n m is known, the unknown parameter vector Θ R m is distributed as N(µ Θ,Σ Θ ), and the unknown noise vector W R n is distributed as N(0,Σ W ). Unless otherwise specified, we always assume the noise and the unknown parameters are independent of each other. What are the Bayesian MMSE/MMAE/MAP estimators in this case? Worcester Polytechnic Institute D. Richard Brown III 2 / 8

3 Linear Gaussian Model: Posterior Distribution Analysis To develop an expression for the posterior distribution π y (θ), we first note that π y (θ) = p Y,Θ(y,θ) p Y (y). To find the joint distribution p Y,Θ (y,θ) let ] ][ ] Θ Z = [ Y Θ = [ H I I 0 Since Θ and W are independent of each other and each is Gaussian, they are jointly Gaussian. Furthermore, since Z is a linear transformation of a jointly Gaussian random vector, it too is jointly Gaussian. To fully characterize Z N(µ Z,Σ Z ), we just need its mean and covariance: [ ] HµΘ µ Z := E[Z] = µ Θ [ HΣΘ H Σ Z := cov[z] = ] +Σ W HΣ Θ Σ Θ H W Σ Θ Worcester Polytechnic Institute D. Richard Brown III 3 / 8

4 Linear Gaussian Model: Posterior Distribution Analysis To compute the posterior, we can write π y (θ) = p Z(z) p Y (y) = (2π) (m+n)/2 Σ Z /2 exp (2π) n/2 Σ Y /2 exp { (z µz ) Σ Z (z µ Z) 2 { } (y µy ) Σ Y (y µ Y ) 2 To simplify the terms outside of the exponentials, note that [ HΣΘ H Σ Z := cov[z] = ] [ ] +Σ W HΣ Θ ΣY Σ Σ Θ H = Y,Θ Σ Θ Σ Θ,Y Σ Θ [ ] A B The determinant of a partitioned matrix P = can be written as C D P = A D CA B if A is invertible. Covariance matrices are invertible, hence the terms outside the exponentials can be simplified to (2π) (m+n)/2 Σ Z /2 = (2π) n/2 Σ Y /2 (2π) m/2 Σ Θ Σ Θ,Y Σ Y Σ Y,Θ /2 Worcester Polytechnic Institute D. Richard Brown III 4 / 8 }

5 Linear Gaussian Model: Posterior Distribution Analysis To simplify the terms inside the exponentials, we can use a matrix inversion formula for partitioned matrices (A must be invertible) [ ] A B = C D [ (A BD C) (D CA B) CA and the matrix inversion lemma A B(D CA B) ] (D CA B) (A BD C) = A +A B(D CA B) CA. Skipping all the algebraic details, we can write { } (z µz ) exp Σ Z (z µ Z) { 2 (θ α(y)) Σ } (θ α(y)) { } = exp (y µy ) exp Σ 2 Y (y µ Y ) 2 where α(y) = µ Θ +Σ Θ,Y Σ Y (y µ Y) and Σ = Σ Θ Σ Θ,Y Σ Y Σ Y,Θ. Worcester Polytechnic Institute D. Richard Brown III 5 / 8

6 Linear Gaussian Model: Posterior Distribution Analysis Putting it all together, we have the posterior distribution { (θ α(y)) Σ } (θ α(y)) π y (θ) = (2π) m/2 exp Σ /2 2 where α(y) = µ Θ +Σ Θ,Y Σ Y (y µ Y) and Σ = Σ Θ Σ Θ,Y Σ Y Σ Y,Θ with [ Σ Θ,Y = cov(θ,y) = E (Θ µ Θ )(HΘ+W Hµ Θ ) ] = Σ Θ H Σ Y,Θ = Σ Θ,Y = HΣ Θ Σ Y = cov(y,y) = HΣ Θ H +Σ W µ Y = E[HΘ+W] = Hµ Θ What can we say about the posterior distribution of the random parameter Θ conditioned on the observation Y = y? Worcester Polytechnic Institute D. Richard Brown III 6 / 8

7 Linear Gaussian Model: Bayesian Estimators Lemma In the linear Gaussian model, the parameter vector Θ conditioned on the observation Y = y is jointly Gaussian distributed with E[Θ Y = y] = µ Θ +Σ Θ H ( HΣ Θ H +Σ W ) (y HµΘ ) cov[θ Y = y] = Σ Θ Σ Θ H ( HΣ Θ H +Σ W ) HΣΘ Corollary In the linear Gaussian model ˆθ mmse (y) = ˆθ mmae (y) = ˆθ map (y) Worcester Polytechnic Institute D. Richard Brown III 7 / 8

8 Linear Gaussian Model: Bayesian Estimator Remarks All of the estimators are linear (actually affine) in the observation y. Recall that the performance of the Bayesian MMSE estimator is [ MMSE = E Θ ˆθ ] mmse (Y) 2 2 = trace{cov(θ Y = y)} p(y)dy. In the linear Gaussian model, we see that cov[θ Y = y] does not depend on y. Hence, we can move the trace outside of the integral and write the MMSE as MMSE = trace{cov[θ Y = y]} p(y) dy = trace{σ Θ } trace {Σ Θ H ( ) } HΣΘ HΣ Θ H +Σ W. Worcester Polytechnic Institute D. Richard Brown III 8 / 8

ECE531 Homework Assignment Number 6 Solution

ECE531 Homework Assignment Number 6 Solution ECE53 Homework Assignment Number 6 Solution Due by 8:5pm on Wednesday 3-Mar- Make sure your reasoning and work are clear to receive full credit for each problem.. 6 points. Suppose you have a scalar random