ELEG 5633 Detection and Estimation Minimum Variance Unbiased Estimators (MVUE)

1 ELEG 5633 Detection and Estimation Minimum Variance Unbiased Estimators (MVUE) Jingxian Wu Department of Electrical Engineering University of Arkansas

Outline Minimum Variance Unbiased Estimators (MVUE) Cramer-Rao Lower Bound (CRLB) Best Linear Unbiased Estimators (BLUE)

Minimum Variance Unbiased Estimators (MVUE) Recall MSE(ˆθ) = bias(ˆθ) 2 2 + var(ˆθ) It is usually impossible to design ˆθ to minimize the MSE because the bias depends on the true value θ, which is unknown. Restrict to unbiased estimators, E(ˆθ) = θ. Then MSE(ˆθ) = var(ˆθ) Note var(ˆθ) does not depend on θ. A realizable approach: optimize the MSE with respect to all unbiased estimators. Minimum Variance UnBiased (MVUB) estimator is defined as ˆθ = argmin E[ ˆθ E(ˆθ) 2 2] ˆθ:E(ˆθ)=θ

Example X 1, X 2,..., X n i.i.d. N (θ, σ 2 ). Let ˆθ = 1 n n i=1 x i. We have Eˆθ = θ MSE(ˆθ) = 1 n 2 n i=1 varx i = σ2 n Is this the MVUB estimator? This question can be answered by using Cramer-Rao Lower Bound (CRLB).

MVUE Does a MVUE always exist? If it does, can we always find it? Can we say anything about MVUE?

Cramer-Rao Lower Bound (CRLB) The CRLB gives a lower bound on the variance of ANY UNBIASED estimator Does NOT guarantee the bound can be achieved. Can be used to verify that a particular estimator is MVUB. Otherwise we can use other tools to construct a better estimator from any unbiased one Possibly the MVUE if conditions are met.

CRLB for Scalar Parameters Theorem (Cramer-Rao Lower Bound (CRLB)) Let p(x θ) satisfy the regularity condition [ ] ln p(x θ) E = 0 θ Then the variance of any unbiased estimator ˆθ must satisfy var(ˆθ) E 1 [ 2 ln p(x θ) θ 2 θ=θ 1 ] = [ ( ) 2 E ln p(x θ) θ=θ ] Furthermore an unbiased estimator may be found that attains the bound for all θ iff ln p(x θ) = I(θ)[g(x) θ] θ for some g( ) and I. That estimator, which is the MVUE, is ˆθ = g(x) and the minimum variance is 1/I(θ). θ

Example X 1, X 2,..., X n i.i.d. N (θ, σ 2 ). Let ˆθ = 1 n n i=1 x i. Is it MVUB? Solution: log p(x θ) = N log 2πσ 2 1 2σ 2 θ log p(x θ) = 1 σ 2 2 n (X i A) i=1 n (X i A) 2 i=1 2 θ log p(x θ) = n σ [ 2 ( ) ] 2 log p(x θ) I(θ ) = E θ=θ θ = 1 σ 4 n i=1 E[(x i θ ) 2 ] = n σ 2 = E [ 2 ln p(x θ) θ 2 varˆθ 1 I(θ ) = σ2 n θ=θ ]

More about I(θ) Fisher Information [ 2 ] [ ( ) ] 2 ln p(x θ) ln p(x θ) I(θ) = E θ 2 θ=θ = E θ=θ θ It is always non-negative. It is additive for independent observations. The CRLB for N i.i.d. observations is 1/N times that for one observation.

Theorem (Vector Form of the Cramer-Rao[ Lower Bound ] (CRLB)) Assume p(x θ) satisfy the regularity condition E ln p(x θ) θ = 0, θ. Let ˆθ = ˆθ(x) be an unbiased estimator of θ. Then the error covariance satisfies E[(ˆθ Eˆθ)(ˆθ Eˆθ) T ] I 1 (θ ) 0 where 0 means the matrix is positive semi-definite. I(θ ) is the Fisher-Information matrix with (i, j)th element [ I ij (θ 2 ] log p(x θ) ) = E θ=θ θ i θ j Furthermore an unbiased estimator may be found that attains the bound iff ln p(x θ) θ = I(θ)[g(x) θ] In that case, ˆθ = g(x) is the MVUE with covariance matrix I 1 (θ).

Example Consider x[n] = A + w[n], n = 0, 1,..., N, where w[n] is WGN with variance σ 2. What is the CRLB for the vector parameter θ = [A, σ 2 ] T? Solutions: Let θ 1 = A and θ 2 = σ 2. log p(x θ) = N 2 log 2π N 2 log σ2 1 2σ 2 N i 1 (X i A) 2 log p(x θ) = 1 N θ 1 σ 2 (X i A) i 1 [ 2 ] log p(x θ) E θ1 2 = E [ Nσ ] 2 = N σ 2 [ 2 ] [ ] log p(x θ) E = E 1 N θ 1 θ 2 σ 4 (X i A) = 0 i=1

12 Solution:(Cont d) log p(x θ) = N θ 2 2 [ 2 ] log p(x θ) E = E θ 2 2 [ N 2 [ 2 ] [ log p(x θ) E = E 1 θ 2 θ 1 σ 4 Fisher Information matrix I(θ) = I 1 (θ) = [ σ 2 N 0 0 2σ 4 N ] 1 σ 2 + 1 2σ 4 1 σ 4 1 σ 6 N i 1 N (X i A) 2 i 1 ] N (X i A) 2 i 1 (X i A) [ N σ 2 0 0 N 2σ 4 ] ] = 0 = N 2σ 4

Solution:(Cont d) If an estimator can achieve the CRLB, then it must saitsfy ln p(x θ) θ = I(θ)(g(x) θ) ln p(x θ) θ = I(θ) ([ 1 N 1 ] N N i=1 X i N i=1 (X i A) 2 [ A σ 2 ]) Thus Â = 1 N ˆσ 2 = 1 N N i=1 X i N (X i A) 2 i=1 Therefore CRLB cannot be achieved because ˆσ 2 depends on the unknown parameter A. Recall MLE: ˆσ ML 2 = 1 N N i=1 (X i Â)2, E(ˆσ ML 2 ) = N 1 N σ2. It is a biased estiamtor.

14 Efficiency An unbiased estimator that achieved the CRLB is said to be efficient. Efficient estimators are MVUB, but not all MVUB estimators are necessarily efficient. An MVUB could minimimize the MSE, but the minimium achievable MSE is larger than the CRLB. An estimator ˆθ n is said to be asymptotically efficient if it achieves the CRLB, as n. Recall that under mild regularity conditions, the MLE has an asymptotic distribution ( ˆθ n N θ, 1 ) n I 1 (θ ) asymptotically so ˆθ n is asymptotically unbiased. so it is asymptotically efficient. var(ˆθ n ) = 1 n I 1 (θ )

15 Best Linear Unbiased Estimators (BLUE) So far, we have learned CRLB, may give you the MVUE MVUE still may be tough to find Best Linear Unbiased Estimators Find the MVUE by constraining the estimators to be linear, i.e., ˆθ = A T x Only need the first and second moments of p(x θ), which is fairly practical. Trading optimality for practicality. There is no reason to believe that a linear estimator is efficient, an MVUE, or optimal in any sense.

BLUE Assumptions In order to employ BLUE, the relationship between x and θ must be linear, i.e., x = Hθ + w This ensures that we can find a linear unbiased estimator. H is known C = E[(x E[x])(x E[x]) T ] is known. We wish to find the linear unbiased estimator with the minimum variance for each θ i θ. ˆθ = A T x A T H = I Find A T to minimize N i=1 var(ˆθ i ) = tr(a T CA)

Cˆθ = E[(A T x θ)(a T x θ) T ] = E[A T ww T A] = A T CA = (H T C 1 H) 1 17 BLUE Formulation min A s.t. tr(a T CA) A T H = I Solutions: This is a convex optimization problem. J(A) = tr ( A T CA) (A T H I)λ ) J(A) A = 2CA Hλ = 0 A = 1 2 C 1 Hλ Determine λ by using the constraint A T H = I = 1 2 λt H T C 1 H = I Optimum solution Error covariance matrix 1 2 λt = (H T C 1 H) 1 ˆθ = (H T C 1 H) 1 H T C 1 x

Theorem (Gauss-Markov) If the data are of the general linear model form x = Hθ + w where H is a known N p matrix, θ is a p 1 vector of parameters to be estimated, and w is a N 1 noise vector with zero mean and covariance C, then the BLUE of θ is ˆθ = (H T C 1 H) 1 H T C 1 x In addition, the covariance matrix of ˆθ is Cˆθ = (H T C 1 H) 1

Example x n = A + w n, n = 1,..., N, w n not Gaussian, but independent, identically distributed with zero mean and variance σ 2. Solutions: H = 1 N Â BLUE = 1 N N i=1 x i var(âblue) = σ2 N The sample mean is the BLUE independent of the PDF of the data. It is the MVUE for Gaussian noise. σ Recall MMSE: Â MMSE = 2 A X var(âmmse) = σ2 σa 2 + 1 A σ2 = σ2 N σ2 NσA 2 +σ2 N+ σ2 σ A 2 var(âmmse) var(âblue) lim var(âblue) = var(âmmse) σa 2 In MMSE, we have prior information about A (µ A = 0 and σ 2 A ). In BLUE and MVUE, no prior information of A is avaiable σ 2 A = ).

Example x[n] = A + w[n], n = 0, 1,..., N 1, w[n] not Gaussian, but independent with zero mean and variance σ 2 n.

Example For the general linear model x = Hθ + s + w where s is a known N 1 vector and E[w] = 0, E[ww T ] = C. Find the BLUE. Solutions: Let y = x s.

Example Consider the following curve fitting problem, where we wish to find θ 0,, θ p 1 so as to best fit the experimental data points (t n, x(t n )) for n = 0,, N 1 by the polynomial curve x(t n ) = θ 0 + θ 1 t n + θ 2 t 2 n + + θ p 1 t p 1 n + w(t n ) (1) where w(t n ) are i.i.d. with zero mean and variance σ 2. Find the BLUE of θ = [θ 0, θ 1,, θ p 1 ] T.