ELEG 5633 Detection and Estimation Minimum Variance Unbiased Estimators (MVUE)

Similar documents
EIE6207: Estimation Theory

Chapters 9. Properties of Point Estimators

A Few Notes on Fisher Information (WIP)

Advanced Signal Processing Introduction to Estimation Theory

Terminology Suppose we have N observations {x(n)} N 1. Estimators as Random Variables. {x(n)} N 1

Module 2. Random Processes. Version 2, ECE IIT, Kharagpur

Mathematical statistics

EIE6207: Maximum-Likelihood and Bayesian Estimation

10-704: Information Processing and Learning Fall Lecture 24: Dec 7

Regression #3: Properties of OLS Estimator

ECE 275A Homework 6 Solutions

ECE531 Lecture 10b: Maximum Likelihood Estimation

Module 1 - Signal estimation

Rowan University Department of Electrical and Computer Engineering

Advanced Signal Processing Minimum Variance Unbiased Estimation (MVU)

Brief Review on Estimation Theory

Elements of statistics (MATH0487-1)

Theory of Statistics.

Linear models. x = Hθ + w, where w N(0, σ 2 I) and H R n p. The matrix H is called the observation matrix or design matrix 1.

Estimation MLE-Pandemic data MLE-Financial crisis data Evaluating estimators. Estimation. September 24, STAT 151 Class 6 Slide 1

Mathematical statistics

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators

Variations. ECE 6540, Lecture 10 Maximum Likelihood Estimation

Chapter 8: Least squares (beginning of chapter)

Classical Estimation Topics

STAT 730 Chapter 4: Estimation

Introduction to Simple Linear Regression

Estimators as Random Variables

Detection & Estimation Lecture 1

ECE534, Spring 2018: Solutions for Problem Set #3

Introduction to Estimation Methods for Time Series models Lecture 2

Estimation and Detection

PATTERN RECOGNITION AND MACHINE LEARNING

Detection & Estimation Lecture 1

ECE 275A Homework 7 Solutions

Parameter Estimation

2 Statistical Estimation: Basic Concepts

Review. December 4 th, Review

Regression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood

Estimation Theory Fredrik Rusek. Chapters

Proof In the CR proof. and

Methods of evaluating estimators and best unbiased estimators Hamid R. Rabiee

SGN Advanced Signal Processing: Lecture 8 Parameter estimation for AR and MA models. Model order selection

Graduate Econometrics I: Unbiased Estimation

Math 494: Mathematical Statistics

Statistics and Econometrics I

STAT 512 sp 2018 Summary Sheet

Estimation Theory. as Θ = (Θ 1,Θ 2,...,Θ m ) T. An estimator

Review Quiz. 1. Prove that in a one-dimensional canonical exponential family, the complete and sufficient statistic achieves the

HT Introduction. P(X i = x i ) = e λ λ x i

13. Parameter Estimation. ECE 830, Spring 2014

Maximum Likelihood Estimation

Estimation Tasks. Short Course on Image Quality. Matthew A. Kupinski. Introduction

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari

STA 260: Statistics and Probability II

Rethinking Biased Estimation: Improving Maximum Likelihood and the Cramér Rao Bound

Basic concepts in estimation

Lecture 7 Introduction to Statistical Decision Theory

Central Limit Theorem ( 5.3)

EECE Adaptive Control

ELEG 5633 Detection and Estimation Signal Detection: Deterministic Signals

10. Linear Models and Maximum Likelihood Estimation

Parameter estimation! and! forecasting! Cristiano Porciani! AIfA, Uni-Bonn!

Introduction to Estimation Methods for Time Series models. Lecture 1

System Identification, Lecture 4

Estimation Theory for Engineers

System Identification, Lecture 4

Unbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others.

Quick Tour of Basic Probability Theory and Linear Algebra

Lecture Notes 5 Convergence and Limit Theorems. Convergence with Probability 1. Convergence in Mean Square. Convergence in Probability, WLLN

STAT215: Solutions for Homework 2

Regression Estimation Least Squares and Maximum Likelihood

Statistics. Lecture 2 August 7, 2000 Frank Porter Caltech. The Fundamentals; Point Estimation. Maximum Likelihood, Least Squares and All That

Covariance function estimation in Gaussian process regression

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Max. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes

Economics 620, Lecture 9: Asymptotics III: Maximum Likelihood Estimation

Lecture 8: Information Theory and Statistics

Machine learning - HT Maximum Likelihood

STAT 135 Lab 3 Asymptotic MLE and the Method of Moments

Econometrics A. Simple linear model (2) Keio University, Faculty of Economics. Simon Clinet (Keio University) Econometrics A October 16, / 11

6 The normal distribution, the central limit theorem and random samples

f(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain

The Expectation-Maximization Algorithm

Final Exam. 1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given.

LINEAR MODELS IN STATISTICAL SIGNAL PROCESSING

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables THE UNIVERSITY OF MANCHESTER. 21 June :45 11:45

Statistics: Learning models from data

ENEE 621 SPRING 2016 DETECTION AND ESTIMATION THEORY THE PARAMETER ESTIMATION PROBLEM

Chapter 3: Unbiased Estimation Lecture 22: UMVUE and the method of using a sufficient and complete statistic

Gov 2001: Section 4. February 20, Gov 2001: Section 4 February 20, / 39

Econometrics I, Estimation

All other items including (and especially) CELL PHONES must be left at the front of the room.

Mathematics Ph.D. Qualifying Examination Stat Probability, January 2018

6.1 Variational representation of f-divergences

Tutorial: Statistical distance and Fisher information

Parametric Techniques Lecture 3

ELEG 3143 Probability & Stochastic Process Ch. 6 Stochastic Process

Parametric Techniques

Transcription:

1 ELEG 5633 Detection and Estimation Minimum Variance Unbiased Estimators (MVUE) Jingxian Wu Department of Electrical Engineering University of Arkansas

Outline Minimum Variance Unbiased Estimators (MVUE) Cramer-Rao Lower Bound (CRLB) Best Linear Unbiased Estimators (BLUE)

Minimum Variance Unbiased Estimators (MVUE) Recall MSE(ˆθ) = bias(ˆθ) 2 2 + var(ˆθ) It is usually impossible to design ˆθ to minimize the MSE because the bias depends on the true value θ, which is unknown. Restrict to unbiased estimators, E(ˆθ) = θ. Then MSE(ˆθ) = var(ˆθ) Note var(ˆθ) does not depend on θ. A realizable approach: optimize the MSE with respect to all unbiased estimators. Minimum Variance UnBiased (MVUB) estimator is defined as ˆθ = argmin E[ ˆθ E(ˆθ) 2 2] ˆθ:E(ˆθ)=θ

Example X 1, X 2,..., X n i.i.d. N (θ, σ 2 ). Let ˆθ = 1 n n i=1 x i. We have Eˆθ = θ MSE(ˆθ) = 1 n 2 n i=1 varx i = σ2 n Is this the MVUB estimator? This question can be answered by using Cramer-Rao Lower Bound (CRLB).

MVUE Does a MVUE always exist? If it does, can we always find it? Can we say anything about MVUE?

Cramer-Rao Lower Bound (CRLB) The CRLB gives a lower bound on the variance of ANY UNBIASED estimator Does NOT guarantee the bound can be achieved. Can be used to verify that a particular estimator is MVUB. Otherwise we can use other tools to construct a better estimator from any unbiased one Possibly the MVUE if conditions are met.

CRLB for Scalar Parameters Theorem (Cramer-Rao Lower Bound (CRLB)) Let p(x θ) satisfy the regularity condition [ ] ln p(x θ) E = 0 θ Then the variance of any unbiased estimator ˆθ must satisfy var(ˆθ) E 1 [ 2 ln p(x θ) θ 2 θ=θ 1 ] = [ ( ) 2 E ln p(x θ) θ=θ ] Furthermore an unbiased estimator may be found that attains the bound for all θ iff ln p(x θ) = I(θ)[g(x) θ] θ for some g( ) and I. That estimator, which is the MVUE, is ˆθ = g(x) and the minimum variance is 1/I(θ). θ

Example X 1, X 2,..., X n i.i.d. N (θ, σ 2 ). Let ˆθ = 1 n n i=1 x i. Is it MVUB? Solution: log p(x θ) = N log 2πσ 2 1 2σ 2 θ log p(x θ) = 1 σ 2 2 n (X i A) i=1 n (X i A) 2 i=1 2 θ log p(x θ) = n σ [ 2 ( ) ] 2 log p(x θ) I(θ ) = E θ=θ θ = 1 σ 4 n i=1 E[(x i θ ) 2 ] = n σ 2 = E [ 2 ln p(x θ) θ 2 varˆθ 1 I(θ ) = σ2 n θ=θ ]

More about I(θ) Fisher Information [ 2 ] [ ( ) ] 2 ln p(x θ) ln p(x θ) I(θ) = E θ 2 θ=θ = E θ=θ θ It is always non-negative. It is additive for independent observations. The CRLB for N i.i.d. observations is 1/N times that for one observation.

Theorem (Vector Form of the Cramer-Rao[ Lower Bound ] (CRLB)) Assume p(x θ) satisfy the regularity condition E ln p(x θ) θ = 0, θ. Let ˆθ = ˆθ(x) be an unbiased estimator of θ. Then the error covariance satisfies E[(ˆθ Eˆθ)(ˆθ Eˆθ) T ] I 1 (θ ) 0 where 0 means the matrix is positive semi-definite. I(θ ) is the Fisher-Information matrix with (i, j)th element [ I ij (θ 2 ] log p(x θ) ) = E θ=θ θ i θ j Furthermore an unbiased estimator may be found that attains the bound iff ln p(x θ) θ = I(θ)[g(x) θ] In that case, ˆθ = g(x) is the MVUE with covariance matrix I 1 (θ).

Example Consider x[n] = A + w[n], n = 0, 1,..., N, where w[n] is WGN with variance σ 2. What is the CRLB for the vector parameter θ = [A, σ 2 ] T? Solutions: Let θ 1 = A and θ 2 = σ 2. log p(x θ) = N 2 log 2π N 2 log σ2 1 2σ 2 N i 1 (X i A) 2 log p(x θ) = 1 N θ 1 σ 2 (X i A) i 1 [ 2 ] log p(x θ) E θ1 2 = E [ Nσ ] 2 = N σ 2 [ 2 ] [ ] log p(x θ) E = E 1 N θ 1 θ 2 σ 4 (X i A) = 0 i=1

12 Solution:(Cont d) log p(x θ) = N θ 2 2 [ 2 ] log p(x θ) E = E θ 2 2 [ N 2 [ 2 ] [ log p(x θ) E = E 1 θ 2 θ 1 σ 4 Fisher Information matrix I(θ) = I 1 (θ) = [ σ 2 N 0 0 2σ 4 N ] 1 σ 2 + 1 2σ 4 1 σ 4 1 σ 6 N i 1 N (X i A) 2 i 1 ] N (X i A) 2 i 1 (X i A) [ N σ 2 0 0 N 2σ 4 ] ] = 0 = N 2σ 4

Solution:(Cont d) If an estimator can achieve the CRLB, then it must saitsfy ln p(x θ) θ = I(θ)(g(x) θ) ln p(x θ) θ = I(θ) ([ 1 N 1 ] N N i=1 X i N i=1 (X i A) 2 [ A σ 2 ]) Thus  = 1 N ˆσ 2 = 1 N N i=1 X i N (X i A) 2 i=1 Therefore CRLB cannot be achieved because ˆσ 2 depends on the unknown parameter A. Recall MLE: ˆσ ML 2 = 1 N N i=1 (X i Â)2, E(ˆσ ML 2 ) = N 1 N σ2. It is a biased estiamtor.

14 Efficiency An unbiased estimator that achieved the CRLB is said to be efficient. Efficient estimators are MVUB, but not all MVUB estimators are necessarily efficient. An MVUB could minimimize the MSE, but the minimium achievable MSE is larger than the CRLB. An estimator ˆθ n is said to be asymptotically efficient if it achieves the CRLB, as n. Recall that under mild regularity conditions, the MLE has an asymptotic distribution ( ˆθ n N θ, 1 ) n I 1 (θ ) asymptotically so ˆθ n is asymptotically unbiased. so it is asymptotically efficient. var(ˆθ n ) = 1 n I 1 (θ )

15 Best Linear Unbiased Estimators (BLUE) So far, we have learned CRLB, may give you the MVUE MVUE still may be tough to find Best Linear Unbiased Estimators Find the MVUE by constraining the estimators to be linear, i.e., ˆθ = A T x Only need the first and second moments of p(x θ), which is fairly practical. Trading optimality for practicality. There is no reason to believe that a linear estimator is efficient, an MVUE, or optimal in any sense.

BLUE Assumptions In order to employ BLUE, the relationship between x and θ must be linear, i.e., x = Hθ + w This ensures that we can find a linear unbiased estimator. H is known C = E[(x E[x])(x E[x]) T ] is known. We wish to find the linear unbiased estimator with the minimum variance for each θ i θ. ˆθ = A T x A T H = I Find A T to minimize N i=1 var(ˆθ i ) = tr(a T CA)

Cˆθ = E[(A T x θ)(a T x θ) T ] = E[A T ww T A] = A T CA = (H T C 1 H) 1 17 BLUE Formulation min A s.t. tr(a T CA) A T H = I Solutions: This is a convex optimization problem. J(A) = tr ( A T CA) (A T H I)λ ) J(A) A = 2CA Hλ = 0 A = 1 2 C 1 Hλ Determine λ by using the constraint A T H = I = 1 2 λt H T C 1 H = I Optimum solution Error covariance matrix 1 2 λt = (H T C 1 H) 1 ˆθ = (H T C 1 H) 1 H T C 1 x

Theorem (Gauss-Markov) If the data are of the general linear model form x = Hθ + w where H is a known N p matrix, θ is a p 1 vector of parameters to be estimated, and w is a N 1 noise vector with zero mean and covariance C, then the BLUE of θ is ˆθ = (H T C 1 H) 1 H T C 1 x In addition, the covariance matrix of ˆθ is Cˆθ = (H T C 1 H) 1

Example x n = A + w n, n = 1,..., N, w n not Gaussian, but independent, identically distributed with zero mean and variance σ 2. Solutions: H = 1 N Â BLUE = 1 N N i=1 x i var(âblue) = σ2 N The sample mean is the BLUE independent of the PDF of the data. It is the MVUE for Gaussian noise. σ Recall MMSE: Â MMSE = 2 A X var(âmmse) = σ2 σa 2 + 1 A σ2 = σ2 N σ2 NσA 2 +σ2 N+ σ2 σ A 2 var(âmmse) var(âblue) lim var(âblue) = var(âmmse) σa 2 In MMSE, we have prior information about A (µ A = 0 and σ 2 A ). In BLUE and MVUE, no prior information of A is avaiable σ 2 A = ).

Example x[n] = A + w[n], n = 0, 1,..., N 1, w[n] not Gaussian, but independent with zero mean and variance σ 2 n.

Example For the general linear model x = Hθ + s + w where s is a known N 1 vector and E[w] = 0, E[ww T ] = C. Find the BLUE. Solutions: Let y = x s.

Example Consider the following curve fitting problem, where we wish to find θ 0,, θ p 1 so as to best fit the experimental data points (t n, x(t n )) for n = 0,, N 1 by the polynomial curve x(t n ) = θ 0 + θ 1 t n + θ 2 t 2 n + + θ p 1 t p 1 n + w(t n ) (1) where w(t n ) are i.i.d. with zero mean and variance σ 2. Find the BLUE of θ = [θ 0, θ 1,, θ p 1 ] T.