Basic concepts in estimation

Size: px

Start display at page:

Download "Basic concepts in estimation"

Kristian Chapman
5 years ago
Views:

1 Basic concepts in estimation Random and nonrandom parameters Definitions of estimates ML Maimum Lielihood MAP Maimum A Posteriori LS Least Squares MMS Minimum Mean square rror Measures of quality of estimates Unbiased, variance, consistency,the Cramer-Rao lower bound, Fisher information, fficiency

2 The problem of parameter estimation Parameter is usually assumed to be time-invariant quantity, vs. time varying parameter Given measurements (),.. to estimate parameter, in presence of disturbances w() ( ) h[,, w( ) ],, L, find a function of observation { ( ) } [ ] or ˆ Z ˆ Z ˆ, Δ Z This function is called estimator, and the value is called estimate The estimation error ~ Δ ˆ

3 The problem of parameter estimation Models for estimation of a parameter. Nonrandom, unnown constant, unnown true value, non-bayesian or Fisher approach. Random, the parameter is a random variable with a prior (a priori) pdf p(), the Bayesian approach The Bayesian approach from the prior pdf, one can obtain posterior (a posteriori) pdf using Bayes formula ( Z ) p p p Z p( Z ) p p( Z ) c c normaliation constant the posterior pdf can be used in several ways to estimate

4 The problem of parameter estimation The Non-Bayesian approach no the prior pdf associated with parameter, one cannot define posterior pdf for it One has the pdf of the measurements conditioned on the parameter, Lielihood Function LF of the parameter Λ Z Δ ( p Z or Λ p Z ) Δ as a measure of how liely the parameter value is for the observations, LF serves as a measure of evidence from the data.

5 ML and MAP estimators Maimiation of the LF relative to produces Maimum Lielihood stimator (ML) ( Z ) arg ma Λ arg ma p( Z ) ML ˆ Z ML, being a function of random observations Z, is a random variable ML is the solution of lielihood equation dλ Z d d p( Z ) d

6 ML and MAP estimators Maimum A Posteriori stimator (MAP), for a random parameter, follows from the maimiation of the posterior pdf ( Z ) arg ma p( Z ) arg ma[ p( Z ) p ] MAP ˆ MAP estimate is a random variable Normaliation constant c (from Bayes formula) is irrelevant for the maimiation

7 ML versus MAP with Gaussian prior Consider the single measurement w where p ( w N, ) Assume that is unnown constant, no a prior information Λ ( p N, ) e π ; then ˆ ML ( Z ) arg ma Λ since the pea or mode occurs at

8 ML versus MAP with Gaussian prior Assume now that a prior information about is Gaussian, N ;, p which is independent of w, then the posterior pdf of, on the observation, is p ( ) p ( ) p p p c π where c is the normaliation constant independent of c e ( )

9 ML versus MAP with Gaussian prior rearrangement of the eponent

10 ML versus MAP with Gaussian prior the posterior pdf of, on the observation, becomes where [ ], ; ξ π ξ e N p p p p ξ Δ Δ p MAP ξ ma arg ˆ

11 ML versus MAP with Gaussian prior note that MAP estimator is a weighted combination of. the ML estimator, pea of the lielihood function. the pea of the a prior pdf of the parameter ˆ MAP arg ma p( ) ξ the weighting is inversely proportional to their variance directly proportional to information, inverse variance

12 The sufficient statistics and the lielihood equation If the LF can be decomposed Λ p( Z ) f g( Z ) then the ML depends only on function g(z), called sufficient statistics, [ ] f ( Z ) Δ, summaries the information contained in the data set about ample: scalar measurement, mutually independent ( ) w w( ),,..., ( ) ( N, N, )

13 The sufficient statistics and the lielihood equation LF of in terms of is then sufficient statistics () [ ] [ ] [ ] Z g e Z g f ce Z f Z f Z g f e ce ce ce N p Z p Λ Δ Δ Δ,,,,, ;,..., { } Z

14 The sufficient statistics and the lielihood equation Lielihood equation dλ df g Z, d d since logarithm is monotonic transformation, equivalent loglielihood function can be used d ln Λ d d ln f [ g( Z ), ] d [ ] ( ) which yields ˆ ML

15 Least Squares and Minimum Mean Square rror estimation Another common estimation procedure for for nonrandom parameters is Least Squares method (LS) the LS of is ( ) h(, ) w( ),,..., ˆ LS arg min [ ( ) h(, ) ] Nonlinear LS-problem, linear LS-problem treated in detail no assumptions about the measurement errors

16 Least Squares and Minimum Mean Square rror estimation if the measurement errors are independent Gaussian random variables p w N, then LS coincides with the ML, which is shown: ( ( ) ) N h(, ) Lielihood Function of is Λ Δ ( p Z ) p[ (),..., ] N ( ) ; h(, ) (, ) ( ) (, ), p,..., ce ( ( ) h(, ) )

17 Least Squares and Minimum Mean Square rror estimation ˆ ML arg ma ln Λ arg ma arg min ˆ LS [ ( ) h(, ) ] [ ( ) h(, ) ] under Gaussian assumption LS method is a disguised ML approach

18 Least Squares and Minimum Mean Square rror estimation For random parameter the counterpart of LS is the Minimum Mean Square rror (MMS) stimator d ˆ ( Z) arg min ˆ [( ˆ ) Z ] MMS [ ( ˆ ) Z ] ( ˆ ) dˆ ˆ MMS ( Z) [ Z ] ( ˆ [ Z ]) Δ [ Z ] p( Z ) d the solution is the conditional mean of

19 Least Squares and Minimum Mean Square rror estimation Some LS-estimators Single measurement of the unnown parameter if the noise is Gaussian Several measurement of the unnown parameter, Gaussian independent noise terms sample mean sample average [ ] ML LS w ˆ arg min ˆ, [ ] [ ] ML LS w ˆ arg ma ln arg ma arg min ˆ,..., ), ( ) ( Λ

20 Least Squares and Minimum Mean Square rror estimation MMS versus MAP stimator Single measurement of the unnown random parameter with a Gaussian priori pdf, noise is also Gaussian ( ξ ( )) w, p e π the mean of this Gaussian pdf is ξ(), which is also the mode of this pdf. Thus [ ] MAP ˆ MMS ˆ ξ MMS estimator coincides with MAP estimator due to the the fact that mean and mode of Gaussian coincide MMS MAP N ; ˆ, N ; ˆ p,

21 Unbiased estimators For a nonrandom parameter, an estimator is unbiased if [ ] ˆ,, ( Z p Z ) pdf where is the true value of the parameter. Bayesian case, is random with a prior pdf p() [ ] [ ] ˆ, Z General definition with estimation error [ ~ ] ~ Δ ˆ;

22 Unbiasedness of ML and MAP estimator Single measurement, ML estimator unbiased Single measurement, MAP estimator, prior pdf p() unbiased [ ] [] [ ] [ ] [ ] ˆ ˆ, w w w ML ML [ ] [ ] [] [ ] w w MAP MAP Δ ) ( ˆ ) ( ˆ, ξ ξ

23 ln Λ Bias in the ML estimation of two parameters stimating (ML) of unnown mean and variance of a set of measurements mean ( ( ) ) Λ [ ], p,...,, e ( π ) ML ˆ ( ) sample variance (, ) 3 ( ( ) ) ( ) ( ML ˆ ) 3 [ ] ML ˆ ( ) ML ˆ ( ) ( )

24 mean Bias in the ML estimation of two parameters [ ] ML ˆ ( ) sample mean estimator unbiased sample variance {[ ] } ML ˆ ( ) ( ) sample variance estimator biased, (asymptotically unbiased) In order to be unbiased [ ] ML ˆ ( ) ( )

25 The variance of an estimator Non-Bayesian case, ML or LS var[ ˆ ( Z )] Δ { ˆ ( Z ) [ ˆ ( Z )]} if estimator unbiased var[ ˆ ( Z )] { ˆ ( Z ) } if estimator is biased then Mean Square rror (MS) Bayesian case, MAP Unconditional (MS) MS MS [ ˆ ( Z )] { ˆ ( Z ) } MS [ ˆ ( Z )] [ ˆ ( Z ) ] [ ˆ ( Z )] { ˆ ( Z ) } [ { }] Z MS ˆ ( Z ) [ [ Z ] inside bracets is conditional MS

26 The variance of an estimator For the MMS estimator, the conditional MS is [ ] ] ˆ Z Z ( Z ) [ ] ] Z ( Z ) MMS var that is, the conditional variance of given Z. Averaging over Z yields [ ] [ var( Z )] [ ( Z )] which is the unconditional variance of the estimate average squared error over all possible observations ˆ MMS

27 The variance of an estimator General definition with the estimation error ~ Δ ˆ the epected value of the square of the estimation error is the estimator s variance or MS [ ] ~ MS var ( ˆ ) ( ˆ ) biased or unibiased The square root of the variance (or MS) ˆ var( ˆ ) is its standard error, also called standard deviation of the estimation error if ˆ if ˆ unbiased

28 The variance of an estimator Comparison of variances of an ML and MAP estimator Single observation This is due to the availability of of the a priori information [ ] { } { } ˆ ˆ var Δ ML ML [ ] { } [ ] ML MAP MAP w w ˆ var ) ( ˆ ˆ var <

29 Consistency and efficiency of estimators Consistency is here an asymptotic property An estimator of a nonrandom parameter is consistent estimator, if the estimate converges to the true value in some sense Using convergence in the mean square criterion then, Over Z lim ˆ, Z is the condition for consistency in the mean square sense for a random parameter [ ] ] lim [ ] ] ˆ, Z [ ~ (, )] Z lim Over Z and For unbiased estimators

30 Consistency and efficiency of estimators The Cramer-Rao Lower Bound (CRLB) and the Fisher Information Matri (FIM) CRLB: The mean square error corresponding to the estimator of a parameter cannot be smaller than a certain quantity Scalar case Scalar nonrandom parameter with an unbiased estimator [ ] ] ˆ, Z Δ J Λ J is the Fisher Information Λ Λ() is the lielihood function the true value of the parameter. J where

31 Consistency and efficiency of estimators The Cramer-Rao Lower Bound (CRLB) and the Fisher Information Matri (FIM) Scalar case For scalar random parameter with an unbiased estimator J is the Fisher Information If the estimator variance is equal to CRLB, then such an estimator is called efficient. [ ] [ ] Δ,, J where, ˆ Z p Z p J Z

32 Consistency and efficiency of estimators The Cramer-Rao Lower Bound (CRLB) and the Fisher Information Matri (FIM) Vector case For nonrandom vector parameter, the CRLB states that the covariance matri of an unbiased estimator is bounded from below: [ ˆ ][ ˆ ] ] T Z Z Δ J [ ] T T ln Λ [ ln Λ ] ln Λ J is the Fisher Information Matri (FIM) Λ() is the lielihood function the true value of the parameter J where [ [ ] ] T A B C Δ A B pos. def.

2 Statistical Estimation: Basic Concepts

2 Statistical Estimation: Basic Concepts Technion Israel Institute of Technology, Department of Electrical Engineering Estimation and Identification in Dynamical Systems (048825) Lecture Notes, Fall 2009, Prof. N. Shimkin 2 Statistical Estimation: