Basic concepts in estimation

Similar documents
2 Statistical Estimation: Basic Concepts

Variations. ECE 6540, Lecture 10 Maximum Likelihood Estimation

Estimation Tasks. Short Course on Image Quality. Matthew A. Kupinski. Introduction

Cramér-Rao Bounds for Estimation of Linear System Noise Covariances

ECE531 Lecture 10b: Maximum Likelihood Estimation

Estimators as Random Variables

Lecture Note 2: Estimation and Information Theory

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators

Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory

Estimation and Detection

f-domain expression for the limit model Combine: 5.12 Approximate Modelling What can be said about H(q, θ) G(q, θ ) H(q, θ ) with

Brief Review on Estimation Theory

Estimation Theory Fredrik Rusek. Chapters

Estimation Theory. as Θ = (Θ 1,Θ 2,...,Θ m ) T. An estimator

Motivation CRLB and Regularity Conditions CRLLB A New Bound Examples Conclusions

Technique for Numerical Computation of Cramér-Rao Bound using MATLAB

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012

Chapter 3: Maximum-Likelihood & Bayesian Parameter Estimation (part 1)

ELEG 5633 Detection and Estimation Minimum Variance Unbiased Estimators (MVUE)

STATISTICS/ECONOMETRICS PREP COURSE PROF. MASSIMO GUIDOLIN

Detection & Estimation Lecture 1

SGN Advanced Signal Processing: Lecture 8 Parameter estimation for AR and MA models. Model order selection

Module 2. Random Processes. Version 2, ECE IIT, Kharagpur

PARAMETER ESTIMATION AND ORDER SELECTION FOR LINEAR REGRESSION PROBLEMS. Yngve Selén and Erik G. Larsson

Review Quiz. 1. Prove that in a one-dimensional canonical exponential family, the complete and sufficient statistic achieves the

Detection & Estimation Lecture 1

Density Estimation: ML, MAP, Bayesian estimation

Signal detection theory

On the convergence of the iterative solution of the likelihood equations

Advanced Signal Processing Introduction to Estimation Theory

EIE6207: Maximum-Likelihood and Bayesian Estimation

System Identification

Modern Methods of Data Analysis - WS 07/08

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn

The Joint MAP-ML Criterion and its Relation to ML and to Extended Least-Squares

Research Article Improved Estimators of the Mean of a Normal Distribution with a Known Coefficient of Variation

Efficient Monte Carlo computation of Fisher information matrix using prior information

Estimation Theory Fredrik Rusek. Chapters 6-7

F & B Approaches to a simple model

Theory of Maximum Likelihood Estimation. Konstantin Kashin

Timing Recovery at Low SNR Cramer-Rao bound, and outperforming the PLL

Frames and Phase Retrieval

Bayesian Methods / G.D. Hager S. Leonard

Terminology Suppose we have N observations {x(n)} N 1. Estimators as Random Variables. {x(n)} N 1

y Xw 2 2 y Xw λ w 2 2

Statistics: Learning models from data

Machine Learning Lecture 2

Regression #3: Properties of OLS Estimator

Chapter 3. Point Estimation. 3.1 Introduction

Maximum Likelihood Diffusive Source Localization Based on Binary Observations

Introduction to Maximum Likelihood Estimation

Parameter estimation! and! forecasting! Cristiano Porciani! AIfA, Uni-Bonn!

Estimation techniques

ST5215: Advanced Statistical Theory

Machine Learning 4771

Proof In the CR proof. and

A Few Notes on Fisher Information (WIP)

CSE 559A: Computer Vision

1 Appendix A: Matrix Algebra

Regression, Ridge Regression, Lasso

CSE 559A: Computer Vision Tomorrow Zhihao's Office Hours back in Jolley 309: 10:30am-Noon

6. Vector Random Variables

Parameter Estimation

The problem we are interest in this lecture is that of estimating the value of an n{

Lecture 3: Pattern Classification. Pattern classification

Bayesian estimation of chaotic signals generated by piecewise-linear maps

10-704: Information Processing and Learning Fall Lecture 24: Dec 7

Probabilistic Graphical Models

5. Discriminant analysis

Econometrics I, Estimation

Parametric Techniques Lecture 3

Fisher Information Matrix-based Nonlinear System Conversion for State Estimation

The Kalman Filter. Data Assimilation & Inverse Problems from Weather Forecasting to Neuroscience. Sarah Dance

EIE6207: Estimation Theory

Mathematical statistics

Classical Estimation Topics

On the convergence of the iterative solution of the likelihood equations

A Practitioner s Guide to Generalized Linear Models

Machine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li.

Parametric Techniques

LECTURE NOTES FYS 4550/FYS EXPERIMENTAL HIGH ENERGY PHYSICS AUTUMN 2013 PART I A. STRANDLIE GJØVIK UNIVERSITY COLLEGE AND UNIVERSITY OF OSLO

Methods of evaluating estimators and best unbiased estimators Hamid R. Rabiee

Bayesian Decision Theory

10. Linear Models and Maximum Likelihood Estimation

PART I INTRODUCTION The meaning of probability Basic definitions for frequentist statistics and Bayesian inference Bayesian inference Combinatorics

Lecture 2: CDF and EDF

Computer Vision Group Prof. Daniel Cremers. 3. Regression

MACHINE LEARNING ADVANCED MACHINE LEARNING

Inferring from data. Theory of estimators

Machine Learning Lecture 2

2. A Basic Statistical Toolbox

The Finite Sample Properties of the Least Squares Estimator / Basic Hypothesis Testing

Notes on Discriminant Functions and Optimal Classification

MACHINE LEARNING ADVANCED MACHINE LEARNING

Model structure. Lecture Note #3 (Chap.6) Identification of time series model. ARMAX Models and Difference Equations

Cross entropy-based importance sampling using Gaussian densities revisited

DS-GA 1002 Lecture notes 11 Fall Bayesian statistics

Minimum Message Length Analysis of the Behrens Fisher Problem

COMPLEX CONSTRAINED CRB AND ITS APPLICATION TO SEMI-BLIND MIMO AND OFDM CHANNEL ESTIMATION. Aditya K. Jagannatham and Bhaskar D.

Transcription:

Basic concepts in estimation Random and nonrandom parameters Definitions of estimates ML Maimum Lielihood MAP Maimum A Posteriori LS Least Squares MMS Minimum Mean square rror Measures of quality of estimates Unbiased, variance, consistency,the Cramer-Rao lower bound, Fisher information, fficiency

The problem of parameter estimation Parameter is usually assumed to be time-invariant quantity, vs. time varying parameter Given measurements (),.. to estimate parameter, in presence of disturbances w() ( ) h[,, w( ) ],, L, find a function of observation { ( ) } [ ] or ˆ Z ˆ Z ˆ, Δ Z This function is called estimator, and the value is called estimate The estimation error ~ Δ ˆ

The problem of parameter estimation Models for estimation of a parameter. Nonrandom, unnown constant, unnown true value, non-bayesian or Fisher approach. Random, the parameter is a random variable with a prior (a priori) pdf p(), the Bayesian approach The Bayesian approach from the prior pdf, one can obtain posterior (a posteriori) pdf using Bayes formula ( Z ) p p p Z p( Z ) p p( Z ) c c normaliation constant the posterior pdf can be used in several ways to estimate

The problem of parameter estimation The Non-Bayesian approach no the prior pdf associated with parameter, one cannot define posterior pdf for it One has the pdf of the measurements conditioned on the parameter, Lielihood Function LF of the parameter Λ Z Δ ( p Z or Λ p Z ) Δ as a measure of how liely the parameter value is for the observations, LF serves as a measure of evidence from the data.

ML and MAP estimators Maimiation of the LF relative to produces Maimum Lielihood stimator (ML) ( Z ) arg ma Λ arg ma p( Z ) ML ˆ Z ML, being a function of random observations Z, is a random variable ML is the solution of lielihood equation dλ Z d d p( Z ) d

ML and MAP estimators Maimum A Posteriori stimator (MAP), for a random parameter, follows from the maimiation of the posterior pdf ( Z ) arg ma p( Z ) arg ma[ p( Z ) p ] MAP ˆ MAP estimate is a random variable Normaliation constant c (from Bayes formula) is irrelevant for the maimiation

ML versus MAP with Gaussian prior Consider the single measurement w where p ( w N, ) Assume that is unnown constant, no a prior information Λ ( p N, ) e π ; then ˆ ML ( Z ) arg ma Λ since the pea or mode occurs at

ML versus MAP with Gaussian prior Assume now that a prior information about is Gaussian, N ;, p which is independent of w, then the posterior pdf of, on the observation, is p ( ) p ( ) p p p c π where c is the normaliation constant independent of c e ( )

ML versus MAP with Gaussian prior rearrangement of the eponent 4 4 4 4

ML versus MAP with Gaussian prior the posterior pdf of, on the observation, becomes where [ ], ; ξ π ξ e N p p p p ξ Δ Δ p MAP ξ ma arg ˆ

ML versus MAP with Gaussian prior note that MAP estimator is a weighted combination of. the ML estimator, pea of the lielihood function. the pea of the a prior pdf of the parameter ˆ MAP arg ma p( ) ξ the weighting is inversely proportional to their variance directly proportional to information, inverse variance

The sufficient statistics and the lielihood equation If the LF can be decomposed Λ p( Z ) f g( Z ) then the ML depends only on function g(z), called sufficient statistics, [ ] f ( Z ) Δ, summaries the information contained in the data set about ample: scalar measurement, mutually independent ( ) w w( ),,..., ( ) ( N, N, )

The sufficient statistics and the lielihood equation LF of in terms of is then sufficient statistics () [ ] [ ] [ ] Z g e Z g f ce Z f Z f Z g f e ce ce ce N p Z p Λ Δ Δ Δ,,,,, ;,..., { } Z

The sufficient statistics and the lielihood equation Lielihood equation dλ df g Z, d d since logarithm is monotonic transformation, equivalent loglielihood function can be used d ln Λ d d ln f [ g( Z ), ] d [ ] ( ) which yields ˆ ML

Least Squares and Minimum Mean Square rror estimation Another common estimation procedure for for nonrandom parameters is Least Squares method (LS) the LS of is ( ) h(, ) w( ),,..., ˆ LS arg min [ ( ) h(, ) ] Nonlinear LS-problem, linear LS-problem treated in detail no assumptions about the measurement errors

Least Squares and Minimum Mean Square rror estimation if the measurement errors are independent Gaussian random variables p w N, then LS coincides with the ML, which is shown: ( ( ) ) N h(, ) Lielihood Function of is Λ Δ ( p Z ) p[ (),..., ] N ( ) ; h(, ) (, ) ( ) (, ), p,..., ce ( ( ) h(, ) )

Least Squares and Minimum Mean Square rror estimation ˆ ML arg ma ln Λ arg ma arg min ˆ LS [ ( ) h(, ) ] [ ( ) h(, ) ] under Gaussian assumption LS method is a disguised ML approach

Least Squares and Minimum Mean Square rror estimation For random parameter the counterpart of LS is the Minimum Mean Square rror (MMS) stimator d ˆ ( Z) arg min ˆ [( ˆ ) Z ] MMS [ ( ˆ ) Z ] ( ˆ ) dˆ ˆ MMS ( Z) [ Z ] ( ˆ [ Z ]) Δ [ Z ] p( Z ) d the solution is the conditional mean of

Least Squares and Minimum Mean Square rror estimation Some LS-estimators Single measurement of the unnown parameter if the noise is Gaussian Several measurement of the unnown parameter, Gaussian independent noise terms sample mean sample average [ ] ML LS w ˆ arg min ˆ, [ ] [ ] ML LS w ˆ arg ma ln arg ma arg min ˆ,..., ), ( ) ( Λ

Least Squares and Minimum Mean Square rror estimation MMS versus MAP stimator Single measurement of the unnown random parameter with a Gaussian priori pdf, noise is also Gaussian ( ξ ( )) w, p e π the mean of this Gaussian pdf is ξ(), which is also the mode of this pdf. Thus [ ] MAP ˆ MMS ˆ ξ MMS estimator coincides with MAP estimator due to the the fact that mean and mode of Gaussian coincide MMS MAP N ; ˆ, N ; ˆ p,

Unbiased estimators For a nonrandom parameter, an estimator is unbiased if [ ] ˆ,, ( Z p Z ) pdf where is the true value of the parameter. Bayesian case, is random with a prior pdf p() [ ] [ ] ˆ, Z General definition with estimation error [ ~ ] ~ Δ ˆ;

Unbiasedness of ML and MAP estimator Single measurement, ML estimator unbiased Single measurement, MAP estimator, prior pdf p() unbiased [ ] [] [ ] [ ] [ ] ˆ ˆ, w w w ML ML [ ] [ ] [] [ ] w w MAP MAP Δ ) ( ˆ ) ( ˆ, ξ ξ

ln Λ Bias in the ML estimation of two parameters stimating (ML) of unnown mean and variance of a set of measurements mean ( ( ) ) Λ [ ], p,...,, e ( π ) ML ˆ ( ) sample variance (, ) 3 ( ( ) ) ( ) ( ML ˆ ) 3 [ ] ML ˆ ( ) ML ˆ ( ) ( )

mean Bias in the ML estimation of two parameters [ ] ML ˆ ( ) sample mean estimator unbiased sample variance {[ ] } ML ˆ ( ) ( ) sample variance estimator biased, (asymptotically unbiased) In order to be unbiased [ ] ML ˆ ( ) ( )

The variance of an estimator Non-Bayesian case, ML or LS var[ ˆ ( Z )] Δ { ˆ ( Z ) [ ˆ ( Z )]} if estimator unbiased var[ ˆ ( Z )] { ˆ ( Z ) } if estimator is biased then Mean Square rror (MS) Bayesian case, MAP Unconditional (MS) MS MS [ ˆ ( Z )] { ˆ ( Z ) } MS [ ˆ ( Z )] [ ˆ ( Z ) ] [ ˆ ( Z )] { ˆ ( Z ) } [ { }] Z MS ˆ ( Z ) [ [ Z ] inside bracets is conditional MS

The variance of an estimator For the MMS estimator, the conditional MS is [ ] ] ˆ Z Z ( Z ) [ ] ] Z ( Z ) MMS var that is, the conditional variance of given Z. Averaging over Z yields [ ] [ var( Z )] [ ( Z )] which is the unconditional variance of the estimate average squared error over all possible observations ˆ MMS

The variance of an estimator General definition with the estimation error ~ Δ ˆ the epected value of the square of the estimation error is the estimator s variance or MS [ ] ~ MS var ( ˆ ) ( ˆ ) biased or unibiased The square root of the variance (or MS) ˆ var( ˆ ) is its standard error, also called standard deviation of the estimation error if ˆ if ˆ unbiased

The variance of an estimator Comparison of variances of an ML and MAP estimator Single observation This is due to the availability of of the a priori information [ ] { } { } ˆ ˆ var Δ ML ML [ ] { } [ ] ML MAP MAP w w ˆ var ) ( ˆ ˆ var <

Consistency and efficiency of estimators Consistency is here an asymptotic property An estimator of a nonrandom parameter is consistent estimator, if the estimate converges to the true value in some sense Using convergence in the mean square criterion then, Over Z lim ˆ, Z is the condition for consistency in the mean square sense for a random parameter [ ] ] lim [ ] ] ˆ, Z [ ~ (, )] Z lim Over Z and For unbiased estimators

Consistency and efficiency of estimators The Cramer-Rao Lower Bound (CRLB) and the Fisher Information Matri (FIM) CRLB: The mean square error corresponding to the estimator of a parameter cannot be smaller than a certain quantity Scalar case Scalar nonrandom parameter with an unbiased estimator [ ] ] ˆ, Z Δ J Λ J is the Fisher Information Λ Λ() is the lielihood function the true value of the parameter. J where

Consistency and efficiency of estimators The Cramer-Rao Lower Bound (CRLB) and the Fisher Information Matri (FIM) Scalar case For scalar random parameter with an unbiased estimator J is the Fisher Information If the estimator variance is equal to CRLB, then such an estimator is called efficient. [ ] [ ] Δ,, J where, ˆ Z p Z p J Z

Consistency and efficiency of estimators The Cramer-Rao Lower Bound (CRLB) and the Fisher Information Matri (FIM) Vector case For nonrandom vector parameter, the CRLB states that the covariance matri of an unbiased estimator is bounded from below: [ ˆ ][ ˆ ] ] T Z Z Δ J [ ] T T ln Λ [ ln Λ ] ln Λ J is the Fisher Information Matri (FIM) Λ() is the lielihood function the true value of the parameter J where [ [ ] ] T A B C Δ A B pos. def.