Advanced Signal Processing Introduction to Estimation Theory
|
|
- Alexina Hancock
- 5 years ago
- Views:
Transcription
1 Advanced Signal Processing Introduction to Estimation Theory Danilo Mandic, room 813, ext: Department of Electrical and Electronic Engineering Imperial College London, UK URL: mandic c D. P. Mandic Advanced Signal Processing 1
2 Aims of this lecture To introduce the notions of an Estimator, Estimate, Estimandum To discuss the bias and variance in statistical estimation theory, asymptotically unbiased and consistent estimators Performance metric, such as the Mean Square Error (MSE) The bias variance dilemma and the MSE in this context To derive feasible MSE estimators A class of Minimum Variance Unbiased (MVU) estimators Extension to the vector parameter case Point estimators, confidence intervals, statistical goodness of an estimator, the role of noise c D. P. Mandic Advanced Signal Processing 2
3 Role of estimation in signal processing try also the function specgram in Matlab it produces the TF diagram below M aaaa tl aaa b An enabling technology in many DSP applications Radar and sonar: estimation of the range and azimuth Image analysis: motion estimation, segmentation Speech: features used in recognition and speaker verification Seismics: oil reservoirs Horizontal axis: time Vertical axis: frequency Communications: equalization, symbol detection Biomedicine: various applications c D. P. Mandic Advanced Signal Processing 3
4 From Lecture 2: Estimation of the parameters of an AR(2) process (under- vs. over-fitting) Original AR(2) process x[n] = 0.2x[n 1] 0.9x[n 2] + w[n], w[n] N (0, 1), estimated using AR(1), AR(2) and AR(20) models: 5 Original and estimated signals AR coefficients Time [sample] Original AR(2) signal AR(1), Error= AR(2), Error= AR(20), Error= Coefficient index Can we put this into a bigger estimation theory framework? Can we quantify the goodness of the estimators above (bias, variance)? c D. P. Mandic Advanced Signal Processing 4
5 Statistical estimation problem for simplicity, consider a DC level in WGN, x[n] = A + w[n], w N (0, σ 2 ) Problem statement: We seek to determine a set of parameters θ = [θ 1,..., θ p ] T from a set of data points x = [x[0],..., x[n 1]] T such that the values of these parameters would yield the highest probability of obtaining the observed data. In other words, max p(x; θ) reads : p(x) parametrised by θ span θ The unknown parameters may be seen as either deterministic or random variables There are essentially two alternatives to the statistical case No a priori distribution assumed: Maximum Likelihood A priori distribution known: Bayesian estimation Key problem to estimate a group of parameters from a discrete-time signal or dataset. c D. P. Mandic Advanced Signal Processing 5
6 Estimation of a scalar random variable Given an N-point dataset x[0], x[1],..., x[n 1] which depends on an unknown parameter θ (scalar), define an estimator as some function, g, of the dataset, that is ˆθ = g ( x[0], x[1],..., x[n 1] ) which may be used to estimate θ (single parameter case). (in our DC level estimation problem, θ = A) This defines the problem of parameter estimation We also need to determine g( ) Theory and techniques of statistical estimation are available Estimation based on PDFs which contain unknown but deterministic parameters is termed classical estimation In Bayesian estimation, the unknown parameters are assumed to be random variables, which may be prescribed a priori to lie within some range of allowable parameters (or desired performance) c D. P. Mandic Advanced Signal Processing 6
7 The stastical estimation problem First step: to model, mathematically, the data We employ a Probability Desity Function (PDF) to describe the inherently random measurement process, that is p(x[0], x[1],..., x[n 1]; θ) which is parametrised by the unknown parameter θ P(x[0]; ) Example: for N = 1, and θ denoting the mean value, a generic form of PDF for the class of Gaussian PDFs with any value of θ is given by p(x[0]; θ) = 1 2πσ 2 exp [ 1 2σ 2 (x[0] θ) 2] 1 2 x[0] Clearly, the observed value of x[0] critically impacts upon the likely value of θ. c D. P. Mandic Advanced Signal Processing 7
8 Estimator vs. Estimate The parameter to be estimated is then viewed as a realisation of the random variable θ Data are described by the joint PDF of the data and parameters: p(x, θ) = p(x θ) p(θ) }{{}}{{} (conditional P DF ) (prior P DF ) An estimator is a rule that assigns a value to the parameter θ from each realisation of x = x = [x[0],..., x[n 1]] T An estimate of the true value, θ, also called estimandum, is obtained for a given realisation of x = [x[0],..., x[n 1]] T in the form ˆθ = g(x) Example: for a noisy straight line: p(x; θ) = 1 (2πσ 2 ) N/2 e [ 1 2σ 2 N 1 n=0 (x[n] A Bn)2] Performance is critically dependent upon this PDF assumption the estimator should be robust to a slight mismatch between the measurement and the PDF assumption c D. P. Mandic Advanced Signal Processing 8
9 Example 1: Finding the parameters of a straight line specification of the PDF is critical in determining a good estimator In practice, we choose a PDF which fits the problem constraints and any a priori information; but it must also be mathematically tractable. Example: Assume that on the average data values are increasing Data: straight line embedded in random noise w[n] N (0, σ 2 ) x[n] noisy line x[n] = A + Bn + w[n] n = 0, 1,..., N 1 [ 1 p(x; A, B) = e 1 N 1 (2πσ 2 ) N/2 2σ 2 n=0 (x[n] A Bn)2] A 0 ideal noiseless line n Unknown parameters: A, B θ [A B] T Careful: what would be the effects of bias in A and B? c D. P. Mandic Advanced Signal Processing 9
10 Bias in parameter estimation Estimation theory (scalar case): estimate the value of an unknown parameter, θ, from a set of observations of a random variable described by that parameter ˆθ = g ( x[0], x[1],..., x[n 1] ) Example: given a set of observations from Gaussian distribution, estimate the mean or variance from these observations. Recall that in linear mean square estimation, when estimating the value of a random variable y from an observation of a related random variable x, the coefficients A and B within the estimator y = Ax + B depend upon the mean and variance of x and y, as well as on their correlation. The difference between the expected value of the estimate and the actual value θ is called the bias and will be denoted by B. B = E{ˆθ N } θ where ˆθ N denotes estimation over N data samples, x[0],..., x[n 1] c D. P. Mandic Advanced Signal Processing 10
11 Asymptotic unbiasedness If the bias is zero, then the expected value of the estimate is equal to the true value, that is E{ˆθ N } = θ B = E{ˆθ N } θ = 0 and the estimate is said to be unbiased. If B 0 then the estimator ˆθ = g(x) is said to be biased. Example: Consider the sample mean estimator of the DC level in WGN, x[n] = A + w[n], w N (0, 1), given by  = x = 1 N + 2 N 1 n=0 x[n] that is θ = A Is the above sample mean estimator of the true mean A biased? Observe: This estimator is biased but the bias B 0 when N lim E{ˆθ N } = θ N Such as estimator is said to be asymptotically unbiased. c D. P. Mandic Advanced Signal Processing 11
12 How about the variance? It is desirable that an estimator be either unbiased or asymptotically unbiased (think about the power of estimation error due to DC offset) For an estimate to be meaningful, it is necessary that we use the available statistics effectively, that is, var 0 as N or in other words { lim var{ˆθ N } = lim ˆθ N E{ˆθ N } 2} = 0 N N If ˆθ N is unbiased then E{ˆθ N } = θ, and from Tchebycheff inequality ɛ > 0 P r{ ˆθ N θ ɛ} var{ˆθ N } ɛ 2 if var 0 as N, then the probability that ˆθ N differs by more than ɛ from the true value will go to zero (showing consistency). In this case, ˆθ N is said to converge to θ with probability one. c D. P. Mandic Advanced Signal Processing 12
13 Mean square convergence NB: Mean square error criterion is very different from the variance criterion Another form of convergence, stronger than convergence with probability one is the mean square convergence. An estimate ˆθ N is said to converge to θ in the mean square sense, if lim E{ ˆθ N θ 2 } N }{{} mean square error = 0 For an unbiased estimator, this is equivalent to the previous condition that the variance of the estimate goes to zero An estimate is said to be consistent if it converges, in some sense, to the true value of the parameter We say that the estimator is consistent if it is asymptotically unbiased and has a variance that goes to zero as N c D. P. Mandic Advanced Signal Processing 13
14 Example 2: Assessing the performance of the Sample Mean as an estimator Consider the estimation of a DC level, A, in random noise, which can be modelled as x[n] = A + w[n] where w[n] is some zero-mean random iid process. Aim: to estimate A given {x[0], x[1],..., x[n 1]} Intuitively, the sample mean is a reasonable estimator, and has the form  = 1 N N 1 n=0 x[n] Q1: How close will  be to A? Q2: Are there better estimators than the sample mean? c D. P. Mandic Advanced Signal Processing 14
15 Mean and variance of the Sample Mean estimator x[n] = A + w[n] w[n] N (0, σ 2 ) Estimator = f( random data ) it is a random variable itself its performance must be judged statistically (1) What is the mean of Â? { } } N 1 1 E {Â = E x[n] = 1 N N n=0 (2) What is the variance of Â? N 1 n=0 E {x[n]} = A unbiased Assumption: The samples of w[n]s are uncorrelated } var {Â = E [ { Â E[Â]] 2 1 = var }{{} N variability around the mean = 1 N 2 N 1 n=0 var {x[n]} = 1 N 2N σ2 = σ2 N N 1 n=0 x[n] Notice the variance 0 as N consistent (see your P&A sets) } c D. P. Mandic Advanced Signal Processing 15
16 Minimum Variance Unbiased (MVU) estimation Aim: to establish good estimators of unknown deterministic parameters Unbiased estimator on the average yields the true value of the unknown parameter, independently of its particular value, i.e. E(ˆθ) = θ a < θ < b where (a, b) denotes the range of possible values of θ Example 3: Unbiased estimator for a DC level in white Gaussian noise (WGN), observed as x[n] = A + w[n] n = 0, 1,..., N 1 where A is the unknown, but deterministic, parameter to be estimated which lies within the interval (, ), then the sample mean can be used as an estimator of A, namely  = 1 N N 1 n=0 x[n] c D. P. Mandic Advanced Signal Processing 16
17 Careful: The estimator is parameter dependent! An estimator may be unbiased for certain values of the unknown parameter but not all, such an estimator is not unbiased Example 4: Consider another sample mean estimator of a DC level:  = 1 2N } Therefore: E { } E { N 1 n=0 x[n] = 0 when A = 0 but = A 2 when A 0 (parameter dependent) Hence  is not an unbiased estimator. A biased estimator introduces a systemic error which should not generally be present Our goal is to avoid bias if we can, as we are interested in stochastic signal properties and bias is largely deterministic c D. P. Mandic Advanced Signal Processing 17
18 Remedy: How about averaging? But do we average data segments or estimates? Also look in the CW Assignment dealing with PSD in your CW ) Several unbiased estimators of the same quantity may be averaged together. For example, given the L independent estimates {ˆθ1, ˆθ 2,..., ˆθ L } we may choose to average them, to yield ˆθ = 1 L L l=1 ˆθ l Our assumption was that the individual estimators, ˆθ l = g(x), are unbiased, with equal variances, and mutually uncorrelated. Then (NB: averaging biased estimators will not remove the bias) } E {ˆθ = θ and } var {ˆθ = 1L } } Ll=1 {ˆθl var = {ˆθl 1L var 2 Note, as L, ˆθ θ (consistent estimator) c D. P. Mandic Advanced Signal Processing 18
19 Effects of averaging for real world data Problem 3.4 from your P/A sets: heart rate estimation The heart rate, h, of a patient is automatically recorded } by a computer every 100ms. One second of the measurements {ĥ1, ĥ2,..., ĥ10 are averaged to obtain ĥ. Given } than E {ĥi = αh for some constant α and var(ĥi) = 1 for all i, determine whether averaging improves the estimator, for α = 1 and α = 1/2. ĥ = i=1 ĥ i [n], Before Averaging After Averaging α = 1 α = 1 } E {ĥ = α h = αh p(ĥi) p(ĥi) i=1 For α = 1, the estimator is unbiased. For α = 1/2 it will not be unbiased unless the estimator is formed as ĥ = i=1 ĥi[n]. α = 1 2 h ĥ i α = 1 2 h ĥ i } var {ĥ = 1 10 L 2 } var {ĥi p(ĥi) p(ĥi) i=1 h/2 h ĥ i h/2 h ĥ i c D. P. Mandic Advanced Signal Processing 19
20 Minimum variance criterion An optimality criterion is necessary to define an optimal estimator Such a criterion is the Mean Square Error (MSE), given by { ) } 2 ( MSE(ˆθ) = E (ˆθ θ E { error 2} ) which measures the average mean squared deviation of the estimate from the true value (error power). Without any constraints, this criterion leads to unrealisable estimators those which are not solely a function of the data (see Example 5). { ) ( )] } 2 MSE(ˆθ) = E [(ˆθ E(ˆθ) + E(ˆθ) θ = var(ˆθ) + E { (ˆθ) θ} 2 = var(ˆθ) + B 2 (ˆθ) MSE = VARIANCE OF THE ESTIMATOR + SQUARED BIAS c D. P. Mandic Advanced Signal Processing 20
21 Example 5: An MSE estimator with a gain factor Consider the following estimator for DC level in WGN Â = a 1 N N 1 n=0 x[n] Task: Find the value of a which results in the minimum MSE. Solution: } E {Â = aa and var(â) = a2 σ 2 N so that we have MSE(Â) = a2 σ 2 N + (a 1)2 A 2 Of course, the choice a = 1 removes the mean and minimises the variance c D. P. Mandic Advanced Signal Processing 21
22 Example 5: (continued) An MSE estimator with a gain (is a biased estimator feasible?) But, can we find a analytically? Differentiating with respect to a yields MSA a (Â) = 2aσ2 N + 2(a 1)A2 and setting the result to zero gives the optimal value a opt = A2 A 2 + σ2 N but we do not know the value of A The optimal value, a opt, depends on A which is the unknown parameter Comment: any criterion which depends on the value of an unknown parameter to be found is likely to yield unrealisable estimators Practically, the minimum MSE estimator needs to be abandoned, and the estimator must be constrained to be unbiased. c D. P. Mandic Advanced Signal Processing 22
23 Example 6: A counter-example a little bias can help (but the estimator is difficult to control) Q: Let {y[n]}, n = 1,..., N be iid Gaussian variables N (0, σ 2 ). Consider the following estimate of σ 2 ˆσ 2 = α N y 2 [n] α > 1 N n=1 Find α which minimises the MSE of ˆσ 2. S: It is straightforward to show that E{σ 2 } = ασ 2 and MSE(ˆσ 2 ) = E{ (ˆσ 2 σ 2) 2 } = E{ˆσ 4 } + σ 4 (1 2α) = α2 N 2 N n=1 N E{y 2 [n]y 2 [s]} + σ 4 (1 2α) s=1 Hint : Σ n 2 = Σ n Σ s = α2 N 2 ( N 2 σ 4 + 2Nσ 4) + σ 4 (1 2α) = σ 4[ α 4 (1 + 2 N ) + (1 2α) ] The MMSE is obtained for α min = N N+2 and has the value min MSE(ˆσ 2 ) = 2σ4 N+2. Given that the minimum variance of an unbiased estimator (CRLB, later) is 2σ 4 /N, this is an example of a biased estimator which obtains a lower MSE than the CRLB. c D. P. Mandic Advanced Signal Processing 23
24 Desired: minimum variance unbiased (MVU) estimator Minimising the variance of an unbiased estimator concentrates the PDF of the error about zero estimation error is therefore less likely to be large Existence of the MVU estimator The MVU estimator is an unbiased estimator with minimum variance for all θ, that is, ˆθ 3 on the graph. c D. P. Mandic Advanced Signal Processing 24
25 Methods to find the MVU estimator The MVU estimator may not always exist, but if it exists we need to examine if that is an optimal estimator A single unbiased estimator may not exist in which case a search for the MVU is fruitless. Some solutions are: Determine the Cramer-Rao lower bound (CRLB) and find some estimator which satisfies the MVU criteria Apply the Rao-Blackwell-Lehmann-Scheffe (RBLS) theorem Restrict the class of estimators to be not only unbiased, but also linear in both the parameters and data (BLUE) (Lecture 5) Model-based approaches, such as Least Squares and Sequential Least Squares (Lecture 6) Adaptive estimators (Lecture 7) c D. P. Mandic Advanced Signal Processing 25
26 Extensions to the vector parameter case If θ = [ˆθ1, ˆθ 2,..., ˆθ p ] T R p 1 is a vector of unknown parameters, an estimator is said to be unbiased if By defining E(ˆθ i ) = θ i where a i < θ i < b i for i = 1, 2,..., p E(θ) = E(θ 1 ) E(θ 2 ). E(θ p ) an unbiased estimator has the property E(ˆθ) = θ within the p dimensional space of parameters spanned by θ = [θ 1,..., θ p ] T. An MVU estimator has the additional property that its var( ˆθ i ), for i = 1, 2,..., p, is the minimum among all unbiased estimators. c D. P. Mandic Advanced Signal Processing 26
27 Summary and food for thoughts We are now equipped with performance metrics for assessing the goodnes of any estimator (bias, variance, MSE). Since MSE = var + bias 2, some biased estimators may yield low MSE. However, we prefer minimum variance unbiased (MVU) estimators. Even a simple Sample Mean estimator is a very rich example of the advantages of statistical estimators. The knowledge of the parametrised PDF p(data;parameters) is very important for designing efficient estimators. We have introduced statistical point estimators, would it be useful to also know the confidence we have in our point estimate? In many disciplines it is useful to design so called set membership estimates, where the output of an estimator belongs to a pre-definined bound (range) of values. We will next address linear, best linear unbiased, maximum likelihood, least squares, sequential least squares, and adaptive estimators. c D. P. Mandic Advanced Signal Processing 27
28 Homework: Check another proof for the MSE expression MSE(ˆθ) = var(ˆθ) + bias 2 (θ) Note : var(x) = E[x 2 ] [ E[x] ] 2 ( ) Idea : Let x = ˆθ θ substitute into ( ) to give var(ˆθ θ) }{{} term (1) Let us now evaluate these terms: (1) var(ˆθ θ) = var(ˆθ) (2) E[ˆθ θ] 2 = MSE = E[(ˆθ θ) 2 ] }{{} term (2) [ E[ˆθ θ] ] 2 }{{} term (3) ( ) (3) [ E[ˆθ θ] ] 2 = [ E[ˆθ] E[θ] ] 2 = [ E[ˆθ θ] ] 2 = bias 2 (ˆθ) Substitute (1), (2), (3) into (**) to give var(ˆθ) = MSE bias 2 MSE = var(ˆθ) + bias 2 (ˆθ) c D. P. Mandic Advanced Signal Processing 28
29 Recap: Unbiased estimators Due to the linearity properties of the statistical expectation operator, E { }, that is E{a + b} = E{a} + E{b} the sample mean estimator can be shown to be unbiased, i.e. } E { = 1 N 1 N n=0 E {x[n]} = 1 N 1 N n=0 A = A In some applications, the value of A may be constrained to be positive. For example, the value of an electronic component such as an inductor, capacitor or resistor would be positive (prior knowledge). For N data points in random noise, unbiased estimators generally have symmetric PDFs centred about their true value, that is  N (A, σ 2 /N) c D. P. Mandic Advanced Signal Processing 29
30 Notes c D. P. Mandic Advanced Signal Processing 30
31 Notes c D. P. Mandic Advanced Signal Processing 31
32 Notes c D. P. Mandic Advanced Signal Processing 32
EIE6207: Estimation Theory
EIE6207: Estimation Theory Man-Wai MAK Dept. of Electronic and Information Engineering, The Hong Kong Polytechnic University enmwmak@polyu.edu.hk http://www.eie.polyu.edu.hk/ mwmak References: Steven M.
More informationELEG 5633 Detection and Estimation Minimum Variance Unbiased Estimators (MVUE)
1 ELEG 5633 Detection and Estimation Minimum Variance Unbiased Estimators (MVUE) Jingxian Wu Department of Electrical Engineering University of Arkansas Outline Minimum Variance Unbiased Estimators (MVUE)
More informationEstimation and Detection
stimation and Detection Lecture 2: Cramér-Rao Lower Bound Dr. ir. Richard C. Hendriks & Dr. Sundeep P. Chepuri 7//207 Remember: Introductory xample Given a process (DC in noise): x[n]=a + w[n], n=0,,n,
More informationAdvanced Signal Processing Minimum Variance Unbiased Estimation (MVU)
Advanced Signal Processing Minimum Variance Unbiased Estimation (MVU) Danilo Mandic room 813, ext: 46271 Department of Electrical and Electronic Engineering Imperial College London, UK d.mandic@imperial.ac.uk,
More informationRowan University Department of Electrical and Computer Engineering
Rowan University Department of Electrical and Computer Engineering Estimation and Detection Theory Fall 2013 to Practice Exam II This is a closed book exam. There are 8 problems in the exam. The problems
More informationA Few Notes on Fisher Information (WIP)
A Few Notes on Fisher Information (WIP) David Meyer dmm@{-4-5.net,uoregon.edu} Last update: April 30, 208 Definitions There are so many interesting things about Fisher Information and its theoretical properties
More informationTerminology Suppose we have N observations {x(n)} N 1. Estimators as Random Variables. {x(n)} N 1
Estimation Theory Overview Properties Bias, Variance, and Mean Square Error Cramér-Rao lower bound Maximum likelihood Consistency Confidence intervals Properties of the mean estimator Properties of the
More informationECE531 Lecture 10b: Maximum Likelihood Estimation
ECE531 Lecture 10b: Maximum Likelihood Estimation D. Richard Brown III Worcester Polytechnic Institute 05-Apr-2011 Worcester Polytechnic Institute D. Richard Brown III 05-Apr-2011 1 / 23 Introduction So
More informationModule 2. Random Processes. Version 2, ECE IIT, Kharagpur
Module Random Processes Version, ECE IIT, Kharagpur Lesson 9 Introduction to Statistical Signal Processing Version, ECE IIT, Kharagpur After reading this lesson, you will learn about Hypotheses testing
More informationDetection & Estimation Lecture 1
Detection & Estimation Lecture 1 Intro, MVUE, CRLB Xiliang Luo General Course Information Textbooks & References Fundamentals of Statistical Signal Processing: Estimation Theory/Detection Theory, Steven
More informationRegression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood
Regression Estimation - Least Squares and Maximum Likelihood Dr. Frank Wood Least Squares Max(min)imization Function to minimize w.r.t. β 0, β 1 Q = n (Y i (β 0 + β 1 X i )) 2 i=1 Minimize this by maximizing
More informationChapter 8: Least squares (beginning of chapter)
Chapter 8: Least squares (beginning of chapter) Least Squares So far, we have been trying to determine an estimator which was unbiased and had minimum variance. Next we ll consider a class of estimators
More informationDetection & Estimation Lecture 1
Detection & Estimation Lecture 1 Intro, MVUE, CRLB Xiliang Luo General Course Information Textbooks & References Fundamentals of Statistical Signal Processing: Estimation Theory/Detection Theory, Steven
More informationPATTERN RECOGNITION AND MACHINE LEARNING
PATTERN RECOGNITION AND MACHINE LEARNING Slide Set 2: Estimation Theory January 2018 Heikki Huttunen heikki.huttunen@tut.fi Department of Signal Processing Tampere University of Technology Classical Estimation
More informationIntroduction to Simple Linear Regression
Introduction to Simple Linear Regression Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Introduction to Simple Linear Regression 1 / 68 About me Faculty in the Department
More informationEIE6207: Maximum-Likelihood and Bayesian Estimation
EIE6207: Maximum-Likelihood and Bayesian Estimation Man-Wai MAK Dept. of Electronic and Information Engineering, The Hong Kong Polytechnic University enmwmak@polyu.edu.hk http://www.eie.polyu.edu.hk/ mwmak
More informationMathematical statistics
October 4 th, 2018 Lecture 12: Information Where are we? Week 1 Week 2 Week 4 Week 7 Week 10 Week 14 Probability reviews Chapter 6: Statistics and Sampling Distributions Chapter 7: Point Estimation Chapter
More informationRegression #3: Properties of OLS Estimator
Regression #3: Properties of OLS Estimator Econ 671 Purdue University Justin L. Tobias (Purdue) Regression #3 1 / 20 Introduction In this lecture, we establish some desirable properties associated with
More informationECE534, Spring 2018: Solutions for Problem Set #3
ECE534, Spring 08: Solutions for Problem Set #3 Jointly Gaussian Random Variables and MMSE Estimation Suppose that X, Y are jointly Gaussian random variables with µ X = µ Y = 0 and σ X = σ Y = Let their
More informationSGN Advanced Signal Processing: Lecture 8 Parameter estimation for AR and MA models. Model order selection
SG 21006 Advanced Signal Processing: Lecture 8 Parameter estimation for AR and MA models. Model order selection Ioan Tabus Department of Signal Processing Tampere University of Technology Finland 1 / 28
More informationHT Introduction. P(X i = x i ) = e λ λ x i
MODS STATISTICS Introduction. HT 2012 Simon Myers, Department of Statistics (and The Wellcome Trust Centre for Human Genetics) myers@stats.ox.ac.uk We will be concerned with the mathematical framework
More informationA General Overview of Parametric Estimation and Inference Techniques.
A General Overview of Parametric Estimation and Inference Techniques. Moulinath Banerjee University of Michigan September 11, 2012 The object of statistical inference is to glean information about an underlying
More informationEstimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators
Estimation theory Parametric estimation Properties of estimators Minimum variance estimator Cramer-Rao bound Maximum likelihood estimators Confidence intervals Bayesian estimation 1 Random Variables Let
More informationTheory of Statistics.
Theory of Statistics. Homework V February 5, 00. MT 8.7.c When σ is known, ˆµ = X is an unbiased estimator for µ. If you can show that its variance attains the Cramer-Rao lower bound, then no other unbiased
More informationEstimation Theory Fredrik Rusek. Chapters
Estimation Theory Fredrik Rusek Chapters 3.5-3.10 Recap We deal with unbiased estimators of deterministic parameters Performance of an estimator is measured by the variance of the estimate (due to the
More informationChapters 9. Properties of Point Estimators
Chapters 9. Properties of Point Estimators Recap Target parameter, or population parameter θ. Population distribution f(x; θ). { probability function, discrete case f(x; θ) = density, continuous case The
More informationVariations. ECE 6540, Lecture 10 Maximum Likelihood Estimation
Variations ECE 6540, Lecture 10 Last Time BLUE (Best Linear Unbiased Estimator) Formulation Advantages Disadvantages 2 The BLUE A simplification Assume the estimator is a linear system For a single parameter
More informationRegression Estimation Least Squares and Maximum Likelihood
Regression Estimation Least Squares and Maximum Likelihood Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 1 Least Squares Max(min)imization Function to minimize
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationEstimators as Random Variables
Estimation Theory Overview Properties Bias, Variance, and Mean Square Error Cramér-Rao lower bound Maimum likelihood Consistency Confidence intervals Properties of the mean estimator Introduction Up until
More informationUnbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others.
Unbiased Estimation Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others. To compare ˆθ and θ, two estimators of θ: Say ˆθ is better than θ if it
More information2 Statistical Estimation: Basic Concepts
Technion Israel Institute of Technology, Department of Electrical Engineering Estimation and Identification in Dynamical Systems (048825) Lecture Notes, Fall 2009, Prof. N. Shimkin 2 Statistical Estimation:
More informationMathematical Statistics
Mathematical Statistics Chapter Three. Point Estimation 3.4 Uniformly Minimum Variance Unbiased Estimator(UMVUE) Criteria for Best Estimators MSE Criterion Let F = {p(x; θ) : θ Θ} be a parametric distribution
More informationMathematical statistics
October 18 th, 2018 Lecture 16: Midterm review Countdown to mid-term exam: 7 days Week 1 Chapter 1: Probability review Week 2 Week 4 Week 7 Chapter 6: Statistics Chapter 7: Point Estimation Chapter 8:
More informationLinear models. x = Hθ + w, where w N(0, σ 2 I) and H R n p. The matrix H is called the observation matrix or design matrix 1.
Linear models As the first approach to estimator design, we consider the class of problems that can be represented by a linear model. In general, finding the MVUE is difficult. But if the linear model
More informationLecture Notes 5 Convergence and Limit Theorems. Convergence with Probability 1. Convergence in Mean Square. Convergence in Probability, WLLN
Lecture Notes 5 Convergence and Limit Theorems Motivation Convergence with Probability Convergence in Mean Square Convergence in Probability, WLLN Convergence in Distribution, CLT EE 278: Convergence and
More informationDetection and Estimation Theory
Detection and Estimation Theory Instructor: Prof. Namrata Vaswani Dept. of Electrical and Computer Engineering Iowa State University http://www.ece.iastate.edu/ namrata Slide 1 What is Estimation and Detection
More informationECE531 Lecture 8: Non-Random Parameter Estimation
ECE531 Lecture 8: Non-Random Parameter Estimation D. Richard Brown III Worcester Polytechnic Institute 19-March-2009 Worcester Polytechnic Institute D. Richard Brown III 19-March-2009 1 / 25 Introduction
More informationBrief Review on Estimation Theory
Brief Review on Estimation Theory K. Abed-Meraim ENST PARIS, Signal and Image Processing Dept. abed@tsi.enst.fr This presentation is essentially based on the course BASTA by E. Moulines Brief review on
More informationLINEAR MMSE ESTIMATION
LINEAR MMSE ESTIMATION TERM PAPER FOR EE 602 STATISTICAL SIGNAL PROCESSING By, DHEERAJ KUMAR VARUN KHAITAN 1 Introduction Linear MMSE estimators are chosen in practice because they are simpler than the
More informationStatistics and Econometrics I
Statistics and Econometrics I Point Estimation Shiu-Sheng Chen Department of Economics National Taiwan University September 13, 2016 Shiu-Sheng Chen (NTU Econ) Statistics and Econometrics I September 13,
More informationModule 1 - Signal estimation
, Arraial do Cabo, 2009 Module 1 - Signal estimation Sérgio M. Jesus (sjesus@ualg.pt) Universidade do Algarve, PT-8005-139 Faro, Portugal www.siplab.fct.ualg.pt February 2009 Outline of Module 1 Parameter
More information10-704: Information Processing and Learning Fall Lecture 24: Dec 7
0-704: Information Processing and Learning Fall 206 Lecturer: Aarti Singh Lecture 24: Dec 7 Note: These notes are based on scribed notes from Spring5 offering of this course. LaTeX template courtesy of
More informationBias Variance Trade-off
Bias Variance Trade-off The mean squared error of an estimator MSE(ˆθ) = E([ˆθ θ] 2 ) Can be re-expressed MSE(ˆθ) = Var(ˆθ) + (B(ˆθ) 2 ) MSE = VAR + BIAS 2 Proof MSE(ˆθ) = E((ˆθ θ) 2 ) = E(([ˆθ E(ˆθ)]
More informationEstimation techniques
Estimation techniques March 2, 2006 Contents 1 Problem Statement 2 2 Bayesian Estimation Techniques 2 2.1 Minimum Mean Squared Error (MMSE) estimation........................ 2 2.1.1 General formulation......................................
More informationIf we want to analyze experimental or simulated data we might encounter the following tasks:
Chapter 1 Introduction If we want to analyze experimental or simulated data we might encounter the following tasks: Characterization of the source of the signal and diagnosis Studying dependencies Prediction
More informationF & B Approaches to a simple model
A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 215 http://www.astro.cornell.edu/~cordes/a6523 Lecture 11 Applications: Model comparison Challenges in large-scale surveys
More informationDetection theory. H 0 : x[n] = w[n]
Detection Theory Detection theory A the last topic of the course, we will briefly consider detection theory. The methods are based on estimation theory and attempt to answer questions such as Is a signal
More informationEstimation MLE-Pandemic data MLE-Financial crisis data Evaluating estimators. Estimation. September 24, STAT 151 Class 6 Slide 1
Estimation September 24, 2018 STAT 151 Class 6 Slide 1 Pandemic data Treatment outcome, X, from n = 100 patients in a pandemic: 1 = recovered and 0 = not recovered 1 1 1 0 0 0 1 1 1 0 0 1 0 1 0 0 1 1 1
More informationECE 275A Homework 6 Solutions
ECE 275A Homework 6 Solutions. The notation used in the solutions for the concentration (hyper) ellipsoid problems is defined in the lecture supplement on concentration ellipsoids. Note that θ T Σ θ =
More information6.1 Variational representation of f-divergences
ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016 Lecture 6: Variational representation, HCR and CR lower bounds Lecturer: Yihong Wu Scribe: Georgios Rovatsos, Feb 11, 2016
More informationUnbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others.
Unbiased Estimation Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others. To compare ˆθ and θ, two estimators of θ: Say ˆθ is better than θ if it
More informationReview. December 4 th, Review
December 4 th, 2017 Att. Final exam: Course evaluation Friday, 12/14/2018, 10:30am 12:30pm Gore Hall 115 Overview Week 2 Week 4 Week 7 Week 10 Week 12 Chapter 6: Statistics and Sampling Distributions Chapter
More informationEC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)
1 EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) Taisuke Otsu London School of Economics Summer 2018 A.1. Summation operator (Wooldridge, App. A.1) 2 3 Summation operator For
More informationSTAT 512 sp 2018 Summary Sheet
STAT 5 sp 08 Summary Sheet Karl B. Gregory Spring 08. Transformations of a random variable Let X be a rv with support X and let g be a function mapping X to Y with inverse mapping g (A = {x X : g(x A}
More informationECE 275A Homework 7 Solutions
ECE 275A Homework 7 Solutions Solutions 1. For the same specification as in Homework Problem 6.11 we want to determine an estimator for θ using the Method of Moments (MOM). In general, the MOM estimator
More informationParameter Estimation
Parameter Estimation Consider a sample of observations on a random variable Y. his generates random variables: (y 1, y 2,, y ). A random sample is a sample (y 1, y 2,, y ) where the random variables y
More informationBasic concepts in estimation
Basic concepts in estimation Random and nonrandom parameters Definitions of estimates ML Maimum Lielihood MAP Maimum A Posteriori LS Least Squares MMS Minimum Mean square rror Measures of quality of estimates
More informationEstimation Theory. as Θ = (Θ 1,Θ 2,...,Θ m ) T. An estimator
Estimation Theory Estimation theory deals with finding numerical values of interesting parameters from given set of data. We start with formulating a family of models that could describe how the data were
More informationik () uk () Today s menu Last lecture Some definitions Repeatability of sensing elements
Last lecture Overview of the elements of measurement systems. Sensing elements. Signal conditioning elements. Signal processing elements. Data presentation elements. Static characteristics of measurement
More informationChapter 8.8.1: A factorization theorem
LECTURE 14 Chapter 8.8.1: A factorization theorem The characterization of a sufficient statistic in terms of the conditional distribution of the data given the statistic can be difficult to work with.
More informationV. Properties of estimators {Parts C, D & E in this file}
A. Definitions & Desiderata. model. estimator V. Properties of estimators {Parts C, D & E in this file}. sampling errors and sampling distribution 4. unbiasedness 5. low sampling variance 6. low mean squared
More informationEconometrics I, Estimation
Econometrics I, Estimation Department of Economics Stanford University September, 2008 Part I Parameter, Estimator, Estimate A parametric is a feature of the population. An estimator is a function of the
More informationLECTURE 5 NOTES. n t. t Γ(a)Γ(b) pt+a 1 (1 p) n t+b 1. The marginal density of t is. Γ(t + a)γ(n t + b) Γ(n + a + b)
LECTURE 5 NOTES 1. Bayesian point estimators. In the conventional (frequentist) approach to statistical inference, the parameter θ Θ is considered a fixed quantity. In the Bayesian approach, it is considered
More informationIEOR E4703: Monte-Carlo Simulation
IEOR E4703: Monte-Carlo Simulation Output Analysis for Monte-Carlo Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com Output Analysis
More informationParametric Techniques
Parametric Techniques Jason J. Corso SUNY at Buffalo J. Corso (SUNY at Buffalo) Parametric Techniques 1 / 39 Introduction When covering Bayesian Decision Theory, we assumed the full probabilistic structure
More informationStatistics: Learning models from data
DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial
More informationIntroduction to Estimation Methods for Time Series models Lecture 2
Introduction to Estimation Methods for Time Series models Lecture 2 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 1 / 21 Estimators:
More informationSystem Identification, Lecture 4
System Identification, Lecture 4 Kristiaan Pelckmans (IT/UU, 2338) Course code: 1RT880, Report code: 61800 - Spring 2012 F, FRI Uppsala University, Information Technology 30 Januari 2012 SI-2012 K. Pelckmans
More informationMethods of evaluating estimators and best unbiased estimators Hamid R. Rabiee
Stochastic Processes Methods of evaluating estimators and best unbiased estimators Hamid R. Rabiee 1 Outline Methods of Mean Squared Error Bias and Unbiasedness Best Unbiased Estimators CR-Bound for variance
More informationSystem Identification, Lecture 4
System Identification, Lecture 4 Kristiaan Pelckmans (IT/UU, 2338) Course code: 1RT880, Report code: 61800 - Spring 2016 F, FRI Uppsala University, Information Technology 13 April 2016 SI-2016 K. Pelckmans
More informationSTAT 730 Chapter 4: Estimation
STAT 730 Chapter 4: Estimation Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Analysis 1 / 23 The likelihood We have iid data, at least initially. Each datum
More informationParametric Techniques Lecture 3
Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to
More informationAlgorithm Independent Topics Lecture 6
Algorithm Independent Topics Lecture 6 Jason Corso SUNY at Buffalo Feb. 23 2009 J. Corso (SUNY at Buffalo) Algorithm Independent Topics Lecture 6 Feb. 23 2009 1 / 45 Introduction Now that we ve built an
More informationCentral Limit Theorem ( 5.3)
Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately
More informationPart IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015
Part IB Statistics Theorems with proof Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly)
More informationEstimation, Detection, and Identification
Estimation, Detection, and Identification Graduate Course on the CMU/Portugal ECE PhD Program Spring 2008/2009 Chapter 5 Best Linear Unbiased Estimators Instructor: Prof. Paulo Jorge Oliveira pjcro @ isr.ist.utl.pt
More informationClassical Estimation Topics
Classical Estimation Topics Namrata Vaswani, Iowa State University February 25, 2014 This note fills in the gaps in the notes already provided (l0.pdf, l1.pdf, l2.pdf, l3.pdf, LeastSquares.pdf). 1 Min
More informationBIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation
BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation Yujin Chung November 29th, 2016 Fall 2016 Yujin Chung Lec13: MLE Fall 2016 1/24 Previous Parametric tests Mean comparisons (normality assumption)
More informationMAS223 Statistical Inference and Modelling Exercises
MAS223 Statistical Inference and Modelling Exercises The exercises are grouped into sections, corresponding to chapters of the lecture notes Within each section exercises are divided into warm-up questions,
More informationLecture 7 Introduction to Statistical Decision Theory
Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7
More informationUniversity of Cambridge Engineering Part IIB Module 4F10: Statistical Pattern Processing Handout 2: Multivariate Gaussians
Engineering Part IIB: Module F Statistical Pattern Processing University of Cambridge Engineering Part IIB Module F: Statistical Pattern Processing Handout : Multivariate Gaussians. Generative Model Decision
More informationCovariance function estimation in Gaussian process regression
Covariance function estimation in Gaussian process regression François Bachoc Department of Statistics and Operations Research, University of Vienna WU Research Seminar - May 2015 François Bachoc Gaussian
More informationParameter estimation! and! forecasting! Cristiano Porciani! AIfA, Uni-Bonn!
Parameter estimation! and! forecasting! Cristiano Porciani! AIfA, Uni-Bonn! Questions?! C. Porciani! Estimation & forecasting! 2! Cosmological parameters! A branch of modern cosmological research focuses
More informationThe Expectation-Maximization Algorithm
1/29 EM & Latent Variable Models Gaussian Mixture Models EM Theory The Expectation-Maximization Algorithm Mihaela van der Schaar Department of Engineering Science University of Oxford MLE for Latent Variable
More informationLecture 5 September 19
IFT 6269: Probabilistic Graphical Models Fall 2016 Lecture 5 September 19 Lecturer: Simon Lacoste-Julien Scribe: Sébastien Lachapelle Disclaimer: These notes have only been lightly proofread. 5.1 Statistical
More informationMath 494: Mathematical Statistics
Math 494: Mathematical Statistics Instructor: Jimin Ding jmding@wustl.edu Department of Mathematics Washington University in St. Louis Class materials are available on course website (www.math.wustl.edu/
More informationGaussian Estimation under Attack Uncertainty
Gaussian Estimation under Attack Uncertainty Tara Javidi Yonatan Kaspi Himanshu Tyagi Abstract We consider the estimation of a standard Gaussian random variable under an observation attack where an adversary
More informationHypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3
Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest
More informationSTATISTICS/ECONOMETRICS PREP COURSE PROF. MASSIMO GUIDOLIN
Massimo Guidolin Massimo.Guidolin@unibocconi.it Dept. of Finance STATISTICS/ECONOMETRICS PREP COURSE PROF. MASSIMO GUIDOLIN SECOND PART, LECTURE 2: MODES OF CONVERGENCE AND POINT ESTIMATION Lecture 2:
More informationSTAT215: Solutions for Homework 2
STAT25: Solutions for Homework 2 Due: Wednesday, Feb 4. (0 pt) Suppose we take one observation, X, from the discrete distribution, x 2 0 2 Pr(X x θ) ( θ)/4 θ/2 /2 (3 θ)/2 θ/4, 0 θ Find an unbiased estimator
More informationEXAMINERS REPORT & SOLUTIONS STATISTICS 1 (MATH 11400) May-June 2009
EAMINERS REPORT & SOLUTIONS STATISTICS (MATH 400) May-June 2009 Examiners Report A. Most plots were well done. Some candidates muddled hinges and quartiles and gave the wrong one. Generally candidates
More informationLECTURE NOTE #3 PROF. ALAN YUILLE
LECTURE NOTE #3 PROF. ALAN YUILLE 1. Three Topics (1) Precision and Recall Curves. Receiver Operating Characteristic Curves (ROC). What to do if we do not fix the loss function? (2) The Curse of Dimensionality.
More informationStatistics II Lesson 1. Inference on one population. Year 2009/10
Statistics II Lesson 1. Inference on one population Year 2009/10 Lesson 1. Inference on one population Contents Introduction to inference Point estimators The estimation of the mean and variance Estimating
More informationChapter 3: Unbiased Estimation Lecture 22: UMVUE and the method of using a sufficient and complete statistic
Chapter 3: Unbiased Estimation Lecture 22: UMVUE and the method of using a sufficient and complete statistic Unbiased estimation Unbiased or asymptotically unbiased estimation plays an important role in
More informationPerhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.
Chapter 5 Two Random Variables In a practical engineering problem, there is almost always causal relationship between different events. Some relationships are determined by physical laws, e.g., voltage
More informationLecture 4: September Reminder: convergence of sequences
36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 4: September 6 In this lecture we discuss the convergence of random variables. At a high-level, our first few lectures focused
More informationLecture Notes 7 Stationary Random Processes. Strict-Sense and Wide-Sense Stationarity. Autocorrelation Function of a Stationary Process
Lecture Notes 7 Stationary Random Processes Strict-Sense and Wide-Sense Stationarity Autocorrelation Function of a Stationary Process Power Spectral Density Continuity and Integration of Random Processes
More informationMS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari
MS&E 226: Small Data Lecture 11: Maximum likelihood (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 18 The likelihood function 2 / 18 Estimating the parameter This lecture develops the methodology behind
More informationCOS513 LECTURE 8 STATISTICAL CONCEPTS
COS513 LECTURE 8 STATISTICAL CONCEPTS NIKOLAI SLAVOV AND ANKUR PARIKH 1. MAKING MEANINGFUL STATEMENTS FROM JOINT PROBABILITY DISTRIBUTIONS. A graphical model (GM) represents a family of probability distributions
More information