Parameter Estimation
|
|
- Tobias Gilbert
- 5 years ago
- Views:
Transcription
1 Parameter Estimation Consider a sample of observations on a random variable Y. his generates random variables: (y 1, y 2,, y ). A random sample is a sample (y 1, y 2,, y ) where the random variables y t, t = 1,,, are independently and identically distributed (often written i.i.d. ). his means that the joint probability density function of the sample is f (y 1, y 2,, y ) = f(y 1 ) f(y 2 ) f(y 3 ) f(y ) where f(y t ) is the marginal probability function of y t, t = 1, 2,,. Estimators Consider the estimation of a (K 1) parameter vector θ based on a sample (y 1, y 2,, y ). An estimator of the (K 1) parameter vector θ is a function θ e (y 1, y 2,, y ), where y t is the t-th random variable in the sample. Since it is a function of random variables, an estimator θ e (y 1, y 2,, y ) is itself a vector of random variables. An estimate of the (K 1) parameter vector θ is given θ e (y 1, y 2,, y ), where y t is the observed value of the random variable y t, t = 1, 2,,. Since the y t s are observed, the estimate θ e (y 1, y 2,, y ) is not a vector of random variables. he observed values y t s depend on the sample, implying that the estimate q e (y 1, y 2,, y ) will take a specific value for a given sample, but vary across samples. How do we identify the function q e ( )? he Method of Moments: Consider the r-th sample moment of the random variable Y µ e r = y r t /, r = 1, 2,, K. Assume that the K sample moments of Y are related to the (K 1) parameter vector θ by the known function: µ r = h r (θ), r = 1, 2,, K. he method of moments consists in equating the sample moments µ r e with the true moments µ r, and solving the resulting system of K equations, µ r e = h r (θ), r = 1,, K, for q e. he resulting estimator is the method of moment estimator.
2 Let y 1, y 2,, y be a sample, where y t is distributed with mean β and variance σ 2, t = 1, 2,, or y ~(β σ 2 ). herefore θ = (β, σ 2 ). We have: E(Y) = µ 1 = β, and E[(Y - β) 2 ] = E(Y 2 ) E(Y) 2 = µ 2 - (µ 1 ) 2 = σ 2. hen, equating the sample moments to the population moments gives µ e 1 = y t / = β µ 2 e = y t 2 / = σ 2 + β 2. Solving these two equations for θ = (β, σ 2 ) gives the method of moment estimator for the mean β: β m = y t /, and the variance σ 2 : 2 σ m = ( y 2 t /) - (β m ) 2 = (y t - β m ) 2 /. he Maximum Likelihood Method Let the joint probability density function of the sample (y 1, y 2,, y ) be f (y 1, y 2,, y θ), where θ is a (K 1) vector of unknown parameters that belongs to the parameter space Ω, θ Ω. Define the likelihood function of the sample as l(θ y 1, y 2,, y ) = f (y 1, y 2,, y θ). he maximum likelihood estimator is the value θ l e that solves the following maximization problem Max θ [l(θ Y 1, Y 2,, Y ): θ Ω]. Define the log-likelihood function of the sample as L = ln[l(θ y 1, y 2,, y )]. Since the logarithmic function is monotonic, l( ) and [ln l( )] attain their maxima at the same value of θ. As a result, it will often be convenient to define the maximum likelihood estimator of θ as the value θ l e that Max θ L = ln[l(θ Y 1, Y 2,, Y )], θ Ω. Let (y 1, y 2,, y ) be a random sample, where y t N(β, σ 2 ), θ = (β, σ 2 ), σ 2 > 0. hus the probability density function for Y t is f(y t β, σ 2 ) = exp[(-1/2) (y t - β) 2 /σ 2 ]/[2π σ 2 ] 1/2 and l(β, σ 2 y 1, y 2,, y )] = f (y 1, y 2,, y θ) = Π f(y t β, σ 2 ). 2
3 It follows that the log-likelihood function of the sample is L = ln[l(β, σ 2 Y 1, Y 2,, Y )] = ln[ Π f(y t β, σ 2 ) ] = ln[ Π exp[(-1/2) (Y t - β) 2 /σ 2 ]/[2π σ 2 ] 1/2 ] = t= 1[(-1/2) (Y t - β) 2 /σ 2 - (1/2) ln[2π σ 2 ]] = (-/2) ln(2π) - (/2) ln(σ 2 ) - (1/2) t= 1(Y t - β) 2 /σ 2. Note that L is a concave function of θ = (β, σ 2 ). he first-order conditions for a maximum with respect to θ = (β, σ 2 ) are: L/ β = [( y t ) - β]/σ 2 = 0 L/ (σ 2 ) = -/(2 σ 2 ) + [ (y t - β) 2 ]/(2 σ ). Solving these two equations for θ = (β, σ 2 ) gives the maximum likelihood estimator for the mean β: β l = y t /, and the variance σ 2 : σ 2 l = (y t - β e l ) 2 /. he maximum likelihood method requires knowing the probability distribution of the y t s. (Other methods can be less demanding ) In this case, β l = β m, and σ l 2 = σ m 2, i.e. the method of moment and the maximum likelihood estimation method give identical estimators for θ = (β, σ 2 ). Least-Squares Method Assume that we know a function h(θ) satisfying E(Y t ) = h t (θ), t = 1, 2,,, where θ is a (K 1) vector of parameters, θ Ω. Define the error term e t = y t - h t (θ). It follows that e t is a random variable (since it is a function of the random variable y t ) and has mean zero: E(e t ) = E(y t ) - h t (θ) = 0. Define the error sum of squares, S: S(y 1, y 2,, y, θ) = [y t - h t (θ)] 2. he least squares estimator of θ is the value θ s e that solves the following minimization problem Min θ [S(y 1, y 2,, y, θ): θ Ω]. Let y 1, y 2,, y be a sample, where Y t is distributed with mean β and some finite variance, t = 1, 2,,. Given E(y t ) = β, let h t (β) = β. hen, the least squares estimator of β is obtained by minimizing S = (y t - β) 2. Note that S is a convex function of β. he first-order necessary condition for a minimum of S is: 3
4 S/ β = -2 (y t - β) = 0. Solving this equation for β gives the least squares estimator of β β s = y t /. Note: In this case, β l = β m = β s, i.e. the method of moment, the maximum likelihood estimation method, and the least squares method all give identical estimators for the mean β. Properties of Estimators Again, we consider an estimator θ e (y 1, y 2,, y ) for a (K 1) parameter vector θ based on a sample (y 1, y 2,, y ). Finite sample properties Based on sample of size Unbiased estimator An estimator θ e (y 1, y 2,, y ) of θ is unbiased if E(θ e ) = θ. If E(θ e ) θ, then the estimator θ e is said to be biased, its bias being [E(θ e ) - θ] 0. Efficient estimator An estimator θ e (y 1, y 2,, y ) of θ is efficient if it is unbiased and if it has the smallest possible variance among all unbiased estimators. Cramer-Rao Lower Bound An unbiased estimator θ e is efficient if its variance satisfies V(θ e ) = -[E( 2 L(θ)/ θ 2 )] -1 = I(θ) -1 where: L(θ) = ln[l(θ Y 1, Y 2,, Y )] is the log-likelihood function of the sample I(θ) = -[E( 2 L(θ)/ θ 2 )] is a (K K) matrix called the information matrix and -[E( 2 L(θ)/ θ 2 )] -1 = I(θ) -1 is called the Cramer-Rao lower bound. Note: his requires knowing the probability function of the Y t s. Best Linear Unbiased Estimator (BLUE) An estimator θ e is best linear unbiased: if it is linear, i.e. if θ e = a t Y t, for some a t s, if it is unbiased, i.e. if E(θ e ) = θ, and if it has the smallest variance among all linear unbiased estimators. his does not require knowing the probability function of the y t s.
5 Let (y 1, y 2,, y ) be a random sample of size, where y t ~ (β, σ 2 ) t = 1, 2,,. By definition of the mean, we have E(y t ) = β. Consider the following estimator of the mean β β e = y t /. We have E(β e ) = E( y t /) = It follows that β e = E(y t )/ = β/ = β. y t / is an unbiased estimator of β. he variance of β e is V(β e ) = V( y t / ) = (1/ 2 ) V(y t ) (his assumes the y t s are independent in a random sample) = (1/ 2 ) σ 2 since V(y t ) = σ 2, t = (1,2,, ) = (1/ 2 ) σ 2 = σ 2 / = V(β e ). Noting that the estimator β e = y t / is linear, it can be shown to have the smallest variance among all linear unbiased estimators. hus, β e is the best linear unbiased estimator (BLUE) of β. If we know that y t is normally distributed (i.e., y t N(β, σ 2 )), then if can be shown that I(θ) = -E[ 2 L/ θ ] = σ, where θ = (β, σ 2 ). 0 2σ It follows that the Cramer-Rao lower bound is 2 σ I(θ) -1 0 =, 2σ 0 Since the variance of β e, V(β e ) = σ 2 /, is equal to the Cramer-Rao lower bound, it follows that β e is efficient. his means that, under the normality assumption, β e has the smallest variance among all unbiased estimators (whether they are linear or not). Since β e = β m = β l = β s = y t /, it follows that the estimator of the mean β obtained from either the method of moments, the maximum likelihood method, or the least square method, is unbiased, BLUE, as well as efficient under a normal distribution. 5
6 Let Y 1, Y 2,, Y be a random sample of size, where Y t is distributed with mean β and variance σ 2, t = 1, 2,,. Consider the following estimator of the variance σ 2 (σ 2 ) e = (Y t - β e ) 2 /, where β e = Y t /. Since (σ 2 ) e = σ 2 m = σ 2 l, the estimator (σ 2 ) e is identical to both the method of moment estimator and the maximum likelihood estimator (under normality) of σ 2. Note that E(σ 2 ) e = E[(1/) = E[(1/) = E[(1/) (Y t - β e ) 2 ] [(Y t - β) - (β e - β)] 2 ] [(Y t - β) 2 + (β e - β) 2-2 (Y t - β)(β e - β)]] = E[ (Y t - β) 2 / + (β e - β) 2-2 (Y t - β)(β e - β)/] = E[ (Y t - β) 2 / + (β e - β) 2-2 (β e - β) 2 ] since β e = = E[ (Y t - β) 2 / - (β e - β) 2 ] = = Y t / E(Y t - β) 2 / - E(β e - β) 2 σ 2 / - V(β e ), since σ 2 = E(Y t - β) 2, and E(β e - β) 2 = V(β e ) = σ 2 - σ 2 /, since V(β e ) = σ 2 / = σ 2 (-1)/. It follows that E(σ 2 ) e = [(-1)/] σ 2 < σ 2. his implies that (σ 2 ) e = σ 2 m = σ 2 l is a biased estimator of the variance σ 2. hus, the estimation of the variance σ 2 obtained by either the method of moment or the maximum likelihood method gives biased estimator. his suggests that an unbiased estimator of the variance σ 2 is σ u 2 = (Y t - β e ) 2 /(-1), where β e = Y t /. he estimator σ u 2 is unbiased since σ u 2 = (σ 2 ) e [/(-1)], which implies E(σ u 2 ) = E(σ 2 ) e [/(-1)] = σ 2. Asymptotic Properties he sample size becomes large, approaches infinity Again, we consider an estimator θ e (y 1, y 2,, y ) for a (K 1) parameter vector θ based on a sample of size. Consistent estimator An estimator θ e of θ is said to be consistent if lim P( θ e - θ < ε) = 1 where ε is an arbitrarily small positive number. Equivalently, the estimator θ e is consistent if it converges in probability to the constant θ, where θ is said to be the probability limit of θ e : 6
7 plim θ e = θ. Note: Sufficient conditions for θ e to be a consistent estimator of θ are that lim E(θ e ) = θ (i.e. θ e is asymptotic unbiased) and lim V(θ e ) = 0. Let (y 1, y 2,, y ) be a sample where Y t has mean β and variance σ 2. Consider the estimator β e = y t / of β. We have shown that the estimator β e has mean β and variance σ 2 /. We know that the estimator β e is unbiased (i.e., E(β e ) = β) for any sample size. It is thus also asymptotically unbiased (as becomes large). In addition its variance V(β e ) = σ 2 / clearly goes to zero as becomes large. hus, β e = Y t / is a consistent estimator of β. Central Limit heorem Let (y 1, y 2,, y ) be a random sample where y t ~ (β, σ 2 ), t = 1, 2,,. Let β e = Y t /. hen, as, () 1/2 (β e - β) converges in distribution to a N(0, σ 2 ) random variable: () 1/2 (β e - β) d N(0, σ 2 ). Implications: his result obtains for any distribution of the y t s. he Central Limit heorem says that, if the sample size is reasonably large (e.g., > 0), then () 1/2 (β e - β) is approximately normally distributed with mean 0 and variance σ 2. Equivalently stated, when is reasonably large, β e is approximately normally distributed: β e N(β, σ 2 /). Note that this result is consistent with our earlier results that β e has mean β and variance σ 2 /. What is new here is the asymptotic normality of β e (or of () 1/2 (β e - β)) for any distribution of the Y t s. Asymptotic Efficiency An estimator θ e of θ is said to be asymptotically efficient if it is consistent and if it has the smallest possible asymptotic variance among all consistent estimators. An estimator θ e of θ is asymptotically efficient if it satisfies () 1/2 (θ e - θ) d N(0, lim [(1/) I(θ)] -1 ) where I(θ) = -E[ 2 L/ θ 2 ] is the information matrix defined above. Under fairly general conditions, the maximum likelihood estimator θ l of θ is Consistent Asymptotically Normal 7
8 Asymptotically Unbiased Asymptotically Efficient Let (y 1, y 2,, y ) be a random sample of size, where y t ~ N(β, σ 2 ), t = 1, 2,,. he maximum likelihood estimator θ e l = (β l, σ 2 l ) of θ = (β, σ 2 ) is β l = Y t / and σ 2 l = (1/) (Y t - β l ) 2. From the above results, the maximum likelihood estimator θ e l = (β l, σ 2 l ) of θ is consistent, asymptotically normal, and asymptotically efficient. In addition, we have seen that the information matrix is I(θ) = -E[ 2 L/ θ 2 ] = 0 2 σ. It follows that the asymptotic distribution of θ e l = (β l, σ 2 l ) is 0 2σ () 1/2 (θ e l - θ) d N(0, lim [(1/) I(θ)] -1 2 ) = σ 0 N 0,. 0 2σ his shows that the asymptotic variance of θ e l is: V(β l ) σ 2 / (which is identical to the one derived earlier) and V(σ 2 l ) 2 σ / (which is a new result). 8
Max. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes
Maximum Likelihood Estimation Econometrics II Department of Economics Universidad Carlos III de Madrid Máster Universitario en Desarrollo y Crecimiento Económico Outline 1 3 4 General Approaches to Parameter
More informationIntroduction to Estimation Methods for Time Series models Lecture 2
Introduction to Estimation Methods for Time Series models Lecture 2 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 1 / 21 Estimators:
More informationEstimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators
Estimation theory Parametric estimation Properties of estimators Minimum variance estimator Cramer-Rao bound Maximum likelihood estimators Confidence intervals Bayesian estimation 1 Random Variables Let
More informationThe loss function and estimating equations
Chapter 6 he loss function and estimating equations 6 Loss functions Up until now our main focus has been on parameter estimating via the maximum likelihood However, the negative maximum likelihood is
More informationChapter 3: Maximum Likelihood Theory
Chapter 3: Maximum Likelihood Theory Florian Pelgrin HEC September-December, 2010 Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, 2010 1 / 40 1 Introduction Example 2 Maximum likelihood
More informationEconomics 620, Lecture 9: Asymptotics III: Maximum Likelihood Estimation
Economics 620, Lecture 9: Asymptotics III: Maximum Likelihood Estimation Nicholas M. Kiefer Cornell University Professor N. M. Kiefer (Cornell University) Lecture 9: Asymptotics III(MLE) 1 / 20 Jensen
More informationLinear Model Under General Variance
Linear Model Under General Variance We have a sample of T random variables y 1, y 2,, y T, satisfying the linear model Y = X β + e, where Y = (y 1,, y T )' is a (T 1) vector of random variables, X = (T
More informationIntroduction to Maximum Likelihood Estimation
Introduction to Maximum Likelihood Estimation Eric Zivot July 26, 2012 The Likelihood Function Let 1 be an iid sample with pdf ( ; ) where is a ( 1) vector of parameters that characterize ( ; ) Example:
More informationEstimation, Inference, and Hypothesis Testing
Chapter 2 Estimation, Inference, and Hypothesis Testing Note: The primary reference for these notes is Ch. 7 and 8 of Casella & Berger 2. This text may be challenging if new to this topic and Ch. 7 of
More informationSGN Advanced Signal Processing: Lecture 8 Parameter estimation for AR and MA models. Model order selection
SG 21006 Advanced Signal Processing: Lecture 8 Parameter estimation for AR and MA models. Model order selection Ioan Tabus Department of Signal Processing Tampere University of Technology Finland 1 / 28
More informationChapter 3. Point Estimation. 3.1 Introduction
Chapter 3 Point Estimation Let (Ω, A, P θ ), P θ P = {P θ θ Θ}be probability space, X 1, X 2,..., X n : (Ω, A) (IR k, B k ) random variables (X, B X ) sample space γ : Θ IR k measurable function, i.e.
More information6. MAXIMUM LIKELIHOOD ESTIMATION
6 MAXIMUM LIKELIHOOD ESIMAION [1] Maximum Likelihood Estimator (1) Cases in which θ (unknown parameter) is scalar Notational Clarification: From now on, we denote the true value of θ as θ o hen, view θ
More informationELEG 5633 Detection and Estimation Minimum Variance Unbiased Estimators (MVUE)
1 ELEG 5633 Detection and Estimation Minimum Variance Unbiased Estimators (MVUE) Jingxian Wu Department of Electrical Engineering University of Arkansas Outline Minimum Variance Unbiased Estimators (MVUE)
More informationChapter 4: Unconstrained nonlinear optimization
Chapter 4: Unconstrained nonlinear optimization Edoardo Amaldi DEIB Politecnico di Milano edoardo.amaldi@polimi.it Website: http://home.deib.polimi.it/amaldi/opt-15-16.shtml Academic year 2015-16 Edoardo
More informationSystem Identification, Lecture 4
System Identification, Lecture 4 Kristiaan Pelckmans (IT/UU, 2338) Course code: 1RT880, Report code: 61800 - Spring 2012 F, FRI Uppsala University, Information Technology 30 Januari 2012 SI-2012 K. Pelckmans
More informationSystem Identification, Lecture 4
System Identification, Lecture 4 Kristiaan Pelckmans (IT/UU, 2338) Course code: 1RT880, Report code: 61800 - Spring 2016 F, FRI Uppsala University, Information Technology 13 April 2016 SI-2016 K. Pelckmans
More informationConcentration of Measures by Bounded Couplings
Concentration of Measures by Bounded Couplings Subhankar Ghosh, Larry Goldstein and Ümit Işlak University of Southern California [arxiv:0906.3886] [arxiv:1304.5001] May 2013 Concentration of Measure Distributional
More informationECE531 Lecture 10b: Maximum Likelihood Estimation
ECE531 Lecture 10b: Maximum Likelihood Estimation D. Richard Brown III Worcester Polytechnic Institute 05-Apr-2011 Worcester Polytechnic Institute D. Richard Brown III 05-Apr-2011 1 / 23 Introduction So
More informationλ(x + 1)f g (x) > θ 0
Stat 8111 Final Exam December 16 Eleven students took the exam, the scores were 92, 78, 4 in the 5 s, 1 in the 4 s, 1 in the 3 s and 3 in the 2 s. 1. i) Let X 1, X 2,..., X n be iid each Bernoulli(θ) where
More informationMaximum Likelihood Estimation
University of Pavia Maximum Likelihood Estimation Eduardo Rossi Likelihood function Choosing parameter values that make what one has observed more likely to occur than any other parameter values do. Assumption(Distribution)
More informationParameter estimation! and! forecasting! Cristiano Porciani! AIfA, Uni-Bonn!
Parameter estimation! and! forecasting! Cristiano Porciani! AIfA, Uni-Bonn! Questions?! C. Porciani! Estimation & forecasting! 2! Cosmological parameters! A branch of modern cosmological research focuses
More informationEstimators as Random Variables
Estimation Theory Overview Properties Bias, Variance, and Mean Square Error Cramér-Rao lower bound Maimum likelihood Consistency Confidence intervals Properties of the mean estimator Introduction Up until
More informationCentral Limit Theorem ( 5.3)
Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately
More informationMethods of evaluating estimators and best unbiased estimators Hamid R. Rabiee
Stochastic Processes Methods of evaluating estimators and best unbiased estimators Hamid R. Rabiee 1 Outline Methods of Mean Squared Error Bias and Unbiasedness Best Unbiased Estimators CR-Bound for variance
More informationProblem 1 (20) Log-normal. f(x) Cauchy
ORF 245. Rigollet Date: 11/21/2008 Problem 1 (20) f(x) f(x) 0.0 0.1 0.2 0.3 0.4 0.0 0.2 0.4 0.6 0.8 4 2 0 2 4 Normal (with mean -1) 4 2 0 2 4 Negative-exponential x x f(x) f(x) 0.0 0.1 0.2 0.3 0.4 0.5
More informationParametric Techniques Lecture 3
Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to
More informationSTA 260: Statistics and Probability II
Al Nosedal. University of Toronto. Winter 2017 1 Properties of Point Estimators and Methods of Estimation 2 3 If you can t explain it simply, you don t understand it well enough Albert Einstein. Definition
More informationMathematical statistics
October 4 th, 2018 Lecture 12: Information Where are we? Week 1 Week 2 Week 4 Week 7 Week 10 Week 14 Probability reviews Chapter 6: Statistics and Sampling Distributions Chapter 7: Point Estimation Chapter
More informationSTAT215: Solutions for Homework 2
STAT25: Solutions for Homework 2 Due: Wednesday, Feb 4. (0 pt) Suppose we take one observation, X, from the discrete distribution, x 2 0 2 Pr(X x θ) ( θ)/4 θ/2 /2 (3 θ)/2 θ/4, 0 θ Find an unbiased estimator
More informationParametric Techniques
Parametric Techniques Jason J. Corso SUNY at Buffalo J. Corso (SUNY at Buffalo) Parametric Techniques 1 / 39 Introduction When covering Bayesian Decision Theory, we assumed the full probabilistic structure
More informationGraduate Econometrics I: Unbiased Estimation
Graduate Econometrics I: Unbiased Estimation Yves Dominicy Université libre de Bruxelles Solvay Brussels School of Economics and Management ECARES Yves Dominicy Graduate Econometrics I: Unbiased Estimation
More informationLinear models. Linear models are computationally convenient and remain widely used in. applied econometric research
Linear models Linear models are computationally convenient and remain widely used in applied econometric research Our main focus in these lectures will be on single equation linear models of the form y
More information2 Statistical Estimation: Basic Concepts
Technion Israel Institute of Technology, Department of Electrical Engineering Estimation and Identification in Dynamical Systems (048825) Lecture Notes, Fall 2009, Prof. N. Shimkin 2 Statistical Estimation:
More informationConcentration of Measures by Bounded Size Bias Couplings
Concentration of Measures by Bounded Size Bias Couplings Subhankar Ghosh, Larry Goldstein University of Southern California [arxiv:0906.3886] January 10 th, 2013 Concentration of Measure Distributional
More informationECE 275A Homework 7 Solutions
ECE 275A Homework 7 Solutions Solutions 1. For the same specification as in Homework Problem 6.11 we want to determine an estimator for θ using the Method of Moments (MOM). In general, the MOM estimator
More informationExercises and Answers to Chapter 1
Exercises and Answers to Chapter The continuous type of random variable X has the following density function: a x, if < x < a, f (x), otherwise. Answer the following questions. () Find a. () Obtain mean
More informationA Very Brief Summary of Statistical Inference, and Examples
A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2008 Prof. Gesine Reinert 1 Data x = x 1, x 2,..., x n, realisations of random variables X 1, X 2,..., X n with distribution (model)
More informationTheory of Maximum Likelihood Estimation. Konstantin Kashin
Gov 2001 Section 5: Theory of Maximum Likelihood Estimation Konstantin Kashin February 28, 2013 Outline Introduction Likelihood Examples of MLE Variance of MLE Asymptotic Properties What is Statistical
More informationBrief Review on Estimation Theory
Brief Review on Estimation Theory K. Abed-Meraim ENST PARIS, Signal and Image Processing Dept. abed@tsi.enst.fr This presentation is essentially based on the course BASTA by E. Moulines Brief review on
More informationParameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn
Parameter estimation and forecasting Cristiano Porciani AIfA, Uni-Bonn Questions? C. Porciani Estimation & forecasting 2 Temperature fluctuations Variance at multipole l (angle ~180o/l) C. Porciani Estimation
More informationAn exponential family of distributions is a parametric statistical model having densities with respect to some positive measure λ of the form.
Stat 8112 Lecture Notes Asymptotics of Exponential Families Charles J. Geyer January 23, 2013 1 Exponential Families An exponential family of distributions is a parametric statistical model having densities
More informationSupport Vector Machines
Support Vector Machines Le Song Machine Learning I CSE 6740, Fall 2013 Naïve Bayes classifier Still use Bayes decision rule for classification P y x = P x y P y P x But assume p x y = 1 is fully factorized
More informationMS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari
MS&E 226: Small Data Lecture 11: Maximum likelihood (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 18 The likelihood function 2 / 18 Estimating the parameter This lecture develops the methodology behind
More informationStatistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation
Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence
More informationFinal Exam. 1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given.
1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given. (a) If X and Y are independent, Corr(X, Y ) = 0. (b) (c) (d) (e) A consistent estimator must be asymptotically
More informationEstimation Theory Fredrik Rusek. Chapters
Estimation Theory Fredrik Rusek Chapters 3.5-3.10 Recap We deal with unbiased estimators of deterministic parameters Performance of an estimator is measured by the variance of the estimate (due to the
More informationAn Introduction to Parameter Estimation
Introduction Introduction to Econometrics An Introduction to Parameter Estimation This document combines several important econometric foundations and corresponds to other documents such as the Introduction
More informationGreene, Econometric Analysis (6th ed, 2008)
EC771: Econometrics, Spring 2010 Greene, Econometric Analysis (6th ed, 2008) Chapter 17: Maximum Likelihood Estimation The preferred estimator in a wide variety of econometric settings is that derived
More information2.3 Methods of Estimation
96 CHAPTER 2. ELEMENTS OF STATISTICAL INFERENCE 2.3 Methods of Estimation 2.3. Method of Moments The Method of Moments is a simple technique based on the idea that the sample moments are natural estimators
More informationEconometrics I, Estimation
Econometrics I, Estimation Department of Economics Stanford University September, 2008 Part I Parameter, Estimator, Estimate A parametric is a feature of the population. An estimator is a function of the
More informationENEE 621 SPRING 2016 DETECTION AND ESTIMATION THEORY THE PARAMETER ESTIMATION PROBLEM
c 2007-2016 by Armand M. Makowski 1 ENEE 621 SPRING 2016 DETECTION AND ESTIMATION THEORY THE PARAMETER ESTIMATION PROBLEM 1 The basic setting Throughout, p, q and k are positive integers. The setup With
More informationVariations. ECE 6540, Lecture 10 Maximum Likelihood Estimation
Variations ECE 6540, Lecture 10 Last Time BLUE (Best Linear Unbiased Estimator) Formulation Advantages Disadvantages 2 The BLUE A simplification Assume the estimator is a linear system For a single parameter
More informationACTEX CAS EXAM 3 STUDY GUIDE FOR MATHEMATICAL STATISTICS
ACTEX CAS EXAM 3 STUDY GUIDE FOR MATHEMATICAL STATISTICS TABLE OF CONTENTS INTRODUCTORY NOTE NOTES AND PROBLEM SETS Section 1 - Point Estimation 1 Problem Set 1 15 Section 2 - Confidence Intervals and
More information1. Point Estimators, Review
AMS571 Prof. Wei Zhu 1. Point Estimators, Review Example 1. Let be a random sample from. Please find a good point estimator for Solutions. There are the typical estimators for and. Both are unbiased estimators.
More informationStatistical inference
Statistical inference Contents 1. Main definitions 2. Estimation 3. Testing L. Trapani MSc Induction - Statistical inference 1 1 Introduction: definition and preliminary theory In this chapter, we shall
More informationMaximum Likelihood Estimation
Maximum Likelihood Estimation Assume X P θ, θ Θ, with joint pdf (or pmf) f(x θ). Suppose we observe X = x. The Likelihood function is L(θ x) = f(x θ) as a function of θ (with the data x held fixed). The
More information1 Appendix A: Matrix Algebra
Appendix A: Matrix Algebra. Definitions Matrix A =[ ]=[A] Symmetric matrix: = for all and Diagonal matrix: 6=0if = but =0if 6= Scalar matrix: the diagonal matrix of = Identity matrix: the scalar matrix
More informationA General Overview of Parametric Estimation and Inference Techniques.
A General Overview of Parametric Estimation and Inference Techniques. Moulinath Banerjee University of Michigan September 11, 2012 The object of statistical inference is to glean information about an underlying
More informationA Few Notes on Fisher Information (WIP)
A Few Notes on Fisher Information (WIP) David Meyer dmm@{-4-5.net,uoregon.edu} Last update: April 30, 208 Definitions There are so many interesting things about Fisher Information and its theoretical properties
More information( 1 k "information" I(X;Y) given by Y about X)
SUMMARY OF SHANNON DISTORTION-RATE THEORY Consider a stationary source X with f (x) as its th-order pdf. Recall the following OPTA function definitions: δ(,r) = least dist'n of -dim'l fixed-rate VQ's w.
More informationTerminology Suppose we have N observations {x(n)} N 1. Estimators as Random Variables. {x(n)} N 1
Estimation Theory Overview Properties Bias, Variance, and Mean Square Error Cramér-Rao lower bound Maximum likelihood Consistency Confidence intervals Properties of the mean estimator Properties of the
More informationFinancial Econometrics
Financial Econometrics Estimation and Inference Gerald P. Dwyer Trinity College, Dublin January 2013 Who am I? Visiting Professor and BB&T Scholar at Clemson University Federal Reserve Bank of Atlanta
More informationWooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics
Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics A short review of the principles of mathematical statistics (or, what you should have learned in EC 151).
More informationSTAT 512 sp 2018 Summary Sheet
STAT 5 sp 08 Summary Sheet Karl B. Gregory Spring 08. Transformations of a random variable Let X be a rv with support X and let g be a function mapping X to Y with inverse mapping g (A = {x X : g(x A}
More informationStatistics GIDP Ph.D. Qualifying Exam Theory Jan 11, 2016, 9:00am-1:00pm
Statistics GIDP Ph.D. Qualifying Exam Theory Jan, 06, 9:00am-:00pm Instructions: Provide answers on the supplied pads of paper; write on only one side of each sheet. Complete exactly 5 of the 6 problems.
More informationV. Properties of estimators {Parts C, D & E in this file}
A. Definitions & Desiderata. model. estimator V. Properties of estimators {Parts C, D & E in this file}. sampling errors and sampling distribution 4. unbiasedness 5. low sampling variance 6. low mean squared
More informationAdvanced Quantitative Methods: maximum likelihood
Advanced Quantitative Methods: Maximum Likelihood University College Dublin March 23, 2011 1 Introduction 2 3 4 5 Outline Introduction 1 Introduction 2 3 4 5 Preliminaries Introduction Ordinary least squares
More informationRegression Estimation Least Squares and Maximum Likelihood
Regression Estimation Least Squares and Maximum Likelihood Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 1 Least Squares Max(min)imization Function to minimize
More information6.1 Variational representation of f-divergences
ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016 Lecture 6: Variational representation, HCR and CR lower bounds Lecturer: Yihong Wu Scribe: Georgios Rovatsos, Feb 11, 2016
More informationMLE and GMM. Li Zhao, SJTU. Spring, Li Zhao MLE and GMM 1 / 22
MLE and GMM Li Zhao, SJTU Spring, 2017 Li Zhao MLE and GMM 1 / 22 Outline 1 MLE 2 GMM 3 Binary Choice Models Li Zhao MLE and GMM 2 / 22 Maximum Likelihood Estimation - Introduction For a linear model y
More informationAdvanced Signal Processing Introduction to Estimation Theory
Advanced Signal Processing Introduction to Estimation Theory Danilo Mandic, room 813, ext: 46271 Department of Electrical and Electronic Engineering Imperial College London, UK d.mandic@imperial.ac.uk,
More informationTwo hours. To be supplied by the Examinations Office: Mathematical Formula Tables THE UNIVERSITY OF MANCHESTER. 21 June :45 11:45
Two hours MATH20802 To be supplied by the Examinations Office: Mathematical Formula Tables THE UNIVERSITY OF MANCHESTER STATISTICAL METHODS 21 June 2010 9:45 11:45 Answer any FOUR of the questions. University-approved
More informationEstimation of Dynamic Regression Models
University of Pavia 2007 Estimation of Dynamic Regression Models Eduardo Rossi University of Pavia Factorization of the density DGP: D t (x t χ t 1, d t ; Ψ) x t represent all the variables in the economy.
More informationGeneralized Linear Models. Kurt Hornik
Generalized Linear Models Kurt Hornik Motivation Assuming normality, the linear model y = Xβ + e has y = β + ε, ε N(0, σ 2 ) such that y N(μ, σ 2 ), E(y ) = μ = β. Various generalizations, including general
More informationLecture 7 Introduction to Statistical Decision Theory
Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7
More informationEstimation MLE-Pandemic data MLE-Financial crisis data Evaluating estimators. Estimation. September 24, STAT 151 Class 6 Slide 1
Estimation September 24, 2018 STAT 151 Class 6 Slide 1 Pandemic data Treatment outcome, X, from n = 100 patients in a pandemic: 1 = recovered and 0 = not recovered 1 1 1 0 0 0 1 1 1 0 0 1 0 1 0 0 1 1 1
More informationLasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices
Article Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Fei Jin 1,2 and Lung-fei Lee 3, * 1 School of Economics, Shanghai University of Finance and Economics,
More informationBIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation
BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation Yujin Chung November 29th, 2016 Fall 2016 Yujin Chung Lec13: MLE Fall 2016 1/24 Previous Parametric tests Mean comparisons (normality assumption)
More informationST5215: Advanced Statistical Theory
Department of Statistics & Applied Probability Wednesday, October 5, 2011 Lecture 13: Basic elements and notions in decision theory Basic elements X : a sample from a population P P Decision: an action
More informationAdvanced Quantitative Methods: maximum likelihood
Advanced Quantitative Methods: Maximum Likelihood University College Dublin 4 March 2014 1 2 3 4 5 6 Outline 1 2 3 4 5 6 of straight lines y = 1 2 x + 2 dy dx = 1 2 of curves y = x 2 4x + 5 of curves y
More informationy = 1 N y i = 1 N y. E(W )=a 1 µ 1 + a 2 µ a n µ n (1.2) and its variance is var(w )=a 2 1σ a 2 2σ a 2 nσ 2 n. (1.
The probability density function of a continuous random variable Y (or the probability mass function if Y is discrete) is referred to simply as a probability distribution and denoted by f(y; θ) where θ
More informationHypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3
Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest
More informationf(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain
0.1. INTRODUCTION 1 0.1 Introduction R. A. Fisher, a pioneer in the development of mathematical statistics, introduced a measure of the amount of information contained in an observaton from f(x θ). Fisher
More informationMcGill University. Faculty of Science. Department of Mathematics and Statistics. Part A Examination. Statistics: Theory Paper
McGill University Faculty of Science Department of Mathematics and Statistics Part A Examination Statistics: Theory Paper Date: 10th May 2015 Instructions Time: 1pm-5pm Answer only two questions from Section
More informationSTATS 200: Introduction to Statistical Inference. Lecture 29: Course review
STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout
More informationPh.D. Qualifying Exam Friday Saturday, January 3 4, 2014
Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014 Put your solution to each problem on a separate sheet of paper. Problem 1. (5166) Assume that two random samples {x i } and {y i } are independently
More informationMathematical statistics
October 18 th, 2018 Lecture 16: Midterm review Countdown to mid-term exam: 7 days Week 1 Chapter 1: Probability review Week 2 Week 4 Week 7 Chapter 6: Statistics Chapter 7: Point Estimation Chapter 8:
More informationChapters 9. Properties of Point Estimators
Chapters 9. Properties of Point Estimators Recap Target parameter, or population parameter θ. Population distribution f(x; θ). { probability function, discrete case f(x; θ) = density, continuous case The
More informationCourse: ESO-209 Home Work: 1 Instructor: Debasis Kundu
Home Work: 1 1. Describe the sample space when a coin is tossed (a) once, (b) three times, (c) n times, (d) an infinite number of times. 2. A coin is tossed until for the first time the same result appear
More informationMath 152. Rumbos Fall Solutions to Assignment #12
Math 52. umbos Fall 2009 Solutions to Assignment #2. Suppose that you observe n iid Bernoulli(p) random variables, denoted by X, X 2,..., X n. Find the LT rejection region for the test of H o : p p o versus
More informationMaximum Likelihood (ML) Estimation
Econometrics 2 Fall 2004 Maximum Likelihood (ML) Estimation Heino Bohn Nielsen 1of32 Outline of the Lecture (1) Introduction. (2) ML estimation defined. (3) ExampleI:Binomialtrials. (4) Example II: Linear
More informationLeast Squares Estimation-Finite-Sample Properties
Least Squares Estimation-Finite-Sample Properties Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU) Finite-Sample 1 / 29 Terminology and Assumptions 1 Terminology and Assumptions
More informationF & B Approaches to a simple model
A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 215 http://www.astro.cornell.edu/~cordes/a6523 Lecture 11 Applications: Model comparison Challenges in large-scale surveys
More informationDA Freedman Notes on the MLE Fall 2003
DA Freedman Notes on the MLE Fall 2003 The object here is to provide a sketch of the theory of the MLE. Rigorous presentations can be found in the references cited below. Calculus. Let f be a smooth, scalar
More informationStatistics Ph.D. Qualifying Exam: Part I October 18, 2003
Statistics Ph.D. Qualifying Exam: Part I October 18, 2003 Student Name: 1. Answer 8 out of 12 problems. Mark the problems you selected in the following table. 1 2 3 4 5 6 7 8 9 10 11 12 2. Write your answer
More informationDavid Giles Bayesian Econometrics
David Giles Bayesian Econometrics 1. General Background 2. Constructing Prior Distributions 3. Properties of Bayes Estimators and Tests 4. Bayesian Analysis of the Multiple Regression Model 5. Bayesian
More informationModern Methods of Data Analysis - WS 07/08
Modern Methods of Data Analysis Lecture VIc (19.11.07) Contents: Maximum Likelihood Fit Maximum Likelihood (I) Assume N measurements of a random variable Assume them to be independent and distributed according
More informationStatistics. Statistics
The main aims of statistics 1 1 Choosing a model 2 Estimating its parameter(s) 1 point estimates 2 interval estimates 3 Testing hypotheses Distributions used in statistics: χ 2 n-distribution 2 Let X 1,
More informationMathematics Ph.D. Qualifying Examination Stat Probability, January 2018
Mathematics Ph.D. Qualifying Examination Stat 52800 Probability, January 2018 NOTE: Answers all questions completely. Justify every step. Time allowed: 3 hours. 1. Let X 1,..., X n be a random sample from
More informationLecture 3 September 1
STAT 383C: Statistical Modeling I Fall 2016 Lecture 3 September 1 Lecturer: Purnamrita Sarkar Scribe: Giorgio Paulon, Carlos Zanini Disclaimer: These scribe notes have been slightly proofread and may have
More information