Graduate Econometrics I: Maximum Likelihood I

Size: px
Start display at page:

Download "Graduate Econometrics I: Maximum Likelihood I"

Transcription

1 Graduate Econometrics I: Maximum Likelihood I Yves Dominicy Université libre de Bruxelles Solvay Brussels School of Economics and Management ECARES Yves Dominicy Graduate Econometrics I: Maximum Likelihood I 1/28

2 Outline Yves Dominicy Graduate Econometrics I: Maximum Likelihood I 2/28

3 Outline Yves Dominicy Graduate Econometrics I: Maximum Likelihood I 3/28

4 Consider : P = {P θ = l(y; θ), θ Θ R p }. A maximum likelihood estimator of θ is a solution to the maximization problem : max l(y; θ). θ Θ Because the solutions to an optimization problem remain unchanged when the objective function is transformed by a strictly increasing mapping : max log l(y; θ). θ Θ Note that log s make function more linear For conditional models max θ Θ l(y x; θ). Yves Dominicy Graduate Econometrics I: Maximum Likelihood I 4/28

5 ML estimates the unknown parameters by choosing them in such a way that the resulting distribution corresponds as close as possible to the probability distribution of the observed data. Maximization (or optimization) is done by finding the values that make the gradient equal to zero : log l(y; θ) θ= = 0. ˆθ n Yves Dominicy Graduate Econometrics I: Maximum Likelihood I 5/28

6 Yves Dominicy Graduate Econometrics I: Maximum Likelihood I 6/28

7 In other words, ML searches for the distribution in the model that is closest to the empirical distribution according to the Kullback-Leibler discrepancy measure. Definition Given P = f (y) and P = f (y), [ I(P P ) = E log f ] (y) = log f (y) f (y) Y f (y) f (y)dy is the Kullback-Leibler discrepancy between P and P. Let f (y) = l(y; θ 0 ) and f (y) = l(y; θ). Then I(l(y; θ) l(y; θ 0 )) = log l(y; θ 0) Y l(y; θ) l(y; θ 0)dy = log l(y; θ 0 )l(y; θ 0 )dy Y Y log l(y; θ)l(y; θ 0 )dy. Yves Dominicy Graduate Econometrics I: Maximum Likelihood I 7/28

8 Since we want to minimize the distance between l(y; θ 0 ) and l(y; θ), it is equivalent to minimize min log l(y; θ)l(y; θ 0 )dy, θ Y or maximize the log-likelihood ( ML) max log l(y; θ)l(y; θ 0 )dy, θ or maximize the sample counterpart : Y max θ 1 n n log l(y i ; θ). i=1 We will denote the MLE by ˆθ n. Yves Dominicy Graduate Econometrics I: Maximum Likelihood I 8/28

9 Remark : There are a certain number of problems that may be encountered. 1- Non-existence of a solution : Due sometimes to the fact that the parameter space is open or the log-likelihood has discontinuities in θ. Property If the parameter space Θ is compact (bounded+ closed) and if the likelihood function θ l(y; θ) is continuous on Θ, then there exists a MLE. 2- Non-uniqueness of the likelihood function : When more than one value give the same likelihood. Property If the parameter space Θ is convex and if the log-likelihood function is strictly concave in ξ = h(θ), where h( ) is a bijective transformation of the parameter, then the MLE exists and it is unique. Yves Dominicy Graduate Econometrics I: Maximum Likelihood I 9/28

10 Outline Yves Dominicy Graduate Econometrics I: Maximum Likelihood I 10/28

11 Unconstrained Property If θ = (θ 1,..., θ p) Θ R p and the log likelihood function is differentiable in θ and if ˆθ n belongs to the interior of Θ, then the MLE satisfies : L(y; ˆθ n) = log l(y; ˆθ n) These equations are called the likelihood equations. = 0. Example : Let Y 1,.., Y n be a random sample drawn from a Poisson distribution P(λ). The loglikelihood function is : n n L(y; λ) = nλ + y i log λ log(y i!). It attains a maximum at ˆλ satisfying : 0 = L(y; ˆλ n) λ i=1 = n + n i=1 i=1 y i ˆλ n ˆλ n = ȳ. Yves Dominicy Graduate Econometrics I: Maximum Likelihood I 11/28

12 Constrained Econometric models have usually constraints on the parameters : f (θ) = 0. Maximization of L(y; θ) must take into account the constraints f (θ) = 0. To do so, we introduce a vector λ of r Lagrange multipliers and we maximize : max L(y; θ) λ f (θ). θ And the first order conditions are : { L(y; ˆθ n) f ( ˆθ n) λ = 0, f (ˆθ n) = 0. The same property as for the unconstrained case holds as far as f (θ) is a function from R p to R r with r p. Yves Dominicy Graduate Econometrics I: Maximum Likelihood I 12/28

13 Constrained Example : Suppose that Y = (Y 1,..., Y n) follows a binomial distribution : ( ) n P(Y = y) = p y q 1 y, y where p and q are two probabilities such that : p + q = 1 p + q 1 = 0. Therefore the maximization probability is : where θ = (p, q). max L(y; θ) λ(p + q 1), θ Yves Dominicy Graduate Econometrics I: Maximum Likelihood I 13/28

14 Constrained First order conditions : then : Σn i=1 y i p L = Σn i=1 y i p L = n Σn i=1 y i q L p λ = 0 q λ = 0 λ = p + q 1 = 0 p = 1 q = n Σn i=1 y i = λ Σn i=1 y i q 1 q (1 q)(n Σn i=1 y i ) qσ n i=1 y i q(1 q) = 0 = n Σn i=1 y i q n nq n i=1 y i + q n i=1 y i q n i=1 y i = 0 ˆq n = 1 Σn i=1 y i n ˆp n = Σn i=1 y i n. Yves Dominicy Graduate Econometrics I: Maximum Likelihood I 14/28

15 Outline Yves Dominicy Graduate Econometrics I: Maximum Likelihood I 15/28

16 Existence and Consistency Consider a parametric model and random sampling. Regularity conditions 1 : A1 The variables Y i, i = 1,..., n are i.i.d. with density f (y; θ), θ Θ. A2 The parameter space is compact (= closed and bounded). A3 The true, but unknown, parameter value θ 0 is identified. A4 The log-likelihood function is continuous with respect to θ. A5 E 0 (log f (y i ; θ)) exists. Yves Dominicy Graduate Econometrics I: Maximum Likelihood I 16/28

17 Existence and Consistency Property Existence and consistency. Under assumptions A1-A5, existence and uniqueness, there exists a sequence of MLE converging to the true parameter value θ 0. PROOF (sketch) : A2 and A4 ensure the existence of the MLE ˆθ n obtained from maximizing L n(θ) or 1 n Ln(θ). Since 1 n Ln(θ) = 1 n n i=1 log f (y i; θ) can be interpreted as the sample mean of the random variables log f (y i ; θ). By the LLN : 1 n Ln(θ) p E 0(log l(y ; θ)). Yves Dominicy Graduate Econometrics I: Maximum Likelihood I 17/28

18 Existence and Consistency Next, when the convergence is uniform, the solution ˆθ n converges to the solution of the limit problem : plim ˆθ n = θ = arg max θ E 0 (log l(y ; θ)) = arg max θ Y log l(y; θ)l(y; θ 0)dy. By the identification condition on θ 0, the solution to the limit problem is unique and equal to θ 0 θ = θ 0 plim ˆθ n = θ 0. Yves Dominicy Graduate Econometrics I: Maximum Likelihood I 18/28

19 Existence and Consistency Small variations in the assumptions can be done. In particular, instead working with all the parameter space Θ, we may replace A2 by : A2 The interior of Θ is non-empty and θ 0 belongs to the interior of Θ. We also need a local LLN. In this case we work with local maxima instead of global. Yves Dominicy Graduate Econometrics I: Maximum Likelihood I 19/28

20 Asymptotic Distribution Since the sequence ˆθ n converges to θ 0, it is useful to consider the asymptotic behaviour of ˆθ n θ 0, or rather, determine the rate of convergence. We need extra regularity conditions : Regularity conditions 2 : A6 L n(θ) is twice differentiable in an open neighbourhood of θ 0. ( ) A7 I 1 (θ 0 ) = E 0 2 log f (Y 1 ;θ 0 ) exists and is non-singular. I is the Fisher (expected) information matrix for one random variable. Yves Dominicy Graduate Econometrics I: Maximum Likelihood I 20/28

21 Asymptotic Distribution Property Under A1, A2, A3-A5, A6-A7, a consistent sequence ˆθ n of local maxima is such that n(ˆθ n θ 0 ) converges in distribution to a Gaussian distribution with mean zero and variance covariance matrix I 1 (θ 0 ) 1 : n(ˆθ n θ 0 ) d N(0, I 1 (θ 0 ) 1 ). PROOF (sketch) : Since ˆθ n satisfies Taylor expansion 1 of the score Ln( ˆθ) = 0 and it converges to θ 0, a Ln( ˆθ) in a neighborhood of θ = θ 0 gives : 1. Taylor expansion : p f (x 0 ) f (x) = (x x 0 ) i + R n i! i=0 where p f (x 0 ) i=0 i! (x x 0 ) i is a p-degree polynomial and R n is the remainder. Yves Dominicy Graduate Econometrics I: Maximum Likelihood I 21/28

22 Asymptotic Distribution 0 = Ln(ˆθ) = Ln(θ 0) + 2 L n(θ 0 ) (ˆθ θ 0 ) + o p(1) where the remainder of the expansion is o p(1). Rearranging : and dividing by n ( 1 n 2 L n(θ 0 ) (ˆθ n θ 0 ) Ln(θ 0) ) } 2 L n(θ 0 ) {{ } 1 n(ˆθ n θ 0 ) 1 L n(θ 0 ). n }{{} 2 Yves Dominicy Graduate Econometrics I: Maximum Likelihood I 22/28

23 Asymptotic Distribution 1 1 n 2 L n(θ 0 ) = 1 n n i=1 2 log f (y i ; θ) this is an empirical mean. By an appropriate LLN it converges to : ( ) 2 log f (y 1 ; θ 0 ) I 1 (θ 0 ) = E θ. 2 1 L n(θ 0 ) n = = 1 n 1 n n log f (y i ; θ 0 ) n ( log f (yi ; θ 0 ) i=1 i=1 ( )) log f (yi ; θ 0 ) E θ0. Yves Dominicy Graduate Econometrics I: Maximum Likelihood I 23/28

24 Asymptotic Distribution and by the CLT it converges in distribution to ( ( )) log f (y1 ; θ 0 ) N 0, V θ0 N(0, I 1 (θ 0 )). Collecting 1 and 2 : and : I 1 (θ 0 ) n(ˆθ n θ 0 ) 1 L n(θ 0 ), n I 1 (θ 0 ) n(ˆθ n θ 0 ) d N(0, I 1 (θ 0 )) n(ˆθn θ 0 ) d N(0, I 1 (θ 0 ) 1 I 1 (θ 0 )I 1 (θ 0 ) 1 ) n(ˆθn θ 0 ) d N(0, I 1 (θ 0 ) 1 ). Yves Dominicy Graduate Econometrics I: Maximum Likelihood I 24/28

25 Asymptotic Distribution All this implies that : ˆθ n d N(0, I n(θ 0 ) 1 ), where : 1 n I 1(θ 0 ) 1 = (ni 1 (θ 0 )) 1 = I n(θ 0 ) 1 is the Fisher information matrix for n observations. Hence, ˆθ n is consistent, efficient and asymptotically Gaussian! Yves Dominicy Graduate Econometrics I: Maximum Likelihood I 25/28

26 Asymptotic Distribution I n(θ 0 ) depends on θ 0, which is unknown. But it can be estimated consistently by : I n(θ) = 1 n 2 log f (y i ; ˆθ n) n or I n(θ) = 1 n n i=1 i=1 log f (y i ; ˆθ n) log f (y i ; ˆθ n). Property Let g be a continuous differentiable function of θ R p with values in R q. Then under the regularity conditions : i) g(ˆθ n) p g(θ 0 ) ii) n(g(ˆθ n) g(θ 0 )) d N ( ) 0, g(θ 0) I 1 (θ 0 ) 1 g(θ 0). Yves Dominicy Graduate Econometrics I: Maximum Likelihood I 26/28

27 Asymptotic Distribution Why in the previous proof n is so important? For two reasons : We had dividing by n : ( 1 ) 2 L n(θ 0 ) n }{{} First Reason 2 L n(θ 0 ) (ˆθ n θ 0 ) Ln(θ 0), n(ˆθ n θ 0 ) 1 L n(θ 0 ). n }{{} Second Reason First reason for n : Law of Large Numbers for the Hessian. If we do not divide by n the LLN cannot be applied. Second reason for n : Central Limit Theorem for the score. Yves Dominicy Graduate Econometrics I: Maximum Likelihood I 27/28

28 Asymptotic Distribution We had : or 1 n n ( 1 n n ( log f (yi ; θ 0 ) i=1 n i=1 and the CLT works here. log f (y i ; θ 0 ) ( )) log f (yi ; θ 0 ) E 0 1 n E 0 ( ) ) log f (yi ; θ 0 ) If we do not divide by n, the CLT cannot be applied (it does not converge to a Gaussian). Yves Dominicy Graduate Econometrics I: Maximum Likelihood I 28/28

Graduate Econometrics I: Maximum Likelihood II

Graduate Econometrics I: Maximum Likelihood II Graduate Econometrics I: Maximum Likelihood II Yves Dominicy Université libre de Bruxelles Solvay Brussels School of Economics and Management ECARES Yves Dominicy Graduate Econometrics I: Maximum Likelihood

More information

Graduate Econometrics I: Unbiased Estimation

Graduate Econometrics I: Unbiased Estimation Graduate Econometrics I: Unbiased Estimation Yves Dominicy Université libre de Bruxelles Solvay Brussels School of Economics and Management ECARES Yves Dominicy Graduate Econometrics I: Unbiased Estimation

More information

f(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain

f(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain 0.1. INTRODUCTION 1 0.1 Introduction R. A. Fisher, a pioneer in the development of mathematical statistics, introduced a measure of the amount of information contained in an observaton from f(x θ). Fisher

More information

Chapter 3: Maximum Likelihood Theory

Chapter 3: Maximum Likelihood Theory Chapter 3: Maximum Likelihood Theory Florian Pelgrin HEC September-December, 2010 Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, 2010 1 / 40 1 Introduction Example 2 Maximum likelihood

More information

Introduction to Estimation Methods for Time Series models Lecture 2

Introduction to Estimation Methods for Time Series models Lecture 2 Introduction to Estimation Methods for Time Series models Lecture 2 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 1 / 21 Estimators:

More information

Chapter 1: A Brief Review of Maximum Likelihood, GMM, and Numerical Tools. Joan Llull. Microeconometrics IDEA PhD Program

Chapter 1: A Brief Review of Maximum Likelihood, GMM, and Numerical Tools. Joan Llull. Microeconometrics IDEA PhD Program Chapter 1: A Brief Review of Maximum Likelihood, GMM, and Numerical Tools Joan Llull Microeconometrics IDEA PhD Program Maximum Likelihood Chapter 1. A Brief Review of Maximum Likelihood, GMM, and Numerical

More information

Graduate Econometrics I: Asymptotic Theory

Graduate Econometrics I: Asymptotic Theory Graduate Econometrics I: Asymptotic Theory Yves Dominicy Université libre de Bruxelles Solvay Brussels School of Economics and Management ECARES Yves Dominicy Graduate Econometrics I: Asymptotic Theory

More information

The properties of L p -GMM estimators

The properties of L p -GMM estimators The properties of L p -GMM estimators Robert de Jong and Chirok Han Michigan State University February 2000 Abstract This paper considers Generalized Method of Moment-type estimators for which a criterion

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Chapter 8 Maximum Likelihood Estimation 8. Consistency If X is a random variable (or vector) with density or mass function f θ (x) that depends on a parameter θ, then the function f θ (X) viewed as a function

More information

Introduction Large Sample Testing Composite Hypotheses. Hypothesis Testing. Daniel Schmierer Econ 312. March 30, 2007

Introduction Large Sample Testing Composite Hypotheses. Hypothesis Testing. Daniel Schmierer Econ 312. March 30, 2007 Hypothesis Testing Daniel Schmierer Econ 312 March 30, 2007 Basics Parameter of interest: θ Θ Structure of the test: H 0 : θ Θ 0 H 1 : θ Θ 1 for some sets Θ 0, Θ 1 Θ where Θ 0 Θ 1 = (often Θ 1 = Θ Θ 0

More information

Theory of Maximum Likelihood Estimation. Konstantin Kashin

Theory of Maximum Likelihood Estimation. Konstantin Kashin Gov 2001 Section 5: Theory of Maximum Likelihood Estimation Konstantin Kashin February 28, 2013 Outline Introduction Likelihood Examples of MLE Variance of MLE Asymptotic Properties What is Statistical

More information

Econometrics I, Estimation

Econometrics I, Estimation Econometrics I, Estimation Department of Economics Stanford University September, 2008 Part I Parameter, Estimator, Estimate A parametric is a feature of the population. An estimator is a function of the

More information

Lecture 4 September 15

Lecture 4 September 15 IFT 6269: Probabilistic Graphical Models Fall 2017 Lecture 4 September 15 Lecturer: Simon Lacoste-Julien Scribe: Philippe Brouillard & Tristan Deleu 4.1 Maximum Likelihood principle Given a parametric

More information

Greene, Econometric Analysis (6th ed, 2008)

Greene, Econometric Analysis (6th ed, 2008) EC771: Econometrics, Spring 2010 Greene, Econometric Analysis (6th ed, 2008) Chapter 17: Maximum Likelihood Estimation The preferred estimator in a wide variety of econometric settings is that derived

More information

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score

More information

Stat 5102 Lecture Slides Deck 3. Charles J. Geyer School of Statistics University of Minnesota

Stat 5102 Lecture Slides Deck 3. Charles J. Geyer School of Statistics University of Minnesota Stat 5102 Lecture Slides Deck 3 Charles J. Geyer School of Statistics University of Minnesota 1 Likelihood Inference We have learned one very general method of estimation: method of moments. the Now we

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Chapter 7 Maximum Likelihood Estimation 7. Consistency If X is a random variable (or vector) with density or mass function f θ (x) that depends on a parameter θ, then the function f θ (X) viewed as a function

More information

Estimation of Dynamic Regression Models

Estimation of Dynamic Regression Models University of Pavia 2007 Estimation of Dynamic Regression Models Eduardo Rossi University of Pavia Factorization of the density DGP: D t (x t χ t 1, d t ; Ψ) x t represent all the variables in the economy.

More information

P n. This is called the law of large numbers but it comes in two forms: Strong and Weak.

P n. This is called the law of large numbers but it comes in two forms: Strong and Weak. Large Sample Theory Large Sample Theory is a name given to the search for approximations to the behaviour of statistical procedures which are derived by computing limits as the sample size, n, tends to

More information

Information in a Two-Stage Adaptive Optimal Design

Information in a Two-Stage Adaptive Optimal Design Information in a Two-Stage Adaptive Optimal Design Department of Statistics, University of Missouri Designed Experiments: Recent Advances in Methods and Applications DEMA 2011 Isaac Newton Institute for

More information

Chapter 4: Constrained estimators and tests in the multiple linear regression model (Part III)

Chapter 4: Constrained estimators and tests in the multiple linear regression model (Part III) Chapter 4: Constrained estimators and tests in the multiple linear regression model (Part III) Florian Pelgrin HEC September-December 2010 Florian Pelgrin (HEC) Constrained estimators September-December

More information

Graduate Econometrics I: What is econometrics?

Graduate Econometrics I: What is econometrics? Graduate Econometrics I: What is econometrics? Yves Dominicy Université libre de Bruxelles Solvay Brussels School of Economics and Management ECARES Yves Dominicy Graduate Econometrics I: What is econometrics?

More information

Max. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes

Max. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes Maximum Likelihood Estimation Econometrics II Department of Economics Universidad Carlos III de Madrid Máster Universitario en Desarrollo y Crecimiento Económico Outline 1 3 4 General Approaches to Parameter

More information

Maximum likelihood estimation

Maximum likelihood estimation Maximum likelihood estimation Guillaume Obozinski Ecole des Ponts - ParisTech Master MVA Maximum likelihood estimation 1/26 Outline 1 Statistical concepts 2 A short review of convex analysis and optimization

More information

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Article Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Fei Jin 1,2 and Lung-fei Lee 3, * 1 School of Economics, Shanghai University of Finance and Economics,

More information

Chapter 3 : Likelihood function and inference

Chapter 3 : Likelihood function and inference Chapter 3 : Likelihood function and inference 4 Likelihood function and inference The likelihood Information and curvature Sufficiency and ancilarity Maximum likelihood estimation Non-regular models EM

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Maximum Likelihood Estimation Assume X P θ, θ Θ, with joint pdf (or pmf) f(x θ). Suppose we observe X = x. The Likelihood function is L(θ x) = f(x θ) as a function of θ (with the data x held fixed). The

More information

5601 Notes: The Sandwich Estimator

5601 Notes: The Sandwich Estimator 560 Notes: The Sandwich Estimator Charles J. Geyer December 6, 2003 Contents Maximum Likelihood Estimation 2. Likelihood for One Observation................... 2.2 Likelihood for Many IID Observations...............

More information

A Very Brief Summary of Statistical Inference, and Examples

A Very Brief Summary of Statistical Inference, and Examples A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2009 Prof. Gesine Reinert Our standard situation is that we have data x = x 1, x 2,..., x n, which we view as realisations of random

More information

Exercises Chapter 4 Statistical Hypothesis Testing

Exercises Chapter 4 Statistical Hypothesis Testing Exercises Chapter 4 Statistical Hypothesis Testing Advanced Econometrics - HEC Lausanne Christophe Hurlin University of Orléans December 5, 013 Christophe Hurlin (University of Orléans) Advanced Econometrics

More information

δ -method and M-estimation

δ -method and M-estimation Econ 2110, fall 2016, Part IVb Asymptotic Theory: δ -method and M-estimation Maximilian Kasy Department of Economics, Harvard University 1 / 40 Example Suppose we estimate the average effect of class size

More information

ECE 275A Homework 7 Solutions

ECE 275A Homework 7 Solutions ECE 275A Homework 7 Solutions Solutions 1. For the same specification as in Homework Problem 6.11 we want to determine an estimator for θ using the Method of Moments (MOM). In general, the MOM estimator

More information

simple if it completely specifies the density of x

simple if it completely specifies the density of x 3. Hypothesis Testing Pure significance tests Data x = (x 1,..., x n ) from f(x, θ) Hypothesis H 0 : restricts f(x, θ) Are the data consistent with H 0? H 0 is called the null hypothesis simple if it completely

More information

The outline for Unit 3

The outline for Unit 3 The outline for Unit 3 Unit 1. Introduction: The regression model. Unit 2. Estimation principles. Unit 3: Hypothesis testing principles. 3.1 Wald test. 3.2 Lagrange Multiplier. 3.3 Likelihood Ratio Test.

More information

Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed

Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed 18.466 Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed 1. MLEs in exponential families Let f(x,θ) for x X and θ Θ be a likelihood function, that is, for present purposes,

More information

Statistics 3858 : Maximum Likelihood Estimators

Statistics 3858 : Maximum Likelihood Estimators Statistics 3858 : Maximum Likelihood Estimators 1 Method of Maximum Likelihood In this method we construct the so called likelihood function, that is L(θ) = L(θ; X 1, X 2,..., X n ) = f n (X 1, X 2,...,

More information

Modification and Improvement of Empirical Likelihood for Missing Response Problem

Modification and Improvement of Empirical Likelihood for Missing Response Problem UW Biostatistics Working Paper Series 12-30-2010 Modification and Improvement of Empirical Likelihood for Missing Response Problem Kwun Chuen Gary Chan University of Washington - Seattle Campus, kcgchan@u.washington.edu

More information

Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016

Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016 Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016 1 Entropy Since this course is about entropy maximization,

More information

Ch. 5 Hypothesis Testing

Ch. 5 Hypothesis Testing Ch. 5 Hypothesis Testing The current framework of hypothesis testing is largely due to the work of Neyman and Pearson in the late 1920s, early 30s, complementing Fisher s work on estimation. As in estimation,

More information

Estimation, Inference, and Hypothesis Testing

Estimation, Inference, and Hypothesis Testing Chapter 2 Estimation, Inference, and Hypothesis Testing Note: The primary reference for these notes is Ch. 7 and 8 of Casella & Berger 2. This text may be challenging if new to this topic and Ch. 7 of

More information

IEOR E4570: Machine Learning for OR&FE Spring 2015 c 2015 by Martin Haugh. The EM Algorithm

IEOR E4570: Machine Learning for OR&FE Spring 2015 c 2015 by Martin Haugh. The EM Algorithm IEOR E4570: Machine Learning for OR&FE Spring 205 c 205 by Martin Haugh The EM Algorithm The EM algorithm is used for obtaining maximum likelihood estimates of parameters when some of the data is missing.

More information

Closest Moment Estimation under General Conditions

Closest Moment Estimation under General Conditions Closest Moment Estimation under General Conditions Chirok Han and Robert de Jong January 28, 2002 Abstract This paper considers Closest Moment (CM) estimation with a general distance function, and avoids

More information

STAT 135 Lab 3 Asymptotic MLE and the Method of Moments

STAT 135 Lab 3 Asymptotic MLE and the Method of Moments STAT 135 Lab 3 Asymptotic MLE and the Method of Moments Rebecca Barter February 9, 2015 Maximum likelihood estimation (a reminder) Maximum likelihood estimation Suppose that we have a sample, X 1, X 2,...,

More information

Chapter 4: Asymptotic Properties of the MLE (Part 2)

Chapter 4: Asymptotic Properties of the MLE (Part 2) Chapter 4: Asymptotic Properties of the MLE (Part 2) Daniel O. Scharfstein 09/24/13 1 / 1 Example Let {(R i, X i ) : i = 1,..., n} be an i.i.d. sample of n random vectors (R, X ). Here R is a response

More information

Economics 101A (Lecture 3) Stefano DellaVigna

Economics 101A (Lecture 3) Stefano DellaVigna Economics 101A (Lecture 3) Stefano DellaVigna January 24, 2017 Outline 1. Implicit Function Theorem 2. Envelope Theorem 3. Convexity and concavity 4. Constrained Maximization 1 Implicit function theorem

More information

Posterior Regularization

Posterior Regularization Posterior Regularization 1 Introduction One of the key challenges in probabilistic structured learning, is the intractability of the posterior distribution, for fast inference. There are numerous methods

More information

DA Freedman Notes on the MLE Fall 2003

DA Freedman Notes on the MLE Fall 2003 DA Freedman Notes on the MLE Fall 2003 The object here is to provide a sketch of the theory of the MLE. Rigorous presentations can be found in the references cited below. Calculus. Let f be a smooth, scalar

More information

Link lecture - Lagrange Multipliers

Link lecture - Lagrange Multipliers Link lecture - Lagrange Multipliers Lagrange multipliers provide a method for finding a stationary point of a function, say f(x, y) when the variables are subject to constraints, say of the form g(x, y)

More information

The Uniform Weak Law of Large Numbers and the Consistency of M-Estimators of Cross-Section and Time Series Models

The Uniform Weak Law of Large Numbers and the Consistency of M-Estimators of Cross-Section and Time Series Models The Uniform Weak Law of Large Numbers and the Consistency of M-Estimators of Cross-Section and Time Series Models Herman J. Bierens Pennsylvania State University September 16, 2005 1. The uniform weak

More information

(θ θ ), θ θ = 2 L(θ ) θ θ θ θ θ (θ )= H θθ (θ ) 1 d θ (θ )

(θ θ ), θ θ = 2 L(θ ) θ θ θ θ θ (θ )= H θθ (θ ) 1 d θ (θ ) Setting RHS to be zero, 0= (θ )+ 2 L(θ ) (θ θ ), θ θ = 2 L(θ ) 1 (θ )= H θθ (θ ) 1 d θ (θ ) O =0 θ 1 θ 3 θ 2 θ Figure 1: The Newton-Raphson Algorithm where H is the Hessian matrix, d θ is the derivative

More information

Maximum Likelihood Tests and Quasi-Maximum-Likelihood

Maximum Likelihood Tests and Quasi-Maximum-Likelihood Maximum Likelihood Tests and Quasi-Maximum-Likelihood Wendelin Schnedler Department of Economics University of Heidelberg 10. Dezember 2007 Wendelin Schnedler (AWI) Maximum Likelihood Tests and Quasi-Maximum-Likelihood10.

More information

Nonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix

Nonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix Nonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix Yingying Dong and Arthur Lewbel California State University Fullerton and Boston College July 2010 Abstract

More information

10. Linear Models and Maximum Likelihood Estimation

10. Linear Models and Maximum Likelihood Estimation 10. Linear Models and Maximum Likelihood Estimation ECE 830, Spring 2017 Rebecca Willett 1 / 34 Primary Goal General problem statement: We observe y i iid pθ, θ Θ and the goal is to determine the θ that

More information

Chapter 2. Binary and M-ary Hypothesis Testing 2.1 Introduction (Levy 2.1)

Chapter 2. Binary and M-ary Hypothesis Testing 2.1 Introduction (Levy 2.1) Chapter 2. Binary and M-ary Hypothesis Testing 2.1 Introduction (Levy 2.1) Detection problems can usually be casted as binary or M-ary hypothesis testing problems. Applications: This chapter: Simple hypothesis

More information

Chapter 4: Asymptotic Properties of the MLE

Chapter 4: Asymptotic Properties of the MLE Chapter 4: Asymptotic Properties of the MLE Daniel O. Scharfstein 09/19/13 1 / 1 Maximum Likelihood Maximum likelihood is the most powerful tool for estimation. In this part of the course, we will consider

More information

Lecture 1: Introduction

Lecture 1: Introduction Principles of Statistics Part II - Michaelmas 208 Lecturer: Quentin Berthet Lecture : Introduction This course is concerned with presenting some of the mathematical principles of statistical theory. One

More information

Lecture 4: Optimization. Maximizing a function of a single variable

Lecture 4: Optimization. Maximizing a function of a single variable Lecture 4: Optimization Maximizing or Minimizing a Function of a Single Variable Maximizing or Minimizing a Function of Many Variables Constrained Optimization Maximizing a function of a single variable

More information

Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory

Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory Statistical Inference Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory IP, José Bioucas Dias, IST, 2007

More information

Paul Schrimpf. October 18, UBC Economics 526. Unconstrained optimization. Paul Schrimpf. Notation and definitions. First order conditions

Paul Schrimpf. October 18, UBC Economics 526. Unconstrained optimization. Paul Schrimpf. Notation and definitions. First order conditions Unconstrained UBC Economics 526 October 18, 2013 .1.2.3.4.5 Section 1 Unconstrained problem x U R n F : U R. max F (x) x U Definition F = max x U F (x) is the maximum of F on U if F (x) F for all x U and

More information

Advanced Quantitative Methods: maximum likelihood

Advanced Quantitative Methods: maximum likelihood Advanced Quantitative Methods: Maximum Likelihood University College Dublin 4 March 2014 1 2 3 4 5 6 Outline 1 2 3 4 5 6 of straight lines y = 1 2 x + 2 dy dx = 1 2 of curves y = x 2 4x + 5 of curves y

More information

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata Maura Department of Economics and Finance Università Tor Vergata Hypothesis Testing Outline It is a mistake to confound strangeness with mystery Sherlock Holmes A Study in Scarlet Outline 1 The Power Function

More information

Statistics and econometrics

Statistics and econometrics 1 / 36 Slides for the course Statistics and econometrics Part 10: Asymptotic hypothesis testing European University Institute Andrea Ichino September 8, 2014 2 / 36 Outline Why do we need large sample

More information

Review and continuation from last week Properties of MLEs

Review and continuation from last week Properties of MLEs Review and continuation from last week Properties of MLEs As we have mentioned, MLEs have a nice intuitive property, and as we have seen, they have a certain equivariance property. We will see later that

More information

Chapter 3. Point Estimation. 3.1 Introduction

Chapter 3. Point Estimation. 3.1 Introduction Chapter 3 Point Estimation Let (Ω, A, P θ ), P θ P = {P θ θ Θ}be probability space, X 1, X 2,..., X n : (Ω, A) (IR k, B k ) random variables (X, B X ) sample space γ : Θ IR k measurable function, i.e.

More information

ML Testing (Likelihood Ratio Testing) for non-gaussian models

ML Testing (Likelihood Ratio Testing) for non-gaussian models ML Testing (Likelihood Ratio Testing) for non-gaussian models Surya Tokdar ML test in a slightly different form Model X f (x θ), θ Θ. Hypothesist H 0 : θ Θ 0 Good set: B c (x) = {θ : l x (θ) max θ Θ l

More information

Lecture 3 September 1

Lecture 3 September 1 STAT 383C: Statistical Modeling I Fall 2016 Lecture 3 September 1 Lecturer: Purnamrita Sarkar Scribe: Giorgio Paulon, Carlos Zanini Disclaimer: These scribe notes have been slightly proofread and may have

More information

STAT 461/561- Assignments, Year 2015

STAT 461/561- Assignments, Year 2015 STAT 461/561- Assignments, Year 2015 This is the second set of assignment problems. When you hand in any problem, include the problem itself and its number. pdf are welcome. If so, use large fonts and

More information

EM Algorithm II. September 11, 2018

EM Algorithm II. September 11, 2018 EM Algorithm II September 11, 2018 Review EM 1/27 (Y obs, Y mis ) f (y obs, y mis θ), we observe Y obs but not Y mis Complete-data log likelihood: l C (θ Y obs, Y mis ) = log { f (Y obs, Y mis θ) Observed-data

More information

Testing Restrictions and Comparing Models

Testing Restrictions and Comparing Models Econ. 513, Time Series Econometrics Fall 00 Chris Sims Testing Restrictions and Comparing Models 1. THE PROBLEM We consider here the problem of comparing two parametric models for the data X, defined by

More information

Lecture 6: Gaussian Mixture Models (GMM)

Lecture 6: Gaussian Mixture Models (GMM) Helsinki Institute for Information Technology Lecture 6: Gaussian Mixture Models (GMM) Pedram Daee 3.11.2015 Outline Gaussian Mixture Models (GMM) Models Model families and parameters Parameter learning

More information

Composite Hypotheses and Generalized Likelihood Ratio Tests

Composite Hypotheses and Generalized Likelihood Ratio Tests Composite Hypotheses and Generalized Likelihood Ratio Tests Rebecca Willett, 06 In many real world problems, it is difficult to precisely specify probability distributions. Our models for data may involve

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

Lecture 10 Maximum Likelihood Asymptotics under Non-standard Conditions: A Heuristic Introduction to Sandwiches

Lecture 10 Maximum Likelihood Asymptotics under Non-standard Conditions: A Heuristic Introduction to Sandwiches University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 10 Maximum Likelihood Asymptotics under Non-standard Conditions: A Heuristic Introduction to Sandwiches Ref: Huber,

More information

Math 494: Mathematical Statistics

Math 494: Mathematical Statistics Math 494: Mathematical Statistics Instructor: Jimin Ding jmding@wustl.edu Department of Mathematics Washington University in St. Louis Class materials are available on course website (www.math.wustl.edu/

More information

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) = Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,

More information

Closest Moment Estimation under General Conditions

Closest Moment Estimation under General Conditions Closest Moment Estimation under General Conditions Chirok Han Victoria University of Wellington New Zealand Robert de Jong Ohio State University U.S.A October, 2003 Abstract This paper considers Closest

More information

Answer Key for STAT 200B HW No. 7

Answer Key for STAT 200B HW No. 7 Answer Key for STAT 200B HW No. 7 May 5, 2007 Problem 2.2 p. 649 Assuming binomial 2-sample model ˆπ =.75, ˆπ 2 =.6. a ˆτ = ˆπ 2 ˆπ =.5. From Ex. 2.5a on page 644: ˆπ ˆπ + ˆπ 2 ˆπ 2.75.25.6.4 = + =.087;

More information

Theoretical Statistics. Lecture 17.

Theoretical Statistics. Lecture 17. Theoretical Statistics. Lecture 17. Peter Bartlett 1. Asymptotic normality of Z-estimators: classical conditions. 2. Asymptotic equicontinuity. 1 Recall: Delta method Theorem: Supposeφ : R k R m is differentiable

More information

ECE 275B Homework # 1 Solutions Winter 2018

ECE 275B Homework # 1 Solutions Winter 2018 ECE 275B Homework # 1 Solutions Winter 2018 1. (a) Because x i are assumed to be independent realizations of a continuous random variable, it is almost surely (a.s.) 1 the case that x 1 < x 2 < < x n Thus,

More information

Maximum Likelihood Large Sample Theory

Maximum Likelihood Large Sample Theory Maximum Likelihood Large Sample Theory MIT 18.443 Dr. Kempthorne Spring 2015 1 Outline 1 Large Sample Theory of Maximum Likelihood Estimates 2 Asymptotic Results: Overview Asymptotic Framework Data Model

More information

Lecture 3 January 16

Lecture 3 January 16 Stats 3b: Theory of Statistics Winter 28 Lecture 3 January 6 Lecturer: Yu Bai/John Duchi Scribe: Shuangning Li, Theodor Misiakiewicz Warning: these notes may contain factual errors Reading: VDV Chater

More information

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner Fundamentals CS 281A: Statistical Learning Theory Yangqing Jia Based on tutorial slides by Lester Mackey and Ariel Kleiner August, 2011 Outline 1 Probability 2 Statistics 3 Linear Algebra 4 Optimization

More information

High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data

High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data Song Xi CHEN Guanghua School of Management and Center for Statistical Science, Peking University Department

More information

Outline of GLMs. Definitions

Outline of GLMs. Definitions Outline of GLMs Definitions This is a short outline of GLM details, adapted from the book Nonparametric Regression and Generalized Linear Models, by Green and Silverman. The responses Y i have density

More information

Notes on Asymptotic Theory: Convergence in Probability and Distribution Introduction to Econometric Theory Econ. 770

Notes on Asymptotic Theory: Convergence in Probability and Distribution Introduction to Econometric Theory Econ. 770 Notes on Asymptotic Theory: Convergence in Probability and Distribution Introduction to Econometric Theory Econ. 770 Jonathan B. Hill Dept. of Economics University of North Carolina - Chapel Hill November

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation University of Pavia Maximum Likelihood Estimation Eduardo Rossi Likelihood function Choosing parameter values that make what one has observed more likely to occur than any other parameter values do. Assumption(Distribution)

More information

Sampling distribution of GLM regression coefficients

Sampling distribution of GLM regression coefficients Sampling distribution of GLM regression coefficients Patrick Breheny February 5 Patrick Breheny BST 760: Advanced Regression 1/20 Introduction So far, we ve discussed the basic properties of the score,

More information

II. An Application of Derivatives: Optimization

II. An Application of Derivatives: Optimization Anne Sibert Autumn 2013 II. An Application of Derivatives: Optimization In this section we consider an important application of derivatives: finding the minimum and maximum points. This has important applications

More information

Parameter Estimation

Parameter Estimation Parameter Estimation Consider a sample of observations on a random variable Y. his generates random variables: (y 1, y 2,, y ). A random sample is a sample (y 1, y 2,, y ) where the random variables y

More information

Maximum Likelihood (ML) Estimation

Maximum Likelihood (ML) Estimation Econometrics 2 Fall 2004 Maximum Likelihood (ML) Estimation Heino Bohn Nielsen 1of32 Outline of the Lecture (1) Introduction. (2) ML estimation defined. (3) ExampleI:Binomialtrials. (4) Example II: Linear

More information

Economics 620, Lecture 9: Asymptotics III: Maximum Likelihood Estimation

Economics 620, Lecture 9: Asymptotics III: Maximum Likelihood Estimation Economics 620, Lecture 9: Asymptotics III: Maximum Likelihood Estimation Nicholas M. Kiefer Cornell University Professor N. M. Kiefer (Cornell University) Lecture 9: Asymptotics III(MLE) 1 / 20 Jensen

More information

8. Hypothesis Testing

8. Hypothesis Testing FE661 - Statistical Methods for Financial Engineering 8. Hypothesis Testing Jitkomut Songsiri introduction Wald test likelihood-based tests significance test for linear regression 8-1 Introduction elements

More information

Inference in non-linear time series

Inference in non-linear time series Intro LS MLE Other Erik Lindström Centre for Mathematical Sciences Lund University LU/LTH & DTU Intro LS MLE Other General Properties Popular estimatiors Overview Introduction General Properties Estimators

More information

Empirical Likelihood

Empirical Likelihood Empirical Likelihood Patrick Breheny September 20 Patrick Breheny STA 621: Nonparametric Statistics 1/15 Introduction Empirical likelihood We will discuss one final approach to constructing confidence

More information

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30 MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD Copyright c 2012 (Iowa State University) Statistics 511 1 / 30 INFORMATION CRITERIA Akaike s Information criterion is given by AIC = 2l(ˆθ) + 2k, where l(ˆθ)

More information

ECE531 Lecture 10b: Maximum Likelihood Estimation

ECE531 Lecture 10b: Maximum Likelihood Estimation ECE531 Lecture 10b: Maximum Likelihood Estimation D. Richard Brown III Worcester Polytechnic Institute 05-Apr-2011 Worcester Polytechnic Institute D. Richard Brown III 05-Apr-2011 1 / 23 Introduction So

More information

MC3: Econometric Theory and Methods. Course Notes 4

MC3: Econometric Theory and Methods. Course Notes 4 University College London Department of Economics M.Sc. in Economics MC3: Econometric Theory and Methods Course Notes 4 Notes on maximum likelihood methods Andrew Chesher 25/0/2005 Course Notes 4, Andrew

More information

Asymptotic inference for a nonstationary double ar(1) model

Asymptotic inference for a nonstationary double ar(1) model Asymptotic inference for a nonstationary double ar() model By SHIQING LING and DONG LI Department of Mathematics, Hong Kong University of Science and Technology, Hong Kong maling@ust.hk malidong@ust.hk

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Practical Econometrics. for. Finance and Economics. (Econometrics 2)

Practical Econometrics. for. Finance and Economics. (Econometrics 2) Practical Econometrics for Finance and Economics (Econometrics 2) Seppo Pynnönen and Bernd Pape Department of Mathematics and Statistics, University of Vaasa 1. Introduction 1.1 Econometrics Econometrics

More information