Maximum Likelihood Estimation
|
|
- Darren Flynn
- 5 years ago
- Views:
Transcription
1 University of Pavia Maximum Likelihood Estimation Eduardo Rossi
2 Likelihood function Choosing parameter values that make what one has observed more likely to occur than any other parameter values do. Assumption(Distribution) The pair {U, V } is a random variable and the N variables {(U 1, V 1 ),...,(U N, V N )} are i.i.d. random sample of (U, V ). F U V (u v; θ 0 ) is completely known but θ 0 (true value of the real-valued parameter vector) is unknown, θ R K. Support of F U V is S(θ 0 ) S(θ 0 ) df U V (u v; θ 0 ) = 1 = u S(θ 0 ) f(u v; θ 0) S(θ 0 ) f(u v; θ 0)du if U discrete if U continuous Eduardo Rossi c - Econometria finanziaria
3 Likelihood function Probability function for (U 1,...,U N ) (V 1,...,V N ) N f(u t v t ; θ 0 ) t=1 Normal Linear Regression: y t = x tβ 0 + ǫ t, (y t,x t ) i.i.d. normal u t = y t, f(u t v t ; θ 0 ) = v t = x t 1 2πσ 2 0 exp [ (y t x ] tβ 0 ) 2σ0 2 S(θ 0 ) = R. Since the obs are i.i.d. normal. The conditional p.d.f. of the sample is N t=1 f(u t v t ; θ 0 ) = [ ] 2πσ0 2 N/2 exp [ (y Xβ 0) ] (y Xβ 0 ) 2σ0 2 Eduardo Rossi c - Econometria finanziaria
4 Likelihood function The marginal distribution of x t does not depend on θ 0. Student s t Linear Regression y t x tβ 0 x t σ 0 t ν0 f(u t v t ; θ 0 ) = Γ[(ν 0 + 1)/2] Γ(ν 0 /2) 1 πν0 σ 2 0 [1 + (y t x tβ 0 ) 2 ν 0 σ 2 0 ] (ν0 +1)/2 Eduardo Rossi c - Econometria finanziaria
5 Likelihood function Laplace Linear Regression f(u t v t ; σ 2 0) = 1 2σ 2 0 exp 2 y t x tβ 0 σ 0 U = y t, V = x t, S(θ 0 ) = R, θ 0 = [β 0, σ 2 0]. We can obtain h(θ 0 ) E[g(u)] = g(u)df(u; θ 0 ) h(v; θ 0 ) E[g(U, V ) V = v] = g(u, v)df(u v; θ 0 ) Eduardo Rossi c - Econometria finanziaria
6 The likelihood function Unconditional specification: f(u; θ) describes the likely values of every r.v. U t, t = 1, 2...,N for a specific value of θ. The sample likelihood function treats the u argument as given and θ 0 as variable. It describes the likely values of the unknown θ 0 given the realizations of the r.v. U. The likelihood function of θ for a random variable U with p.f. F(u; θ 0 ) is defined to be l(θ; U) = f(u; θ) L(θ; U) = log l(θ; U) Eduardo Rossi c - Econometria finanziaria
7 The Likelihood function Likelihood function: we evaluate the p.f. at a random variable and consider the result as a function of the variable θ: [ N ] L(θ; U 1,...,U N ) = log f(u t ; θ) = t=1 N L(θ; U t ) t=1 The conditional likelihood function of θ for a r.v. U with p.f. f(u v; θ 0 ) given the r.v. V is l(θ, U V ) = f(u v; θ) L(θ; U V ) = log l(θ; U V ) θ 0 Θ, Θ parameter space, the set of permitted values of the model. Eduardo Rossi c - Econometria finanziaria
8 Dominance condition Assumption (Dominance condition) [ ] E sup L(θ; U V ) θ Θ exists. This means that L(θ; U V ) is dominated by h(u, V ) sup L(θ; U V ) θ Θ where h(u, V ) does not depend on θ. The existence of E[h(U)] implies the existence of E[L(θ; U V )], θ Θ. Lemma. If L(θ; U V ) is the conditional log-likelihood for θ, the Dominance condition holds, then E [L(θ; U V ) V ] E[L(θ 0 ; U V ) V ]. Eduardo Rossi c - Econometria finanziaria
9 Expected log-likelihood inequality Unconditional case: E[L(θ 0 ; U)] E[L(θ; U)] The specification of p.f. of U determines expected values of functions of U. Therefore Q(θ, θ 0 ) E[L(θ; U)] which depends on θ because the L does and depends on θ 0 because Q is the expected value of a function of U. The expected loglikelihood inequality states that Q(θ 0, θ 0 ) = max θ Θ Q(θ, θ 0) Eduardo Rossi c - Econometria finanziaria
10 Normal linear regression model y t x t N(x tβ 0, σ 2 0) E [L(θ, y t x t ) x t ] = 1 2 log (2πσ2 ) E[(y t x tβ) 2 x t ] 2σ 2 = 1 2 log (2πσ2 )+ 1 2 = 1 2 E[(y t x tβ 0 + x tβ 0 x tβ) 2 x t ] σ 2 [log (2πσ 2 ) + σ2 0 + (x tβ x tβ 0 ) 2 ] σ 2 which is uniquely maximized at x tβ = x tβ 0 and σ 2 = σ 2 0. Eduardo Rossi c - Econometria finanziaria
11 Normal linear regression model The conditional expectation of the conditional log-likelihood of the entire sample is the sum of such terms E [L(θ;y X) X] = N 2 log (2πσ2 ) Nσ2 0 + (β β 0 ) X X(β β 0 ) 2σ 2 which is uniquely maximized at β = β 0, Xβ = Xβ 0 and σ 2 = σ 2 0 if X is full-column rank. Eduardo Rossi c - Econometria finanziaria
12 Student t Linear Regression The expected log-likelihood is analytically intractable. We show that E[L(θ; U V )] exists, for ν 0 > 2, because the concavity of the logarithmic function log (1 + z 2 ) z 2 E [ log [1 + (y ] t x tβ) 2 ] νσ 2 x t E [ (yt x tβ) 2 ] νσ 2 x t = ν 0σ (x tβ 0 x tβ) 2 νσ 2 (ν 0 2) provided that E[x t x t] exists, the expected log-lik exists. Eduardo Rossi c - Econometria finanziaria
13 Unconditional inequality The expected log-likelihood inequality implies the unconditional inequality E[L(θ; U V )] E[L(θ 0 ; U V )] starting from E[L(θ; U V ) V ] E[L(θ 0 ; U V ) V ] we can take the E[ ] over V E[L(θ; U V )] = E [E[L(θ; U V ) V ]] E[E[L(θ 0 ; U V ) V ]] = E[L(θ 0 ; U V )] Eduardo Rossi c - Econometria finanziaria
14 The ML estimator Because θ 0 maximizes E[L(θ; U V )] it is natural to construct an estimator of θ 0 from the value of θ that maximizes the sample: the average log-likelihood functions of the N observations 1 N L(θ; U t V t ) E N [L(θ; U V )] t E[L(θ; U V )] = L(θ; u v)df(u v; θ 0 ) ML estimator: the MLE is a value of the parameter vector that maximizes the sample average log-lik function θ N arg max θ Θ E N[L(θ)] Eduardo Rossi c - Econometria finanziaria
15 Normal Linear Regression Model The empirical expectation of the log-likelihood E N [L(θ)] = 1 2 log (2πσ2 ) E N[(y t x tβ) 2 ] 2σ 2 = 1 2 log (2πσ2 ) (y Xβ) (y Xβ)/N 2σ 2 The log-lik is differentiable. F.O.C s: E N [L β (θ)] = 1 σ 2 E N[x t (y t x tβ)] = 1 Nσ 2 [X (y Xβ)] E N [L σ 2(θ)] = 1 2σ 4 {σ2 E N [(y t x tβ) 2 ]} = 1 [ σ 2 1 ] 2σ 4 N (y Xβ) (y Xβ) Eduardo Rossi c - Econometria finanziaria
16 Normal Linear Regression Model Solutions: 1 N σ 2 [X (y X β)] = 0 β = (X X) 1 X y The MLE of σ 2 is with σ 2 = 1 N (y X β) (y X β) σ 2 = ǫ ǫ N = N K N s2 The Hessian matrix: E N [L θθ (θ)] = 1 σ 2 N X X (y Xβ) X σ 4 N X (y Xβ) σ 4 N 1 2σ 1 4 σ 6 N (y Xβ) (y Xβ) Eduardo Rossi c - Econometria finanziaria
17 Normal Linear Regression Model E N [L θθ ( θ)] = = 1 σ 2 N X X X (y X β) σ 4 N (y X β) X σ 4 N 1 2 σ 1 4 σ 6 N (y X β) (y X β) 1 σ 2 N X X σ 1 4 σ 6 N (y X β) (y X β) which is negative definite. The second-order necessary condition for a point to be the local maximum of a twice continuously differentiable function is that the Hessian be negative semidefinite at the point. Eduardo Rossi c - Econometria finanziaria
18 Identification Is the DGP sufficiently informative about the parameters of the model? If f(u v; θ 0 ) = f(u v; θ 1 ) data drawn from these two distributions will have the same sampling properties. There is no way to distinguish whether θ = θ 0 or θ = θ 1. Eduardo Rossi c - Econometria finanziaria
19 Global Identification The parameter θ 0 is globally identified in Θ if, for every θ 1 Θ, θ 0 θ 1, implies that Pr{f(U V ; θ 0 ) f(u V ; θ 1 )} > 0 Assumption (Global identification): Every parameter vector θ 0 Θ is globally identified. Lemma (Strict expected log-likelihood inequality): Under the Distribution, Dominance and Global identification assumptions: implies θ θ 0 E[L(θ)] < E[L(θ 0 )]. Eduardo Rossi c - Econometria finanziaria
20 Example Exact multicollinearity among explanatory variables in a linear regression E[y X] = Xβ 0 is a failure of global identification. If rank(x) < K then E[L(θ)] E[L(θ 0 )] still holds. The normal log-likelihood still attains its maximum in β at β 0 because (β β 0 ) X X(β β 0 ) 0 but inequality is not strict for all β β 0. If rank(x) = K then β 0 is the unique maximum of E[L(θ)]. Eduardo Rossi c - Econometria finanziaria
21 Example Identification concerns E[L(θ)] and not the E N [L(θ)]. One can discover failures of identification in the sample log-likelihood. But if a sample log-likelihood function fails to have a unique global maximum this does not always imply a failure of global identification. Eduardo Rossi c - Econometria finanziaria
22 Example Exact multicollinearity among explanatory variables in a LRM E[y X] = Xβ 0 is a failure of global identification. Note that if rank(x) < K the expected log-likelihood inequality E[L(θ)] E[L(θ 0 )] still holds. Eduardo Rossi c - Econometria finanziaria
23 Differentiability When the support of the distribution depends on the unknown parameter values the MLE cannot be found with simple calculus. In such cases the log-lik cannot be differentiable everywhere in the parameter space. Assumption (Differentiability): The p.f. f(u v; θ) is twice continuously differentiable in θ, θ Θ. The S(θ) does not depend on θ, and differentiation and integration are interchangeable in the sense that θ 2 θ 2 S(θ) S(θ) df(u v; θ) = df(u v; θ) = S(θ) S(θ) df(u v; θ) θ 2 2 df(u v; θ) θ Eduardo Rossi c - Econometria finanziaria
24 Differentiability [ ] E[L(θ) V = v] L(θ) = E θ θ V = v 2 [ E[L(θ) V = v] 2 ] L(θ) θ θ = E θ θ V = v The interchange of differentiation and integration is ensured in part by S(θ) = S. θ 0 = arg max θ Θ E[L(θ)] translates into the conditions E[L(θ)] θ = 0 θ=θ0 and the second order conditions that the Hessian matrix 2 E[L(θ)] θ θ is a n.d. matrix. θ=θ0 Eduardo Rossi c - Econometria finanziaria
25 The score function The MLE θ is an implicit function of the data u θ = arg max θ Θ E N[L(θ)] arg zero θ Θ E N [L θ (θ)] The F.O.C. Normal equations or likelihood equations E N [L θ ( θ)] = 0 where the score function L θ L(θ) θ θ must be calculated by numerical methods for maximizing differentiable functions. Eduardo Rossi c - Econometria finanziaria
26 Score Identity Lemma (Score identity): Under Distribution and Differentiability assumptions E[L θ (θ 0 ) V = v] = 0 Proof: Continuous random variables case 1 = df(u v; θ) = f(u v; θ)du S S Eduardo Rossi c - Econometria finanziaria
27 Score Identity we can differentiate both sides of this equality w.r.t. θ 0 = f(u v; θ)du θ S = f θ (u v; θ)du consider = S S L θ (θ; U V ) = 1 f(u v; θ) f θ(u v; θ)f(u v; θ)du E[L θ (θ; U V ) V = v] = 1 f(u v; θ) f θ(u v; θ) S 1 f(u v; θ) f θ(u v; θ)f(u v; θ 0 )du Eduardo Rossi c - Econometria finanziaria
28 Score Identity The E[ V = v] is evaluated at θ = θ 0. For θ θ 0 E[L θ (θ; U V ) V = v] 0 But if θ = θ 0 then E[L θ (θ 0 ; U V ) V = v] = S 1 f(u v; θ 0 ) f θ(u v; θ 0 )f(u v; θ 0 )du = 0. Eduardo Rossi c - Econometria finanziaria
29 Score Identity In the Normal Linear Regression Model E[L β (θ)] = 1 σ 2 E[x tx t](β 0 β) E[L σ 2(θ)] = 1 2σ 4 ( σ 2 { σ E[(x tβ 0 x tβ) 2 ] }) θ 0 = (β 0, σ 2 0) E[L β (θ 0 )] = 1 σ 2 0 E[x t x t](β 0 β 0 ) = 0 E[L σ 2(θ 0 )] = 1 2σ 4 0 ( σ 2 0 { σ E[(x tβ 0 x tβ 0 ) 2 ] }) = 0 Eduardo Rossi c - Econometria finanziaria
30 The Information Matrix If there exists θ such that E N [L θ ( θ N )] = 0 we must check that we have a global maximum. Otherwise our solution cannot be the MLE ( θ N ). In general, a sufficient condition for θ N to be a local maximum is that the Hessian matrix E N [L θθ ( θ N )] 2 E N [L(θ)] θ θ θ= θn evaluated at θ N is negative definite: c R K,c 0 c E N [L θθ ( θ N )]c < 0 it guarantees that E N [L(θ)] is strictly concave in a neighborhood of θ. Eduardo Rossi c - Econometria finanziaria
31 Information Matrix We investigate the second-order conditions for the maximum of E[L(θ)]. Assumption (Finite Information): V ar[l θ (θ 0 )] exists. Lemma (Information Identity): Under Distribution, Differentiability, Finite Information assumptions E[L θθ (θ 0 ) V = v] = V ar[l θ (θ 0 ) V = v] and this matrix is negative semidefinite. Eduardo Rossi c - Econometria finanziaria
32 Information Matrix Proof: 0 = S L θ (θ; u v)f(u v; θ)du Differentiating both sides (L θ (θ)f(θ)) θ = L θ θ f + L f θ θ = L θθ f + L θ (f θ ) = (L θθ + L θ L θ)f f f(u v; θ). 0 = [L θθ (θ; u v) + L θ (θ; u v)l θ (θ; u v) ]df(u v; θ) S Eduardo Rossi c - Econometria finanziaria
33 Information Matrix S L θθ (θ; u v)df(u v; θ) = S [L θ (θ; u v)l θ (θ; u v) ]df(u v; θ) Setting θ = θ 0 E[L θθ (θ 0 ; U V ) V = v] = E[L θ (θ 0 ; U V )L θ (θ 0 ; U V ) V = v] = V ar[l θ (θ 0 ; U V ) V = v] because E[L θ (θ 0 ; U V ) V ] = 0. The Hessian is negative semidefinite since is the negative of a variance matrix. Eduardo Rossi c - Econometria finanziaria
34 Conditional Information The conditional variance matrix of the score vector L θ (θ; U V ) given V = v and evaluated at θ 0 I(θ 0 v) E[L θ (θ 0 )L θ (θ 0 ) V = v] = V ar[l θ (θ 0 ) V = v] we can always find the conditional information matrix function I(θ v) L θ (θ; u v)l θ (θ; u v) df(u v; θ) S Eduardo Rossi c - Econometria finanziaria
35 Population Information The marginal expectation I(θ 0 ) E[L θ (θ; U V )L θ (θ; U V ) ] is the population information matrix. The population information matrix is the unconditional variance matrix of the conditional score vector because E[L θ (θ 0 ; U V ) V ] = 0 V ar[l θ (θ 0 ; U V )] = E[V ar[l θ (θ 0 ; U V )]] + V ar[e[l θ (θ 0 ; U V )] V ] = E[I(θ 0 V )] = I(θ 0 ) Eduardo Rossi c - Econometria finanziaria
36 Normal linear regression model The conditional information matrix for the normal linear regression model: 1 x I(θ 0 x t ) = σ0 2 t x t σ 4 0 The Hessian of the conditional normal regression log-likelihood function L θθ (θ; y t x t ) = 1 σ x 2 t x t 1 σ x 4 t (y t x tβ) 1 σ (y 4 t x tβ)x t (y t x tβ) 2 /σ 6 1 2σ 4 0 E[L θθ (θ 0 ; y t x t ) V ] = I(θ 0 x t ) Eduardo Rossi c - Econometria finanziaria
37 Nonsigular information It is possible that information matrix can be singular even θ 0 is globally identifiable and the expected log-lik is uniquely maximized at θ 0. The second order condition that the Hessian be negative definite is sufficient but not necessary for a local maximum. We assume this condition explicitly. Assumption (Nonsingular Information) The information matrix I(θ 0 ) is nonsingular for all possible θ 0 Θ. Eduardo Rossi c - Econometria finanziaria
38 The Cramér - Rao Lower Bound Information matrix: measure of how much we can learn about θ 0 from the random sample {(U 1, V 1 ),...,(U N, V N )}. Theorem: θ unbiased estimator of θ 0, with finite variance matrix with interchangeability between differentiation and integration E[ θ v 1,...,v N ] θ 0 = θ 0 = S S θ θ θ 0 N df(u t v t ; θ 0 ) t=1 N df(u t v t ; θ 0 ) if Distribution, Differentiability, Finite Information Nonsingularity assumptions also hold then that for any a R K t=1 a V ar[ θ v]a a (NE N [I(θ 0 ) v]) 1 a. Eduardo Rossi c - Econometria finanziaria
39 The Cramér - Rao Lower Bound In some cases we can find estimators with variances equal to the Cramér-Rao lower bound. The OLS estimator β is efficient relative to all unbiased estimators of β 0. Proof: Using I(θ 0 x t ) = (N E N [I(θ 0 x t )]) 1 = because 1 σ 2 0 x t x t σ 4 0 V ar[ β X] = σ 2 0(X X) 1 1 σ 2 0 (X X) 0 0 N 2σ = σ2 0(X X) 0 0 The OLS/MLE estimator attains the Cramér-Rao lower bound. 2σ 4 0 N Eduardo Rossi c - Econometria finanziaria
Maximum Likelihood Asymptotic Theory. Eduardo Rossi University of Pavia
Maximum Likelihood Asymtotic Theory Eduardo Rossi University of Pavia Slutsky s Theorem, Cramer s Theorem Slutsky s Theorem Let {X N } be a random sequence converging in robability to a constant a, and
More informationEstimation of Dynamic Regression Models
University of Pavia 2007 Estimation of Dynamic Regression Models Eduardo Rossi University of Pavia Factorization of the density DGP: D t (x t χ t 1, d t ; Ψ) x t represent all the variables in the economy.
More informationIntroduction to Estimation Methods for Time Series models Lecture 2
Introduction to Estimation Methods for Time Series models Lecture 2 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 1 / 21 Estimators:
More informationTheory of Maximum Likelihood Estimation. Konstantin Kashin
Gov 2001 Section 5: Theory of Maximum Likelihood Estimation Konstantin Kashin February 28, 2013 Outline Introduction Likelihood Examples of MLE Variance of MLE Asymptotic Properties What is Statistical
More informationGARCH Models Estimation and Inference
GARCH Models Estimation and Inference Eduardo Rossi University of Pavia December 013 Rossi GARCH Financial Econometrics - 013 1 / 1 Likelihood function The procedure most often used in estimating θ 0 in
More informationParameter Estimation
Parameter Estimation Consider a sample of observations on a random variable Y. his generates random variables: (y 1, y 2,, y ). A random sample is a sample (y 1, y 2,, y ) where the random variables y
More informationGARCH Models Estimation and Inference. Eduardo Rossi University of Pavia
GARCH Models Estimation and Inference Eduardo Rossi University of Pavia Likelihood function The procedure most often used in estimating θ 0 in ARCH models involves the maximization of a likelihood function
More informationInstrumental Variables
Università di Pavia 2010 Instrumental Variables Eduardo Rossi Exogeneity Exogeneity Assumption: the explanatory variables which form the columns of X are exogenous. It implies that any randomness in the
More informationStatistics 300B Winter 2018 Final Exam Due 24 Hours after receiving it
Statistics 300B Winter 08 Final Exam Due 4 Hours after receiving it Directions: This test is open book and open internet, but must be done without consulting other students. Any consultation of other students
More informationMathematics Ph.D. Qualifying Examination Stat Probability, January 2018
Mathematics Ph.D. Qualifying Examination Stat 52800 Probability, January 2018 NOTE: Answers all questions completely. Justify every step. Time allowed: 3 hours. 1. Let X 1,..., X n be a random sample from
More informationBrief Review on Estimation Theory
Brief Review on Estimation Theory K. Abed-Meraim ENST PARIS, Signal and Image Processing Dept. abed@tsi.enst.fr This presentation is essentially based on the course BASTA by E. Moulines Brief review on
More informationChapter 4: Asymptotic Properties of the MLE
Chapter 4: Asymptotic Properties of the MLE Daniel O. Scharfstein 09/19/13 1 / 1 Maximum Likelihood Maximum likelihood is the most powerful tool for estimation. In this part of the course, we will consider
More informationStatistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach
Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score
More informationSTAT215: Solutions for Homework 2
STAT25: Solutions for Homework 2 Due: Wednesday, Feb 4. (0 pt) Suppose we take one observation, X, from the discrete distribution, x 2 0 2 Pr(X x θ) ( θ)/4 θ/2 /2 (3 θ)/2 θ/4, 0 θ Find an unbiased estimator
More informationECE531 Lecture 10b: Maximum Likelihood Estimation
ECE531 Lecture 10b: Maximum Likelihood Estimation D. Richard Brown III Worcester Polytechnic Institute 05-Apr-2011 Worcester Polytechnic Institute D. Richard Brown III 05-Apr-2011 1 / 23 Introduction So
More informationf(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain
0.1. INTRODUCTION 1 0.1 Introduction R. A. Fisher, a pioneer in the development of mathematical statistics, introduced a measure of the amount of information contained in an observaton from f(x θ). Fisher
More informationMathematical statistics
October 4 th, 2018 Lecture 12: Information Where are we? Week 1 Week 2 Week 4 Week 7 Week 10 Week 14 Probability reviews Chapter 6: Statistics and Sampling Distributions Chapter 7: Point Estimation Chapter
More informationGARCH Models Estimation and Inference
Università di Pavia GARCH Models Estimation and Inference Eduardo Rossi Likelihood function The procedure most often used in estimating θ 0 in ARCH models involves the maximization of a likelihood function
More informationMax. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes
Maximum Likelihood Estimation Econometrics II Department of Economics Universidad Carlos III de Madrid Máster Universitario en Desarrollo y Crecimiento Económico Outline 1 3 4 General Approaches to Parameter
More informationLinear models. Linear models are computationally convenient and remain widely used in. applied econometric research
Linear models Linear models are computationally convenient and remain widely used in applied econometric research Our main focus in these lectures will be on single equation linear models of the form y
More informationEstimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators
Estimation theory Parametric estimation Properties of estimators Minimum variance estimator Cramer-Rao bound Maximum likelihood estimators Confidence intervals Bayesian estimation 1 Random Variables Let
More informationStatistics 3858 : Maximum Likelihood Estimators
Statistics 3858 : Maximum Likelihood Estimators 1 Method of Maximum Likelihood In this method we construct the so called likelihood function, that is L(θ) = L(θ; X 1, X 2,..., X n ) = f n (X 1, X 2,...,
More informationEconometrics I, Estimation
Econometrics I, Estimation Department of Economics Stanford University September, 2008 Part I Parameter, Estimator, Estimate A parametric is a feature of the population. An estimator is a function of the
More informationUniversity of Pavia. M Estimators. Eduardo Rossi
University of Pavia M Estimators Eduardo Rossi Criterion Function A basic unifying notion is that most econometric estimators are defined as the minimizers of certain functions constructed from the sample
More informationGraduate Econometrics I: Maximum Likelihood II
Graduate Econometrics I: Maximum Likelihood II Yves Dominicy Université libre de Bruxelles Solvay Brussels School of Economics and Management ECARES Yves Dominicy Graduate Econometrics I: Maximum Likelihood
More informationNotes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed
18.466 Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed 1. MLEs in exponential families Let f(x,θ) for x X and θ Θ be a likelihood function, that is, for present purposes,
More informationFinal Exam. 1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given.
1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given. (a) If X and Y are independent, Corr(X, Y ) = 0. (b) (c) (d) (e) A consistent estimator must be asymptotically
More informationGraduate Econometrics I: Maximum Likelihood I
Graduate Econometrics I: Maximum Likelihood I Yves Dominicy Université libre de Bruxelles Solvay Brussels School of Economics and Management ECARES Yves Dominicy Graduate Econometrics I: Maximum Likelihood
More informationDensity Estimation. Seungjin Choi
Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/
More informationOptimization. The value x is called a maximizer of f and is written argmax X f. g(λx + (1 λ)y) < λg(x) + (1 λ)g(y) 0 < λ < 1; x, y X.
Optimization Background: Problem: given a function f(x) defined on X, find x such that f(x ) f(x) for all x X. The value x is called a maximizer of f and is written argmax X f. In general, argmax X f may
More information1 Extremum Estimators
FINC 9311-21 Financial Econometrics Handout Jialin Yu 1 Extremum Estimators Let θ 0 be a vector of k 1 unknown arameters. Extremum estimators: estimators obtained by maximizing or minimizing some objective
More informationReview and continuation from last week Properties of MLEs
Review and continuation from last week Properties of MLEs As we have mentioned, MLEs have a nice intuitive property, and as we have seen, they have a certain equivariance property. We will see later that
More informationPh.D. Qualifying Exam Friday Saturday, January 3 4, 2014
Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014 Put your solution to each problem on a separate sheet of paper. Problem 1. (5166) Assume that two random samples {x i } and {y i } are independently
More informationStatistics: Learning models from data
DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial
More informationECE 275A Homework 7 Solutions
ECE 275A Homework 7 Solutions Solutions 1. For the same specification as in Homework Problem 6.11 we want to determine an estimator for θ using the Method of Moments (MOM). In general, the MOM estimator
More informationLikelihood, MLE & EM for Gaussian Mixture Clustering. Nick Duffield Texas A&M University
Likelihood, MLE & EM for Gaussian Mixture Clustering Nick Duffield Texas A&M University Probability vs. Likelihood Probability: predict unknown outcomes based on known parameters: P(x q) Likelihood: estimate
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7
MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 1 Random Vectors Let a 0 and y be n 1 vectors, and let A be an n n matrix. Here, a 0 and A are non-random, whereas y is
More informationStatement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.
MATHEMATICAL STATISTICS Take-home final examination February 1 st -February 8 th, 019 Instructions You do not need to edit the solutions Just make sure the handwriting is legible The final solutions should
More informationPh.D. Qualifying Exam Friday Saturday, January 6 7, 2017
Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a
More informationStat 5102 Lecture Slides Deck 3. Charles J. Geyer School of Statistics University of Minnesota
Stat 5102 Lecture Slides Deck 3 Charles J. Geyer School of Statistics University of Minnesota 1 Likelihood Inference We have learned one very general method of estimation: method of moments. the Now we
More informationSuggested Solution for PS #5
Cornell University Department of Economics Econ 62 Spring 28 TA: Jae Ho Yun Suggested Solution for S #5. (Measurement Error, IV) (a) This is a measurement error problem. y i x i + t i + " i t i t i + i
More informationTowards stability and optimality in stochastic gradient descent
Towards stability and optimality in stochastic gradient descent Panos Toulis, Dustin Tran and Edoardo M. Airoldi August 26, 2016 Discussion by Ikenna Odinaka Duke University Outline Introduction 1 Introduction
More informationMultivariate Analysis and Likelihood Inference
Multivariate Analysis and Likelihood Inference Outline 1 Joint Distribution of Random Variables 2 Principal Component Analysis (PCA) 3 Multivariate Normal Distribution 4 Likelihood Inference Joint density
More informationChapter 3: Maximum Likelihood Theory
Chapter 3: Maximum Likelihood Theory Florian Pelgrin HEC September-December, 2010 Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, 2010 1 / 40 1 Introduction Example 2 Maximum likelihood
More information1. Fisher Information
1. Fisher Information Let f(x θ) be a density function with the property that log f(x θ) is differentiable in θ throughout the open p-dimensional parameter set Θ R p ; then the score statistic (or score
More informationELEG 5633 Detection and Estimation Minimum Variance Unbiased Estimators (MVUE)
1 ELEG 5633 Detection and Estimation Minimum Variance Unbiased Estimators (MVUE) Jingxian Wu Department of Electrical Engineering University of Arkansas Outline Minimum Variance Unbiased Estimators (MVUE)
More informationChapter 3. Point Estimation. 3.1 Introduction
Chapter 3 Point Estimation Let (Ω, A, P θ ), P θ P = {P θ θ Θ}be probability space, X 1, X 2,..., X n : (Ω, A) (IR k, B k ) random variables (X, B X ) sample space γ : Θ IR k measurable function, i.e.
More informationGeneralized Linear Models Introduction
Generalized Linear Models Introduction Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Generalized Linear Models For many problems, standard linear regression approaches don t work. Sometimes,
More informationLecture 7 Introduction to Statistical Decision Theory
Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7
More informationPractical Econometrics. for. Finance and Economics. (Econometrics 2)
Practical Econometrics for Finance and Economics (Econometrics 2) Seppo Pynnönen and Bernd Pape Department of Mathematics and Statistics, University of Vaasa 1. Introduction 1.1 Econometrics Econometrics
More informationLecture 2: Consistency of M-estimators
Lecture 2: Instructor: Deartment of Economics Stanford University Preared by Wenbo Zhou, Renmin University References Takeshi Amemiya, 1985, Advanced Econometrics, Harvard University Press Newey and McFadden,
More informationMaximum Likelihood Tests and Quasi-Maximum-Likelihood
Maximum Likelihood Tests and Quasi-Maximum-Likelihood Wendelin Schnedler Department of Economics University of Heidelberg 10. Dezember 2007 Wendelin Schnedler (AWI) Maximum Likelihood Tests and Quasi-Maximum-Likelihood10.
More information2.3 Methods of Estimation
96 CHAPTER 2. ELEMENTS OF STATISTICAL INFERENCE 2.3 Methods of Estimation 2.3. Method of Moments The Method of Moments is a simple technique based on the idea that the sample moments are natural estimators
More informationProblem Set 6 Solution
Problem Set 6 Solution May st, 009 by Yang. Causal Expression of AR Let φz : αz βz. Zeros of φ are α and β, both of which are greater than in absolute value by the assumption in the question. By the theorem
More informationMathematical statistics
October 18 th, 2018 Lecture 16: Midterm review Countdown to mid-term exam: 7 days Week 1 Chapter 1: Probability review Week 2 Week 4 Week 7 Chapter 6: Statistics Chapter 7: Point Estimation Chapter 8:
More informationBiostat 2065 Analysis of Incomplete Data
Biostat 2065 Analysis of Incomplete Data Gong Tang Dept of Biostatistics University of Pittsburgh October 20, 2005 1. Large-sample inference based on ML Let θ is the MLE, then the large-sample theory implies
More informationMaster s Written Examination
Master s Written Examination Option: Statistics and Probability Spring 016 Full points may be obtained for correct answers to eight questions. Each numbered question which may have several parts is worth
More informationML and REML Variance Component Estimation. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 58
ML and REML Variance Component Estimation Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 1 / 58 Suppose y = Xβ + ε, where ε N(0, Σ) for some positive definite, symmetric matrix Σ.
More informationSystem Identification, Lecture 4
System Identification, Lecture 4 Kristiaan Pelckmans (IT/UU, 2338) Course code: 1RT880, Report code: 61800 - Spring 2012 F, FRI Uppsala University, Information Technology 30 Januari 2012 SI-2012 K. Pelckmans
More informationChapter 3 : Likelihood function and inference
Chapter 3 : Likelihood function and inference 4 Likelihood function and inference The likelihood Information and curvature Sufficiency and ancilarity Maximum likelihood estimation Non-regular models EM
More informationSTAT 730 Chapter 4: Estimation
STAT 730 Chapter 4: Estimation Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Analysis 1 / 23 The likelihood We have iid data, at least initially. Each datum
More informationEstimation, Inference, and Hypothesis Testing
Chapter 2 Estimation, Inference, and Hypothesis Testing Note: The primary reference for these notes is Ch. 7 and 8 of Casella & Berger 2. This text may be challenging if new to this topic and Ch. 7 of
More informationSystem Identification, Lecture 4
System Identification, Lecture 4 Kristiaan Pelckmans (IT/UU, 2338) Course code: 1RT880, Report code: 61800 - Spring 2016 F, FRI Uppsala University, Information Technology 13 April 2016 SI-2016 K. Pelckmans
More informationIntroduction to Stochastic processes
Università di Pavia Introduction to Stochastic processes Eduardo Rossi Stochastic Process Stochastic Process: A stochastic process is an ordered sequence of random variables defined on a probability space
More informationAn Introduction to Bayesian Linear Regression
An Introduction to Bayesian Linear Regression APPM 5720: Bayesian Computation Fall 2018 A SIMPLE LINEAR MODEL Suppose that we observe explanatory variables x 1, x 2,..., x n and dependent variables y 1,
More informationElements of statistics (MATH0487-1)
Elements of statistics (MATH0487-1) Prof. Dr. Dr. K. Van Steen University of Liège, Belgium November 12, 2012 Introduction to Statistics Basic Probability Revisited Sampling Exploratory Data Analysis -
More informationMaster s Written Examination
Master s Written Examination Option: Statistics and Probability Spring 05 Full points may be obtained for correct answers to eight questions Each numbered question (which may have several parts) is worth
More informationHypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3
Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest
More informationMLE and GMM. Li Zhao, SJTU. Spring, Li Zhao MLE and GMM 1 / 22
MLE and GMM Li Zhao, SJTU Spring, 2017 Li Zhao MLE and GMM 1 / 22 Outline 1 MLE 2 GMM 3 Binary Choice Models Li Zhao MLE and GMM 2 / 22 Maximum Likelihood Estimation - Introduction For a linear model y
More information6. MAXIMUM LIKELIHOOD ESTIMATION
6 MAXIMUM LIKELIHOOD ESIMAION [1] Maximum Likelihood Estimator (1) Cases in which θ (unknown parameter) is scalar Notational Clarification: From now on, we denote the true value of θ as θ o hen, view θ
More informationLeast Squares Estimation-Finite-Sample Properties
Least Squares Estimation-Finite-Sample Properties Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU) Finite-Sample 1 / 29 Terminology and Assumptions 1 Terminology and Assumptions
More informationParametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012
Parametric Models Dr. Shuang LIANG School of Software Engineering TongJi University Fall, 2012 Today s Topics Maximum Likelihood Estimation Bayesian Density Estimation Today s Topics Maximum Likelihood
More informationMath 423/533: The Main Theoretical Topics
Math 423/533: The Main Theoretical Topics Notation sample size n, data index i number of predictors, p (p = 2 for simple linear regression) y i : response for individual i x i = (x i1,..., x ip ) (1 p)
More informationMaximum Likelihood Estimation
Chapter 8 Maximum Likelihood Estimation 8. Consistency If X is a random variable (or vector) with density or mass function f θ (x) that depends on a parameter θ, then the function f θ (X) viewed as a function
More informationA Few Notes on Fisher Information (WIP)
A Few Notes on Fisher Information (WIP) David Meyer dmm@{-4-5.net,uoregon.edu} Last update: April 30, 208 Definitions There are so many interesting things about Fisher Information and its theoretical properties
More informationFisher Information & Efficiency
Fisher Information & Efficiency Robert L. Wolpert Department of Statistical Science Duke University, Durham, NC, USA 1 Introduction Let f(x θ) be the pdf of X for θ Θ; at times we will also consider a
More information2.6.3 Generalized likelihood ratio tests
26 HYPOTHESIS TESTING 113 263 Generalized likelihood ratio tests When a UMP test does not exist, we usually use a generalized likelihood ratio test to verify H 0 : θ Θ against H 1 : θ Θ\Θ It can be used
More informationSTAT 512 sp 2018 Summary Sheet
STAT 5 sp 08 Summary Sheet Karl B. Gregory Spring 08. Transformations of a random variable Let X be a rv with support X and let g be a function mapping X to Y with inverse mapping g (A = {x X : g(x A}
More informationAdvanced Quantitative Methods: maximum likelihood
Advanced Quantitative Methods: Maximum Likelihood University College Dublin 4 March 2014 1 2 3 4 5 6 Outline 1 2 3 4 5 6 of straight lines y = 1 2 x + 2 dy dx = 1 2 of curves y = x 2 4x + 5 of curves y
More informationLasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices
Article Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Fei Jin 1,2 and Lung-fei Lee 3, * 1 School of Economics, Shanghai University of Finance and Economics,
More informationStatistics GIDP Ph.D. Qualifying Exam Theory Jan 11, 2016, 9:00am-1:00pm
Statistics GIDP Ph.D. Qualifying Exam Theory Jan, 06, 9:00am-:00pm Instructions: Provide answers on the supplied pads of paper; write on only one side of each sheet. Complete exactly 5 of the 6 problems.
More informationSTATISTICS/ECONOMETRICS PREP COURSE PROF. MASSIMO GUIDOLIN
Massimo Guidolin Massimo.Guidolin@unibocconi.it Dept. of Finance STATISTICS/ECONOMETRICS PREP COURSE PROF. MASSIMO GUIDOLIN SECOND PART, LECTURE 2: MODES OF CONVERGENCE AND POINT ESTIMATION Lecture 2:
More informationProblem Selected Scores
Statistics Ph.D. Qualifying Exam: Part II November 20, 2010 Student Name: 1. Answer 8 out of 12 problems. Mark the problems you selected in the following table. Problem 1 2 3 4 5 6 7 8 9 10 11 12 Selected
More informationSection 8: Asymptotic Properties of the MLE
2 Section 8: Asymptotic Properties of the MLE In this part of the course, we will consider the asymptotic properties of the maximum likelihood estimator. In particular, we will study issues of consistency,
More informationThe outline for Unit 3
The outline for Unit 3 Unit 1. Introduction: The regression model. Unit 2. Estimation principles. Unit 3: Hypothesis testing principles. 3.1 Wald test. 3.2 Lagrange Multiplier. 3.3 Likelihood Ratio Test.
More informationMaximum Likelihood Estimation
Maximum Likelihood Estimation Assume X P θ, θ Θ, with joint pdf (or pmf) f(x θ). Suppose we observe X = x. The Likelihood function is L(θ x) = f(x θ) as a function of θ (with the data x held fixed). The
More informationHT Introduction. P(X i = x i ) = e λ λ x i
MODS STATISTICS Introduction. HT 2012 Simon Myers, Department of Statistics (and The Wellcome Trust Centre for Human Genetics) myers@stats.ox.ac.uk We will be concerned with the mathematical framework
More informationEvaluating the Performance of Estimators (Section 7.3)
Evaluating the Performance of Estimators (Section 7.3) Example: Suppose we observe X 1,..., X n iid N(θ, σ 2 0 ), with σ2 0 known, and wish to estimate θ. Two possible estimators are: ˆθ = X sample mean
More informationMathematical statistics
October 1 st, 2018 Lecture 11: Sufficient statistic Where are we? Week 1 Week 2 Week 4 Week 7 Week 10 Week 14 Probability reviews Chapter 6: Statistics and Sampling Distributions Chapter 7: Point Estimation
More informationTime Series Analysis
Time Series Analysis hm@imm.dtu.dk Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby 1 Outline of the lecture Regression based methods, 1st part: Introduction (Sec.
More informationMaximum likelihood estimation
Maximum likelihood estimation Guillaume Obozinski Ecole des Ponts - ParisTech Master MVA Maximum likelihood estimation 1/26 Outline 1 Statistical concepts 2 A short review of convex analysis and optimization
More informationMaximum Likelihood Estimation
Chapter 7 Maximum Likelihood Estimation 7. Consistency If X is a random variable (or vector) with density or mass function f θ (x) that depends on a parameter θ, then the function f θ (X) viewed as a function
More informationStat 710: Mathematical Statistics Lecture 27
Stat 710: Mathematical Statistics Lecture 27 Jun Shao Department of Statistics University of Wisconsin Madison, WI 53706, USA Jun Shao (UW-Madison) Stat 710, Lecture 27 April 3, 2009 1 / 10 Lecture 27:
More informationParametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory
Statistical Inference Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory IP, José Bioucas Dias, IST, 2007
More informationComputing the MLE and the EM Algorithm
ECE 830 Fall 0 Statistical Signal Processing instructor: R. Nowak Computing the MLE and the EM Algorithm If X p(x θ), θ Θ, then the MLE is the solution to the equations logp(x θ) θ 0. Sometimes these equations
More informationEstimation Theory. as Θ = (Θ 1,Θ 2,...,Θ m ) T. An estimator
Estimation Theory Estimation theory deals with finding numerical values of interesting parameters from given set of data. We start with formulating a family of models that could describe how the data were
More information3.0.1 Multivariate version and tensor product of experiments
ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016 Lecture 3: Minimax risk of GLM and four extensions Lecturer: Yihong Wu Scribe: Ashok Vardhan, Jan 28, 2016 [Ed. Mar 24]
More informationExogeneity and Causality
Università di Pavia Exogeneity and Causality Eduardo Rossi University of Pavia Factorization of the density DGP: D t (x t χ t 1, d t ; Ψ) x t represent all the variables in the economy. The econometric
More informationQualifying Exam CS 661: System Simulation Summer 2013 Prof. Marvin K. Nakayama
Qualifying Exam CS 661: System Simulation Summer 2013 Prof. Marvin K. Nakayama Instructions This exam has 7 pages in total, numbered 1 to 7. Make sure your exam has all the pages. This exam will be 2 hours
More informationRegression Estimation Least Squares and Maximum Likelihood
Regression Estimation Least Squares and Maximum Likelihood Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 1 Least Squares Max(min)imization Function to minimize
More information