Introduction to Estimation Methods for Time Series models Lecture 2
|
|
- Pearl Blair
- 5 years ago
- Views:
Transcription
1 Introduction to Estimation Methods for Time Series models Lecture 2 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 1 / 21
2 Estimators: Large Sample Properties Purposes: study the behavior of ˆθ n when n approximate unknown finite sample distributions of ˆθ n Being ˆθ n a distribution n, how to define ˆθ n θ 0? Convergence in Probability (plim): The random variable ˆθ n converges in probability to a constant θ 0 if ɛ > 0 lim n P( ˆθ n θ 0 < ɛ) = 1 If plim ˆθ n = θ 0 the estimator is Consistent Convergence in Quadratic Mean: lim E(ˆθ n θ 0 ) 2 = lim MSE[ˆθ n n n] = 0 Being MSE[ˆθ n] = Var[ˆθ n]+bias[ˆθ n] 2 we have Convergence in Quadratic Mean lim Bias[ˆθ n n] = 0 and lim Var[ˆθ n n] = 0 Convergence in Quadratic Mean Convergence in Probability Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 2 / 21
3 Consistency Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 3 / 21
4 OLS without Normality: Large Sample Theory When H.4 of Normality is violated, OLS is still unbiased and BLUE, however confidence intervals and test statistics are not valid Assumptions for the large sample theory: A.1 E[x i ɛ i ] = 0 regressors uncorrelated with errors (weaker than strict exogeneity) 1 A.2 lim n n X X = Q definite positive Being ( n ) 1 ( n ) ˆβ = β +(X X) 1 X ɛ = x i x i x i ɛ i Consistency (convergence in Probability): plim β n = β ( ) 1 n 1 ( ) β n = β + x i x 1 n i x i ɛ i β n n }{{}}{{} Q 1 (A.2) 0 (A.1+LLN) Asymptotic Normality ( ) 1 n 1 ( ) 1 n n ( βn β) = x i x i x i ɛ i N(0, Q 1 (σ 2 Q)Q 1 ) = N(0,σ 2 Q 1 ) n n }{{}}{{} Q 1 (A.2) N(0,σ 2 Q) (A.2+CLT) Hence β n N(β,σ 2 (X X) 1 ) as in the Normal case (H.4) but only asymptotically. Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 4 / 21
5 NLS (the idea) The Nonlinear Regression model is y i = h(x i,β)+ɛ i with E[ɛ i x i ] = 0 Nonlinear Least Square (NLS) estimator: β NLS = arg min β n ɛ 2 i = arg min β n (y i h(x i,β)) 2 FOC: NLS estimator β NLS satisfy n [ y i h(x i, β ] h(xi, NLS ) β NLS ) = 0 β In general, no close form solutions numerical minimization Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 5 / 21
6 Maximum Likelihood Basic, strong assumption: distribution of the data known up to θ. Likelihood function: The joint density of an i.i.d random sample (x 1, x 2,...,x n) from f(x;θ 0 ) f(x 1, x 2,...,x n;θ) = f(x 1 ;θ)f(x 2 ;θ)...f(x n;θ) a different perspective: see the joint density as a function of the parameters θ (as opposed to the sample) n L(θ X) f(x 1, x 2,...,x n;θ) = f(x i ;θ) it is usually simpler to work with the log of the likelihood l(θ x 1, x 2,...,x n) ln L(θ x 1, x 2,...,x n) = n ln f(x i ;θ) ML Estimator: Given sample data generated from parametric model, find parameters that maximize probability of observing that sample. F.0.C. the Score: θ ML = arg max θ L(θ x 1, x 2,...,x n) = arg max θ l(θ) θ = 0 l(θ x 1, x 2,...,x n) Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 6 / 21
7 Maximum Likelihood: Example Consider a Univariate Normal model: f(y,θ) = N(µ,σ 2 ) = ( 1 exp 1 2πσ 2 2 (y i µ) 2 ) The log-likelihood is n l(µ,σ 2 ) = f(y i ;θ) = n 2 ln(2π) n 2 ln(σ2 ) 1 n (y i µ) 2 2 σ 2 and then the score of the µ and σ 2 parameters are l(µ,σ 2 ) σ 2 l(µ,σ 2 ) µ = 1 n σ 2 (y i µ) = 0 = n 2σ σ 4 σ 2 n (y i µ) 2 = 0 Therefore, by first solving for µ and inserting it in the score of σ 2 we get the ML estimators µ ML = 1 n y i n σ ML 2 = 1 n (y i µ ML ) 2 n Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 7 / 21
8 Maximum Likelihood: Properties Consistency: plim θ ML = θ 0 Asymptotic normality: a θ ML N (θ 0,[I(θ 0 )] 1) [ 2 ] ln L(θ) where I(θ) = E θ θ I(θ) is the Fisher Information matrix Asymptotic efficiency: has the smallest asymptotic variance being I(θ) 1 the Cramér-Rao lower bound Invariance: if θ is the MLE for θ 0, and if g(θ) is any (invertible) transformation of θ, then the MLE for g(θ 0 ) = g( θ). Ex: precision parameter γ 2 = 1/σ 2 γ 2 ML = 1/σ2 ML Bottom line: MLE makes best use of information (asymptotically) Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 8 / 21
9 Maximum Likelihood: Properties of regular density Under some regularity conditions (whose goal is to use Taylor approximation and interchange differentiation and expectation) D1: ln f(y i ;θ), g i = ln f(y i;θ), H θ i = 2 ln f(y i ;θ) θθ are random sample D2: E 0 [g i (θ 0 )] = 0 D3: Var 0 [g i (θ 0 )] = E 0 [H i (θ 0 )] D1 implied by assumption: y i, i = 1,..., n is random sample D2 is a consequence of ln f(y i ;θ 0 )dy i = 1 since by differencing both sides by θ 0 f(yi ;θ 0 ) ln f(yi ;θ 0 ) 0 = dy i = f(y i ;θ 0 )dy i = E 0 [g i (θ 0 )] θ 0 θ 0 D3 is obtained by differencing once more w.r.t θ 0 [ D1 (random sample) Var n 0 g i(θ 0 ) ] = n Var 0[g i (θ 0 )], thus [ ] [ ] ln L(θ0 ; y i ) 2 ln L(θ 0 ; y i ) J(θ 0 ) Var 0 = E 0 θ 0 θ 0 θ 0 I(θ 0 ) }{{} Information matrix equality Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 9 / 21
10 Asymptotic normality of MLE being the max of ln L, MLE satisfy by construction the likelihood equation g(ˆθ) = n g i(ˆθ) = 0 define H(θ 0 ) = 2 ln L(θ 0 ;y i ) θ 0 θ 0 = n 2 ln f(y i ;θ 0 ) θ 0 θ 0 = n H i(θ 0 ) 1 take first order Taylor expansion of the score g(ˆθ) around θ 0 g(ˆθ) = g(θ 0 )+H(θ 0 )(ˆθ θ 0 )+R 1 = 0, 2 rearrange and scale by n ( ) n(ˆθ θ0 ) = 1 n 1 H i (θ 0 ) n }{{} E 0[ 1 n H(θ 0)] n 1 n g i (θ 0 ) n }{{} N(0,Var 0[ n 1 g(θ 0)]) + R 1 }{{} 0 3 Use LLN (on first term) and CLT (on second term) [ ] 1 1 [ ]) 1 n(ˆθ θ0 ) E 0 n H(θ 0) N( 0, Var 0 n g(θ 0) ( N 0, n I(θ 0 ) 1 J(θ 0 )I(θ 0 ) 1) ( ) 1 1 = n I(θ 0) N( 0, 1 ) n J(θ 0) if information matrix equality J(θ 0 ) = I(θ 0 ) holds then ˆθ a N (θ 0, I(θ 0 ) 1) Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 10 / 21
11 Estimating asymptotic covariance matrix of MLE Three asymptotically equivalent estimators of the Asy.Var[ˆθ]: 1 Calculate E 0 [H(θ 0 )] (very difficult) and evaluate it at ˆθ to estimate [ ]} 1 {I(ˆθ)} 1 = { E 2 ln L(ˆθ; y i ) 0 ˆθˆθ 2 Calculate H(θ 0 ) (still quite difficult) and evaluate it at ˆθ to get { } 1 {Î(ˆθ)} 1 2 ln L(ˆθ; y i ) = ˆθˆθ 3 BHHH or OPG estimator (easy): use information matrix equality I(θ 0 ) = J(θ 0 ) { [ ]} 1 { n } 1 {Ĩ(ˆθ)} 1 ln L(ˆθ; y i ) = Var = g i (ˆθ)g i (ˆθ) ˆθ Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 11 / 21
12 Hypothesis testing (idea) Test of hypothesis H 0 : c(θ) = 0 Three tests, asymptotically equivalent (not in finite sample): Likelihood ratio test : If c(θ) = 0 is valid, then imposing it should not lead to a large reduction in the log-likelihood function. Therefore, we base the test on the difference, 2(ln L ln L R ) χ 2 df Both unrestricted ln L and restricted ln L R ML estimators are required Wald test: If c(θ) = 0 is valid, then c(θ ML ) 0 Only unrestricted (ML) estimator is required Lagrange multiplier test: If c(θ) = 0 is valid, then the restricted estimator should be near the point that maximizes the ln L. Therefore, the slope of ln L should be near zero at the restricted estimator Only restricted estimator is required Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 12 / 21
13 Hypothesis testing Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 13 / 21
14 Application of MLE: Linear regression model Model: y i = x i β +ɛ i and y i x i N(x i β,σ2 ) Log-likelihood based on n conditionally independent observations: Likelihood equations ln L = n 2 ln(2π) n 2 lnσ2 1 n (y i x i β) 2 σ 2 = n 2 ln(2π) n 2 lnσ2 (y Xβ) (y Xβ) 2σ 2 solving likelihood equations ln L β ln L σ 2 = X (y Xβ) σ 2 = 0 = n 2σ 2 + (y Xβ) (y Xβ) 2σ 4 = 0 ˆβ ML = (X X) 1 X Y and ˆσ 2 ML = e e n ˆβ ML = ˆβ OLS OLS has all desirable asymptotic properties of MLE Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 14 / 21
15 Maximum Likelihood in time series: AR model In a time series y t, the innovations ɛ t are usually not i.i.d. It is then very convenient to use the prediction error decomposition of the likelihood: L(y T, y T 1,..., y 1 ;θ) = f(y T Ω T 1 ;θ) f(y T 1 Ω T 2 ;θ)... f(y 1 Ω 0 ;θ) For example for the AR(1) the full log-likelihood can be written as y t = φ 1 y t 1 +ɛ t T l(φ) = f Y1 (y 1;φ) + f Y t Y t 1 (y t y t 1;φ) = f Y1 (y T T 1;φ) }{{} 2 log(2π) logσ 2 1 T (y t φy t 1) 2 2 σ 2 t=2 t=1 t=2 marginal 1 st obs }{{} conditional likelihood under normality OLS=MLE Hence, maximizing the conditional likelihood for φ is equivalent to minimize which is the OLS criteria. T (y t φy t 1 ) 2 t=2 In general for AR(p) process OLS are consistent and, under gaussianity, asymptotically equivalent to MLE asymptotically efficient Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 15 / 21
16 Maximum Likelihood in time series: ARMA model For a general ARMA(p,q) Y t = φ 1 Y t φ py t p +ɛ t +θ 1 ɛ t θ qɛ t q Y t 1 is correlated with ɛ t 1,...,ɛ t q OLS not consistent. MLE with numerical optimization procedures. Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 16 / 21
17 Maximum Likelihood in time series: GARCH model A GARCH process with gaussian innovation: has conditional densities: r t Ω t 1 N(µ t(θ),σ 2 t (θ)) f(r t Ω t 1 ;θ) = 1 ( σt 1 (θ) exp 1 2π 2 (r t µ t(θ)) 2 ) σ 2 t (θ) using the prediction error decomposition the log-likelihood becomes: log L(r T, r T 1,..., r 1 ;θ) = T T 2 log(2π) logσt 2 (θ) 1 T (r t µ t(θ)) 2 2 σ 2 t=1 t=1 t (θ) Non linear function in θ Numerical optimization techniques. Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 17 / 21
18 Quasi Maximum Likelihood ML requires complete specification of f(y i x i ;θ), usually Normality is assumed. Nevertheless, even if the true distribution is not Normal, assuming Normality gives consistency and asymptotic normality provided that the conditional mean and variance processes are correctly specified. However, the information matrix equality does not hold anymore i.e. J(θ 0 ) I(θ 0 ) hence, the covariance matrix of θ [ ML is not I(θ 0 ) 1 = E 2 ] 1 l(θ0 ) but θ 0 θ 0 θ QML a N ( θ 0,[I(θ 0 )] 1 J(θ 0 )[I(θ 0 )] 1) where 1 J(θ 0 ) = lim n T T t=1 [ lt(θ 0 ) l t(θ 0 ) ] E θ 0 θ 0 ] 1 ] 1 the std error provided by [Î(θ0 ) Ĵ(θ0 ) [Î(θ0 ) are called the robust standard errors. Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 18 / 21
19 GMM (idea) Idea: Sample moments p Population moments = function(parameters) A vector function g(θ, w t) which under the true value θ 0 satisfies E 0 [g(θ 0, w t)] = 0 is called a set of orthogonality or moment conditions Goal: estimate θ 0 from the informational content of the moment conditions semiparametric approach i.e. no distributional assumptions! replace population moment conditions with sample moments ĝ T (θ) = 1 T T g(θ, w t) t=1 and minimize, w.r.t. θ, a quadratic form of ĝ n(θ) with a certain weighting matrix W ( ( ) 1 T 1 T θ GMM = arg min g(θ, w t)) W g(θ, w t) θ T T t=1 t=1 Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 19 / 21
20 GMM: optimal weighting matrix exactly identified (Method of Moment, MM): # orthogonality conditions = # parameters moment equations satisfied exactly, i.e. 1 T g( θ, w t) = 0 W irrelevant T t=1 overidentified (Generalize MM): # orthogonality conditions > # parameters W is relevant. The optimal weighting matrix W is the inverse of the asymptotic var-cov of g(θ 0, w t) W = Var[g(θ 0, w t)] 1 but it depends on the unknonw θ 0 Feasible two-step procedure: Step 1. Use W = I to obtain a consistent estimator, ˆθ 1, then estimate ˆΦ = 1 T g(ˆθ 1, w t)g(ˆθ 1, w t) T t=1 Step 2. Compute second step GMM estimator using the weighting matrix ˆΦ 1 ( ) ( ) 1 T θ GMM = arg min g(θ, w t) ˆΦ 1 1 T g(θ, w t) θ T T t=1 t=1 The two-step estimator θ GMM is asymptotically efficient in the GMM class Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 20 / 21
21 GMM, OLS and ML Many estimation methods can be seen as GMM Examples: 1) OLS is a GMM with ortoghonality condition E[x i ɛ i ] = E [ x i (y i x i θ)] = 0 2) ML is a GMM on the score [ ] log f(yt;θ) E = E[g(θ, w t)] = 0 θ Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 21 / 21
Introduction to Estimation Methods for Time Series models. Lecture 1
Introduction to Estimation Methods for Time Series models Lecture 1 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 1 SNS Pisa 1 / 19 Estimation
More informationGreene, Econometric Analysis (6th ed, 2008)
EC771: Econometrics, Spring 2010 Greene, Econometric Analysis (6th ed, 2008) Chapter 17: Maximum Likelihood Estimation The preferred estimator in a wide variety of econometric settings is that derived
More informationPractical Econometrics. for. Finance and Economics. (Econometrics 2)
Practical Econometrics for Finance and Economics (Econometrics 2) Seppo Pynnönen and Bernd Pape Department of Mathematics and Statistics, University of Vaasa 1. Introduction 1.1 Econometrics Econometrics
More informationMISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30
MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD Copyright c 2012 (Iowa State University) Statistics 511 1 / 30 INFORMATION CRITERIA Akaike s Information criterion is given by AIC = 2l(ˆθ) + 2k, where l(ˆθ)
More informationf(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain
0.1. INTRODUCTION 1 0.1 Introduction R. A. Fisher, a pioneer in the development of mathematical statistics, introduced a measure of the amount of information contained in an observaton from f(x θ). Fisher
More information6. MAXIMUM LIKELIHOOD ESTIMATION
6 MAXIMUM LIKELIHOOD ESIMAION [1] Maximum Likelihood Estimator (1) Cases in which θ (unknown parameter) is scalar Notational Clarification: From now on, we denote the true value of θ as θ o hen, view θ
More informationMaximum Likelihood Tests and Quasi-Maximum-Likelihood
Maximum Likelihood Tests and Quasi-Maximum-Likelihood Wendelin Schnedler Department of Economics University of Heidelberg 10. Dezember 2007 Wendelin Schnedler (AWI) Maximum Likelihood Tests and Quasi-Maximum-Likelihood10.
More informationParameter Estimation
Parameter Estimation Consider a sample of observations on a random variable Y. his generates random variables: (y 1, y 2,, y ). A random sample is a sample (y 1, y 2,, y ) where the random variables y
More informationIntroduction to ARMA and GARCH processes
Introduction to ARMA and GARCH processes Fulvio Corsi SNS Pisa 3 March 2010 Fulvio Corsi Introduction to ARMA () and GARCH processes SNS Pisa 3 March 2010 1 / 24 Stationarity Strict stationarity: (X 1,
More informationsimple if it completely specifies the density of x
3. Hypothesis Testing Pure significance tests Data x = (x 1,..., x n ) from f(x, θ) Hypothesis H 0 : restricts f(x, θ) Are the data consistent with H 0? H 0 is called the null hypothesis simple if it completely
More informationChapter 1: A Brief Review of Maximum Likelihood, GMM, and Numerical Tools. Joan Llull. Microeconometrics IDEA PhD Program
Chapter 1: A Brief Review of Maximum Likelihood, GMM, and Numerical Tools Joan Llull Microeconometrics IDEA PhD Program Maximum Likelihood Chapter 1. A Brief Review of Maximum Likelihood, GMM, and Numerical
More informationMathematical statistics
October 4 th, 2018 Lecture 12: Information Where are we? Week 1 Week 2 Week 4 Week 7 Week 10 Week 14 Probability reviews Chapter 6: Statistics and Sampling Distributions Chapter 7: Point Estimation Chapter
More informationChapter 3: Maximum Likelihood Theory
Chapter 3: Maximum Likelihood Theory Florian Pelgrin HEC September-December, 2010 Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, 2010 1 / 40 1 Introduction Example 2 Maximum likelihood
More informationGraduate Econometrics I: Maximum Likelihood I
Graduate Econometrics I: Maximum Likelihood I Yves Dominicy Université libre de Bruxelles Solvay Brussels School of Economics and Management ECARES Yves Dominicy Graduate Econometrics I: Maximum Likelihood
More informationMaximum Likelihood (ML) Estimation
Econometrics 2 Fall 2004 Maximum Likelihood (ML) Estimation Heino Bohn Nielsen 1of32 Outline of the Lecture (1) Introduction. (2) ML estimation defined. (3) ExampleI:Binomialtrials. (4) Example II: Linear
More informationA Very Brief Summary of Statistical Inference, and Examples
A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2009 Prof. Gesine Reinert Our standard situation is that we have data x = x 1, x 2,..., x n, which we view as realisations of random
More informationThe outline for Unit 3
The outline for Unit 3 Unit 1. Introduction: The regression model. Unit 2. Estimation principles. Unit 3: Hypothesis testing principles. 3.1 Wald test. 3.2 Lagrange Multiplier. 3.3 Likelihood Ratio Test.
More informationMLE and GMM. Li Zhao, SJTU. Spring, Li Zhao MLE and GMM 1 / 22
MLE and GMM Li Zhao, SJTU Spring, 2017 Li Zhao MLE and GMM 1 / 22 Outline 1 MLE 2 GMM 3 Binary Choice Models Li Zhao MLE and GMM 2 / 22 Maximum Likelihood Estimation - Introduction For a linear model y
More informationChapter 4: Constrained estimators and tests in the multiple linear regression model (Part III)
Chapter 4: Constrained estimators and tests in the multiple linear regression model (Part III) Florian Pelgrin HEC September-December 2010 Florian Pelgrin (HEC) Constrained estimators September-December
More informationProblem Selected Scores
Statistics Ph.D. Qualifying Exam: Part II November 20, 2010 Student Name: 1. Answer 8 out of 12 problems. Mark the problems you selected in the following table. Problem 1 2 3 4 5 6 7 8 9 10 11 12 Selected
More informationFinal Exam. 1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given.
1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given. (a) If X and Y are independent, Corr(X, Y ) = 0. (b) (c) (d) (e) A consistent estimator must be asymptotically
More informationMax. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes
Maximum Likelihood Estimation Econometrics II Department of Economics Universidad Carlos III de Madrid Máster Universitario en Desarrollo y Crecimiento Económico Outline 1 3 4 General Approaches to Parameter
More informationChapter 3. Point Estimation. 3.1 Introduction
Chapter 3 Point Estimation Let (Ω, A, P θ ), P θ P = {P θ θ Θ}be probability space, X 1, X 2,..., X n : (Ω, A) (IR k, B k ) random variables (X, B X ) sample space γ : Θ IR k measurable function, i.e.
More informationBIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation
BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation Yujin Chung November 29th, 2016 Fall 2016 Yujin Chung Lec13: MLE Fall 2016 1/24 Previous Parametric tests Mean comparisons (normality assumption)
More informationAdvanced Quantitative Methods: maximum likelihood
Advanced Quantitative Methods: Maximum Likelihood University College Dublin March 23, 2011 1 Introduction 2 3 4 5 Outline Introduction 1 Introduction 2 3 4 5 Preliminaries Introduction Ordinary least squares
More informationBrief Review on Estimation Theory
Brief Review on Estimation Theory K. Abed-Meraim ENST PARIS, Signal and Image Processing Dept. abed@tsi.enst.fr This presentation is essentially based on the course BASTA by E. Moulines Brief review on
More informationStatistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach
Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score
More information(θ θ ), θ θ = 2 L(θ ) θ θ θ θ θ (θ )= H θθ (θ ) 1 d θ (θ )
Setting RHS to be zero, 0= (θ )+ 2 L(θ ) (θ θ ), θ θ = 2 L(θ ) 1 (θ )= H θθ (θ ) 1 d θ (θ ) O =0 θ 1 θ 3 θ 2 θ Figure 1: The Newton-Raphson Algorithm where H is the Hessian matrix, d θ is the derivative
More informationP n. This is called the law of large numbers but it comes in two forms: Strong and Weak.
Large Sample Theory Large Sample Theory is a name given to the search for approximations to the behaviour of statistical procedures which are derived by computing limits as the sample size, n, tends to
More informationFor iid Y i the stronger conclusion holds; for our heuristics ignore differences between these notions.
Large Sample Theory Study approximate behaviour of ˆθ by studying the function U. Notice U is sum of independent random variables. Theorem: If Y 1, Y 2,... are iid with mean µ then Yi n µ Called law of
More informationMathematical statistics
October 18 th, 2018 Lecture 16: Midterm review Countdown to mid-term exam: 7 days Week 1 Chapter 1: Probability review Week 2 Week 4 Week 7 Chapter 6: Statistics Chapter 7: Point Estimation Chapter 8:
More informationCh. 5 Hypothesis Testing
Ch. 5 Hypothesis Testing The current framework of hypothesis testing is largely due to the work of Neyman and Pearson in the late 1920s, early 30s, complementing Fisher s work on estimation. As in estimation,
More informationAdvanced Quantitative Methods: maximum likelihood
Advanced Quantitative Methods: Maximum Likelihood University College Dublin 4 March 2014 1 2 3 4 5 6 Outline 1 2 3 4 5 6 of straight lines y = 1 2 x + 2 dy dx = 1 2 of curves y = x 2 4x + 5 of curves y
More informationVariations. ECE 6540, Lecture 10 Maximum Likelihood Estimation
Variations ECE 6540, Lecture 10 Last Time BLUE (Best Linear Unbiased Estimator) Formulation Advantages Disadvantages 2 The BLUE A simplification Assume the estimator is a linear system For a single parameter
More informationAdvanced Econometrics
Advanced Econometrics Dr. Andrea Beccarini Center for Quantitative Economics Winter 2013/2014 Andrea Beccarini (CQE) Econometrics Winter 2013/2014 1 / 156 General information Aims and prerequisites Objective:
More informationMaximum Likelihood Asymptotic Theory. Eduardo Rossi University of Pavia
Maximum Likelihood Asymtotic Theory Eduardo Rossi University of Pavia Slutsky s Theorem, Cramer s Theorem Slutsky s Theorem Let {X N } be a random sequence converging in robability to a constant a, and
More informationA Very Brief Summary of Statistical Inference, and Examples
A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2008 Prof. Gesine Reinert 1 Data x = x 1, x 2,..., x n, realisations of random variables X 1, X 2,..., X n with distribution (model)
More informationIntroduction Large Sample Testing Composite Hypotheses. Hypothesis Testing. Daniel Schmierer Econ 312. March 30, 2007
Hypothesis Testing Daniel Schmierer Econ 312 March 30, 2007 Basics Parameter of interest: θ Θ Structure of the test: H 0 : θ Θ 0 H 1 : θ Θ 1 for some sets Θ 0, Θ 1 Θ where Θ 0 Θ 1 = (often Θ 1 = Θ Θ 0
More informationMaximum Likelihood Estimation
University of Pavia Maximum Likelihood Estimation Eduardo Rossi Likelihood function Choosing parameter values that make what one has observed more likely to occur than any other parameter values do. Assumption(Distribution)
More informationEcon 583 Final Exam Fall 2008
Econ 583 Final Exam Fall 2008 Eric Zivot December 11, 2008 Exam is due at 9:00 am in my office on Friday, December 12. 1 Maximum Likelihood Estimation and Asymptotic Theory Let X 1,...,X n be iid random
More informationEstimation, Inference, and Hypothesis Testing
Chapter 2 Estimation, Inference, and Hypothesis Testing Note: The primary reference for these notes is Ch. 7 and 8 of Casella & Berger 2. This text may be challenging if new to this topic and Ch. 7 of
More informationInference in non-linear time series
Intro LS MLE Other Erik Lindström Centre for Mathematical Sciences Lund University LU/LTH & DTU Intro LS MLE Other General Properties Popular estimatiors Overview Introduction General Properties Estimators
More informationECON 4160, Autumn term Lecture 1
ECON 4160, Autumn term 2017. Lecture 1 a) Maximum Likelihood based inference. b) The bivariate normal model Ragnar Nymoen University of Oslo 24 August 2017 1 / 54 Principles of inference I Ordinary least
More informationStatistics and econometrics
1 / 36 Slides for the course Statistics and econometrics Part 10: Asymptotic hypothesis testing European University Institute Andrea Ichino September 8, 2014 2 / 36 Outline Why do we need large sample
More informationChapters 9. Properties of Point Estimators
Chapters 9. Properties of Point Estimators Recap Target parameter, or population parameter θ. Population distribution f(x; θ). { probability function, discrete case f(x; θ) = density, continuous case The
More information10. Linear Models and Maximum Likelihood Estimation
10. Linear Models and Maximum Likelihood Estimation ECE 830, Spring 2017 Rebecca Willett 1 / 34 Primary Goal General problem statement: We observe y i iid pθ, θ Θ and the goal is to determine the θ that
More information5.2 Fisher information and the Cramer-Rao bound
Stat 200: Introduction to Statistical Inference Autumn 208/9 Lecture 5: Maximum likelihood theory Lecturer: Art B. Owen October 9 Disclaimer: These notes have not been subjected to the usual scrutiny reserved
More informationEconometrics I, Estimation
Econometrics I, Estimation Department of Economics Stanford University September, 2008 Part I Parameter, Estimator, Estimate A parametric is a feature of the population. An estimator is a function of the
More informationELEG 5633 Detection and Estimation Minimum Variance Unbiased Estimators (MVUE)
1 ELEG 5633 Detection and Estimation Minimum Variance Unbiased Estimators (MVUE) Jingxian Wu Department of Electrical Engineering University of Arkansas Outline Minimum Variance Unbiased Estimators (MVUE)
More informationGeneralized Method of Moment
Generalized Method of Moment CHUNG-MING KUAN Department of Finance & CRETA National Taiwan University June 16, 2010 C.-M. Kuan (Finance & CRETA, NTU Generalized Method of Moment June 16, 2010 1 / 32 Lecture
More informationCentral Limit Theorem ( 5.3)
Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationEconometrics II - EXAM Answer each question in separate sheets in three hours
Econometrics II - EXAM Answer each question in separate sheets in three hours. Let u and u be jointly Gaussian and independent of z in all the equations. a Investigate the identification of the following
More informationSampling distribution of GLM regression coefficients
Sampling distribution of GLM regression coefficients Patrick Breheny February 5 Patrick Breheny BST 760: Advanced Regression 1/20 Introduction So far, we ve discussed the basic properties of the score,
More informationLecture 3 September 1
STAT 383C: Statistical Modeling I Fall 2016 Lecture 3 September 1 Lecturer: Purnamrita Sarkar Scribe: Giorgio Paulon, Carlos Zanini Disclaimer: These scribe notes have been slightly proofread and may have
More informationECON 5350 Class Notes Nonlinear Regression Models
ECON 5350 Class Notes Nonlinear Regression Models 1 Introduction In this section, we examine regression models that are nonlinear in the parameters and give a brief overview of methods to estimate such
More informationSTATS 200: Introduction to Statistical Inference. Lecture 29: Course review
STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout
More informationRegression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood
Regression Estimation - Least Squares and Maximum Likelihood Dr. Frank Wood Least Squares Max(min)imization Function to minimize w.r.t. β 0, β 1 Q = n (Y i (β 0 + β 1 X i )) 2 i=1 Minimize this by maximizing
More informationElements of statistics (MATH0487-1)
Elements of statistics (MATH0487-1) Prof. Dr. Dr. K. Van Steen University of Liège, Belgium November 12, 2012 Introduction to Statistics Basic Probability Revisited Sampling Exploratory Data Analysis -
More informationTesting Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata
Maura Department of Economics and Finance Università Tor Vergata Hypothesis Testing Outline It is a mistake to confound strangeness with mystery Sherlock Holmes A Study in Scarlet Outline 1 The Power Function
More information13.2 Example: W, LM and LR Tests
13.2 Example: W, LM and LR Tests Date file = cons99.txt (same data as before) Each column denotes year, nominal household expenditures ( 10 billion yen), household disposable income ( 10 billion yen) and
More informationLECTURE 18: NONLINEAR MODELS
LECTURE 18: NONLINEAR MODELS The basic point is that smooth nonlinear models look like linear models locally. Models linear in parameters are no problem even if they are nonlinear in variables. For example:
More informationSpring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM
University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 14 GEE-GMM Throughout the course we have emphasized methods of estimation and inference based on the principle
More informationMaximum Likelihood Estimation
Chapter 7 Maximum Likelihood Estimation 7. Consistency If X is a random variable (or vector) with density or mass function f θ (x) that depends on a parameter θ, then the function f θ (X) viewed as a function
More information2 Statistical Estimation: Basic Concepts
Technion Israel Institute of Technology, Department of Electrical Engineering Estimation and Identification in Dynamical Systems (048825) Lecture Notes, Fall 2009, Prof. N. Shimkin 2 Statistical Estimation:
More informationUnbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others.
Unbiased Estimation Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others. To compare ˆθ and θ, two estimators of θ: Say ˆθ is better than θ if it
More informationHypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3
Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest
More informationMS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari
MS&E 226: Small Data Lecture 11: Maximum likelihood (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 18 The likelihood function 2 / 18 Estimating the parameter This lecture develops the methodology behind
More informationStatistics Ph.D. Qualifying Exam
Department of Statistics Carnegie Mellon University May 7 2008 Statistics Ph.D. Qualifying Exam You are not expected to solve all five problems. Complete solutions to few problems will be preferred to
More informationProblem Set 6 Solution
Problem Set 6 Solution May st, 009 by Yang. Causal Expression of AR Let φz : αz βz. Zeros of φ are α and β, both of which are greater than in absolute value by the assumption in the question. By the theorem
More informationEstimation and Model Selection in Mixed Effects Models Part I. Adeline Samson 1
Estimation and Model Selection in Mixed Effects Models Part I Adeline Samson 1 1 University Paris Descartes Summer school 2009 - Lipari, Italy These slides are based on Marc Lavielle s slides Outline 1
More informationUnbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others.
Unbiased Estimation Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others. To compare ˆθ and θ, two estimators of θ: Say ˆθ is better than θ if it
More informationAssociation studies and regression
Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration
More informationOutline of GLMs. Definitions
Outline of GLMs Definitions This is a short outline of GLM details, adapted from the book Nonparametric Regression and Generalized Linear Models, by Green and Silverman. The responses Y i have density
More informationMaster s Written Examination
Master s Written Examination Option: Statistics and Probability Spring 016 Full points may be obtained for correct answers to eight questions. Each numbered question which may have several parts is worth
More informationQuick Review on Linear Multiple Regression
Quick Review on Linear Multiple Regression Mei-Yuan Chen Department of Finance National Chung Hsing University March 6, 2007 Introduction for Conditional Mean Modeling Suppose random variables Y, X 1,
More informationEconometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018
Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate
More informationLECTURE 10: NEYMAN-PEARSON LEMMA AND ASYMPTOTIC TESTING. The last equality is provided so this can look like a more familiar parametric test.
Economics 52 Econometrics Professor N.M. Kiefer LECTURE 1: NEYMAN-PEARSON LEMMA AND ASYMPTOTIC TESTING NEYMAN-PEARSON LEMMA: Lesson: Good tests are based on the likelihood ratio. The proof is easy in the
More informationGMM - Generalized method of moments
GMM - Generalized method of moments GMM Intuition: Matching moments You want to estimate properties of a data set {x t } T t=1. You assume that x t has a constant mean and variance. x t (µ 0, σ 2 ) Consider
More informationLecture 17: Likelihood ratio and asymptotic tests
Lecture 17: Likelihood ratio and asymptotic tests Likelihood ratio When both H 0 and H 1 are simple (i.e., Θ 0 = {θ 0 } and Θ 1 = {θ 1 }), Theorem 6.1 applies and a UMP test rejects H 0 when f θ1 (X) f
More informationGraduate Econometrics I: Maximum Likelihood II
Graduate Econometrics I: Maximum Likelihood II Yves Dominicy Université libre de Bruxelles Solvay Brussels School of Economics and Management ECARES Yves Dominicy Graduate Econometrics I: Maximum Likelihood
More informationRegression #3: Properties of OLS Estimator
Regression #3: Properties of OLS Estimator Econ 671 Purdue University Justin L. Tobias (Purdue) Regression #3 1 / 20 Introduction In this lecture, we establish some desirable properties associated with
More informationEstimation of Dynamic Regression Models
University of Pavia 2007 Estimation of Dynamic Regression Models Eduardo Rossi University of Pavia Factorization of the density DGP: D t (x t χ t 1, d t ; Ψ) x t represent all the variables in the economy.
More informationGARCH Models Estimation and Inference
GARCH Models Estimation and Inference Eduardo Rossi University of Pavia December 013 Rossi GARCH Financial Econometrics - 013 1 / 1 Likelihood function The procedure most often used in estimating θ 0 in
More informationEstimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators
Estimation theory Parametric estimation Properties of estimators Minimum variance estimator Cramer-Rao bound Maximum likelihood estimators Confidence intervals Bayesian estimation 1 Random Variables Let
More informationPOLI 8501 Introduction to Maximum Likelihood Estimation
POLI 8501 Introduction to Maximum Likelihood Estimation Maximum Likelihood Intuition Consider a model that looks like this: Y i N(µ, σ 2 ) So: E(Y ) = µ V ar(y ) = σ 2 Suppose you have some data on Y,
More informationGeneralized Linear Models. Kurt Hornik
Generalized Linear Models Kurt Hornik Motivation Assuming normality, the linear model y = Xβ + e has y = β + ε, ε N(0, σ 2 ) such that y N(μ, σ 2 ), E(y ) = μ = β. Various generalizations, including general
More informationStatistics 3858 : Maximum Likelihood Estimators
Statistics 3858 : Maximum Likelihood Estimators 1 Method of Maximum Likelihood In this method we construct the so called likelihood function, that is L(θ) = L(θ; X 1, X 2,..., X n ) = f n (X 1, X 2,...,
More informationGeneralized Method of Moments (GMM) Estimation
Econometrics 2 Fall 2004 Generalized Method of Moments (GMM) Estimation Heino Bohn Nielsen of29 Outline of the Lecture () Introduction. (2) Moment conditions and methods of moments (MM) estimation. Ordinary
More informationMathematics Ph.D. Qualifying Examination Stat Probability, January 2018
Mathematics Ph.D. Qualifying Examination Stat 52800 Probability, January 2018 NOTE: Answers all questions completely. Justify every step. Time allowed: 3 hours. 1. Let X 1,..., X n be a random sample from
More information8. Hypothesis Testing
FE661 - Statistical Methods for Financial Engineering 8. Hypothesis Testing Jitkomut Songsiri introduction Wald test likelihood-based tests significance test for linear regression 8-1 Introduction elements
More informationDA Freedman Notes on the MLE Fall 2003
DA Freedman Notes on the MLE Fall 2003 The object here is to provide a sketch of the theory of the MLE. Rigorous presentations can be found in the references cited below. Calculus. Let f be a smooth, scalar
More informationECE531 Lecture 10b: Maximum Likelihood Estimation
ECE531 Lecture 10b: Maximum Likelihood Estimation D. Richard Brown III Worcester Polytechnic Institute 05-Apr-2011 Worcester Polytechnic Institute D. Richard Brown III 05-Apr-2011 1 / 23 Introduction So
More informationMaximum Likelihood Estimation
Chapter 8 Maximum Likelihood Estimation 8. Consistency If X is a random variable (or vector) with density or mass function f θ (x) that depends on a parameter θ, then the function f θ (X) viewed as a function
More informationHT Introduction. P(X i = x i ) = e λ λ x i
MODS STATISTICS Introduction. HT 2012 Simon Myers, Department of Statistics (and The Wellcome Trust Centre for Human Genetics) myers@stats.ox.ac.uk We will be concerned with the mathematical framework
More informationEconomics 583: Econometric Theory I A Primer on Asymptotics
Economics 583: Econometric Theory I A Primer on Asymptotics Eric Zivot January 14, 2013 The two main concepts in asymptotic theory that we will use are Consistency Asymptotic Normality Intuition consistency:
More informationGraduate Econometrics I: Unbiased Estimation
Graduate Econometrics I: Unbiased Estimation Yves Dominicy Université libre de Bruxelles Solvay Brussels School of Economics and Management ECARES Yves Dominicy Graduate Econometrics I: Unbiased Estimation
More informationChapter 8.8.1: A factorization theorem
LECTURE 14 Chapter 8.8.1: A factorization theorem The characterization of a sufficient statistic in terms of the conditional distribution of the data given the statistic can be difficult to work with.
More informationRegression #4: Properties of OLS Estimator (Part 2)
Regression #4: Properties of OLS Estimator (Part 2) Econ 671 Purdue University Justin L. Tobias (Purdue) Regression #4 1 / 24 Introduction In this lecture, we continue investigating properties associated
More information2014/2015 Smester II ST5224 Final Exam Solution
014/015 Smester II ST54 Final Exam Solution 1 Suppose that (X 1,, X n ) is a random sample from a distribution with probability density function f(x; θ) = e (x θ) I [θ, ) (x) (i) Show that the family of
More information