GMM Estimation and Testing

Similar documents
GENERALIZEDMETHODOFMOMENTS I

Generalized Method of Moments (GMM) Estimation

Generalized Method of Moments Estimation

Generalized Method of Moment

GMM Estimation and Testing II

GMM and SMM. 1. Hansen, L Large Sample Properties of Generalized Method of Moments Estimators, Econometrica, 50, p

Spring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Nonlinear GMM. Eric Zivot. Winter, 2013

Testing Overidentifying Restrictions with Many Instruments and Heteroskedasticity

Volume 30, Issue 1. Measuring the Intertemporal Elasticity of Substitution for Consumption: Some Evidence from Japan

STAT 100C: Linear models

University of Pavia. M Estimators. Eduardo Rossi

What s New in Econometrics. Lecture 15

Research Article Optimal Portfolio Estimation for Dependent Financial Returns with Generalized Empirical Likelihood

ORTHOGONALITY TESTS IN LINEAR MODELS SEUNG CHAN AHN ARIZONA STATE UNIVERSITY ABSTRACT

Single Equation Linear GMM

GMM estimation is an alternative to the likelihood principle and it has been

The generalized method of moments

Section 9: Generalized method of moments

Short Questions (Do two out of three) 15 points each

Quick Review on Linear Multiple Regression

Economics 582 Random Effects Estimation

GMM estimation is an alternative to the likelihood principle and it has been

Econometrics II - EXAM Answer each question in separate sheets in three hours

Chapter 1. GMM: Basic Concepts

Instrumental Variable Estimation with Heteroskedasticity and Many Instruments

Asymptotics for Nonlinear GMM

A Reduced Bias GMM-like Estimator with Reduced Estimator Dispersion

Economics 583: Econometric Theory I A Primer on Asymptotics

The properties of L p -GMM estimators

Notes on Generalized Method of Moments Estimation

Increasing the Power of Specification Tests. November 18, 2018

δ -method and M-estimation

An estimate of the long-run covariance matrix, Ω, is necessary to calculate asymptotic

A Very Brief Summary of Statistical Inference, and Examples

Asymptotic Properties of Empirical Likelihood Estimator in Dynamic Panel Data Models

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models

Finite Sample Performance of A Minimum Distance Estimator Under Weak Instruments

ASSET PRICING MODELS

Introduction to Maximum Likelihood Estimation

Moment and IV Selection Approaches: A Comparative Simulation Study

1 Outline. 1. Motivation. 2. SUR model. 3. Simultaneous equations. 4. Estimation

Single Equation Linear GMM with Serially Correlated Moment Conditions

Single Equation Linear GMM with Serially Correlated Moment Conditions

The BLP Method of Demand Curve Estimation in Industrial Organization

Exponential Tilting with Weak Instruments: Estimation and Testing

GMM Estimation of the Spatial Autoregressive Model in a System of Interrelated Networks

1. GENERAL DESCRIPTION

the error term could vary over the observations, in ways that are related

Regularized Empirical Likelihood as a Solution to the No Moment Problem: The Linear Case with Many Instruments

Multiple Equation GMM with Common Coefficients: Panel Data

Complexity of two and multi-stage stochastic programming problems

Econ 583 Final Exam Fall 2008

Chapter 4: Constrained estimators and tests in the multiple linear regression model (Part III)

Generalized Method Of Moments (GMM)

Massachusetts Institute of Technology Department of Economics Time Series Lecture 6: Additional Results for VAR s

Chapter 2. GMM: Estimating Rational Expectations Models

Chapter 6. Panel Data. Joan Llull. Quantitative Statistical Methods II Barcelona GSE

Central Bank of Chile October 29-31, 2013 Bruce Hansen (University of Wisconsin) Structural Breaks October 29-31, / 91. Bruce E.

Properties of the least squares estimates

Econ 510 B. Brown Spring 2014 Final Exam Answers

Instrumental Variables Estimation in Stata

Asymmetric least squares estimation and testing

The Time Series and Cross-Section Asymptotics of Empirical Likelihood Estimator in Dynamic Panel Data Models

Ma 3/103: Lecture 24 Linear Regression I: Estimation

Greene, Econometric Analysis (5th ed, 2003)

GMM - Generalized method of moments

Econ 583 Homework 7 Suggested Solutions: Wald, LM and LR based on GMM and MLE

Economic modelling and forecasting

Size Distortion and Modi cation of Classical Vuong Tests

Instrumental Variables and GMM: Estimation and Testing. Steven Stillman, New Zealand Department of Labour

Asymptotic distribution of GMM Estimator

The outline for Unit 3

Closest Moment Estimation under General Conditions

Measuring the Sensitivity of Parameter Estimates to Estimation Moments

Identification and Estimation of Marginal Effects in Nonlinear Panel Models 1

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be

Parameter Estimation

Generalized Linear Models

Empirical likelihood and self-weighting approach for hypothesis testing of infinite variance processes and its applications

MLE and GMM. Li Zhao, SJTU. Spring, Li Zhao MLE and GMM 1 / 22

Linear Regression. Junhui Qian. October 27, 2014

Spatial Regression. 3. Review - OLS and 2SLS. Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Introduction to Estimation Methods for Time Series models Lecture 2

IV and IV-GMM. Christopher F Baum. EC 823: Applied Econometrics. Boston College, Spring 2014

A Very Brief Summary of Statistical Inference, and Examples

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Understanding Regressions with Observations Collected at High Frequency over Long Span

Motivation Non-linear Rational Expectations The Permanent Income Hypothesis The Log of Gravity Non-linear IV Estimation Summary.

DSGE Methods. Estimation of DSGE models: GMM and Indirect Inference. Willi Mutschler, M.Sc.

Generalized Method of Moments: I. Chapter 9, R. Davidson and J.G. MacKinnon, Econometric Theory and Methods, 2004, Oxford.

Linear Model Under General Variance

Nonlinear Error Correction Model and Multiple-Threshold Cointegration May 23, / 31

Multivariate Regression

Proofs for Large Sample Properties of Generalized Method of Moments Estimators

econstor Make Your Publications Visible.

Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator

Central Limit Theorem ( 5.3)

STAT5044: Regression and Anova. Inyoung Kim

Transcription:

GMM Estimation and Testing Whitney Newey July 2007

Idea: Estimate parameters by setting sample moments to be close to population counterpart. Definitions: β : p 1 parameter vector, with true value β 0. g i (β) = g(w i,β):m 1 vector of functions of i th data observation w i and paramet Model (or moment restriction): E[g i (β 0 )] = 0. Definitions: ĝ(β) def = nx i=1 g i (β)/n : Sample averages. Â : m m positive semi-definite matrix. GMM ESTIMATOR: ˆβ =argmin β ĝ(β) 0 Âĝ(β).

GMM ESTIMATOR: ˆβ =argmin β ĝ(β) 0 Âĝ(β). Interpretation: Choosing ˆβ so sample moments are close to zero. For kgkâ = q g 0Âg, same as minimizing kĝ(β) 0k Â. When m = p, theˆβ with ĝ(ˆβ) =0will be the GMM estimator for any  When m > p then  matters. MethodofMomentsisSpecialCase: Moments : E[y j ]=h j (β 0 ), (1 j p), Specify moment functions : g i (β) =(y i h 1 (β),...,y p i h p(β)) 0, Estimator : ĝ(ˆβ) =0same as y j = h j (ˆβ), (1 j p).

Two-stage Least Squares as GMM: Model : y i = Xi 0 β + ε i,e[z i ε i ]=0,(i =1,...,n). Specify moment functions : g i (β) =Z i (y i Xi 0 β). Sample moments are : ĝ(β) = nx i=1 Z i (y i X 0 i β)/n = Z0 (y Xβ)/n Z = [Z 1,...,Z n ] 0,X =[X 1,...,X n ] 0,y =(y 1,...,y n ) 0. Specify distance (weighting) matrix : Â =(Z 0 Z/n) 1. GMM estimator is 2SLS : ˆβ =argmin β [(y Xβ) 0 Z/n](Z 0 Z/n) 1 Z 0 (y Xβ)/n = argmin β (y Xβ) 0 Z(Z 0 Z) 1 Z 0 (y Xβ).

Intertemporal CAPM Nonlinear in parameters. c i consumption at time i, R i is asset return between i and i+1, α 0 is time discount factor, u(c, γ 0 ) utility function; I i is variables observed at time i; Agent maximizes E[ X j=0 subject to intertemporal budget constraint. α j 0 u(c i+j,γ 0 )]

First-order conditions are E[R i α 0 u c (c i+1,γ 0 )/u c (c i,γ 0 ) I i ]=1. Marginal rate of substitution between i and i +1 equals rate of return (in expected value). Let Z i denote a m 1 vector of variables observable at time i (like lagged consumption, lagged returns, and nonlinear functions of these). Moment function is g i (β) =Z i {R i α u c (c i+1,γ 0 )/u c (c i,γ 0 ) 1}. Here GMM is nonlinear instrumental variables. Empirical Example: Hansen and Singleton (1982, Econometrica).

IDENTIFICATION: Identification precedes estimation. If a parameter is not identified then no consistent estimator exists. Identification from moment functions: Let ḡ(β) =E[g i (β)]. Identification condition is β 0 IS THE ONLY SOLUTION TO ḡ(β) =0 Necessary condition for identification is that m p. When m<p,i.e. there are fewer equations to solve than parameters, there will typically be multiple solutions to the moment conditions. In IV need at least as many instrumental variables as right-hand side variables.

Rank condition for identification: Let G = E[ g i (β 0 )/ β]. Rank condition is rank(g) =p. Necessary and sufficient for identification when g i (β) is linear in β. 0=ḡ(β) =ḡ(β 0 )+G(β β 0 )=G(β β 0 ) has a unique solution at β 0 if and only if the rank of G equals the number of columns of G. For IV G = E[Z i Xi 0 ] so that rank(g) =p is usual rank condition.

In the general nonlinear case it is difficult to specify conditions for uniqueness of the solution to ḡ(β) =0. rank(g) =p will be sufficient for local identification. There exists a neighborhood of β 0 such that β 0 is the unique solution to ḡ(β) for all β in that neighborhood. Exact identification is m = p. For IV as many instruments as right-hand side variables. Here the GMM estimator will satisfy ĝ(ˆβ) =0asymptotically; see notes. Overidentification is m>p.

TWO STEP OPTIMAL GMM ESTIMATOR: When m>pgmm estimator depends on weighting matrix Â. Optimal  minimizes asymptotic variance of ˆβ. Let Ω be asymptotic variance of nĝ(β 0 )= P n i=1 g i (β 0 )/ n,i.e. d nĝ(β0 ) N(0, Ω). In general, central limit theorem for time series gives Ω = lim n E[nĝ(β 0)ĝ(β 0 ) 0 ]. Optimal  satisfies  p Ω 1. Take  = ˆΩ 1 for ˆΩ a consistent estimator of Ω.

Estimating Ω. Let β be preliminary GMM estimator (known Â). Let g i = g i ( β) ĝ( β) If E[g i (β 0 )g i+ (β 0 ) 0 ]=0thenforsomepre ˆΩ = nx i=1 g i g i 0 /n. Example is CAPM model above.

If have autocorrelation in moment conditions then ˆΩ = ˆΛ 0 + ˆΛ = n X i=1 LX =1 g i g 0 i+ /n. w L (ˆΛ + ˆΛ 0 ), Weights w L make ˆΩ positive semi-definite. Bartlett weights w L =1 /(L +1), as in Newey and West (1987). Choice of L.

Two step optimal GMM: ˆβ =argmin β ĝ(β) 0 ˆΩ 1 ĝ(β). Two step optimal instrumental variables: For ĝ(β) =Z 0 (y Xβ)/n, ˆβ =(X 0 Z ˆΩ 1 Z 0 X) 1 X 0 Z ˆΩ 1 Z 0 y.

CONSISTENT ASYMPTOTIC VARIANCE ESTIMATION Optimal two step GMM estimator has d n(ˆβ β 0 ) N(0,V),V =(G 0 Ω 1 G). To form asymptotic t-statistics and confidence intervals need a consistent estimator ˆV of V. Let Ĝ = ĝ(ˆβ)/ β. A consistent estimator of V is ˆV =(Ĝ 0 ˆΩ 1 Ĝ) 1. Could also update ˆΩ.

ASYMPTOTIC THEORY FOR GMM: Consistency for i.i.d. data. If the data are i.i.d. and i) E[g i (β)] = 0 if and only if β = β 0 (identification); ii) the GMM minimization takes place over a compact set B containing β 0 ; iii) g i (β) is continuous at each β with probability one and E[sup β B kg i (β)k] is finite; iv) Â p A positive definite; then ˆβ p β 0. Proof in Newey and McFadden (1994). Idea: ĝ(β 0 ) p 0 by law of large numbers. Therefore ĝ(β 0 ) 0 Âg(β 0 ) p 0. Also ĝ(ˆβ) 0 Âĝ(ˆβ) ĝ(β 0 ) 0 Âg(β 0 ), so ĝ(ˆβ) 0 Âg(ˆβ) p 0. Theonlywaythiscanhappenisifˆβ p β 0.

Asymptotic normality: If the data are i.i.d., ˆβ p β 0 and i) β 0 is in the interior of the parameter set over which minimization occurs; ii) g i (β) is continuously differentiable on a neighborhood N of β 0 iii) E[sup β N k g i (β)/ βk] is finite; iv) Â p A and G 0 AG is nonsingular, for G = E[ g i (β 0 )/ β]; v)ω = E[g i (β 0 )g i (β 0 ) 0 ] exists, then n(ˆβ β 0 ) d N(0,V),V =(G 0 AG) 1 G 0 AΩAG(G 0 AG) 1. See Newey and McFadden (1994) for the proof. Can give derivation:

Let Ĝ = ĝ(ˆβ)/ β. First order conditions are 0=Ĝ 0 Âĝ(ˆβ) Expand ĝ(ˆβ) around β 0 to obtain ĝ(ˆβ) =ĝ(β 0 )+Ḡ(ˆβ β 0 ), where Ḡ = ĝ( β)/ β and β lies on the line joining ˆβ and β 0, and actually differs from row to row of Ḡ. Substitute this back in first order conditions to get 0=Ĝ 0 Âĝ(β 0 )+Ĝ 0 ÂḠ(ˆβ β 0 ), Solve for ˆβ β 0 and mulitply through by n to get n(ˆβ β 0 )= ³ Ĝ 0 ÂḠ 1 Ĝ0 Â nĝ(β 0 ).

Recall n(ˆβ β 0 )= ³ Ĝ 0 ÂḠ 1 Ĝ0  nĝ(β 0 ). By central limit theorem nĝ(β0 ) d N(0, Ω) Also we have  theorem, p A, Ĝ p G, Ḡ ³Ĝ0 ÂḠ 1 Ĝ0  p ³ G 0 AG 1 G 0 A p G, so by the continuous mapping Then by the Slutzky lemma. d n(ˆβ β 0 ) ³ G 0 AG 1 G 0 AN(0, Ω) =N(0,V).

The fact that A = Ω 1 minimizes the asymptotic varince follows from the Gauss Markov Theorem. Consider a linear model. E[Y ]=Gδ, V ar(y )=Ω. (G 0 Ω 1 G) 1 is the variance of generalized least squares (GLS) in this model. V =(G 0 AG) 1 G 0 AΩAG(G 0 AG) 1 is the variance of ˆδ =(G 0 AG) 1 G 0 AY. GLS has smallest variance by Gauss-Markov theorem, V (G 0 Ω 1 G) 1 is p.s.d.

TESTING IN GMM: Overidentification test statistic is nĝ(ˆβ) 0 ˆΩ 1 ĝ(ˆβ). Limiting distribution is chi-squared with m p degrees of freedom. Tests of H 0 : s(β) =0. Wald test statistic: ns(ˆβ) 0 [ s(ˆβ)/ β(ĝ 0 ˆΩ 1 Ĝ) 1 s(ˆβ)/ β] 1 s(ˆβ). Likelihood ratio (sum of squared residuals test statistic) min β:s(β)=0 nĝ(β)0 ˆΩ 1 ĝ(β) nĝ(ˆβ) 0 ˆΩ 1 ĝ(ˆβ). Lagrange mulitplier test statistic.

USING MANY MOMENTS IN GMM: Using more moments increases asymptotic efficiency of optimal two-step GMM. Suppose that g i (β) =(g 1 i (β)0,g 2 i (β)0 ) 0. Optimal GMM estimator for just the first set of moment conditions gi 1 (β) uses à (ˆΩ  = 1 ) 1! 0, 0 0 This  is not generally optimal for the entire moment function vector g i (β). Optimal GMM estimator for more moments asymptotically more efficient. But gives more bias for two-step GMM.

To see bias we look at expectation of GMM objective function for i.i.d. case. Similar in time series. Idea is that if expectation not minimized at truth then have bias, because estimator getting close to minimum of the expectation. Assume A is not random. Let ḡ(β) =E[g i (β)] and Ω(β) =E[g i (β)g i (β) 0 ]. Note that E[g i (β) 0 Ag i (β)] = E[tr{g i (β) 0 Ag i (β)}] =E[tr{Ag i (β)g i (β) 0 }] = tr{e[ag i (β)g i (β) 0 ]} = tr{aω(β)}.

For GMM objective function ˆQ(β) =ĝ(β) 0 Aĝ(β) is E[ ˆQ(β)] = E[ nx i,j=1 g i (β) 0 Ag j (β)]/n 2 = = X i6=j ḡ(β) 0 Aḡ(β)/n 2 + nx i=1 nx i,j=1 E[g i (β) 0 Ag j (β)]/n 2 E[g i (β) 0 Ag i (β)]/n 2 = (1 n 1 )ḡ(β) 0 Aḡ(β)+tr(AΩ(β))/n. First term (1 n 1 )ḡ(β) 0 Aḡ(β) is minimized at β 0. However, the second term is NOT generally minimized at β 0, and hence neither is E[ ˆQ(β)]. Han and Phillips (2005) noise term.

E[ ˆQ(β)] = (1 n 1 )ḡ(β) 0 Aḡ(β)+tr(AΩ(β))/n. Second, "bias" term shrinks at rate n 1, so GMM consistent. Bias in GMM behaves like size of second term, m/n. Higher-order asymptotics can be used to show that this does indeed happen. There are two known approaches to reducing the bias of GMM from this source. A) Subtract from the GMM objective function an estimate of the bias. B) Make the bias not depend on β by right choice of A.

Turns out B) is better. Choose A = Ω(β) 1. Then E[ ˆQ(β)] = (1 n 1 )ḡ(β) 0 Ω(β) 1 ḡ(β)+tr(ω(β) 1 Ω(β))/n = (1 n 1 )ḡ(β) 0 Ω(β) 1 ḡ(β)+m/n. Bias term does not depend on β, so that the objective function will be minimized at the true value. This estimator is not feasbile, because Ω(β) is known. A feasible version can be obtained by replacing Ω(β) with its estimator ˆΩ(β). The estimator that minimizes the resulting objective function is given by β =argminĝ(β) 0 ˆΩ(β) 1 ĝ(β). β This is the continuous updating estimator of Hansen, Heaton, and Yaron (1995). Time series extension: Replace ˆΩ(β) with autocorrelation consistent estimator like that described above.

GENERALIZED EMPIRICAL LIKELIHOOD (GEL) Estimators share low bias property of continuous updating estimator. Let ρ(u) be concave function of scalar argument u. Estimator is: ˆβ =argmin β max λ nx i=1 ρ(λ 0 g i (β)). This is continuous updating estimator for ρ(u) quadratic. For ρ(u) =ln(1 u) it is empirical likelihood. For ρ(u) = e u it is exponential tilting. Time series version, Kitamura (1997): Relace g i (β) with g i (β) = LX = L w L g i (β).

Has small bias like continuous updating because ˆλ close to zero so ρ(u) approximately quadratic, i.e. estimator behaves like continuous updating. Also, can be seen from first order conditions. Focus on empirical likelihood for i.i.d. data. The GMM first order conditions are Ĝ 0 ˆΩ( β) 1 ĝ(ˆβ) =0. The first order conditions for empirical likelihood are, for ĝ i = g i (ˆβ),

Define Solving gives λ : 0 = β : 0 = G = nx i=1 nx i=1 nx i=1 1 1 ˆλ 0 ĝ i = nĝ(ˆβ)+ g i 1 1 ˆλ 0 g i (ˆβ)/ β 0ˆλ g i 1 1 ˆλ 0 g i g i (ˆβ)/ β, Ω = G 0 Ω 1 ĝ(ˆβ) =0. nx i=1 nx i=1 1 1 ˆλ 0 g i ĝ i ĝ 0 iˆλ. 1 1 ˆλ 0 g i ĝ i ĝ 0 i. Differs from GMM in replacing the sample averages Ĝ and ˆΩ( β) by the weighted averages G and Ω. This replacement removes a source of bias because they are the weights that make the moment conditions sum to zero (from the first-order conditions for ˆλ). Newey and Smith (1994) show that among bias corrected estimators the empirical likelihood estimator is higher order efficient with i.i.d. data. It inherits the famous higher-order efficiency properties of maximum likelihood.