Generalized Method of Moments Estimation

Similar documents
generalized method of moments estimation

Interview with Lars Peter Hansen

GMM and SMM. 1. Hansen, L Large Sample Properties of Generalized Method of Moments Estimators, Econometrica, 50, p

Economic modelling and forecasting

Nonlinear GMM. Eric Zivot. Winter, 2013

Diagnostic Test for GARCH Models Based on Absolute Residual Autocorrelations

The properties of L p -GMM estimators

Research Article Optimal Portfolio Estimation for Dependent Financial Returns with Generalized Empirical Likelihood

A Course on Advanced Econometrics

Proofs for Large Sample Properties of Generalized Method of Moments Estimators

GMM Estimation and Testing

Specification Test for Instrumental Variables Regression with Many Instruments

Generalized Method of Moment

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be

An estimate of the long-run covariance matrix, Ω, is necessary to calculate asymptotic

ECONOMICS 7200 MODERN TIME SERIES ANALYSIS Econometric Theory and Applications

Estimating Deep Parameters: GMM and SMM

Volume 30, Issue 1. Measuring the Intertemporal Elasticity of Substitution for Consumption: Some Evidence from Japan

GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails

Joint Estimation of Risk Preferences and Technology: Further Discussion

Generalized Method of Moments (GMM) Estimation

Notes on Generalized Method of Moments Estimation

Motivation Non-linear Rational Expectations The Permanent Income Hypothesis The Log of Gravity Non-linear IV Estimation Summary.

Chapter 1. GMM: Basic Concepts

Single Equation Linear GMM

The generalized method of moments

Generalized Method of Moments: I. Chapter 9, R. Davidson and J.G. MacKinnon, Econometric Theory and Methods, 2004, Oxford.

Chapter 11 GMM: General Formulas and Application

DSGE Methods. Estimation of DSGE models: GMM and Indirect Inference. Willi Mutschler, M.Sc.

1. GENERAL DESCRIPTION

Spring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM

DSGE-Models. Limited Information Estimation General Method of Moments and Indirect Inference

Econ 423 Lecture Notes: Additional Topics in Time Series 1

Weak Identification in Maximum Likelihood: A Question of Information

The Hansen Singleton analysis

This chapter reviews properties of regression estimators and test statistics based on

A Primer on Asymptotics

What s New in Econometrics? Lecture 14 Quantile Methods

Econometrics of Panel Data

Uncertainty and Disagreement in Equilibrium Models

INFERENCE ON SETS IN FINANCE

Introduction to the Mathematical and Statistical Foundations of Econometrics Herman J. Bierens Pennsylvania State University

ORTHOGONALITY TESTS IN LINEAR MODELS SEUNG CHAN AHN ARIZONA STATE UNIVERSITY ABSTRACT

Empirical likelihood and self-weighting approach for hypothesis testing of infinite variance processes and its applications

Testing Overidentifying Restrictions with Many Instruments and Heteroskedasticity

Follow links for Class Use and other Permissions. For more information send to:

Method of Moments. 2. Minimum Chi-square Estimation. 1. Introduction. Method of Moments

New Developments in Econometrics Lecture 16: Quantile Estimation

GMM - Generalized method of moments

Financial Econometrics and Quantitative Risk Managenent Return Properties

L4. EULER EQUATIONS, NONLINEAR GMM, AND OTHER ADVENTURES

University of Pavia. M Estimators. Eduardo Rossi

1 Estimation of Persistent Dynamic Panel Data. Motivation

Flexible Estimation of Treatment Effect Parameters

INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION. 1. Introduction

Discussion of Bootstrap prediction intervals for linear, nonlinear, and nonparametric autoregressions, by Li Pan and Dimitris Politis

Follow links for Class Use and other Permissions. For more information send to:

Lecture 7: Dynamic panel models 2

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Introduction to Econometrics

Specification Test Based on the Hansen-Jagannathan Distance with Good Small Sample Properties

Birkbeck Working Papers in Economics & Finance

Comparing Forecast Accuracy of Different Models for Prices of Metal Commodities

Multivariate Distributions

A Guide to Modern Econometric:

Finite Sample Performance of A Minimum Distance Estimator Under Weak Instruments

ECOM 009 Macroeconomics B. Lecture 2

Exponential Tilting with Weak Instruments: Estimation and Testing

Long-run Uncertainty and Value

A strong consistency proof for heteroscedasticity and autocorrelation consistent covariance matrix estimators

Inference in VARs with Conditional Heteroskedasticity of Unknown Form

Long-run Uncertainty and Value

Introduction to Estimation Methods for Time Series models. Lecture 1

Regression: Ordinary Least Squares

Financial Econometrics

Syllabus. By Joan Llull. Microeconometrics. IDEA PhD Program. Fall Chapter 1: Introduction and a Brief Review of Relevant Tools

Chapter 1. Introduction. 1.1 Background

ASSET PRICING MODELS

Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case

COMPARISON OF GMM WITH SECOND-ORDER LEAST SQUARES ESTIMATION IN NONLINEAR MODELS. Abstract

1 The Basic RBC Model

Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case

Asymptotic distribution of GMM Estimator

1 Appendix A: Matrix Algebra

Dynamic Panels. Chapter Introduction Autoregressive Model

6. The econometrics of Financial Markets: Empirical Analysis of Financial Time Series. MA6622, Ernesto Mordecki, CityU, HK, 2006.

Understanding Regressions with Observations Collected at High Frequency over Long Span

Improving GMM efficiency in dynamic models for panel data with mean stationarity

Financial Econometrics and Volatility Models Estimation of Stochastic Volatility Models

Predicting bond returns using the output gap in expansions and recessions

Sample Exam Questions for Econometrics

ADVANCED FINANCIAL ECONOMETRICS PROF. MASSIMO GUIDOLIN

Asymptotics for Nonlinear GMM

Bayesian Averaging, Prediction and Nonnested Model Selection

If we want to analyze experimental or simulated data we might encounter the following tasks:

Cointegration Lecture I: Introduction

ARIMA Modelling and Forecasting

Dynamic Panel Data Models

Dynamic Discrete Choice Structural Models in Empirical IO

Equilibrium Valuation with Growth Rate Uncertainty

Transcription:

Generalized Method of Moments Estimation Lars Peter Hansen March 0, 2007 Introduction Generalized methods of moments (GMM) refers to a class of estimators which are constructed from exploiting the sample moment counterparts of population moment conditions (sometimes known as orthogonality conditions) of the data generating model. GMM estimators have become widely used, for the following reasons. GMM estimators have large sample properties that are easy to characterize in ways that facilitate comparison. A family of such estimators can be studied a priori in ways that make asymptotic efficiency comparisons easy. The method also provides a natural way to construct tests which take account of both sampling and estimation error. 2. In practice, researchers find it useful that GMM estimators can be constructed without specifying the full data generating process (which would be required to write down the maximum likelihood estimator.) This has been the case in the study of single equations in a simultaneous system, in the study of potentially misspecified dynamic models designed to match target moments, and in the construction of stochastic discount factor models that link asset pricing to sources of macroeconomic risk. Books with good discussions of GMM estimation with a wide array of applications include: Cochrane (200), Arellano (2003), Hall (2005), Cochrane (200), and Singleton (2006). For a theoretical treatment of this method see Hansen (982) along with the self discussions in the books. 2 Setup There are alternative ways to specify generalized method of moments estimators. Data are a finite number of realizations of the process {x t : t =, 2,...}. The model is specified as a vector of moment conditions: Ef(x t, β) = 0

where f has r coordinates and β is an unknown vector in a parameter space P R k. To achieve identification we assume that E[f(x t, β)] = 0 () on the parameter space P if, and only if β = β 0. The parameter β is typically not sufficient to write down a likelihood function. Other parameters are needed to specify fully the probability model that underlies the data generation. In other words, the model is only partially specified. Examples included: linear and nonlinear versions of instrumental variables estimators as in Sargan (958), Sargan (959) and Amemiya; asset pricing restrictions deduced from Euler equations; matching target moments of possibly misspecified models. 2. Central limit theory and Martingale approximation The sum g (β) = f(x t, β) is central to what we consider. As a refinement of the identification condition: t= g (β 0 ) = ormal(o, V ) where V is assumed to be nonsingular. In an iid data setting, V is the covariance matrix of the random vector f(x t, β 0 ). In a time series setting: V = lim E [g (β 0 )g (β 0 ) ], which is the long run counterpart to a covariance matrix. Central limit theory for time series is typically built on martingale approximation. (See Gordin (969) or Hall and Heyde (980)). For many time series models, the martingale approximators can be constructed directly and there is specific structure to the V matrix. A leading example is when f(x t, β) defines a conditional moment restriction. Suppose that x t, t = 0,,... generates a sigma algebra F t, that E [ f(x t, β 0 ) 2 ] < and E [f(x t+l, β 0 ) F t ] = 0 for some l. If l =, then g is itself a martingale; but when l > it is straightforward to find a martingale m with stationary increments and finite second moments such that lim E [ g (β 0 ) m (β 0 ) 2] = 0. 2

Moreover, V = l j= l+ E [f(x t, β 0 )f(x t+l, β 0 ) ] When there is no exploitable structure to the martingale approximator, the matrix V is the spectral density at frequency zero. 2.2 Minimizing a Quadratic Form One approach for constructing a GMM estimator is to minimize the quadratic form: b = arg min g (β) W g (β) P for some positive definite weighting matrix W. Alternative weighting matrices W are associated with alternative estimators. Part of the justification for this approach is that β 0 = arg min E[(x t, β)] W g (β) P The GMM estimator mimics this identification scheme by using a sample counterpart. There are a variety of ways to prove consistency of GMM estimators. Hansen (982) established a uniform law of large numbers for random functions when the data generation is stationary and ergodic. This uniformity is applied to show that sup β P g (β) E [f(x t, β)] = 0 and presumes a compact parameter space. The uniformity in the approximation carries over directly the GMM criterion function g (β) W g (β). See ewey and MacFadden for a more complete catalog of approaches of this type. The compactness of the parameter is seldom taken seriously in applications and thus this commonly invoke result is of less use than it might seem. It is substitute for checking tail behavior of the approximating function. This tail behavior can be important in practice, so a direct investigation of it can be fruitful. For models with parameter separation: f(x, β) = Xh(β) where X is a r m matrix constructed from x and h is a one-to-one function mapping P into R m, there is an alternative way to establish consistency. See Hansen (982). Models that are either linear in the variables or models based on matching moments that are nonlinear functions of the underlying parameters can be written in this form. The choice of W = V receives special attention, in part because g (β) V g (β) = χ 2 (r). 3

Since the matrix V is typically not known it can be replaced by a consistent estimator without altering the large sample properties of the estimator. When using martingale approximation, the implied structure of V can often be exploited. When there is no such exploitable structure, the method of ewey and West (Econometrica) can be employed based on approaches from the spectral analysis of time series. For asset pricing models there are other choices of weighting matrix motivated by considerations of misspecification. These are models with parameterized stochastic discount factors and the alternative weighting matrix is constructed by either minimizing the maximum pricing error or equivalently finding the least squares distance between a model of a stochastic discount factor with misspecification and the closest in least squares since stochastic discount factor that prices correctly the asset payoff under consideration in an investigation. See Hansen and Jagannathan (Journal of Finance) and Hansen - Heaton and Luttmer (Review of Financial Studies). 2.3 Selection Matrices An alternative depiction is to introduce a selection matrix a that has dimension k r and solve the equation system: ag (b ) = 0. The selection matric reduces the number of equations to be solved from r to k. Alternative selection matrices are associated with alternative GMM estimators. The selection matrix a reduces the problem to one of selecting among the array of possible equation systems. This approach builds on an approach of Sargan (958) and Sargan (959) and is most useful for characterizing limiting distributions. The aim is to study simultaneously the behavior of a family of estimators. Since alternative choices of a may give rise to alternative GMM estimators, index alternative estimators by the choice of a. Alternative choices of a imply alternative estimators. In what follows, replacing a by a consistent estimator does not alter the limiting distribution. Let [ ] f(xt, β o ) d = E β Then two results are central the study of GMM estimators: b (ad) a g (β 0 ) (2) and g (b ) [ I d(ad) a ] g (β 0 ). (3) These results are obtained by standard local methods. They require that the square matrix ad be nonsingular. For there to exist a valid selection matrix, d must have full column rank k. otice from (3) that the sample moment conditions evaluated at b have a degenerate 4

distribution. Premultiplying by a makes the right-hand side zero. This is to be expected because linear combinations of the sample moment conditions are set to zero in estimation. Let cov(a) = (ad) av a (d a ), which is the asymptotic covariance matrix for a GMM estimator with selection matrix a. otice that a selection matrix in effect over-parameterizes a GMM estimator. Two such estimators with selection matrices of the form a and ea for a nonsingular matrix e imply cov(ea) = cov(a) because the same linear combination of moment conditions are being used in estimation. Thus without loss of generality we may assume that ad = I. With this restriction we may imitate the proof of the famed Gauss-Markov Theorem to show that d V d cov(a) (4) and that the lower bound on left is attained by any ã such that ã = e(d V d) d V for some nonsingular e. The quadratic form version of a GMM estimator satisfies this restriction provided that W is a consistent estimator of V. To explore further the implications of this choice, factor the inverse covariance matrix V as V = Λ Λ and form d = Λd. Then V d(d V d) d V = Λ [ d( d d) d ]Λ The matrices d( d d) d and I d( d d) d are each idempotent and [ ] ([ ] [ ]) [I d( d d) d ] 0 I d( d d) d 0 d( d Λg (β 0 ) ormal, d) d 0. 0 d( d d) d The first coordinate block is an approximation for Λg (b ) and the sum of the two coordinate blocks is an approximation for Λg (β o ). Thus we may decompose the quadratic form [g (β o )] V g (β o ) [g (β )] V g (β ) + [g (β o )] V d(d V d) d V g (β o ). (5) where the two terms on the right-hand side are distributed as independent chi-square. The first has r degrees of freedom and the second one has r k degrees of freedom. 3 Implementation For a parameter vector β let V (β) denote an estimator of the long run covariance matrix. Given an initial consistent estimator b, suppose that V (β) is a consistent estimator of β 0 and d = f(b ) β. t= 5

Then use of the selection a = d [V (b )] attains the efficiency bound for GMM estimators. This is the so-called two step approach to GMM estimation. Repeating this procedure is the so-called iterative estimator. Finally consider the continuous-updating estimator. This is obtained by solving: min L (β) β P where L (β) = [g (β)] [V (β)] g (β) Let b denote the minimized value. While local methods of approximation give one way to construct approximate inferences, consider three alternative approaches: i) {β P : L (β) C} where C is a critical value from a χ 2 (r) distribution. ii) {β P : L (β) L (b ) C} where C is critical value from a χ 2 (k) distribution. iii) Choose a prior π. Mehcanically, treat 2 L (β) as a log-likelihood and compute exp [ L 2 (β) ] π(β) [ exp L 2 (β) ]. π( β)d β Approach i is based on the left-hand side of (5. It was suggested and studied in Hansen et al. (995) and Stock and Wright (Econometrica). As emphasized by Stock and Wright, it avoids using the local identification condition (a rank condition on the matrix d. On the other hand, it combines evidence about the parameter as reflected by the curvature of the objective with overall evidence about the model. A misspecified model will be reflected in an empty confidence interval. Approach ii is based on the second term on right-hand side of (5). By translating the objective function, evidence against the model is netted out. Of course it remains important to consider such evidence as parameter inference may be hard to interpret for a misspecified model. It s advantage is that it that degrees of freedom of the chi-square distribution are reduced from r to k. This approach was used by Hansen and Singleton, Journal of Business and Economic Statistics. Approach iii was suggested by Chernozhukov and Hong (2003). It requires an integrability condition which will be satisfied by specifying a uniform distribution π over a compact parameter space. The resulting histograms can be sensitive to this choice of this set or more generally to the choice of π. All three methods look at the shape of the objective function. 4 Conditional Moment Conditions The bound (4) presumes a finite number of moment conditions and characterizes how to use these moment conditions efficiently. If we start from the conditional moment moment There are no general argument that repeated iteration will converge. 6

restriction: E [f(x t+l, β 0 ) F t ] = 0 then in fact there are many moment conditions at our disposal. Functions of variables in the conditioning information set can be used to extend the number of moment conditions. By allowing for these conditions, we can improve on the asymptotic efficiency bound. For a characterization appropriate for cross sectional data see Chamberlain (Journal of Econometrics), and for a time series characterization see Hansen (Journal of Econometrics, 985). 7

References Arellano, M. 2003. Panel Data Econometrics. ew York: Oxford University Press. Chernozhukov, V. and H. Hong. 2003. An MCMC Approach to Classical Estimation 5:293 346. Cochrane, John. 200. Asset Pricing. Princeton University Press. Gordin, M. I. 969. The Central Limit Theorem for Stationary Processes. Soviet Mathematics Doklady 0:74 76. Hall, A. R. 2005. Generalized Method of Moments. ew York: Oxford University Press. Hall, P. and C. C. Heyde. 980. Martingale Limit Theory and Its Application. Boston: Academic Press. Hansen, L. P., J. Heaton, and E. G. J. Luttmer. 995. Econometric Evaluation of Asset Pricing Models. Review of Financial Studies 8:237 274. Hansen, Lars P. 982. Large Sample Properties of Generalized Method of Moments Estimators. Econometrica 50:029 054. Sargan, J. D. 958. The Estimation of Economic Relationships Using Instrumental Variables. Econometrica 26:393 45.. 959. The Estimation of Relationships with Autocorrelated Residuals by the Use of Instrumental Variables. Journal of the Royal Statistical Society: Series B 2:9 05. Singleton, K. J. 2006. Empirical Dynamic Asset Pricing: Model Specification and Econometric Assessment. Princeton University Press. 8