GMM, HAC estimators, & Standard Errors for Business Cycle Statistics

Similar documents
An estimate of the long-run covariance matrix, Ω, is necessary to calculate asymptotic

Asymptotic distribution of GMM Estimator

Single Equation Linear GMM with Serially Correlated Moment Conditions

Heteroskedasticity and Autocorrelation Consistent Standard Errors

Timevarying VARs. Wouter J. Den Haan London School of Economics. c Wouter J. Den Haan

Single Equation Linear GMM with Serially Correlated Moment Conditions

Chapter 11 GMM: General Formulas and Application

Spatial Regression. 11. Spatial Two Stage Least Squares. Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Generalized Method of Moments (GMM) Estimation

LECTURE ON HAC COVARIANCE MATRIX ESTIMATION AND THE KVB APPROACH

HAR Inference: Recommendations for Practice

Testing in GMM Models Without Truncation

Econ 423 Lecture Notes: Additional Topics in Time Series 1

Economic modelling and forecasting

Parameterized Expectations Algorithm

A Course on Advanced Econometrics

Heteroskedasticity- and Autocorrelation-Robust Inference or Three Decades of HAC and HAR: What Have We Learned?

Econ 423 Lecture Notes

LECTURE 18: NONLINEAR MODELS

9. AUTOCORRELATION. [1] Definition of Autocorrelation (AUTO) 1) Model: y t = x t β + ε t. We say that AUTO exists if cov(ε t,ε s ) 0, t s.

Freeing up the Classical Assumptions. () Introductory Econometrics: Topic 5 1 / 94

Dynamic Regression Models (Lect 15)

GMM and SMM. 1. Hansen, L Large Sample Properties of Generalized Method of Moments Estimators, Econometrica, 50, p

Estimating Deep Parameters: GMM and SMM

Econometrics Summary Algebraic and Statistical Preliminaries

Regression with time series

Lecture 4: Heteroskedasticity

Department of Economics, UCSD UC San Diego

Econometrics of Panel Data

Time Series and Forecasting Lecture 4 NonLinear Time Series

GMM - Generalized method of moments

u t = T u t = ū = 5.4 t=1 The sample mean is itself is a random variable. What if we sample again,

Lecture Stat Information Criterion

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

F9 F10: Autocorrelation

HETEROSKEDASTICITY, TEMPORAL AND SPATIAL CORRELATION MATTER

Lecture 7: Dynamic panel models 2

University of Pavia. M Estimators. Eduardo Rossi

Heteroskedasticity in Panel Data

LECTURE 11. Introduction to Econometrics. Autocorrelation

Heteroskedasticity in Panel Data

First published on: 29 November 2010 PLEASE SCROLL DOWN FOR ARTICLE

GMM estimation is an alternative to the likelihood principle and it has been

Econometría 2: Análisis de series de Tiempo

1 Outline. 1. Motivation. 2. SUR model. 3. Simultaneous equations. 4. Estimation

Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data

Central Bank of Chile October 29-31, 2013 Bruce Hansen (University of Wisconsin) Structural Breaks October 29-31, / 91. Bruce E.

Econ 582 Fixed Effects Estimation of Panel Data

Economics 582 Random Effects Estimation

Learning in Macroeconomic Models

ECON 4160, Autumn term 2017 Lecture 9

A Non-Parametric Approach of Heteroskedasticity Robust Estimation of Vector-Autoregressive (VAR) Models

Economics 620, Lecture 14: Lags and Dynamics

Econometrics Honor s Exam Review Session. Spring 2012 Eunice Han

Introduction to Econometrics

Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16)

ECON 366: ECONOMETRICS II. SPRING TERM 2005: LAB EXERCISE #10 Nonspherical Errors Continued. Brief Suggested Solutions

GMM estimation is an alternative to the likelihood principle and it has been

Econometrics - 30C00200

Unit Root and Cointegration

Time series models in the Frequency domain. The power spectrum, Spectral analysis

Generalized Method Of Moments (GMM)

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Multiple Equation GMM with Common Coefficients: Panel Data

1 The Multiple Regression Model: Freeing Up the Classical Assumptions

CAE Working Paper # A New Asymptotic Theory for Heteroskedasticity-Autocorrelation Robust Tests. Nicholas M. Kiefer and Timothy J.

A New Asymptotic Theory for Heteroskedasticity-Autocorrelation Robust Tests

Simultaneous Equations and Weak Instruments under Conditionally Heteroscedastic Disturbances

Time Series Econometrics For the 21st Century

GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails

Panel Data Models. James L. Powell Department of Economics University of California, Berkeley

B y t = γ 0 + Γ 1 y t + ε t B(L) y t = γ 0 + ε t ε t iid (0, D) D is diagonal

Questions and Answers on Heteroskedasticity, Autocorrelation and Generalized Least Squares

DSGE Methods. Estimation of DSGE models: GMM and Indirect Inference. Willi Mutschler, M.Sc.

ECONOMETFUCS FIELD EXAM Michigan State University May 11, 2007

Heteroskedasticity. We now consider the implications of relaxing the assumption that the conditional

A New Approach to the Asymptotics of Heteroskedasticity-Autocorrelation Robust Testing

The generalized method of moments

Heteroskedasticity and Autocorrelation

CCP Estimation. Robert A. Miller. March Dynamic Discrete Choice. Miller (Dynamic Discrete Choice) cemmap 6 March / 27

Discussion of Bootstrap prediction intervals for linear, nonlinear, and nonparametric autoregressions, by Li Pan and Dimitris Politis

DSGE-Models. Limited Information Estimation General Method of Moments and Indirect Inference

Inference about Clustering and Parametric. Assumptions in Covariance Matrix Estimation

This chapter reviews properties of regression estimators and test statistics based on

Short T Panels - Review

Introduction to Estimation Methods for Time Series models Lecture 2

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43

A New Approach to Robust Inference in Cointegration

ECONOMETRICS HONOR S EXAM REVIEW SESSION

10. Time series regression and forecasting

Economics 308: Econometrics Professor Moody

Optimal Bandwidth Selection in Heteroskedasticity-Autocorrelation Robust Testing

Lecture 6: Dynamic panel models 1

Heteroscedasticity and Autocorrelation

Pseudo-marginal MCMC methods for inference in latent variable models

Trend-Cycle Decompositions

Accurate Asymptotic Approximation in the Optimal GMM Frame. Stochastic Volatility Models

Final Exam. Economics 835: Econometrics. Fall 2010

6. Assessing studies based on multiple regression

Transcription:

GMM, HAC estimators, & Standard Errors for Business Cycle Statistics Wouter J. Den Haan London School of Economics c Wouter J. Den Haan

Overview Generic GMM problem Estimation Heteroskedastic and Autocorrelation Consistent (HAC) estimators to calcuate optimal weighting matrix and standard errors Simple applications OLS with correct standard errors IV with multiple instruments standard errors for business cycle statistics

GMM problem Underlying true model: E [h(x t ; θ)] = 0 p θ : m 1 vector with parameters x : n 1 vector of observables h( ) : p 1 vector-valued function with p m 0 p : p 1 vector with zeros

Examples OLS: q t = a + bp t + u t [ ut E u t p t ] = [ 0 0 ] IV: q t = a + bp t + u t [ ut E u t z t ] = [ 0 0 ]

Examples DSGE: [ ] c γ t = E t β (1 + r t+1 ) c γ t+1 u t+1 = β (1 + r t+1 ) c γ t+1 c γ t E t [u t+1 f (z t )] = 0 p = E [u t+1 f (z t )] = 0 p where z t is a vector with variables in information set in period t and f ( ) is a measurable function

GMM estimation Let where Y T contains data g(θ; Y T ) = T h (x t ; θ) /T, t=1 Idea of estimation: choose θ such that g(θ; Y T ) is as small as possible Throughout the slides, remember that g(θ; Y T ) is a mean

GMM estimation θ T = arg min θ Θ g(θ; Y T) W T g(θ; Y T ) where W T : p p weighting matrix No weighting matrix needed if p = m

Asymptotic standard errors T ( θ T θ 0 ) N(0, V) V = ( DWD ) 1 DWΣ0 W D ( DWD ) 1

Asymptotic standard errors W = plim W T T D = plim T θ 0 : true θ Σ 0 = E j= g (θ; Y T ) θ θ=θ0 [h (x t ; θ 0 ) h ( x t j ; θ 0 ) ]

Terms showing up in V D measures how sensitive g is to changes in θ 0. less precise estimates of θ 0 if g is not very sensitive to changes in θ. Σ 0 is the variance-covariance matrix of Tg(θ 0 ; Y T ) as T. if the mean underlying the estimation is more volatile = estimates of θ 0 less precise obtaining an estimate for Σ 0 is often the tricky bit (more on this below)

Two cases when formula for V simplifies 1 p = m (no overidentifying restrictions) T ( θ T θ 0 ) V = N(0, V) ( DΣ 1 0 D ) 1 2 using optimal weighting matrix, i.e., the matrix W that minimizes V. W optimal = Σ 1 0 T ( θ T θ 0 ) V = N(0, V) ( DΣ 1 0 D ) 1

Example 1 Y T = {x t } T t=1 and we want to estimate the mean µ h (x t ; µ) = x t µ g (µ; Y T ) = T t=1 (x t µ) T µ T = T t=1 x t T

Example 1 D = 1 Σ T equals the variance of ( T equals variance of T ( T t=1 x t ) T t=1 (x t µ) /T, which ) /T Variance of T t=1 x t equals variance of x 1 + covariance of x 1 &x 2 + + covariance of x 1 &x T + covariance of x 1 &x 2 + variance x 2 + etc. IF x t serially uncorrelated, then variance of ( T equals variance of x t T t=1 x t ) /T

Example 2 x 1,t and x 2,t have the same mean for simplicity assume that x 1,t and x 2,t are not correlated with each other and are not serially correlated Y T = {x 1,t, x 2,t } T t=1 h (x t ; µ) = Σ 0 = W = [ ] x1,t µ x 2,t µ & D = [ 1 1 ] [ σ 2 x1 0 ] 0 σ 2 x 2 [ 1/σ 2 x1 0 ] 0 1/σ 2 x 2

Estimating variance-covariance matrix g(θ; Y T ) is the mean of h(x t ; θ) variance of g(θ; Y T ) is easy to calculate if h (x t ; θ) is serially uncorrelated but in general it is diffi cult

Estimate variance of a mean The (limit of the) variance-covariance of Tg(θ; Y T ) equals Σ 0 = E [h (x t ; θ 0 ) h ( ) ] x t j ; θ 0 j= This is the spectral density of h (x t ; θ 0 ) at frequency 0 Since Σ 0 is a variance-covariance matrix, it should be positive semi-definite (PSD), that is, z Σ 0 z should be non-negative for any non-zero column vector z

HAC estimators 1 Kernel-based (truncated, Newey-West, Andrews) 2 Parametric (Den Haan & Levin)

Estimate variance of a mean Σ T = J [h κ(j; J) min{t,t+j} (x t=max{1,j+1} t ; θ 0 ) h ( ) ] x t j ; θ 0 T j= J where κ( ; J) is the kernel with bandwidth parameter J.

Truncated kernel Truncated κ(j; J) = { 1 if j J 0 o.w. disadvantages potential bias because autocorrelation at j > J is ignored answer not necessarily positive semi-definite

Newey-West (Bartlett) kernel κ(j; J) = 1 j J + 1 for j J Newey-West kernel always gives PSD disadvantages potential bias because autocorrelation at j > J is ignored bias because k (j; J) < 1 for included terms

Choice of bandwidth parameter a high (low) J leads to less (more) bias more (less) sampling variability J should be higher if h ( ; θ 0 ) is more persistent J optimal as T but at a slower rate

Choice of bandwidth parameter There are papers that give expressions for J optimal, but adding a constant to these expressions does not affect optimality (that is, the analysis only gives an optimal rate). Thus, in a finite sample you simply have to experiment and hope your answer is robust

Cost of imposing PSD Suppose h (x t ; θ 0 ) is an MA(1) = E [ h (x t ; θ 0 ) h ( x t j ; θ 0 )] = 0 for j > 1 If J = 1 then you have a biased estimate since k(1; 1) = 1/2 If J > 1 then you must include estimates of expectations that we know are zero Suppose h(x t ; θ 0 ) has a persistent and not persistent component = you must use the same J for both components

VARHAC Idea: Estimate a flexible time-series process (VAR) for h t, that is, h (x t ; θ 0 ) = J A j h ( ) x t j ; θ 0 + εt j=1 Use implied spectrum at frequency zero as the estimate for Σ 0

VARHAC h (x t ; θ 0 ) = J A j h ( ) x t j ; θ 0 + εt j=1 Then the implied spectrum at frequency zero, i.e., Σ 0, equals Σ T = [ I p J j=1 A j ] 1 Σ ε,t [I p ] J 1 A j j=1 with Σ ε,t = T j=j+1 ε tε t T

VARHAC Estimate is PSD by construction (PSD is obtained without imposing additional bias) Only bias is due to lag length potentially being too short You can use standard model selection criteria (AIC, BIC) to determine lag length You could estimate a VARMA VARHAC also gives a consistent estimate for nonlinear processes since Σ 0 only depends on second-order processes which can be captured with VAR

Back to GMM Exactly identified case: estimate θ T and calculate V T Over-identified case: You need W T to estimate θ T W optimal T depends on Σ 0, which depends on θ 0

Back to GMM What to do in practice? obtain (consistent) estimate of θ 0 with simple W T (identity although scaling is typically a good idea) use this estimate to calculate W optimal T use W optimal T to get a more effi cient estimate of θ 0 calculate variance of θ T using ( D T Σ T 1 D ) 1 T you could iterate on this

Examples OLS with heteroskedastic and serially correlated errors IV with multiple instruments Business cycle statistics

Key lesson of today s lecture If you can write the estimation problem as E [h(x t ; θ] = 0 p then you can use GMM and we know how to calculate standard errors

OLS q t = θp t + u t h (x t ) = u t p t = q t p t θp 2 t

OLS & GMM ) T ( θ T θ 0 N(0, V) V = ( DΣ0 1 1 ( ) D = E p 2 t

OLS & GMM ( ( ( )) ( ( )) ) 1 V = E p 2 t Σ0 1 E p 2 t If E[u t u t+τ ] = 0 for τ = 0, then [ Σ0 1 = E (u t p t ) 2] and ( ( ( )) [ V = E p 2 t E (u t p t ) 2] ( ( )) ) 1 E p 2 t If errors are homoskedastic, then [ E (u t p t ) 2] [ = E ( V = E u 2 t ] ( p 2 t [ E p 2 t ] )) 1 E [ u 2 t ]

OLS & GMM Suppose that q t = θp t + u t u t = ε t p t ε t i.i.d, E t [p t ε t ] = 0, and ε t homoskedastic Then you should do GLS q t p t = θ + ε t GMM does not by itself do the transformation, i.e., it does not do GLS, but it does give you standard errors that correct for heteroskedasticity

IV Suppose that q t = θp t + u t, E [u t z 1,t ] = 0 and E [u t z 2,t ] = 0 With GMM you can find the optimal weighting of the two instruments

Business cycles statistics 1 2 ψ = standard deviation (c t) standard deviation (y t ) ρ = correlation (c t, y t ) Statistics are easy to estimate, but what is the standard error?

Business cycles statistics GMM problem for ψ h (x t ; θ 0 ) = c t µ c y t µ y ( y t µ y ) 2 ψ 2 (c t µ c ) 2

Business cycles statistics corresponding D matrix D = = 1 0 0 0 1 0 ) [ ( 2E [(c t µ c )] 2E [(y t µ y ψ 2] ) ] 2 2E y t µ y ψ 1 0 0 0 1 0 [ ( ) ] 2 0 0 2E y t µ y ψ

Steps to follow 1 Use standard estimates for µ c, µ y, and ψ 2 Obtain estimate for D 3 Obtain estimate for Σ 0 using ) h (x t ; θ T = ( y t µ y,t ) 2 ψ 2 c t µ c,t y t µ y,t ) 2 (c t µ c,t 4 Obtain estimate for V using ( D T Σ T 1 D ) 1 T

Business cycles statistics GMM problem for ρ h ( x t; θ ) = c t µ c y t µ ( ) y 2 y t µ y ψ 2 (c t µ c ) 2 ( ) 2 ) y t µ y ψρ (ct µ c ) (y t µ y

Business cycles statistics corresponding D matrix D = 1 0 0 0 0 1 0 0 [ ( ) ] 2 0 0 2E y t µ y ψ 0 [ ( ) ] [ 2 ( ) ] 2 0 0 E y t µ y ρ E y t µ y ψ

References Den Haan, W.J., and A. Levin, 1997, A practioner s guide to robust covariance matrix estimation a detailed survey of all the ways to estimate Σ 0 with more detailed information