LECTURE ON HAC COVARIANCE MATRIX ESTIMATION AND THE KVB APPROACH

Similar documents
An estimate of the long-run covariance matrix, Ω, is necessary to calculate asymptotic

Asymptotic Least Squares Theory

Generalized Method of Moment

A New Approach to Robust Inference in Cointegration

Heteroskedasticity and Autocorrelation Consistent Standard Errors

Asymptotic distribution of GMM Estimator

Understanding Regressions with Observations Collected at High Frequency over Long Span

Single Equation Linear GMM with Serially Correlated Moment Conditions

Single Equation Linear GMM with Serially Correlated Moment Conditions

Testing in GMM Models Without Truncation

A New Approach to the Asymptotics of Heteroskedasticity-Autocorrelation Robust Testing

GMM, HAC estimators, & Standard Errors for Business Cycle Statistics

Central Bank of Chile October 29-31, 2013 Bruce Hansen (University of Wisconsin) Structural Breaks October 29-31, / 91. Bruce E.

Block Bootstrap HAC Robust Tests: The Sophistication of the Naive Bootstrap

CAE Working Paper # A New Asymptotic Theory for Heteroskedasticity-Autocorrelation Robust Tests. Nicholas M. Kiefer and Timothy J.

A New Asymptotic Theory for Heteroskedasticity-Autocorrelation Robust Tests

Economic modelling and forecasting

Quick Review on Linear Multiple Regression

Econometrics II - EXAM Outline Solutions All questions have 25pts Answer each question in separate sheets

Econometrics of Panel Data

Heteroskedasticity- and Autocorrelation-Robust Inference or Three Decades of HAC and HAR: What Have We Learned?

Optimal Bandwidth Selection in Heteroskedasticity-Autocorrelation Robust Testing

On the Long-Run Variance Ratio Test for a Unit Root

A Fixed-b Perspective on the Phillips-Perron Unit Root Tests

Econometrics II - EXAM Answer each question in separate sheets in three hours

HAR Inference: Recommendations for Practice

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Accurate Asymptotic Approximation in the Optimal GMM Frame. Stochastic Volatility Models

A strong consistency proof for heteroscedasticity and autocorrelation consistent covariance matrix estimators

University of Pavia. M Estimators. Eduardo Rossi

LONG RUN VARIANCE ESTIMATION AND ROBUST REGRESSION TESTING USING SHARP ORIGIN KERNELS WITH NO TRUNCATION

This chapter reviews properties of regression estimators and test statistics based on

The outline for Unit 3

by Ye Cai and Mototsugu Shintani

Robust Unit Root and Cointegration Rank Tests for Panels and Large Systems *

Fixed bandwidth asymptotics in single equation models of cointegration with an application to money demand

Alternative HAC Covariance Matrix Estimators with Improved Finite Sample Properties

Lecture 4: Heteroskedasticity

Robust Nonnested Testing and the Demand for Money

POWER MAXIMIZATION AND SIZE CONTROL IN HETEROSKEDASTICITY AND AUTOCORRELATION ROBUST TESTS WITH EXPONENTIATED KERNELS

CAE Working Paper # Fixed-b Asymptotic Approximation of the Sampling Behavior of Nonparametric Spectral Density Estimators

First published on: 29 November 2010 PLEASE SCROLL DOWN FOR ARTICLE

Introductory Econometrics

Estimation and Inference of Linear Trend Slope Ratios

Generalized Method of Moments (GMM) Estimation

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Choice of Spectral Density Estimator in Ng-Perron Test: Comparative Analysis

Introduction to Quantile Regression

CAE Working Paper # Spectral Density Bandwith Choice: Source of Nonmonotonic Power for Tests of a Mean Shift in a Time Series

ROBUST MEASUREMENT OF THE DURATION OF

Economics 582 Random Effects Estimation

Testing Weak Convergence Based on HAR Covariance Matrix Estimators

Spatial Regression. 11. Spatial Two Stage Least Squares. Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Testing for Unit Roots in the Presence of a Possible Break in Trend and Non-Stationary Volatility

Consistent Parameter Estimation for Conditional Moment Restrictions

Darmstadt Discussion Papers in Economics

MFE Financial Econometrics 2018 Final Exam Model Solutions

Introductory Econometrics

Title. Description. Quick start. Menu. stata.com. xtcointtest Panel-data cointegration tests

Inference with Dependent Data Using Cluster Covariance Estimators

Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION. September 2017

The generalized method of moments

Follow links for Class Use and other Permissions. For more information send to:

Quasi Maximum Likelihood Theory

Orthogonal samples for estimators in time series

Residual-Based Tests for Cointegration and Multiple Deterministic Structural Breaks: A Monte Carlo Study

Asymptotic Least Squares Theory: Part II

A Robust Test for Weak Instruments in Stata

THE ERROR IN REJECTION PROBABILITY OF SIMPLE AUTOCORRELATION ROBUST TESTS

1 Procedures robust to weak instruments

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Casuality and Programme Evaluation

Predicting bond returns using the output gap in expansions and recessions

CHANGE DETECTION IN TIME SERIES

Ch.10 Autocorrelated Disturbances (June 15, 2016)

Econ 510 B. Brown Spring 2014 Final Exam Answers

A Suffi cient Test for Dynamic Stability

GMM - Generalized method of moments

Regression #3: Properties of OLS Estimator

A TWO-STAGE PLUG-IN BANDWIDTH SELECTION AND ITS IMPLEMENTATION FOR COVARIANCE ESTIMATION

TitleReducing the Size Distortion of the.

10. Time series regression and forecasting

Advanced Econometrics

Econ 583 Final Exam Fall 2008

Generalized Least Squares Theory

GMM and SMM. 1. Hansen, L Large Sample Properties of Generalized Method of Moments Estimators, Econometrica, 50, p

Discussion of Bootstrap prediction intervals for linear, nonlinear, and nonparametric autoregressions, by Li Pan and Dimitris Politis

arxiv: v1 [math.st] 29 Aug 2017

Final Exam November 24, Problem-1: Consider random walk with drift plus a linear time trend: ( t

Finite Sample Properties of Tests Based on Prewhitened Nonparametric Covariance Estimators

Multivariate Time Series

The regression model with one fixed regressor cont d

HAC Corrections for Strongly Autocorrelated Time Series

Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator

Chapter 6. Panel Data. Joan Llull. Quantitative Statistical Methods II Barcelona GSE

Department of Economics, UCSD UC San Diego

HETEROSKEDASTICITY, TEMPORAL AND SPATIAL CORRELATION MATTER

Inference with Dependent Data Using Cluster Covariance Estimators

Bootstrap prediction intervals for factor models

Transcription:

LECURE ON HAC COVARIANCE MARIX ESIMAION AND HE KVB APPROACH CHUNG-MING KUAN Institute of Economics Academia Sinica October 20, 2006 ckuan@econ.sinica.edu.tw www.sinica.edu.tw/ ckuan

Outline C.-M. Kuan, HAC.1 Preliminary HAC (heteroskedasticity and autocorrelation consistent) covariance matrix estimation Kernel HAC estimators Choices of the kernel function and bandwidth KVB Approach: Constructing asymptotically pivotal tests without consistent estimation of asymptotic covariance matrix ests of parameters M tests for general moment conditions Over-identifying restrictions tests

Preliminary C.-M. Kuan, HAC.2 Consider the specification: y t = x tβ + e t. [A1] For some β o, ɛ t = y t x tβ o such that IE(x t ɛ t ) = 0 and [ r] 1 t=1 x t ɛ t S o W k (r), r [0, 1], Σ o = lim var ( 1/2 t=1 x tɛ t ) with So its matrix square root (i.e., Σ o = S o S o). [A2] M [ r] := [ r] 1 [ r] t=1 x tx t Asymptotic normality of the OLS estimator: (ˆβ β o ) = M 1 1 IP M o uniformly in r (0, 1]. x t ɛ t t=1 which has the distribution N (0, M 1 o Σ o M 1 o ). D M 1 o S o W k (1),

C.-M. Kuan, HAC.3 he null hypothesis is Rβ o = r, where R (q k) has full row rank. ( Rˆβ r ) D N (0, RM 1 o Σ o M 1 o R ). ( RM 1 Σ M 1 R ) 1/2 ( Rˆβ r ) D N (0, Iq ). he Wald test is W = ( Rˆβ r ) ( RM 1 Σ M 1 R ) 1( Rˆβ r ) D χ 2 (q). Consistent estimation of Σ o is crucial; W would not have a limiting χ 2 distribution if Σ is not a consistent estimator. Othe large sample tests (e.g., the LM test) also depend on consistent estimation of the asymptotic covariance matrix. When Σ is consistent, the resulting tests are said to be robust to heteroskedasticity and serial correlations of unknown form.

Asymptotic Covariance Matrix Σ o C.-M. Kuan, HAC.4 A general form: Σ o = lim 1 t=1 s=1 IE(x t ɛ t ɛ s x s) = lim 1 j= +1 Γ (j), with autocovariances: { 1 t=j+1 Γ (j) = IE(x tɛ t ɛ t j x t j ), j = 0, 1, 2,..., 1 t= j+1 IE(x t+jɛ t+j ɛ t x t), j = 1, 2,.... When x t ɛ t are covariance stationary, Γ (j) = Γ(j), and the spectral density of x t ɛ t at frequency ω is f(ω) = 1 Γ(j)e ijω. 2π j= Σ o = 2π f(0) and hence is also known as the long-run variance of x t ɛ t.

Examples of Σ o C.-M. Kuan, HAC.5 When x t ɛ t are serially uncorrelated: Σ o = lim Γ (0) = lim 1 IE(ɛ 2 tx t x t), t=1 which can be consistently estimated by White s heteroskedasticity-consistent estimator: Σ = 1 t=1 ê2 tx t x t. When x t ɛ t are serially uncorrelated and ɛ t conditionally homoskedastic: ( ) Σ o = σo 2 1 lim IE(x t x t) = σom 2 o, t=1 which can be consistently estimated by Σ = ˆσ 2 M. Estimating Σ o is much more difficult when heteroskedasticity and serial correlations are present and of unknown form.

A Consistent Estimator of Σ o C.-M. Kuan, HAC.6 Recall Σ o = lim 1 j= +1 Γ (j). A consistent estimator (White, 1984): Σ = l( ) Γ (j), j= l( ) with sample autocovariances { 1 t=j+1 Γ (j) = x tê t ê t j x t j, j = 0, 1, 2,..., 1 t= j+1 x t+jê t+j ê t x t, j = 1, 2,.... where l( ) grows with but at a slower rate. Drawbacks: his estimator is not guaranteed to be a positive semi-definite matrix. Must determine l( ) in a given sample.

Kernel HAC Estimators C.-M. Kuan, HAC.7 Newey and West (1987) and Gallant (1987): A consistent estimator that is also positive semi-definite is: Σ κ = 1 j= +1 ( j κ ) Γ (j), l( ) where κ is a kernel function and l( ) is its bandwidth. κ and l( ) jointly determine the weighting scheme on Γ (j) and must be selected by users. For a given j, κ(j/l( )) 1 as, so as to ensure consistency. For a given, κ(j/l( )) should be small when j is large, so as to ensure positive semi-definiteness. More formally (Andrews, 1991), κ(x)e ixω dx 0, ω R.

Kernel Functions C.-M. Kuan, HAC.8 Bartlett kernel (Newey and West, 1987): { 1 x, x 1, κ(x) = 0, otherwise; Parzen kernel (Gallant, 1987): 1 6x 2 + 6 x 3, x 1/2, κ(x) = 2(1 x ) 3, 1/2 x 1, 0, otherwise; Quadratic spectral kernel (Andrews, 1991): κ(x) = 25 ( ) sin(6πx/5) cos(6πx/5) ; 12π 2 x 2 6πx/5 Daniel kernel (Ng and Perron, 1996): κ(x) = sin(πx) πx.

Bandwidths C.-M. Kuan, HAC.9 Andrews (1991): By minimizing MSE, the optimal bandwidth growth rates are: l ( ) = 1.1447(c 1 ) 1/3, l ( ) = 2.6614(c 2 ) 1/5, l ( ) = 1.3221(c 2 ) 1/5, (Bartlett), (Parzen), (quadractic spectral), c 1 and c 2 are unknown numbers depending on the spectral density, and /l ( ) ( ˆΣ κ Σ o ) = O IP (1). As far as MSE is concerned, the quadratic spectral kernel is to be preferred. Σ B = O IP ( 1/3 ); Σ P nd Σ QS Σ QS are O IP ( 2/5 ). is more efficient than Σ P ; Σ B is the least efficient.

Problems with the Kernel Estimators C.-M. Kuan, HAC.10 he performance of the kernel HAC estimator varies with the choices of the kernel and its bandwidth. he kernel weighting scheme yields negative bias, and such bias could be substantial in finite samples. he tests based on the HAC estimators usually over-reject the null. he choices of kernel and bandwidth are somewhat arbitrary in practice, and hence the statistical inferences are vulnerable. he HAC estimator with the quadratic spectral kernel need not have better performance in finite samples. Andrews (1991) suggested a plug-in method to estimate the optimal growth rates l ( ), but this method requires estimation of a userselected model to determine c 1 and c 2.

Other Improved HAC Estimators C.-M. Kuan, HAC.11 Andrews and Monahan (1992): Pre-whitened estimator. Apply a VAR model to whiten x t ê t and estimate the covariance matrix based on its residuals. he choices of the model for pre-whitening and VAR lag order are, again, arbitrary. Kuan and Hsieh (2006): Computing sample autocovariances based on forecast errors (y t x β t t 1 ), instead of the OLS residuals. It does not require another user-chosen parameter. It yields a smaller bias (but a larger MSE); the resulting tests have more accurate test size without sacrificing test power. Bias reduction seems more important for improving HAC estimators.

KVB Approach C.-M. Kuan, HAC.12 Kiefer, Vogelsang, and Bunzel (2000): A Wald-type test is W = ( Rˆβ r ) ( RM 1 1 ĈM R ) 1( Rˆβ r ), where a normalizing matrix Ĉ is used in place of Σ κ. Ĉ is inconsistent for Σ o but is able to eliminate the nuisance parameters in Σ o. Advantages: Do not have to choose a kernel bandwidth. he resulting test remains pivotal asymptotically. he limiting distribution of the test approximates the finite-sample distribution very well (i.e., little size distortion).

KVB s Normalizing Matrix C.-M. Kuan, HAC.13 Let ˆϕ t = 1/2 t i=1 x iê i. he normalizing matrix Ĉ is ( Ĉ = 1 ˆϕ t ˆϕ t = 1 t ) ( t ) x 2 i ê i ê i x i. t=1 t=1 i=1 i=1 he limit of ˆϕ [ r] : ˆϕ [ r] = 1 [ r] i=1 x i ɛ i [ r] 1 [ r] [ r] i=1 x i x i (ˆβ β o ) Hence, S o W k (r) rm o M 1 o S o W k (1) = S o B k (r). Ĉ S o ( 1 0 ) B k (r)b k (r) dr S o =: S o P k S o.

C.-M. Kuan, HAC.14 Let G o denote the matrix square root of RM 1 o S o S om 1 R. hen, RM 1 1 ĈM R RM 1 o S o P k S om 1 o R d = Go P q G o. and R (ˆβ ) D β o RM 1 o S o W k (1) = d G o W q (1). o W is thus asymptotically pivotal: W W ) q(1) G o( Go P q G 1Go o W q (1) = W q (1) P 1 q W q (1). Lobato (2001) reported some quantiles, cf. Kiefer et al. (2000). For the null of β i = r, a t-type test is ( ˆβi, t r ) D W (1) = [ 1 ˆδi 0 B(r)2 dr ]. 1/2 his distribution is more disperse than the standard normal distribution.

Kernel-Based Normalizing Matrices C.-M. Kuan, HAC.15 Kiefer and Vogelsang (2002a): 2Ĉ = Σ B without truncation, i.e., l( ) =. he usual Wald test based on Σ B without truncation is thus the same as W /2. In particular, the t test based on Σ B truncation is also t / 2. Kiefer and Vogelsang (2002b): Σκ S o P κ ks o, with P κ k = 1 1 0 0 κ (r s)b k (r)b k (s) dr ds; without he Wald test based on Σ κ without truncation can also serve as a KVB s robust test. A test based on Σ B without truncation compares favorably with that based on Σ QS in terms of test power. Hence, the Bartlett kernel is to be preferred in constructing a KVB test, in contrast with HAC estimation.

M ests C.-M. Kuan, HAC.16 he null hypothesis: IE[f(η t ; θ o )] = 0, where θ o is the k 1 true parameter vector, and f is a q 1 vector of functions. θ o Is Known Define m [r ] (θ) = 1 [r ] t=1 f(η t; θ), for r (0, 1]. An M test is based on m (θ o ), the sample counterpart of the null. By a CL, 1/2 m (θ o ) m (θ o ) Σ 1 m (θ o ) D N (0, Σ o ), and the conventional M test is: D χ 2 (q), when Σ is a consistent estimator of Σ o. he limiting χ 2 distribution hinges on consistent estimation of Σ o.

C.-M. Kuan, HAC.17 [B1](a) Under the null, m [r ] (θ o ) S o W q (r) for 0 r 1, where S o is the nonsingular, matrix square root of Σ o. C (θ o ) = 1 t=1 ϕ t(θ o )ϕ t (θ o ) with ϕ t (θ o ) = 1 t [ f(ηi ; θ o ) m (θ o ) ]. i=1 Analogous to KVB s Wald-type test, an M test is M = m (θ o ) C (θ o ) 1 m (θ o ) By [B1](a), 1/2 m (θ o ) S o W q (1). D W q (1) P 1 q W q (1). ϕ [r ] (θ o ) S o [ Wq (r) rw q (1) ] = S o B q (r), 0 r 1. C (θ o ) S o P q S o with P q = 1 0 B q(r)b q (r) dr.

θ o Is Unknown C.-M. Kuan, HAC.18 Replacing θ o in m and ϕ t with a root- consistent estimator ˆθ that satisfies [ 1 (ˆθ θ o ) = Q o ] q(η t ; θ o ) + o IP (1). t=1 Kuan and Lee (2006): he M test is M = m (ˆθ ) Ĉ 1 m (ˆθ ), where Ĉ = C (ˆθ ) = 1 t=1 ϕ t(ˆθ )ϕ t (ˆθ ) with ϕ t (ˆθ ) = 1 t [ f(ηi ; ˆθ ) m (ˆθ ) ]. i=1 he limit of M depends on the estimation effect of replacing θ o with ˆθ.

C.-M. Kuan, HAC.19 [B1](b) Under the null, 1 t=1 f(η t; θ o ) 1 t=1 q(η G o W q+k (1), t; θ o ) where G o is nonsingular. [B2] F [r ] (θ o ) = [r ] 1 [r ] t=1 θf ( η t ; θ o ) IP F o, uniformly in r (0, 1], where F o is a q k non-stochastic matrix; θ F [r ] (θ o ) = O IP (1). A aylor expansion about θ o gives m[r ] (ˆθ ) = m [r ] (θ o )+ [r ] F [r ](θ o ) [ (ˆθ θ o ) ] +o IP (1); the second term is the estimation effect and converges to rf o Q o A o W k (1), where A o is the matrix square root of G 22 G 22 + G 21 G 21.

C.-M. Kuan, HAC.20 [B1](b) and [B2] imply m (ˆθ ) [I q F o Q o ]G o W q+k (1) d = V o W q (1), where V o is the matrix square root of [I q F o Q o ]G o G o[i q F o Q o ]. Note V o = S o when F o = 0 (i.e., no estimation effect). Due to centering, the estimation effects in ϕ [r ] (ˆθ ) cancel out: ϕ [r ] (ˆθ ) = m [r ] (θ o ) + [r ] F [r ](θ o ) [ (ˆθ θ o ) ] [r ] m (θ o ) [r ] F (θ o ) [ (ˆθ θ o ) ] + o IP (1) = m [r (θ o ) [r ] m (θ o ) + o IP (1). Ĉ = C (θ o ) + o IP (1) S o P q S o, regardless of the estimation effect.

C.-M. Kuan, HAC.21 When estimation effect is present, Ĉ is unable to eliminate V o, and M D Wq (1) V o[ So P q S o] 1V o W q (1), hat is, M depends on S o and V o and is not asymptotically pivotal. When there is no estimation effect (F o = 0), V o = S o, and M D Wq (1) P 1 q W q (1), which is also the limit of M. Remark: he nonsingularity of G o required in [B1](b) is crucial for the M tests here. It excludes the cases that the moment functions (f) and the estimator (which depends on q) are asymptotically correlated, e.g., the over-identifying restrictions in the context of GMM.

M est under Estimation Effect C.-M. Kuan, HAC.22 Kuan and Lee (2006): C = 1 t=k+1 ϕ t ϕ t with ϕ t = ϕ t ( θ t, θ ) = 1 t [ f(ηi, θ t ) m ( θ ) ], i=1 where θ t are the recursive estimators based on first t observations. he M test is M = m (ˆθ ) C 1 m (ˆθ ) which has the same limit as M. 1/2 m (ˆθ ) V o W q (1). D W q (1) P 1 q W q (1), ϕ [r ] V o B q (r), and hence C V o P q V o. While HAC estimation of V o is practically difficult, M avoids estimating V o and hence is also robust to estimation effect.

Example: ests of Serial Correlations C.-M. Kuan, HAC.23 Specification: y t = h(x t ; θ) + e t (θ) with the NLS estimator ˆθ. IE(y t x t ) = h(x t ; θ o ) and ε t := e t (θ o ) = y t h(x t ; θ o ). he null hypothesis is IE[f t,q (θ o )] = IE(ε t ɛ t 1,q ) = 0, where ɛ t 1,q = [ε t 1,..., ε t q ]. Letting q = q, define m q (θ) = 1 e t (θ)e t 1,q (θ). q t=q+1 We can base an M test on m q (ˆθ ) = 1 q t=q+1 e t(ˆθ )e t 1,q (ˆθ ). q 1/2 m q (ˆθ ) and q 1/2 m q (θ o ) are not asymptotically equivalent unless F q (θ o ) converges to F o = 0.

C.-M. Kuan, HAC.24 Here, F q (θ o ) = 1 q [ t=q+1 ɛt 1,q θ h t (θ o ) + ε t θ h t 1,q (θ o ) ]. F o would be zero if {x t } and {ε t } are mutually independent. When h(x t ; θ o ) = x tθ o, F o = 0 when {x t } and {ε t } are mutually uncorrelated. he M test based on model residuals is M q = q m q (ˆθ ) Ĉ 1 q m q (ˆθ ) where the normalizing matrix is Ĉ q = 1 q D W q (1) P 1 q W q (1), t=q+1 ˆϕ t ˆϕ t with ˆϕ t = 1 q t [ ei (ˆθ )e i 1,q (ˆθ ) ] t q 1 q i=q+1 q [ ei (ˆθ )e i 1,q (ˆθ ) ]. i=q+1 M q includes the test of Lobato (2001) for raw time series as a special case.

C.-M. Kuan, HAC.25 F o 0 for the residuals of dynamic models, such as AR models and models with lagged dependent variables. he M test based on the residuals of dynamic models is M q = m q (ˆθ ) C 1 q m q (ˆθ ) where the normalizing matrix is C q = 1 q D W q (1) P 1 q W q (1), t=q+1 ϕ t ϕ t with ϕ t = 1 q t [ ei ( θ t )e i 1,q ( θ t ) ] t q 1 q i=q+1 q [ ei ( θ )e i 1,q ( θ ) ], i=q+1 and e i ( θ t ) = y i h(x i ; θ t ) is the i th residual evaluated at the recursive NLS estimator θ t. M q is a specification test without consistent estimation of the asymptotic covariance matrix.

C.-M. Kuan, HAC.26

C.-M. Kuan, HAC.27 κ(x) 0.2 0.2 0.4 0.6 0.8 1.0 Bartlett Parzen Quadratic Spectra Daniell 0 1 2 3 4 x Figure 1: he Bartlett, Parzen, quadratic spectral and Daniel kernels.

C.-M. Kuan, HAC.28 Figure 2: he asymptotic local powers of the standard M test (solid), M (dashed) and at 5% level. M (dotted)