Single Equation Linear GMM with Serially Correlated Moment Conditions

Similar documents
Single Equation Linear GMM with Serially Correlated Moment Conditions

Covariance Stationary Time Series. Example: Independent White Noise (IWN(0,σ 2 )) Y t = ε t, ε t iid N(0,σ 2 )

An estimate of the long-run covariance matrix, Ω, is necessary to calculate asymptotic

B y t = γ 0 + Γ 1 y t + ε t B(L) y t = γ 0 + ε t ε t iid (0, D) D is diagonal

Trend-Cycle Decompositions

Asymptotic distribution of GMM Estimator

Define y t+h t as the forecast of y t+h based on I t known parameters. The forecast error is. Forecasting

GMM, HAC estimators, & Standard Errors for Business Cycle Statistics

7. MULTIVARATE STATIONARY PROCESSES

5: MULTIVARATE STATIONARY PROCESSES

Heteroskedasticity and Autocorrelation Consistent Standard Errors

Introduction to Stochastic processes

Nonlinear GMM. Eric Zivot. Winter, 2013

LECTURE ON HAC COVARIANCE MATRIX ESTIMATION AND THE KVB APPROACH

Single Equation Linear GMM

1 Outline. 1. Motivation. 2. SUR model. 3. Simultaneous equations. 4. Estimation

9. AUTOCORRELATION. [1] Definition of Autocorrelation (AUTO) 1) Model: y t = x t β + ε t. We say that AUTO exists if cov(ε t,ε s ) 0, t s.

VAR Models and Applications

ECON 616: Lecture 1: Time Series Basics

Vector autoregressions, VAR

Advanced Econometrics

Forecasting with ARMA

Vector Auto-Regressive Models

The generalized method of moments

1 Linear Difference Equations

Università di Pavia. Forecasting. Eduardo Rossi

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

Class 1: Stationary Time Series Analysis

Ch. 14 Stationary ARMA Process

1 Class Organization. 2 Introduction

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

Econometría 2: Análisis de series de Tiempo

E 4101/5101 Lecture 6: Spectral analysis

Discrete time processes

Stationary Stochastic Time Series Models

Economic modelling and forecasting

Econometric Forecasting

Class 4: VAR. Macroeconometrics - Fall October 11, Jacek Suda, Banque de France

Heteroskedasticity- and Autocorrelation-Robust Inference or Three Decades of HAC and HAR: What Have We Learned?

Ch. 15 Forecasting. 1.1 Forecasts Based on Conditional Expectations

Testing in GMM Models Without Truncation

Principles of forecasting

Multiple Equation GMM with Common Coefficients: Panel Data

Generalized Method of Moments (GMM) Estimation

Univariate Nonstationary Time Series 1

Autoregressive Moving Average (ARMA) Models and their Practical Applications

Econ 623 Econometrics II Topic 2: Stationary Time Series

2.5 Forecasting and Impulse Response Functions

Estimating Moving Average Processes with an improved version of Durbin s Method

Univariate Time Series Analysis; ARIMA Models

Consider the trend-cycle decomposition of a time series y t

Lecture 2: ARMA(p,q) models (part 2)

Econometrics II Heij et al. Chapter 7.1

Financial Econometrics and Volatility Models Estimation of Stochastic Volatility Models

Empirical Macroeconomics

Forecasting and Estimation

GMM and SMM. 1. Hansen, L Large Sample Properties of Generalized Method of Moments Estimators, Econometrica, 50, p

Financial Econometrics Return Predictability

LECTURE 10 LINEAR PROCESSES II: SPECTRAL DENSITY, LAG OPERATOR, ARMA. In this lecture, we continue to discuss covariance stationary processes.

Empirical Macroeconomics

Title. Description. Quick start. Menu. stata.com. xtcointtest Panel-data cointegration tests

Econ 424 Time Series Concepts

Econ 583 Final Exam Fall 2008

ECONOMETRICS Part II PhD LBS

10. Time series regression and forecasting

Econ 582 Nonparametric Regression

Non-nested model selection. in unstable environments

Multivariate Time Series

Lecture 1: Fundamental concepts in Time Series Analysis (part 2)

Lecture on ARMA model

HAR Inference: Recommendations for Practice

Chapter 3 - Temporal processes

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley

Asymptotics for Nonlinear GMM

Estimating Deep Parameters: GMM and SMM

Massachusetts Institute of Technology Department of Economics Time Series Lecture 6: Additional Results for VAR s

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models

Lecture 4a: ARMA Model

ARIMA Modelling and Forecasting

Lesson 2: Analysis of time series

Econometrics of financial markets, -solutions to seminar 1. Problem 1

Permanent Income Hypothesis (PIH) Instructor: Dmytro Hryshko

Lecture 1: Stationary Time Series Analysis

A Primer on Asymptotics

Thomas J. Fisher. Research Statement. Preliminary Results

CAE Working Paper # Fixed-b Asymptotic Approximation of the Sampling Behavior of Nonparametric Spectral Density Estimators

Robust Standard Errors for Robust Estimators

TIME SERIES AND FORECASTING. Luca Gambetti UAB, Barcelona GSE Master in Macroeconomic Policy and Financial Markets

Econ 423 Lecture Notes: Additional Topics in Time Series 1

Statistics of stochastic processes

Week 5 Quantitative Analysis of Financial Markets Characterizing Cycles

Nonlinear time series

Y t = ΦD t + Π 1 Y t Π p Y t p + ε t, D t = deterministic terms

MEI Exam Review. June 7, 2002

Empirical Market Microstructure Analysis (EMMA)

Autoregressive and Moving-Average Models

Asymptotic Least Squares Theory

Chapter 9: Forecasting

IDENTIFICATION OF ARMA MODELS

Ch. 19 Models of Nonstationary Time Series

Transcription:

Single Equation Linear GMM with Serially Correlated Moment Conditions Eric Zivot November 2, 2011

Univariate Time Series Let {y t } be an ergodic-stationary time series with E[y t ]=μ and var(y t ) <. A fundamental decomposition result is the following: Wold Representation Theorem. y t has the representation y t = μ + j=0 ψ j ε t j = μ + ε t + ψ 1 ε t 1 + ψ 0 = 1, j=0 ψ 2 j < ε t MDS(0,σ 2 )

Remarks 1. The Wold representation shows that y t has a linear structure. As a result, thewoldrepresentationisoftencalledthelinear process representation of y t P j=0 ψ 2 j 2. < is called square-summability and controls the memory of the process. It implies that ψ j 0 as j at a sufficiently fast rate.

Variance γ 0 = var(y t ) = var = k=0 k=0 ψ k ε t k ψ 2 k var(ε t) = σ 2 X ψ 2 k < k=0

Autocovariances γ j = E[(y t μ)(y t j μ)] = E k=0 ψ k ε t k h=0 ψ h ε t h j E[(ψ 0 + ψ 1 ε t 1 + + ψ j ε t j + ) (ψ 0 ε t j + ψ 1 ε t j 1 + )] = σ 2 X ψ j+k ψ k, j =0, 1, 2,... k=0

Ergodicity Ergodicity requires It can be shown that implies P j=0 γ j <. j=0 j=0 γ j < ψ 2 j <

Asymptotic Properties of Linear Processes LLN for Linear Processes (Phillips and Solo, Annals of Statistics 1992). Assume j=0 y t = μ + ψ(l)ε t, ε t MDS(0,σ 2 ) = μ + j=0 ψ j ε t j, ψ(l) = ψ(l) is 1-summable; i.e, j ψ j = 1 ψ 1 +2 ψ 2 + < j=0 ψ j L j

Then ˆμ = 1 T ˆγ j = 1 T TX t=1 TX t=1 y t p E[yt ]=μ (y t ˆμ)(y t ˆμ) p cov(y t,y t j )=γ j

CLT for Linear Processes (Phillips and Solo, Annals of Statistics 1992) y t = μ + ψ(l)ε t, ε t MDS(0,σ 2 ) ψ(1) = = μ + j=0 ψ j ε t j, ψ(l) = ψ(l) is 1-summable j=0 ψ j 6=0 j=0 ψ j L j

Then T (ˆμ μ) d N(0, LRV) LRV = long-run variance = γ j j= = γ 0 +2 j=1 = σ 2 ψ(1) 2 = σ 2 γ j, since γ j = γ j 2 ψ j j=0

Intuition behind LRV formula Consider var( T ȳ) =var Using the fact that TX t=1 it follows that var X T y t = 1 T t=1 T var 1 TX y t t=1 y t = 1 0 y, 1 =(1,...,1) 0, y =(y 1,...,y T ) 0 TX y t t=1 =var(1 0 y)=1 0 var(y)1

Now var(y) = E[(y μ1)(y μ1) 0 ] = = Γ γ 0 γ 1 γ 2 γ T 1 γ 1 γ 0 γ 1 γ T 2 γ 2 γ 1 γ 0........ γ T 3. γ T 1 γ T 2 γ T 3 γ 0 where γ j =cov(y t,y t j ) and γ j = γ j. Therefore, var TX y t t=1 = 1 0 var(y)1 = 1 0 Γ1

Now, 1 0 Γ1 = sum of all elements in the T T matrix Γ. This sum may be computed by summing across the rows, or the columns or along the diagonals. Given the band diagonal structure of Γ, it is most convenient to sum along the diagonals. Doing so gives 1 0 Γ1 = Tγ 0 +2(T 1)γ 1 +2(T 2)γ 2 + +2γ T 1

Then 1 (T 1) (T 2) T 10 Γ1 = γ 0 +2 γ T 1 +2 γ T 2 + +2 1 T γ T 1 TX 1 µ = γ 0 +2 1 j γ T j j=1 As T, it can be shown that 1 T 10 Γ1 γ 0 +2 j=1 γ j =LRV

Remark Since γ j = γ j, we may re-express T 1 1 0 Γ1 as 1 T 10 Γ1 = γ 0 + T 1 X j= (T 1) Then, an alternative representation for LRV is à 1 j T! γ j LRV = γ j j=

Example: MA(1) Process Y t = μ + ε t + θε t 1, θ < 1 ε t iid (0,σ 2 ) Recall ψ(l) = 1+θL γ 0 = σ 2 (1 + θ 2 ), γ 1 = σ 2 θ Then LRV = γ 0 +2 γ j j=1 = σ 2 (1 + θ 2 )+2σ 2 θ = σ 2 (1 + θ) 2 = σ 2 ψ(1) 2

Remarks 1. If θ =0then LRV = σ 2 2. If θ = 1 then ψ(1) = 0 LRV = σ 2 ψ(1) 2 =0 (a) This motivates the condition ψ(1) 6= 0in the CLT for stationary and ergodic linear processes

Example: AR(1) Process Y t μ = φ(y t 1 μ)+ε t, ε t WN(0,σ 2 ), φ < 1 E[Y t ] = μ Recall, Then ψ(l) = (1 φl) 1 γ 0 = σ 2 LRV = σ 2 ψ(1) 2 = 1 φ 2, γ j = φ j γ 0 σ 2 (1 φ) 2

Straightforward algebra gives LRV = γ 0 +2 γ j j=1 = γ 0 +2 γ 0 X j=1 φ j = σ 2 (1 φ) 2 Remark LRV as φ 1

Estimating Long-Run Variance y t = μ + ψ(l)ε t, ε t MDS(0,σ 2 ) LRV = j= = σ 2 ψ(1) 2 There are two types of estimators γ j = γ 0 +2 γ j j=1 Parametric (assume a parametric model for y t ) Nonparametric (do not assume a parametric model for y t )

Example: MA(1) process A parametric LRV estimate is Y t = μ + ε t + θε t 1, θ < 1 ε t iid (0,σ 2 ) LRV = σ 2 (1 + θ) 2 d LRV = ˆσ 2 (1 + ˆθ) 2, ˆσ 2 p σ 2, ˆθ p θ

Remarks 1. Parametric estimates require us to specify a model for y t 2. For the parametric estimate, we only need consistent estimates for σ 2 and θ (e.g., GMM estimates or ML estimates) in order for LRV d to be consistent 3. Parametric estimates are generally more efficient than non-parametric estimates provided they are based on the correct model

For the general linear process y t = μ + ψ(l)ε t, ε t MDS(0,σ 2 ) a parametric estimator is not possible because ψ(l) = P j=0 ψ j L j has an infinite number of parameters. A natural nonparametric estimator is the truncated sum of sample autocovariances d LRV q = ˆγ 0 +2 qx j=1 ˆγ j q = truncation lag

Problems 1. How to pick q? 2. q must grow with T in order for d LRV q p LRV 3. d LRV q can be negative in finite samples, particularly if q is close to T. This is due to Perceval s result TX j=1 ˆγ j =0

Kernel Based Estimators These are nonparametric estimators of the form (Hayashi notation) where d LRV ker = T 1 X j= (T 1) κ Ã j q(t )! ˆγ j κ( ) = kernel weight function q(t ) = bandwidth parameter

The kernel estimators are motivated by the result which suggests var ³ T ȳ = κ Ã j q(t )! = T 1 X j= (T 1) Ã 1 j T! Ã 1 j T! ˆγ j, q(t )=T

Examples of Kernel Weight Functions Truncated kernel Note q(t ) = q<t ( 1 for x 1 κ(x) = 0 for x > 1 κ Ã j q! = ( 1 0 for j q for j >q

Example: q =2 LRV d Trunc ker = T 1 X j= (T 1) κ Ã j q(t )! ˆγ j = = ˆγ 2 +ˆγ 1 +ˆγ 0 +ˆγ 1 +ˆγ 2 = ˆγ 0 +2(ˆγ 1 +ˆγ 2 ) 2X j= 2 ˆγ j Remarks: 1. The bandwidth parameter q(t )=q acts as a lag truncation parameter 2. Here, d LRV Trunc ker is not guaranteed to be positive if q is close to T

Bartlett kernel q(t ) < T ( 1 x κ(x) = 0 for x 1 for x > 1 d LRV ker computed with the Bartlet kernel gives the so-called Newey-West estimator. Example: q(t )=3 κ Ã j q(t )! = κ µ j 3 = ( 1 j 0 3 for j 2 for j > 2

Then LRV d Bartlett ker = T 1 X j= (T 1) κ Ã j q(t )! ˆγ j = 2X j= 2 Ã 1 j 3! ˆγ j = 3ˆγ 1 2 + 2 3ˆγ 1 +ˆγ 0 + 2 3ˆγ 1 + 1 3ˆγ 2 2 = ˆγ 0 +2 3ˆγ 1 + 1 2 3ˆγ = ˆγ 0 +2 2X j=1 1 j ˆγ 3 j

Remarks 1. Use of the Bartlett kernel guarantees that d and West, 1987 Ecta for a proof) LRV Bartlett ker > 0 (See Newey 2. In order for d p LRV, it must be the case that q(t ) as T. There is a bias-variance tradeoff. Ifq(T ) tooslowlythenthere will be bias; if q(t ) too fast then there will be an increase in variance. LRV Bartlett ker 3. Andrews (1991) Ecta proved that if q(t ) at rate T 1/3 then MSE( d LRV Bartlett ker, LRV) will be minimized asymptotically. 4. Andrews result q(t ) at rate T 1/3 suggests setting q(t )=ct 1/3, c = some constant

However, Andrews result says nothing about what c should be. Hence, Andrews result is not practically useful.

Remarks continued 5. Many software programs use the following rule due to Newey and West (1987) " q(t )=int 4 µ T 100 1/4 # 6. Andrews (1991) studied two other kernels - the Parzen kernel and the quadratic spectral (QS) kernel (these are commonly programmed into software). He shows that q(t ) at rate T 1/5 in order to asymptotically minimize MSE( LRV d ker, LRV). He also showed that MSE( LRV d QS ker, LRV) is the smallest among the different kernels.

Remarks continued 7. Newey and West (1994) Ecta gave Monte Carlo evidence that the choice of bandwidth, q(t ), ismoreimportantthanthechoiceofkernel. They suggested some data-based methods for automatically choosing q(t ). However, as discussed in den Haan and Levin (1997) these methods are not really automatic because they depend on an initial estimate of LRV and some pre-specified weights. See Hall (2005), section 3.5 for more details.

weights 0.0 0.2 0.4 0.6 0.8 1.0 Truncated Bartlett Parzen Quadratic Spectral 1 2 3 4 5 6 7 8 9 10 lags Figure 1: Common kernel functions.

Multivariate Time Series Consider n time series variables {y 1t },...,{y nt }. A multivariate time series is the (n 1) vector time series {Y t } where the i th row of {Y t } is {y it }. That is, for any time t, Y t =(y 1t,...,y nt ) 0. Multivariate time series analysis is used when one wants to model and explain the interactions and co-movements among a group of time series variables: Consumption and income, Stock prices and dividends, forward and spot exchange rates interest rates, money growth, income, inflation GMM moment conditions (g t = x t ε t )

Stationary and Ergodic Multivariate Time Series A multivariate time series Y t is covariance stationary and ergodic if all of its component time series are stationary and ergodic. E[Y t ]=μ =(μ 1,...,μ n ) 0 = var(y t )=Γ 0 = E[(Y t μ)(y t μ) 0 ] var(y 1t ) cov(y 1t,y 2t ) cov(y 1t,y nt ) cov(y 2t,y 1t ) var(y 2t ) cov(y 2t,y nt )...... cov(y nt,y 1t ) cov(y nt,y 2t ) var(y nt ) The correlation matrix of Y t is the (n n) matrix corr(y t )=R 0 = D 1 Γ 0 D 1 where D is a diagonal matrix with j th diagonal element (γ 0 jj )1/2 =var(y jt ) 1/2.

The parameters μ, Γ 0 and R 0 are estimated from data (Y 1,...,Y T ) using thesamplemoments Ȳ = 1 T ˆΓ 0 = 1 T TX t=1 TX t=1 Y t p E[Yt ]=μ (Y t Ȳ)(Y t Ȳ) 0 p var(yt )=Γ 0 ˆR 0 = ˆD 1ˆΓ 0 ˆD 1 p cor(y t )=R 0 where ˆD is the (n n) diagonal matrix with the sample standard deviations of y jt along the diagonal. The Ergodic Theorem justifies convergence of the sample moments to their population counterparts.

All of the lag k cross covariances and correlations are summarized in the (n n) lag k cross covariance and lag k cross correlation matrices Γ k = E[(Y t μ)(y t k μ) 0 ]= cov(y 1t,y 1t k ) cov(y 1t,y 2t k ) cov(y 1t,y nt k ) cov(y 2t,y 1t k ) cov(y 2t,y 2t k ) cov(y 2t,y nt k )...... cov(y nt,y 1t k ) cov(y nt,y 2t k ) cov(y nt,y nt k ) R k = D 1 Γ k D 1 The matrices Γ k and R k arenotsymmetricink but it is easy to show that Γ k = Γ 0 k and R k= R 0 k.

The matrices Γ k and R k are estimated from data (Y 1,...,Y T ) using ˆΓ k = 1 T TX t=k+1 ˆR k = ˆD 1ˆΓ k ˆD 1 (Y t Ȳ)(Y t k Ȳ) 0

Multivariate Wold Representation Any (n 1) covariance stationary multivariate time series Y t has a Wold or linear process representation of the form Y t = μ + ε t + Ψ 1 ε t 1 + Ψ 2 ε t 2 + = μ + k=0 Ψ k ε t k, Ψ 0 = I n ε t WN(0, Σ) Ψ k is an (n n) matrix with (i, j)th element ψ k ij.

In lag operator notation, the Wold form is Ψ(L) = The moments of Y t are given by Y t = μ + Ψ(L)ε t k=0 E[Y t ] = μ var(y t ) = k=0 Ψ k L k Ψ k ΣΨ 0 k

Digression on Vector Autoregressive Models (VARs) The most popular multivariate time series model is the vector autoregressive (VAR) model. The VAR model is a multivariate extension of the univariate autoregressive model. For example, a bivariate VAR(1) model has the form or à y1t y 2t! = à c1 c 2! + à π 1 11 π 1 12 π 1 21 π1 22!à y1t 1 y 2t 1 y 1t = c 1 + π 1 11 y 1t 1 + π 1 12 y 2t 1 + ε 1t y 2t = c 2 + π 1 21 y 1t 1 + π 1 22 y 2t 1 + ε 2t! + à ε1t ε 2t! where à ε1t ε 2t! iid Ãà 0 0!, à σ11 σ 12 σ 12 σ 22!!

The general VAR(p) modelfory t =(y 1t,y 2t,...,y nt ) 0 has the form Y t = c + Π 1 Y t 1 + Π 2 Y t 2 + + Π p Y t p + ε t, t =1,...,T ε t iid (0, Σ) In lag operator notation, we have Π(L)Y t = c + ε t Π(L) = I n Π 1 L Π p L p Note: For {Y t } to be covariance-stationary and ergodic, there must be restrictions on the elements of the Π k matrices. These are discussed in Econ 584.

If det(π(z)) = 0 has all roots outside the complex unit circle then Y t is covariance stationary and ergodic and the Wold representation for Y t can be found by inverting the VAR polynomial Y t = Π(L) 1 (c + ε t )=μ + Ψ(L)ε t

Long Run Variance Let Y t be an (n 1) stationary and ergodic multivariate time series with E[Y t ]=μ. Phillips and Solo s CLT for linear processes states that T (Ȳ μ) d N (0, LRV) LRV (n n) = X j= Ψ(1)= Γ j = Ψ(1)ΣΨ(1) 0 Ψ k k=0 Since Γ j = Γ 0 j, LRV may be alternatively expressed as LRV = Γ 0 + j=1 (Γ j + Γ 0 j )

Non-parametric Estimate of the Long-Run Variance A consistent estimate of LRV may be computed using non-parametric methods. A popular estimator is the Newey-West weighted autocovariance estimator basedonbartlettweights d LRV = ˆΓ 0 + ˆΓ j = 1 T q(t ) 1 X TX t=j+1 j=1 q(t ) = bandwidth à 1 j q(t )! ³ˆΓ j + ˆΓ 0 j (Y t Ȳ)(Y t j Ȳ) 0

Single Equation Linear GMM with Serial Correlation y t = z 0 tδ 0 + ε t, t =1,...,n z t = L 1 vector of explanatory variables δ 0 = L 1 vector of unknown coefficients ε t = random error term x t = K 1 vector of exogenous instruments Moment Conditions and identification for General Model g t (w t, δ 0 ) = x t ε t = x t (y t z 0 tδ 0 ) E[g t (w t, δ 0 )] = E[x t ε t ]=E[x t (y t z 0 tδ 0 )] = 0 E[g t (w t, δ)] 6= 0 for δ 6= δ 0

Serially Correlated Moments It is assumed that {g t } = {x t ε t } is a mean zero, 1-summable linear process g t = Ψ(L)ε t, ε t MDS(0, Σ) satisfying the Phillips-Solo CLT with long-run variance Remark S = LRV = Γ 0 + j=1 (Γ j + Γ 0 j )=Ψ(1)ΣΨ(1)0 Γ 0 = E[g t gt]=e[x 0 t x 0 tε 2 t ] Γ j = E[g t gt j 0 ]=E[x tx 0 t j ε tε t j ] If {g t } is not serially correlated, then Γ j =0for j>0 and LRV = Γ 0 = E[g t g 0 t]

Estimation of LRV LRV is typically estimated using the Newey-West estimator (kernel estimator with Bartlett kernel) d LRV = ˆΓ 0 + ˆΓ 0 = 1 T ˆΓ j = 1 T q(t ) 1 X TX j=1 à t=j+1 g t g 0 t = 1 T TX 1 j q(t ) t=j+1 g t g 0 t j = 1 T ˆε t = y t z 0 tˆδ, ˆδ p ˆδ TX! t=j+1 TX t=j+1 ³ˆΓ j + ˆΓ 0 j x t x 0 tˆε 2 t x t x 0 t jˆε tˆε t j

Remark The estimate d LRV is often denoted Ŝ HAC, where HAC denotes heteroskedastic and autocorrelation consistent.

Efficient GMM ˆδ(Ŝ HAC 1 ) = argmin δ = argmin δ J(Ŝ 1 HAC, δ) ng n (δ) 0 Ŝ 1 HAC g n(δ) As with GMM with heteroskedastic errors, the following efficient GMM estimators can be computed 1. 2-step efficient 2. Iterated efficient 3. Continuously-Updated (CU)

Note: the CU estimator solves where ˆδ(Ŝ 1 HAC,CU )=argmin δ + q(t ) 1 X j=1 à Ŝ HAC (δ) =ˆΓ 0 (δ) 1 j q(t ) ng n (δ) 0 Ŝ 1 HAC (δ)g n(δ)! ³ˆΓ j (δ)+ˆγ 0 j (δ) ˆΓ j (δ) = 1 T TX t=j+1 ε t (δ) = y t z 0 tδ This can be quite numerically unstable! x t x 0 t j ε t(δ)ε t j (δ)