Linear Regression with Time Series Data

Econometrics 2 Linear Regression with Time Series Data Heino Bohn Nielsen 1of21 Outline (1) The linear regression model, identification and estimation. (2) Assumptions and results: (a) Consistency. (b) Unbiasedness and bias in dynamic models. (c) Asymptotic Distribution. (3) Autocorrelation of the Error Term: (a) Consequences of autocorrelation. (b) Interpretations on residual autocorrelation. (c) Tests for no-autocorrelation. 2of21

Examples of Time Series Regression Models Consider the linear regression model y t = x 0 tβ + t = x 1t β 1 + x 2t β 2 +... + x kt β k + t, for t =1, 2,...,T.Notethaty t and t are 1 1, while x t and β are k 1. The properties of the model depends on the variables in x t. (1) If x t contains contemporaneously dated variables, it is denoted a static regression: y t = x 0 tβ + t. (2) Asimplemodelfory t given the past is the autoregressive model: y t = θy t 1 + t. (3) More complicated dynamics in the autoregressive distributed lag (ADL) model: y t = θ 1 y t 1 + x 0 tϕ 0 + x 0 t 1ϕ 1 + t. 3of21 Interpretation of Regression Models Consider again the regression model y t = x 0 tβ + t, t =1, 2,..., T. ( ) As it stands, the equation is a tautology: is not informative on β! Why?...For any β we can find a residual t so that ( ) holds. We have to impose restrictions on t to ensure a unique solution to ( ). This is called identification in econometrics. Assume that ( ) represents the conditional expectation, E[y t x t ]=x 0 tβ, sothat E[ t x t ]=0. ( ) This condition states a zero-conditional-mean. Wesaythatx t is predetermined. Under assumption ( ) the coefficients are the partial (ceteris paribus) effects E[y t x t ] x jt = β j. 4of21

Identification Predeterminedness implies the so-called moment condition: stating that x t and t are uncorrelated. E[x t t ]=0, ( ) Now insert the model definition, t = y t x 0 tβ, in( )toobtain E[x t (y t x 0 tβ)] = 0 E[x t y t ] E[x t x 0 t]β = 0. This is a system of k equations in the k unknown parameters, β, andife[x t x 0 t] is non-singular we can find the so-called population estimator which is unique. β = E[x t x 0 t] 1 E[x t y t ], The parameters in β are identified by ( ) and the non-singularity condition. The latter is the well-known condition for no-perfect-multicollinearity. 5of21 Method of Moments (MM) Estimation From a given sample (y t,x 0 t) 0, t =1, 2,...,T, we cannot calculate expectations. In practice we replace with sample averages, and obtain the MM or OLS estimator Ã! 1 Ã! bβ = T 1 x t x 0 t T 1 x t y t. For MM to work, i.e. b β β, weneedalawoflargenumbers(lln) toapply,i.e. T 1 x t y t E[x t y t ] and T 1 x t x 0 t E[x t x 0 t]. Note the two distinct conditions for OLS to converge to the true value: (1) The moment condition ( ) or predeterminedness should be satisfied. (2) A law of large numbers should apply. A central part of econometric analysis is to ensure these conditions. 6of21

Main Assumption We impose assumptions to ensure that a LLN applies to the sample averages. Main Assumption: Consider a time series y t and the k 1 vector time series x t.weassume (1) that z t =(y t,x 0 t) 0 has a joint stationary distribution; and (2) that the process z t is weakly dependent, sothatz t and z t+p becomes approximately independent for p. Interpretation: Think of (1) as replacing identical distributions for IID data. Think of (2) as replacing independent observations for IID data. Under the main assumption, most of the results for linear regression on random samples carry over to the time series case. 7of21 Consistency Consistency is the first requirement for an estimator: b β should converge to β. Result 1: Consistency Let y t and x t obey the main assumption. If the regressors obey the moment condition, E[x t t ]=0, then the OLS estimator is consistent, i.e. β b β as T. To get consistency of OLS, the explanatory variables should be predetermined. OLS is consistent if the regression represents the conditional expectation of y t x t. The conditions in Result 1 are sufficient but not necessary. We will see examples of estimators that are consistent even if the conditions are not satisfied (related to unit roots and cointegration). 8of21

Illustration of Consistency Consider the regression model with a single explanatory variable, k =1, y t = x t β + t t =1, 2,...,T. Write the OLS estimator as bβ = T 1 P T y tx t T P = T 1 P T (x tβ + t )x t 1 T x2 t T P = β + T 1 P T tx t 1 T x2 t T P, 1 T x2 t andlookatthetermsast : plimt 1 T plimt 1 T x 2 t = q, 0 <q< (O) t x t = E [ t x t ]=0 (OO) Where LLN applies under Assumption 1; and (O) holds for a stationary process (q is the limiting variance of x t ). (OO) follows from predeterminedness. 9of21 Unbiasedness A stronger requirement for an estimator, b β, is unbiasedness: E[ b β]=β. Result 2: Unbiasedness Let y t and x t obey the main assumption. If the regressors are strictly exogenous, E [ t x 1,x 2,..., x t,...,x T ]=0, then the OLS estimator is unbiased, i.e. E[ b β x 1,x 2,..., x T ]=β. Unbiasedness require strict exogeneity, which is not fulfilled in a dynamic regression. Consider the first order autoregressive model y t = θy t 1 + t. Here y t is function of t,so t cannot be uncorrelated with y t,y t+1,...,y T. Result 3: Estimation bias in dynamic models In general, the OLS estimator is not unbiased in a regressions with a lagged dependent variable. 10 of 21

Finite Sample Bias in an AR(1) In a MC simulation we take an AR(1) as the DGP and estimation model: y t =0.9 y t 1 + t, t N(0, 1). 1.2 1.1 1.0 0.9 0.8 0.7 0.6 0.5 0.4 Mean of OLS estimate in AR(1) model True value MEAN MEAN ± 2 MCSD 10 20 30 40 50 60 70 80 90 100 11 of 21 Asymptotic Distribution To derive the asymptotic distribution we need a CLT; additional restrictions on t. Result 4: Asymptotic distribution Let y t and x t obey the main assumption. Furthermore, assume homoskedasticity and no serial correlation, i.e. E[ 2 t x t ] = σ 2 E[ t s x t,x s ] = 0 for all t 6= s. Then as T, the OLS estimator is asymptotically normal: T ³ bβ β N(0,σ 2 E[x t x 0 t] 1 ). Inserting natural estimators, we can test hypothesis using Ã! 1 bβ a N β, bσ 2 x t x 0 t. 12 of 21

Complete Dynamic Model The precise condition for no-serial-correlation looks strange. Often we disregard conditioning and consider whether t and s are uncorrelated. We say that a model is dynamically complete if E[y t x t,y t 1,x t 1,y t 2,x t 2,...,y 1,x 1 ]=E[y t x t ]=x 0 tβ. x t contains all relevant information in the available information set. No-serial-correlation is practically the same as dynamic completeness. All systematic information in the past of y t and x t is used in the regression model. This is often taken as an important design criteria for a dynamic regression model. We should always test for no-autocorrelation in time series models. 13 of 21 Autocorrelation of the Error Term If Cov( t, s ) 6= 0for some t 6= s, wehaveautocorrelation of the error term. This is detected from estimated residual and Cov(b t,b s ) 6= 0is referred to as residual autocorrelation. Often used synonymously. Residual autocorrelation does not imply that the DGP has autocorrelated errors. Autocorrelation is taken as a signal of misspecification. Different possibilities: (i) Autoregressive errors in the DGP. (ii) Dynamic misspecification. (iii) Omitted variables and non-modelled structural shifts. (iv) Misspecified functional form To solution to the problem depends on the interpretation. 14 of 21

Consequences of Autocorrelation Autocorrelation will not violate the assumptions for Result 1 in general. But E[x t t ]=0is violated if the model includes a lagged dependent variable. Look at an AR(1) model with error autocorrelation, i.e. the two equations y t = θy t 1 + t t = ρ t 1 + v t, v t IID(0,σ 2 v). Both y t 1 and t depends on t 1,soE [y t 1 t ] 6= 0. Result 5: Inconsistency of OLS In a regression model including the lagged dependent variable, the OLS estimator is not consistent in the presence of autocorrelation of the error term. Even if OLS is consistent, the standard formula for the variance Result 4 is no longer valid. It is possible to derive the variance, the so-called heteroskedasticityand-autocorrelation-consistent (HAC) standard errors. 15 of 21 (i): Autoregressive Errors in the DGP Consider the case where the errors are truly autoregressive: y t = x 0 tβ + t t = ρ t 1 + v t, v t IID(0,σ 2 v). If ρ is known we can write (y t ρy t 1 ) = x 0 t ρxt 1 0 β +( t ρ t 1 ) y t = ρy t 1 + x 0 tβ x 0 t 1ρβ + v t. The transformation is analog to GLS transformation in the case of heteroskedasticity. The GLS model is subject to a so-called common factor restriction: three regressors but only two parameters, ρ and β. Estimation is non-linear. 16 of 21

Consistent estimation of the parameters in the GLS model requires E[(x t ρx t 1 )( t ρ t 1 )] = 0. But then t 1 should be uncorrelated with x t,i.e.thate[ t x t+1 ]=0. Consistency of GLS requires stronger assumptions than consistency of OLS. The GLS transformation is rarely used in modern econometrics. (1) Residual autocorrelation does not imply that the error term is autoregressive. There is no a priori reason to believe that the transformation is correct. (2) The requirement for consistency of GLS is strong. 17 of 21 (ii): Dynamic Misspecification Residual autocorrelation indicates that the model is not dynamically complete, so E[y t x t ] 6= E[y t x t,y t 1,x t 1,y t 2,x t 2,..., y 1,x 1 ]. The (dynamic) model is misspecified and should be reformulated. Natural remedy is to extend the list of lags in x t and y t. If autocorrelation seems of order one, then a starting point is the GLS transformation. The AR(1) structure is only indicative and we look at the unrestricted ADL model y t = α 0 y t 1 + x 0 tα 1 + x 0 t 1α 2 + η t. Finding of AR(1) is only used to extend the list of regressors. COMFAC removed. 18 of 21

(iii): Omitted Variables Omitted variables in general can also produce autocorrelation. Let the DGP be y t = x 1t β 1 + x 2t β 2 + t, ( ) and consider the estimation model y t = x 1t β 1 + u t. ( ) Then the error term is u t = x 2t β 2 + t, which is autocorrelated if x 2t is persistent. An example is if the DGP exhibit a level shift, e.g. ( ) includes the dummy variable ½ 0 for t<t0 x 2t =. 1 for t T 0 If x 2t isnotincludedin( ) the the residual will be systematic. Again the solution is to extend the list of regressors, x t. 19 of 21 (iv): Misspecified Functional Form Ifthetruerelationship y t = g(x t )+ t is non-linear, the residuals from a linear regression will typically be autocorrelated. The solution is to try to reformulate the functional form of the regression line. 20 of 21

Test for No-Autocorrelation Let b t (t =1, 2,...,T) be the residuals from the original regression model y t = x 0 tβ + t. A test for no-autocorrelation is based on the hypothesis γ = 0 in the auxiliary regression b t = x 0 tδ + γb t 1 + u t, where x t is included because it may be correlated with t 1. A valid test is t ratio for γ =0. Alternatively there is the Breusch-Godfrey LM test LM = T R 2 χ 2 (1). Note that x t and t are orthogonal, and any explanatory power is due to b t 1. The Durbin Watson (DW) test is derived for finite samples. Based on strict exogeneity. Not valid in many models. 21 of 21