The linear regression model: functional form and structural breaks Ragnar Nymoen Department of Economics, UiO 16 January 2009
Overview Dynamic models A little bit more about dynamics Extending inference to parameters of interest that are non-linear functions of regression coe cients Modelling and testing of structural breaks. Main reference is Greene Ch 6.1-6.4. See course page syllabus for overlapping reference to Biørn and Kennedy
Models with short and long-run derivative coe cients We ended the lecture 1 slide set with a note on the dynamic model: y t = β 2 y t 1 + ε t, ε t N(0, 1), t = 1,... T. which we found can be estimated consistently by OLS despite the correlation between y t and past disturbances. As, noted this suggest that OLS can be used to estimate dynamic models that contain both exogenous regressors and a lagged regressor. The autoregressive distributed lag model, ADL, is y t = β 1 + β 2 y t 1 + β 3 x t + β 4 x t 1 + ε t, ε t N(0, σ 2 ), (1) ADLs have obvious relevance in economics (eq ECON 34310/4410): Impact, dynamic and long-run multipliers.
Greene has numerous references to regression models that are ADLs, cf Example 4.7 on p 69. In ADLs, one parameter of interest is the long-run multiplier: B 2 = β 3 + β 4 1 β 2 B 2 is a non-linear function of the parameters of the regression modei, so can we test an hypothesis about this parameter of interest, i.e. H 0 : B 2 = B2 o? The answer is: based on an asymptotically valid computation of the variance of ˆB 2 = b 3+b 4 1 b 2 we can. We use ^ to denote the estimator of the long-run multiplier here. Greene makes reference to the delta method on page 68, but we state directly a results due to Bårdsen (1989):
First, re-write the ADL as y t = β 1 + (β 2 1)y t 1 + β 3 x t + (β 3 + β 4 )x t 1 + ε t = β 1 + αy t 1 + β 3 x t + γ t 1 x t 1 + ε t (2) where is the di erence operator, so y t = y t y t 1 and α = (β 2 1) and γ = (β 3 + β 4 ). (1) and (2) gives identical SSEs, so statistically they are the same models. (Although R 2 very di erent) (2) easier to use since ˆB 1 = b 3 + b 4 1 b 2 ˆγˆα, and Var[ ˆB 1 ] can be obtained as Var[ \ 1 2 ˆB 1 ] \Var( ˆγ 2 ˆγ) + \Var(ˆα) (3) ˆα ( ˆα) 2 1 ˆγ +2 \ ˆα ( ˆα) 2 Cov( ˆγ, ˆα).
This method applies to models with K 1 exogenous regressors, and with higher order lags in both the dependent variable and in the xs. Estimates of long-run derivative coe cients and their variances are part of the output of PC Give. But Bårdsen s formula is convenient if you use other software, since only need the covariance matrix of the estimates.
Example: Table 4.7, p 69 in Greene Consider long-run elasticity of gasoline demand with respect to income. Income is variable number 3 in the model, so ˆB 3 = 0.164097 0.169090291 = 0.97047 Var[ \ 1 2 ˆB 3 ] = (0.0030279) 0.169090291 0.164097 2 + (0.169090291) 2 0.0020943 1 0.164097 +2 0.169090291 (0.169090291) 2 ( 0.0021881) = 0.026349 which comes close to the estimate of the from the delta method reported by Greene on page 70.
Linearity in parameters and intrinsic linearity Have already made the point that linearity in parameters, not linearity in parameters, is the de ning trait of the linear regression model. Can extend the relevance of the model to the case where our parameters of interest are one-one functions of the coe cients of the regression model. Greene p 119 call this intrinsic linearity. The long-run multiplier is an example! Many other in econometrics CES production function Green p 119, and Ch 16.64 The natural rate of unemployment,ie π t = β 1 + β 2 u t + ε t where u t is the rate of unemployment and π t is in ation. The Phillips curve natural rate is u phil = β 1 β2
If we have a hypothesis about when a structural break occurs we can test that hypothesis Let T 1 denote the last period with the old regime and let T 1 + 1 denote the rst period of the new ; then y t = β 1 + β 2 X 2t + ε t, t = 1, 2,..., T 1 and y t = γ 1 + γ 2 X 2t + ɛ i, i = T 1 + 1, 2,..., T. H 0 : β 1 = γ 1, β 2 = γ 2 vs H 1 : β 1 6= γ 1, β 2 6= γ 2. In the multivariate case: H 0 : β 1 = γ 1, β 2 = γ 2, β 3 = γ 3,..., β K = γ K There are two well known statistics for these cases, both due to Chow (1960) and referred to as Chow tests.
2-sample Chow-test SSE 1 is for the rst sample (t = 1, 2,.., T 1 ) SSE 2 is for the second. SSE U = SSE 1 + SSE 2. SSE R is the SSE when the whole sample is used, i.e under H 0 F Chow 2 = SSE R SSE U SSE U (T 4) 2 F (2, T 4). In general F Chow 2 = SSE R SSE U T 2K ) SSE U K F (K, T 2K )).
Predictive Chow-test Consider T T 1 < K. Same SSE R (full sample) but SSE U is only on the basis of the rst T 1 observations. This predictive Chow-test is given as F ChowP = SSE R SSE U T 1 K F (T T 1, T 1 K ) SSE U T T 1 If we have no clear idea about the dating of a regime shift, graphs with the whole sequence of predictive Chow tests are useful. Chow tests rely on constant and equal variances of the disturbances. Hence, good practice to plot the sequence of s 2 as a function of t.
Testing by dummies Dummy variables are exible tools for modelling and testing parameter changes in both cross-section and time series data, Greene ch. 6.2. Consider temporary change in period T 1 in β 1 : H 0 : β 1 = γ 1, vs H 1 : β 1 6= γ 1, for t = T 1 Test H 0 with t-statistic of λ 1 = 0 in y i = β 1 + λ 1 D t + β 2 x t + ε t, t = 1, 2,..., T where D t = 1 when t = T 1 and 0 elsewhere.
If the break also a ects the slope, use Y i = β 1 + λ 1 D t + β 2 X t + λ 2 X t D t + ε t, t = 1, 2,..., T 1 to test H 0 : λ 1 = λ 2 = 0 vs H 1 : λ 1 6= 0, or λ 2 6= 0. The F statistic is distributed F (2, T 3 regressors and an intercept. 4), since SSE U is based on