Reliability of inference (1 of 2 lectures) Ragnar Nymoen University of Oslo 5 March 2013 1 / 19
This lecture (#13 and 14): I The optimality of the OLS estimators and tests depend on the assumptions of the regression model being true. We now answer three questions: 1. What are the consequences of residual mis-specification (heteroskedasticity, autocorrelation, non-normality), and non-constancy of parameters? 2. How can we discover empirically departures from the classical assumptions and from constancy (of parameters)? 3. What do we do if departures from the assumptions are detected? Lecture from Ch 8 in HGL: kap 8 in BN 2 / 19
Departures from the normality assumption OLS estimators are BLUE even if the disturbances have a distribution that is different from the normal. They are also consistent. The problem with non-normality is that we do not know the exact (finite sample) distribution of the t-ratios and F-statistics that we use in testing. Inference may become unreliable, at least with small-samples. With large samples (30 +): Can often refer to asymptotic normality under relatively mild assumptions. Test departure from normality: J-B test 3 / 19
Heteroscedasticity I If the variances of the disturbances are not all identical, the homoskedasticity assumption Var(ε 2 i ) = σ 2 of the regression model is violated, and we have heteroskedasticity. Note that the definition of heteroskedasticity is in terms of the theoretical disturbances. In practice the question is therefore whether the empirical hetroskedasticity that we observe from the residuals are signs of significant departures from homoskedasticity. 4 / 19
Consequences of heteroskedasticity I Without loss of generality, consider consequences in simple regression. The OLS estimator ˆβ 1 is unbiased also in the case of heteroskedastic disturbances, since Var(ε 2 i ) = σ 2 i does not enter into the proof for unbiasedness. By the same argument, the OLS estimator ˆβ 1 is also consistent. 5 / 19
Consequences of heteroskedasticity II The OLS estimator is no longer efficient and BLUE since the formulae Var( ˆβ 1 ) = σ2 n ˆσ 2 will either over- or underestimate the true variance of ˆβ 1, when Var(ε 2 i ) = σ2 i : The estimated variance is biased! The t-ratio which makes use this expression will also be biased. This means that the statistical inference is no longer reliable under heteroskedasticity. The direction of the bias in Var( ˆβ 1 ) depends on the direction of the association between (X i X ) 2 and σ 2 i. If (X i X ) 2 and σ 2 i are positively related it underestimates the true variance. Hence we will make type-i errors more frequently than the nominal significance level. 6 / 19
Consequences of heteroskedasticity III If (X i X ) 2 and σ 2 i are negatively related, the (absolute value of) the t-ratio is underestimated. Hence will conclude too often that a regressor is insignificant when it is in fact significant. 7 / 19
Forms of heteroskedasticity I In the following it will be necessary to assume some form for variation on σ 2 i. A form that is sometimes referred to as classical heteroskedasticity is σ 2 i = σ 2 W h i with h > 0 (1) where W i is an observable variable. Harald s first lecture mentioned this for the situation where the scatter plot suggested: Var(Y X ) = σ 2 X 2 8 / 19
Forms of heteroskedasticity II As we shall see: If heteroskedasticity is of this type (with h known) the problem created by heteroskedasticity for inference is easily corrected. However: For the purpose of testing (1) is inconvenient since the null hypothesis of homoskedasticity cannot be formulated as a parametric restriction on a general model with h > 0. Therefore, mixed heteroskedasticity forms have been suggested. This form models the variances as a function of s observable variables: σ 2 i = a 0 + a 1 Z 1i +... + a 2 Z si (2) 9 / 19
Forms of heteroskedasticity III i.e., a variance function. The null hypothesis of homoskedasticity is: H 0 : a 1 = a 2 =... = a s = 0 (A good feature of the tests below is that they have power to reject H 0 even when the variance function is non-linear in a 0 + a 1 Z 1i +... + a 2 Z si ) A form that is much used in models for time series data, and in financial econometrics in particular, is autoregressive conditional heteroskedasticity, ARCH. The first order ARCH is: σ 2 t = a 0 + a 1 σ 2 t 1 (3) 10 / 19
Testing the null of homoskedasticity I Main point: The residuals contain all the variation in Y i that is unexplained by the our model, i.e., the specification we have chosen for the conditional expectation: ˆε i = Y i E (Y i X i ) Therefore we can use the residuals to test the assumptions we have made abut the disturbances of the regression model. Mis-specification testing is a large field: And we can only mention a few popular test here. 11 / 19
Testing the null of homoskedasticity II Informal tests: As Harald showed, the scatter plot is often instructive as an informal test. After estimation of a model that assumes homoskedasticity, ˆε i (or ˆε 2 i ) can be plotted against X i. Formal test. We mention White s version of the Lagrange multiplier test in ch. 8.2.2 in HGL. In the case of one regressor. White s test replaces the theoretical variance function (2) with ˆε 2 i = a 0 + a 1 X 1i + a s X 2 i + v i, i = 1, 2,..., n (4) where v i is a disturbance with assumed classical properties. 12 / 19
Testing the null of homoskedasticity III (4) is an example of an auxiliary regression that simplifies the testing of residual mis-specification, in this case: H 0 : a 1 = a 2 = 0 is tested by using the existence of a relationship test: F het = R2 het 1 R 2 het n 3 2 F (2, n 3) where Rhet 2 is the R-sq from the auxiliary regression (4). HGL also mention the χ 2 version of this test: χ 2 het(2) = nr 2 het the two are equivalent in large sample. Research shows that the F version has better properties in small samples. 13 / 19
Testing the null of homoskedasticity IV In the case of k regressors in the regression model, we get F het = R2 het 1 R 2 het n 2k 1 2k F (2k, n 2k 1) when all the squared regressors are included. In the case of k > 0 can also include cross-products of regressors, but since the number of cross-products increases rapidly in k: k! (k 2)!2 this is not practical in moderate sample sizes A common case is that k variable regression model contains both continuous variables and dummies as regressors. 14 / 19
Testing the null of homoskedasticity V Testing for ARCH: In this case the auxiliary regression is the direct counterpart to the ARCH formulation in (3): ˆε 2 t = a 0 + a 1 ε 2 t 1 + v i, and H 0 : a 1 = 0 against a 1 = 0. Hence report F arhc (1, T 2) = R2 het 1 R 2 het T 1 1 even if a two sided t-test can be used. Higher order ARCH effects: Includes longer lags of ˆε 2 t and adjust df accordingly 15 / 19
Example: Andy s I sales i = 1.19 (0.106) + 1.19 (0.106) price i 0.08922 (0.00826) advert i 1 75, (n = 44), R 2 = 0.44826, χ 2 normality (2) = 6.2498[0.0439] (J-B test) F het (4, 70) = 0.98576[0.4211] (X 2 version) F het (5, 69) = 0.88864[0.4936] (X 2 and X X version) 16 / 19
Example: Norwegian PCM I π i = 10.5 (1.453) 1.83 (0.423) U t 1975 2005 (T = 27), R 2 = 0.44826 χ 2 normality (2) = 1.0925[0.5791] (J-B test) F het (2, 24) = 2.6057[0.0946] (X 2 version) F arch (1, 25) = 7.5486[0.0110] 17 / 19
Inference and estimation under from heteroskedasticity I When a test of homoskedasticity rejects (as in HGL p 306), the inference based on the OLS estimations not reliable without further qualifications What can we do? First: Can try to robustify our conclusions Informal robustification: If the purpose is to test the significance of a regressor, and the het. is of a form that leads to an overestimated t-ratio, we know that a non-rejection outcome is robust. Formal robustification: Stata and other software can compute standard errors of ˆβ j (j = 0, 2,... k) that are robust to unknown forms of heteroskedasticity (this builds on White s approach). Use these heteroskedastic consistent standard errors, to calculate robust t-ratios. 18 / 19
Inference and estimation under from heteroskedasticity II Second: If we can specify the form of heteroskedasticity, we can restore a regression model with homoskedastic disturbances. This leads to new estimators of β j (j = 0, 2,... k) called generalized least-squares or weighted-least squares. About weighted least-squares in class. Third: Acknowledge that the first modelling attempt failed: Back to drawing board, ( Re-make, Re-model ) 19 / 19