Heteroscedasticity 1 Pierre Nguimkeu BUEC 333 Summer 2011 1 Based on P. Lavergne, Lectures notes
Outline Pure Versus Impure Heteroscedasticity Consequences and Detection Remedies
Pure Heteroscedasticity Homoscedasticity The variance of the error term is constant Heteroscedasticity The variance of the error terms varies, that is Var ε i = σ 2 i i = 1,..., n Violates Classical Assumption 5, which states that Var ε i = σ 2 i = 1,..., n. Pure heteroscedasticity The model is well specified, i.e. Classical Assumptions 1,2,3 holds, but there is heteroscedasticity. Occurs in particular In cross-section, when there is large variation in the dependent variable. In time-series, when there is large variation in the dependent variable over time. When the quality of data collection changes a lot across the sample.
Heteroscedasticity: Examples R i = rent of renter i, I i = income of renter i. R i = β 0 + β 1 I i + ε i Seems sensible to expect that not only mean of rent increases with income, but also that variance (or s.d.) of rent increases with income. W i = β 0 + β 1 E i + β 3 X i + ε i W i = wage of worker i, E i = education level of worker i, X i = experience level of worker i. Mean wage increases with education and experience, but wage dispersion also increases with education and experience.
Impure Heteroscedasticity Caused by misspecification in the model, i.e. Classical Assumption 1 does not hold. If Y i = β 0 + β 1 X 1i + β 2 X 2i + ε i but we omit X 2i, then Y i = β 0 + β 1 X 1i + ε i where ε i = ε i + β 2 X 2i If X 2 is relevant, i.e. β 2 0, Var ε i = f (X 2i ). We should write Y i = β 0 + β 1 X 1i + ε i ε i = ε i + β 0 β 0 + (β 1 β 1 ) X 1i + β 2 X 2i because nothing ensures the coefficients are the same in the two equations. If Y i = β 0 + β 1 X 1i + β 2 X1i 2 + ε i, but we specify a linear equation, then Y i = β 0 + β 1 X 1i + ε i ε i = ε i + f (X 1i ). ε i depends on X 1i, so its s.d.
Consequences Estimates remain unbiased OLS is not BLUE in general, then OLS has not minimum variance The standard errors are biased t-scores don t have a t-distribution, so confidence intervals and tests are unreliable. t-scores are often too large.
Preliminary Checks Are there any obvious specification errors? Delay testing for heteroscedasticity until you are confident with your specification. Is the dependent variable likely afflicted with heteroscedasticity? Range of dependent variable, previous studies,... Is there any likely factor of heteroscedasticity? Graph the residuals against this variable.
Rent versus Income 28000 Rent of renter 24000 20000 16000 12000 8000 4000 0 0 20000 40000 60000 80000 100000 Income of Renter Dependent Variable: RENT Method: Least Squares Date: 11/09/09 Time: 17:38 Sample: 1 108 Included observations: 108 C 5455.483 602.7776 9.050573 0.0000 INCOME 0.063568 0.014390 4.417505 0.0000 R-squared 0.155475 Mean dependent var 7718.111 Adjusted R-squared 0.147508 S.D. dependent var 3577.000 S.E. of regression 3302.662 Akaike info criterion 19.06119 Sum squared resid 1.16E+09 Schwarz criterion 19.11086 Log likelihood -1027.304 F-statistic 19.51435 Durbin-Watson stat 2.012384 Prob(F-statistic) 0.000024 20000 16000 12000 RESID01 8000 4000 0-4000 -8000 0 20000 40000 60000 80000 100000 Income of Renter
The Park Test Assume that Var ε i = σ 2 Zi 2 i = 1,..., n for some proportionality factor Z you can observe. Then ln IEε 2 = ln σ 2 + 2 ln Z i The same reasonning applies if Var ε i = σ 2 Z α i i = 1,..., n. 1. Estimate your equation by OLS and get the residuals e i 2. Run the OLS auxiliary regression ln e 2 i = α 0 + α 1 ln Z i + u i 3. Test the significance of α 1 with a t-test. The Park test assumes that there is only one proportionality factor and you know which one. We look at whether squared residuals are related to Z (all in logs).
Rent versus Income LOGRESIDSQ LOGRESIDSQ vs. Log Income of Renter 20 16 12 8 4 0 20000 40000 60000 80000 100000 Income of Renter Dependent Variable: LOG(RESID01^2) Method: Least Squares Date: 03/12/08 Time: 21:30 Sample: 1 108 Included observations: 108 C 2.083771 2.994793 0.695798 0.4881 LOG(INCOME) 1.216408 0.290830 4.182539 0.0001 R-squared 0.141656 Mean dependent var 14.58387 Adjusted R-squared 0.133559 S.D. dependent var 2.142308 S.E. of regression 1.994121 Akaike info criterion 4.236629 Sum squared resid 421.5110 Schwarz criterion 4.286298 Log likelihood -226.7780 F-statistic 17.49363 Durbin-Watson stat 1.843326 Prob(F-statistic) 0.000060 What is the outcome of the test?
1. Estimate your equation The White Test Y i = β 0 + β 1 X 1i + β 2 X 2i + ε i by OLS and get the residuals e i 2. Run the OLS auxiliary regression e 2 i = α 0 + α 1 X 1i + α 2 X 2i + α 3 X 2 1i + α 4 X 2i + α 5 X 1i X 2i + u i That is regress the squared residuals on all the independent variables, their squares and their cross-products. 3. Test the significance of all coefficients but α 0 with an F-test. H 0 : α 1 = α 2 =... = α 5 = 0 against H A : at least one is not 0 Beware of perfect multicollinearity: If the equation is Y i = β 0 + β 1 X 1i + β 2 X 2 1i + ε i regress squared residuals on an intercept, X 1i, X 2 1i, X 3 1i and X 4 1i.
Rent versus Income Dependent Variable: RESID01^2 Method: Least Squares Date: 11/09/09 Time: 17:58 Sample: 1 108 Included observations: 108 C -14296693 9696170. -1.474468 0.1433 INCOME 1173.190 516.8715 2.269791 0.0253 INCOME^2-0.009549 0.005590-1.708341 0.0905 R-squared 0.077555 Mean dependent var 10705585 Adjusted R-squared 0.059985 S.D. dependent var 31078674 S.E. of regression 30132136 Akaike info criterion 37.30747 Sum squared resid 9.53E+16 Schwarz criterion 37.38197 Log likelihood -2011.603 F-statistic 4.413975 Durbin-Watson stat 1.860540 Prob(F-statistic) 0.014433 What is the outcome of the test?
Weighted Least-Squares Y i = β 0 + β 1 X 1i + β 2 X 2i + ε i and Var ε i = σ 2 Z 2 i. Then Y i Z i = β 0 1 Z i + β 1 X 1i Z i If Z i = X 1i, Now we can use OLS! But careful + β 2 X 2i Z i + u i Y i 1 X 2i = β 0 + β 1 + β 2 + u i X 1i X 1i X 1i is such that Var u i = Var ε i Z i = σ 2 There may be no intercept in the equation. The transformation is only to get OLS estimates, but interpretation relies on the original equation R i = β 0 + β 1 I i + ε i R i 1 = β 0 + β 1 + ε i I i I i β 1 : marginal effect of income on rent.
Rent versus Income.9 RATIO.8.7.6.5.4.3.2.1.0.00000.00004.00008.00012.00016 INVINCOME Dependent Variable: RENT/INCOME Method: Least Squares Date: 11/09/09 Time: 18:05 Sample: 1 108 Included observations: 108 1/INCOME 4811.862 322.2745 14.93094 0.0000 C 0.085679 0.016701 5.130303 0.0000 R-squared 0.677746 Mean dependent var 0.291701 Adjusted R-squared 0.674706 S.D. dependent var 0.171429 S.E. of regression 0.097774 Akaike info criterion -1.793980 Sum squared resid 1.013325 Schwarz criterion -1.744311 Log likelihood 98.87494 F-statistic 222.9331 Durbin-Watson stat 1.900821 Prob(F-statistic) 0.000000 What is the marginal effect of income on rent?
Redefining the Model C i = expenditure in city i, Y i = income in city i, POP i = population in city i, W i = average wage in city i. C i = β 0 + β 1 Y i + β 2 POP i + β 3 W i + ε i When estimated by OLS, this formulation gives a large weight to the large cities. See Figure 10.5. It makes sense to consider a specification that redefine the variables with respect to the size of the city, i.e. C i Y i = α 0 + α 1 + α 2 W i + u i POP i POP i This is a new formulation that relates per capita consumption to per capita income. There may still be heteroscedasticity.
Heteroscedasticity-Corrected Standard Errors In place of another estimation method or another model, we can use OLS (unbiased and consistent) and correct the standard errors. Heteroscedasticity-robust standard errors (White standard errors) Estimate the standard deviation of the OLS coefficients whether there is heteroscedasticity or not Are often larger than the OLS standard errors Can be used to construct tests and confidence intervals in the usual way Works well in large samples Are given by Eviews, see Options/Heteroscedasticty consistent coefficient covariance.
Rent versus Income Dependent Variable: RENT Method: Least Squares Date: 11/09/09 Time: 17:38 Sample: 1 108 Included observations: 108 C 5455.483 602.7776 9.050573 0.0000 INCOME 0.063568 0.014390 4.417505 0.0000 R-squared 0.155475 Mean dependent var 7718.111 Adjusted R-squared 0.147508 S.D. dependent var 3577.000 S.E. of regression 3302.662 Akaike info criterion 19.06119 Sum squared resid 1.16E+09 Schwarz criterion 19.11086 Log likelihood -1027.304 F-statistic 19.51435 Durbin-Watson stat 2.012384 Prob(F-statistic) 0.000024 Dependent Variable: RENT White Heteroskedasticity-Consistent Standard Errors & Covariance C 5455.483 403.2469 13.52889 0.0000 INCOME 0.063568 0.014759 4.307218 0.0000 R-squared 0.155475 Mean dependent var 7718.111 Adjusted R-squared 0.147508 S.D. dependent var 3577.000 S.E. of regression 3302.662 Akaike info criterion 19.06119 Sum squared resid 1.16E+09 Schwarz criterion 19.11086 Log likelihood -1027.304 F-statistic 19.51435 Durbin-Watson stat 2.012384 Prob(F-statistic) 0.000024 OK, the difference is small here, but not always! Id the sample size large enough?
Log Hourly Wage versus Educ and Age Dependent Variable: LWAGE Method: Least Squares Date: 03/12/08 Time: 21:39 Sample: 1 340 Included observations: 340 C -0.056965 0.227325-0.250589 0.8023 EDUC 0.122578 0.013614 9.003560 0.0000 AGE 0.020087 0.002445 8.213880 0.0000 R-squared 0.274597 Mean dependent var 2.424713 Adjusted R-squared 0.270292 S.D. dependent var 0.602183 S.E. of regression 0.514402 Akaike info criterion 1.517162 Sum squared resid 89.17345 Schwarz criterion 1.550947 Log likelihood -254.9175 F-statistic 63.78458 Durbin-Watson stat 2.139412 Prob(F-statistic) 0.000000 Dependent Variable: RESID01^2 Method: Least Squares Date: 11/09/09 Time: 18:31 Sample: 1 340 Included observations: 340 C 1.226049 1.302637 0.941206 0.3473 EDUC -0.067572 0.162368-0.416164 0.6776 AGE -0.038086 0.018546-2.053550 0.0408 EDUC^2 0.002388 0.005441 0.438937 0.6610 AGE^2 0.000419 0.000145 2.894857 0.0040 EDUC*AGE 0.000550 0.000980 0.561242 0.5750 R-squared 0.046380 Mean dependent var 0.262275 Adjusted R-squared 0.032105 S.D. dependent var 0.420260 S.E. of regression 0.413459 Akaike info criterion 1.088974 Sum squared resid 57.09682 Schwarz criterion 1.156544 Log likelihood -179.1256 F-statistic 3.248895 Durbin-Watson stat 1.892939 Prob(F-statistic) 0.007039 Seems like there is heteroscedasticity!
Log Hourly Wage versus Educ and Age Dependent Variable: LWAGE Method: Least Squares Date: 11/09/09 Time: 18:29 Sample: 1 340 Included observations: 340 C -1.461583 0.345819-4.226436 0.0000 EDUC 0.122720 0.013108 9.361960 0.0000 AGE 0.094515 0.014381 6.572153 0.0000 AGE^2-0.000904 0.000172-5.246208 0.0000 R-squared 0.329518 Mean dependent var 2.424713 Adjusted R-squared 0.323531 S.D. dependent var 0.602183 S.E. of regression 0.495281 Akaike info criterion 1.444314 Sum squared resid 82.42203 Schwarz criterion 1.489360 Log likelihood -241.5333 F-statistic 55.04395 Durbin-Watson stat 2.090796 Prob(F-statistic) 0.000000 Dependent Variable: RESID02^2 Method: Least Squares Date: 11/09/09 Time: 18:32 Sample: 1 340 Included observations: 340 C 1.375719 1.340224 1.026485 0.3054 EDUC -0.127623 0.167053-0.763969 0.4454 AGE -0.023028 0.019081-1.206850 0.2283 EDUC^2 0.003993 0.005598 0.713217 0.4762 AGE^2 0.000188 0.000149 1.262912 0.2075 EDUC*AGE 0.000818 0.001009 0.811230 0.4178 R-squared 0.025134 Mean dependent var 0.242418 Adjusted R-squared 0.010540 S.D. dependent var 0.427649 S.E. of regression 0.425389 Akaike info criterion 1.145866 Sum squared resid 60.43937 Schwarz criterion 1.213436 Log likelihood -188.7973 F-statistic 1.722212 Durbin-Watson stat 1.981363 Prob(F-statistic) 0.128881 It was likely impure!