CHAPER 11: HETEROSCEDASTICITY: WHAT HAPPENS WHEN ERROR VARIANCE IS NONCONSTANT?

Basc Econometrcs, Gujarat and Porter CHAPER 11: HETEROSCEDASTICITY: WHAT HAPPENS WHEN ERROR VARIANCE IS NONCONSTANT? 11.1 (a) False. The estmators are unbased but are neffcent. (b) True. See Sec. 11.4 (c) False. Typcally, but not always, wll the varance be overestmated. See Sec. 11.4 and Exercse 11.9 (d) False. Besdes heteroscedastcty, such a pattern may result from autocorrelaton, model specfcaton errors, etc. (e) True. Snce the trueσ are not drectly observable, some assumpton about the nature of heteroscedastcty s nevtable. (f) True. See answer to (d) above. (g) False. Heteroscedastcty s about the varance of the error term u and not about the varance of a regressor. 11. (a) As equaton (1) shows, as N ncreases by a unt, on average, wages ncrease by about.9 dollars. If you multply the second equaton through by N, you wll see that the results are qute smlar to Eq. (1). (b) Apparently, the author was concerned about heteroscedastcty, snce he dvded the orgnal equaton by N. Ths amounts to assumng that the error varance s proportonal to the square of N. Thus the author s usng weghted least-squares n estmatng Eq. (). (c) The ntercept coeffcent n Eq. (1) s the slope coeffcent n Eq. () and the slope coeffcent n Eq. (1) s the ntercept n Eq. (). (d) No. The dependent varables n the two models are not the same. 11.3 (a) No. These models are non-lnear n the parameters and cannot be estmated by OLS. (b) There are specalzed non-lnear estmatng procedures. We dscuss ths topc n the chapter on non-lnear regresson models. Informally, we can estmate the parameters by a process of tral and error. 17

Basc Econometrcs, Gujarat and Porter 11.4 (a) See Exercse 7.14 and Secton 6.9. (b) No. E[ln( u )] = E[ln(1)] =. But E[ln( u )] < ln E( u ) because of the concavty property of log transformaton. The expectaton of the log of a random varable s less than the log of ts expectaton, unless the varable has a zero varance, n whch case they are equal. (c) Let Y = ln β + β ln X + ln u 1 = α + β ln X + u * where u = [ln u E(ln u )] and α = [ln β + E(ln u )]. Now * 1 E u = E u E u = Incdentally, notce that we do * ( ) [ln (ln )]. not get a drect estmate of β 1. 11.5 Ths s a matter of substtutng the defntonal terms and smplfyng. 11.6 (a)the assumpton made s that the error varance s proportonal to the square of GNP, as s descrbed n the postulaton. The authors make ths assumpton by lookng at the data over tme and observng ths relatonshp. (b) The results are essentally the same, although the standard errors for two of the coeffcents are lower n the second model; ths may be taken as emprcal justfcaton of the transformaton for heteroscedastcty. (c) No. The R terms may not be drectly compared, as the dependent varables n the two models are not the same. 11.7 As wll be seen n Problem 11.13, the Bartlett test shows that there was no problem of heteroscedastcty n ths data set. Therefore, ths fndng s not surprsng. Also, see Problem 11.11. 11.8 Substtutng w = w n (11.3.8), we obtan: ˆ * ( nw)( w X Y ) ( w X )( w Y ) β = ( nw)( w X ) ( w X ) n X Y ( X )( Y ) = = ˆ β n X ( X ) The equalty of the varances may be shown smlarly. 11.9 From Eq. (11..), we have 18

Basc Econometrcs, Gujarat and Porter var( ˆ β ) Substtutng x σ = ( x ) σ σ k var( ˆ β ) = n the precedng equaton, we get σ x k σ x k = = ( x ) x x The frst term on the rght s the varance shown n Eq. (11..3). x k Thus, f > 1, then the heteroscedastc varance gven above s x greater than the homoscedastc varance. In ths case, the homoscedastc varance wll underestmate the heteroscedastc varance leadng to nflated t and F statstcs. One cannot draw any general conclusons because the result s based on a specfc form of heteroscedastcty. 11.1 From Append 3A.3 and 6A.1, we have ˆ X var( u ) var( β) = ( X ) Emprcal Exercses Gven that var(u )= σ X, we obtan var( ˆ β ) X σ X σ X 4 = = ( X ) ( X ) 11.11 The regresson results are already gven n (11.5.3). If average productvty ncreases by a dollar, on average, compensaton ncreases by about 3 cents. (a) The resduals from ths regresson are as follows: -775.6579, -5.481, 165.851, 183.9356, 199.3785, 54.6657, 11.841, 15.639, 113.41 (b) Ths s a matter of straghtforward verfcaton. (c) The regresson results are: uˆ = 47.3455.3 X t = = (.6433) (.313) r.18 uˆ = 575.976 3.797 (.4479) (.787).19 t = r = X 19

Basc Econometrcs, Gujarat and Porter As these results show, there s lttle evdence of heteroscedastcty on the bass of the Glejser tests. (d) If you rank the absolute resduals from low to hgh value and smlarly rank average productvty fgures from low to hgh value and compute the Spearman's rank correlaton coeffcent as gven n (11.5.5) you wll observe that ths coeffcent s about -.5167. Usng the t formula gven n (11.5.6), the t value s about -.856. Ths t value s not statstcally sgnfcant; the 5% crtcal t value for 7 d.f. s.447 n absolute value. Hence, on the bass of the rank correlaton test, we have no reason to expect heterosccdastcty. In sum, all the precedng tests suggest that we do not have the problem of heteroscedastcty. 11.1 (a) & (b) 7.8 7.6 MEAN 7.4 7. 7. 6.8..4.6.8 1. SD Mean v. Standard dev aton (c) The regresson results are: SDˆ =.991.65Mean t = (.3756)(.1795) r =.64 Snce the slope coeffcent s statstcally not dfferent from zero, there s no systematc relatonshp between the two varables, whch can be seen from the fgure n (a). (d) There s no need for any transformaton, because there s no systematc relatonshp between mean sales/cash rato and standard devaton n the varous asset classes. 11.13 Usng Bartlett's test, the χ value s 6.6473, whose p value s.5748. Therefore, do not reject the null that the varances are equal. 13

Basc Econometrcs, Gujarat and Porter 11.14 Usng the formula (11.3.8) for weghted least-squares, t can be shown that ˆ * * β = 1 ( Y ˆ 1 Y ) and var( β ) = σ 3 3 If we use OLS, then from Eq.(6.1.6), we obtan: ˆ X Y Y1 Y 1 β = = = ( Y 1 Y ) X and usng (6.1.7), we get: ˆ σ 1 var( β ) = = σ X Comparng the two estmates, we see that the weghted least squares gves a weght of /3 to Y 1 and 1/3 to Y, whereas OLS gves equal weght to the two Y observatons. The varance of the slope estmator s larger n the weghted least-squares than n the OLS. 11.15 (a) The regresson results are as follows: MPG ˆ = 189.9597 1.716SP +.394HP 1.93WT se = (.587) (.331) (.76) (.1855) t = (8.4318) ( 5.4551) (5.17) ( 1.593) R =.888 As expected, MPG s postvely related to HP and negatvely related to speed and weght. (b) Snce ths s a cross-sectonal data nvolvng a dversty of cars, a pror one would expect heteroscedastcty. (c) Regressng the squared resduals obtaned from the model shown n (a) on the three regressors, ther squared terms, and ther crossproduct terms, we obtan an R value of.394. Multplyng ths value by the number of observatons (=81), we obtan 5.646, whch under the null hypothess that there s no heteroscedastcty, has the Ch-square dstrbuton wth 9 d.f. (3 regressors, 3 squared regressors, and 3 three cross-product terms). The p value of obtanng a Ch-square value of as much as 5.646 or greater (under the null hypothess) s.9, whch s very small. Hence, we must reject the null hypothess. That s, there s heteroscedastcty. 131

Basc Econometrcs, Gujarat and Porter (d)the results based on Whte's procedure are as follows: Dependent Varable: MPG Method: Least Squares Sample: 1 81 Included observatons: 81 Whte Heteroscedastcty-Consstent Standard Errors & Covarance Varable Coeffcent Std. Error t-statstc Prob. C 189.9597 33.965 5.6531. SP -1.71697.33639-3.784375.3 HP.39433.18781 3.58918.6 WT -1.9373.8577-6.67635. R-squared.88864; Durbn-Watson 1.37 When you compare these results wth the OLS results, you wll fnd that the values of the estmated coeffcents are the same, but ther varances and standard errors are dfferent. As you can see, the standard errors of all the estmated slope coeffcents are hgher under the Whte procedure, hence t are lower, suggestng that OLS had underestmated the standard errors. Ths could all be due to heteroscedastcty. (e) There s no smple formula to determne the exact nature of heteroscedastcty n the present case. Perhaps one could make some smple assumptons and try varous transformatons. For example, f t s beleved that the "culprt" varable s HP, and f we beleve that the error varance s proportonal to the square of HP, we could dvde through by HP and see what happens. Of course, any other regressor s a lkely canddate for transformaton. 13

Basc Econometrcs, Gujarat and Porter 11.16 (a) The regresson results are as follows: Dependent Varable: FOODEXP Varable Coeffcent Std. Error t-statstc Prob. C 94.878 5.85635 1.85449.695 TOTALEXP.43689.7833 5.57747. R-squared.36984 The resduals obtaned from ths regresson look as follows: 1-1 - 5 1 15 5 3 35 4 45 5 55 (b) Plottng resduals (R1) aganst total expendture, we observe 1 R1-1 - 3 4 5 6 7 8 9 TOTALEXP It seems that as total expendture ncreases, the absolute value of the resduals also ncrease, perhaps nonlnearly. 133

Basc Econometrcs, Gujarat and Porter (c)park Test Dependent Varable: LOG (RESQ) Varable Coeffcent Std. Error t-statstc Prob. C -16.8688 1.14-1.68653.977 LOG(totalexp) 3.7335 1.551873.3863.6 R-squared.9718 Snce the estmate slope coeffcent s sgnfcant, the Park test confrms heteroscedastcty. Glejser Test Dependent Varable: u ˆ, absolute value of resduals Varable Coeffcent Std. Error t-statstc Prob. C -3.1965 9.48998-1.9563.795 TOTALEXP.1379.45417.877997.58 R-squared.135158 Snce the estmated slope coeffcent s statstcally sgnfcant, the Glejser test also suggests heteroscedastcty. Whte Test Dependent Varable: u ˆ Varable Coeffcent Std. Error t-statstc Prob. C 1344. 1156.58.616546.54 TOTALEXP -53.16 71.48347 -.743145.467 TOTALEXPSQ.59795.5886 1.15887.3144 R-squared.1348 If you multply the R-squared value by 55, and the null hypothess s that there s no heteroscedastcty, the resultng product of 7.3745 follows the Ch-square dstrbuton wth d.f. and the p value of such a Ch-square value s about.5, whch s small. Thus, lke the Park and Glejser tests, the Whte test also suggests heteroscedastcty. 134

Basc Econometrcs, Gujarat and Porter (d) The Whte heteroscedastcty-corrected results are as follows: Dependent Varable: FOODEXP Varable Coeffcent Std. Error t-statstc Prob. C 94.878 43.635.177581.339 TOTALEXP.43689.7454 5.88597. R-squared.36984 Compared wth the OLS regresson results gven n (a), there s not much dfference n the standard error of the slope coeffcent. although the standard error of the ntercept has declned. Whether ths dfference s worth botherng about, s hard to tell. But unless we go through ths exercse, we wll not know how large or small the dfference s between the OLS and Whte's procedures. 11.17 The regresson results are as follows: Varable Coeffcent Std. Error t-statstc Prob. C 1.15433.777959 1.483795.1438 LOG(TotalEx).73636.1713 6.99834. R-squared.41469 The Park, Glejser and Whte's test appled to the resduals obtaned from the double log regresson showed no evdence of heteroscedastcty. Ths example shows that log transformaton can often reduce heteroscedastcty. Hence, the functonal form n whch a regresson model s expressed can be crtcal n decdng whether there s heteroscedastcty or not. 11.18 The squared resduals from the regresson of food expendture on total expendture were frst obtaned, denoted by R 1.Then they were regressed on the forecast and forecast squared value obtaned from the regresson of food expendture on total expendture. The results were as follows: 135

Basc Econometrcs, Gujarat and Porter Dependent Varable:R 1 Varable Coeffcent Std. Error t-statstc Prob. C 78.63 394.59.69594.4896 FOODEXF -18.669 1.554 -.815434.4185 FOODEXF^.313387.38486 1.15887.3144 R-squared.1348 Multplyng the precedng R by 55, we obtan 7.3745. Under the null hypothess that there s no heteroscedastcty, ths value follows the Ch-square dstrbuton wth d.f. The p value of obtanng a Ch-square value of as much as 7.3745 or greater s about.5, whch s qute small. Hence, the concluson s that the error varance s heteroscedastc. It can be shown that f the precedng procedure s appled to the squared resduals obtaned from the regresson of the log of food expendture on the log of total expendture, there s no evdence of heteroscedastcty. 11.19 There s no reason to beleve that the results wll be any dfferent because profts and sales are hghly correlated, as can be seen from the followng regresson of profts on sales. Dependent Varable: PROFITS Varable Coeffcent Std. Error t-statstc Prob. C -338.5385 115.311 -.3683.7636 SALES.1713.1197 9.75346. R-squared.845936 136

Basc Econometrcs, Gujarat and Porter 11. (a) Salares vs Rank 15 14 13 1 11 1 9 8 5 1 15 5 3 35 Rank, n Years As ths fgure shows, medan salary ncreases wth years n rank, but not perfectly lnearly. (b) From the fgure gven n (a) t would seem that model () mght be more approprate, whch also fts n wth economc theory of human captal. (c) The results of fttng both the lnear and quadratc models are as follows: Varable Coeffcent Std. Error t-statstc Prob. C 1779.96 387.67 7.817. X 971.787 9.86 4.8.9 R-squared.699 Varable Coeffcent Std. Error t-statstc Prob. C 185.8 5855.786 18.45. X 893.785 974.137.9175.3894 X^.447 9.571.87.9364 137

Basc Econometrcs, Gujarat and Porter R-squared.691 (c)whte's heteroscedastcty test appled to model (1) showed that there was not evdence of heteroscedastcty. The value of n.r from the auxlary regresson of squared resduals was.937 wth a p value of.3176, suggestng homoscedastcty. When the same test was appled to model (), n.r was.75, wth a p value of.436, suggestng that there was no heteroscedastcty at the 5% level. (d) Snce there was no apparent heteroscedastcty, no further procedures are necessary. Often n ths type of stuaton, where each observaton s an average of several tems (professors, here), the error varance s proportonal to the square of years of experence. A remedy for ths type of stuaton would be to dvde model (1) through by X to remove the heteroscedastcty. 11.1 The calculated test statstc, λ( = F) s RSS / df 14 / 5 λ = = =.5454 RSS1 / df 55/ 5 The 5% crtcal F for 5 d.f. n the numerator and denomnator s 1.97. Snce the estmated value of.5454 exceeds ths crtcal value, reject the null of homoscedastcty. 11. (a) The graph s as follows. 3 Y 1 1 3 X 138

Basc Econometrcs, Gujarat and Porter (b) The regresson results are: Varable Coeffcent Std. Error t-statstc Prob. C 4.618 1.8496 4.49478.5 X.757433.149941 5.51559.1 R-squared.58638 The resduals from ths regresson when plotted aganst X showed the followng pcture. 1 5-5 -1 5 1 15 5 3 One resdual, that belongng to Chle, domnates the other resduals. (c) Excludng the observaton for Chle, the regresson results were as follows: Varable Coeffcent Std. Error t-statstc Prob. C 6.7388.38486.85358.117 X.1484.555568.398663.6951 R-squared.96 As you can see, n (a) the slope coeffcent was very sgnfcant, but n ths regresson t s not. See how a sngle extreme pont, an outler, can dstort regresson results. The squared resduals from ths regresson when plotted aganst X showed the followng graph. 139

Basc Econometrcs, Gujarat and Porter 1 5-5 -1 4 6 8 1 (d) Comparng the resdual graphs n (b) and (c), we see that once Chle s removed from the data there s lttle relatonshp between Y and X. Hence, any appearance of heteroscedastcty s spurous. 11.3 (a) Regresson results from EVews are as follows: Dependent Varable: SALARY Method: Least Squares Date: 7/1/8 Tme: 1:37 Sample: 1 447 Included observatons: 447 Varable Coeffcent Std. Error t-statstc Prob. C 998.795 63.6954 1.6177.11 TENURE 31.6779 9.46597 3.3467.9 AGE 5.49393 11.4686.4793.63 SALES.1487.6614.164.313 PROFITS.1413.68845.5471.47 ASSETS.763.136 5.754849. R-squared.4889 Mean dependent var 7.517 Adjusted R-squared.431 S.D. dependent var 17.566 S.E. of regresson 151.39 Akake nfo crteron 17.4795 Sum squared resd 9.94E+8 Schwarz crteron 17.53457 Log lkelhood -39.669 F-statstc 9.166 Durbn-Watson stat.1486 Prob(F-statstc). 14

Basc Econometrcs, Gujarat and Porter The results for the Whte Heteroskedastcty test are: Whte Heteroskedastcty Test: F-statstc.89313 Probablty.419 Obs*R-squared.4473 Probablty.5349 Wth a p value of.53, there s apparent heteroscedastcty n the data. It s left as an exercse to the reader to construct the Breusch-Pagan statstc, whch also ndcates heteroscedastcty n ths dataset. (b) Results for the log-ln model and Whte s heteroscedastcty test are as follows: Dependent Varable: LN_SAL Method: Least Squares Date: 7/1/8 Tme: 13:56 Sample: 1 447 Included observatons: 447 Varable Coeffcent Std. Error t-statstc Prob. C 6.753659.3683 8.51778. TENURE.851.3594.95836. AGE.78.435 1.6611.974 SALES 6.9E-6.51E-6.45693.157 PROFITS 5.7E-5.61E-5.186738.93 ASSETS.3E-6 5.3E-7 4.35537.1 R-squared.8984 Mean dependent var 7.391898 Adjusted R-squared.16 S.D. dependent var.637388 S.E. of regresson.5791 Akake nfo crteron 1.779 Sum squared resd 143.368 Schwarz crteron 1.78359 Log lkelhood -38.497 F-statstc 3.317 Durbn-Watson stat 1.917 Prob(F-statstc). Whte Heteroskedastcty Test: F-statstc.58193 Probablty.4784 Obs*R-squared 4.9978 Probablty.5363 Apparently there s stll some heteroscedastcty n the data. 141

Basc Econometrcs, Gujarat and Porter (c) 18 16 14 1 1 8 6 4 1 3 4 5 6 7 Tenure 18 16 14 1 1 8 6 4 1 3 4 5 6 7 8 9 Age 18 16 14 1 1 8 6 4 1 3 4 5 6 7 8 9 Sales 14

Basc Econometrcs, Gujarat and Porter 18 16 14 1 1 8 6 4.. 4. 6. 8. 1. 1. 14. 16. 18. Profts 18 16 14 1 1 8 6 4. 1.. 3. 4. 5. 6. 7. 8. Assets Based on these scattergrams, there are several varables that mght be addng to the heteroscedastcty. It s left to the reader to try several models to see whch helps decrease t suffcently. 143