SOLUTIONS TO PROBLEMS CHAPTER 8 8.1 Parts (ii) and (iii). The homoskedasticity assumption played no role in Chapter 5 in showing that OLS is consistent. But we know that heteroskedasticity causes statistical inerence based on the usual t and F statistics to be inalid, een in large samples. As heteroskedasticity is a iolation o the Gauss-Marko assumptions, OLS is no longer BLUE. 8.3 False. The unbiasedness o WLS and OLS hinges crucially on Assumption MLR.4, and, as we know rom Chapter 4, this assumption is oten iolated when an important ariable is omitted. When MLR.4 does not hold, both WLS and OLS are biased. Without speciic inormation on how the omitted ariable is correlated with the included explanatory ariables, it is not possible to determine which estimator has a small bias. It is possible that WLS would hae more bias than OLS or less bias. Because we cannot know, we should not claim to use WLS in order to sole biases associated with OLS. 8.5 (i) No. For each coeicient, the usual standard errors and the heteroskedasticity-robust ones are practically ery similar. (ii) The eect is.09(4) =.116, so the probability o smoking alls by about.116. (iii) As usual, we compute the turning point in the quadratic:.00/[(.0006)] 38.46, so about 38 and one-hal years. (i) Holding other actors in the equation ixed, a person in a state with restaurant smoking restrictions has a.101 lower chance o smoking. This is similar to the eect o haing our more years o education. () We just plug the alues o the independent ariables into the OLS regression line: smokes ˆ =.656.069 log(67.44) +.01 log(6,500).09(16) +.00(77).0006(77 ).005. Thus, the estimated probability o smoking or this person is close to zero. (In act, this person is not a smoker, so the equation predicts well or this particular obseration.) 8.7 (i) This ollows rom the simple act that, or uncorrelated random ariables, the ariance o the sum is the sum o the ariances: Var( + ) = Var( ) + Var( ) = σ + σ. i i, e i i, e (ii) We compute the coariance between any two o the composite errors as Co( u, u ) = Co( +, + ) = Co(, ) + Co(, ) + Co(, ) + Co(, ) i, e i, g i i, e i i, g i i i i, g i, e i i, e i, g = Var( i) + 0 + 0 + 0 = σ, where we use the act that the coariance o a random ariable with itsel is its ariance and the assumptions that i, i, e, and i, g are pairwise uncorrelated. 4
(iii) This is most easily soled by writing 1 mi 1 mi 1 mi i 1 i, e = i 1 i + i, e = e e i + = = i e= 1 i, e m u m ( u ) m. Now, by assumption, i is uncorrelated with each term in the last sum; thereore, i is uncorrelated 1 m with m i. It ollows that i e= 1 i, e 1 mi 1 mi ( i + mi ) ( ) e 1 i ( ), e = i + mi = e= 1 i, e Var Var Var = σ + σ / mi, where we use the act that the ariance o an aerage o m i uncorrelated random ariables with common ariance ( σ in this case) is simply the common ariance diided by m i the usual ormula or a sample aerage rom a random sample. (i) The standard weighting ignores the ariance o the irm eect, σ. Thus, the (incorrect) weight unction used is1/ hi = mi. A alid weighting unction is obtained by writing the ariance rom (iii) as Var( ui) = σ [1 + ( σ / σ ) / mi] = σ hi. But obtaining the proper weights requires us to know (or be able to estimate) the ratio σ / σ. Estimation is possible, but we do not discuss that here. In any eent, the usual weight is incorrect. When the m i are large or the ratio σ / σ is small so that the irm eect is more important than the indiidual-speciic eect the correct weights are close to being constant. Thus, attaching large weights to large irms may be quite inappropriate. SOLUTIONS TO COMPUTER EXERCISES C8.1 (i) Gien the equation sleep + β totwrk + β educ + β age + β age + β yngkid + β male + u 0 1 3 4 5 6, the assumption that the ariance o u gien all explanatory ariables depends only on gender is Var( u totwrk, educ, age, yngkid, male) = Var( u male) = δ + δ male 0 1 Then the ariance or women is simply δ 0 and that or men is δ 0 + δ 1; the dierence in ariances is δ 1. (ii) Ater estimating the aboe equation by OLS, we regress (including, o course, an intercept). We can write the results as u ˆi on male i, i = 1,,,706 43
û = 189,359. 8,849.6 male + residual (0,546.4) (7,96.5) n = 706, R =.0016. Because the coeicient on male is negatie, the estimated ariance is higher or women. (iii) No. The t statistic on male is only about 1.06, which is not signiicant at een the 0% leel against a two-sided alternatie. C8.3 Ater estimating equation (8.18), we obtain the squared OLS residuals û. The ull-blown White test is based on the R-squared rom the auxiliary regression (with an intercept), û on llotsize, lsqrt, bdrms, llotsize, lsqrt, bdrms, llotsize lsqrt, llotsize bdrms, and lsqrt bdrms, where l in ront o lotsize and sqrt denotes the natural log. [See equation (8.19).] With 88 obserations the n-r-squared ersion o the White statistic is 88(.109) 9.59, and this is the outcome o an (approximately) χ 9 random ariable. The p-alue is about.385, which proides little eidence against the homoskedasticity assumption. C8.5 (i) By regressing sprdcr on an intercept only we obtain ˆμ.515 se.01). The asymptotic t statistic or H 0 : µ =.5 is (.515.5)/.01.71, which is not signiicant at the 10% leel, or een the 0% leel. (ii) 35 games were played on a neutral court. (iii) The estimated LPM is sprdcr =.490 +.035 ahome +.118 neutral.03 a5 +.018 und5 (.045) (.050) (.095) (.050) (.09) n = 553, R =.0034. The ariable neutral has by ar the largest eect i the game is played on a neutral court, the probability that the spread is coered is estimated to be about.1 higher and, except or the intercept, its t statistic is the only t statistic greater than one in absolute alue (about 1.4). (i) Under H 0 : β 1 3 4 = 0, the response probability does not depend on any explanatory ariables, which means neither the mean nor the ariance depends on the explanatory ariables. [See equation (8.38).] () The F statistic or joint signiicance, with 4 and 548 d, is about.47 with p-alue.76. There is essentially no eidence against H 0. 44
(i) Based on these ariables, it is not possible to predict whether the spread will be coered. The explanatory power is ery low, and the explanatory ariables are jointly ery insigniicant. The coeicient on neutral may indicate something is going on with games played on a neutral court, but we would not want to bet money on it unless it could be conirmed with a separate, larger sample. C8.7 (i) The heteroskedasticity-robust standard error or ˆwhite β.19 is about.06, which is notably higher than the nonrobust standard error (about.00). The heteroskedasticity-robust 95% conidence interal is about.078 to.179, while the nonrobust CI is, o course, narrower, about.090 to.168. The robust CI still excludes the alue zero by some margin. (ii) There are no itted alues less than zero, but there are 31 greater than one. Unless we do something to those itted alues, we cannot directly apply WLS, as h ˆi will be negatie in 31 cases. C8.9 (i) I now get R =.057, but the other estimates seem okay. (ii) One way to ensure that the unweighted residuals are being proided is to compare them with the OLS residuals. They will not be the same, o course, but they should not be wildly dierent. (iii) The R-squared rom the regression ui on yi, yi, i = 1,...,807 is about.07. We use this as R in equation (8.15) but with k =. This gies F = 11.15, and so the p-alue is essentially û zero. (i) The substantial heteroskedasticity ound in part (iii) shows that the easible GLS procedure described on page 79 does not, in act, eliminate the heteroskedasticity. Thereore, the usual standard errors, t statistics, and F statistics reported with weighted least squares are not alid, een asymptotically. () Weighted least squares estimation with robust standard errors gies cigs = 5.64 + 1.30 log(income).94 log(cigpric).463 educ (37.31) (.54) (8.97) (.149) +.48 age.0056 age 3.46 restaurn (.115) (.001) (.7) n = 807, R =.1134 The substantial dierences in standard errors compared with equation (8.36) urther indicate that our proposed correction or heteroskedasticity did not ully sole the heteroskedasticity problem. With the exception o restaurn, all standard errors got notably bigger; or example, the standard error or log(cigpric) doubled. All ariables that were statistically signiicant with the nonrobust standard errors remain signiicant, but the conidence interals are much wider in seeral cases. 45
C8.11 (i) The usual OLS standard errors are in ( ), the heteroskedasticity-robust standard errors are in [ ]: netta = 17.0 +.68 inc +.051 (age 5) +.54 male (.8) (.080) (.006) (.04) [3.3] [.098] [.0044] [.06] 3.83 e401k +.343 e401k inc (4.40) (.14) [6.5] [.0] n =,017, R =.131 Although the usual OLS t statistic on the interaction term is about.8, the heteroskedasticityrobust t statistic is just under 1.6. Thereore, using OLS, we must conclude the interaction term is only marginally signiicant. But the coeicient is nontriial: it implies a much more sensitie relationship between inancial wealth and income or those eligible or a 401(k) plan. (ii) The WLS estimates, with usual WLS standard errors in ( ) and the robust ones in [ ], are netta = 14.09 +.619 inc +.0175 (age 5) + 1.78 male (.7) (.084) (.0019) (1.56) [.53] [.091] [.006] [1.31].17 e401k +.95 e401k inc (3.66) (.130) [3.51] [.160] n =,017, R =.114 The robust t statistic is about 1.84, and so the interaction term is marginally signiicant (twosided p-alue is about.066). (iii) The coeicient on e401k literally gies the estimated dierence in inancial wealth at inc = 0, which obiously is not interesting. It is not surprising that it is not statistically dierent rom zero; we obiously cannot hope to estimate the dierence at inc = 0, nor do we care to. (i) When we replace e401k inc with e401k (inc 30), the coeicient on e401k becomes 6.68 (robust t = 3.0). Now, this coeicient is the estimated dierence in netta between those with and without 401(k) eligibility at roughly the aerage income, $30,000. Naturally, we can estimate this much more precisely, and its magnitude ($6,680) makes sense. 46