Multiple Regression Analysis y = 0 + 1 x 1 + x +... k x k + u 6. Heteroskedasticity
What is Heteroskedasticity?! Recall the assumption of homoskedasticity implied that conditional on the explanatory variables, the variance of the unobserved error, u, was constant! If this is not true, that is if the variance of u is different for different values of the x s, then the errors are heteroskedastic! Example: We estimate returns to education but ability is unobservable, and we think the variance in ability differs by educational attainment
Example of Heteroskedasticity f(y x)... E(y x) = 0 + 1 x x 1 x x 3 x
Why Worry About Heteroskedasticity?! We didn t need homoskedasticity to show that our OLS estimates are unbiased and consistent! However, the estimated standard errors of the hats are biased with heteroskedasticity! i.e we can not use the usual t or F or LM statistics for drawing inferences! And, this problem doesn t go away as n increases! Our estimates are also no longer BLUE --could find other estimators with smaller variance
Variance with Heteroskedasticity! Suppose that in a simple (univariate) regression model, the first 4 Gauss-Markov assumptions hold but Var(u i x i )= i (i.e. variance depends on x)! Then, the OLS estimator is just: ˆβ 1 = β 1 + Var( ˆβ ) 1 = ( x i x ) u i ( x i x )! We can show that: σ i! Without the subscript i we could show that this equals /SST x ( x i x ), where SST x = x i x SST x ( )
Heteroskedasticity Robust SE! So, we need some way to estimate this! A valid estimator of this when i is: ( x i x ) û i, where û i are the OLS residuals SST x! For the general multiple regression model, a valid estimator of the var(betahat j ) with heteroskedasticity: Var ( ˆβ ) j = ˆr ijû i SSR j! Where r ij is the i th residual from regressing x j on all the other x s and SSR j is the SSR from this regression
Heteroskedasticity Robust SE! To see this recall that under homoskedasticity: Var( ˆβ ) j = = r ij σ = σ r ij SSR j σ ( r ) = σ ij SST j 1 R j ( ) ( r ) ij! With heteroskedasticity we can t pull out front!
Robust Standard Errors! Now that we have a consistent estimate of the variance, the square root can be used as a standard error for inference! Typically call these robust standard errors! Sometimes the estimated variance is corrected for degrees of freedom by multiplying by n/(n k 1)! As n tends to infinity it s all the same, though! Note: the robust standard errors only have asymptotic justification
Robust Standard Errors (cont)! Can use the robust SEs to construct robust t- statistic, if heteroskedasticity is a problem (just substitute in robust SE when forming t-stat)! With small sample sizes t statistics formed with robust standard errors will not have a distribution close to the t, and inferences will not be correct As sample size gets large, this problem disappears (i.e. het-robust t-stats are asymptotically t-distributed)! Thus, if homoskedasticity holds it is better to use the usual standard errors for small samples! In EViews, robust standard errors are easily obtained.
A Robust LM Statistic! We can compute LM statistics that are robust to heteroskedasticity to test multiple exclusion restrictions! Where, y= 0 + 1 x 1 + x + + k x k +u and we want to test H 0 : 1 =0, =0 1. Run OLS on the restricted model and save the residuals û. Regress each of the excluded variables on all of the included variables (q different regressions) and save each set of residuals r 1, r i.e. x 1 on x 3.x k and x on x 3.x k
A Robust LM Statistic (cont) 3. Create a variable equal to 1 for all i observations and regress this on r 1 * û, r * û with no intercept! The heteroskedasticity-robust LM statistic turns out to be n SSR 1, where SSR 1 is the sum of squared residuals from this final regression! LM still has a Chi-squared distribution with q degrees of freedom! I offer no explanation as to why or how this works as the derivation is tedious! Just know it can be done.
Testing for Heteroskedasticity! We still may want to know if heteroskedasticity is present if, for example, we have a small sample or if we want to know the form of heteroskedasticity! Essentially we want to test H 0 : Var(u x 1, x,, x k ) =! This is equivalent to H 0 : E(u x 1, x,, x k ) = E(u ) = {E(u)=0}! So to test for heteroskedasticity we want to test whether u is related to any of the x s
The Breusch-Pagan Test! If assume the relationship between u and x j will be linear, can test as a linear restriction! So, for u = 0 + 1 x 1 + + k x k + v, this means testing H 0 : 1 = = = k = 0! Don t observe the error, but can estimate it with the residuals from the OLS regression! Thus, regress the residuals squared on all the x s! The R from this regression can be used to form an F or LM statistic to test for overall significance
The Breusch-Pagan Test (cont) F-Statistic:! The F statistic is just the reported F statistic for overall significance of the regression! F = [R /k]/[(1 R )/(n k 1)], which is distributed F k, n - k - 1 LM-Statistic (Breusch-Pagan)! The LM statistic is LM = nr, which is distributed k Note that we did not have to run an auxilliary regression in this case! Could carry out this test for any of the x s individually (instead of for all x s together)
The White Test! The Breusch-Pagan test will detect any linear forms of heteroskedasticity! The White test allows for nonlinearities by using squares and crossproducts of all the x s! e.g. With two x s estimate:! û = 0 + 1 x 1 + x + 3 x 1 + 4 x + 5 x 1 x +erro r! Still just using an F or LM to test whether all the x j, x j, and x j x h are jointly significant! This can get to be unwieldy pretty quickly as number of x variables gets bigger.
Alternate form of the White test! We can get all the squares and cross products on the RHS in an easier way! If we recognize that the fitted values from OLS, yhats, are a function of all the x s! Thus, yhat will be a function of the squares and cross-products of the x j s! So another way to do the White test is to regress the residuals squared on yhat and yhat u ˆ = δ + δ yˆ + δ yˆ + 0 1 error! Then, as before, use the R to form an F or LM statistic
Alternate form (cont)! Here the null hypothesis is H 0 : 1 =0, =0! So, there are only restrictions regardless of the number of independent variables in the original model! It is important to test for functional form first otherwise we could reject H 0 even if it is true Make sure your functional form is right, before doing heteroskedasticity tests.! e.g. Don t estimate a model with experience only when should have experience and experience Your heteroskedasticity test (using the misspecified model) would lead to rejection of the null, when the null is true.
Weighted Least Squares! While it s always possible to estimate robust standard errors for OLS estimates (see above) the OLS estimates themselves may not be efficient! If we know something about the specific form of the heteroskedasticity, we can obtain more efficient estimates than OLS! The basic idea is going to be to transform the model into one that has homoskedastic errors! We call this the weighted least squares approach
Heteroskedasticity is known up to a multiplicative constant! Suppose the heteroskedasticity can be modeled as Var(u x) = h(x) h(x) is some function of the x s, greater than zero Here h(x) is assumed to be known (Case 1)! The trick is to figure out what h(x) looks like! Thus, for a random sample from the population! s = Var(u i x) = s h i! How can we use this information to get homoskedastic errors and therefore estimate betas more efficiently?
Heteroskedasticity is known! First note that: E u i h i x! And, therefore, ( ) = 0, since h i is a function of the x's ( ) = σ, since Var u i x ( ) = σ h i Var u i h i x! So, if we divided our whole equation by the square root of h i we would have a model where the error is homoskedastic y i h i = β 0 h i + β 1 x 1i h i +... + β k x ki h i + u i h i
Heteroskedasticity is known y * = β x * + β x * +... + β x * + u * i 0 i0 1 i1 k ik i! So, estimate the transformed equation to get more efficient estimates of the parameters! Notice that the s have the same interpretation as in the original model Example:! Cigs= 0 + 1 Cigprice+ Rest+ + k Income +u! Might think that var(u i x i )= *income! i.e. that h(x)=income
Generalized Least Squares! Estimating the transformed equation by OLS is an example of generalized least squares (GLS)! GLS will be BLUE as long as the original model satisfies Gauss-Markov except homoskedasticity! GLS is a weighted least squares (WLS) procedure where each squared residual is weighted by the inverse of Var(u i x i )! So, less weight is given to observations with a higher error variance
Weighted Least Squares! While it is intuitive to see why performing OLS on a transformed equation is appropriate, it can be tedious to do the transformation! Weighted least squares is a way of getting the same thing, without the transformation! Idea is to minimize the weighted sum of squares (weighted by 1/h i ) minimize: ( y i b 0 b 1 x i1... b k x ) ik h i! This is the same as the transformed regression! EViews will do this without transforming variables
More on WLS! WLS is great if we know what Var(u i x i ) looks like! In most cases, won t know form of heteroskedasticity (Case )
Feasible GLS! In the more typical case where you don t know the form of heteroskedasticity, you can estimate h(x i )! Typically, we start with the assumption of a fairly flexible model, such as! Var(u x) = exp( 0 + 1 x 1 + + k x k ), where h (x i ) = exp( 0 + 1 x 1 + + k x k )! Use exp( ) to ensure that h(x i )>0! Since we don t know the s, we must estimate them
Feasible GLS (continued)! Our assumption implies that u = exp( 0 + 1 x 1 + + k x k )v! Where E(v x) = 1, then if E(v) = 1(independent): ln(u ) = 0 + 1 x 1 + + k x k + e! Where E(e) = 0 and e is independent of x! Intercept has a different interpretation ( 0 0 )! Now, we know that û is an estimate of u, so we can estimate this by OLS! regress log(û ) on the x s
Feasible GLS (continued)! We then obtain the fitted values from this regression (g i hats)! Then the estimates of the h i s is obtained as h i hat = exp(g i hat)! Now all we need to do is to run a WLS regression using 1/ exp(ghat) as the weights Notes:! FGLS is biased (due to use of h i hats) but consistent and asymptotically more efficient than OLS! Remember, the estimates ß give the marginal effect of x j on y j not x j * on y j *
WLS Wrapup! F-statistics after WLS (whether R or SSR form) are a bit tricky! Need to form the weights from the unrestricted model and use those to estimate WLS for the restricted model as well (i.e use same weighted x s)! The OLS and WLS estimates will be different due to sampling error! However, if they are very different then it s likely that some other Gauss-Markov assumption is false e.g. (0 conditional mean)