Topic 7: Heteroskedasticity Advanced Econometrics (I Dong Chen School of Economics, Peking University Introduction If the disturbance variance is not constant across observations, the regression is heteroskedastic That is, Var (ε i = σ i, i =,, n ( We continue to assume that disturbances are pairwise uncorrelated This implies that σ 0 0 E (εε = σ 0 σ 0 Ω = ( 0 0 σn Heteroskedasticity may arise in many applications, especially in cross-sectional data Example : (i The variation in profits of large firms may be greater than that of small ones, even after accounting for differences in firm size (ii The variation of expenditure on certain commodity groups may be higher for high-income families than for low-income ones (iii When estimate the return to education, ability is unobservable and thus it enters the disturbance It s possible that the variance of ability varies with the level of education (iv Sometimes heteroskedasticity is a consequence of aggregation (eg taking average of data By eyeballing the patterns of residuals from OLS estimation, we may find some evidence of heteroskedasticity Example : Consider the following model EXP = β +β AGE +β 3 INCOME +β 4 (INCOME +β 5 OW NER+ε, (3 where EXP is the credit card expenditure and OW NER is a dummy variable indicating whether an individual owns a house Model (3 is estimated by OLS and the residuals are saved In Figure the residuals are plotted against INCOME and in Figure against AGE In Figure, the spread of the residuals become wider for higher income, while in Figure the distribution of the residuals are largely random Figure and suggest that a common cause of
Consequences Fig : Plot of the OLS Residuals against INCOME heteroskedasticity is that the variances of the disturbance terms may depend on some of the x variables, ie, σi = h (x i In this case, it appears that σi is positively related to IN COM E STATA Tips To obtain graphs like those in Figure and, use the following command in STATA reg exp age income income owner predict e, resid graph twoway scatter e income, msymbol(oh yline(0 Consequences Recall from our previous discussion that if we use OLS when Var (ε = σ Ω, then (i b is unbiased; (ii b is inefficient, while the GLS estimator, β, is BLUE (iii Var (b = σ (X X X ΩX (X X So the use of σ (X X is incorrect and it leads to incorrect standard errors and unreliable inferences about population parameters
3 Robust Estimation of Asymptotic Covariance Matrix 3 Fig : Plot of the OLS Residuals against AGE 3 Robust Estimation of Asymptotic Covariance Matrix The above discussions suggest that if we are to continue using OLS in the presence of heteroskedasticity, then we should at least use the correct formula for Var (b Note that in the expression for Var (b, σ and Ω are both unknown To estimate Var (b, we need to estimate the matrix σ X ΩX White (980, Econometrica shows that under very general conditions, the matrix, is a consistent estimator of S 0 = n n e i x i x i, (4 Σ = n σ X ΩX = n n σi x i x i, (5 where e i is the OLS residual for observation i and x i = [ ] x i x i x ik Therefore, we can obtain a consistent estimator of Var (b, which is given by ( n EstAsyVar (b = (X X e i x i x i (X X (6 This is usually called the White heteroskedastic-consistent/robust estimator of the covariance matrix of b Note that in forming this estimator, we don t have to assume any specific form of heteroskedasticity So it s a very useful result The asymptotic properties of the estimator is unambiguous, but
4 Testing for Heteroskedasticity 4 its usefulness in small sample is open to question Some Monte Carlo studies suggest that in small sample the White estimator tends to underestimate the variance matrix Remark : With the White robust estimator for covariance matrix, we can construct the t statistic as usual, which is called the heteroskedastic-robust t statistic Note that this robust statistic follows a t distribution only asymptotically In small sample, its sampling distribution is unknown Remark : We cannot use the F test for testing exact linear restrictions because the distributional assumption of the F statistic requires homoskedasticity But we can use a Wald test The statistic is W = (Rb q {R [EstAsyVar (b] R } (Rb q (7 χ J under H 0 : Rβ = q That is, the statistic is asymptotically distributed as χ with degrees of freedom equal to the number of restrictions STATA Tips: In STATA, to obtain the White estimator, we simply add the option robust to the regress command For example, reg y x x x3, robust Then the output will report standard errors computed from the White estimator of the covariance matrix of b 4 Testing for Heteroskedasticity Among others, three tests are common in practice to detect heteroskedasticity They are: ( White s general test; ( Goldfeld-Quandt test; and (3 Breusch- Pagan LM test These tests are based on the following strategy OLS estimator of β is consistent even in the presence of heteroskedasticity Therefore, the OLS residuals will mimic the heteroskedasticity of the true disturbances Hence, tests designed to detect heteroskedasticity will be applied to the OLS residuals 4 White s General Test The hypotheses under examination are H 0 : σ i = σ vs H : not H 0 Note that to conduct White test, we do not have to assume any specific form of heteroskedasticity White test is motivated by the observation that if the model does not have heteroskedasticity, then ε i should not be correlated with any regressors, the squares of those regressors and their cross products A simple operational version of White test is carried out by obtaining nr in the auxiliary regression of e i on a constant and all unique variables contained in x i and all the squares and cross products of the variables in x i
4 Testing for Heteroskedasticity 5 Example 3: Suppose we have four regressors, x, x, x 3, and a constant term Then White test is carried out by first obtaining the residuals, e i, from OLS of the original model and then estimating an auxiliary regression e i on a constant and x, x, x 3, x, x, x 3, x x, x x 3, x x 3 Finally, record the R from the auxiliary regression and construct the test statistic nr The test statistic, nr, is asymptotically distributed as chi-squared with P degrees of freedom, where P is the number of regressors in the auxiliary regression, including the constant nr a χ P (8 Remark 3: White test is very general in that it does not specify any specific form of heteroskedasticity Remark 4: Due to its generality, White test may simply identify some other specification errors (such as the omission of x from a simple regression instead of heteroskedasticity Remark 5: The power of White test may be low in some cases Remark 6: White test is nonconstructive in that if we reject the null hypothesis, then the result of the test does not provide any guidance for the next step STATA Tips To perform White test in STATA, you can either manually construct the test statistic as in (8 or to use the whitetst command following a regress command on the original model whitetst is not an official STATA command and has to be downloaded Type findit whitetst in STATA and follow the link and the command will be installed automatically 4 Goldfeld-Quandt Test Goldfeld-Quandt test assumes some particular form of heteroskedasticity It tests that E ( ε i = σ h (x ik, eg, σ x ik This test is applicable if one of the x variables is thought to cause the heteroskedasticity Steps: Reorder observations by values of x k Omit c central observations and we are left with two samples of (n c / observations 3 Let σ (σ be the error variance of the first (second sample Test H 0 : σ = σ vs H : σ > σ 4 Estimate the regression y = Xβ + ε in each sub-sample (which requires that (n c / > K Obtain e e and e e, where e and e are the residual vectors from the two sub-samples respectively 5 Form R = e e /e e
4 Testing for Heteroskedasticity 6 It can be shown that under H 0, where n = (n c K / R F n,n, (9 Remark 7: c can be zero Introducing c is intended to increase the power of the test However, if c increases, then (n c / decreases, which leads to lower degrees of freedom in the estimation with each sub-sample and this tends to diminish the power of the test So there is a trade-off in choose the appropriate c Some studies suggest that no more than a third of the observations should be dropped One choice is that c n 3 K Remark 8: Goldfeld-Quandt test is exactly distributed as F under H 0 if the disturbances are normally distributed If not, then F distribution is only an approximation 43 Breusch-Pagan LM Test Goldfeld-Quandt test is reasonably powerful if we know or are able to identify correctly the variable to use in sample separation This limits its generality For example, what if a set of regressors jointly determine the nature of heteroskedasticity? In this regard, Breusch-Pagan LM test is more general Assume σ i = h (z iα, where h ( is some function, α is a coefficient vector unrelated to β, and z i is a vector of variables causing heteroskedasticity, with the first element being Within this framework, if α = α 3 = = α P = 0, then σ i = h (α = σ, ie, homoskedasticity Therefore, we are to test Steps: H 0 : α = α 3 = = α P = 0 vs H : not H 0 Regress y on X Obtain OLS residual vector e Compute σ = e e/n and g i = ( e i / σ 3 Estimate, by OLS, an auxiliary regression g i = α + α z i + α 3 z i3 + + α P z ip + v i (0 4 Compute the regression sum of squares (SSR, SSR = n (ĝ i g, where g = n n g i ( Under H 0, LM = SSR a χ P (
5 Generalized Least Squares Estimator 7 STATA Tips To perform Breusch-Pagan LM Test in STATA, you can use the hettest or the bpagan command following the regress command on the original model bpagan is unofficial and thus needs to be downloaded The syntax is the following hettest var_list where var_list specifies z i without the The same syntax applies to bpagan 5 Generalized Least Squares Estimator 5 Weighted Least Squares when Ω Is Known Suppose the variance matrix of ε is given by (, where Ω is known Without loss of generality, we may write So, σ i = σ ω i (3 ω 0 ω Ω = (4 0 ω n Now consider a weight matrix, P, as follows: ω 0 P = ω (5 0 ωn Hence, P P = Ω and Py = y / ω y / ω y n / ω n and PX = x / ω x / ω x n / ω n Regressing Py on PX using OLS gives the GLS estimator, β = (X P PX X P Py = ( X Ω X X Ω y [ n ] [ n ] = w i x i x i w i x i y i, (6 where w i = /ω i In this case, β is also called the weighted least squares (WLS estimator
5 Generalized Least Squares Estimator 8 A common specification is that the variance is proportional to one of the regressors or its square For example, if σ i = σ x ik (7 for some k, then the transformed regression model for GLS (or WLS is ( ( ( y x x xk = β k + β + β + + β K + ε (8 x k x k x k x k x k If the variance is proportional to x k instead of x k, then the weight applied to each observation is / x k instead of /x k STATA Tips In STATA, you can perform WLS either by manually transforming the data and then running OLS or use the aweight feature in the regress command The syntax is as follows regress y x x xk [aweight=var_name] The weight to be used is /ω i For example, if σi = σ x ik, then you should first generate a variable, say w, which equals /x ik, and then write [aweight=w] in the regress command If σi = σ x ik, then w should be /x ik 5 Estimation when Ω Is Unknown It s rare that the form of Ω is known, so usually it has to be estimated The general form of the heteroskedastic regression model has too many parameters to estimate Typically, the model is restricted by formulating σ Ω as a function of a few parameters, α Write this function as Ω (α FGLS based on a consistent estimator of Ω (α is asymptotically equivalent to full GLS Recall that for the heteroskedastic model, the GLS estimator is [ n ( β = σ i x i x i ] [ n ( ] x i y i (9 Basically, we first need to obtain estimates for σi, say σ i, usually using some function of the OLS residuals Then we can compute β from (9 and σ i Note that E ( ε i = σ i, so σ i ε i = σ i + v i, (0 where v i is the difference between ε i and its expectation Since ε i is unobservable, we would use the least squares residuals, for which Then e i = ε i x i (b β = ε i + u i ( e i = ε i + u i + ε i u i ( However, we know that b is consistent, ie, b P β Therefore, the terms in u i will become negligible and thus approximately we have, e i = σ i + v i (3
5 Generalized Least Squares Estimator 9 The above reasoning leads to the following estimation strategy If σ i = h (z i α, where z i may or may not coincide with x i, then we can obtain a consistent estimator for α by estimating e i = h (z iα + v i (4 Obtaining the fitted value of e i, say ê i, we can use it in place of σ i in (9 to construct β, the feasible generalized least squares (FGLS estimator This estimation method is called the two-step estimation A common functional form for h ( is exponential Suppose we have a model, We may write y i = β + β x i + + β K x Ki + ε i, where ε i ( 0, σ i (5 σ i = exp (α + α z i + + α P z P i v i, (6 where v i is uncorrelated with z s and has expectation of Then ln ( σ i = α + α z i + + α P z P i + v i (7 In this case, the procedures to obtain the FGLS estimator are the following Regress y on (, x,, x K and obtain e i Compute ln ( e i and use it as the dependent variable in model (7 Obtain the fitted value, ln (e i ( 3 Compute ĥi = exp ln (e i and its reciprocal, w i = b hi 4 Use w i as the weight to compute the weighted least squares estimator of β