Introductory Econometrics Violation of basic assumptions Heteroskedasticity Barbara Pertold-Gebicka CERGE-EI 16 November 010
OLS assumptions 1. Disturbances are random variables drawn from a normal distribution. Mean of this distribution is zero E [ε i ] = 0 / E [ε] = 0 3. Variance of this distribution is constant Var [ε i ] = σ (homoskedasticity) 4. Disturbances are not autocorrelated Cov [ε i ε j ] = 0 3.-4. In matrix notation: Var [ε] = σ I n 1.-4. Can be summarized as: ε NID(0, σ I n ) 5. Disturbances are not correlated with the explanatory variable cov(x ik, ε i ) = 0 / Cov [X, ε] = 0 (consistency assumption) 6. Explanatory variables are not linearly dependent (no multicolinearity)
When all the assumptions are satis ed OLS estimator bβ is a normally distributed random variable h i 9 E b β k = β = h i k ) OLS estimator is unbiased E bβ = β ; h i 9 Var b σ β k = = SST k(1 R h i k ) Var bβ = X T X 1 σ ; ) OLS estimator is e cient (has the lowest possible variance) Thus: OLS estimator is BLUE (best linear unbiased estimator) OLS estimator is consistent (is e cient and unbiased when n! )
Homoskedasticity vs. Heteroskedasticity
Variance-covariance matrix of disturbance term ε = 6 4 ε 1 ε... ε n 3 7 5 Var[ε] = 6 4 Var[ε 1 ] Cov[ε 1, ε ]... Cov[ε 1, ε n ] Cov[ε, ε 1 ] Var[ε ]... Cov[ε, ε n ]............ Cov[ε n, ε 1 ] Cov[ε n, ε ] Var[ε n ] 3 7 5 Homoskedasticity: Var [ε i ] = E [ε i ] (E [ε i ]) = E ε i = σ No autocorrelation: Cov [ε i, ε j ] = E [ε i ε j ] E [ε i ]E [ε j ] = E [ε i ε j ] = 0 Var[ε] = 6 4 σ 0... 0 0 σ... 0............ 0 0 σ 3 7 5 = σ I n
Homoskedasticity vs. heteroskedasticity Homoskedastic disturbance: σ 0... 0 3 0 σ... 0............ 0 0 σ Var[ε] = 6 4 7 5 = σ I n Var[ε] = 6 4 σ 1 0... 0 0 σ... 0............ 0 0 σ i Heteroskedastic disturbance: 3 7 5 = 6 4 σ ω 1 0... 0 0 σ ω... 0............ 0 0 σ ω n 3 7 5 = σ Ω or: Var [ε i ] = σ ω i
x Picturing heteroskedasticity in -variable case y
What happens to OLS estimates under heteroskedasticity? y i = β 0 + β 1 x i1 + β x i +... + β k x ik + ε i y = Xβ + ε where E [x ik ε i ] = 0, E [ε i ε j ] = 0, but Var [ε i ] = σ ω i 6= const. or E X T ε = 0, but Cov [ε] = σ OLS estimate: b β = X T X 1 X T y h i E bβ h i Var bβ = β (by assumption E [εjx] = 0) ) β b is unbiased 1 1 = X X T X T Var [ε] X X X T = 1 = X X T X T σ X 1 X X T ) h i 1 Var bβ is di erent than X T X σ
What happens to OLS estimates under heteroskedasticity? h i Var bβ 1 1 = σ X X T X T ΩX X T X We used to estimate the OLS estimator variance by: h i dvar bβ = bσ X X 1 1 T = s X T e X = X T X n k 1 h h i E Var d bβ 1 h i Is this a good estimator of Var bβ? h ii Var bβ = E s X X 1 1 T σ X X T X T ΩX X T X in large samples s σ 1 h i 1 = E σ X T X X T X X T ΩX X T X 6= 0 if cov(ω X) 6= 0
What happens to OLS estimates under heteroskedasticity? Standard estimate of OLS standard errors are biased in small samples Standard estimate of OLS standard errors are biased even in large samples if heteroskedasticity is correlated with some explanatory variables We can not perform reliable hypothesis testing OLS estimator is no longer BLUE (Best Linear Unbiased Estimator)
Example - marginal propensity to save Model 1: OLS estimates using the 100 observations 1-100 Dependent variable: sav coefficient std. error t-ratio p-value - - - - - - - - - - - - - - - - - - - - - const -1438.80 1477.88-0.9736 0.337 inc 0.10815 0.06514 1.660 0.1001 size 65.8668 16.69 0.3040 0.7618 educ 143.316 105.37 1.361 0.1768 Unadjusted R-squared = 0.0814 Adjusted R-squared = 0.0553 F-statistic (3, 96) =.8957 (p-value = 0.045)
Heteroskedasticity-robust standard errors OLS estimator b β is unbiased even under heteroskedasticity The only thing we need to be careful about are the standard errors of coe cients h i Knowing that Var bβ = X T X 1 X T σ ΩX X T X 1 rather than h i Var bβ = σ X T X 1, h i let us nd a consistent estimator of Var bβ White (1980) showed that X T σ ΩX can be estimated by 1 n e i x i x T i Heteroskedasticity-robust variance of OLS estimator is estimated as: h i dvar bβ = 1 n X T X 1 e i x i x T i 1 X T X
Heteroskedasticity-robust standard errors h i Var bβ = X T X 1 X T σ ΩX X T X 1 6 4 1 1... 1 x 11 x 1... x n1 x 1 x... x n............ x 1k x k... x nk 3 6 7 4 5 X T σ ΩX = σ 1 0... 0 0 σ... 0............ 0 0 σ i = σ i x i x T i 3 7 6 5 4 1 x 11 x 1... x 1k 1 x 1 x... x k............... 1 x n1 x n... x nk 3 7 5 dvar bβ = 1 n X T X 1 e i x i x T i X T X 1 is a good estimate of the above
Example - marginal propensity to save Model : OLS estimates using the 100 observations 1-100 Heteroskedasticity-robust standard errors Dependent variable: sav coefficient std. error t-ratio p-value - - - - - - - - - - - - - - - - - - - - - - - - const -1438.80 10.51-0.6785 0.4991 inc 0.1081 0.0756476 1.430 0.1560 size 65.8668 17.064 0.3034 0.76 educ 143.316 15.576 0.9393 0.3499 Unadjusted R-squared = 0.0814 Adjusted R-squared = 0.0553 F-statistic (3, 96) =.3654 (p-value = 0.0795)
Why don t we always apply heteroskedasticity-robust standard errors? Robust standard errors can be used for valid hypothesis testing only in large samples t = b β β! t(n k 1) b n! se( b β) In small samples robust t-statistics might be distributed di erently We prefer to use robust standard errors only where presence of heteroskedasticity is justi ed We would like to test whether hetoerskedasticity is present
Testing for heteroskedasticity y i = β 0 + β 1 x i1 + β x i +... + β k x ik + ε i Under homoskedasticity (H 0 ): Var[ε] = σ I or Var (ε i ) = σ (const) Under heteroskedasticity (H A ): Var[ε] = σ Ω or Var (ε i ) 6= const Let us remember that all these assumptions are conditional on the explanatory variables, i.e. Under homoskedasticity (H 0 ): Var[εjx] = σ I or Var (ε i jx i ) = σ (const) Under heteroskedasticity (H A ): Var[εjx] = σ Ω or Var (ε i jx i ) 6= const
Testing for heteroskedasticity Under homoskedasticity(h 0 ): Var[εjx] = σ I or Var (ε i jx i ) = σ (const) Under heteroskedasticity(h A ): Var[εjx] = σ Ω or Var (ε i jx i ) 6= const Another assumption states that E [ε i jx i ] = 0, thus: Var (ε i jx i ) = E ε i jx i E [ε i jx i ] = E ε i jx i H 0 : E ε i jx i = const H A : E ε i jx i 6= const Estimate ε i by residuals e i and nd out if E e i jx i = const
Testing for heteroskedasticity H 0 : E ε i jx i = const H A : E ε i jx i 6= const Estimate ε i by residuals e i and nd out if E e i jx i = const e i = δ 0 + δ 1 x i1 + δ x i +... + δ k x ik + u i H 0 : δ 1 = δ =... = δ k = 0 H A : at least one of the deltas is signi cant Use the F-test for overall signi cance of the above regression: F = (SSR R SSR U )/k SSR U /(n k 1) = Ru /k (1 Ru )/(n k 1) (because SSR R = SST R ) R R = 0)
Testing for heteroskedasticity e i = δ 0 + δ 1 x i1 + δ x i +... + δ k x ik + u i H 0 (homoskedasticity): δ 1 = δ =... = δ k = 0 H A (heteroskedasticity): at least one of the deltas is signi cant Test statistics: F = Ru /k F (k, n k 1) (1 Ru )/(n k 1) We reject the null hypothesis (reject homoskedasticity) if the test statistics is higher than the appropriate critical value.
Example - marginal propensity to save Model 4: OLS estimates using the 100 observations 1-100 Dependent variable: e_sq coefficient std. error t-ratio p-value - - - - - - - - - - - - - - - - - - - - - - - - const -4.064E+07.36545E+07-1.70 0.090 inc 6.309 104.6 0.171 0.886 size.91615e+06 3.4689E+06 0.8408 0.405 educ 3.03488E+06 1.6858E+06 1.800 0.0750 Unadjusted R-squared = 0.0541 Adjusted R-squared = 0.080 F-statistic (3, 96) = 1.7699 (p-value = 0.158)
Heteroskedasticity - summary In small samples it always means estimating biased standard errors of coe cients by OLS In large samples OLS estimates of coe cients standard errors are biased only if heteroskedasticity is correlated with explanatory variables Heteroskedasticity robust standard errors might not produce a t-distribudion in small samples In large samples robust standard errors produce a t-distribudion Especially in small samples we would like to test for presence of heteroskedasticity before applying robust standard errors
Special form of heteroskedasticity Heteroskedasticity is problematic when correlated with X We could model this relationship by Var (ε i jx i ) = σ h(x i ) h(x i ) > 0 Assume we know h(x i ) y i = β 0 + β 1 x i1 + β x i +... + β k x ik + ε i where E [εjx] = 0, E [ε i ε j jx] = 0 8i, j and Var [εjx] = σ h(x) Note that although Var [εjx] = E ε jx = σ h(x), ε Var p jx = 1 h(x) h(x) Var ε jx = 1 h(x) σ h(x) = σ ε Thus, Var p jx = 0 h(x) ε Moreover E p jx = p 1 E [εjx] = p 1 0 = 0 h(x) h(x) h(x)
Special form of heteroskedasticity y i = β 0 + β 1 x i1 + β x i +... + β k x ik + ε i ε where Var p ε jx = 0 and E p jx = 0 h(x) h(x) y i p h(xi ) = β 0 1 p h(xi ) + β 1 x i1 p h(xi ) +... + β k x ik p h(xi ) + ε i p h(xi ) y i = β 0 x i0 Satis es the assumptions that E ε jx = 0, E + β xi +... + β k xik + εi h εi ε j jxi = 0 8i, j and Var [ε jx ] = σ
Weighted Least Squares (WLS) y i p h(xi ) = β 0 1 p h(xi ) + β 1 x i1 p h(xi ) +... + β k x ik p h(xi ) + ε i p h(xi ) y i = β 0 x i0 + β x i +... + β k x ik + ε i OLS estimates of this equation (β 0, β 1,...,β k ) are called the WLS estimates Each observation (including the constant term) is weighted by 1 p h(xi ) WLS estimators are more e cient than the OLS estimators in presence of heteroskedasticity The Weighted Least Squares (WLS) estimation is a special case of the Generalized Least Squares (GLS) estimation.
Feasible GLS - estimating the heteroskedasticity function In the above example we knew what is the form of heteroskedasticity Usually, we do not know this form We can assume some general functional form and estimate it using the data The assumed functional form of heteroskedasticity: Var (ε i jx i ) = σ exp (δ 0 + δ 1 x i1 + δ x i +... + δ k x i Thus, we assume that h(x i ) = exp (δ 0 + δ 1 x i1 + δ x i +... + δ k x ik ) We use exponential function to assure that h(x i ) is positive We can write: Var (ε i jx i ) = E ε i jx i e i jx i v i, where v i is a random variable wit e i = σ exp (δ 0 + δ 1 x i1 + δ x i +... + δ k x ik ) v i
Feasible GLS - estimating the heteroskedasticity function Original regression: y i = β 0 + β 1 x i1 + β x i +... + β k x ik + ε i Heteroskedasticity form : Var (ε i jx i ) = σ exp (δ 0 + δ 1 x i1 + δ x i +... + δ k x ik ) log ei log ei e i = σ exp (δ 0 + δ 1 x i1 + δ x i +... + δ k x ik ) v i assuming that v i is independent of x i = log σ + (δ 0 {z } v + δ 1 x i1 + δ x i +... + δ k x ik ) + log v i {z } u i = α 0 + δ 1 x i1 + δ x i +... + δ k x ik + u i, where E [u i jx i ] = 0, because E [v i jx i ] = 1 and log(1) = 0
Feasible GLS - estimating the heteroskedasticity function Original regression: y i = β 0 + β 1 x i1 + β x i +... + β k x ik + ε i Var(ε i ) = σ h(x i ) h(x i ) = exp (δ 0 + δ 1 x i1 + δ x i +... + δ k x ik ) log [h(x i )] = δ 0 + δ 1 x i1 + δ x i +... + δ k x ik We can estimate h(x i ) by regressing the log of squared e i from the original regression on all explanatory variables log ei = α0 + δ 1 x i1 + δ x i +... + δ k x ik + u i Note, that tted values, log \ (e i ) are estimates of α 0 + δ 1 x i1 + δ x i +... + δ k x ik and thus exp log \(e i ) = σ \ h(x i ) can be then used to estimate original equation by
Feasible GLS - the procedure 1 Estimate the original equation by OLS y i = β 0 + β 1 x i1 + β x i +... + β k x ik + ε i and record e i Calculate log ei 3 Estimate the following regression by OLS: log ei = α0 + δ 1 x i1 + δ x i +... + δ k x ik + u i and record the tted values log(e \ i ) 4 Calculate h(x [ i ) = exp log(e \ i ) - i.e. estimate h(x i ) by the exponential of tted values 5 Finally, estimate the original equation by WLS, using 1 [h(x i ) y i = β 0 + β 1 x i1 + β x i +... + β k x ik + ε i as weights.
Summary What is heteroskedasticity What happens to OLS estimates under heteroskedasticity? How to test for presence of heteroskedasticity? Three methods to deal with heteroskedasticity heteroskedasticity-robust standard errors Weighted Least Squares (Generalized Least Squares) Feasible Generalized Least Squares