Heteroskedasticity y i = β + β x i + β x i +... + β k x ki + e i where E(e i ) σ, non-constant variance. Common problem with samples over individuals. ê i e ˆi x k x k AREC-ECON 535 Lec F
Suppose y i = β + β x i + β x i +... + β k x ki + e i where E(e i ) = E(e i ) = σ i E(e i e j ) =. Assuming the error variance changes over the sample and that we can explain the heteroskedasticity. AREC-ECON 535 Lec F
Why does heteroskedasticity occur? Variance of dependent variable increases with increases in the level of the dependent variable. Pattern in random variables. Variance of dependent variable increases or decreases with changes in independent variable. error-learning, discretionary income, data collection Outliers in data. Small number of outliers result in a large variance. Specification Bias: missing variable or incorrect functional form. Systematic process remains in error term. Consequences of Heteroskedasticity ˆ are unbiased but inefficient. Formulas for variances of OLS estimators are biased and inconsistent. Variances are generally too small and hypothesis test statistics are too large. However, opposite can happen. AREC-ECON 535 Lec F 3
Detection of Heteroskedasticity ) Graph the residuals against all X's and y. (Not as simple as autocorrelation. Temporal processes are much simpler than multi-dimensional ones.) ) Breusch-Pagan (LM) test e i = α + α z i + α z i +... + α p z pi + u i H : α = α =... = α p = Steps of test: a) Run OLS regression and obtain ê i b) Calculate ~ = Σ e ˆi / N. c) Construct v i = e ˆi / ~ d) Regress v i on Z s (usually we use the X's so p = k but we don t have to). e) Obtain Estimated SS AREC-ECON 535 Lec F 4
If sample size is large ½ ESS ~ χ p ex) ½ (.788) = 5.394 > 3.845 = χ (5%) f) Econometric research shows an F-test on the auxiliary regression has better small sample properties. H : α = α =... = α p = AREC-ECON 535 Lec F 5
3) White (LM) test e i = α + α x i + α x i +... + α k x ki + α k+ x i + α k+ x i +... + α k+k x ki + α k+ x i x i + α k+ x i x 3i +... + α m x k-i x ki + u i m = ((k - k)/) + k H : α = α =... = α m = (N) R ~ χ m Steps of test are same as Breusch-Pagan test (except use e ˆi instead of v i ). White test examines for more complex heteroskedasticity. But be careful of sample size and looking for too much... (Including irrelevant variables does what?) AREC-ECON 535 Lec F 6
4) ARCH test - time series data (Autoregressive Conditional Heteroskedasticity) Suppose y t = β + β x t +... + β k x kt + e t and σ t = α + α e t- +... + α p e t-p + u t Model has strong intuitive appeal. e t e t t t AREC-ECON 535 Lec F 7
e t = α + α e t- +... + α p e t-p + u t H : α = α =... = α p = Steps of test: a) Run OLS regression and obtain ê t. b) Regress e ˆt on X's in model and ˆ t e,..., ˆ t p e. c) Obtain R. d) (T - p) R ~ χ p under H. e) F-test on the auxiliary regression. 5) Other tests: Park test, Glejser test, Goldfeld-Quandt test... These can be special cases or at least thought of as of Breusch-Pagan test. However, small sample properties may be better and look for specific forms of heteroskedasticity. Like with serial correlation, use caution. Do not want result to be due to assumption made by researcher. AREC-ECON 535 Lec F 8
What to do about heteroskedasticity? Heteroskedastic Consistent Variance-Covariance Matrix OLS: V( ˆ ) = ˆ (X X) - White's: V( ˆ ) = (X X) - (Σ eˆi x i x i ) (X X) - Compare the two. Correct if problem in interpretation is present. Many regression packages will generate White standard errors or covariance matrix. Likewise, Serial Correlation Consistent Variance-Covariance Matrix Newey-West: V( ˆ ) = (X X) - (X Ω( ˆ )X)(X X) - Many regression packages will do this but I don t think you should use it unless AREC-ECON 535 Lec F 9
Practical Note: If: y i = β + β x i +... + β k x ki + e i is heteroskedastic, conduct test on ln(y i ) = β + β x i +... + β k x ki + e i or ln(y i ) = β + β ln(x i ) +... + β k ln(x ki ) + e i. Use the model without heteroskedasticity unless theory suggests a functional form. Correcting for Heteroskedasticity Procedure: transform the error term into a random variable that meets OLS assumption. Results in Generalized Least Squares. i.e., E(e i * ) = σ. Be careful. Interpretation of slope coefficients does not remain necessarily the same as it does in models corrected for serial correlation. AREC-ECON 535 Lec F
Generalized Least Squares: when σ i is known. also called Weighted Least Squares (WLS) y i = β + β x i + e i (y i /σ i ) = (β + β x i + e i )/σ i (y i /σ i ) = (β /σ i ) + β (x i / σ i ) + (e i / σ i ) * * * * y i = β + β x i + e i * * V(e i ) = E(e i ) = E(e i /σ i ) = E(e i )/σ i = Run OLS on this model. ˆ are BLUE. If we knew σ i we could fix the problem. AREC-ECON 535 Lec F
Generalized Least Squares: estimation Assume a particular form of heteroskedasticity. ) Error variance proportional to independent variable squared. (Standard error model heteroskedasticity.) y i = β + β x i + e i and E(e i ) = σ i = σ x i or σ i = σx i y i / x i = β / x i + β + e i / x i y i * = β ( / x i ) + β + e i * E(e i * ) = E(e i / x i ) = E(e i ) /x i = σ ) Error variance proportional to independent variable. (Variance model heteroskedasticity.) y i = β + β x i + e i and E(e i ) = σ i = σ x i y i / x i = β / x i + β x i + e i / x i y i * = β (/ x i ) + β ( x i ) + e i * E(e i * ) = E(e i / x i ) = E(e i )/x i = σ AREC-ECON 535 Lec F
So the weight function in EViews is: e ˆi x k Dark: w i = /x i Dashed: w i = / x i AREC-ECON 535 Lec F 3
3) Error variance proportional to dependent variable. (Dependent variable model heteroskedasticity.) y i = β + β x i + e i and E(e i ) = σ i = σ [E(y i )] = σ [ ŷ i ] (y i / ŷ i ) = (β / ŷ i ) + β (x i / ŷ i ) + (e i / ŷ i ) * y i = β (/ ŷ i ) + β (x i / ŷ * * i ) + e i E(e i ) = E(e i / ŷ i ) = E(e i )/[E(y i )] = σ AREC-ECON 535 Lec F 4
AREC-ECON 535 Lec F 5 GLS in Matrix Notation β = (X Ω - X) - X Ω - y V(β) = (X Ω - X) - (under OLS: Ω = σ I ) where with heteroskedasticity 3 N 3 N
AREC-ECON 535 Lec F 6 and with serial correlation 3 3 T T T T T T
AREC-ECON 535 Lec F 7 Ω can be rewritten as Ω = H H so that EGLS is OLS on the transformed model (Hy) = (HX)β + (He) or y* = X*β + e* where with heteroskedasticity & with serial correlation N H 3 ) ( ) ( H these transformations make e* have good properties.
Estimated GLS (EGLS) or Feasible GLS (FGLS) Replace ρ and σ i with consistent estimates of ρ and σ i and iterate... (Careful about the number of parameters. The problem is iterating until convergence. So just twice ) Maximum Likelihood Alternative to GLS and EGLS lnl = -(T/)ln(π) - (/)Σln(σ t ) - (/)Σ((y t - μ t ) /σ t ) The common model is: μ t = β + β x t +... + β k x kt σ t = σ However, this can also be a common model: μ t = β + β x t +... + β k x kt σ t = α + α z t +... + α p z pt Think about, specify, estimate, test models of the conditional variance... AREC-ECON 535 Lec F 8
Several computer regression packages will do these procedures (LIMDEP: HREG, SAS: AUTOREG, STATA:...). LIMDEP: y t = x t β + e t σ t = exp{α + α z t } σ t = σ [x t β] SAS & SHAZAM: y t = x t β + e t σ t = exp{α + α z t } σ t = σ [x t β] σ t = σ [α + α z t ] σ t = σ [α + α z t ] AREC-ECON 535 Lec F 9
Example application mean and variance equation: Transaction Price = f(..., Market Institution,...) The introduction of Mandatory Price Reporting (the market institution) had a negative effect as a variable in the mean equation the conditional mean of transaction prices was reduced under the institution was introduced. And the market institution had a negative coefficient in the variance equation, i.e., decreased the variance of transaction prices. MPR decreased the mean price and decreased price risk. So heteroskedasticity is not just something to fix. It can have an economic interpretation. And if that is the case then it should be explored. AREC-ECON 535 Lec F