Econometrics Multiple Regression Analysis: Heteroskedasticity

Similar documents
Multiple Regression Analysis

Introductory Econometrics

Multiple Regression Analysis: Heteroskedasticity

Heteroskedasticity. Part VII. Heteroskedasticity

Intermediate Econometrics

Course Econometrics I

Econometrics - 30C00200

Graduate Econometrics Lecture 4: Heteroskedasticity

Topic 7: Heteroskedasticity

Heteroskedasticity (Section )

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econometrics Multiple Regression Analysis with Qualitative Information: Binary (or Dummy) Variables

Semester 2, 2015/2016

ECON The Simple Regression Model

ECO375 Tutorial 7 Heteroscedasticity

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Review of Econometrics

Chapter 8 Heteroskedasticity

Simple Linear Regression: The Model

Introductory Econometrics

Introduction to Econometrics. Heteroskedasticity

the error term could vary over the observations, in ways that are related

Multiple Linear Regression

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Heteroskedasticity and Autocorrelation

The Simple Regression Model. Simple Regression Model 1

Lecture 3: Multiple Regression

coefficients n 2 are the residuals obtained when we estimate the regression on y equals the (simple regression) estimated effect of the part of x 1

Lab 11 - Heteroskedasticity

Introductory Econometrics

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit

Motivation for multiple regression

Models, Testing, and Correction of Heteroskedasticity. James L. Powell Department of Economics University of California, Berkeley

Multiple Regression Analysis

Lecture 4: Heteroskedasticity

ECON3150/4150 Spring 2015

Heteroskedasticity. We now consider the implications of relaxing the assumption that the conditional

Homoskedasticity. Var (u X) = σ 2. (23)

Linear Regression with 1 Regressor. Introduction to Econometrics Spring 2012 Ken Simons

Testing Linear Restrictions: cont.

Econometrics Summary Algebraic and Statistical Preliminaries

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

OSU Economics 444: Elementary Econometrics. Ch.10 Heteroskedasticity

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

Economics 113. Simple Regression Assumptions. Simple Regression Derivation. Changing Units of Measurement. Nonlinear effects

Diagnostics of Linear Regression

Formulary Applied Econometrics

The general linear regression with k explanatory variables is just an extension of the simple regression as follows

Iris Wang.

Econometrics. 9) Heteroscedasticity and autocorrelation

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity

Econometrics of Panel Data

Multiple Linear Regression CIVL 7012/8012

Introductory Econometrics

Heteroskedasticity ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD

The Simple Regression Model. Part II. The Simple Regression Model

Econometrics of Panel Data

Multiple Linear Regression

Econometrics. Final Exam. 27thofJune,2008. Timeforcompletion: 2h30min

Heteroskedasticity. Occurs when the Gauss Markov assumption that the residual variance is constant across all observations in the data set

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Econometrics. Week 6. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Outline. Possible Reasons. Nature of Heteroscedasticity. Basic Econometrics in Transportation. Heteroscedasticity

Heteroskedasticity. y i = β 0 + β 1 x 1i + β 2 x 2i β k x ki + e i. where E(e i. ) σ 2, non-constant variance.

Multiple Regression Analysis. Part III. Multiple Regression Analysis

Heteroscedasticity. Jamie Monogan. Intermediate Political Methodology. University of Georgia. Jamie Monogan (UGA) Heteroscedasticity POLS / 11

1. You have data on years of work experience, EXPER, its square, EXPER2, years of education, EDUC, and the log of hourly wages, LWAGE

Lecture 2 Multiple Regression and Tests

Econometrics I Lecture 3: The Simple Linear Regression Model

Ch 2: Simple Linear Regression

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables

Homework Set 2, ECO 311, Spring 2014

Making sense of Econometrics: Basics

ECON3150/4150 Spring 2016

Multivariate Regression Analysis

ECON2228 Notes 7. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 41

Statistical Inference with Regression Analysis

1 Motivation for Instrumental Variable (IV) Regression

statistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors:

Econ 510 B. Brown Spring 2014 Final Exam Answers

Measuring the fit of the model - SSR

Heteroskedasticity. (In practice this means the spread of observations around any given value of X will not now be constant)

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018

Practical Econometrics. for. Finance and Economics. (Econometrics 2)

Sample Problems. Note: If you find the following statements true, you should briefly prove them. If you find them false, you should correct them.

1 Introduction to Generalized Least Squares

ECON Introductory Econometrics. Lecture 5: OLS with One Regressor: Hypothesis Tests

x i = 1 yi 2 = 55 with N = 30. Use the above sample information to answer all the following questions. Show explicitly all formulas and calculations.

F9 F10: Autocorrelation

Least Squares Estimation-Finite-Sample Properties

POLSCI 702 Non-Normality and Heteroskedasticity

Wooldridge, Introductory Econometrics, 2d ed. Chapter 8: Heteroskedasticity In laying out the standard regression model, we made the assumption of

Formal Statement of Simple Linear Regression Model

Inference for Regression

1 Linear Regression Analysis The Mincer Wage Equation Data Econometric Model Estimation... 11

mrw.dat is used in Section 14.2 to illustrate heteroskedasticity-robust tests of linear restrictions.

Introductory Econometrics

The regression model with one fixed regressor cont d

EC312: Advanced Econometrics Problem Set 3 Solutions in Stata

Week 11 Heteroskedasticity and Autocorrelation

Transcription:

Econometrics Multiple Regression Analysis: João Valle e Azevedo Faculdade de Economia Universidade Nova de Lisboa Spring Semester João Valle e Azevedo (FEUNL) Econometrics Lisbon, April 2011 1 / 19

Properties of OLS: Variance Assumption MLR.5 (Homoskedasticity) The error u has the same variance given any value of the explanatory variables Var(u x i,..., x k ) = σ 2 leading to Var(ˆβ X) = σ 2 (X X) 1 With MLR.1 through MLR.5 we have derived the variance of the OLS estimators and further concluded that OLS was asymptotically Normal: Enough to conduct inference as usual If MLR.5 does not hold, that is, if the conditional variance of u is allowed to vary given the x s, then the errors are heteroskedastic and the results above are NOT valid. Cannot make inference as usual (t-tests, F tests, LM tests) João Valle e Azevedo (FEUNL) Econometrics Lisbon, April 2011 2 / 19

Heteroskedastic Case Suppose y is wage and x is education f(y x)... E(y x) = b 0 + b 1 x x 1 x 2 x 3 x Figure: How spread out is the distribution of the estimator João Valle e Azevedo (FEUNL) Econometrics Lisbon, April 2011 3 / 19

Properties of OLS: Variance (Cont.) Theorem Under assumptions MLR.1 through MLR.5 σ 2 Var( ˆβ j ) = SST j (1 Rj 2 j = 0, 1,..., k ), n SST j = (x ij x j ) 2 i=1 Rj 2 is the coefficient of determination from regressing x j on all the other regressors. Tells us how much the other regressors explain x j João Valle e Azevedo (FEUNL) Econometrics Lisbon, April 2011 4 / 19

Variance with Now assume Var(u i x i1,..., x ik ) = σ 2 i For the simple regression case: So, conditional on the x s: ˆβ 1 = β 1 + (xi x)u i (xi x) 2 (xi x) Var( ˆβ 2 σi 2 1 ) = (xi x) 2 A valid estimator when σ 2 i σ 2 is: Var( ˆβ 1 ) = (xi x) 2 û 2 i [ (xi x) 2] 2, where û i are the OLS residuals João Valle e Azevedo (FEUNL) Econometrics Lisbon, April 2011 5 / 19

Variance with For the multiple regression model, a valid (consistent) estimator of Var( ˆβ j ) with heteroskedasticity is: Var( ˆβ j ) = ˆr 2 ij û 2 i SSR 2 j ˆr ij is the i th residual from regressing x j on all other independent variables SSR j is the sum of squared residuals from this regression û i are the OLS residuals João Valle e Azevedo (FEUNL) Econometrics Lisbon, April 2011 6 / 19

Robust Standard Errors The square root of this variance can be used as a standard error for inference (Robust Standard error). With these standard errors it turns out that: t = ( ˆβ j β j ) se( ˆβ j ) a Normal(0, 1) This is an heteroskedasticity-robust t statistic Often, the estimated variance is corrected for degrees of freedom by multiplying by n/(n-k-1) (irrelevant for large n) Why not use always robust standard errors? In small samples t statistics using robust standard errors will not have a distribution close to the Normal (or t) and inferences will not be correct Will not deal with heteroskedasticity-robust F statistics Instead, use heteroskedasticity-robust LM tests João Valle e Azevedo (FEUNL) Econometrics Lisbon, April 2011 7 / 19

A Robust LM Statistic Suppose we have a standard model y = β 0 + β 1 x 1 + β 2 x 2 +... + β k x k + u and our null hypothesis is H 0 : β k q+1 = β k q+2 =... = β k = 0 (the number of restrictions is q) First, we just run OLS on the restricted model and save the residuals ŭ Regress each of the excluded variables on all of the included variables (q different regressions) and save each set of residuals r 1, r 2,..., r q Regress a variable defined to be = 1 on r 1, r 2,..., r q, with no intercept The LM statistic is n SSR 1, where SSR 1 is the sum of squared residuals from this final regression, it has a chi-square distribution with q degrees of freedom (under the Null) João Valle e Azevedo (FEUNL) Econometrics Lisbon, April 2011 8 / 19

Testing for Want to test H 0 : Var(u x i,..., x k ) = σ 2, which is equivalent to H 0 : E(u 2 x i,..., x k ) = E(u 2 ) = σ 2 If assume the relationship between u 2 and x j will be linear, can test as a linear restriction Thus, for u 2 = δ 0 + δ 1 x 1 +... + δ k x k + ν this means testing H 0 : δ 1 = δ 2 =... = δk = 0 Don t observe the error, but can use residuals from the OLS regression João Valle e Azevedo (FEUNL) Econometrics Lisbon, April 2011 9 / 19

The Breusch-Pagan Test Estimate u 2 = δ 0 + δ 1 x 1 +... + δ k x k + ν by OLS Want to test H 0 : δ 1 = δ 2 =... = δk = 0 Take the R 2 of this regression. With assumptions MLR.1 through MLR.4 still in place we can use an F test or an LM type test The F statistic is just the reported F statistic for overall significance of this regression F = R 2 /k (1 R 2 )/(n k 1) F (k,n k 1) Alternatively, can form the LM statistic LM = nr 2, which is approximately distributed as a χ 2 k under the null (R2 of the regression above!, this is not the typical LM test!) These tests are usually called the Breusch-Pagan tests for heteroskedasticity João Valle e Azevedo (FEUNL) Econometrics Lisbon, April 2011 10 / 19

The White Test The Breusch-Pagan tests will detect linear forms of heteroskedasticity The White test allows for nonlinearities by using squares and cross-products of all the x s Estimate u 2 = δ 0 + δ 1 x 1 +... + δ k x k + δ k+1 x 2 1 +... + δ 2k x 2 k +...+ + δ2k + 1x 1 x 2 +... + δ k+k(k+1)/2 x k x k 1 + error by OLS Want to test H 0 : δ 1 = δ 2 =... = δ k+k(k+1)/2 = 0 Take the R 2 of this regression and still use the F or LM statistics to test whether all the x j, x 2 j, and x jx h are jointly significant: F = R 2 /q (1 R 2 F (q, n k 1) (approx.) under the null )/(n q 1) and LM=nR 2 χ 2 q (approx.) under the null (q = k + k(k + 1)/2) If k is large and n small these approximations are poor João Valle e Azevedo (FEUNL) Econometrics Lisbon, April 2011 11 / 19

Alternate form of the White Test Now, the fitted values from OLS, ŷ, are a function of all the x s Thus, ŷ 2 will be a function of the squares and cross-products and ŷ and ŷ 2 can substitute for all of the x j, x 2 j, and x jx h, so: Regress the squared residuals on ŷ and ŷ 2 (as well as a constant) and use the R 2 to form an F or LM statistic (as for the BP or White tests) Only testing 2 restrictions now João Valle e Azevedo (FEUNL) Econometrics Lisbon, April 2011 12 / 19

WLS Weighted Least Squares We can always estimate robust standard errors for OLS However, if we know something about the specific form of the heteroskedasticity, we can obtain estimators that have a smaller variance than OLS If we know in fact something we are able to transform the model into one that has homoskedastic errors João Valle e Azevedo (FEUNL) Econometrics Lisbon, April 2011 13 / 19

WLS Case of known form up to a multiplicative constant y = β 0 + β 1 x 1 + β 2 x 2 + β 3 x 3 +... + β k x k + u Suppose we know that Var(u x) = σ 2 h(x), or Var(u i x) = σ 2 h(x i ) = σ 2 h i Example: wage = β 0 + β 1 Education + β 2 Experience + β 3 Tenure + u We know that E(u i / h i x) = 0, because h i depends only on x, and Var(u i / h i x) = σ 2, because Var(u x) = σ 2 h i So, if we divide the regression equation by h i we will get a model where the error is homoskedastic (MLR.1 to MLR.5 verified again) João Valle e Azevedo (FEUNL) Econometrics Lisbon, April 2011 14 / 19

WLS Generalized Least Squares Estimating the transformed equation by OLS is an example of generalized least squares (GLS) GLS will be BLUE (Best Linear Unbiased Estimator) in this case The GLS estimator for the particular case where we divide the regression equation by h i is called a weighted least squares (WLS) estimator. Why? n (yi ˆβ 0 ˆβ 1 xi1... ˆβ k xik )2 hi i=1 where yi = y i / h i, xi1 = x i1 / h i n (y i ˆβ 0 ˆβ 1 x i1... ˆβ k x ik ) 2 /h i i=1 João Valle e Azevedo (FEUNL) Econometrics Lisbon, April 2011 15 / 19

WLS More on WLS We interpret WLS estimates in the original (not transformed model) but get variances of the WLS estimators in the transformed model WLS is optimal if we know the form of Var(u i x i ) In most cases, won t know the form of heteroskedasticity Can often estimate the form of heteroskedasticity Example: wage = β 0 + β 1 Education + β 2 Experience + β 3 Tenure + u Var(u Education, Experience, Tenure) = σ 2 exp(δ 0 + δ 1 Education) where δ0 and δ 1 are unknown João Valle e Azevedo (FEUNL) Econometrics Lisbon, April 2011 16 / 19

WLS Must estimate the form of : Feasible GLS First, we assume a model for heteroskedasticity Example: Var(u x) = E(u 2 x) = σ 2 exp(δ 0 + δ 1 x 1 +... + δ k x k ) > 0 Since we don t know the δ s, must estimate them We can write the above model as: u 2 = σ 2 exp(δ 0 + δ 1 x 1 +... + δ k x k )ν, where E(ν x) = 1 Assume further that ν is independent of x Then ln(u 2 ) = α 0 + δ 1 x 1 +... + δ k x k + e where E(e) = 0 and e is independent of x João Valle e Azevedo (FEUNL) Econometrics Lisbon, April 2011 17 / 19

WLS Feasible GLS (continued) ln(u 2 ) = α 0 +δ 1 x 1 +...+δ k x k +e where E(e) = 0 and e is independent of x Can use û (from OLS) instead of u, to estimate this equation by OLS Then, obtain an estimate of h i by ĥ i = exp(ĝ i ), Finally, use 1/ĥ i as the weights in WLS Summary: Run OLS in the original model, save the residuals, û, square them and take logs Regress ln(û 2 ) on all of the independent variables (plus constant) and get the fitted values, ĝ Do WLS using 1/exp(ĝ) as the weight João Valle e Azevedo (FEUNL) Econometrics Lisbon, April 2011 18 / 19

WLS Notes on GLS OLS is still unbiased and consistent with heteroskedasticity (as long as MLR.1 through MLR.4) hold We use GLS just for efficiency (smaller variance of the estimators) If we know the weights to use in WLS, then GLS is unbiased. Otherwise, and assuming that we estimate a correctly specified for heteroskedasticity, FGLS (which is a Feasible GLS) is not unbiased but is consistent and asymptotically efficient Remember, with FGLS we are estimating the parameters of the original model. Standard errors in the transformed model also refer to standard errors in the original model Can use the t and F tests for inference When doing F tests with WLS, form the weights from the unrestricted model and use those weights to do WLS on the restricted model as well as on the unrestricted model João Valle e Azevedo (FEUNL) Econometrics Lisbon, April 2011 19 / 19