Multiple Regression Analysis: Heteroskedasticity

Similar documents
Multiple Regression Analysis

Econometrics Multiple Regression Analysis: Heteroskedasticity

Heteroskedasticity. Part VII. Heteroskedasticity

Econometrics - 30C00200

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Course Econometrics I

Introduction to Econometrics. Heteroskedasticity

Chapter 8 Heteroskedasticity

Introductory Econometrics

ECO375 Tutorial 7 Heteroscedasticity

Lab 11 - Heteroskedasticity

Multiple Linear Regression

Topic 7: Heteroskedasticity

Heteroskedasticity (Section )

Intermediate Econometrics

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

ECON The Simple Regression Model

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

coefficients n 2 are the residuals obtained when we estimate the regression on y equals the (simple regression) estimated effect of the part of x 1

Homoskedasticity. Var (u X) = σ 2. (23)

Multiple Regression Analysis

Lecture 8: Heteroskedasticity. Causes Consequences Detection Fixes

the error term could vary over the observations, in ways that are related

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Model Specification and Data Problems. Part VIII

Motivation for multiple regression

Lecture 4: Heteroskedasticity

Introductory Econometrics

Statistical Inference with Regression Analysis

Reliability of inference (1 of 2 lectures)

The Simple Regression Model. Simple Regression Model 1

Econometrics. 9) Heteroscedasticity and autocorrelation

LECTURE 11. Introduction to Econometrics. Autocorrelation

1 Motivation for Instrumental Variable (IV) Regression

Graduate Econometrics Lecture 4: Heteroskedasticity

Multiple Linear Regression CIVL 7012/8012

Heteroskedasticity and Autocorrelation

Heteroskedasticity. Occurs when the Gauss Markov assumption that the residual variance is constant across all observations in the data set

Problem C7.10. points = exper.072 exper guard forward (1.18) (.33) (.024) (1.00) (1.00)

Review of Econometrics

ECON3150/4150 Spring 2015

Freeing up the Classical Assumptions. () Introductory Econometrics: Topic 5 1 / 94

Linear Regression with 1 Regressor. Introduction to Econometrics Spring 2012 Ken Simons

Econometrics Summary Algebraic and Statistical Preliminaries

Diagnostics of Linear Regression

Testing Linear Restrictions: cont.

2. Linear regression with multiple regressors

Heteroskedasticity. (In practice this means the spread of observations around any given value of X will not now be constant)

ECON3150/4150 Spring 2016

Lecture 3: Inference in SLR

Econometrics I Lecture 3: The Simple Linear Regression Model

Inference in Regression Analysis

Contest Quiz 3. Question Sheet. In this quiz we will review concepts of linear regression covered in lecture 2.

OSU Economics 444: Elementary Econometrics. Ch.10 Heteroskedasticity

Simple Linear Regression: The Model

Ordinary Least Squares Regression

Iris Wang.

Introductory Econometrics

EC312: Advanced Econometrics Problem Set 3 Solutions in Stata

Answer Key: Problem Set 6

LECTURE 10: MORE ON RANDOM PROCESSES

ECON 497: Lecture Notes 10 Page 1 of 1

F9 F10: Autocorrelation

Solutions to Problem Set 5 (Due November 22) Maximum number of points for Problem set 5 is: 220. Problem 7.3

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity

Econ 510 B. Brown Spring 2014 Final Exam Answers

Wooldridge, Introductory Econometrics, 2d ed. Chapter 8: Heteroskedasticity In laying out the standard regression model, we made the assumption of

Steps in Regression Analysis

Making sense of Econometrics: Basics

ECON2228 Notes 7. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 41

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47

Least Squares Estimation-Finite-Sample Properties

Business Statistics. Lecture 10: Correlation and Linear Regression

Econometrics. 4) Statistical inference

Statistical Inference. Part IV. Statistical Inference

Heteroskedasticity. We now consider the implications of relaxing the assumption that the conditional

Multiple Regression Analysis: Inference MULTIPLE REGRESSION ANALYSIS: INFERENCE. Sampling Distributions of OLS Estimators

1. You have data on years of work experience, EXPER, its square, EXPER2, years of education, EDUC, and the log of hourly wages, LWAGE

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit

Ch 2: Simple Linear Regression

Multiple Regression: Inference

A Practitioner s Guide to Cluster-Robust Inference

Heteroskedasticity ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD

Applied Statistics and Econometrics

ECON3150/4150 Spring 2016

Applied Econometrics (QEM)

Chapter 15 Panel Data Models. Pooling Time-Series and Cross-Section Data

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43

1 Linear Regression Analysis The Mincer Wage Equation Data Econometric Model Estimation... 11

1 Independent Practice: Hypothesis tests for one parameter:

Chapter Learning Objectives. Regression Analysis. Correlation. Simple Linear Regression. Chapter 12. Simple Linear Regression

Lecture 2 Multiple Regression and Tests

THE MULTIVARIATE LINEAR REGRESSION MODEL

FNCE 926 Empirical Methods in CF

Basic econometrics. Tutorial 3. Dipl.Kfm. Johannes Metzler

Semester 2, 2015/2016

Intro to Applied Econometrics: Basic theory and Stata examples

A Course in Applied Econometrics Lecture 7: Cluster Sampling. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

Week 11 Heteroskedasticity and Autocorrelation

Transcription:

Multiple Regression Analysis: Heteroskedasticity y = β 0 + β 1 x 1 + β x +... β k x k + u Read chapter 8. EE45 -Chaiyuth Punyasavatsut 1

topics 8.1 Heteroskedasticity and OLS 8. Robust estimation 8.3 Testing for Heteroskedasticity 8.4 Weighted Least Squares Estimation 8.5 LPM revisited EE45 -Chaiyuth Punyasavatsut

8.1 What is Heteroskedasticity Recall the assumption of homoskedasticity implied that conditional on the explanatory variables, the variance of the unobserved error, u, was constant If this is not true, that is if the variance of u is different for different values of the x s, then the errors are heteroskedastic Example: estimating returns to education and ability is unobservable, and think the variance in ability differs by educational attainment EE45 -Chaiyuth Punyasavatsut 3

EE45 -Chaiyuth Punyasavatsut 4

8.1 Why Worry About Heteroskedasticity? OLS is still unbiased and consistent, even if we do not assume homoskedasticity (MLR1-4 enough) R and adjusted R are unaffected But, The standard errors of the estimates are biased if we have heteroskedasticity If the standard errors are biased, we can not use the usual t statistics or F statistics or LM statistics for drawing inferences LS is no longer BLUE under heteroskedasticity, means that we can find more efficient estimators EE45 -Chaiyuth Punyasavatsut 5

8. Heteroskedasticity-Robust adjust standard errors and t, F and LM statistics so that they are valid in the presence of heteroskedasticity of unknown form. Thus, LS is still useful Let begin with how the variances, var ( ˆβ ) 1, can be estimated under hetero Consider a simple regression y i = β 0 + β 1 x i + u i var(u i x i ) = σ i EE45 -Chaiyuth Punyasavatsut 6

8. Variance with Heteroskedasticity For LS, Var ( β ) where βˆ ( x x) ( x ) i i = β + 1 1 ( x x) i = i 1 SST x σ i ˆ, x ( ) SST = x x i x u, so EE45 -Chaiyuth Punyasavatsut 7

8. Variance with Heteroskedasticity σ i If σ =, then Var ( β ) ( x x) i 1 SST x σ ˆ = = SST seeeq.(.57), so wecannot use this formula when hetero is present. σ x EE45 -Chaiyuth Punyasavatsut 8

8. Variance with Heteroskedasticity When σ σ i, a valid estimator of ( ) i 1 SST x ˆ i var( ˆ x x u β ) is, where uˆ are are the OLS residuals i We can compute this after LS regression. EE45 -Chaiyuth Punyasavatsut 9

8. Variance with Heteroskedasticity For multiple regression model, a valid estimator of Var ( ) ( ) j Var j th ˆ ru ˆˆ ij i SSRj βˆ with hetero is β =, where rˆ is the i residual from regressing x ij on all other independent variables, and SSR j is the sum of squared residuals from this regression j EE45 -Chaiyuth Punyasavatsut 10

8. Variance with Heteroskedasticity: Robust Standard Errors Now that we have a consistent estimate of the variance, the square root can be used as a standard error for inference called hetero-robust standard error Sometimes, the estimated variance is corrected for degrees of freedom by multiplying by n/(n k 1) As n it s all the same, though The hetero-robust s.e. provide us to compute t- statistics that are asymptotically t-distributed whether or not hetero is present. EE45 -Chaiyuth Punyasavatsut 11

8. Robust Standard Errors (cont) Important to remember that these robust standard errors only have asymptotic justification. With small sample sizes, t statistics formed with robust standard errors will not have a distribution close to the t, and inferences will not be correct That s why we like the usual t-stat under the homoskedasticity assumption: t-stat have an exact t-distribution. In Stata, robust standard errors are easily obtained using the robust option of reg EE45 -Chaiyuth Punyasavatsut 1

8. Robust Standard Errors (cont) Note that the robust standard errors can be either larger or smaller than the usual s.e. In empirical works, we often find that the robust s.e. is larger. Similarly, we could have a hetero-robust- F- statistics or LM statistics EE45 -Chaiyuth Punyasavatsut 13

8. Robust Standard Errors (cont) A Robust LM Statistic consider the model y = β 0 + β 1 x 1 + β x +... β 5 x 5 + u H 0 :β 4 =0, β 5 =0 To obtain the usual LM statistic, first we estimate the restricted model to obtain the residuals, ŭ. Then, we regress ŭ on all x s Thus, LM = n. R u EE45 -Chaiyuth Punyasavatsut 14

8. Robust Standard Errors (cont) A Robust LM Statistic For the robust LM 1. run OLS on the restricted model and save the residuals ŭ. Regress each of the excluded variables on all of the included variables (q different regressions) and save each set of residuals ř 1, ř,, ř q 3. Regress 1 on ř 1 ŭ, ř ŭ,, ř q ŭ, with no intercept 4. The LM statistic is n SSR 1, where SSR 1 is the sum of squared residuals from the final regression from step 3. EE45 -Chaiyuth Punyasavatsut 15

8.3 Testing for Heteroskedasticity To decide whether to us the robust-s.e or the usual one. If the hetero is present, LS is not BLUE. If the form of hetero is known, we can find the better estimator than LS...WLS Consider the model y = β 0 + β 1 x 1 + β x +... β k x k + u Assuming MLR1-4 hold Essentially, we want to test H 0 : Var(u x 1, x,, x k ) = σ, which is equivalent to H 0 : E(u x 1, x,, x k ) = E(u ) = σ EE45 -Chaiyuth Punyasavatsut 16

8.3 Testing for Heteroskedasticity This is to test whether MLR5 is true. If data violates this null hypothesis, that means we can find some relationship between u and x j. A simple approach is to assume a linear function as u = δ 0 + δ 1 x 1 + + δ k x k + v The null hypothesis of homoskedasticity is to test H 0 : δ 1 = δ = = δ k = 0 remind that u is the errors in the population model which are not observed by us, we have to estimate. EE45 -Chaiyuth Punyasavatsut 17

8.3 Testing for Heteroskedasticity (cont): BP test We can estimate it with the residuals from the OLS regression After regressing the residuals squared on all of the x s, we can use the R to form an F or LM test The F statistic is just the reported F statistic for overall significance of the regression, F = [R /k]/[(1 R )/(n k 1)], which is distributed F k, n k - 1 The LM statistic is LM = nr, which is distributed χ k The LM test is also called the Breusch-Pagan test for hetero (or BP EE45 test -Chaiyuth for Punyasavatsut short) 18

8.3 Testing for Heteroskedasticity (cont): White test The Breusch-Pagan test will detect any linear forms of heteroskedasticity The White test allows for nonlinearities by using squares and crossproducts of all the x s Still just using an F or LM to test whether all the x j, x j, and x j x h are jointly significant This can get to be unwieldy pretty quickly EE45 -Chaiyuth Punyasavatsut 19

8.3 Testing for Heteroskedasticity (cont): White test For k=3, we have 9 terms (3 for x s, 3 for x s, and 3 for crossproduct x i x j ) in total H 0 : δ 1 = 0,...δ 9 = 0. loss many degrees of freedom we can conserve on df with the White idea by using LS fitted values. EE45 -Chaiyuth Punyasavatsut 0

8.3 Testing for Heteroskedasticity (cont): White test Consider that the fitted values from OLS, ŷ, are a function of all the x s yˆ = βˆ + βˆx +... + βˆx i 0 1 i1 k k1 Thus, ŷ will be a function of the squares and crossproducts So, ŷ and ŷ can proxy for all of the x j, x j, and x j x h, EE45 -Chaiyuth Punyasavatsut 1

8.3 Testing for Heteroskedasticity (cont): White test We can then regressing the residual squared on ŷ and ŷ and use the R to form an F or LM statistic Note that we need to test only restrictions now: δ 1 = 0, δ = 0. This alternative method saves some df. EE45 -Chaiyuth Punyasavatsut

8.3 Testing for Heteroskedasticity (cont): White test The LM-statistics will have Chi-squared distribution with k df. In the previous example, k = (no. of restrictions) If we use F-statistics, we have F- distribution with (k, n- no. of estimates) or (, n-3) in the previous example. EE45 -Chaiyuth Punyasavatsut 3

8.3 Testing for Heteroskedasticity (cont): White test Heteroskedasticity problem sometimes could be viewed as model misspecification problem In practices, we will test model specification first, once we are satisfied with the functional form, we then test for heteroskedasticity. EE45 -Chaiyuth Punyasavatsut 4

8.3 Testing for Heteroskedasticity : example 8.4 (housing price equation) price = 1.77 +.0007lotsize + 0.13sqrft + 13.85bdrms (9.48) (0.00064) (0.013) (9.01) n = 88, R squared = 0.67 If we regress the LS residuals on independent variables, we get R- squared=0.16 For n=88, k=3 EE45 -Chaiyuth Punyasavatsut 5

8.3 Testing for Heteroskedasticity : example 8.4 (housing price equation) The F-test (formula 8.15) for heteroskedasticity = 0.16/ 3 5.34 (1 0.16)/(88 3 1) So, we reject the null. We find hetero problem What if we take log of all variables? EE45 -Chaiyuth Punyasavatsut 6

8.3 Testing for Heteroskedasticity : example 8.5 (White Test) see program EE45 -Chaiyuth Punyasavatsut 7