Introductory Econometrics

Similar documents
Econometrics Multiple Regression Analysis: Heteroskedasticity

Heteroskedasticity. Part VII. Heteroskedasticity

Intermediate Econometrics

Econometrics - 30C00200

Multiple Regression Analysis

the error term could vary over the observations, in ways that are related

Topic 7: Heteroskedasticity

Heteroskedasticity. We now consider the implications of relaxing the assumption that the conditional

Introductory Econometrics

Course Econometrics I

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

ECO375 Tutorial 7 Heteroscedasticity

Heteroskedasticity and Autocorrelation

Graduate Econometrics Lecture 4: Heteroskedasticity

Multiple Regression Analysis: Heteroskedasticity

Advanced Econometrics

Introduction to Econometrics. Heteroskedasticity

Introductory Econometrics

Heteroskedasticity (Section )

Multivariate Regression Analysis

GENERALISED LEAST SQUARES AND RELATED TOPICS

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Lecture 4: Heteroskedasticity

Empirical Economic Research, Part II

Models, Testing, and Correction of Heteroskedasticity. James L. Powell Department of Economics University of California, Berkeley

Quantitative Analysis of Financial Markets. Summary of Part II. Key Concepts & Formulas. Christopher Ting. November 11, 2017

Simple Linear Regression: The Model

Review of Econometrics

Semester 2, 2015/2016

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Introductory Econometrics

Heteroskedasticity ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD

Iris Wang.

Heteroskedasticity. Occurs when the Gauss Markov assumption that the residual variance is constant across all observations in the data set

Week 11 Heteroskedasticity and Autocorrelation

Heteroscedasticity. Jamie Monogan. Intermediate Political Methodology. University of Georgia. Jamie Monogan (UGA) Heteroscedasticity POLS / 11

Heteroskedasticity. y i = β 0 + β 1 x 1i + β 2 x 2i β k x ki + e i. where E(e i. ) σ 2, non-constant variance.

Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data

Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16)

Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator

Wooldridge, Introductory Econometrics, 2d ed. Chapter 8: Heteroskedasticity In laying out the standard regression model, we made the assumption of

ECON The Simple Regression Model

Outline. Possible Reasons. Nature of Heteroscedasticity. Basic Econometrics in Transportation. Heteroscedasticity

POLSCI 702 Non-Normality and Heteroskedasticity

Multiple Linear Regression

Motivation for multiple regression

Econometrics. Week 6. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

AUTOCORRELATION. Phung Thanh Binh

Zellner s Seemingly Unrelated Regressions Model. James L. Powell Department of Economics University of California, Berkeley

Heteroskedasticity. (In practice this means the spread of observations around any given value of X will not now be constant)

Econometrics I Lecture 3: The Simple Linear Regression Model

Diagnostics of Linear Regression

Instrumental Variables and GMM: Estimation and Testing. Steven Stillman, New Zealand Department of Labour

Econometrics. 9) Heteroscedasticity and autocorrelation

Regression and Statistical Inference

Section 6: Heteroskedasticity and Serial Correlation

Topic 6: Non-Spherical Disturbances

Making sense of Econometrics: Basics

Economics 582 Random Effects Estimation

Chapter 8 Heteroskedasticity

Econometrics of Panel Data

Homoskedasticity. Var (u X) = σ 2. (23)

Econometrics of Panel Data

1 Introduction to Generalized Least Squares

Econometrics of Panel Data

Econ 510 B. Brown Spring 2014 Final Exam Answers

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

Econometrics Summary Algebraic and Statistical Preliminaries

OSU Economics 444: Elementary Econometrics. Ch.10 Heteroskedasticity

Panel Data Models. James L. Powell Department of Economics University of California, Berkeley

Non-Spherical Errors

Questions and Answers on Heteroskedasticity, Autocorrelation and Generalized Least Squares

We can relax the assumption that observations are independent over i = firms or plants which operate in the same industries/sectors

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models

Financial Econometrics

Simple Linear Regression Model & Introduction to. OLS Estimation

FinQuiz Notes

A Practitioner s Guide to Cluster-Robust Inference

Introductory Econometrics

Microeconometrics: Clustering. Ethan Kaplan

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity

Heteroskedasticity-Robust Inference in Finite Samples

Practical Econometrics. for. Finance and Economics. (Econometrics 2)

Lab 11 - Heteroskedasticity

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

LECTURE 10: MORE ON RANDOM PROCESSES

Additional Topics on Linear Regression

Appendix A: The time series behavior of employment growth

Spatial Regression. 3. Review - OLS and 2SLS. Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Statistics and econometrics

INTRODUCTORY ECONOMETRICS

Hypothesis testing Goodness of fit Multicollinearity Prediction. Applied Statistics. Lecturer: Serena Arima

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

The Simple Regression Model. Part II. The Simple Regression Model

The Statistical Property of Ordinary Least Squares

Model Mis-specification

More efficient tests robust to heteroskedasticity of unknown form

ECON2228 Notes 7. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 41

Analysis of Cross-Sectional Data

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018

Transcription:

Based on the textbook by Wooldridge: : A Modern Approach Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna December 11, 2012

Outline Heteroskedasticity The White-Eicker method Testing for heteroskedasticity Weighted least squares

Heteroskedasticity: the definition Heteroskedasticity refers to the feature that the variance σ 2 i of the regression errors u i is not constant in i: E(u 2 i x i1,...,x ik ) = σ 2 i σ 2 The variation of the unobserved influences depends on the observed explanatory variables. Example: Engel curves etc.

The consequences of heteroskedasticity OLS remains unbiased and consistent (good news); OLS becomes inefficient, as the conditions for the Gauss-Markov Theorem are no longer valid: there are better estimators (in large samples, inefficiency may not always be a problem); The standard errors are incorrect, inference based on t statistics becomes misleading (bad news).

Three topics in heteroskedasticity Repair the incorrect standard errors, use inefficient least squares for estimation (White-Eicker method); Testing for heteroskedasticity (Breusch-Pagan test, White test); Efficient estimation with heteroskedasticity (weighted least squares, feasible generalized least squares).

The White-Eicker method The idea of White-Eicker in matrices Presume y = Xβ +u with Euu = Ω, with Ω a diagonal matrix formed from (σ 2 1,...,σ2 n). Then, the variance of the OLS estimate ˆβ is not σ 2 (X X) 1 but [conditional on X where needed] var(ˆβ) = (X X) 1 X ΩX(X X) 1. Evaluation of this formula may need estimates for Ω, typically unavailable. It can be shown that replacing Ω by diag(û 2 1,...,û2 n) from OLS residuals û i yields a consistent estimate of the variance: var(ˆβ) r = (X X) 1 X diag(û 2 1,...,û2 n )X(X X) 1. We cannot estimate the big n n matrix Ω but we can estimate the small (k +1) (k +1) matrix.

The White-Eicker method White-Eicker in simple regression In a simple regression y i = β 0 +β 1 x i +u i, the OLS estimate for β 1 is ˆβ 1 = β 1 + n i=1 (x i x)u i n i=1 (x i x) 2, and hence which is estimated by var(ˆβ 1 ) = { n (x i x) 2 σi 2 }/SST 2 x, i=1 var(ˆβ 1 ) r = n i=1 (x i x) 2 û 2 i SST 2 x.

The White-Eicker method White-Eicker in practice The square roots of the estimates var(ˆβ j ) are called heteroskedasticity-robust standard errors. The ratio ˆβ j var(ˆβ j ) r is the heteroskedasticity-robust t statistic. Similarly, robust F statistics (Wald statistics) can be computed.

Testing for heteroskedasticity The Breusch-Pagan test for heteroskedasticity The way that σ 2 i depends on the regressors x i1,...,x ik (the functional form ) is unknown. Breusch & Pagan considered linear dependence σ 2 i = δ 0 +δ 1 x i1 +...+δ k x ik, and constructed a test based on the nr 2 approximation to the LM test. Such a test should have the homoskedastic null δ 1 =... = δ k = 0. The distribution of the statistic under the null should be χ 2 in large samples with k degrees of freedom.

Testing for heteroskedasticity Breusch-Pagan test: the construction Like other LM tests, the Breusch-Pagan test can be implemented via an auxiliary regression and its R 2 : 1. Estimate the main equation y = Xβ +u by OLS. Save the OLS residuals û i ; 2. Regress ûi 2,i = 1,...,n, on all original regressors (the auxiliary regression) û 2 i = δ 0 +δ 1 x i1 +...+δ k x ik +ν i, and keep the corresponding R 2 û 2 ; 3. The statistic nr 2 û 2 is the approximate LM test statistic. It is χ 2 (k) distributed under the null in large samples. There is also an approximate F version available. For large values, reject homoskedasticity.

Testing for heteroskedasticity The White test for heteroskedasticity Presume that σ 2 i also depends on squares of explanatory variables and on cross-terms. This is a quite general form that may pick up heteroskedasticity that is missed by Breusch-Pagan: û 2 = δ 0 +δ 1 x 1 +...+δ k x k +δ 11 x 2 1 +...+δ kk x 2 k +δ 12 x 1 x 2 +...+δ k 1,k x k 1 x k +error A drawback is that there are k(k +3)/2 regressors (excluding the constant), which may be too many with large k and small n. An alternative suggestion is to use û 2 = δ 0 +δ 1 ŷ +δ 2 ŷ 2 +error, with ŷ the fitted values from the main original regression.

Testing for heteroskedasticity White test: the construction Again, the White test is implemented in the nr 2 form of the LM tests: 1. Estimate the main equation y = Xβ +u by OLS. Save the OLS residuals û i ; 2. Regress ûi 2,i = 1,...,n, on all original regressors and on their squares and cross-products (auxiliary regression). Keep the R 2 from this regression; 3. Compute nrû 2, which is asymptotically χ 2 ( k(k+3) 2 2 ) distributed under the null. Reject homoskedasticity for large values of the test statistic. With the alternative version û 2 = δ 0 +δ 1 ŷ +δ 2 ŷ 2 +error, the asymptotic null distribution of the test statistic is χ 2 (2).

Weighted least squares Suppose σ 2 i are known If the error variances σ 2 i are known, it is fairly easy to construct an efficient estimator. Just divide the regression equation by σ i : y i σ i = β 0 σ i +β 1 x i,1 σ i +...+β k x i,k σ i + u i σ i, a homogeneous regression with homoskedastic errors. Gauss-Markov assumptions are fulfilled, and least squares is efficient. If σi 2,i = 1,...,n, are known up to a constant factor, the same technique can be applied. If only the functional form of their dependence on the regressors is known, σi 2 can be estimated.

Weighted least squares Weighted least squares in matrices Presume the regression y = Xβ +u has Euu = Ω with Ω = diag(σ1 2,...,σ2 n ), a diagonal matrix. Then, the regression ỹ = Xβ +ũ, ỹ = Ω 1/2 y, X = Ω 1/2 X,ũ = Ω 1/2 u, fulfils the conditions of Gauss-Markov, and OLS is efficient. In original variables, this OLS is written as ˆβ w = (X Ω 1 X) 1 X Ω 1 y. It is called weighted least squares (WLS), as observations with strong variation in their errors now obtain smaller weights, observations with low variation obtain larger weights. It is also called generalized least squares (GLS), as it is an example for a more general technique that can always be used when Ω is known but not a scalar matrix, not only for heteroskedasticity with its diagonal Ω.

Weighted least squares Standard errors for WLS WLS is obtained by transforming the original regression model, re-scaling observations, and then applying OLS. Hence, the variance of the WLS estimate is obvious: var(ˆβ w ) = (X Ω 1 X) 1 The square roots of the diagonal elements are standard errors and can be used for t statistics. Heuristically, observations with large variation in their errors contribute strongly to these standard errors. It can be shown that this variance is smaller than the variance of the OLS estimate var(ˆβ) = (X X) 1 X ΩX(X X) 1.

Weighted least squares Feasible GLS estimator The variances σ 2 i are usually unknown. So, they must be estimated. Functional forms that may yield negative variances must be discarded, such as any forms with linear terms. Consider the form var(u x) = σ 2 exp(δ 0 +δ 1 x 1 +...+δ k x k ). An obvious suggestion is to run a regression on OLS residuals û, logû 2 i = δ 0 +δ 1 x i,1 +...+δ k x i,k +error, and to plug in the corresponding fitted values for σi 2. This is an example for a feasible GLS (fgls) method. These methods can actually be very unreliable in small samples, but as n, they can be shown to achieve the efficiency of the (infeasible and efficient) GLS.

Weighted least squares A feasible GLS estimator in heteroskedastic models 1. Regress y on x 1,...,x k using OLS. Keep residuals û; 2. Create log(û 2 ) from the OLS residuals of the main regression; 3. Regress log(û 2 ) on x 1,...,x k using OLS. Keep fitted values ĝ; 4. Exponentiate the fitted values : ĥ = exp(ĝ) to obtain variance estimates; 5. Regress y on x 1,...,x k using WLS with weights ĥ 1/2 i, or transform all variables via division by ĥi and regress by OLS.

Weighted least squares What if the specific functional form for heteroskedasticity is wrong? It is preposterous to assume that there is an exact match of the entertained functional form and the true one, even when such a form exists. Nevertheless, as long as the technique captures the variation in dispersion across the sample well, fgls will outperform OLS in larger samples. The same remark holds for heteroskedasticity tests. As long as the functional form is reasonable, it will pick up existing heteroskedasticity, even when it is incorrect. The logarithmic form is convenient for fgls. However, it is not recommended for the construction of tests in analogy to White and Breusch/Pagan (Park test).