Reliability of inference (1 of 2 lectures)

Similar documents
Dynamic Regression Models (Lect 15)

E 31501/4150 Properties of OLS estimators (Monte Carlo Analysis)

The regression model with one stochastic regressor.

The regression model with one fixed regressor cont d

Lecture 4: Heteroskedasticity

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

Econometrics - 30C00200

The multiple regression model; Indicator variables as regressors

The regression model with one stochastic regressor (part II)

Introduction to Econometrics. Heteroskedasticity

Multiple Regression Analysis: Heteroskedasticity

ECON 3150/4150, Spring term Lecture 7

F9 F10: Autocorrelation

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables

2. Linear regression with multiple regressors

Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator

Outline. Possible Reasons. Nature of Heteroscedasticity. Basic Econometrics in Transportation. Heteroscedasticity

Heteroskedasticity and Autocorrelation

Heteroskedasticity. Part VII. Heteroskedasticity

E 4160 Autumn term Lecture 9: Deterministic trends vs integrated series; Spurious regression; Dickey-Fuller distribution and test

Introductory Econometrics

Lab 11 - Heteroskedasticity

LECTURE 11. Introduction to Econometrics. Autocorrelation

Week 11 Heteroskedasticity and Autocorrelation

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

ECON 4160, Autumn term Lecture 1

Lecture 7: Dynamic panel models 2

A Practitioner s Guide to Cluster-Robust Inference

Topic 7: Heteroskedasticity

Multiple Linear Regression

Advanced Econometrics

Econometrics of Panel Data

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

AUTOCORRELATION. Phung Thanh Binh

Iris Wang.

Economics 308: Econometrics Professor Moody

Wooldridge, Introductory Econometrics, 2d ed. Chapter 8: Heteroskedasticity In laying out the standard regression model, we made the assumption of

ECON 3150/4150, Spring term Lecture 6

ECON 4160, Lecture 11 and 12

FinQuiz Notes

Outline. Nature of the Problem. Nature of the Problem. Basic Econometrics in Transportation. Autocorrelation

Econometrics. 9) Heteroscedasticity and autocorrelation

Graduate Econometrics Lecture 4: Heteroskedasticity

LECTURE 10: MORE ON RANDOM PROCESSES

Applied Econometrics. Applied Econometrics. Applied Econometrics. Applied Econometrics. What is Autocorrelation. Applied Econometrics

Least Squares Estimation-Finite-Sample Properties

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models

Empirical Economic Research, Part II

Lecture 6: Dynamic panel models 1

E 4101/5101 Lecture 9: Non-stationarity

Økonomisk Kandidateksamen 2004 (I) Econometrics 2. Rettevejledning

Ma 3/103: Lecture 24 Linear Regression I: Estimation

The linear regression model: functional form and structural breaks

Ch 2: Simple Linear Regression

Review of Econometrics

Recall that a measure of fit is the sum of squared residuals: where. The F-test statistic may be written as:

Applied Statistics and Econometrics

Forecasting. Simultaneous equations bias (Lect 16)

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

The Simple Linear Regression Model

Modified Variance Ratio Test for Autocorrelation in the Presence of Heteroskedasticity

8. Instrumental variables regression

Homoskedasticity. Var (u X) = σ 2. (23)

OSU Economics 444: Elementary Econometrics. Ch.10 Heteroskedasticity

Introductory Econometrics

Linear Regression with 1 Regressor. Introduction to Econometrics Spring 2012 Ken Simons

Econometrics of Panel Data

Introductory Econometrics

Cointegration Lecture I: Introduction

Heteroskedasticity ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD

ECON 497: Lecture Notes 10 Page 1 of 1

1. You have data on years of work experience, EXPER, its square, EXPER2, years of education, EDUC, and the log of hourly wages, LWAGE

Statistical Inference with Regression Analysis

Intermediate Econometrics

Freeing up the Classical Assumptions. () Introductory Econometrics: Topic 5 1 / 94

Econometrics Multiple Regression Analysis: Heteroskedasticity

mrw.dat is used in Section 14.2 to illustrate heteroskedasticity-robust tests of linear restrictions.

Answer Key: Problem Set 6

Ma 3/103: Lecture 25 Linear Regression II: Hypothesis Testing and ANOVA

A Course on Advanced Econometrics

1 The Multiple Regression Model: Freeing Up the Classical Assumptions

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

Casuality and Programme Evaluation

ECON 4160, Spring term Lecture 12

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Hypothesis testing Goodness of fit Multicollinearity Prediction. Applied Statistics. Lecturer: Serena Arima

Making sense of Econometrics: Basics

Applied Statistics and Econometrics

Motivation for multiple regression

The Finite Sample Properties of the Least Squares Estimator / Basic Hypothesis Testing

Econometrics. 4) Statistical inference

TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST

EC312: Advanced Econometrics Problem Set 3 Solutions in Stata

Multiple Regression Analysis

Econometrics I Lecture 3: The Simple Linear Regression Model

3. Linear Regression With a Single Regressor

MS&E 226: Small Data

WISE International Masters

Transcription:

Reliability of inference (1 of 2 lectures) Ragnar Nymoen University of Oslo 5 March 2013 1 / 19

This lecture (#13 and 14): I The optimality of the OLS estimators and tests depend on the assumptions of the regression model being true. We now answer three questions: 1. What are the consequences of residual mis-specification (heteroskedasticity, autocorrelation, non-normality), and non-constancy of parameters? 2. How can we discover empirically departures from the classical assumptions and from constancy (of parameters)? 3. What do we do if departures from the assumptions are detected? Lecture from Ch 8 in HGL: kap 8 in BN 2 / 19

Departures from the normality assumption OLS estimators are BLUE even if the disturbances have a distribution that is different from the normal. They are also consistent. The problem with non-normality is that we do not know the exact (finite sample) distribution of the t-ratios and F-statistics that we use in testing. Inference may become unreliable, at least with small-samples. With large samples (30 +): Can often refer to asymptotic normality under relatively mild assumptions. Test departure from normality: J-B test 3 / 19

Heteroscedasticity I If the variances of the disturbances are not all identical, the homoskedasticity assumption Var(ε 2 i ) = σ 2 of the regression model is violated, and we have heteroskedasticity. Note that the definition of heteroskedasticity is in terms of the theoretical disturbances. In practice the question is therefore whether the empirical hetroskedasticity that we observe from the residuals are signs of significant departures from homoskedasticity. 4 / 19

Consequences of heteroskedasticity I Without loss of generality, consider consequences in simple regression. The OLS estimator ˆβ 1 is unbiased also in the case of heteroskedastic disturbances, since Var(ε 2 i ) = σ 2 i does not enter into the proof for unbiasedness. By the same argument, the OLS estimator ˆβ 1 is also consistent. 5 / 19

Consequences of heteroskedasticity II The OLS estimator is no longer efficient and BLUE since the formulae Var( ˆβ 1 ) = σ2 n ˆσ 2 will either over- or underestimate the true variance of ˆβ 1, when Var(ε 2 i ) = σ2 i : The estimated variance is biased! The t-ratio which makes use this expression will also be biased. This means that the statistical inference is no longer reliable under heteroskedasticity. The direction of the bias in Var( ˆβ 1 ) depends on the direction of the association between (X i X ) 2 and σ 2 i. If (X i X ) 2 and σ 2 i are positively related it underestimates the true variance. Hence we will make type-i errors more frequently than the nominal significance level. 6 / 19

Consequences of heteroskedasticity III If (X i X ) 2 and σ 2 i are negatively related, the (absolute value of) the t-ratio is underestimated. Hence will conclude too often that a regressor is insignificant when it is in fact significant. 7 / 19

Forms of heteroskedasticity I In the following it will be necessary to assume some form for variation on σ 2 i. A form that is sometimes referred to as classical heteroskedasticity is σ 2 i = σ 2 W h i with h > 0 (1) where W i is an observable variable. Harald s first lecture mentioned this for the situation where the scatter plot suggested: Var(Y X ) = σ 2 X 2 8 / 19

Forms of heteroskedasticity II As we shall see: If heteroskedasticity is of this type (with h known) the problem created by heteroskedasticity for inference is easily corrected. However: For the purpose of testing (1) is inconvenient since the null hypothesis of homoskedasticity cannot be formulated as a parametric restriction on a general model with h > 0. Therefore, mixed heteroskedasticity forms have been suggested. This form models the variances as a function of s observable variables: σ 2 i = a 0 + a 1 Z 1i +... + a 2 Z si (2) 9 / 19

Forms of heteroskedasticity III i.e., a variance function. The null hypothesis of homoskedasticity is: H 0 : a 1 = a 2 =... = a s = 0 (A good feature of the tests below is that they have power to reject H 0 even when the variance function is non-linear in a 0 + a 1 Z 1i +... + a 2 Z si ) A form that is much used in models for time series data, and in financial econometrics in particular, is autoregressive conditional heteroskedasticity, ARCH. The first order ARCH is: σ 2 t = a 0 + a 1 σ 2 t 1 (3) 10 / 19

Testing the null of homoskedasticity I Main point: The residuals contain all the variation in Y i that is unexplained by the our model, i.e., the specification we have chosen for the conditional expectation: ˆε i = Y i E (Y i X i ) Therefore we can use the residuals to test the assumptions we have made abut the disturbances of the regression model. Mis-specification testing is a large field: And we can only mention a few popular test here. 11 / 19

Testing the null of homoskedasticity II Informal tests: As Harald showed, the scatter plot is often instructive as an informal test. After estimation of a model that assumes homoskedasticity, ˆε i (or ˆε 2 i ) can be plotted against X i. Formal test. We mention White s version of the Lagrange multiplier test in ch. 8.2.2 in HGL. In the case of one regressor. White s test replaces the theoretical variance function (2) with ˆε 2 i = a 0 + a 1 X 1i + a s X 2 i + v i, i = 1, 2,..., n (4) where v i is a disturbance with assumed classical properties. 12 / 19

Testing the null of homoskedasticity III (4) is an example of an auxiliary regression that simplifies the testing of residual mis-specification, in this case: H 0 : a 1 = a 2 = 0 is tested by using the existence of a relationship test: F het = R2 het 1 R 2 het n 3 2 F (2, n 3) where Rhet 2 is the R-sq from the auxiliary regression (4). HGL also mention the χ 2 version of this test: χ 2 het(2) = nr 2 het the two are equivalent in large sample. Research shows that the F version has better properties in small samples. 13 / 19

Testing the null of homoskedasticity IV In the case of k regressors in the regression model, we get F het = R2 het 1 R 2 het n 2k 1 2k F (2k, n 2k 1) when all the squared regressors are included. In the case of k > 0 can also include cross-products of regressors, but since the number of cross-products increases rapidly in k: k! (k 2)!2 this is not practical in moderate sample sizes A common case is that k variable regression model contains both continuous variables and dummies as regressors. 14 / 19

Testing the null of homoskedasticity V Testing for ARCH: In this case the auxiliary regression is the direct counterpart to the ARCH formulation in (3): ˆε 2 t = a 0 + a 1 ε 2 t 1 + v i, and H 0 : a 1 = 0 against a 1 = 0. Hence report F arhc (1, T 2) = R2 het 1 R 2 het T 1 1 even if a two sided t-test can be used. Higher order ARCH effects: Includes longer lags of ˆε 2 t and adjust df accordingly 15 / 19

Example: Andy s I sales i = 1.19 (0.106) + 1.19 (0.106) price i 0.08922 (0.00826) advert i 1 75, (n = 44), R 2 = 0.44826, χ 2 normality (2) = 6.2498[0.0439] (J-B test) F het (4, 70) = 0.98576[0.4211] (X 2 version) F het (5, 69) = 0.88864[0.4936] (X 2 and X X version) 16 / 19

Example: Norwegian PCM I π i = 10.5 (1.453) 1.83 (0.423) U t 1975 2005 (T = 27), R 2 = 0.44826 χ 2 normality (2) = 1.0925[0.5791] (J-B test) F het (2, 24) = 2.6057[0.0946] (X 2 version) F arch (1, 25) = 7.5486[0.0110] 17 / 19

Inference and estimation under from heteroskedasticity I When a test of homoskedasticity rejects (as in HGL p 306), the inference based on the OLS estimations not reliable without further qualifications What can we do? First: Can try to robustify our conclusions Informal robustification: If the purpose is to test the significance of a regressor, and the het. is of a form that leads to an overestimated t-ratio, we know that a non-rejection outcome is robust. Formal robustification: Stata and other software can compute standard errors of ˆβ j (j = 0, 2,... k) that are robust to unknown forms of heteroskedasticity (this builds on White s approach). Use these heteroskedastic consistent standard errors, to calculate robust t-ratios. 18 / 19

Inference and estimation under from heteroskedasticity II Second: If we can specify the form of heteroskedasticity, we can restore a regression model with homoskedastic disturbances. This leads to new estimators of β j (j = 0, 2,... k) called generalized least-squares or weighted-least squares. About weighted least-squares in class. Third: Acknowledge that the first modelling attempt failed: Back to drawing board, ( Re-make, Re-model ) 19 / 19