Econ 423 Lecture Notes

Similar documents
Introduction to Econometrics

ECON2228 Notes 10. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 54

Empirical Macroeconomics

Lecture#17. Time series III

Autocorrelation. Think of autocorrelation as signifying a systematic relationship between the residuals measured at different points in time

Heteroscedasticity and Autocorrelation

ECON2228 Notes 10. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 48

ECON3150/4150 Spring 2016

ECON Introductory Econometrics. Lecture 7: OLS with Multiple Regressors Hypotheses tests

Econ 423 Lecture Notes

Lecture 8: Instrumental Variables Estimation

ECON Introductory Econometrics. Lecture 5: OLS with One Regressor: Hypothesis Tests

ECON Introductory Econometrics. Lecture 6: OLS with Multiple Regressors

7 Introduction to Time Series

Model Mis-specification

Auto correlation 2. Note: In general we can have AR(p) errors which implies p lagged terms in the error structure, i.e.,

Final Exam. 1. Definitions: Briefly Define each of the following terms as they relate to the material covered in class.

7 Introduction to Time Series Time Series vs. Cross-Sectional Data Detrending Time Series... 15

ECON Introductory Econometrics. Lecture 13: Internal and external validity

Econometrics. 9) Heteroscedasticity and autocorrelation

Chapter 7. Hypothesis Tests and Confidence Intervals in Multiple Regression

Lecture 5. In the last lecture, we covered. This lecture introduces you to

Applied Statistics and Econometrics

Applied Statistics and Econometrics

GMM, HAC estimators, & Standard Errors for Business Cycle Statistics

Problem Set 5 ANSWERS

Econometrics Honor s Exam Review Session. Spring 2012 Eunice Han

Empirical Macroeconomics

Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation

8. Nonstandard standard error issues 8.1. The bias of robust standard errors

Empirical Application of Simple Regression (Chapter 2)

Empirical Macroeconomics

LECTURE 10: MORE ON RANDOM PROCESSES

ECON3150/4150 Spring 2016

9) Time series econometrics

Final Exam. Question 1 (20 points) 2 (25 points) 3 (30 points) 4 (25 points) 5 (10 points) 6 (40 points) Total (150 points) Bonus question (10)

Answer all questions from part I. Answer two question from part II.a, and one question from part II.b.

ECONOMETRICS HONOR S EXAM REVIEW SESSION

Econ 423 Lecture Notes: Additional Topics in Time Series 1

ECON Introductory Econometrics. Lecture 16: Instrumental variables

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals

Introduction to Econometrics

Stationary and nonstationary variables

Empirical Application of Panel Data Regression

LECTURE 11. Introduction to Econometrics. Autocorrelation

Handout 12. Endogeneity & Simultaneous Equation Models

Specification Error: Omitted and Extraneous Variables

Economics 308: Econometrics Professor Moody

Autoregressive models with distributed lags (ADL)

Heteroskedasticity Example

Econometrics Homework 1

Quantitative Methods Final Exam (2017/1)

Lecture #8 & #9 Multiple regression

ECON 594: Lecture #6

ECON Introductory Econometrics. Lecture 11: Binary dependent variables

Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 2017, Chicago, Illinois

Hypothesis Tests and Confidence Intervals in Multiple Regression

Graduate Econometrics Lecture 4: Heteroskedasticity

Week 11 Heteroskedasticity and Autocorrelation

Chapter 6: Linear Regression With Multiple Regressors

4. MA(2) +drift: y t = µ + ɛ t + θ 1 ɛ t 1 + θ 2 ɛ t 2. Mean: where θ(l) = 1 + θ 1 L + θ 2 L 2. Therefore,

Lab 07 Introduction to Econometrics

AUTOCORRELATION. Phung Thanh Binh

10) Time series econometrics

Introductory Econometrics. Lecture 13: Hypothesis testing in the multiple regression model, Part 1

Lab 11 - Heteroskedasticity

Introduction to Econometrics. Regression with Panel Data

ECON 366: ECONOMETRICS II. SPRING TERM 2005: LAB EXERCISE #10 Nonspherical Errors Continued. Brief Suggested Solutions

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

. *DEFINITIONS OF ARTIFICIAL DATA SET. mat m=(12,20,0) /*matrix of means of RHS vars: edu, exp, error*/

Lecture 4: Multivariate Regression, Part 2

Time Series Econometrics For the 21st Century

Extensions to the Basic Framework I

Econ 300/QAC 201: Quantitative Methods in Economics/Applied Data Analysis. 17th Class 7/1/10

Lecture 14. More on using dummy variables (deal with seasonality)

Linear Regression with Multiple Regressors

Question 1 [17 points]: (ch 11)

Dynamic Panel Data Models

Mediation Analysis: OLS vs. SUR vs. 3SLS Note by Hubert Gatignon July 7, 2013, updated November 15, 2013

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47

Answers: Problem Set 9. Dynamic Models

4 Instrumental Variables Single endogenous variable One continuous instrument. 2

Econ 582 Fixed Effects Estimation of Panel Data

Reading Assignment. Serial Correlation and Heteroskedasticity. Chapters 12 and 11. Kennedy: Chapter 8. AREC-ECON 535 Lec F1 1

1 The Multiple Regression Model: Freeing Up the Classical Assumptions

ECON2228 Notes 7. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 41

Introduction to Time Series Regression and Forecasting

Lecture#12. Instrumental variables regression Causal parameters III

Heteroskedasticity. Occurs when the Gauss Markov assumption that the residual variance is constant across all observations in the data set

Problem Set #3-Key. wage Coef. Std. Err. t P> t [95% Conf. Interval]

Applied Econometrics. Professor Bernard Fingleton

Introduction to Econometrics. Multiple Regression (2016/2017)

ECON3150/4150 Spring 2015

Heteroskedasticity- and Autocorrelation-Robust Inference or Three Decades of HAC and HAR: What Have We Learned?

ECON Introductory Econometrics. Lecture 4: Linear Regression with One Regressor

Statistical Inference with Regression Analysis

****Lab 4, Feb 4: EDA and OLS and WLS

Heteroskedasticity. (In practice this means the spread of observations around any given value of X will not now be constant)

Introduction to Econometrics

Lecture notes to Stock and Watson chapter 8

Transcription:

Econ 423 Lecture Notes (hese notes are modified versions of lecture notes provided by Stock and Watson, 2007. hey are for instructional purposes only and are not to be distributed outside of the classroom.) 5-

Heteroskedasticity and Autocorrelation-Consistent (HAC) Standard Errors Consider a generalization of the distributed lag model, where the errors u t are not necessarily i.i.d., i.e., Y t = β 0 + β X t + + β r+ X t r + u t. Suppose that u t is serially correlated; then, OLS will still yield consistent* estimators of the coefficients β 0, β,., β r+ (*consistent but possibly biased!) he sampling distribution of ˆ β, etc., is normal 5-2

BU the formula for the variance of this sampling distribution is not the usual one from cross-sectional (i.i.d.) data, because u t is not i.i.d. in this case since, in particular, u t is serially correlated! his means that the usual OLS standard errors (usual SAA printout) are wrong! We need to use, instead, SEs that are robust to autocorrelation as well as to heteroskedasticity his is easy to do using SAA and most (but not all) other statistical software. 5-3

HAC standard errors, ctd. he math for the simplest case with no lags: Y t = β 0 + β X t + u t he OLS estimator: Using the usual regression algebra, we obtain ( X t X ) u ˆ t= β β = 2 ( X t X ) t= t vt t= 2 σ X where v t = (X t X )u t. (in large samples) 5-4

HAC standard errors, ctd. hus, in large samples, var( ˆ β ) = var = 2 t = s = vt / t = ( σ ) 2 2 X cov( vt, vs )/( σ X ) 2 2 In i.i.d. cross sectional data, cov(v t, v s ) = 0 for t s, so var( ˆ β ) = var( v ) 2 t )/ = t ( σ ) = 2 2 X σ ( σ ) 2 v 2 2 x his is our usual cross-sectional result. 5-5

HAC standard errors, ctd. But in time series data, cov(v t, v s ) 0 in general. Consider = 2: var vt = var[½(v +v 2 )] t = = ¼[var(v ) + var(v 2 ) + 2cov(v,v 2 )] = ½ 2 2 σ v + ½ρ σ (ρ = corr(v,v 2 )) = ½σ 2 v f 2, where f 2 = (+ρ ) In i.i.d. data, ρ = 0 so f 2 =, yielding the usual formula In time series data, if ρ 0 then var( ˆ β ) is not given by the usual formula. v 5-6

Expression for var( ˆ β ), general var so var( ˆ β ) = where f = vt = t = 2 σ v f σ ( σ ) 2 v 2 2 X f j + 2 ρ j j= Conventional OLS SE s are wrong when u t is serially correlated (SAA printout is wrong). he OLS SEs are off by the factor f We need to use a different SE formula!!! 5-7

HAC Standard Errors Conventional OLS SEs (heteroskedasticity-robust or not) are wrong when u t is autocorrelated So, we need a new formula that produces SEs that are robust to autocorrelation as well as heteroskedasticity We need Heteroskedasticity- and Autocorrelation- Consistent (HAC) standard errors If we knew the factor f, we could just make the adjustment. However, in most practical applications, we must estimate f. 5-8

HAC SEs, ctd. 2 var( ˆ σ v j β ) = 2 2 ( σ X ) f, where f = + 2 ρ j j= he most commonly used estimator of f is: f ˆ = ρ% j is an estimator of ρ j m m j + 2 % j j m ρ (Newey-West) = his is the Newey-West HAC SE estimator m is called the truncation parameter Why not just set m =? hen how should you choose m? o Use the Goldilocks method o Or, use the rule of thumb, m = 0.75 /3 5-9

Empirical Example he Orange Juice Data Data Monthly, Jan. 950 Dec. 2000 ( = 62) Price = price of frozen OJ (a sub-component of the producer price index; US Bureau of Labor Statistics) %ChgP = percentage change in price at an annual rate, so %ChgP t = 200 ln(price t ) FDD = number of freezing degree-days during the month, recorded in Orlando FL o Example: If November has 2 days with lows < 32 o, one at 30 o and at 25 o, then FDD Nov = 2 + 7 = 9 5-0

5-

Initial OJ regression % ChgP t = -.40 +.47FDD t (.22) (.3) Statistically significant positive relation More freezing degree days price increase Standard errors are heteroskedasticity and autocorrelationconsistent (HAC) SE s 5-2

Example: OJ and HAC estimators in SAA. gen l0fdd = fdd; generate lag #0. gen lfdd = L.fdd; generate lag #. gen l2fdd = L2.fdd; generate lag #2. gen l3fdd = L3.fdd;.. gen l4fdd = L4.fdd;.. gen l5fdd = L5.fdd;.. gen l6fdd = L6.fdd;. reg dlpoj fdd if tin(950m,2000m2), r; NO HAC SEs Linear regression Number of obs = 62 F(, 60) = 2.2 Prob > F = 0.0005 R-squared = 0.0937 Root MSE = 4.826 ------------------------------------------------------------------------------ Robust dlpoj Coef. Std. Err. t P> t [95% Conf. Interval] -------------+---------------------------------------------------------------- fdd.466282.339293 3.48 0.00.203998.7292367 _cons -.4022562.89372-2.2 0.034 -.774549 -.0303575 ------------------------------------------------------------------------------ 5-3

Example: OJ and HAC estimators in SAA, ctd Rerun this regression, but with Newey-West SEs:. newey dlpoj fdd if tin(950m,2000m2), lag(7); Regression with Newey-West standard errors Number of obs = 62 maximum lag: 7 F(, 60) = 2.23 Prob > F = 0.0005 ------------------------------------------------------------------------------ Newey-West dlpoj Coef. Std. Err. t P> t [95% Conf. Interval] -------------+---------------------------------------------------------------- fdd.466282.33342 3.50 0.00.2044077.7280288 _cons -.4022562.259802 -.86 0.063 -.82642.028987 ------------------------------------------------------------------------------ Uses autocorrelations up to m = 7 to compute the SEs rule-of-thumb: 0.75*(62 /3 ) = 6.4 7, rounded up a little. OK, in this case the difference in SEs is small, but not always so! 5-4

Example: OJ and HAC estimators in SAA, ctd.. global lfdd6 "fdd lfdd l2fdd l3fdd l4fdd l5fdd l6fdd";. newey dlpoj $lfdd6 if tin(950m,2000m2), lag(7); Regression with Newey-West standard errors Number of obs = 62 maximum lag : 7 F( 7, 604) = 3.56 Prob > F = 0.0009 ------------------------------------------------------------------------------ Newey-West dlpoj Coef. Std. Err. t P> t [95% Conf. Interval] -------------+---------------------------------------------------------------- fdd.46932.359686 3.45 0.00.2022834.7363407 lfdd.43052.0837047.7 0.088 -.023364.3074388 l2fdd.0564234.056724.00 0.36 -.0538936.667404 l3fdd.0722595.0468776.54 0.24 -.098033.643223 l4fdd.0343244.02954.6 0.245 -.0236383.092287 l5fdd.0468222.030879.52 0.30 -.03822.074657 l6fdd.0485.0446404.08 0.282 -.0395577.357807 _cons -.650583.2336986-2.78 0.006 -.09479 -.95578 ------------------------------------------------------------------------------ global lfdd6 defines a string which is all the additional lags What are the estimated dynamic multipliers (dynamic effects)? 5-5

FAQ: Do I need to use HAC SEs when I estimate an AR or an ADL model? A: No, only if one is sure that the true model is an AR or an ADL in the purest sense so that there is no serial correlation or heteroskedasticity in the errors. In AR and ADL models with homoskedastic errors, one may argue that the errors will be serially uncorrelated if you include enough lags of Y o If you include enough lags of Y, then the error term can t be predicted using past Y, or equivalently by past u so u is serially uncorrelated However, the safer and more robust choice would be to always use HAC SE s. 5-6