Autocorrelation. Jamie Monogan. Intermediate Political Methodology. University of Georgia. Jamie Monogan (UGA) Autocorrelation POLS / 20

Similar documents
Heteroscedasticity. Jamie Monogan. Intermediate Political Methodology. University of Georgia. Jamie Monogan (UGA) Heteroscedasticity POLS / 11

Two-Variable Regression Model: The Problem of Estimation

AUTOCORRELATION. Phung Thanh Binh

ARIMA Models. Jamie Monogan. January 25, University of Georgia. Jamie Monogan (UGA) ARIMA Models January 25, / 38

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

F9 F10: Autocorrelation

Intervention Models and Forecasting

Economics 308: Econometrics Professor Moody

Multiple Regression Analysis: The Problem of Inference

Time-Series Cross-Section Analysis

Modeling the Covariance

Vector Autoregression

Pooling Space and Time

INTRODUCTORY REGRESSION ANALYSIS

ECON2228 Notes 10. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 48

ECON2228 Notes 10. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 54

Auto correlation 2. Note: In general we can have AR(p) errors which implies p lagged terms in the error structure, i.e.,

Econometrics. 9) Heteroscedasticity and autocorrelation

Reading Assignment. Serial Correlation and Heteroskedasticity. Chapters 12 and 11. Kennedy: Chapter 8. AREC-ECON 535 Lec F1 1

Univariate, Nonstationary Processes

Iris Wang.

Christopher Dougherty London School of Economics and Political Science

Journal of Asian Scientific Research COMBINED PARAMETERS ESTIMATION METHODS OF LINEAR REGRESSION MODEL WITH MULTICOLLINEARITY AND AUTOCORRELATION

ARIMA Models. Jamie Monogan. January 16, University of Georgia. Jamie Monogan (UGA) ARIMA Models January 16, / 27

Heteroskedasticity and Autocorrelation

11.1 Gujarati(2003): Chapter 12

Autocorrelation. Think of autocorrelation as signifying a systematic relationship between the residuals measured at different points in time

Course information EC2020 Elements of econometrics

Section 6: Heteroskedasticity and Serial Correlation

Likely causes: The Problem. E u t 0. E u s u p 0

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Granger Causality Testing

388 Index Differencing test ,232 Distributed lags , 147 arithmetic lag.

1. You have data on years of work experience, EXPER, its square, EXPER2, years of education, EDUC, and the log of hourly wages, LWAGE

Homoskedasticity. Var (u X) = σ 2. (23)

Greene, Econometric Analysis (7th ed, 2012) Chapters 9, 20: Generalized Least Squares, Heteroskedasticity, Serial Correlation

Applied Econometrics. Applied Econometrics. Applied Econometrics. Applied Econometrics. What is Autocorrelation. Applied Econometrics

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables

Time Series. April, 2001 TIME SERIES ISSUES

Applied Microeconometrics (L5): Panel Data-Basics

Outline. Nature of the Problem. Nature of the Problem. Basic Econometrics in Transportation. Autocorrelation

Panel Data Models. Chapter 5. Financial Econometrics. Michael Hauser WS17/18 1 / 63

LECTURE 11. Introduction to Econometrics. Autocorrelation

Econometrics - 30C00200

Econometrics of Panel Data

The general linear regression with k explanatory variables is just an extension of the simple regression as follows

Økonomisk Kandidateksamen 2004 (I) Econometrics 2. Rettevejledning

Environmental Econometrics

Multiple Regression Analysis. Part III. Multiple Regression Analysis

Circle the single best answer for each multiple choice question. Your choice should be made clearly.

Economics 113. Simple Regression Assumptions. Simple Regression Derivation. Changing Units of Measurement. Nonlinear effects

1 Linear Regression Analysis The Mincer Wage Equation Data Econometric Model Estimation... 11

Introductory Econometrics

Model Mis-specification

ECON 312 FINAL PROJECT

Questions and Answers on Heteroskedasticity, Autocorrelation and Generalized Least Squares

Graduate Econometrics Lecture 4: Heteroskedasticity

1 Motivation for Instrumental Variable (IV) Regression

Heteroskedasticity. We now consider the implications of relaxing the assumption that the conditional

Introduction to Econometrics Final Examination Fall 2006 Answer Sheet

Econometrics of Panel Data

Diagnostics of Linear Regression

Chapter 5. Classical linear regression model assumptions and diagnostics. Introductory Econometrics for Finance c Chris Brooks

Introductory Econometrics

Linear Regression & Correlation

New York University Department of Economics. Applied Statistics and Econometrics G Spring 2013

Linear Regression with Time Series Data

How to Detect and Remove Temporal. Autocorrelation in Vehicular Crash Data.

Formulary Applied Econometrics

ECON The Simple Regression Model

Empirical Economic Research, Part II

the error term could vary over the observations, in ways that are related

Heteroscedasticity and Autocorrelation

GLS and FGLS. Econ 671. Purdue University. Justin L. Tobias (Purdue) GLS and FGLS 1 / 22

Advanced Econometrics

Topic 10: Panel Data Analysis

Intermediate Econometrics

Econometrics Summary Algebraic and Statistical Preliminaries

Section 2 NABE ASTEF 65

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Freeing up the Classical Assumptions. () Introductory Econometrics: Topic 5 1 / 94

EC327: Advanced Econometrics, Spring 2007

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43

1 Introduction to Generalized Least Squares

Multiple Regression Analysis

Instrumental Variables, Simultaneous and Systems of Equations

F3: Classical normal linear rgression model distribution, interval estimation and hypothesis testing

Econometrics. Week 6. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

LECTURE 10: MORE ON RANDOM PROCESSES

ECON/FIN 250: Forecasting in Finance and Economics: Section 8: Forecast Examples: Part 1

Econometrics Multiple Regression Analysis: Heteroskedasticity

Linear Regression with Time Series Data

1 Overview. 2 Data Files. 3 Estimation Programs

L2: Two-variable regression model

Ch.10 Autocorrelated Disturbances (June 15, 2016)

ECON3150/4150 Spring 2015

Chapter 2: simple regression model

Week 11 Heteroskedasticity and Autocorrelation

Statistical Inference with Regression Analysis

UNIVERSITY OF DELHI DELHI SCHOOL OF ECONOMICS DEPARTMENT OF ECONOMICS. Minutes of Meeting

Transcription:

Autocorrelation Jamie Monogan University of Georgia Intermediate Political Methodology Jamie Monogan (UGA) Autocorrelation POLS 7014 1 / 20

Objectives By the end of this meeting, participants should be able to: Define autocorrelation and describe the problems it produces. Distinguish between issues of autocorrelation and problems of functional form. Identify when autocorrelation is present in real data analysis Use feasible GLS to correct for autocorrelation. Jamie Monogan (UGA) Autocorrelation POLS 7014 2 / 20

What is Autocorrelation? The Gauss-Markov assumptions assume that disturbances are independent of each other: cov(u i, u j x i, x j ) = E(u i u j ) = 0 for i j. Whenever this is not true, we have serial correlation or error autocorrelation: E(u i u j ) 0 for i j. Why might this emerge? Time-referenced data: Could today s disturbance in a model of Obama s approval be related to yesterday s? Spatially-referenced data: Could the disturbance for one state s policy actions be related to the disturbances of a state s neighbors? (New issue in political science.) The consequence: OLS estimates ˆβ are still unbiased. (Provided this is purely autocorrelation and not a problem of functional form.) The estimates are no longer efficient, however. Jamie Monogan (UGA) Autocorrelation POLS 7014 3 / 20

Percent Identifying as Liberal Over Time A Model with a Lagged Dependent Variable Citation: Static Model Lagged DV Model Estimate S.E. Estimate S.E. Great Society intervention -5.97 0.64-2.65 0.71 Party control duration -0.12 0.03-0.05 0.03 Post-intervention trend -0.10 0.02-0.03 0.02 Liberal identification (t-1) 0.60 0.09 Intercept 44.12 0.34 17.58 4.11 Radj 2 0.84 0.90 N=70 Ellis, Christopher & James A. Stimson. 2012. Ideology in America. New York: Cambridge University Press. Table 4.4, page 87. Jamie Monogan (UGA) Autocorrelation POLS 7014 4 / 20

Panel Model of Log Wage A Real Example of Autocorrelation OLS GLSE Estimate S.E. Estimate S.E. Experience 0.0132 0.0011 0.0133 0.0017 Bad health -0.0843 0.0412-0.0300 0.0363 Unemployed last year -0.0015 0.0267-0.0402 0.0207 Nonwhite -0.0853 0.0328-0.0878 0.0518 Union 0.0450 0.0191 0.0374 0.0296 Schooling 0.0669 0.0033 0.0676 0.0052 ˆσ u 0.3210 0.1920 ˆρ 0.6320 N=750, T=2 Citations: Greene, William H. 2003. Econometric Analysis. 5th ed. Upper Saddle River, NJ: Prentice Hall. (p.306) Hausman, Jerry A. and William E. Taylor. 1981. Panel Data and Unobservable Individual Effects. Econometrica 49:1377-1398. Jamie Monogan (UGA) Autocorrelation POLS 7014 5 / 20

Identifying Error Autocorrelation Visual Diagnosis Plot residuals against time. (Also a good diagnosis of model misspecification.) Plot residuals against predictors. (Also a good diagnosis of model misspecification.) Plot residuals against lagged residuals. Plot the autocorrelation function. (Really useful for upper-level autocorrelation.) Hypothesis Tests Durbin-Watson s d Durbin s h Breusch-Godfrey Jamie Monogan (UGA) Autocorrelation POLS 7014 6 / 20

The Durbin-Watson d Statistic as a Test d = T (û t û t 1 ) 2 t=2 T ût 2 t=1 2 2 ACF (1) The 1951 solution: If d is non-significant, go ahead and estimate with OLS. Biggest problem with this test: Does not allow a lagged dependent variable. Jamie Monogan (UGA) Autocorrelation POLS 7014 7 / 20

Durbin-Watson d d has an expected value of 2.0 for white noise residuals. In the common case of positive autocorrelation, it takes on values < 2.0 Significance of d is given by d tables. Given the number of observations and number of predictors, a table provides d L & d U. If d > d U, then there is no evidence of first-order serial correlation. If d < d L, then there is evidence of first-order serial correlation. If d U d d L, then there is inconclusive evidence on the presence or absence of first-order serial correlation. Gujarati & Porter lay-out expectations for the less-common case of negative autocorrelation on page 435. Jamie Monogan (UGA) Autocorrelation POLS 7014 8 / 20

Breusch-Godfrey Test After initial regression, estimate the following: û t = α 1 + α 2 X t + ρ 1 û t 1 + ρ 2 û t 2 + + ρ p û t p + ɛ t Compute the R 2 from this auxillary regression. Our test statistic is: (n p)r 2 aux χ 2 p (chi-squared with p degrees of freedom). We use this to test the hypothesis: H 0 : independent observations, H 1 : non-independent observations (error autocorrelation) Jamie Monogan (UGA) Autocorrelation POLS 7014 9 / 20

Software In R: dwtest for Durbin-Watson d. DO NOT USE THIS WITH A LAGGED DEPENDENT VARIABLE. bgtest for Breusch-Godfrey test (allows lagged dependent variable). Both from library(lmtest). In Stata: first reg y indvars Then estat dwatson This is the Durbin-Watson d statistic. DO NOT USE THIS WITH A LAGGED DEPENDENT VARIABLE. Use estat bgodfrey or estat durbinalt instead with a lagged DV. Jamie Monogan (UGA) Autocorrelation POLS 7014 10 / 20

OLS in the Presence of Autocorrelated Error Assume for a moment that a static functional form is the correct functional form, and we have specified this correctly. This is often a wrong assumption. We will discuss what to do for a non-static functional form shortly. The OLS assumptions specifically include no autocorrelation. Therefore the Gauss-Markov proof of BLUE does not follow. In the presence of autocorrelation, ˆβ is unbiased, but inefficient. However, ˆσ 2ˆβ is biased (downward), as are t (upward) and p (downward) i.e., in favor of finding significance. That is, OLS is LUE, but not BLUE. How then do we get BLUE? Jamie Monogan (UGA) Autocorrelation POLS 7014 11 / 20

Setting Up GLS Example for T=5 Assume first-order serial correlation: u t = ρu t 1 + ν t, where ρ 0. Then if we know ρ (which we don t), β GLS = [X Ω 1 X] 1 X Ω 1 y (Aitken 1922). Here, Ω is the matrix of the form: 1 ρ ρ 2 ρ 3 ρ 4 Ω = σ 2 ρ 1 ρ ρ 2 ρ 3 ρ 2 ρ 1 ρ ρ 2 ρ 3 ρ 2 ρ 1 ρ ρ 4 ρ 3 ρ 2 ρ 1 Note the exponential decay moving across or up/down from major diagonal. Jamie Monogan (UGA) Autocorrelation POLS 7014 12 / 20

What If We Impose OLS Assumptions? ρ = 0.0 at all lags (no autocorrelation) Then Ω reduces to σ 2 I 1 0 0 0 Ω = σ 2 0 1 0 0 0 0 1 0 = σ2 I 0 0 0 1 And since I 1 =I, IX=X, and σ2 σ 2 = 1, we have OLS: ˆβ = [X X] 1 X y. Thus OLS is the GLS estimator when ρ = 0 and σ 2 is constant. Jamie Monogan (UGA) Autocorrelation POLS 7014 13 / 20

Estimation of fgls: Three Steps Estimate OLS and extract residuals. Estimate ˆρ=ACF(1). Estimate fgls using estimated ˆρ to construct Ω. Jamie Monogan (UGA) Autocorrelation POLS 7014 14 / 20

An Iterative Alternative Cochrane-Orcutt Designate ˆρ k as the ρ estimated after step k. Then ˆρ 1 has the inefficiency properties of OLS. But ˆρ 1 is a superior estimate of ρ than was the 0.0 assumed by OLS. Thus ˆρ 2, estimated after GLS step 1, should be superior to ˆρ 1. More generally, the ˆρ k estimated after any round should be superior to the estimate which produced it. This can be continued until ˆρ input = ˆρ output, which we declare to be the correct estimate of ρ. This is the Cochrane-Orcutt estimator (corc in Stata). In R: Compile Simon Jackman s program: http://ow.ly/ugzi Jamie Monogan (UGA) Autocorrelation POLS 7014 15 / 20

Three fgls Estimation Strategies 1 Single shot estimation R: pggls function in plm library (for panel data) Stata: prais depvar indvars,twostep 2 Iterative Estimation Cochrane-Orcutt R: Jackman code Stata: prais depvar indvars, corc Hildreth-Lu Prais-Winston Stata: prais depvar indvars 3 Maximum likelihood R & Stata arima functions Jamie Monogan (UGA) Autocorrelation POLS 7014 16 / 20

Dynamic Specification GLS-like corrections for autocorrelation put emphasis on the error term at the cost of static specification. That is the wrong priority. Getting the causal specification right is much more important than tidying up the error term. For that we often need dynamics. One common solution is the Koyck distributed lag scheme: y t = β 1 + β 2 y t 1 + β 3 X t + u t. Think about how this model with a lagged dependent variable works: Suppose X increases by 1 unit at time 0. That means we expect y to increase by β 3 on average, ceteris paribus. One time period later, suppose X returns to its original level without the one unit increase. We still expect y to be a bit different. This is because y 1 is a function of y 0, but y 0 is expected to be β 3 larger. Thus, we expect y 1 to be β 2 β 3 higher, on average, ceteris paribus. A second time period later, y 2 is expected to be β 2 2 β 3 higher, on average, ceteris paribus. At k time periods later, y k is expected to be β k 2 β 3 higher, on average, ceteris paribus. This spillover is called a dynamic effect. Jamie Monogan (UGA) Autocorrelation POLS 7014 17 / 20

Extra Credit Assignment Due November 18 at the start of class. Suggestions for how to improve Political Analysis Using R. Double-spaced, 12-point font, 1 margins. 1 point per page of commentary, with a maximum of 3 bonus points. Jamie Monogan (UGA) Autocorrelation POLS 7014 18 / 20

Looking Ahead: Research Papers Due December 2 at the start of class. Details in the syllabus. Each person s data analysis should be solo work. Demonstrate that you can use the tools from class. You ll always want to do these regression diagnostics, whether they are reported or not. Please include an appendix addressing the following: 1 What is the most likely complaint a reviewer might raise about your model specification? How can you address this issue? 2 Are your residuals autocorrelated? (For cross-sectional data, just say no. If a Durbin-Watson test is not applicable here, use one as part of your answer to the previous question.) 3 Do your residuals have a homoscedastic variance? 4 Are the residuals normally distributed? 5 Are any data points influential? 6 Optional: Is there multicollinearity in your predictors? 7 Also, report all of your software code. (Not outputs, just code.) The model you present in the main text ideally will correct any problems you come across. Jamie Monogan (UGA) Autocorrelation POLS 7014 19 / 20

For Next Time Read Gujarati & Porter Chapter 13. Study some data on Bush s job approval ratings: http://monogan.myweb.uga.edu/teaching/ts/bushjob.dta Notes: (1) These are Stata data. (2) It may take some work to time set and lag the variables. Model Bush s approval rating (approve) as a function of September 11 (s11) and the onset of the Iraq war (iraq). Report the following: A plot of Bush s approval rating against months in office (t). An OLS model where approval is only a function of the two inputs, along with a Durbin-Watson test for autocorrelation. A Cochrane-Orcutt FGLS model where approval is only a function of the two inputs. An OLS model where approval is a function of the two inputs AND lagged approval, along with a Bruesch-Godfrey test for autocorrelation. Which of these three models do you trust the most? Why? Jamie Monogan (UGA) Autocorrelation POLS 7014 20 / 20