Autocorrelation Jamie Monogan University of Georgia Intermediate Political Methodology Jamie Monogan (UGA) Autocorrelation POLS 7014 1 / 20
Objectives By the end of this meeting, participants should be able to: Define autocorrelation and describe the problems it produces. Distinguish between issues of autocorrelation and problems of functional form. Identify when autocorrelation is present in real data analysis Use feasible GLS to correct for autocorrelation. Jamie Monogan (UGA) Autocorrelation POLS 7014 2 / 20
What is Autocorrelation? The Gauss-Markov assumptions assume that disturbances are independent of each other: cov(u i, u j x i, x j ) = E(u i u j ) = 0 for i j. Whenever this is not true, we have serial correlation or error autocorrelation: E(u i u j ) 0 for i j. Why might this emerge? Time-referenced data: Could today s disturbance in a model of Obama s approval be related to yesterday s? Spatially-referenced data: Could the disturbance for one state s policy actions be related to the disturbances of a state s neighbors? (New issue in political science.) The consequence: OLS estimates ˆβ are still unbiased. (Provided this is purely autocorrelation and not a problem of functional form.) The estimates are no longer efficient, however. Jamie Monogan (UGA) Autocorrelation POLS 7014 3 / 20
Percent Identifying as Liberal Over Time A Model with a Lagged Dependent Variable Citation: Static Model Lagged DV Model Estimate S.E. Estimate S.E. Great Society intervention -5.97 0.64-2.65 0.71 Party control duration -0.12 0.03-0.05 0.03 Post-intervention trend -0.10 0.02-0.03 0.02 Liberal identification (t-1) 0.60 0.09 Intercept 44.12 0.34 17.58 4.11 Radj 2 0.84 0.90 N=70 Ellis, Christopher & James A. Stimson. 2012. Ideology in America. New York: Cambridge University Press. Table 4.4, page 87. Jamie Monogan (UGA) Autocorrelation POLS 7014 4 / 20
Panel Model of Log Wage A Real Example of Autocorrelation OLS GLSE Estimate S.E. Estimate S.E. Experience 0.0132 0.0011 0.0133 0.0017 Bad health -0.0843 0.0412-0.0300 0.0363 Unemployed last year -0.0015 0.0267-0.0402 0.0207 Nonwhite -0.0853 0.0328-0.0878 0.0518 Union 0.0450 0.0191 0.0374 0.0296 Schooling 0.0669 0.0033 0.0676 0.0052 ˆσ u 0.3210 0.1920 ˆρ 0.6320 N=750, T=2 Citations: Greene, William H. 2003. Econometric Analysis. 5th ed. Upper Saddle River, NJ: Prentice Hall. (p.306) Hausman, Jerry A. and William E. Taylor. 1981. Panel Data and Unobservable Individual Effects. Econometrica 49:1377-1398. Jamie Monogan (UGA) Autocorrelation POLS 7014 5 / 20
Identifying Error Autocorrelation Visual Diagnosis Plot residuals against time. (Also a good diagnosis of model misspecification.) Plot residuals against predictors. (Also a good diagnosis of model misspecification.) Plot residuals against lagged residuals. Plot the autocorrelation function. (Really useful for upper-level autocorrelation.) Hypothesis Tests Durbin-Watson s d Durbin s h Breusch-Godfrey Jamie Monogan (UGA) Autocorrelation POLS 7014 6 / 20
The Durbin-Watson d Statistic as a Test d = T (û t û t 1 ) 2 t=2 T ût 2 t=1 2 2 ACF (1) The 1951 solution: If d is non-significant, go ahead and estimate with OLS. Biggest problem with this test: Does not allow a lagged dependent variable. Jamie Monogan (UGA) Autocorrelation POLS 7014 7 / 20
Durbin-Watson d d has an expected value of 2.0 for white noise residuals. In the common case of positive autocorrelation, it takes on values < 2.0 Significance of d is given by d tables. Given the number of observations and number of predictors, a table provides d L & d U. If d > d U, then there is no evidence of first-order serial correlation. If d < d L, then there is evidence of first-order serial correlation. If d U d d L, then there is inconclusive evidence on the presence or absence of first-order serial correlation. Gujarati & Porter lay-out expectations for the less-common case of negative autocorrelation on page 435. Jamie Monogan (UGA) Autocorrelation POLS 7014 8 / 20
Breusch-Godfrey Test After initial regression, estimate the following: û t = α 1 + α 2 X t + ρ 1 û t 1 + ρ 2 û t 2 + + ρ p û t p + ɛ t Compute the R 2 from this auxillary regression. Our test statistic is: (n p)r 2 aux χ 2 p (chi-squared with p degrees of freedom). We use this to test the hypothesis: H 0 : independent observations, H 1 : non-independent observations (error autocorrelation) Jamie Monogan (UGA) Autocorrelation POLS 7014 9 / 20
Software In R: dwtest for Durbin-Watson d. DO NOT USE THIS WITH A LAGGED DEPENDENT VARIABLE. bgtest for Breusch-Godfrey test (allows lagged dependent variable). Both from library(lmtest). In Stata: first reg y indvars Then estat dwatson This is the Durbin-Watson d statistic. DO NOT USE THIS WITH A LAGGED DEPENDENT VARIABLE. Use estat bgodfrey or estat durbinalt instead with a lagged DV. Jamie Monogan (UGA) Autocorrelation POLS 7014 10 / 20
OLS in the Presence of Autocorrelated Error Assume for a moment that a static functional form is the correct functional form, and we have specified this correctly. This is often a wrong assumption. We will discuss what to do for a non-static functional form shortly. The OLS assumptions specifically include no autocorrelation. Therefore the Gauss-Markov proof of BLUE does not follow. In the presence of autocorrelation, ˆβ is unbiased, but inefficient. However, ˆσ 2ˆβ is biased (downward), as are t (upward) and p (downward) i.e., in favor of finding significance. That is, OLS is LUE, but not BLUE. How then do we get BLUE? Jamie Monogan (UGA) Autocorrelation POLS 7014 11 / 20
Setting Up GLS Example for T=5 Assume first-order serial correlation: u t = ρu t 1 + ν t, where ρ 0. Then if we know ρ (which we don t), β GLS = [X Ω 1 X] 1 X Ω 1 y (Aitken 1922). Here, Ω is the matrix of the form: 1 ρ ρ 2 ρ 3 ρ 4 Ω = σ 2 ρ 1 ρ ρ 2 ρ 3 ρ 2 ρ 1 ρ ρ 2 ρ 3 ρ 2 ρ 1 ρ ρ 4 ρ 3 ρ 2 ρ 1 Note the exponential decay moving across or up/down from major diagonal. Jamie Monogan (UGA) Autocorrelation POLS 7014 12 / 20
What If We Impose OLS Assumptions? ρ = 0.0 at all lags (no autocorrelation) Then Ω reduces to σ 2 I 1 0 0 0 Ω = σ 2 0 1 0 0 0 0 1 0 = σ2 I 0 0 0 1 And since I 1 =I, IX=X, and σ2 σ 2 = 1, we have OLS: ˆβ = [X X] 1 X y. Thus OLS is the GLS estimator when ρ = 0 and σ 2 is constant. Jamie Monogan (UGA) Autocorrelation POLS 7014 13 / 20
Estimation of fgls: Three Steps Estimate OLS and extract residuals. Estimate ˆρ=ACF(1). Estimate fgls using estimated ˆρ to construct Ω. Jamie Monogan (UGA) Autocorrelation POLS 7014 14 / 20
An Iterative Alternative Cochrane-Orcutt Designate ˆρ k as the ρ estimated after step k. Then ˆρ 1 has the inefficiency properties of OLS. But ˆρ 1 is a superior estimate of ρ than was the 0.0 assumed by OLS. Thus ˆρ 2, estimated after GLS step 1, should be superior to ˆρ 1. More generally, the ˆρ k estimated after any round should be superior to the estimate which produced it. This can be continued until ˆρ input = ˆρ output, which we declare to be the correct estimate of ρ. This is the Cochrane-Orcutt estimator (corc in Stata). In R: Compile Simon Jackman s program: http://ow.ly/ugzi Jamie Monogan (UGA) Autocorrelation POLS 7014 15 / 20
Three fgls Estimation Strategies 1 Single shot estimation R: pggls function in plm library (for panel data) Stata: prais depvar indvars,twostep 2 Iterative Estimation Cochrane-Orcutt R: Jackman code Stata: prais depvar indvars, corc Hildreth-Lu Prais-Winston Stata: prais depvar indvars 3 Maximum likelihood R & Stata arima functions Jamie Monogan (UGA) Autocorrelation POLS 7014 16 / 20
Dynamic Specification GLS-like corrections for autocorrelation put emphasis on the error term at the cost of static specification. That is the wrong priority. Getting the causal specification right is much more important than tidying up the error term. For that we often need dynamics. One common solution is the Koyck distributed lag scheme: y t = β 1 + β 2 y t 1 + β 3 X t + u t. Think about how this model with a lagged dependent variable works: Suppose X increases by 1 unit at time 0. That means we expect y to increase by β 3 on average, ceteris paribus. One time period later, suppose X returns to its original level without the one unit increase. We still expect y to be a bit different. This is because y 1 is a function of y 0, but y 0 is expected to be β 3 larger. Thus, we expect y 1 to be β 2 β 3 higher, on average, ceteris paribus. A second time period later, y 2 is expected to be β 2 2 β 3 higher, on average, ceteris paribus. At k time periods later, y k is expected to be β k 2 β 3 higher, on average, ceteris paribus. This spillover is called a dynamic effect. Jamie Monogan (UGA) Autocorrelation POLS 7014 17 / 20
Extra Credit Assignment Due November 18 at the start of class. Suggestions for how to improve Political Analysis Using R. Double-spaced, 12-point font, 1 margins. 1 point per page of commentary, with a maximum of 3 bonus points. Jamie Monogan (UGA) Autocorrelation POLS 7014 18 / 20
Looking Ahead: Research Papers Due December 2 at the start of class. Details in the syllabus. Each person s data analysis should be solo work. Demonstrate that you can use the tools from class. You ll always want to do these regression diagnostics, whether they are reported or not. Please include an appendix addressing the following: 1 What is the most likely complaint a reviewer might raise about your model specification? How can you address this issue? 2 Are your residuals autocorrelated? (For cross-sectional data, just say no. If a Durbin-Watson test is not applicable here, use one as part of your answer to the previous question.) 3 Do your residuals have a homoscedastic variance? 4 Are the residuals normally distributed? 5 Are any data points influential? 6 Optional: Is there multicollinearity in your predictors? 7 Also, report all of your software code. (Not outputs, just code.) The model you present in the main text ideally will correct any problems you come across. Jamie Monogan (UGA) Autocorrelation POLS 7014 19 / 20
For Next Time Read Gujarati & Porter Chapter 13. Study some data on Bush s job approval ratings: http://monogan.myweb.uga.edu/teaching/ts/bushjob.dta Notes: (1) These are Stata data. (2) It may take some work to time set and lag the variables. Model Bush s approval rating (approve) as a function of September 11 (s11) and the onset of the Iraq war (iraq). Report the following: A plot of Bush s approval rating against months in office (t). An OLS model where approval is only a function of the two inputs, along with a Durbin-Watson test for autocorrelation. A Cochrane-Orcutt FGLS model where approval is only a function of the two inputs. An OLS model where approval is a function of the two inputs AND lagged approval, along with a Bruesch-Godfrey test for autocorrelation. Which of these three models do you trust the most? Why? Jamie Monogan (UGA) Autocorrelation POLS 7014 20 / 20