VII. Serially Correlated Residuals/ Autocorrelation

Size: px

Start display at page:

Download "VII. Serially Correlated Residuals/ Autocorrelation"

Sarah Martin
5 years ago
Views:

1 VII. Serially Correlated Residuals/ Autocorrelation A.Graphical Detection of Serially Correlated (Autocorrelated) Residuals In multiple regression, we typically assume and 2 ( ) ε ~ N 0,Iσ % ( i j) cov ε,ε = 0, i j When the second assumption is not met, we have Serially Correlated Residuals or Autocorrelation. When autocorrelation is present: - The estimated regression coefficients are still unbiased but may no longer have the minimum variance among all unbiased estimates (they are inefficient) - The Mean Squared Error (s 2 ) tends to underestimate σ 2 (often by a great deal). This leads directly to: s bj overestimation of, which in turn results in underestimation of confidencntervals and significance for hypothesis tests that a regression parameter equals zero underestimation of the test statistics for the F test (and so significance) for resulting confidence regions and tests of the hypotheses about combinations of parameters

2 Note that the existence of autocorrelation suggests that residuals are related in one of two manners: - chronologically or - some other logical order suggested by practical circumstances This implies that residuals may related across multiple observations we refer to the correlation of residuals across s observations as ρ s. We refer to this as a Lag-s serial correlation. If ρ s is positive, residuals tend to have the same sign as their lag-s counterpart (for lag-1 this is commonly called attraction). If ρ s is negative, residuals tend to have the opposite sign of their lag-s counterpart (for lag-1 this is commonly called repulsion). Note that if autocorrelation is present, we would expect a plot of residuals in chronological order (or some other logical order suggested by practical circumstances) to have some distinct pattern (depending on the nature of the relationship between the residuals). Let s look at some examples of i) chronological (or ordered) residual plots and corresponding plots of residuals against their lag-s values.

3 We have already used a chronological or ordered plot of residuals to aid in detection of serial correlation. Here we appear to have a first-order positive autocorrelation: Chronological Plot of Residuals Time The pattern is even more evident when we plot the residual against its one period lag: First Order Positive Serial Correlation -1

4 This chronological or ordered plot of residuals suggests that we have a first-order negative autocorrelation: Chronological Plot of Residuals Time Again, the pattern is even more evident when we plot the residual against its one period lag: First Order Negative Serial Correlation -1

5 What does this chronological (or ordered) plot of residuals suggest? Chronological Plot of Residuals Time It is difficult to discern patterns beyond lag-1 by using the chronological (or ordered) plot of residuals. Let s look at the plot of residuals against their two period lag: Second Order Negative Serial Correlation -2

6 What does this chronological (or ordered) plot of residuals suggest? Chronological Plot of Residuals Time Again, let s look at the plot of residuals against their two period lag: Second Order Positive Serial Correlation -2

7 Of course, certain orders appear with greater frequency in time series data. We often see -1 st order autocorrelation -2 nd order autocorrelation -4 th order autocorrelation -7 th order autocorrelation -12 th order autocorrelation As always, the analyst should have a theoretical reason to suspect a certain order autocorrelation (if we just look through the data at every possible order, we are bound to find something that is actually spurious). SAS Procedures that could produce such plots: PROC REG DATA=Salary; MODEL income=age yrseduc; OUTPUT OUT=regdata PREDICTED=yhat RESIDUAL=error; DATA REGDATA; SET REGDATA; lag1res=lag1(error); lag2res=lag2(error); PROC PLOT; PLOT error*num; PROC PLOT; PLOT error*lag1res; PROC PLOT; PLOT error*lag2res; PROC CORR; VAR error; WITH lag1res lag2res; RUN;

8 8 ˆ A 6 ˆ 4 ˆ A The SAS PROC PLOT output for the plot of residuals vs. order (time) looks like this: ˆ 2 A R A A A e s i d 0 ˆ u a l -2 ˆ A A -4 ˆ A -6 ˆ A -8 ˆ Šˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆ Order of Data A more aesthetically appealing graphical presentation can be produced after the residuals armported into Excel: Chronological Plot of Residuals Time

9 8 ˆ A 6 ˆ 4 ˆ A The SAS PROC PLOT output for the plot of residuals vs. lag-1 residuals looks like this: ˆ 2 A R A A A e s i d 0 ˆ u a l -2 ˆ A A -4 ˆ -6 ˆ A -8 ˆ Šƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒ lag1res Again, the plot for the plot of residuals vs. lag-1 residuals produced using Excel is more aesthetically appealing: First Order Serial Correlation -1

10 8 ˆ A 6 ˆ 4 ˆ A The SAS PROC PLOT output for the plot of residuals vs. lag-2 residuals looks like this: ˆ 2 A R A B e s i d 0 ˆ u a l -2 ˆ A -4 ˆ -6 ˆ A -8 ˆ Šƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒˆƒƒ lag2res Again for improved clarity, the plot for the plot of residuals vs. lag-2 residuals produced using Excel is provided: Second Order Serial Correlation -2

11 B. Testing for Autocorrelation The Durbin-Watson Test If we wish to fit a postulated linear model k u 0 i iu u i=1 Y = β + β X + ε by least squares to observations Y u, X 1u, X 2u,, X ku, u = 1,,n, we would assume and 2 ε ~ N ( 0,σ ) ( i j) cov ε,ε = 0, i j which is equivalent to saying ρ s = 0. Given the potentially serious consequences associated with autocorrelation, testing is critical when theory suggests its potential existence. The Durbin-Watson test (Von Neumann, 1941; Durbin & Watson, 1951) can be used to test the hypothesis H 0 : ρ s = 0 H 1 : ρ s = ρ (s) where ρ (s) is any ρ s such that 0 < ρ s < 1. Note that the alternate hypothesis arises from the assumption that the errors are such that ε u = ρε u + z u, z u ~ N(0,σ 2 ) and cov(ε u, ε u ) = 0, cov(ε u,z u ) = 0, u u. This leads to the conclusion that 2 σ ε ~ N 0, 2 1-ρ

12 The Durbin-Watson test statistic is ' n e 0 e 0 2 ( e ) u - e % - % - u-1 u=2 0 e 0 e d = = % % n ' 2 ee eu %% u=1 to detect first-order autocorrelation (relationship between the residual and its one-period lag). It can be shown that: - the distribution of the Durbin-Watson test statistic depends on and is not independent of the regressor data - the distribution of the Durbin-Watson test statistic is symmetric about 2.00 and ranges from 0 to 4 (although extreme values are only possible for very large samples) - positive serial correlation results in a d near 0 - negative serial correlation results in a d near 4 Steps in the Two-Tailed Durban-Watson Test 1. When d < 2 we actually test d, whilf d > 2 we test 4 d 2. Compare the appropriate result (d or 4 d) to values of d L and d U (from appropriate Tables) - If the appropriate result (d or 4 d) < d L, reject H 0 at the 2α level of significance, and conclude that serial correlation may exist (positivf we are using d, negativf we are using 4 d) - If the appropriate result (d or 4 d) > d U, do not reject H 0 at the 2α level of significance, and conclude that serial correlation probably doesn t exist - If the appropriate result d L (d or 4 d) d U, the results arndeterminate at the 2α level of significance

13 Steps in the Lower-Tailed Durban-Watson Test 1. Test 4 d 2. Compare 4 d to values of d L and d U (from appropriate Tables) - If 4 d < d L, reject H 0 at the α level of significance, and conclude that negative serial correlation may exist - If 4 d > d U, do not reject H 0 at the α level of significance, and conclude that negative serial correlation probably doesn t exist - If d L 4 d d U, the results arndeterminate at the α level of significance Steps in the Upper-Tailed Durban-Watson Test 1. Test d 2. Compare d to values of d L and d U (from appropriate Tables) - If d < d L, reject H 0 at the α level of significance, and conclude that positive serial correlation may exist - If d > d U, do not reject H 0 at the α level of significance, and conclude that positive serial correlation probably doesn t exist - If d L d d U, the results arndeterminate at the α level of significance

14 Note that - in each case (two-tailed, lower-tailed, and uppertailed tests) some researchers combine the reject and indeterminate regions into a larger reject region - thndeterminate region narrows rapidly as the sample sizs increased - some researchers argue that inclusion of a lag of the response variable as a regressor renders the Durbin- Watson test ineffective (they argue that such a model biased the DW test toward nonrejection). Rayner (1994) argues that the DW test is still superior to other approaches under such conditions. C. Testing for Autocorrelation The Runs Test This is a quick nonparametric approximation of the Durbin-Watson test. The Runs test considers the patterns in the signs of the residuals in chronological (or other) order. For the Runs test we have n 1 (or n + ) = # positive residuals n 2 (or n - ) = # negative residuals r = # of runs (or the number of times the ordered sequence changes sign + 1) For example, if we had the following ordered signs: we would have n 1 =14, n 2 = 12, and r = 11.

15 For extremely small problems we would enumerate all the possible orders in which n 1 positives and n 2 negatives could be arranged, i.e., n 1 + n2 n 1 + n2 n or 1 n, 2 then found the relative frequency or in this case probability) that r or fewer runs would occur. For example, suppose there are five observations or residuals, two of which are positive. The only possible orderings are: For example, suppose there are five observations or residuals, two of which are positive. The only possible orderings are: Arrangement # of Runs so n 1 + n ! n = = 1 2 2! 5-2! = 10 ( ) If our data actually have only two runs, the probability of this occurring randomly is 0.20.

16 Think about this what is the maximum number of nuns in a data set of size n 1 + n 2 = n? max (r) = 2min(n 1, n 2 ) why? What type of autocorrelation does a large number of runs suggest? First Order Negative Serial Correlation Why does a large number of runs suggest negative autocorrelation? -1 Now think about this what is the minimum number of nuns in a data set of size n 1 + n 2 = n? min (r) = 2 why? What type of autocorrelation does a small number of runs suggest? First Order Positive Serial Correlation Why does a small number of runs suggest positive autocorrelation? -1

17 Obviously, there arssues with using the Runs test: - it ignores much information on the correlation by considering only the signs of residuals - it lacks power for small samples - it is difficult to do exactly for large samples However, the second and third issues can be addressed: - there are tables that provide critical values of r for the Runs test (the tables in our book are provided for 3 n 1, n thers a normal approximation of the Runs test for n 1, n Let nn 1 2 ( 2nn 1 2 -n1 -n2) ( 2 ) ( ) nn n + n n + n n + n µ = + 1,σ = Now we have (approximately) z = 1 r-µ+ 2 σ for a lower tailed test (a test for positive first-order serial correlation) and (approximately) 1 r-µ- 2 z = σ for an upper tailed test (a test for negative first-order serial correlation), and (approximately) r-µ z = σ for a two tailed test (a test for any first-order serial correlation) Continuity Correction

18 Example: Suppose we have the following ordered residuals (which yield a Durban- Watson statistic of d = 0.89) e % = and we hypothesize no first-order serial correlation Our null and alternate hypotheses are: H 0 : ρ s = 0 H 1 : ρ s = ρ (s) and we have n = 22, n 1 = 10 ( 10), n 2 =12 ( 10), and r = 5, so nn ( 12) 120 µ = + 1 = + 1 = + 1 = n 1 + n and ( nn 1 2 ( 2nn 1 2 -n1 -n2) 2 1 2) ( 1 2 ) ( ) ( ) ( ) 2 ( ) ( ) 2 σ = n + n n + n - 1 = = =

19 So the calculated value of our test statistic for our two=tailed test is: r - µ z = = = σ which has a p-value of , so we reject H 0 at any reasonable level of significance and conclude that serial correlation does exist. Notice that this pattern of ordered residuals (which yield a Durban- Watson statistic of d = 1.55) e % = yields exactly the same results for the Runs test (even though the serial correlation is obviously much weaker)!

20 SOME questions you should be able to answer: 1. What is autocorrelation (serial correlation)? What are the ramifications of autocorrelation in a regression model? How do you test for/assess the presence of autocorrelation? How do you correct for the presence of autocorrelation? How do you use SAS to test for autocorrelation? 2. What is the purpose of the Durbin-Watson test? Explain how the Durbin-Watson test statistic works (i.e., explain the interpretations of Durbin-Watson statistic over its range of why the equation effectively performs its specific function). How do you use SAS to perform the Durbin-Watson test? 3. What is the Runs test? How is it used to test for/assess the presence of autocorrelation? What are the strength(s) and weakness(es) of this test for autocorrelation?

Econometrics Part Three

Econometrics Part Three !1 I. Heteroskedasticity A. Definition 1. The variance of the error term is correlated with one of the explanatory variables 2. Example -- the variance of actual spending around the consumption line increases