Problem Set 2: Box-Jenkins methodology

Problem Set : Box-Jenkins methodology 1) For an AR1) process we have: γ0) = σ ε 1 φ σ ε γ0) = 1 φ Hence, For a MA1) process, p lim R = φ γ0) = 1 + θ )σ ε σ ε 1 = γ0) 1 + θ Therefore, p lim R = 1 1 1 + θ = θ 1 + θ We can see that a consistent estimator of the R would "approach" different values depending on the true parameters of the process and not on the goodness of fit of the estimation procedure. So, R is no longer an appropriate goodness of fit measure. Instead, information criteria and the principle of parsimony should be heeded. ) To identify the series you should plot the ACF and the PACF functions. You must choose the e-views command Quick, then Series Statistics and then Correlogram. When you get there, you should specify the series you would like to plot, and then select the number of lags you would like to include default=36) in the estimation. When you plot the Correlogram for the series Return you will be given the following results: 1

Autocorrelation Partial Correlation AC PAC Q-Stat Prob. *. * 1 0.147 0.147 0.081 0.000.... -0.014-0.036 0.5 0.000.... 3-0.043-0.037 1.997 0.000.... 4 0.03 0.044.94 0.000.... 5 0.031 0.018 3.88 0.000. *. 6-0.083-0.093 30.80 0.000... 7-0.081-0.05 36.394 0.000.... 8-0.039-0.0 37.83 0.000.... 9 0.051 0.050 40.8 0.000.... 10 0.003-0.014 40.93 0.000... * 11 0.056 0.068 43.47 0.000.... 1 0.047 0.033 45.308 0.000. *.. 13 0.066 0.045 49.433 0.000.... 14 0.041 0.019 50.99 0.000.... 15 0.009 0.007 51.065 0.000.... 16-0.010-0.009 51.154 0.000.... 17-0.046-0.037 53.137 0.000.... 18-0.01 0.004 53.85 0.000.... 19 0.03 0.040 53.799 0.000.... 0 0.017 0.01 54.069 0.000.... 1 0.015 0.01 54.94 0.000.... -0.047-0.055 56.366 0.000.... 3-0.006-0.005 56.403 0.000.... 4-0.015-0.033 56.64 0.000 We should recall that: - An ARp) process has a declining AC function and the PACF is zero for lags greater than p. - A MAq) process has a AC function that is zero for lags greater than q and a PACF that declines exponentially. Looking at the table, we can see that the cumulative correlation as descibed by the Q statistic shows that is significant for all the lags,.and therefore you reject the null hypothesis of ρk) = 0 for each lag. As a result, series does not seem to be a White Noise process. For example, the estimated value for Q-Stat ρ1) is significant at a 95%, which implies that the null hypothesis of ρ1) = 0.is rejected. We will attempt to identify the series following a "from specific to general" approach. Since the estimate of ρ1) is too small it is very diffi cult looking at the above table) to tell which type of process is the best to characterize the series under scrutiny a MA or AR process). We first try with an AR1) structure. Using e-views you should go to Quick, Estimate Equation, an then type the dependent and independent variables that you will include in the regression:

Returns C AR1 ). We obtain the following results: Variable Coeffi cient Std. Error t-statistic Prob. C 0.00054 0.0007 1.99515 0.0463 AR1) 0.147150 0.0348 4.530161 0.0000 R-squared 0.01659 Mean dependent var 0.000543 Adjusted R-squared 0.00604 S.D. dependent var 0.00713 S.E. of regression 0.007058 Akaike info criterion -7.06710 Sum squared resid 0.046181 Schwarz criterion -7.056713 Log likelihood 384.677 F-statistic 0.536 Durbin-Watson stat 1.989755 ProbF-statistic) 0.000007 Inverted AR Roots.15 The series returns seems to be an AR1) which appears to be stationary because the AR root is 1 0.15 > 1. Nevertheless, we can only be pleased with our representation of the series if the residuals are clean. Therefore we need to check the correlogram of the residuals to inquire whether they are a White Noise process. If they are not, we continue augmenting the model. 3

Autocorrelation Partial Correlation AC PAC Q-Stat Prob.... 1 0.005 0.005 0.03.... -0.030-0.030 0.8505 0.356.... 3-0.048-0.048.996 0.4.... 4 0.035 0.035 4.1439 0.46.... 5 0.040 0.037 5.67 0.9. *. 6-0.079-0.080 11.503 0.04. *. 7-0.066-0.061 15.61 0.016.... 8-0.036-0.038 16.85 0.018.... 9 0.059 0.046 0.100 0.010.... 10-0.01-0.017 0.46 0.016.... 11 0.051 0.061.686 0.01.... 1 0.030 0.036 3.560 0.015.... 13 0.056 0.048 6.501 0.009.... 14 0.031 0.06 7.418 0.011.... 15 0.004 0.01 7.437 0.017.... 16-0.005-0.00 7.456 0.05.... 17-0.044-0.037 9.317 0.0.... 18-0.010-0.007 9.409 0.031.... 19 0.04 0.037 9.940 0.038.... 0 0.01 0.015 30.07 0.051.... 1 0.00 0.03 30.471 0.063.... -0.050-0.049 3.871 0.048.... 3 0.003-0.007 3.879 0.064.... 4-0.007-0.06 3.93 0.08 We can see that from the 6th lag onwards, the p-value associated with the Q statistics are smaller than the 5% critical level and therefore the Q-Stat determines that the lag 6 is significant. This test shows that in the sixth lag there is either an AR or a MA component. We estimate the following equation: yt) = φ 1 y t 1 + φ 6 y t 6 + ε t To estimate this model we go in the Eviews menu to Quick, Estimate Equation and type: Return C AR1) AR6), obtaining the following results: 4

Variable Coeffi cient Std. Error t-statistic Prob. C 0.000541 0.00046.0030 0.080 AR1) 0.14980 0.03386 4.65564 0.0000 AR6) -0.087959 0.03345 -.719380 0.0067 R-squared 0.09410 Mean dependent var 0.000543 Adjusted R-squared 0.07314 S.D. dependent var 0.00713 S.E. of regression 0.007034 Akaike info criterion -7.0791 Sum squared resid 0.045815 Schwarz criterion -7.057311 Log likelihood 388.37 F-statistic 14.0948 Durbin-Watson stat.00063 ProbF-statistic) 0.000001 Inverted AR Roots.60 -.33i.60+.33i.0+.66i.0 -.66i -.55 -.33i -.55+.33i The t-statistics for the AR1) and AR6) lags are both significant at a 5%. Looking at the Correlogram of the residuals of this regression, we find that that we cannot reject the null hypothesis that the residual is a White Noise process. We can therefore stop the identification procedure and conclude that the above model characterizes the data correctly. The Correlogram shows that we cannot reject the null hypothesis that states that the residuals are a White Noise process. 5

Autocorrelation Partial Correlation AC PAC Q-Stat Prob.... 1 0.000 0.000 0.0001.... -0.09-0.09 0.7955.... 3-0.047-0.047.8766 0.090.... 4 0.03 0.031 3.8171 0.148.... 5 0.045 0.043 5.7335 0.15.... 6 0.01 0.01 5.864 0.10.... 7-0.048-0.043 8.0545 0.153.... 8-0.035-0.031 9.1906 0.163.... 9 0.056 0.05 1.10 0.097.... 10-0.01-0.00 1.45 0.141.... 11 0.05 0.054 14.746 0.098.... 1 0.04 0.034 15.79 0.1.... 13 0.05 0.055 17.840 0.085.... 14 0.09 0.031 18.630 0.098.... 15 0.010 0.010 18.74 0.13.... 16-0.009-0.005 18.807 0.17.... 17-0.040-0.043 0.34 0.159.... 18-0.008-0.014 0.403 0.03.... 19 0.05 0.06 0.996 0.6.... 0 0.009 0.005 1.077 0.76.... 1 0.018 0.07 1.377 0.316.... -0.051-0.050 3.838 0.50.... 3 0.001-0.00 3.838 0.301.... 4-0.006-0.01 3.875 0.354 Nevertheless, since we were not sure about the nature of the sixth lag it seems that it could either be an AR6) or a MA6)) we ill also estimate the following model : yt) = φ 1 y t 1 + θ 6 ε t 6 + ε t 6

Variable Coeffi cient Std. Error t-statistic Prob. C 0.000539 0.00050.15596 0.0314 AR1) 0.145798 0.03547 4.479613 0.0000 MA6) -0.075680 0.0384 -.305661 0.013 R-squared 0.07516 Mean dependent var 0.000543 Adjusted R-squared 0.05416 S.D. dependent var 0.00713 S.E. of regression 0.007041 Akaike info criterion -7.07097 Sum squared resid 0.045904 Schwarz criterion -7.055361 Log likelihood 387.467 F-statistic 13.10050 Durbin-Watson stat 1.989908 ProbF-statistic) 0.00000 Inverted AR Roots.15 Inverted MA Roots.65.33+.56i.33 -.56i -.33 -.56i -.33+.56i -.65 The residual of the above regression seems to be a White Noise process and therefore we can conclude that this model also seems to fit the data correctly. We are therefore faced with a choice to make. Since both models seem to fit the series, we choose the model that has the lower value for the relevant Selection Criteria. We consider the Akaike info criterion AIC) and Schwarz criterion SCH). These selection criteria seem to favour the AR1), AR6) model for the series returns. 3) In this exercise we will derive theoretical forectasts of the following stationary AR1) process y t = c + φy t 1 + ε t We will not, as in an econometrics course, compute the predictor of the random variable and estimator used to provide information for future or otherwise unavailable values). So, when we talk about the forecast error we will not be talking about the prediction error covered in your previous econometrics course, as there will be no sampling error involved since our analysis will be theoretical, no estimation will be undertaken or considered). i. You have learnt that the minimun mean squared error forecast is the conditional mean. Imagine you have a size T sample and consider the problem of obtaining the one-step-ahead forecast: y T 1) = E T y T +1 ) = E T c + φy T + ε T +1 ) = c + φy T 7

Now consider the two-step-ahead forecast: y T ) = E T y T + ) = E T [y T +1 1)] = E T [c + φy T +1 ] = c + φy T 1) = c1 + φ) + φ y T In general, it is true that the k-step-ahead forecast satisfies: y T k) = c + φy T k 1) Solving recursively back to k = 1 we get: y T k) = c1 + φ + φ +... + φ k 1 ) + φ k y T 0) = c1 + φ + φ +... + φ k 1 ) + φ k y T This can be written under stationarity) as: y T k) = c 1 φk 1 φ + φk y T = 1 φ k) c 1 φ + φk y T = 1 φ k) µ + φ k y T We find that the optimal k-step-ahead forecast the conditional mean) is a convex combination of the last observation available and the unconditional mean. ii. This is straightforward. Simply take limits as k tends to infinity. lim y T k) = µ lim 1 φ k) + y T lim k + k + k + φk φ < 1 lim k + φk = 0 lim T k) k + = µ As the forecast horizon lengthens, the importance of the last observation available diminishes. iii. Let e T k) be the k-step-ahead optimal forecast error. e T k) = y T +k y T k) = φ y T +k 1 y T k 1)) + ε T +k = φe T k 1) + ε T +k 8

Solving recursively once again, e T k) = ε T +k + φε T +k 1 + φ ε T +k +... + φ k 1 ε T +1 = E e T k)) = 0 k 1 φ i ε T +k i i=0 k 1 σ e T k) k) = φ i σ ε i=0 = σ ε 1 + φ + φ 4 +... + φ k 1)) = σ 1 φ k ε 1 φ = 1 φ k) γ0) iv. Once again, simply take limits, recalling the stationarity hypothesis: lim k + σ e T k) k) = γ0) v. If the process is gaussian, the forecast error is gaussian too, as it a linear combination of independent and identically distributed normal random variables. Hence, knowing its mean and variance allows us to identify the whole of its distribution: e T k) 1 φ k) γ0) e T k) N N 0, 1) 0, 1 φ k) ) γ0) 1 α = P e T k) 1 φ k) z 1 α γ0) 1 α = P 1 α = P 1 α = P z 1 α e T k) 1 φ k) γ0) z 1 α y T k) z 1 α z 1 α 1 φ k) γ0) e T k) z 1 α 1 φ k) γ0) 1 φ k) γ0) y T +k y T k) + z 1 α 1 φ k) γ0) ) ) 9

Since we actually do not know the true value of the parameters, we should use consistent estimators. Then, an asymptotically valid forecast interval is: P ŷ T k) z 1 α 1 φ k) γ0) y T +k ŷ T k) + z 1 α 1 φ k) γ0) 4) Consider now the following invertible MA1) process i. For our size T sample: y t = µ + ε t + θε t 1 ) = 1 α y T 1) = E T y T +1 ) = E T µ + ε T +1 + θε T ) = µ + θε T Now consider the two-step-ahead forecast: So we now have: y T ) = E T y T + ) y T k) = = E T [y T +1 1)] = E T [µ + θε T +1 ] = µ { µ + θεt k = 1 µ k In general, the optimal k-step-ahead forecast returns abruptly to the mean, as soon as we surpass the moving average order in this case, 1). Although, the forecast function seems to be easy to compute, generally this is not the case. In the information set, in general, we do not have the error term. So, different approaches are employed to approximate it. If we have all the history for y infinite number of observations) and the MA process is invertible, we can approximate the error term with an AR ) representation. ε T 1 + θl) = y T µ so ε T = y t µ) 1 + θl) 1 10

and ε T = y T µ) θ y T 1 µ) + θ 3 y T µ) +... If we have a finite data set we have two approaches. On one hand we can approximate the Optimal Forecast under some assumptions. For example, setting to zero the error for some date and all the error for observations preceding this date. The another approach compute the exact finite sample forecast. Both approaches are detailed in Hamilton 1994. ii. This is even more straightforward that before. iii. Let e T k) be the k-step-ahead optimal forecast error: e T k) = y T +k y T k) { µ + εt = +1 + θε T µ θε T k = 1 µ + ε T +k + θε T +k 1 µ k Then, { e T k) = ε T +1 k = 1 ε T +k + θε T +k 1 k which implies, E e T k)) = 0 { σ e T k) k) = σ ε k = 1 1 + θ )σ ε = γ0) k iv. Even more so that ever, this is straightforward. v. Like before, if the process is gaussian, the forecast error is gaussian too. Let s focus on the k case: e T k) N 0, γ0)) e T k) γ0) N 0, 1) 11

) e T k) 1 α = P z 1 α γ0) ) 1 α = P z 1 α e T k) z 1 α γ0) 1 α = P z 1 α 1 α = P y T k) z 1 α ) γ0) et k) z 1 α γ0) γ0) yt +k y T k) + z 1 α γ0) ) The asymptotically valid forecast interval is now: ) P ŷ T k) z 1 α γ0) yt +k ŷ T k) + z 1 α γ0) = 1 α 1