Fittig a ARIMA Process to Data Bria Borchers April 6, 1 Now that we uderstad the theoretical behavior of ARIMA processes, we will cosider how to take a actual observed time series ad fit a ARIMA model to the data. I the fial lecture, we ll cosider how to use the fitted model to predict future values of the time series. The stages i our process for fittig a ARIMA model to a time series are as follows: 1. Idetify the appropriate degree of differecig d by differecig the time series util appears to be statioary.. Remove ay ozero mea from the differeced time series. 3. Estimate the autocorrelatio ad PACF of the differeced time series. Use these to determie the autoregressive order p ad the movig average order q. 4. Estimate the coefficiets φ 1,..., φ p, θ 1,..., θ q. This ca be doe i a variety of ways. However, the most robust method is maximum likelihood estimatio. The first step i our process is fidig the appropriate value of d. It is a good idea to begi this process by simply plottig the data. If the time series shows a strog tred (growth or declie), the the process is clearly ot statioary, ad it should be differeced at least oce. The secod test that we will use is to examie the estimated autocorrelatio of the time series. For a statioary time series, the autocorrelatios will typically decay rapidly to. For a ostatioary time series, the autocorrelatios will typically decay slowly if at all. Computig the differeced time series is easy to do with the diff commad i MATLAB. The secod step i our process is removig ay ozero mea from the differeced time series W. This is a very straight forward computatio- just compute the mea of the time series, ad subtract it from each elemet of the time series. If the mea is small relative to the stadard deviatio of the differeced time series W, the it should be safe to simply skip this step. Example 1 Figures oe through three show the aalysis of a time series. I figure 1a, otice that there are clear log term shifts i the average. I figure 1b, the autocorrelatios do ot decay quickly to zero. I figure a, there is still a strog tred, ad figure b shows that the autocorrelatios are still ot goig 1
1 x 15 1 8 Z 6 4 1 3 4 5 6 7 8 9 1 1.99.98.97.96 4 6 8 1 1 14 16 18 Figure 1: Example time series with o differecig. to very quickly. Fially, figure 3a shows data that appear to be statioary ad figure 3b cofirms that the autocorrelatios die out quickly. The mea for the secod differece of the origial time series is.1, ad the stadard deviatio for this time series is 1.58, so it seems wise to iclude the mea i fittig a ARIMA model to the data.
5 ZD 15 1 5 1 3 4 5 6 7 8 9 1 1.99.98.97.96 4 6 8 1 1 14 16 18 Figure : Example time series with d = 1. 8 6 4 ZD 4 1 3 4 5 6 7 8 9 1.8.6.4. 4 6 8 1 1 14 16 18 Figure 3: Example time series with d =. 3
Order ρ k φ kk (1,) exp decay oly φ 11 ozero (,1) oly ρ 1 ozero exp decay (,) exp or damped sie wave oly φ 11, φ ozero (,) oly ρ 1, ozero exp or damped sie wave (1,1) exp decay exp decay Table 1: Rules for selectig p ad q..8.6 r.4. 4 6 8 1 1 14 16 18.8.6.4 φ.. 4 6 8 1 1 14 16 18 Figure 4: Estimated autocorrelatio ad PACF for the example time series. The ext step i the process is determiig the autoregressive order p ad the movig average order q for the differeced time series W. Table 1 (take from BJR) summarizes the behavior of ARMA(p,q) processes for p =, 1, ad q =, 1,. Example Cotiuig with the time series from the last example, we removed the mea of.1, ad the estimated the autocorrelatios ad partial autocorrelatio fuctio. Figure 4 shows the estimated autocorrelatios ad partial autocorrelatio. Notice that both ˆφ 11 ad ˆφ are ozero, ad that the r appear to decay expoetially. This suggests that a ARIMA(,,) model would be appropriate. Oce we ve determied p, d, ad q, the fial step is to estimate the actual parameters φ 1,..., φ p, θ 1,..., θ q. Oe very simple approach ca be used if we have formulas for the autocorrelatios i terms of the parameters. For example, 4
Order ρ 1 ρ (1,) ρ 1 = φ 1 (,1) ρ 1 = θ 1 /(1 + θ1) (,) ρ 1 = φ 1 /(1 φ ) ρ = (φ 1)/(1 φ ) + φ (,) ρ 1 = θ 1 (1 θ )/(1 + θ1 + θ) ρ = θ /(+θ1 + θ) (1,1) ρ 1 = (1 θ 1 φ 1 )(φ 1 θ 1 )/(1 + θ1 φ 1 θ 1 ) ρ = ρ 1 φ 1 Table : Rules for selectig p ad q. for a AR() process, we kow that ad ρ 1 = φ 1 1 φ ρ = φ 1 1 φ + φ. We ca substitute our estimates r 1 ad r for the differeced time series ad solve these equatios to obtai φ 1 ad φ. Table summarizes the equatios to be solved for the (1,), (,1), (,), (,) ad (1,1) cases. Example 3 Cotiuig our earlier example, we ve decided to fit a AR() model to w. We have r 1 =.7434 ad r =.6844. Solvig the above equatios for φ 1 ad φ, we get the estimates φ 1 =.546 ad φ =.943. I fact, this series was geerated usig φ 1 =.5 ad φ =.3. A more sophisticated approach is to use maximum likelihood estimatio to obtai the parameters. Ufortuately, this fuctio is icluded i a MATLAB toolbox (The System Idetificatio toolbox) that we do t have at NMT. Thus we ll use a differet package, Miitab, to do the maximum likelihood estimatio. Example 4 The followig output was produced by usig Miitab s Stats/Time Series/ARIMA procedure o the differeced time series with the mea removed. ARIMA Model: C1 ARIMA model for C1 Estimates at each iteratio Iteratio SSE Parameters 373.84.1.1 1 747.53.5.166 179.54.4.3 3 3.31.54.87 4.91.53.9 5.91.53.9 Relative chage i each estimate less tha.1 5
Fial Estimates of Parameters Type Coef SE Coef T P AR 1.599.14 4.73. AR.93.14 13.55. Number of observatios: 1998 Residuals: SS =.69 (backforecasts excluded) MS = 1.1 DF = 1996 Modified Box-Pierce (Ljug-Box) Chi-Square statistic Lag 1 4 36 48 Chi-Square 7.7 17.3 5.6 35.1 DF 1 34 46 P-Value.659.747.848.878 The optimal parameters were φ 1 =.599 ad φ =.93. These are quite similar to the estimates obtaied by matchig the first two autocorrelatios. Miitab produces a umber of other useful statistics. For example, both φ 1 ad φ have stadard errors of.14. For both coefficiets, the p-value is.1, idicatig that these coefficiets are defiitely ozero. The Box Pierce Chi-Square p-values tell us that the model fits the data quite well. 6