Theoretical and Simulation-guided Exploration of the AR() Model Overview: Section : Motivation Section : Expectation A: Theory B: Simulation Section : Variance A: Theory B: Simulation Section : ACF A: Theory B: Simulation Section : Forecasting with AR() Section 5: A Glance at Imaginery Processes Section : A Glance at Exponentially Distributed Error Terms
Motivation: The first order autoregressive model, or AR(), is a probability model of time series whereby each step of the series is proportional to the previous step and each step is subject to random noise. The AR() is a special case of the AR(p) model, where each step in the time series is a weighted average of the previous p steps and is subject to random noise. The AR(p) model is, itself, a special case of the ARIMA(p,d,q) model which is widely used in forecasting time series. Some of the properties of the AR(p) model that make it useful for forecasting are that it is stationary in mean and stationary in variance for certain parameters. This allows one to compute forecasts quite far into the future with relatively low error bounds. In real-world cases, where the time-series data does not perfectly match a mathematical model, a function known as the Autocorrelation function can be used to identify particular ARIMA models when they are simple cases. One such example is the AR() model. Knowing which ARIMA model fits your data, and knowing that this model is stationary in mean and variance, allows one to compute accurate forecasts for a time-series. These properties will be explored for the AR() in this project. As some team members had prior experience using different software packages, each person did their own simulation using their respective software. As such, graphs to answer simulation questions will be provided from Excel, R and Matlab wherever each one is applicable. Y = α + αy + ε Model: Assumptions: t t t ( ε, ε ) are independent for all i j i j. ( Y, ε ) are independent for all (, i j ) i εt ~ N(, σ ) Y = j
Section A: Theory for Expectation Guided Theory: Y = α + αy + ε Consider the model expressed above: t t t Q. Based on the model above, using induction can Y t be expressed in terms of Yt? k Using the procedure suggested above, and the assumptions stated at the start of the document: Q. Can we express the expectation of Y t as a geometric series in α? Q. Use these guidelines to show that the expectation of the AR() process converges, as t increases, to the following: (eq) This result holds for any α (,). NOTE: A full answer to this question is provided in Appendix The above theory can be visualised using Simulation. The following section will provide a graphical representation to complement the above theory. The theory states that the expectation of the model is dependent only on α and α, and not on σ. Simulation will verify this.
Section B: Simulations The following graphs are single examples of an AR() processes. Simulation of varying values for α, α, σ can be found in Appendix (i)fig.. in Excel - - - - Y=AR() 9 7 5 9 57 5 7 8 89 97 Y=AR() 8 Generic AR() Y(t)=Alpha()+Alpha().Y(t- )+ε 7 9 5 7 9 55 7 7 79 85 9 97 α!!, a! =.5, σ! = = ; =.7; = Example of an AR() Process arima.sim(list(order = c(,, ), ar =.5), n = ) - - 8 Time - 5 5 5 5 α!!, a! =.5, σ! = α!!, a! =.5, σ! = By varying the value of α! we discovered that the Expected value of the model changes By choosing a value of a! ε, the model is convergent, and the Expected Value is greater than α! What happens if a! ε,? - - -
= ; =.; = Generic AR() Y(t)=Alpha()+Alpha().Y(t- )+ ε 5 5 7 7 8 8 9 9 By choosing a value of a! ε, the model is convergent, and the Expected Value is less than α The effect of a!, was examined α!!, a! =., σ! = Generic AR() Y(t)= Alpha()+Alpha().Y(t- )+ε 8 5 9 5 57 7 78 85 9 99 8 x 8 Example of an AR() Process 7 5 5 5 5 5 = ; =.; = 5 x 7 Example of an AR() Process 5-5 - Generic AR() Y(t)= Alpha()+Alpha().Y(t- )+ε 7 9 5 7 9 55 7 7 79 85 9 97 - - - - -5 5 5 5 5
These simulations corroborate what was shown theoretically about the convergence of the mean in the AR() process. Convergence is seen above only when α <, and divergence is seen otherwise. In the case where < α <, the process undergoes a slight flurry before it settles down. This phenomenon is not immediately obvious from the theory and highlights the usefulness of simulation to compliment the findings of theory. It is also noteworthy that the process diverges to + when α > but that it fans out to span, when α <. In both cases, the absolute value of the process is rising with time, ( ) however in the latter case, the sign alternates. These findings on the convergence of the mean in the AR() process can be demonstrated more convincingly by taking multiple iterations of an AR() process. The following figures display how the mean on the y axis, at time t on the x axis, of 5 AR() iterations changes, as t increases, on the top graph. It compares this with the theoretical convergence value, as calculated from (eq), on the bottom graph Note: these will disagree outside the allowed bounds for α. = ; =.7; = = ; =.; = 8 Expectation of AR() Process Stabilizes 5 Expectation of AR() Process Stabilizes 5 5 5 5 5 8 7 Theoretical Expectation for Given Parameters 5 5 5 5 5.5 5 5 5 5 5.5 Theoretical Expectation for Given Parameters.5 5 5 5 5 = ; =.; = = ; =.; =
-98 8 x 8 Expectation of AR() Process Stabilizes 5 5 5 5 Theoretical Expectation for Given Parameters x 7 Expectation of AR() Process Stabilizes - - 5 5 5 5 Theoretical Expectation for Given Parameters -99-9 - 5 5 5 5 8 5 5 5 5 These plots further solidify what we have discussed in regards to the convergence of the mean of the AR() process. Interestingly, when α ( ),, the process appears to be slightly erratic for a short while before settling down to convergence (as was noted from the single iteration earlier). This suggests that such values for this parameter are appropriate for representing the time-series of systems which are initially not under control, but which subsequently become controlled. In the subsequent section we will discuss the variance of the AR() model. Section a: Theory for Variance Guided Theory: Q. Using a similar method as in Section A we can express the variance of Y t as a geometric series in α.use these guidelines to show that the variance of the AR() process converges to the following as t increases: σ Var[ Y t ] = α (eq) This result holds, again, for any (,) α. Note: A full answer to this question is provided in Appendix The above theory can be visualised using simulation. The following section will provide a graphical representation to complement the above theory. The theory states that the variance of the model is dependent only on α and σ, and not on α. Simulation will verify this. Similarly to the expectation section above, the following figures display how the variance on the y axis, at time t on x axis, of 5 AR() iterations changes, as t increases, on the top graph. It compares this with the theoretical convergence value (bottom graph) as calculated from (eq), on the bottom. Note: These will disagree outside the allowed bounds for α.
Section b: Simulations for Variance = ; =.5; = = ; =.7; =.8 Variance of AR() Process Stabilizes.5 Variance of AR() Process Stabilizes...5. 5 5 5 5.5 5 5 5 5 5 5 5 5 = ; =.; = 5 5 5 5 = ; =.; = Variance of AR() Process Stabilizes Variance of AR() Process Stabilizes.5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 The above plots are in agreement with the theory discussed previously. Changing from α = to α = did not impact on the variance. Varying α in the allowed range caused changes in the variance of the model, but kept the variance convergent. Changing σ = to σ = doubled the variance, as was predicted by the theory. = ; =.; = = ; =.; =
8 x Variance of AR() Process Stabilizes 5 5 5 5 - - -5 8 x Variance of AR() Process Stabilizes 5 5 5 5 - - -5-5 5 5 5-5 5 5 5 Finally, when α >, the variance diverged, as expected. Section a: Theory for Autocorrelation (ACF) The autocorrelation function (acf) is a function of the lag of a time series Y t. The function is given by ( k) Corr( Y, Y ). p = t t+ k In order to find the acf, one first needs to find Cov( Y, Y ) series. Recall that the covariance of two random variables X,Y is given by: t t+ k, and then divide by the variance of the time (, ) = [( [ ])( [ ])] = [ ] [ ] [ ] Cov X Y E X E X Y E Y E XY E X E Y Show that Cov( Y, Y ) is given by: t t+ k k ( α σ ) t t+ k = ( α ) Cov( Y, Y ) (Hint: use properties of the model found earlier to express Y t+k in terms of Y t. We required the expectation of an AR() model and the variance of an AR() model found above to obtain this result. Both of these required that α be in the interval (, ) above covariance holds. The value of a does not impact on the covariance. and so this is the interval for which the Corr( Y, Y + ) = α k Show that: t t k. (eq)
N.B. A full answer to this question is provided in the.pdf document accompanying this document. Section b: Studying the ACF What happens to the ACF when α takes the following values? Explain the shape of the acf in each plot with reference to theory. a. b..,.7 c. -., -.7 d.. e. -. à Simulation of the ACF for each of the above values, and a comparison with the theoretical ACF can be found in Appendix. As before, these will disagree outside the allowed values of α. Answers to the above: a. The acf takes the value at lag and takes the value elsewhere. (This can be seen in the simulation below noting that on the x-axis corresponds to a lag of etc. This is consistent with theory as = and n = for any non-zero n. n b. The acf undergoes an exponential decay with a limit of. This is consistent with theory as α gets very small as n gets very large when α <. Also consistent with theory is that the decay occurs more quickly for smaller values of α. c. The acf is a damped sin wave, again with its limit as. This is consistent with theory because, as stated above a n gets very small as n gets very large when α <. Furthermore α n will be positive for n even, and negative for n odd when α <. d. The acf again undergoes exponential decay as demonstrated by simulation. Note that in this case, the simulation disagrees with the theoretical graph on the bottom. This is because α is outside the allowed range and thus, the theory above does not account for it. e. The acf decreases in absolute value for a short while but seems to stabilize at an absolute value just under.5. From here it alternates between positive and negative correlations. Again, this disagrees with the theoretical graph on the bottom because α is outside the allowed range. The following plots are simulations of the acf for AR() corresponding to the above values (.7 and not. in the case of b. and c.). The bottom graph in each corresponds to the theoretical acf. Note that in all cases the acf is calculated using (eq) (even when α is outside the allowed interval). Thus, as before, in these cases, the two plots will disagree.
Section : Forecasting with AR() It was claimed at the beginning of this document that the AR() model could be used for generating forecasts of time-series. Such forecasts are accomplished by the following procedure: First, one calculates a sample standard deviation for the time-series. Second, one determines how much error their forecast will have for each step after the last data point collected. This is done in a similar manner to the procedure used to prove convergence of the variance of the model. Doing this, one finds that the 95% confidence interval for their forecast of Y t + k Is given by: Y ˆ... t+ k= Yt± s + α + α + + α k Simulations demonstrating the accuracy of such a forecast are given below. Note that the top of each plot is a single iteration of a time-series simulated as before. The bottom of each plot is the same iteration, cut off early, and replaced with the forecast. The reader is invited to inspect how the forecast compares with the actual outcome. Example of an AR() Process Example of an AR() Process 5 5 5 5 5 5 5 5 5 Forecast with 95% Confidence Interval = ; =.; = 5 5 5 5 5 5 5 Forecast with 95% Confidence Interval 5 5 5 5 = 5; =.; = Thus, it is evident that the further ahead in time we forecast, the wider the prediction intervals become, as a result of greater variability in the error terms Section 5: A Glance at Imaginary Processes
Since the theory found that the expectation, variance and autocorrelation were defined whenever α <, it was thought that the process might work when α. Indeed, this was found to be the case when the real and imaginary parts were plotted separately. However, the results, found in the real case for the variance and the correlation, were slightly off. Experimentation led to the guess that instead of having for the variance: σ Var[ Y t ] =, we instead had α and for the correlation, instead of having: σ Var[ Y t ] = α Corr( Y, Y ) a, we instead had Corr( Y, Y ) i + = k t t k k t t+ k = a The following are graphs of one example illustrating the above: α = α = σ = (with the amended formulae) ;.5 i; Example of an AR() Process Imaginary Example of an AR() Process 5 9 8 7 5 5 5 5 5 5 5 5 5
Expectation of AR() Process Stabilizes Imaginary Expectation of AR() Process Stabilizes 9 8 7 5 5 5 5 9 8.5 8 7.5 Theoretical Expectation for Given Parameters 7 5 5 5 5 5 5 5 5 5.5.5 Imaginary Theoretical Expectation for Given Parameters 5 5 5 5.8 Variance of AR() Process Stabilizes... 5 5 5 5 5 5 5 5 ACF. Imaginary ACF.5 -.5 5 5 5 5 5 5.5 Theoretical ACF -.5 5 5 5 5 5 5 -. -. -. 5 5 5 5 5 5. -. -. Imaginary Theoretical ACF -. 5 5 5 5 5 5 Plots of the same process with the original formulae: We were unable to find the theoretical justification for these amendments to the variance and acf formulae. However, simulation seems to support our guess. Nevertheless, there seems to be evidence that the AR() process is indeed stationary, in both the real and imaginary plane, when α.
Section : A Glance at Exponentially Distributed Error Terms Not all random error in real life situations will be normally distributed. Some processes, which vary by random small amounts with time, have occasional very large deviations. Such processes are relevant in domains such as webpage hits and the number of people at any one time in a retail centre. The error of such processes can be captured by the Exponential Distribution. It turns out that the theory of the AR() process with Exponentially distributed error terms can be derived quite similarly to the way it was derived previously for Normally distributed error terms. One amendment must be made to the assumptions underlying this theory: In place of assuming: εt ~ N(, σ ), we instead assume that: ε ~ Exp( λ ) t As such, we have that: λ [ ] = and Var[ ε t ] E ε t λ =. We find that the following changes occur in the results of the Expectation and the Variance of the AR() process (we find that the ACF remains unchanged): α + EY [ t ] = λ α Var[ Y t ] = λ ( α ) We conclude the document with graphical illustrations of this additional theory for two values of λ : λ = ; λ =.5 Note, that in the first of these cases, we have provided two graphs comparing the forecast with the actual outcome. This is because, by chance, the first stimulation generated a forecast from data where the last value was an exceptional value. As such, the error bounds were around an unusual value and did not make an accurate prediction. The second simulation provided a better forecast. We included both graphs to illustrate the danger of using a forecast which is based on the most recent data point when the error of the system is exponentially distributed.
α = ; α =.5; λ = Example of an AR() Process 5 5 5 5 Example of an AR() Process 5 5 5 5 5 5 Forecast with 95% Confidence Interval 5 5 5 5 5 5 5 5 5 5 Forecast with 95% Confidence Interval 5 5 5 5 Note that this model has occasional large upward spikes due to occasional large error terms (due to the Exponential Distribution). 5 Expectation of AR() Process Stabilizes.5 Variance of AR() Process Stabilizes. 5. 5 5 5 5.5.5 Theoretical Expectation for Given Parameters 5 5 5 5. 5 5 5 5-5 5 5 5 α = ; α =.5; λ =.5 ACF Example of an AR() Process.5 -.5 5 5 5 5 5 5 Theoretical ACF 5 5 5 5 Forecast with 95% Confidence Interval.5 5 5 5 5 5 5 5 5 5 5
5 Expectation of AR() Process Stabilizes 8 Variance of AR() Process Stabilizes 5 5 5 5 5 5.5.5 Theoretical Expectation for Given Parameters 5 5 5 5 5 5 5 5 7 5 5 5 5 5 ACF.5 -.5 5 5 5 5 5 5 Theoretical ACF.5 5 5 5 5 5 5