Theoretical and Simulation-guided Exploration of the AR(1) Model

Similar documents
at least 50 and preferably 100 observations should be available to build a proper model

2. An Introduction to Moving Average Models and ARMA Models

Stat 5100 Handout #12.e Notes: ARIMA Models (Unit 7) Key here: after stationary, identify dependence structure (and use for forecasting)

Ch 6. Model Specification. Time Series Analysis

Econ 424 Time Series Concepts

STAT Financial Time Series

Univariate ARIMA Models

Prof. Dr. Roland Füss Lecture Series in Applied Econometrics Summer Term Introduction to Time Series Analysis

Time series models in the Frequency domain. The power spectrum, Spectral analysis

Problem Set 1 Solution Sketches Time Series Analysis Spring 2010

The Identification of ARIMA Models

TIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA

Scenario 5: Internet Usage Solution. θ j

EASTERN MEDITERRANEAN UNIVERSITY ECON 604, FALL 2007 DEPARTMENT OF ECONOMICS MEHMET BALCILAR ARIMA MODELS: IDENTIFICATION

1 Linear Difference Equations

A Data-Driven Model for Software Reliability Prediction

Decision 411: Class 9. HW#3 issues

MODELING INFLATION RATES IN NIGERIA: BOX-JENKINS APPROACH. I. U. Moffat and A. E. David Department of Mathematics & Statistics, University of Uyo, Uyo

Chapter 9: Forecasting

Problem Set 2 Solution Sketches Time Series Analysis Spring 2010

5 Autoregressive-Moving-Average Modeling

Ch 9. FORECASTING. Time Series Analysis

Classic Time Series Analysis

University of Oxford. Statistical Methods Autocorrelation. Identification and Estimation

FORECASTING SUGARCANE PRODUCTION IN INDIA WITH ARIMA MODEL

Statistics 349(02) Review Questions

3 Theory of stationary random processes

IDENTIFICATION OF ARMA MODELS

Ch 5. Models for Nonstationary Time Series. Time Series Analysis

Ch 4. Models For Stationary Time Series. Time Series Analysis

Stochastic Modelling Solutions to Exercises on Time Series

The autocorrelation and autocovariance functions - helpful tools in the modelling problem

ECON/FIN 250: Forecasting in Finance and Economics: Section 7: Unit Roots & Dickey-Fuller Tests

A SEASONAL TIME SERIES MODEL FOR NIGERIAN MONTHLY AIR TRAFFIC DATA

data lam=36.9 lam=6.69 lam=4.18 lam=2.92 lam=2.21 time max wavelength modulus of max wavelength cycle

MAT3379 (Winter 2016)

Chapter 4: Models for Stationary Time Series

Basics: Definitions and Notation. Stationarity. A More Formal Definition

Applied Econometrics. Professor Bernard Fingleton

Regression with correlation for the Sales Data

Time Series I Time Domain Methods

A Beginner s Introduction. Box-Jenkins Models

A time series is called strictly stationary if the joint distribution of every collection (Y t

Advanced Econometrics

Module 3. Descriptive Time Series Statistics and Introduction to Time Series Models

Comparing the Univariate Modeling Techniques, Box-Jenkins and Artificial Neural Network (ANN) for Measuring of Climate Index

Marcel Dettling. Applied Time Series Analysis SS 2013 Week 05. ETH Zürich, March 18, Institute for Data Analysis and Process Design

Econometría 2: Análisis de series de Tiempo

Time Series Analysis -- An Introduction -- AMS 586

1. How can you tell if there is serial correlation? 2. AR to model serial correlation. 3. Ignoring serial correlation. 4. GLS. 5. Projects.

Empirical Market Microstructure Analysis (EMMA)

Solutions to Odd-Numbered End-of-Chapter Exercises: Chapter 14

Univariate Time Series Analysis; ARIMA Models

Probabilistic forecasting of solar radiation

Time Series Analysis. Nonstationary Data Analysis by Time Deformation

6 NONSEASONAL BOX-JENKINS MODELS

Regression of Time Series

Circle the single best answer for each multiple choice question. Your choice should be made clearly.

TAKEHOME FINAL EXAM e iω e 2iω e iω e 2iω

TMA4285 December 2015 Time series models, solution.

Chapter 2: Unit Roots

Forecasting. Simon Shaw 2005/06 Semester II

Some Time-Series Models

Time Series Analysis

Problem Set 2: Box-Jenkins methodology

Firstly, the dataset is cleaned and the years and months are separated to provide better distinction (sample below).

Estimation and application of best ARIMA model for forecasting the uranium price.

Modeling and Performance Analysis with Discrete-Event Simulation

Minitab Project Report - Assignment 6

Computer Science, Informatik 4 Communication and Distributed Systems. Simulation. Discrete-Event System Simulation. Dr.

Financial Econometrics Review Session Notes 3

Improved Holt Method for Irregular Time Series

Lecture 2: Univariate Time Series

Analysis. Components of a Time Series

Lecture 3: Autoregressive Moving Average (ARMA) Models and their Practical Applications

Modeling and forecasting global mean temperature time series

Chapter 12: An introduction to Time Series Analysis. Chapter 12: An introduction to Time Series Analysis

Time Series Modeling. Shouvik Mani April 5, /688: Practical Data Science Carnegie Mellon University

Econometrics of financial markets, -solutions to seminar 1. Problem 1

Lecture # 37. Prof. John W. Sutherland. Nov. 28, 2005

Problem Set 6 Solution

APPLIED ECONOMETRIC TIME SERIES 4TH EDITION

Chapter 3. ARIMA Models. 3.1 Autoregressive Moving Average Models

Design of Time Series Model for Road Accident Fatal Death in Tamilnadu

Moving Average (MA) representations

White Noise Processes (Section 6.2)

Issues in estimating past climate change at local scale.

Applied Forecasting (LECTURENOTES) Prof. Rozenn Dahyot

A SARIMAX coupled modelling applied to individual load curves intraday forecasting

Univariate ARIMA Forecasts (Theory)

INTRODUCTION TO TIME SERIES ANALYSIS. The Simple Moving Average Model

On 1.9, you will need to use the facts that, for any x and y, sin(x+y) = sin(x) cos(y) + cos(x) sin(y). cos(x+y) = cos(x) cos(y) - sin(x) sin(y).

7. Forecasting with ARIMA models

THE ROYAL STATISTICAL SOCIETY 2009 EXAMINATIONS SOLUTIONS GRADUATE DIPLOMA MODULAR FORMAT MODULE 3 STOCHASTIC PROCESSES AND TIME SERIES

Stochastic Processes: I. consider bowl of worms model for oscilloscope experiment:

ARIMA Modelling and Forecasting

ECE 673-Random signal analysis I Final

Forecasting Data Streams: Next Generation Flow Field Forecasting

Read Section 1.1, Examples of time series, on pages 1-8. These example introduce the book; you are not tested on them.

MCMC analysis of classical time series algorithms.

Transcription:

Theoretical and Simulation-guided Exploration of the AR() Model Overview: Section : Motivation Section : Expectation A: Theory B: Simulation Section : Variance A: Theory B: Simulation Section : ACF A: Theory B: Simulation Section : Forecasting with AR() Section 5: A Glance at Imaginery Processes Section : A Glance at Exponentially Distributed Error Terms

Motivation: The first order autoregressive model, or AR(), is a probability model of time series whereby each step of the series is proportional to the previous step and each step is subject to random noise. The AR() is a special case of the AR(p) model, where each step in the time series is a weighted average of the previous p steps and is subject to random noise. The AR(p) model is, itself, a special case of the ARIMA(p,d,q) model which is widely used in forecasting time series. Some of the properties of the AR(p) model that make it useful for forecasting are that it is stationary in mean and stationary in variance for certain parameters. This allows one to compute forecasts quite far into the future with relatively low error bounds. In real-world cases, where the time-series data does not perfectly match a mathematical model, a function known as the Autocorrelation function can be used to identify particular ARIMA models when they are simple cases. One such example is the AR() model. Knowing which ARIMA model fits your data, and knowing that this model is stationary in mean and variance, allows one to compute accurate forecasts for a time-series. These properties will be explored for the AR() in this project. As some team members had prior experience using different software packages, each person did their own simulation using their respective software. As such, graphs to answer simulation questions will be provided from Excel, R and Matlab wherever each one is applicable. Y = α + αy + ε Model: Assumptions: t t t ( ε, ε ) are independent for all i j i j. ( Y, ε ) are independent for all (, i j ) i εt ~ N(, σ ) Y = j

Section A: Theory for Expectation Guided Theory: Y = α + αy + ε Consider the model expressed above: t t t Q. Based on the model above, using induction can Y t be expressed in terms of Yt? k Using the procedure suggested above, and the assumptions stated at the start of the document: Q. Can we express the expectation of Y t as a geometric series in α? Q. Use these guidelines to show that the expectation of the AR() process converges, as t increases, to the following: (eq) This result holds for any α (,). NOTE: A full answer to this question is provided in Appendix The above theory can be visualised using Simulation. The following section will provide a graphical representation to complement the above theory. The theory states that the expectation of the model is dependent only on α and α, and not on σ. Simulation will verify this.

Section B: Simulations The following graphs are single examples of an AR() processes. Simulation of varying values for α, α, σ can be found in Appendix (i)fig.. in Excel - - - - Y=AR() 9 7 5 9 57 5 7 8 89 97 Y=AR() 8 Generic AR() Y(t)=Alpha()+Alpha().Y(t- )+ε 7 9 5 7 9 55 7 7 79 85 9 97 α!!, a! =.5, σ! = = ; =.7; = Example of an AR() Process arima.sim(list(order = c(,, ), ar =.5), n = ) - - 8 Time - 5 5 5 5 α!!, a! =.5, σ! = α!!, a! =.5, σ! = By varying the value of α! we discovered that the Expected value of the model changes By choosing a value of a! ε, the model is convergent, and the Expected Value is greater than α! What happens if a! ε,? - - -

= ; =.; = Generic AR() Y(t)=Alpha()+Alpha().Y(t- )+ ε 5 5 7 7 8 8 9 9 By choosing a value of a! ε, the model is convergent, and the Expected Value is less than α The effect of a!, was examined α!!, a! =., σ! = Generic AR() Y(t)= Alpha()+Alpha().Y(t- )+ε 8 5 9 5 57 7 78 85 9 99 8 x 8 Example of an AR() Process 7 5 5 5 5 5 = ; =.; = 5 x 7 Example of an AR() Process 5-5 - Generic AR() Y(t)= Alpha()+Alpha().Y(t- )+ε 7 9 5 7 9 55 7 7 79 85 9 97 - - - - -5 5 5 5 5

These simulations corroborate what was shown theoretically about the convergence of the mean in the AR() process. Convergence is seen above only when α <, and divergence is seen otherwise. In the case where < α <, the process undergoes a slight flurry before it settles down. This phenomenon is not immediately obvious from the theory and highlights the usefulness of simulation to compliment the findings of theory. It is also noteworthy that the process diverges to + when α > but that it fans out to span, when α <. In both cases, the absolute value of the process is rising with time, ( ) however in the latter case, the sign alternates. These findings on the convergence of the mean in the AR() process can be demonstrated more convincingly by taking multiple iterations of an AR() process. The following figures display how the mean on the y axis, at time t on the x axis, of 5 AR() iterations changes, as t increases, on the top graph. It compares this with the theoretical convergence value, as calculated from (eq), on the bottom graph Note: these will disagree outside the allowed bounds for α. = ; =.7; = = ; =.; = 8 Expectation of AR() Process Stabilizes 5 Expectation of AR() Process Stabilizes 5 5 5 5 5 8 7 Theoretical Expectation for Given Parameters 5 5 5 5 5.5 5 5 5 5 5.5 Theoretical Expectation for Given Parameters.5 5 5 5 5 = ; =.; = = ; =.; =

-98 8 x 8 Expectation of AR() Process Stabilizes 5 5 5 5 Theoretical Expectation for Given Parameters x 7 Expectation of AR() Process Stabilizes - - 5 5 5 5 Theoretical Expectation for Given Parameters -99-9 - 5 5 5 5 8 5 5 5 5 These plots further solidify what we have discussed in regards to the convergence of the mean of the AR() process. Interestingly, when α ( ),, the process appears to be slightly erratic for a short while before settling down to convergence (as was noted from the single iteration earlier). This suggests that such values for this parameter are appropriate for representing the time-series of systems which are initially not under control, but which subsequently become controlled. In the subsequent section we will discuss the variance of the AR() model. Section a: Theory for Variance Guided Theory: Q. Using a similar method as in Section A we can express the variance of Y t as a geometric series in α.use these guidelines to show that the variance of the AR() process converges to the following as t increases: σ Var[ Y t ] = α (eq) This result holds, again, for any (,) α. Note: A full answer to this question is provided in Appendix The above theory can be visualised using simulation. The following section will provide a graphical representation to complement the above theory. The theory states that the variance of the model is dependent only on α and σ, and not on α. Simulation will verify this. Similarly to the expectation section above, the following figures display how the variance on the y axis, at time t on x axis, of 5 AR() iterations changes, as t increases, on the top graph. It compares this with the theoretical convergence value (bottom graph) as calculated from (eq), on the bottom. Note: These will disagree outside the allowed bounds for α.

Section b: Simulations for Variance = ; =.5; = = ; =.7; =.8 Variance of AR() Process Stabilizes.5 Variance of AR() Process Stabilizes...5. 5 5 5 5.5 5 5 5 5 5 5 5 5 = ; =.; = 5 5 5 5 = ; =.; = Variance of AR() Process Stabilizes Variance of AR() Process Stabilizes.5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 The above plots are in agreement with the theory discussed previously. Changing from α = to α = did not impact on the variance. Varying α in the allowed range caused changes in the variance of the model, but kept the variance convergent. Changing σ = to σ = doubled the variance, as was predicted by the theory. = ; =.; = = ; =.; =

8 x Variance of AR() Process Stabilizes 5 5 5 5 - - -5 8 x Variance of AR() Process Stabilizes 5 5 5 5 - - -5-5 5 5 5-5 5 5 5 Finally, when α >, the variance diverged, as expected. Section a: Theory for Autocorrelation (ACF) The autocorrelation function (acf) is a function of the lag of a time series Y t. The function is given by ( k) Corr( Y, Y ). p = t t+ k In order to find the acf, one first needs to find Cov( Y, Y ) series. Recall that the covariance of two random variables X,Y is given by: t t+ k, and then divide by the variance of the time (, ) = [( [ ])( [ ])] = [ ] [ ] [ ] Cov X Y E X E X Y E Y E XY E X E Y Show that Cov( Y, Y ) is given by: t t+ k k ( α σ ) t t+ k = ( α ) Cov( Y, Y ) (Hint: use properties of the model found earlier to express Y t+k in terms of Y t. We required the expectation of an AR() model and the variance of an AR() model found above to obtain this result. Both of these required that α be in the interval (, ) above covariance holds. The value of a does not impact on the covariance. and so this is the interval for which the Corr( Y, Y + ) = α k Show that: t t k. (eq)

N.B. A full answer to this question is provided in the.pdf document accompanying this document. Section b: Studying the ACF What happens to the ACF when α takes the following values? Explain the shape of the acf in each plot with reference to theory. a. b..,.7 c. -., -.7 d.. e. -. à Simulation of the ACF for each of the above values, and a comparison with the theoretical ACF can be found in Appendix. As before, these will disagree outside the allowed values of α. Answers to the above: a. The acf takes the value at lag and takes the value elsewhere. (This can be seen in the simulation below noting that on the x-axis corresponds to a lag of etc. This is consistent with theory as = and n = for any non-zero n. n b. The acf undergoes an exponential decay with a limit of. This is consistent with theory as α gets very small as n gets very large when α <. Also consistent with theory is that the decay occurs more quickly for smaller values of α. c. The acf is a damped sin wave, again with its limit as. This is consistent with theory because, as stated above a n gets very small as n gets very large when α <. Furthermore α n will be positive for n even, and negative for n odd when α <. d. The acf again undergoes exponential decay as demonstrated by simulation. Note that in this case, the simulation disagrees with the theoretical graph on the bottom. This is because α is outside the allowed range and thus, the theory above does not account for it. e. The acf decreases in absolute value for a short while but seems to stabilize at an absolute value just under.5. From here it alternates between positive and negative correlations. Again, this disagrees with the theoretical graph on the bottom because α is outside the allowed range. The following plots are simulations of the acf for AR() corresponding to the above values (.7 and not. in the case of b. and c.). The bottom graph in each corresponds to the theoretical acf. Note that in all cases the acf is calculated using (eq) (even when α is outside the allowed interval). Thus, as before, in these cases, the two plots will disagree.

Section : Forecasting with AR() It was claimed at the beginning of this document that the AR() model could be used for generating forecasts of time-series. Such forecasts are accomplished by the following procedure: First, one calculates a sample standard deviation for the time-series. Second, one determines how much error their forecast will have for each step after the last data point collected. This is done in a similar manner to the procedure used to prove convergence of the variance of the model. Doing this, one finds that the 95% confidence interval for their forecast of Y t + k Is given by: Y ˆ... t+ k= Yt± s + α + α + + α k Simulations demonstrating the accuracy of such a forecast are given below. Note that the top of each plot is a single iteration of a time-series simulated as before. The bottom of each plot is the same iteration, cut off early, and replaced with the forecast. The reader is invited to inspect how the forecast compares with the actual outcome. Example of an AR() Process Example of an AR() Process 5 5 5 5 5 5 5 5 5 Forecast with 95% Confidence Interval = ; =.; = 5 5 5 5 5 5 5 Forecast with 95% Confidence Interval 5 5 5 5 = 5; =.; = Thus, it is evident that the further ahead in time we forecast, the wider the prediction intervals become, as a result of greater variability in the error terms Section 5: A Glance at Imaginary Processes

Since the theory found that the expectation, variance and autocorrelation were defined whenever α <, it was thought that the process might work when α. Indeed, this was found to be the case when the real and imaginary parts were plotted separately. However, the results, found in the real case for the variance and the correlation, were slightly off. Experimentation led to the guess that instead of having for the variance: σ Var[ Y t ] =, we instead had α and for the correlation, instead of having: σ Var[ Y t ] = α Corr( Y, Y ) a, we instead had Corr( Y, Y ) i + = k t t k k t t+ k = a The following are graphs of one example illustrating the above: α = α = σ = (with the amended formulae) ;.5 i; Example of an AR() Process Imaginary Example of an AR() Process 5 9 8 7 5 5 5 5 5 5 5 5 5

Expectation of AR() Process Stabilizes Imaginary Expectation of AR() Process Stabilizes 9 8 7 5 5 5 5 9 8.5 8 7.5 Theoretical Expectation for Given Parameters 7 5 5 5 5 5 5 5 5 5.5.5 Imaginary Theoretical Expectation for Given Parameters 5 5 5 5.8 Variance of AR() Process Stabilizes... 5 5 5 5 5 5 5 5 ACF. Imaginary ACF.5 -.5 5 5 5 5 5 5.5 Theoretical ACF -.5 5 5 5 5 5 5 -. -. -. 5 5 5 5 5 5. -. -. Imaginary Theoretical ACF -. 5 5 5 5 5 5 Plots of the same process with the original formulae: We were unable to find the theoretical justification for these amendments to the variance and acf formulae. However, simulation seems to support our guess. Nevertheless, there seems to be evidence that the AR() process is indeed stationary, in both the real and imaginary plane, when α.

Section : A Glance at Exponentially Distributed Error Terms Not all random error in real life situations will be normally distributed. Some processes, which vary by random small amounts with time, have occasional very large deviations. Such processes are relevant in domains such as webpage hits and the number of people at any one time in a retail centre. The error of such processes can be captured by the Exponential Distribution. It turns out that the theory of the AR() process with Exponentially distributed error terms can be derived quite similarly to the way it was derived previously for Normally distributed error terms. One amendment must be made to the assumptions underlying this theory: In place of assuming: εt ~ N(, σ ), we instead assume that: ε ~ Exp( λ ) t As such, we have that: λ [ ] = and Var[ ε t ] E ε t λ =. We find that the following changes occur in the results of the Expectation and the Variance of the AR() process (we find that the ACF remains unchanged): α + EY [ t ] = λ α Var[ Y t ] = λ ( α ) We conclude the document with graphical illustrations of this additional theory for two values of λ : λ = ; λ =.5 Note, that in the first of these cases, we have provided two graphs comparing the forecast with the actual outcome. This is because, by chance, the first stimulation generated a forecast from data where the last value was an exceptional value. As such, the error bounds were around an unusual value and did not make an accurate prediction. The second simulation provided a better forecast. We included both graphs to illustrate the danger of using a forecast which is based on the most recent data point when the error of the system is exponentially distributed.

α = ; α =.5; λ = Example of an AR() Process 5 5 5 5 Example of an AR() Process 5 5 5 5 5 5 Forecast with 95% Confidence Interval 5 5 5 5 5 5 5 5 5 5 Forecast with 95% Confidence Interval 5 5 5 5 Note that this model has occasional large upward spikes due to occasional large error terms (due to the Exponential Distribution). 5 Expectation of AR() Process Stabilizes.5 Variance of AR() Process Stabilizes. 5. 5 5 5 5.5.5 Theoretical Expectation for Given Parameters 5 5 5 5. 5 5 5 5-5 5 5 5 α = ; α =.5; λ =.5 ACF Example of an AR() Process.5 -.5 5 5 5 5 5 5 Theoretical ACF 5 5 5 5 Forecast with 95% Confidence Interval.5 5 5 5 5 5 5 5 5 5 5

5 Expectation of AR() Process Stabilizes 8 Variance of AR() Process Stabilizes 5 5 5 5 5 5.5.5 Theoretical Expectation for Given Parameters 5 5 5 5 5 5 5 5 7 5 5 5 5 5 ACF.5 -.5 5 5 5 5 5 5 Theoretical ACF.5 5 5 5 5 5 5