Chapter 6: Model Specification for Time Series

Similar documents
Ch 6. Model Specification. Time Series Analysis

University of Oxford. Statistical Methods Autocorrelation. Identification and Estimation

Chapter 4: Models for Stationary Time Series

FE570 Financial Markets and Trading. Stevens Institute of Technology

Chapter 8: Model Diagnostics

Part 1. Multiple Choice (50 questions, 1 point each) Part 2. Problems/Short Answer (10 questions, 5 points each)

Ch 4. Models For Stationary Time Series. Time Series Analysis

Module 3. Descriptive Time Series Statistics and Introduction to Time Series Models

We will only present the general ideas on how to obtain. follow closely the AR(1) and AR(2) cases presented before.

2. An Introduction to Moving Average Models and ARMA Models

3 Theory of stationary random processes

Chapter 12: An introduction to Time Series Analysis. Chapter 12: An introduction to Time Series Analysis

Forecasting. Simon Shaw 2005/06 Semester II

Autoregressive Moving Average (ARMA) Models and their Practical Applications

Chapter 9: Forecasting

Lecture 3: Autoregressive Moving Average (ARMA) Models and their Practical Applications

Lecture 7: Model Building Bus 41910, Time Series Analysis, Mr. R. Tsay

MCMC analysis of classical time series algorithms.

at least 50 and preferably 100 observations should be available to build a proper model

Stat 5100 Handout #12.e Notes: ARIMA Models (Unit 7) Key here: after stationary, identify dependence structure (and use for forecasting)

STAT Financial Time Series

Econometría 2: Análisis de series de Tiempo

Some Time-Series Models

Circle a single answer for each multiple choice question. Your choice should be made clearly.

Univariate Time Series Analysis; ARIMA Models

Time Series Analysis

Circle the single best answer for each multiple choice question. Your choice should be made clearly.

STAT 443 Final Exam Review. 1 Basic Definitions. 2 Statistical Tests. L A TEXer: W. Kong

IDENTIFICATION OF ARMA MODELS

ECON/FIN 250: Forecasting in Finance and Economics: Section 6: Standard Univariate Models

Lesson 2: Analysis of time series

Time Series 2. Robert Almgren. Sept. 21, 2009

Chapter 5: Models for Nonstationary Time Series

EASTERN MEDITERRANEAN UNIVERSITY ECON 604, FALL 2007 DEPARTMENT OF ECONOMICS MEHMET BALCILAR ARIMA MODELS: IDENTIFICATION

Problem Set 2: Box-Jenkins methodology

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

Econ 623 Econometrics II Topic 2: Stationary Time Series

Forecasting using R. Rob J Hyndman. 2.4 Non-seasonal ARIMA models. Forecasting using R 1

A SEASONAL TIME SERIES MODEL FOR NIGERIAN MONTHLY AIR TRAFFIC DATA

Lecture 4a: ARMA Model

Time Series I Time Domain Methods

Lecture 2: Univariate Time Series

Univariate ARIMA Models

Lecture 1: Fundamental concepts in Time Series Analysis (part 2)

The Identification of ARIMA Models

Module 4. Stationary Time Series Models Part 1 MA Models and Their Properties

Reliability and Risk Analysis. Time Series, Types of Trend Functions and Estimates of Trends

6 NONSEASONAL BOX-JENKINS MODELS

A Data-Driven Model for Software Reliability Prediction

Empirical Market Microstructure Analysis (EMMA)

Introduction to ARMA and GARCH processes

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

3. ARMA Modeling. Now: Important class of stationary processes

Midterm Suggested Solutions

Time Series 3. Robert Almgren. Sept. 28, 2009

MODELING INFLATION RATES IN NIGERIA: BOX-JENKINS APPROACH. I. U. Moffat and A. E. David Department of Mathematics & Statistics, University of Uyo, Uyo

TIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA

Marcel Dettling. Applied Time Series Analysis SS 2013 Week 05. ETH Zürich, March 18, Institute for Data Analysis and Process Design

The log transformation produces a time series whose variance can be treated as constant over time.

Time Series Analysis -- An Introduction -- AMS 586

Basics: Definitions and Notation. Stationarity. A More Formal Definition

Covariance Stationary Time Series. Example: Independent White Noise (IWN(0,σ 2 )) Y t = ε t, ε t iid N(0,σ 2 )

AR(p) + I(d) + MA(q) = ARIMA(p, d, q)

4. MA(2) +drift: y t = µ + ɛ t + θ 1 ɛ t 1 + θ 2 ɛ t 2. Mean: where θ(l) = 1 + θ 1 L + θ 2 L 2. Therefore,

Ch 8. MODEL DIAGNOSTICS. Time Series Analysis

Introduction to Time Series Analysis. Lecture 11.

9. Using Excel matrices functions to calculate partial autocorrelations

Time Series Econometrics 4 Vijayamohanan Pillai N

Ch 5. Models for Nonstationary Time Series. Time Series Analysis

Covariances of ARMA Processes

Modelling Monthly Rainfall Data of Port Harcourt, Nigeria by Seasonal Box-Jenkins Methods

Forecasting with ARMA

Univariate, Nonstationary Processes

Stationary Stochastic Time Series Models

Modelling using ARMA processes

{ } Stochastic processes. Models for time series. Specification of a process. Specification of a process. , X t3. ,...X tn }

Lesson 13: Box-Jenkins Modeling Strategy for building ARMA models

Lecture 1: Stationary Time Series Analysis

Exponential decay rate of partial autocorrelation coefficients of ARMA and short-memory processes

Prof. Dr. Roland Füss Lecture Series in Applied Econometrics Summer Term Introduction to Time Series Analysis

Review Session: Econometrics - CLEFIN (20192)

Autoregressive and Moving-Average Models

E 4101/5101 Lecture 6: Spectral analysis

SOME BASICS OF TIME-SERIES ANALYSIS

1 Time Series Concepts and Challenges

STAD57 Time Series Analysis. Lecture 8

A time series is called strictly stationary if the joint distribution of every collection (Y t

Gaussian Copula Regression Application

LINEAR STOCHASTIC MODELS

1. How can you tell if there is serial correlation? 2. AR to model serial correlation. 3. Ignoring serial correlation. 4. GLS. 5. Projects.

MAT 3379 (Winter 2016) FINAL EXAM (SOLUTIONS)

Stochastic Modelling Solutions to Exercises on Time Series

Final Examination 7/6/2011

ARIMA Models. Jamie Monogan. January 25, University of Georgia. Jamie Monogan (UGA) ARIMA Models January 25, / 38

MAT3379 (Winter 2016)

APPLIED ECONOMETRIC TIME SERIES 4TH EDITION

Lecture 3: Autoregressive Moving Average (ARMA) Models and their Practical Applications

A SARIMAX coupled modelling applied to individual load curves intraday forecasting

Example: Baseball winning % for NL West pennant winner

Ross Bettinger, Analytical Consultant, Seattle, WA

Transcription:

Chapter 6: Model Specification for Time Series The ARIMA(p, d, q) class of models as a broad class can describe many real time series. Model specification for ARIMA(p, d, q) models involves 1. Choosing appropriate values for p, d, and q; 2. Estimating the parameters (e.g., the φ s, θ s, and σ 2 e ) of the ARIMA(p, d, q) model; 3. Checking model adequacy, and if necessary, improving the model. This process of iteratively proposing, checking, adjusting, and re-checking the model is known as the Box-Jenkins method for fitting time series models.

The Sample Autocorrelation Function We know that the autocorrelations are important characteristics of our time series models. To get an idea of the autocorrelation structure of a process based on observed data, we look at the sample autocorrelation function: n t=k+1 r k = (Y t Ȳ )(Y t k Ȳ ) n t=1 (Y t Ȳ )2 We look for patterns in the r k values that are similar to known patterns of the ρ k for ARMA models that we have studied. Since r k are merely estimates of ρ k, we cannot expect the r k patterns to match the ρ k patterns of a model exactly.

The Sampling Distribution of r k Because of its form and the fact that it is a function of possibly correlated variables, r k does not have a simple sampling distribution. For large sample sizes, the approximate sampling distribution of r k can be found, when the data come from ARMA-type models. This sampling distribution is approximately normal with mean ρ k, so a strategy for checking model adequacy is to see whether the r k s fall within 2 standard errors of their expected values (the ρ k s). For our purposes, we consider several common models and find the approximate expected values, variances, and correlations of the r k s under these models.

The Sampling Distribution of r k Under Common Models First, under general conditions, for large n, r k is approximately normal with expected value ρ k. If {Y t } is white noise, then for large n, var(r k ) 1/n and corr(r k, r j ) 0 for k j. If {Y t } is AR(1) having ρ k = φ k for k > 0, then var(r 1 ) (1 φ 2 )/n. Note that r 1 has smaller variance when φ is near 1 or 1. And for large k: var(r k ) 1 [ ] 1 + φ 2 n 1 φ 2 This variance tends to be larger for large k than for small k, especially when φ is near 1 or 1. So when φ is near 1 or 1, we can expect r k to be relatively close to ρ k = φ k for small k, but not especially close to ρ k = φ k for large k.

The Sampling Distribution of r k Under the AR(1) Model In the AR(1) model, 1 φ corr(r 1, r 2 ) 2φ 2 1 + 2φ 2 3φ 4 For example, if φ = 0.9, then corr(r 1, r 2 ) = 0.97. Similarly, if φ = 0.9, then corr(r 1, r 2 ) = 0.97. If φ = 0.2, then corr(r 1, r 2 ) = 0.38. Similarly, if φ = 0.2, then corr(r 1, r 2 ) = 0.38. Exhibit 6.1 on page 111 of the textbook gives other example values for corr(r 1, r 2 ) and var(r k ) for selected values of k and φ, assuming an AR(1) process. To determine whether a certain model (like AR(1) with φ = 0.9) is reasonable, we can examine the sample autocorrelation function and compare the observed values from our data to those we would expect, under that model.

The Sampling Distribution of r k Under the MA(1) Model For the MA(1) model, var(r 1 ) = (1 3ρ 2 1 + 4ρ4 1 )/n and var(r k ) = (1 + 2ρ 2 1 )/n for k > 1. Exhibit 6.2 on page 112 of the textbook gives other example values for corr(r 1, r 2 ) and var(r k ) for any k and selected values of θ, under the MA(1) model. For the MA(q) model, var(r k ) = (1 + 2 q j=1 ρ2 j )/n for k > q.

The Need to Go Beyond the Autocorrelation Function The sample autocorrelation function (ACF) is a useful tool to check whether the lag correlations that we see in a data set match what we would expect under a specific model. For example, in an MA(q) model, we know the autocorrelations should be zero for lags beyond q, so we could check the sample ACF to see where the autocorrelations cut off for an observed data set. But for an AR(p) model, the autocorrelations don t cut off at a certain lag; they die off gradually toward zero.

Partial Autocorrelation Functions The partial autocorrelation function (PACF) can be used to determine the order p of an AR(p) model. The PACF at lag k is denoted φ kk and is defined as the correlation between Y t and Y t k after removing the effect of the variables in between: Y t 1,..., Y t k+1. If {Y t } is a normally distributed time series, the PACF can be defined as the correlation coefficient of a conditional bivariate normal distribution: φ kk = corr(y t, Y t k Y t 1,..., Y t k+1 )

Partial autocorrelations in the AR(p) and MA(q) Processes In the AR(1) process, φ kk = 0 for all k > 1. So the partial autocorrelation for lag 1 is not zero, but for higher lags, it is zero. More generally, in the AR(p) process, φ kk = 0 for all k > p. Clearly, examining the PACF for an AR process can help us determine the order of that process. For an MA(1) process, φ kk = θk (1 θ 2 ) 1 θ 2(k+1) So the partial autocorrelation of an MA(1) process never equals zero exactly, but it decays to zero quickly as k increases. In general, the PACF for a MA(q) process behaves similarly as the ACF for a AR(q) process.

Expressions for the Partial Autocorrelation Function For a stationary process with known autocorrelations ρ 1,..., ρ k, the φ kk satisfy the Yule-Walker equations: ρ j = φ k1 ρ j 1 + φ k2 ρ j 2 + + φ kk ρ j k, for j = 1, 2,..., k For a given k, these equations can be solved for φ k1, φ k2,..., φ kk (though we only care about φ kk ). We can do this for all k. If the stationary process is actually AR(p), then φ pp = φ p, and the order p of the process is whatever is the highest lag with a nonzero φ kk.

The Sample Partial Autocorrelation Function By replacing the ρ k s in the previous set of linear equations by r k s, we can solve these equations for estimates of the φ kk s. Equation (6.2.9) on page 115 of the textbook gives a formula for solving recursively for the φ kk s in terms of the ρ k s. Replacing the ρ k s by r k s, we get the sample partial autocorrelations, the ˆφ kk s.

Using the Sample PACF to Assess the Order of an AR Process If the model of AR(p) is correct, then the sample partial autocorrelations for lags greater than p are approximately normally distributed with means 0 and variances 1/n. So for any lag k > p, if the sample partial autocorrelation ˆφ kk is within 2 standard errors of zero (between 2/ n and 2/ n), then this indicates that we do not have evidence against the AR(p) model. If for some lag k > p, we have ˆφ kk < 2/ n or ˆφ kk > 2/ n, then we may need to change the order p in our model (or possibly choose a model other than AR). This is a somewhat informal test, since it doesn t account for the multiple decisions being made across the set of values of k > p.

Summary of ACF and PACF to Identify AR(p) and MA(q) Processes The ACF and PACF are useful tools for identifying pure AR(p) and MA(q) processes. For an AR(p) model, the true ACF will decay toward zero. For an AR(p) model, the true PACF will cut off (become zero) after lag p. For an MA(q) model, the true ACF will cut off (become zero) after lag q. For an MA(q) model, the true PACF will decay toward zero. If we propose either an AR(p) model or a MA(q) model for an observed time series, we could examine the sample ACF or PACF to see whether these are close to what the true ACF or PACF would look like for this proposed model.

Extended Autocorrelation Function for Identifying ARMA Models For an ARMA(p, q) model, the true ACF and true PACF both have infinitely many nonzero values. Neither the true ACF nor the true PACF will cut off entirely after a certain number of lags. So it is hard to determine the correct orders of an ARMA(p, q) model simply by using the ACF and PACF. The extended autocorrelation function (EACF) is one method proposed to assess the orders of a ARMA(p, q) model. Other methods for specifying ARMA(p, q) models include the corner method and the smallest canonical correlation (SCAN) method, which we will not discuss here.

Some Details about the Extended Autocorrelation Function Method In an ARMA(p, q) model, if we filter out (i.e., subtract off) the autoregressive component(s), we are left with a pure MA(q) process that can be specified using the ACF approach. For example, consider an ARMA(1, 1) model: Y t = φy t 1 + e t θe t 1. If we regress Y t on Y t 1, we get an inconsistent estimator of φ, but this regression s residuals do tell us about the behavior of the error process {e t }. So then let s regress Y t on Y t 1 AND the lag 1 of the first regression s residuals, which stand in for e t 1. In this second regression, the estimated coefficient of Y t 1 (call it φ) is a consistent estimator of φ. Then W t = Y t φy t 1, having filtered out the autoregressive part, should be approximately MA(1).

More Details about the Extended Autocorrelation Function Method For higher-order ARMA processes, we would need more of these sequential regressions to consistently estimate the AR coefficients (we d need q extra regressions for an AR(p, q) model). In practice, both AR order p and MA order q are unknown, so we need to do this iteratively, considering grids of values for p and q. This iterative estimation of AR coefficients, assuming a hypothetical ARMA(k, j) model, produces filtered values W t,k,j = Y t φy t 1 φ k Y t k

Extended Sample Autocorrelations The extended sample autocorrelations are the sample autocorrelations of W t,k,j. If the hypothesized AR order k is actually the correct AR order, p, and if the hypothesized MA order j q, then {W t,k,j } is an MA(q) process. In that case, the true autocorrelations of W t,k,j of lag q + 1 or higher should be zero. We can try finding the extended sample autocorrelations for a grid of values of k = 0, 1, 2,... and a grid of values of j = 0, 1, 2,....

The EACF in Table Form We can summarize the EACFs by creating a table with an X in the k-th row and j-th column if the lag j + 1 sample autocorrelation of W t,k,j is significantly different from zero. Since the sample autocorrelations are approximately N(0, 1/(n k j)) under the MA(j) process, the sample autocorrelation is significantly different from zero if its absolute value exceeds 1.96/ n j k. The table gets an 0 in its k-th row and j-th column if the lag j + 1 sample autocorrelation of W t,k,j is NOT significantly different from zero.

The EACF in Table Form (Continued) The EACF table for an ARMA(p, q) process should theoretically have a triangular pattern of zeroes with the top-left zero occurring in the p-th row and q-th column (with the row and column labels both starting from 0). (In reality, the sample EACF table will not be as clear-cut as the examples that follow, since the sample EACF values have sampling variability.)

Theoretical EACF Table for an ARMA(1, 1) Process AR/MA 0 1 2 3 4 5 6 7 8 9 10 0 x x x x x x x x x x x 1 x 0 0 0 0 0 0 0 0 0 0 2 x x 0 0 0 0 0 0 0 0 0 3 x x x 0 0 0 0 0 0 0 0 4 x x x x 0 0 0 0 0 0 0 5 x x x x x 0 0 0 0 0 0 6 x x x x x x 0 0 0 0 0 7 x x x x x x x 0 0 0 0

Theoretical EACF Table for an ARMA(2, 3) Process AR/MA 0 1 2 3 4 5 6 7 8 9 10 0 x x x x x x x x x x x 1 x x x x x x x x x x x 2 x x x 0 0 0 0 0 0 0 0 3 x x x x 0 0 0 0 0 0 0 4 x x x x x 0 0 0 0 0 0 5 x x x x x x 0 0 0 0 0 6 x x x x x x x 0 0 0 0 7 x x x x x x x x 0 0 0