ARIMA Models. Jamie Monogan. January 16, University of Georgia. Jamie Monogan (UGA) ARIMA Models January 16, / 27

Similar documents
ARIMA Models. Jamie Monogan. January 25, University of Georgia. Jamie Monogan (UGA) ARIMA Models January 25, / 38

Univariate, Nonstationary Processes

Empirical Market Microstructure Analysis (EMMA)

at least 50 and preferably 100 observations should be available to build a proper model

Chapter 12: An introduction to Time Series Analysis. Chapter 12: An introduction to Time Series Analysis

Decision 411: Class 9. HW#3 issues

EASTERN MEDITERRANEAN UNIVERSITY ECON 604, FALL 2007 DEPARTMENT OF ECONOMICS MEHMET BALCILAR ARIMA MODELS: IDENTIFICATION

FE570 Financial Markets and Trading. Stevens Institute of Technology

Time Series I Time Domain Methods

Stat 5100 Handout #12.e Notes: ARIMA Models (Unit 7) Key here: after stationary, identify dependence structure (and use for forecasting)

Intervention Models and Forecasting

Circle a single answer for each multiple choice question. Your choice should be made clearly.

MODELING INFLATION RATES IN NIGERIA: BOX-JENKINS APPROACH. I. U. Moffat and A. E. David Department of Mathematics & Statistics, University of Uyo, Uyo

Exercises - Time series analysis

TIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA

FORECASTING SUGARCANE PRODUCTION IN INDIA WITH ARIMA MODEL

Module 3. Descriptive Time Series Statistics and Introduction to Time Series Models

Estimation and application of best ARIMA model for forecasting the uranium price.

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

Basics: Definitions and Notation. Stationarity. A More Formal Definition

Ross Bettinger, Analytical Consultant, Seattle, WA

Forecasting using R. Rob J Hyndman. 2.4 Non-seasonal ARIMA models. Forecasting using R 1

ARMA (and ARIMA) models are often expressed in backshift notation.

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

TIME SERIES ANALYSIS. Forecasting and Control. Wiley. Fifth Edition GWILYM M. JENKINS GEORGE E. P. BOX GREGORY C. REINSEL GRETA M.

Advanced Econometrics

Cointegrated VARIMA models: specification and. simulation

ARIMA Modelling and Forecasting

{ } Stochastic processes. Models for time series. Specification of a process. Specification of a process. , X t3. ,...X tn }

Lecture 2: Univariate Time Series

Time Series Econometrics 4 Vijayamohanan Pillai N

ECON/FIN 250: Forecasting in Finance and Economics: Section 7: Unit Roots & Dickey-Fuller Tests

Review Session: Econometrics - CLEFIN (20192)

ECON/FIN 250: Forecasting in Finance and Economics: Section 6: Standard Univariate Models

Lecture 19 Box-Jenkins Seasonal Models

Univariate linear models

Time Series Analysis -- An Introduction -- AMS 586

STAT Financial Time Series

Introduction to ARMA and GARCH processes

2. An Introduction to Moving Average Models and ARMA Models

Dynamic Time Series Regression: A Panacea for Spurious Correlations

Part 1. Multiple Choice (50 questions, 1 point each) Part 2. Problems/Short Answer (10 questions, 5 points each)

A SEASONAL TIME SERIES MODEL FOR NIGERIAN MONTHLY AIR TRAFFIC DATA

Vector Autoregression

Cointegration, Stationarity and Error Correction Models.

APPLIED ECONOMETRIC TIME SERIES 4TH EDITION

Analysis. Components of a Time Series

Circle the single best answer for each multiple choice question. Your choice should be made clearly.

Ch 6. Model Specification. Time Series Analysis

7. Forecasting with ARIMA models

Univariate Time Series Analysis; ARIMA Models

Univariate ARIMA Models

A Data-Driven Model for Software Reliability Prediction

Non-Stationary Time Series and Unit Root Testing

The Identification of ARIMA Models

SOME BASICS OF TIME-SERIES ANALYSIS

7 Introduction to Time Series

Econometrics of financial markets, -solutions to seminar 1. Problem 1

Lesson 13: Box-Jenkins Modeling Strategy for building ARMA models

Lecture 1: Fundamental concepts in Time Series Analysis (part 2)

Lecture 6a: Unit Root and ARIMA Models

Lecture 5: Unit Roots, Cointegration and Error Correction Models The Spurious Regression Problem

ECONOMETRIA II. CURSO 2009/2010 LAB # 3

3 Theory of stationary random processes

University of Oxford. Statistical Methods Autocorrelation. Identification and Estimation

Read Section 1.1, Examples of time series, on pages 1-8. These example introduce the book; you are not tested on them.

THE UNIVERSITY OF CHICAGO Booth School of Business Business 41914, Spring Quarter 2013, Mr. Ruey S. Tsay

Prof. Dr. Roland Füss Lecture Series in Applied Econometrics Summer Term Introduction to Time Series Analysis

UNIVARIATE TIME SERIES ANALYSIS BRIEFING 1970

7 Introduction to Time Series Time Series vs. Cross-Sectional Data Detrending Time Series... 15

9) Time series econometrics

Chapter 3, Part V: More on Model Identification; Examples

Econ 623 Econometrics II Topic 2: Stationary Time Series

Modelling Monthly Rainfall Data of Port Harcourt, Nigeria by Seasonal Box-Jenkins Methods

Econometric Forecasting

Some Time-Series Models

Romanian Economic and Business Review Vol. 3, No. 3 THE EVOLUTION OF SNP PETROM STOCK LIST - STUDY THROUGH AUTOREGRESSIVE MODELS

Autocorrelation. Jamie Monogan. Intermediate Political Methodology. University of Georgia. Jamie Monogan (UGA) Autocorrelation POLS / 20

A time series is called strictly stationary if the joint distribution of every collection (Y t

Ch 5. Models for Nonstationary Time Series. Time Series Analysis

Topic 4 Unit Roots. Gerald P. Dwyer. February Clemson University

10) Time series econometrics

Classic Time Series Analysis

TMA4285 December 2015 Time series models, solution.

Economics 618B: Time Series Analysis Department of Economics State University of New York at Binghamton

AR, MA and ARMA models

Final Examination 7/6/2011

Marcel Dettling. Applied Time Series Analysis SS 2013 Week 05. ETH Zürich, March 18, Institute for Data Analysis and Process Design

Multivariate Time Series: VAR(p) Processes and Models

Time Series Outlier Detection

AR(p) + I(d) + MA(q) = ARIMA(p, d, q)

Empirical Approach to Modelling and Forecasting Inflation in Ghana

Elements of Multivariate Time Series Analysis

Ch 9. FORECASTING. Time Series Analysis

Chapter 6: Model Specification for Time Series

Granger Causality Testing

Econometría 2: Análisis de series de Tiempo

Unit root problem, solution of difference equations Simple deterministic model, question of unit root

The ARIMA Procedure: The ARIMA Procedure

Introduction to Time Series Analysis. Lecture 11.

Transcription:

ARIMA Models Jamie Monogan University of Georgia January 16, 2018 Jamie Monogan (UGA) ARIMA Models January 16, 2018 1 / 27

Objectives By the end of this meeting, participants should be able to: Argue why standard regressions on trending series are inappropriate. Describe the logic of the Box-Jenkins modeling strategy. Define stationarity and describe the processes of a stationary series. Explain why Maximum Likelihood Estimation is essential for ARIMA models and how MLE is tailored to time series needs. Identify, estimate, and diagnose ARIMA models. Identify and estimate seasonal elements of an ARIMA model. Jamie Monogan (UGA) ARIMA Models January 16, 2018 2 / 27

Spurious Regressions in Time Series Trend on Trend Spuriousness We also start with error specification because it is essential to accurately testing our theories. Imagine that we have a simple hypothesis, say Population GDP Our measures of each trend (upward, but it doesn t matter if they are opposite). Granger and Newbold thesis: any two variables which trend will produce spurious evidence of causal connection. In the negative: It is impossible to observe the absence of a statistical relationship between any two trending series! Source: Granger and Newbold. 1974. Spurious Regressions in Econometrics. Journal of Econometrics. 2: 111-120. Jamie Monogan (UGA) ARIMA Models January 16, 2018 3 / 27

Spurious Regressions in Time Series What s the bias with trending series? The definition of unbiased is: E( ˆβ) = β For the case of two trending series we estimate two components: 1 β itself, and 2 the ratio of the the two trends. Thus ˆβ= β + Trend-Ratio Thus ˆβ is unbiased if and only if Trend-Ratio = 0.0 It cannot be zero if both series trend. Jamie Monogan (UGA) ARIMA Models January 16, 2018 4 / 27

Spurious Regressions in Time Series First Differencing Trending Series? First differencing is performing the operation (1-B)z = a or simply: w t = z t - z t 1 In R: use the diff function from the timeseries library In Stata: tsset month d.approval So, under what conditions is it allowable to regress one trending time series on another? Never! Jamie Monogan (UGA) ARIMA Models January 16, 2018 5 / 27

Structure, Error, and Procedure Modeling Form: End Game for Inference & Forecasting Regression y t = {β 0 + β 1 x 1 + β 2 X 2 + } + u t = {structure} + error The two components are structure and error: Y = structure + error Box-Jenkins y t = [transfer function] + ARIMA Model y t =f(x) + N t So we start out working on ARIMA models of error aggregation, the N t, and then later develop transfer functions as tests of theories we care about. (Opposite what we usually do.) The causal flow of the transfer function cannot be observed until we successfully model the error processes. Jamie Monogan (UGA) ARIMA Models January 16, 2018 6 / 27

Structure, Error, and Procedure How a Time Series is Produced We assume the data generating process is: a t Linear filter z t A white noise input is systematically filtered into the observed time series. Getting Back to White Noise a t Linear filter z t implies z t Inverse of Linear filter a t So, if we can solve for z= f(a), then we can invert f and produce a (which is white noise). Jamie Monogan (UGA) ARIMA Models January 16, 2018 7 / 27

Structure, Error, and Procedure The Box-Jenkins Procedure Identification What class of models probably produced z t? Estimation What are the model parameters? Diagnosis Are the residuals, a t, from the estimated model white noise? Empirically Identifying the Error Process We can infer the data generating process because knowing the mathematics, we know the empirical signature. We will develop the signatures of a family of error aggregation models that are autoregressive (AR), integrated (I), moving average (MA) and all combinations ARIMA(P,D,Q) Jamie Monogan (UGA) ARIMA Models January 16, 2018 8 / 27

Structure, Error, and Procedure AR(1) Example AR(1): A very important special case Simplify Notation: AR(1) means autoregressive, 1st order Only the first lag of z appears in the equation z t = θ 0 + φ 1 z t 1 + a t What is its signature? To answer that question it is useful to transform the equation into shock form, where z is a function of all previous a s. z t = θ 0 + a t + φ 1 a t 1 + φ 2 a t 2 + φ 3 a t 3 + φ t 1 a 1 That s an ugly equation that has T terms, but it has useful information about the expected association of each of the a s with z t. Lag 1: φ, that is φ 1 Lag 2: φ 2 Lag 3: φ 3 Lag k: φ k Jamie Monogan (UGA) ARIMA Models January 16, 2018 9 / 27

Structure, Error, and Procedure AR(1) Example Autocorrelation Function for AR(1) Since φ is constrained to be <1.0, that means that each exponential power of φ is a progressively smaller number, looking like: Jamie Monogan (UGA) ARIMA Models January 16, 2018 10 / 27

Structure, Error, and Procedure AR(1) Example Next Steps If we observe an empirical series that shows this (very common) pattern of autocorrelation (and a couple other details), we judge it to be AR(1) This is the IDENTIFICATION stage: we re using empirical evidence such as correlograms to determine the error process. Once we ve tentatively judged the class of model, we estimate the parameters of such a model using MLE (ARIMA ESTIMATION, more later). After we estimate the model, calculate the residuals, a t. Now the really neat part: If our judgment was correct, a t, the estimated residuals must be white noise and we know how to test for that property! (DIAGNOSIS) If it is white noise, then we can use this filtered series for our analysis. Jamie Monogan (UGA) ARIMA Models January 16, 2018 11 / 27

Stationarity and Integration Stationarity and Integration A stationary series is one that tends to return to some equilibrium level after being disburbed. A nonstationary series, or an integrated series, has no equilibrium. The most common integrated series is the random walk (i.e., DJIA). Box-Jenkins models are defined only for stationary time series. The good news: Integrated series can be made stationary by differencing them. (Which you know how to do.) How to Know Stationarity The ACF of a stationary series tends to approach zero after just a few lags And stay there. Integrated series show systematic behavior over very long lag lengths. In the regression tradition, we will develop the Dickey-Fuller test for unit roots. Jamie Monogan (UGA) ARIMA Models January 16, 2018 12 / 27

Stationarity and Integration Macropartisanship: A Non-stationary Series Slowly Decaying Autocorrelation Function Jamie Monogan (UGA) ARIMA Models January 16, 2018 13 / 27

Notation Three Models and Their Notation The AR(1) Model Functional: z t = φz t 1 + a t, where -1 < φ < 1 Backshift: (1 - φb)z t = a t Shock form: z t = a t + φa t 1 + φ 2 a t 2 + φ 3 a t 3 +... + φ t a 0 The MA(1) Model Functional: z t = θ 0 + a t - θ 1 a t 1 Backshift: z t = θ 0 + (1 - θ 1 B)a t Shock form is MA, so there is no expansion The I(1) Model Functional: z t = z t 1 + a t Shock form: z t = a t + a t 1 + a t 2 + a t 3... + a 0 Different from AR(1): No decay terms on shocks, permanent memory. Jamie Monogan (UGA) ARIMA Models January 16, 2018 14 / 27

Notation The General ARMA(P,Q) Model z t = θ 0 + In econometric notation: y t = α 0 + p q φ i z t i + θ i a t i + a t i=1 i=1 p q α i y t i + β i ɛ t i + ɛ t i=1 i=1 Jamie Monogan (UGA) ARIMA Models January 16, 2018 15 / 27

Identification Issues Some Low Order Models White noise z = a Random walk z t - z t 1 = a t (1 - B)z = a z = a [i.e., cumulated white noise] AR(1) z t = φz t 1 + a t (1 - φb)z = a Note that (1 - φb) = a for φ=1.0 is then exactly a random walk. φ = 1.0 is called a unit root. MA(1) z t = a t - θa t 1 z = (1 - θb)a IMA(1,1) (1 - B)z = (1 - θb)a Thus an IMA(1,1) is simply a MA(1) operating on first differences. ARMA(1,1) (1 - φb)z = (1 - θb)a Jamie Monogan (UGA) ARIMA Models January 16, 2018 16 / 27

Identification Issues Autocorrelation Function Expectations AR(1) zt = φz t 1 + a t z t 1 = φz t 2 + a t 1 (from stationarity) z t = φ(φz t 2 + a t 1 ) + a t substituting z t = φa t 1 + φ 2 z t 2 + a t z t = φa t 1 + φ 2 a t 2 + φ 3 z t 3 + a t... z t = a t + φa t 1 + φ 2 a t 2 + φ 3 a t 3 +... φ l a t l We expect exponential decay in ACF: 1, φ, φ 2, φ 3,... MA(1) z t = θ 0 + a t - θ 1 a t 1 ACF: E(ρ 1 ) = -θ 1 /(1 + θ 2 1 ) Crude empirical rule of thumb: ACF(1)= -θ 1 /2. This implies ACF(1) negative and less than.5 in absolute value. Jamie Monogan (UGA) ARIMA Models January 16, 2018 17 / 27

Identification Issues Cookbook Rules for Identification Fuller List: Table 6.1 in Box, Jenkins, & Reinsel (2008, 199) AR(P) Exponential decay in the ACF, P significant spikes in the PACF I(1) Slow decay in the ACF, 1 significant spike in the PACF MA(Q) Q significant spikes in the ACF, exponential decay in the PACF The Partial Autocorrelation Function The PACF shows the autocorrelation at lag k controlling for all previous lags. Thus it shows the effects at lag k which could not have been predicted from lower lags. In effect then, it shows the independent effects of processes at lag k. PACF(1) = ACF(1) Jamie Monogan (UGA) ARIMA Models January 16, 2018 18 / 27

Estimation Least Squares and MLE Some ARIMA models, e.g., AR(1), are essentially linear and could be estimated by least squares. For example z t = φ 1 z t 1 + a t can be estimated by least squares regression if you just drop the first case. R: z <- ts(data$z1) l.z <- lag(z, -1) data2 <- ts.union(z, l.z) reg.1 <- lm(z l.z, data=data2) Stata: tsset month, then reg z l.z The coefficient on the lagged dependent variable is a LS estimate of φ 1 In practice ARIMA software uses a generalized maximum likelihood algorithm for all ARIMA models. The φ s estimated by LS and ML are not identical, but the difference is nearly always trivial. This is not a case like OLS, where LS and ML solutions are proven identical when OLS assumptions hold. Jamie Monogan (UGA) ARIMA Models January 16, 2018 19 / 27

Estimation Maximum Likelihood Unmasked Maximum Likelihood Estimation is really nothing more than efficient trial and error. It has three components: 1 A function to be maximized, the log of likelihood for the equation Why log instead of likelihood itself? 2 An algorithm for generating efficient guesses of parameter values 3 Starting values for the parameters. Log of Likelihood for ARIMA Estimation LL(θ) = T 2 log(2π) T 2 log(σ2 ) This applies to any ARIMA(P,D,Q) model. T t=1 a 2 t 2σ 2 When the likelihood is known, as here, the problem reduces to finding out how to estimate θ and a t. Jamie Monogan (UGA) ARIMA Models January 16, 2018 20 / 27

Estimation MA(1) illustration z t = -θ 1 a t 1 + a t + θ 0 Drop θ 0 for simplicity Note inherent nonlinearity of -θ 1 a t 1 Both θ and a t 1 are unobserved quantities to be estimated Step by step Presume for the moment that we somehow know θ How do we estimate a t? Except for the first case; just solve one case at a time: z t is given Jamie Monogan (UGA) ARIMA Models January 16, 2018 21 / 27

Estimation Solve the MA(1) Equation for a t Just Algebra 1 z t = -θa t 1 + a t 2 z t + θa t 1 = + a t (adding θa t 1 to both sides) 3 a t = z t + θa t 1 (reversing) So, beginning at time zero, if we know a 0, we can solve for a 1, if we know a 1, we can solve for a 2, if we know a 2, we can solve for a 3, recursive all the way to a T So assuming or computing a value for a 0 is the key to everything. Jamie Monogan (UGA) ARIMA Models January 16, 2018 22 / 27

Estimation Conditional Maximum Likelihood Conditional Maximum Likelihood E(a t )=0.0; Therefore initialize a 0 = 0.0 Then maximize log-likelihood conditional on that starting value The assumption will be false, but its effect is transient That is guaranteed by the stationarity condition Jamie Monogan (UGA) ARIMA Models January 16, 2018 23 / 27

Estimation Unconditional Maximum Likelihood with Backforecasting Unconditional Maximum Likelihood Backforecast a 0 Because the backforecast is a product of known z and maximum likelihood estimates of θ and a, it will be optimum. Hence full ML is preferred to Conditional ML Backforecasting for MA(1) Given: z t = -θ 1 a t 1 + a t z t - a t = -θ 1 a t 1 (z t - a t )/-θ 1 = a t 1 Inititalize a T +1 = 0. Solve each previous a t 1 recursively, even a 0. Why is this optimal? Backforecasting s assumption about unobservables is at a series end, and the error in that assumption is transient, so the transient error will not affect the a 0 forecast at the series origin. Jamie Monogan (UGA) ARIMA Models January 16, 2018 24 / 27

Seasonality Cyclical Data So far, difference equations have focused on trend and error processes in relation to recent values. Perhaps the data cycle across quarters or months. One Approach: Linear Seasonal Terms Perhaps just add another autoregressive or moving-average term into the difference equation: ARMA(1,4): y t = a 1 y t 1 + ɛ t + β 1 ɛ t 1 + β 4 ɛ t 4 ARMA(4,1): y t = a 1 y t 1 + a 4 y t 4 + ɛ t + β 1 ɛ t 1 The rub: nonseasonal patterns will interact with the seasonal. This influences the empirical signature. Jamie Monogan (UGA) ARIMA Models January 16, 2018 25 / 27

Seasonality Multiplicative Seasonality The multiplicative model accounts for interaction among terms: ARMA(1,1), MA seasonal: (1 a 1 L)y t = (1 β 1 L)(1 β 4 L 4 )ɛ t ARMA(1,1), AR seasonal: (1 a 1 L)(1 a 4 L 4 )y t = (1 β 1 L)ɛ t MA seasonal, in functional terms: y t = a 1 y t 1 + ɛ t + β 1 ɛ t 1 + β 4 ɛ t 4 + β 1 β 4 ɛ t 5 Notation Now our task will be spotting the empirical signature of such a series. The seasonal ARIMA model is now: ARIMA(p,d,q)(P,D,Q) s When differencing series, subscripts refer to the seasonal period, and superscripts refer to number of differences. Using the above terms, our differencing notation is: d D s Jamie Monogan (UGA) ARIMA Models January 16, 2018 26 / 27

Assignments For Next Time Write down the research question for your term paper. What is the status of the project? Brand new? Data gathered? What? Complete questions #1 and #2.a-2.c from page 185 of Political Analysis Using R. Reading: Time Series Analysis for the Social Sciences, Chapter 2 (pp. 58-67) and Section 7.3 (pp. 187-205). Jamie Monogan (UGA) ARIMA Models January 16, 2018 27 / 27