Time Series 4. Robert Almgren. Oct. 5, 2009

Similar documents
Time Series 2. Robert Almgren. Sept. 21, 2009

Time Series 3. Robert Almgren. Sept. 28, 2009

FE570 Financial Markets and Trading. Stevens Institute of Technology

Time Series Analysis -- An Introduction -- AMS 586

Chapter 12: An introduction to Time Series Analysis. Chapter 12: An introduction to Time Series Analysis

Topic 4 Unit Roots. Gerald P. Dwyer. February Clemson University

Univariate ARIMA Models

Ch 6. Model Specification. Time Series Analysis

at least 50 and preferably 100 observations should be available to build a proper model

ECON/FIN 250: Forecasting in Finance and Economics: Section 7: Unit Roots & Dickey-Fuller Tests

Quantitative Finance I

Univariate, Nonstationary Processes

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley

1 Linear Difference Equations

Forecasting using R. Rob J Hyndman. 2.4 Non-seasonal ARIMA models. Forecasting using R 1

Classic Time Series Analysis

Multivariate Time Series: VAR(p) Processes and Models

Forecasting. Simon Shaw 2005/06 Semester II

Some Time-Series Models

Prof. Dr. Roland Füss Lecture Series in Applied Econometrics Summer Term Introduction to Time Series Analysis

11. Further Issues in Using OLS with TS Data

Stochastic Processes

Lecture 5: Unit Roots, Cointegration and Error Correction Models The Spurious Regression Problem

A Data-Driven Model for Software Reliability Prediction

Basics: Definitions and Notation. Stationarity. A More Formal Definition

Time Series Outlier Detection

Time Series I Time Domain Methods

Chapter 9: Forecasting

Introduction to ARMA and GARCH processes

Marcel Dettling. Applied Time Series Analysis SS 2013 Week 05. ETH Zürich, March 18, Institute for Data Analysis and Process Design

TIME SERIES ANALYSIS. Forecasting and Control. Wiley. Fifth Edition GWILYM M. JENKINS GEORGE E. P. BOX GREGORY C. REINSEL GRETA M.

Read Section 1.1, Examples of time series, on pages 1-8. These example introduce the book; you are not tested on them.

ARIMA Models. Jamie Monogan. January 16, University of Georgia. Jamie Monogan (UGA) ARIMA Models January 16, / 27

Lecture 2: Univariate Time Series

Volatility. Gerald P. Dwyer. February Clemson University

ARIMA Modelling and Forecasting

MODELING INFLATION RATES IN NIGERIA: BOX-JENKINS APPROACH. I. U. Moffat and A. E. David Department of Mathematics & Statistics, University of Uyo, Uyo

Exercises - Time series analysis

1 Teaching notes on structural VARs.

Chapter 2: Unit Roots

2. An Introduction to Moving Average Models and ARMA Models

Ch 9. FORECASTING. Time Series Analysis

Heteroskedasticity in Time Series

Circle a single answer for each multiple choice question. Your choice should be made clearly.

5 Autoregressive-Moving-Average Modeling

γ 0 = Var(X i ) = Var(φ 1 X i 1 +W i ) = φ 2 1γ 0 +σ 2, which implies that we must have φ 1 < 1, and γ 0 = σ2 . 1 φ 2 1 We may also calculate for j 1

Elements of Multivariate Time Series Analysis

Lecture 6a: Unit Root and ARIMA Models

A time series is called strictly stationary if the joint distribution of every collection (Y t

STAT 436 / Lecture 16: Key

Chapter 6: Model Specification for Time Series

data lam=36.9 lam=6.69 lam=4.18 lam=2.92 lam=2.21 time max wavelength modulus of max wavelength cycle

Multivariate Time Series Analysis and Its Applications [Tsay (2005), chapter 8]

Final Examination 7/6/2011

A nonparametric test for seasonal unit roots

7. Integrated Processes

SOME BASICS OF TIME-SERIES ANALYSIS

Stat 5100 Handout #12.e Notes: ARIMA Models (Unit 7) Key here: after stationary, identify dependence structure (and use for forecasting)

Ch 5. Models for Nonstationary Time Series. Time Series Analysis

FinQuiz Notes

6. The econometrics of Financial Markets: Empirical Analysis of Financial Time Series. MA6622, Ernesto Mordecki, CityU, HK, 2006.

7. Integrated Processes

Empirical Market Microstructure Analysis (EMMA)

TIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA

Lecture 6: Univariate Volatility Modelling: ARCH and GARCH Models

Lesson 13: Box-Jenkins Modeling Strategy for building ARMA models

3 Theory of stationary random processes

Chapter 5: Models for Nonstationary Time Series

Ch. 19 Models of Nonstationary Time Series

Applied time-series analysis

Nonstationary Time Series:

Chapter 8: Model Diagnostics

University of Oxford. Statistical Methods Autocorrelation. Identification and Estimation

Ch 8. MODEL DIAGNOSTICS. Time Series Analysis

Ch 4. Models For Stationary Time Series. Time Series Analysis

White Noise Processes (Section 6.2)

Univariate linear models

Box-Jenkins ARIMA Advanced Time Series

STAT 443 Final Exam Review. 1 Basic Definitions. 2 Statistical Tests. L A TEXer: W. Kong

3 Time Series Regression

Discrete time processes

Analysis. Components of a Time Series

Ross Bettinger, Analytical Consultant, Seattle, WA

Data Mining Techniques

Part 1. Multiple Choice (50 questions, 1 point each) Part 2. Problems/Short Answer (10 questions, 5 points each)

Multiple Regression Analysis

Algebra & Trig Review

Univariate Time Series Analysis; ARIMA Models

ARMA (and ARIMA) models are often expressed in backshift notation.

FORECASTING SUGARCANE PRODUCTION IN INDIA WITH ARIMA MODEL

Financial Econometrics

Scenario 5: Internet Usage Solution. θ j

CHAPTER 8 MODEL DIAGNOSTICS. 8.1 Residual Analysis

7 Introduction to Time Series

The Identification of ARIMA Models

Multiscale and multilevel technique for consistent segmentation of nonstationary time series

Modeling and forecasting global mean temperature time series

7. Forecasting with ARIMA models

ESSE Mid-Term Test 2017 Tuesday 17 October :30-09:45

Trend-Cycle Decompositions

Transcription:

Time Series 4 Robert Almgren Oct. 5, 2009 1 Nonstationarity How should you model a process that has drift? ARMA models are intrinsically stationary, that is, they are mean-reverting: when the value of x t is above its long-term mean, its next motions will likely be in a downward direction; when it is below its long-term mean the next motion will likely be up. These models also have decaying autocorrelation: the present value is completely forgotten if you go long enough into the future. In practice a lot of models have some sort of steady drift. For example, if you look at the daily volume of some stock, there is likely to be a substantial upward trend as overall trading volumes have increased. Now we discuss three possible ways that such drift can be included in a model. 1.1 Unit roots When we discussed AR models, we argued that solutions to the model exploded exponentially if P(z) had any roots strictly inside the unit disk. We therefore assumed that all roots of P(z) were strictly outside the unit disk, giving exponential decay. We deliberately ignored the boundary case where P(z) has a root on the unit circle. We shall consider only the case of a root at z = 1 (other cases can be reduced to this by suitable complex rotations). Example Recall the AR(1) model x t = c + φx t 1 + w t. Then P(z) = 1 φz has a single root at z = 1/φ and the root condition requires φ < 1. If φ = 1, then the model is the random walk x t = c + 1

Robert Almgren: Time Series 4 Oct. 5, 2009 2 x t 1 + w t. If we define the differenced series y by y = (I B)x, so y t = x t x t 1, then y satisfies y t = c + w t which is MA(1) and hence stationary. This example illustrates the general case. Suppose that P(z) has a zero at z = 1 of order r > 0, and all other roots are strictly outside the unit circle. Then we may write P(z) = P(z) (1 z) r where P(z) is a polynomial of degree p r, with P(0) = 1 and all roots strictly outside the unit circle. Then the original model can be written as P(z) (I B) r x = c + w and if we define the r th difference series y = (I B) r x then y satisfies the AR(p r ) model P(B) y = c + w. An ARIMA(p, r, q) model is one whose r th difference series follows an ARMA(p,q) model. The I is for integrated. An ARIMA(p, r, q) model has solutions that grow not exponentially, as they would if P had a root strictly inside the unit circle, but algebraically, with an order equal to r : If r = 1, then x t x t 1 is stationary, and x t O(t); If r = 2, then x t 2x t 1 + x t 2 is stationary, and x t O(t 2 ); etc. These asymptotic orders assume that c 0, so that y t has a nonzero mean. If c = 0, then cancellation will reduce the growth and we would need a more nuanced probabilisitic statement. For example, for r = 1 we get the classic random walk with E ( ) x t = 0 but E( x t ) O ( t ). The most common case is r = 1. Long-memory processes So far we have always assumed that autocorrelations decayed exponentially in time. This is very convenient theoretically, but not always justified by real data, for which

Robert Almgren: Time Series 4 Oct. 5, 2009 3 autocorrelations often decay with power-law behavior: γ l O( l β ). We can do this by interpolating between ARMA and ARIMA models. ARMA(p,q): P(B) x = c + Q(B) w exponential decay ARIMA(p,1,q): P(B)(1 B) x = c + Q(B) w no decay. It is then natural to consider models of the form P(B)(1 B) α x = c + Q(B) w decay O ( l 2α 1) where 0.5 < α < 0.5. In practice, this is equivalent to an ARMA(p,q) model with q =, determined by series expansion of (1 z) α. 1.2 Drift The other way to get nonstationary behavior is to add it explicitly, by making the constant term be explicitly time dependent. Most commonly we just put linear dependence µ t = µ 0 + µ 1 t or c t = c 0 + c 1 t. If the model variable x t is the log of something, then these linear terms correspond to exponential behavior. We would then typically assume that the associated ARMA model has roots strictly outside the unit circle, so that it reverts to a moving mean value. These models have E ( x t ) = a + bt but Var ( x t ) finite. Thus they generally stay within a finite distance of a trending mean value, in contrast to a unit-root model with continually growing variance. Although this distinction is clear in theory, on a particular sample of data it can be difficult to tell the difference, and often you have to make the choice based on theoretical preferences. One might summarize the modeling hierarchy as follows: First, do you think that the distribution should be stationary? If yes then look for an ARMA model with no drift. The autocorrelation should decay rapidly to zero. If no, then ask what causes the nonstationarity. If you think it is a general linear or exponential growth (e.g, the company you are studying grows with the general economy), then look for an ARMA model with explicit drift.

Robert Almgren: Time Series 4 Oct. 5, 2009 4 t t t Mean reverting Mean reverting with drift Random walk Figure 1: Sample paths for a mean-reverting AR(1) model, AR(1) with drift, and ARIMA(1,1,0) random walk. In each we have emphasized one realization to emphasize that in practice that is all you see. From just this path, you would have trouble distinguishing the mean reverting model with drift (middle) from the random walk (right). If you think the value drifts randomly, then look for an ARMA model for differences. Of course, if the quality of your data permits, you will always formulate a model that contains all possible dependencies and see which coefficients come out zero. 1.3 Seasonality Some series may have periodic components: for example, daily market data may have weekly effect, corresponding to a signal in the data of period 5 (with occasional disruptions when one or more days in a week are a holiday). Special techniques should be used to remove this component. 1.4 Variable volatility Even if a series has no drift, integrated effects, or seasonality, the magnitude of changes may vary, for example varying volatility of asset price returns. These problems are well covered by ARCH/GARCH methodology which we will talk about in a few weeks.

Robert Almgren: Time Series 4 Oct. 5, 2009 5 1.5 Breaks The entire data series may change at some specific point, for example, because of exogeneous effects such as decimalization in equity prices, or Reg NMS. Or it might be intrinsic regime shifts when the entire model shifts from one configuration to another. Again, specialized techniques exists for identifying these changes. 2 Fitting strategy Now suppose you have a data sequence x 1,..., x n (equally spaced in time) and you want to determine a model that fits it to some acceptable extent. We advocate the following sequence of steps: 1. Remove obvious trends and seasonality. At a minimum, this will probably include subtracting an average drift. You could determine the average drift by doing a linear regression x j µ 0 + µ 1 j + ξ j where the residuals ξ j are your new series. Or you could simply take the drift from the endpoints (x n x 1 )/n. For seasonality, there are standard techniques built into most software packages, or you can use sophisticated Kalman filter methods, or you can construct some approximation that makes sense to you. This step is inevitably rather messy and there is no easy formula. 2. Look for the best AR model. Carry out a sequence of linear regressions of each term on its predecessors, at increasing order: x t x t c + β 1 x t 1 + w t c + β 1 x t 1 + β 2 x t 2 + w t. At each stage, the last coefficient β k is the partial auto-correlation coefficient. Hopefully, these coefficients will have nontrivial magnitude for the first few orders, and then will suddenly drop to a negligible magnitude at a particular order, telling you exactly where to stop.

Robert Almgren: Time Series 4 Oct. 5, 2009 6 More precisely, if x is truly described by an AR(p) process, then β k will be close to zero for k > p. If the noise terms have finite variance, and are not too far from Gaussian, then the variance in the estimate of β k is 1/n for k > p (note that the coefficients β j are nondimensional so the variance does not enter into this estimate). Tsay describes a more sophisticated approach that chooses the order to maximise an information criterion, balancing precision of fit against number of parameters. By whatever means, choose the optimal AR order. 3. Fit an MA model. After fitting the best AR model, the residuals may not yet be white noise. Examine their empirical autocorrelation, and see whether it plausibly drops to zero beyond a particular finite order q. Then the residuals can be fit by an MA(q) model, and you have constructed an ARMA(p,q) model. 4. Test the final residuals as white noise. If the model constructed in the previous steps is adequate, then the final residuals should have no identifiable structure. They should pass empirical test for white noise: zero mean, constant variance, and zero correlation. If you can convince yourself that they pass those tests then you are done. Unit root tests To identify ARIMA models from the AR fit, we need to identify whether a polynomial of order p has a root at z = 1, which sounds complicated. In fact, by rewriting the problem it becomes equally easy. The main observation is that we can write the polynomial as P(z) = 1 φ 1 z φ p z p = 1 β z z(1 z) P(z) where P(z) = α 0 + α 1 z + + α p 2 z p 2 has degree p 2. (To convince yourself that you can rewrite P(z) this way, write out the conditions that determine β, α 0,..., α p 2 in terms of φ 1,..., φ p and convince yourself they can always be solved.) In this form, P(1) = 0 if and only if β = 1.

Robert Almgren: Time Series 4 Oct. 5, 2009 7 Then the AR(p) model P(B)x = c + w can be written as x t = c t + β x t 1 + p 1 l=1 α l 1 ( x) t l + w t where = I B is the difference operator, and we have allowed for potential time dependence in c t. Thus all we have to do is regress x t on ( 1, t, x t 1, x t 1 x t 2,..., x t p+1 x t p ) which is almost as easy as the original regression. Standard packages give us the estimate of β and its uncertainty, and we evaluate it as above.