The autocorrelation and autocovariance functions - helpful tools in the modelling problem

Similar documents
Time Series Analysis -- An Introduction -- AMS 586

Diagnostic Test for GARCH Models Based on Absolute Residual Autocorrelations

Time Series I Time Domain Methods

STAT 443 Final Exam Review. 1 Basic Definitions. 2 Statistical Tests. L A TEXer: W. Kong

Econometric Forecasting

ARMA models with time-varying coefficients. Periodic case.

Introduction to ARMA and GARCH processes

γ 0 = Var(X i ) = Var(φ 1 X i 1 +W i ) = φ 2 1γ 0 +σ 2, which implies that we must have φ 1 < 1, and γ 0 = σ2 . 1 φ 2 1 We may also calculate for j 1

3. ARMA Modeling. Now: Important class of stationary processes

Nonlinear Time Series Modeling

Applied time-series analysis

Econ 423 Lecture Notes: Additional Topics in Time Series 1

MAT 3379 (Winter 2016) FINAL EXAM (SOLUTIONS)

Some Time-Series Models

Nonlinear time series

GARCH Models. Eduardo Rossi University of Pavia. December Rossi GARCH Financial Econometrics / 50

Part III Example Sheet 1 - Solutions YC/Lent 2015 Comments and corrections should be ed to

Econ 424 Time Series Concepts

Time Series 2. Robert Almgren. Sept. 21, 2009

ARIMA Models. Richard G. Pierse

Lecture 3: Autoregressive Moving Average (ARMA) Models and their Practical Applications

Stochastic Processes: I. consider bowl of worms model for oscilloscope experiment:

Financial Econometrics

at least 50 and preferably 100 observations should be available to build a proper model

13. Estimation and Extensions in the ARCH model. MA6622, Ernesto Mordecki, CityU, HK, References for this Lecture:

ECONOMICS 7200 MODERN TIME SERIES ANALYSIS Econometric Theory and Applications

1 Linear Difference Equations

ECONOMETRIC REVIEWS, 5(1), (1986) MODELING THE PERSISTENCE OF CONDITIONAL VARIANCES: A COMMENT

FORECASTING SUGARCANE PRODUCTION IN INDIA WITH ARIMA MODEL

MCMC analysis of classical time series algorithms.

Chapter 4: Models for Stationary Time Series

Statistics of stochastic processes

Volatility. Gerald P. Dwyer. February Clemson University

Discrete time processes

Prof. Dr. Roland Füss Lecture Series in Applied Econometrics Summer Term Introduction to Time Series Analysis

{ } Stochastic processes. Models for time series. Specification of a process. Specification of a process. , X t3. ,...X tn }

TIME SERIES AND FORECASTING. Luca Gambetti UAB, Barcelona GSE Master in Macroeconomic Policy and Financial Markets

Long-range dependence

Lecture 2: ARMA(p,q) models (part 2)

Review Session: Econometrics - CLEFIN (20192)

Econometría 2: Análisis de series de Tiempo

STAT Financial Time Series

Acta Universitatis Carolinae. Mathematica et Physica

Confidence Intervals for the Autocorrelations of the Squares of GARCH Sequences

GARCH processes probabilistic properties (Part 1)

STAD57 Time Series Analysis. Lecture 8

Revisiting linear and non-linear methodologies for time series prediction - application to ESTSP 08 competition data

Econ 623 Econometrics II Topic 2: Stationary Time Series

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

MAT 3379 (Winter 2016) FINAL EXAM (PRACTICE)

An introduction to volatility models with indices

The sample autocorrelations of financial time series models

A time series is called strictly stationary if the joint distribution of every collection (Y t

Chapter 3. ARIMA Models. 3.1 Autoregressive Moving Average Models

ON THE CONVERGENCE OF FARIMA SEQUENCE TO FRACTIONAL GAUSSIAN NOISE. Joo-Mok Kim* 1. Introduction

18.S096 Problem Set 4 Fall 2013 Time Series Due Date: 10/15/2013

α i ξ t i, where ξ t i.i.d. (0, 1),

Lecture 2: Univariate Time Series

Problem Set 1 Solution Sketches Time Series Analysis Spring 2010

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

Ch. 14 Stationary ARMA Process

6. The econometrics of Financial Markets: Empirical Analysis of Financial Time Series. MA6622, Ernesto Mordecki, CityU, HK, 2006.

Part II. Time Series

of seasonal data demonstrating the usefulness of the devised tests. We conclude in "Conclusion" section with a discussion.

ECON 616: Lecture 1: Time Series Basics

2. An Introduction to Moving Average Models and ARMA Models

Empirical Market Microstructure Analysis (EMMA)

Convolution Based Unit Root Processes: a Simulation Approach

Ch. 19 Models of Nonstationary Time Series

MGR-815. Notes for the MGR-815 course. 12 June School of Superior Technology. Professor Zbigniew Dziong

AR, MA and ARMA models

University of Oxford. Statistical Methods Autocorrelation. Identification and Estimation

Determining and Forecasting High-Frequency Value-at-Risk by Using Lévy Processes

distributed approximately according to white noise. Likewise, for general ARMA(p,q), the residuals can be expressed as

LECTURES 2-3 : Stochastic Processes, Autocorrelation function. Stationarity.

EASTERN MEDITERRANEAN UNIVERSITY ECON 604, FALL 2007 DEPARTMENT OF ECONOMICS MEHMET BALCILAR ARIMA MODELS: IDENTIFICATION

ECON3327: Financial Econometrics, Spring 2016

Design of Time Series Model for Road Accident Fatal Death in Tamilnadu

Week 5 Quantitative Analysis of Financial Markets Characterizing Cycles

Theoretical and Simulation-guided Exploration of the AR(1) Model

Stat 248 Lab 2: Stationarity, More EDA, Basic TS Models

Gaussian processes. Basic Properties VAG002-

A SEASONAL TIME SERIES MODEL FOR NIGERIAN MONTHLY AIR TRAFFIC DATA

Extremogram and Ex-Periodogram for heavy-tailed time series

3 Theory of stationary random processes

SOME BASICS OF TIME-SERIES ANALYSIS

Thomas J. Fisher. Research Statement. Preliminary Results

ESSE Mid-Term Test 2017 Tuesday 17 October :30-09:45

THE K-FACTOR GARMA PROCESS WITH INFINITE VARIANCE INNOVATIONS. 1 Introduction. Mor Ndongo 1 & Abdou Kâ Diongue 2

Time Series 3. Robert Almgren. Sept. 28, 2009

ARIMA Modelling and Forecasting

Lesson 9: Autoregressive-Moving Average (ARMA) models

STAT 443 (Winter ) Forecasting

Do Markov-Switching Models Capture Nonlinearities in the Data? Tests using Nonparametric Methods

Marcel Dettling. Applied Time Series Analysis SS 2013 Week 05. ETH Zürich, March 18, Institute for Data Analysis and Process Design

Stochastic Modelling Solutions to Exercises on Time Series

A nonparametric test for seasonal unit roots

Minitab Project Report - Assignment 6

The Size and Power of Four Tests for Detecting Autoregressive Conditional Heteroskedasticity in the Presence of Serial Correlation

Automatic seasonal auto regressive moving average models and unit root test detection

Transcription:

The autocorrelation and autocovariance functions - helpful tools in the modelling problem J. Nowicka-Zagrajek A. Wy lomańska Institute of Mathematics and Computer Science Wroc law University of Technology, Poland E-mail address: agnieszka.wylomanska@pwr.wroc.pl One of the important steps towards constructing an appropriate mathematical model for the real-life data is to determine the structure of dependence. A conventional way of gaining information concerning the dependence structure (in the second-order case) of a given set of observations is estimating the autocovariance or the autocorrelation function (ACF) that can provide useful guidance in the choice of satisfactory model or family of models. As in some cases calculations of ACF for the real-life data may turn out to be insufficient to solve the model selection problem, we propose to consider the autocorrelation function of the squared series as well. Using this approach, in this paper we investigate the dependence structure for several cases of time series models. PACS: 5.45 Tp, 2.5 Cw, 89.65 Gh. Motivation One of the important steps towards constructing an appropriate mathematical model for the real-life data is to determine the structure of dependence. A conventional way of gaining information concerning the dependence structure (in the second-order case) of a given set of observations is estimating one of the most popular measure of dependence the autocovariance (ACVF) or the autocorrelation (ACF) function and this can provide useful guidance in the choice of satisfactory model or family of models. This traditional method is well-established in time series textbooks, see e.g. [, 2, 3], and widely used in various areas, e.g. physics [4, 5], electrical engineering [6], in economics (finance) [7, 8], meteorology [9, ]. However, the inspection of the correlations for the data can lead sometimes to the wrong preliminary choice of the model. For example, if we look at the sample autocorrelation functions on the top left and bottom left panels of Fig. we can guess that these two sets of data were drawn from the same time series model. But we do not make this mistake if we compare the sample autocorrelations for the squared data. Indeed, as we can see from the right panels of Fig., the sample ACF s (circles) for the squares behave in a quite different manner for the considered samples. Therefore, in this paper we propose to consider not only the autocovariance or the autocorrelation function but the higher order dependence, namely the dependence structure for the squared series, as well. In Section 2 we recall the well-known definition of the autocovariance and the autocorrelation functions and note that it is sufficient to restrict our attention to the autocorrelation function. Then our aim is to show the differences in the behaviour of the ACF for the process and its squares for assorted models. Therefore, in Section 3 we consider ()

2 I.I.D I.I.D. 2 4 6 8 2 4 6 8 ARCH(2) ARCH(2) 2 4 6 8 Fig.. 2 4 6 8 some time series models and in order to keep the results simple and illustrative, we focus on particular, low order models. We study examples of Autoregressive Moving Average (ARMA) models as they belong to the class of linear processes and they are often used for modelling empirical time series. Although linear processes are appropriate for describing many real-life phenomena, they do not capture some of the features of financial time series (e.g. periods of high and low volatility tend to persist in the market longer than can be explained by linear processes). Therefore, ARCH (Autoregressive Conditionally Heteroskedastic) and GARCH (Generalized ARCH) processes were developed in order to model the behaviour of stock, stock index, currency, etc. and thus we discuss two examples of such models. In the same section we consider also a specific SARIMA model, we provide explicit expressions for the autocorrelation functions and we show that the autocorrelation functions might prove a helpful guide in selecting a possible model for the real-life data that exhibits seasonal behaviour. Finally, Section 4 gives a few concluding remarks. 2. Measures of dependence for finite-variance time series If we want to describe the dependence structure of finite-variance time series {X n } we can use one of the most popular measures of dependence, i.e. the autocovariance function or the autocorrelation function: cov(x n,x m ) = E(X n X m ) EX n EX m, corr(x n,y m ) = cov(x n,x m ) Var(Xn ) Var(X m ). It is worth noting that, contrary to the case of some measures defined for infinite-variance time series (see e.g. []), both measures considered here are symmetric. And the rescaled (standardized) autocovariance gives the autocorrelation. Moreover, it is obvious that for stationary time series {X n } both the autocovariance and the autocorrelation functions depend only on the difference between n and m and therefore we can write: cov X (m n) = cov(x n,x m ), corr X (m n) = corr(x n,x m ).

3 Usually in model building it is convenient to use the autocorrelation function. Therefore, from now on, we will study only this measure. Of course, if we have the formula for the autocorrelation function for the stationary time series, we can get the autocovariance function by multiplication by Var(X n ). We will denote by (k) the theoretical ACF for stationary time series {X n } and by ˆ (k) the sample autocorrelation function (sample ACF) for n observations x,x 2,...,x n given by: n k i= ˆ (k) = (x i x)(x i+ k x) ni= (x i x) 2, where x = n ni= x i. As we will study not only the dependence structure of the time series but of the squared time series as well, let us assume that EX 4 n exists for every n, X 2 n is stationary and denote Y n = X 2 n. Moreover, as a very wide class of stationary processes can be generated by using iid noise (independent identically distributed random variables) as the forcing terms in a set of equations we will use the symbol {ξ n } to denote a series of zero-mean iid random variables with finite fourth moment. We take the notation σ 2 = Eξ 2 n, γ = Eξ4 n. 3. The ACF for time series models In this section let us consider a few time series models. For each of them we provide the formulae for the ACF for the time series together with the ACF for its squares. 3.. Independent identically distributed variables Let us first consider time series of zero-mean independent identically distributed random variables with finite fourth moment, i.e. X n = ξ n. It is clear that this series is stationary and that autocorrelation functions are given by: (k) = I k=, (k) = I k=. This means that not only for {X n } the autocorrelations are zero (except k = ) but the same result we have for the squared process {Y n }. And now we can look at top panels of Fig. again because these graphs were plotted for iid standard Gaussian noise. 3.2. MA() model The sequence {X n } is said to be MA() (Moving Average) process if for every n: X n = ξ n + aξ n, (3.) where the innovations ξ n are defined as in Section 2. It is clear that this process is stationary and we can show that: (k) = I k= + a 2 I k= + ai k= + a 2, for k =, a (k) = 2 γ σ 4 a 2 for k =, γ(+a 4 )+σ 4 (4a 2 a 4 ) for k >.

4 MA() MA() 2 4 6 8 2 4 6 8 AR() AR() 2 4 6 8 Fig. 2. 2 4 6 8 This means that for both MA() and squared MA() models the autocorrelation function is equal to zero except k = and k =. Note also that if a > then () is greater than zero while for a <, () falls below zero. Let us consider an example of MA() model, i.e., let {X n } be given by (3.) with iid a =.8 and standard Gaussian noise, i.e. ξ n N(, ). The theoretical and sample ACF for {X n } and {Xn 2 } are presented on the top panels of Fig.2. It is easily seen that for k > the theoretical ACF is equal to zero and sample ACF is placed within confidence intervals so it should be treated as equal to zero. It is worth noting that one can obtain similar result for MA(q) models but then the measures are not equal to zero only for k q. Therefore, if the sample autocorrelation functions of both data and squared data are zero except a few first lags one can try to describe the data with MA models. The sequence {X n } given by 3.3. AR() models X n bx n = ξ n, (3.2) where the innovations ξ n are defined as in Section 2, is called AR() (Autoregressive) model. We assume that < b < in order to get so-called causal model and then X n = j= b j ξ n j. For this model we have: (k) = b k, and this means that for b > this measure decays exponentially as k tends to infinity. However, for b < the autocorrelation function oscillates and it is convenient to investigate its behaviour by taking even and odd n s separately the ACF tends exponentially to zero for each of these two subsequences of n s. The autocorrelation function for squared AR() model tends to zero even faster (although still exponentially) as it is given by: (k) = b 2k.

5 And now let us consider an example of AR() model, i.e. let {X n } be given by (3.2) iid with b =.7 and ξ n N(, ). According to our results, the theoretical and sample ACF s for {X n } and {Xn} 2 tend to zero quite quickly, see the bottom left and the bottom right panel of Fig.2. Moreover, as AR() can be treated as MA( ) model, one can use the results obtained for AR() model to imagine the behaviour of ACF for some infinite order moving average processes. So when we deal with the real-life data and the sample ACF tends to zero rapidly it could suggest the use of ARMA model as for this type of models the autocorrelation function is bounded by functions exponentially tending to zero. 3.4. ARCH(2) models Although ARMA processes are useful in modelling real-life data of various kinds, they belong to the class of linear processes and they do not capture the structure of financial data. Therefore, the class of ARCH processes was introduced in [2] to allow the conditional variance of a time series process to depend on past information. ARCH models have become popular in the past few years as they provide a good description of financial time series. They are nonlinear stochastic processes, their distributions are heavy-tailed with time-dependent conditional variance and they model clustering of volatility. Let us consider the ARCH(2) process with Gaussian innovations defined by the equations: X n = h n ξ n, h n = θ + ψ X 2 n + ψ 2 X 2 n 2, (3.3) iid where ξ n N(,) and θ,ψ 2 >, ψ. For such model the conditional variance h n of X n depends on the past through the two most recent values of Xn 2. We assume that ψ 2 +3ψ 2 +3ψ2 2 +3ψ2 ψ 2 3ψ2 3 <, as this condition guarantees the existence of the fourth moment for the process {X t }, see [3]. As it was shown in [3] the following formulae hold: (k) = I k=, for k =, (k) = ψ ψ 2 for k =, ψ 2 +ψ 2 ψ2 2 ψ 2 for k = 2, ψ (k ) + ψ 2 (k 2) for k > 2. So the most important fact is that for ARCH(2) model the ACF is zero (only () = ) and this means that the X n s are uncorrelated (but not independent). Therefore, if we look only at the sample ACF of the data generated from ARCH(2) model, see e.g. the bottom left panel of Fig., we can come to the wrong conclusion that we deal with iid noise, cf. the top left panel of Fig.. Fortunately, if we check the behaviour of ACF for squared time series we notice the significant difference between the ARCH(2) process and iid variables, compare left panels of Fig.. The same results are also true for the general ARCH(q) models, i.e. the X n s are uncorrelated while the ACF is not equal to zero for the squared time series.

6 3.5. GARCH(,) models In empirical applications of the ARCH model a relatively long lag in the conditional variance equation is often called for. In order to avoid problems with negative variance parameter estimates, a more general class of processes, GARCH process was introduced in [4], allowing for a much more flexible lag structure. These kind of models seem to describe stock, stock index and currency returns very well, so they are now widely used to model financial time series (see [5] and references therein). The Gaussian GARCH(,) model is defined by the equations: X n = h n ξ n, h n = θ + φh n + ψx 2 n, (3.4) where ξ n iid N(,) and θ,φ,ψ >. We assume that 3ψ 2 +2ψφ+φ 2 <, as this condition guarantees the existence of the fourth moment for the process {X n }, see [3]. For the considered model we have: (k) = I k=, for k =, ψ( ψφ φ (k) = 2 ) 2ψφ φ for k =, 2 (ψ + φ) k () for k >, see [3]. As in the case of ARCH model, for GARCH(,) model the ACF is zero and this means that the X n s are uncorrelated (but not independent). The ACF for {X 2 n} is not equal to zero and it tends to zero exponentially. We can observe this behaviour on the top panels of Fig.3 where the example of GARCH(,) is given. The ACF for the time series is zero while the same function for squared time series is greater than zero but tends to zero quite quickly. For the general GARCH(p,q) models we get similar results, i.e. the X n s are uncorrelated while the ACF is not equal to zero for the squared time series. Therefore, if we analyse the data and the sample ACF is zero for all lags we should look at the ACF of the squared data in order to get more information. 3.6. SARIMA models Many real-life phenomena exhibit seasonal behaviour. The traditional time series method to deal with them is the classical decomposition where the model of the observed series incorporates trend, seasonality and random noise. However, in modelling empirical data it might not be reasonable to assume, as in the classical decomposition, that the seasonal component repeats itself precisely in the same way cycle after cycle. SARIMA (Seasonal Autoregressive Integrated Moving Average) models allow for randomness in the seasonal pattern from one cycle to the next, see []. A SARIMA(p,d,q) (P,D,Q) model with period s it is a process {X n } that satisfies the equation φ(b)φ(b s )( B) d ( B s ) D X n = θ(b)θ(b s )ξ n, where φ(z) = φ z... φ p z p, Φ(z) = Φ z... φ P z P, θ(z) = +θ z... θ q z q, Θ(z) = + Θ z... Θ Q z Q and B is a backward shift operator.

7 GARCH(,) GARCH(,) 2 4 6 8 2 4 6 8 SARIMA SARIMA 2 4 6 8 Fig. 3. 2 4 6 8 Let us consider SARIMA(,,) (,,) system with period s = 3, standard normal innovations and let us assume that < φ <. In this case the equation for X n takes the following form: X n φx n 3 = ξ n. (3.5) The solution of this equation is X n = j= φ j ξ n 3j and this implies that the unique solution of equation (3.5) is stationary. Moreover, EX n = and (k) = { φ k/3 when k mod 3 =, otherwise. For the process Y n = Xn 2 function: we determined the explicit formula for the autocorrelation (k) = { φ 2k/3 when k mod 3 =, otherwise. The considered model is called seasonal with period s = 3 and the ACF function reflects its character. The autocorrelation functions for both X n and Y n oscillate and the period is also 3 the behaviour of the theoretical and sample ACF for the considered model with φ =.7 can be observed on the bottom panels of Fig.3. 4. Conclusions In this paper we recall that in the second-order case commonly applied method of gaining information concerning the dependence structure of a given set of observations by estimating the autocorrelation (or autocovariance) function can provide useful guidance in the choice of satisfactory model or family of models. However, the results reported above clearly suggest that considering the autocorrelation function for the squares might be helpful in distinguishing between different types of time series models and prove even a

8 better guide in selecting a set of possible model candidates when modelling real-life data. This method is worth applying as it is very simple. Indeed, in order to get the sample ACF for the squares one does not need to prepare new computer programs it is sufficient to restrict to any well-know routine for ACF and use it for squared observations. REFERENCES [] P.J. Brockwell, R.A. Davis, Introduction to Time Series and Forecasting, Springer-Verlag, New York 996. [2] G.E. Box, G.M. Jenkins, Time Series Analysis: Forecasting and Control, Holden-Day, San Francisco 97. [3] J.D. Hamilton, Time Series Analysis, Princeton University Press, 994. [4] I. Megahed, A. Nossier, Journal of Physics E 2, 88 (969) [5] J. Lia, F. Nekka, Physica A 376, 47 (27). [6] J. Nowicka-Zagrajek, R. Weron, Signal Process. 82(2), 93 (22). [7] L. Palatella, J. Perell, M. Montero and J. Masoliver, The European Physical Journal B 38(4), 67 (24). [8] D. Challet, T. Galla, Quantitative Finance 5(6), 569 (25). [9] D.W. Meeka, J.H. Pruegera, T.J. Sauerb, W.P. Kustasc, L.E. Hippsd and J.L. Hatfielda, Agricultural and Forest Meteorology 96, 9 (999). [] T. Hall, S. Jewson, arxiv:physics/5924v, 25. [] G. Samorodnitsky, M.S. Taqqu, Stable Non-Gaussian random processes, Chapman & Hall, New York 994. [2] R.F. Engle, Econometrica 5, 987 (982). [3] T. Bollerslev, J. Time Ser. Anal. 9(2), 2 (998). [4] T. Bollerslev, J. Economertics 3, 37 (986). [5] T. Bollerslev, R.Y. Chou and K.F. Kroner J. Econometrics 52, 5 (992). List of Figures The theoretical ACF (stars) and sample ACF (circles) for X n (left panels) and Y n = X 2 n (right panels), where X n is iid standard Gaussian noise (top panels) and X n is ARCH(2) model with φ =.4, φ 2 =. and standard Gaussian noise (bottom panels)......................... 2 2 The theoretical ACF (stars) and sample ACF (circles) for X n (left panels) and Y n = X 2 n (right panels), where X n is MA() model with a =.8 and standard Gaussian noise (top panels) and X n is AR() model with b =.7 and standard Gaussian noise (bottom panels).................. 4 3 The theoretical ACF (stars) and sample ACF (circles) for X n (left panels) and Y n = X 2 n (right panels), where X n is GARCH(,) model with φ =.6,ψ =.2 and standard Gaussian noise (top panels) and X n is SARIMA(,,) (,,) with period s = 3 and φ =.7 and standard Gaussian noise (bottom panels)............................ 7