ST4064. Time Series Analysis. Lecture notes

Similar documents
OBJECTIVES OF TIME SERIES ANALYSIS

Stationary Time Series

Diebold, Chapter 7. Francis X. Diebold, Elements of Forecasting, 4th Edition (Mason, Ohio: Cengage Learning, 2006). Chapter 7. Characterizing Cycles

- The whole joint distribution is independent of the date at which it is measured and depends only on the lag.

14 Autoregressive Moving Average Models

Licenciatura de ADE y Licenciatura conjunta Derecho y ADE. Hoja de ejercicios 2 PARTE A

Financial Econometrics Jeffrey R. Russell Midterm Winter 2009 SOLUTIONS

Section 4 NABE ASTEF 232

Methodology. -ratios are biased and that the appropriate critical values have to be increased by an amount. that depends on the sample size.

Quarterly ice cream sales are high each summer, and the series tends to repeat itself each year, so that the seasonal period is 4.

R t. C t P t. + u t. C t = αp t + βr t + v t. + β + w t

Box-Jenkins Modelling of Nigerian Stock Prices Data

Vectorautoregressive Model and Cointegration Analysis. Time Series Analysis Dr. Sevtap Kestel 1

Kriging Models Predicting Atrazine Concentrations in Surface Water Draining Agricultural Watersheds

Chapter 5. Heterocedastic Models. Introduction to time series (2008) 1

Time series Decomposition method

Econ107 Applied Econometrics Topic 7: Multicollinearity (Studenmund, Chapter 8)

Chapter 15. Time Series: Descriptive Analyses, Models, and Forecasting

12: AUTOREGRESSIVE AND MOVING AVERAGE PROCESSES IN DISCRETE TIME. Σ j =

Unit Root Time Series. Univariate random walk

Forecasting optimally

Chapter 3, Part IV: The Box-Jenkins Approach to Model Building

Properties of Autocorrelated Processes Economics 30331

Distribution of Estimates

Nonstationarity-Integrated Models. Time Series Analysis Dr. Sevtap Kestel 1

Exponential Smoothing

Math 10B: Mock Mid II. April 13, 2016

Solutions to Odd Number Exercises in Chapter 6

2 Univariate Stationary Processes

Smoothing. Backward smoother: At any give T, replace the observation yt by a combination of observations at & before T

STRUCTURAL CHANGE IN TIME SERIES OF THE EXCHANGE RATES BETWEEN YEN-DOLLAR AND YEN-EURO IN

Institute for Mathematical Methods in Economics. University of Technology Vienna. Singapore, May Manfred Deistler

Summer Term Albert-Ludwigs-Universität Freiburg Empirische Forschung und Okonometrie. Time Series Analysis

Lecture Notes 2. The Hilbert Space Approach to Time Series

Distribution of Least Squares

ACE 562 Fall Lecture 4: Simple Linear Regression Model: Specification and Estimation. by Professor Scott H. Irwin

STAD57 Time Series Analysis. Lecture 17

DYNAMIC ECONOMETRIC MODELS vol NICHOLAS COPERNICUS UNIVERSITY - TORUŃ Józef Stawicki and Joanna Górka Nicholas Copernicus University

Generalized Least Squares

STAD57 Time Series Analysis. Lecture 17

20. Applications of the Genetic-Drift Model

How to Deal with Structural Breaks in Practical Cointegration Analysis

Estimation Uncertainty

Vehicle Arrival Models : Headway

State-Space Models. Initialization, Estimation and Smoothing of the Kalman Filter

Nature Neuroscience: doi: /nn Supplementary Figure 1. Spike-count autocorrelations in time.

Introduction D P. r = constant discount rate, g = Gordon Model (1962): constant dividend growth rate.

Outline. lse-logo. Outline. Outline. 1 Wald Test. 2 The Likelihood Ratio Test. 3 Lagrange Multiplier Tests

Lesson 2, page 1. Outline of lesson 2

Regression with Time Series Data

Comparing Means: t-tests for One Sample & Two Related Samples

Γ(h)=0 h 0. Γ(h)=cov(X 0,X 0-h ). A stationary process is called white noise if its autocovariance

3.1 More on model selection

GMM - Generalized Method of Moments

Types of Exponential Smoothing Methods. Simple Exponential Smoothing. Simple Exponential Smoothing

Økonomisk Kandidateksamen 2005(II) Econometrics 2. Solution

Robust estimation based on the first- and third-moment restrictions of the power transformation model

Forecasting. Summary. Sample StatFolio: tsforecast.sgp. STATGRAPHICS Centurion Rev. 9/16/2013

Lecture 3: Exponential Smoothing

ACE 562 Fall Lecture 8: The Simple Linear Regression Model: R 2, Reporting the Results and Prediction. by Professor Scott H.

5. Stochastic processes (1)

Arima Fit to Nigerian Unemployment Data

Math 333 Problem Set #2 Solution 14 February 2003

Chapter 2. First Order Scalar Equations

STAD57 Time Series Analysis. Lecture 5

(10) (a) Derive and plot the spectrum of y. Discuss how the seasonality in the process is evident in spectrum.

Exponential Weighted Moving Average (EWMA) Chart Under The Assumption of Moderateness And Its 3 Control Limits

Chapter 16. Regression with Time Series Data

Dynamic Econometric Models: Y t = + 0 X t + 1 X t X t k X t-k + e t. A. Autoregressive Model:

Modeling and Forecasting Volatility Autoregressive Conditional Heteroskedasticity Models. Economic Forecasting Anthony Tay Slide 1

Physics 127b: Statistical Mechanics. Fokker-Planck Equation. Time Evolution

23.2. Representing Periodic Functions by Fourier Series. Introduction. Prerequisites. Learning Outcomes

BOX-JENKINS MODEL NOTATION. The Box-Jenkins ARMA(p,q) model is denoted by the equation. pwhile the moving average (MA) part of the model is θ1at

Some Basic Information about M-S-D Systems

NCSS Statistical Software. , contains a periodic (cyclic) component. A natural model of the periodic component would be

Matlab and Python programming: how to get started

I. Return Calculations (20 pts, 4 points each)

A Specification Test for Linear Dynamic Stochastic General Equilibrium Models

Applied Time Series Notes White noise: e t mean 0, variance 5 2 uncorrelated Moving Average

Richard A. Davis Colorado State University Bojan Basrak Eurandom Thomas Mikosch University of Groningen

Hypothesis Testing in the Classical Normal Linear Regression Model. 1. Components of Hypothesis Tests

Sample Autocorrelations for Financial Time Series Models. Richard A. Davis Colorado State University Thomas Mikosch University of Copenhagen

13.3 Term structure models

Two Coupled Oscillators / Normal Modes

Wisconsin Unemployment Rate Forecast Revisited

ACE 564 Spring Lecture 7. Extensions of The Multiple Regression Model: Dummy Independent Variables. by Professor Scott H.

DEPARTMENT OF STATISTICS

ACE 562 Fall Lecture 5: The Simple Linear Regression Model: Sampling Properties of the Least Squares Estimators. by Professor Scott H.

10. State Space Methods

ECON 482 / WH Hong Time Series Data Analysis 1. The Nature of Time Series Data. Example of time series data (inflation and unemployment rates)

References are appeared in the last slide. Last update: (1393/08/19)

Math 2142 Exam 1 Review Problems. x 2 + f (0) 3! for the 3rd Taylor polynomial at x = 0. To calculate the various quantities:

Two Popular Bayesian Estimators: Particle and Kalman Filters. McGill COMP 765 Sept 14 th, 2017

Testing the Random Walk Model. i.i.d. ( ) r

Y 0.4Y 0.45Y Y to a proper ARMA specification.

Dynamic Models, Autocorrelation and Forecasting

2.7. Some common engineering functions. Introduction. Prerequisites. Learning Outcomes

BOOTSTRAP PREDICTION INTERVALS FOR TIME SERIES MODELS WITH HETROSCEDASTIC ERRORS. Department of Statistics, Islamia College, Peshawar, KP, Pakistan 2

Modeling Rainfall in Dhaka Division of Bangladesh Using Time Series Analysis.

Notes on Kalman Filtering

Transcription:

ST4064 Time Series Analysis ST4064 Time Series Analysis Lecure noes

ST4064 Time Series Analysis Ouline I II Inroducion o ime series analysis Saionariy and ARMA modelling. Saionariy a. Definiions b. Sric saionariy c. Weak saionariy. Auocovariance, auocorrelaion and parial auocorrelaion a. Auocovariance b. Auocorrelaion c. Parial auocorrelaion d. Esimaion of he ACF and PACF 3. ARMA modelling a. AR models b. MA models c. ARMA models 4. Backward Shif Operaor and Difference Operaor 5. AR(p) models, saionariy and he Yule-Walker equaions a. The AR() model b. The AR(p) model and saionariy c. Yule-Walker equaions 6. MA(q) models and inveribiliy a. The MA() model b. The MA(q) model and inveribiliy 7. ARMA(p,q) models 8. ARIMA(p,d,q) models a. Non-ARMA processes b. The I(d) noaion 9. The Markov propery III Non-saionariy: rends and echniques. Typical rends. Leas squares rend removal 3. Differencing a. Linear rend removal b. Selecion of d 4. Seasonal differencing

ST4064 Time Series Analysis 3 5. Mehod of moving averages 6. Seasonal means 7. Filering, smoohing 8. Transformaions IV Box-Jenkins mehodology. Overview. Model selecion a. Idenificaion of whie noise b. Idenificaion of MA(q) c. Idenificaion of AR(p) 3. Model fiing a. Fiing an ARMA(p,q) model b. Parameer esimaion: LS and ML c. Parameer esimaion: mehod of momens d. Diagnosic checking V Forecasing. The Box-Jenkins approach. Forecasing ARIMA processes 3. Exponenial smoohing and Hol-Winers 4. Filering VI Mulivariae ime series analysis. Principal componen analysis and dimension reducion. Vecor AR processes 3. Coinegraion 4. Oher common models a. Bilinear models b. Threshold AR models c. Random coefficien AR models 5. ARCH and GARCH a. ARCH b. GARCH

ST4064 Time Series Analysis 4 I. Inroducion o ime series analysis A ime series is a sochasic process in discree ime wih a coninuous sae space. Noaion: {X, X,..., X n } denoes a ime series process, whereas {x, x,..., x n } denoes a univariae ime series, i.e. a sequence of realisaions of he ime series process. S = (-,) X X... X n- X n X n+ x x x n- x n? 0 n- n n+ ime I. Purposes of Time Series Analysis Describe he observed ime series daa: - mean, variance, correlaion srucure,... - e.g. correlaion coefficien beween sales monh apar, monhs apar, ec. Auocorrelaion Funcion (ACF) Parial Auocorrelaion Funcion (PACF) Consruc a model which fis he daa From he class of ARMA models, selec a model which bes fis he daa based on ACF and PACF of he observed ime series Apply Box Jenkins Mehodology: o Idenify enaive model o Esimae model parameers o Diagnosic checks - does he model fi? Forecas fuure values of he ime series process easy, once model has been fied o pas daa All ARMA models are saionary. If an observed ime series is non-saionary (e.g. upward rend), i mus be convered o saionary ime series (e.g. by differencing).

ST4064 Time Series Analysis 5 I. Oher forms of analysis Anoher imporan approach o he analysis of ime series relies on he Specral Densiy Funcion; he analysis is hen based on he auocorrelaion funcion of a ime series model. This approach is no covered in his course.

ST4064 Time Series Analysis 6 II. Saionariy and ARMA modelling II. Saionariy a. Definiion A sochasic process is (sricly) saionary if is saisical properies remain unchanged over ime. S X 5 X 0 X 0 X 5 5 0 0 5 ime Join disribuion of X, X,..., X n = Join disribuion of X k+, X k+,..., X k+n, for all k and for all n. Example: Join disribuion of X 5, X 6,..., X 0 = Join disribuion of X 0, X,..., X 5 " for any chunk of variables " for any shif of sar Implicaions of (sric) saionariy Take n = : X X +k +k Disribuion of X = disribuion of X +k for any inegers k X discree: X coninuous: In paricular, P(X = i) = P(X +k = i) for any k f(x ) = f(x +k ) for any k E(X ) = E(X +k ) for any k Var(X ) = Var(X +k ) for any k A saionary process has consan mean and variance The variables X in a saionary process mus be idenically disribued (bu no necessarily independen)

ST4064 Time Series Analysis 7 Take n = : X s X X s+k X +k {s } {s+k +k} Join Disribuion of (X s,x ) = Join Disribuion of (X s+k,x +k ) " for all lags ( - s) " for all inegers k " depends on he lag ( - s) In paricular, COV(X s,x ) = COV(X s+k,x +k ) where COV(X s,x ) = E[(X s E(X s )) (X E(X ))] Thus COV(X s,x ) depends only on lag ( s) and no on ime s b. Sric Saionariy Very sringen requiremen Hard o prove a process is saionary To show a process is no saionary show one condiion doesn hold Examples: Simple Random Walk: {X } no idenically disribued NOT saionary Whie Noise Process: {Z } i.i.d. rivially saionary c. Weak Saionariy This requires only ha E(X ) is consan AND COV(X s, X ) depends only on ( s) Since Var(X ) = COV(X, X ) his implies ha Var(X ) is consan Weak saionariy does no imply sric saionariy For weak saionariy, COV(X, X +k ) is consan wih respec o for all lags k Here (and ofen), saionary is shorhand for weakly saionary

ST4064 Time Series Analysis 8 Quesion: If he join disribuion of he X s is mulivariae normal, hen weak saionariy implies srong saionariy. Soluion: If X ~ N(, ") hen X is compleely deermined by and " (propery of he mulivariae Normal disribuion). If hese do no depend on, neiher does he disribuion of X. Example: X = sin(# + u), U ~ U[0, $] hen E(X ) = 0. Here COV(X, X +k ) = cos(#k) E(sin (u)), hence does no depend on X is weakly saionary Quesion: If we know X 0, hen we can work ou u, since X 0 = sin(u). We hen know all he values of X = sin(# + u) X is compleely deermined by X 0 Definiion: X is purely indeerminisic if values of X,..., X n are progressively less useful a predicing X N as N % &. Here saionary ime series means weakly saionary, purely indeerminisic process. II. Auocovariance, auocorrelaion and parial auocorrelaion c. Auocovariance funcion X X +k +k ime For a saionary process, E(X ) = µ = µ, for any We define k = Cov(X, X +k ) = E(X X +k ) - E(X ) E(X +k ) he auocovariance a LAG k. This funcion does no depend on. Auocovariance funcion of X: {' 0, ', ',...} = {' k : k ( 0} Noe: 0 = Var(X ) Quesion: Properies of covariance needed when calculaing auocovariances for specified models. b. Auocorrelaion funcion (ACF) Recall ha corr(x,y) = Cov(X,Y) / () X ) Y ) For a saionary process, we define " k = corr(x, X +k ) = k / 0 he auocorrelaion a lag k.

ST4064 Time Series Analysis 9 (This is he usual correlaion coefficien, since Var(X ) = Var(X +k ) = ' 0 ) Auocorrelaion Funcion (ACF) of X: { * 0, *, *,... } = {* k : k ( 0} Noe: " 0 = For a purely indeerminisic process, we expec * k % 0 as k % & (i.e. values far apar will no be correlaed) Recall (ST3053): a sequence of i.i.d. random variables {Z } is called a whie noise process and is rivially saionary. Example: {e } is a zero-mean whie noise process if " E(e ) = 0 for any and " COV( e,e ) " k + k # k =, if 0 = =$ % 0, oherwise Noe: he variables e have zero mean, variance ) and are uncorrelaed A sequence of i.i.d. variables wih zero mean will be a whie noise process, according o his definiion. In paricular, Z independen, Z ~ N(0,) ) is a whie noise process. Resul: ' k = ' -k and * k = * -k Correlogram = plo of ACF {* k : k ( 0} as a funcion of lag k. I is widely used as i ells a lo abou he ime series. c. Parial auocorrelaion funcion (PACF) Le r(x,y z) = corr(x,y z) denoe he parial correlaion coefficien beween x and y, adjused for z (or wih z held consan). Denoe: X X + X +... X +k- X +k + + +k- +k + = corr(x, x + x + ) + 3 = corr(x, x +3 x +, x + ) + 4 = corr(x, x +k x +,... x +k- ) + k = parial auocorrelaion coefficien a lag k.

ST4064 Time Series Analysis 0 Parial auocorrelaion funcion (PACF): {+, +,...} = {+ k, k ( } The + k s are relaed o he * k s: Recall ha + = corr(x, X + ) = * r(x,y z) = r(x,y) - r(x,z)r(y,z) -r (x,z) -r (y,z) Applying his here, using x = X, y = X +, z = X +, " = corr(x, x + x + ) = r(x,y z), along wih # = r(x,z) and # = r(x,y), yields: " # = # d. Esimaion of he ACF and PACF We assume ha he sequence of observaions {x, x,...x n } comes from a saionary ime series process. The following funcions are cenral o he analysis of ime series: {' k } - Auocovariance funcion f($) Specral densiy funcion {* k } Auocorrelaion funcion (ACF) { k } Parial Auocorrelaion funcion (PACF) To find a model o fi he sequence {x,x,...,x n }, we mus be able o esimae he ACF of he process of which he daa is a realisaion. Since he model underlying he daa is assumed o be saionary, is mean can be esimaed using he sample mean. n ˆµ = x n = The auocovariance funcion, ' k, can be esimaed using he sample auocovariance funcion:

ST4064 Time Series Analysis n ˆ = (x " ˆ µ )(x " ˆ µ ) k # n = k + from which are derived esimaes, r k for he auocorrelaion * k : -k r k ˆ k = ˆ 0 The collecion { rk : k Z} is called he sample auocorrelaion funcion (SACF). The plo of r k agains k is called a correlogram. Recall ha he parial auocorrelaion coefficiens k are calculaed as follows: = " = = " # " " " " " # " " In general, k is given as a raio of deerminans involving *, *,..., * k. The sample parial auocorrelaion coefficiens are given by hese formulae, bu wih he * k replaced by heir esimaes r k : ec. ˆ = r ˆ r " r = " r The collecion { ˆ k } is called he sample parial auocorrelaion funcion (SPACF). The plo of { ˆ k } agains k is called he parial correlogram. r k ˆk k - - k These are he main ools in idenifying a model for a saionary ime series.

ST4064 Time Series Analysis II.3 ARMA modelling Auoregressive moving average (ARMA) models consiue he main class of linear models for ime series. More specifically: Auoregressive (AR) Moving Average (MA) Auoregressive Moving Average (ARMA) Auoregressive Inegraed Moving Average (ARIMA) Las ype are non-saionary Ohers are saionary a. AR models Recall: Markov Chain = process such ha he condiional disribuion of X n+, given X n,x n-,...x 0 depends only on X n, i.e. he fuure depends on he presen, bu no on he pas The simples ype of auoregressive model (AR()) has his propery: X =, X - + -, where - is zero-mean whie noise. X - X - X - - For AR(), we prove ha + = corr(x, X - X - ) = 0 Similarly, + k = 0 for k >. A more general form of an AR() model is X = µ +, (X - µ) + - where µ = E(X ) is he process mean Auoregressive process of order p (AR(p)): X = µ +, (X - µ) +, (X - µ) +..., p (X -p µ) + - b. MA models A realisaion of a whie noise process is very jagged, since successive observaions are realisaions of independen variables... Mos ime series observed in pracice have a smooher ime series plo han a realisaion of a whie noise process, since in his process he successive observaions are realisaions of independen variables. In ha respec, aking a moving average is a sandard way of smoohing an observed ime series: Observed daa: x, x, x3, x 4,...

ST4064 Time Series Analysis 3 3 x x x x x x Moving average: ( + +, + +,...) 3 3 4 Daa Moving average A moving average process is smoohed whie noise The simples ype of moving average (MA) process is X = µ +- +.- - where - is zero-mean whie noise The % s are uncorrelaed, bu he X s are no: X -... - - - - - - - X - X For MA() we prove ha: * = corr(x, X - ) = 0 Similarly, * k = 0, for k > Moving average process of order (q) (MA(q)): X = µ +- +. - - +... +. q - -q c. ARMA models ARMA processes «combine» AR and MA pars : Noe: ARMA(p,0) = AR(p) ARMA(0,q) = MA(q) X = µ +, (X - µ) +...+, p (X -p µ) + - +. - - +...+. q - -q II.4 Backwards Shif Operaor and Difference Operaor The following operaors will be useful: Backwards shif operaor: B X = X -, Bµ = µ

ST4064 Time Series Analysis 4 Difference operaor: = -B, hence X = X X - B X = BBX = BX - = X - X = X - X - = X X - (X - X - ) = (-B) X = (-B+B ) X = X X - + X - II.5 AR(p) models, saionariy and he Yule-Walker equaions a. The AR() Model Recall X = µ +,(X - µ) + - Subsiuing in for X -, hen for X -, X = µ +,[, (X - µ) + - - ] + - = µ +, (X - µ) + - +, - - X = µ +, (X 0 µ) + - +, - - +...+, - - = µ +, (X 0 µ) + Noe: X 0 is a Random Variable Since E(- ) = 0 for any, µ = E(X ) =µ +, (µ 0 µ) Since he - s are uncorrelaed wih each oher and wih X 0, - j #" -j j= 0 = % $ & j µ 0 µ " ' + $ + + $ j ( j= 0 Var( X ) Var ( X ) ) * $ Var( X 0) + j j= 0 = + # $ = + # $ Var( X 0) Quesion: When will AR() process be saionary? Answer: This will require consan mean and variance. If µ 0 = µ hen µ = µ +, (µ 0 µ) = µ. If Var (X 0 ) = #" hen ( ) Var X # " " # # # = " + =

ST4064 Time Series Analysis 5 Neiher µ nor Var(X ) depend on. We also require ha, < so ha he AR() process be saionary, in which case # $ µ = µ + " ( µ 0 % µ ) AND Var( X ) % = " Var( X & 0) % ' % " ( % " ) If, <, boh erms will decay away o zero for large X is almos saionary for large Equivalenly, if we assume ha he process has already been running for a very long ime, i will be saionary Any AR() process wih infinie hisory and, < will be saionary:... - -, - -, - 0, -,... -... X -, X -, X 0, X,... X Seady Sae reached Observed ime series An AR() process can be represened as: # j = +% $ j j= 0 X µ " and his converges only if, <. The AR() model X = µ +, (X - µ) + - can be wrien as (,B)(X µ) = - If, <, hen (,B) is inverible and X µ = (,B) - - = ( +,B +, B +...) - = - +,- - +, - - +..., So # X µ " = +% j $ j j= 0 From his represenaion, j µ = E(X ) = µ and Var(X ) = $ " = if, <. #" # j= 0

ST4064 Time Series Analysis 6 So, if, <, he mean and variance are consan, as required for saionariy We mus calculae he auocovariance ' x = Cov(X, X +k ) and show ha his depends only on he lag k. We need properies of covariance: From he following diagram Cov(X+Y,W) = Cov(X,W) + Cov (Y,W) Cov(X,e) = 0... - -, - -, - (uncorrelaed) and... - -, - -, - X X - we can ell ha - and X - are uncorrelaed, hence Cov(-, X - ) = 0 Cov(-, X -k ) = 0, k ( Cov(-, X ) = ) ' = Cov(X, X - ) = Cov(µ +,(X - µ) + -, X - ) =, Cov(X -, X - ) + Cov(-, X - ) =,' 0 + 0 ' = Cov(X, X - ) = Cov(µ +,(X - µ) + -, X - ) =, Cov(X -, X - ) + Cov(-, X - ) =, ' + 0 =, ' 0 Similarly, ' k =, k ' 0, k ( 0 In general, ' k = Cov(X,X -k ) = Cov(µ +,(X - µ) + -,X -k ) =,Cov(X -, X -k ) + Cov(-,X -k ) =, ' k- + 0 Hence, ' k =, k ' 0 =, k #" for k ( 0 and * k = ' k /' 0 =, k for k ( 0 ACF decreases geomerically wih k

ST4064 Time Series Analysis 7 Recall he parial auocorrelaions + and + saisfy # = and = # " " Here + = * =, and # " = = # 0 In fac, # k = 0 for k > In summary, for he AR() model, ACF ails off o zero PACF cus off afer lag Example: Consumer price index Q r = ln(q /Q - ) models he force of inflaion Assume r is an AR() process: r = µ +,(r - -µ) + e Noe: Here µ is he long-run mean r - µ =,(r - - µ), ignoring e If, <, hen r µ % 0 and so r % µ as % &. In his case r is said o be mean-revering. b. The AR(P) model and saionariy Recall ha he AR(p) model can be wrien eiher in is generic form X = µ +, (X - µ ) +, (X - µ) +... +, p (X -p µ) + e or using he B operaor as (, B, B, 3..., p B p ) (X µ) = e Resul: AR(p) is saionary IFF he roos of he characerisic equaion, z, z..., p z p = 0 are all greaer han in absolue value.

ST4064 Time Series Analysis 8, z, z..., p z p # Characerisic Polynomial Explanaion for his resul: wrie he AR(p) process in he form B" B " B " $ # %$ # %... # ( X # µ ) = e z z $ z % & '& ' & p ' where z...z p are roos of he characerisic polynomial:, z..., p z p = z " z " z " $ # %$ # %... # z z $ z % & '& ' & p ' In he AR() case,,z = z/z, where z = /, In AR() case, we can inver he erm B " # - $ % z & in B " $ - %( X # µ ) = e & z ' IFF z >. In he AR(p) case, we need o be able o inver all of he facors B " # - $ z % i & This will be he case IFF z i > for i =,,..., p. Example : AR() X = 5 (X - - 5) + 3(X - 5) + e or ( + B -3B )(X 5) = e + z -3z = 0 is he characerisic equaion here Quesion: when is an AR() process saionary? Answer: we have X = µ +, (X - µ) + e. i.e. (,B)(X µ) = e, so,z = 0 is he characerisic equaion wih soluion z = /,. So, < is equivalen o z >, as required. Quesion: Consider he AR() process X n = X n- / X n- + e n. Is i saionary?

ST4064 Time Series Analysis 9 Answer: Use B-operaor: ( B + / B )X n = e n. So characerisic equaion is z + / z = 0, wih roos ± i and ± i = > Since boh roos saisfy z i >, he process is saionary. In he AR() model, we had ' =,' 0 and ' 0 = ). These are a paricular case of he Yule-Walker Equaions for AR(p): Cov( X, X ) = Cov( µ + ( X # µ ) +... + ( X # µ ) + e, X ) # k # p # p # k = Cov( X #, X # k) +... + pcov( X # p, X # k) +% & $ ", if k=0 0, oherwise c. Yule-Walker equaions The Yule-Walker equaions are defined by he following relaionship: %, if k=0 k = k$ + k$ +... + p k$ p + ', for 0 & k & p " #" # " # " ( 0, oherwise Considering he AR() (i.e. p = ), for k =, we ge ' =,' 0, and for k = 0, we ge ' 0 = ). Example (p=3): ' 3 =, ' +, ' +, 3 ' 0 ' =, ' +, ' 0 +, 3 ' ' =, ' 0 +, ' +, 3 ' ' 0 =, ' +, ' +, 3 ' 3 + ) Example: consider he AR(3) model X = 0.6X - + 0.4X - 0.X -3 + e Yule-Walker Equaions: ' 0 = 0.6' + 0.4' 0.' 3 + ) (0) ' = 0.6' 0 + 0.4' 0.' () ' = 0.6' + 0.4' 0 0.' () ' 3 = 0.6' + 0.4' 0.' 0 (3) From (), ' = 6' 0 6' From (), ' = 0.4' 0 + 0.56', hence ' = 56 65 ' 0, and hence ' = 54 65 ' 0. From (3), ' 3 = 483 650 ' 0 From (0), ) = 0.508' 0 Hence, ' 0 = 4.449), ' =3.878), ' =3.690), ' 3 =3.304)

ST4064 Time Series Analysis 0 and so, since * k = ' k /' 0, * 0 =, * = 0.86, * = 0.83, * 3 = 0.743. I may be shown ha for AR(p) models, ACF ails off o zero, PACF cus off afer lag p, i.e. # k = 0 for k > p II.6 MA(q) models and inveribiliy a. The MA() model The model is given by X = µ + e +.e -, where µ = E(X ) = µ, and ' 0 = Var(e +.e - ) = ( +. )) ' = Cov(e +.e -, e - +.e - )=.) ' k = 0 for k > Hence, he ACF for MA() is: * 0 = * =. / (+. ) * k = 0 for k > Since he mean E(X ) and covariance ' k = E(X, X -k ) do no depend on, he MA() process is (weakly) saionary - for all values of he parameer &. However, we require MA models o be inverible and his imposes condiions on he parameers. Recall: If, < hen in he AR() model (,B)(X µ) = e, (-,B) is inverible and # j µ $ " j µ " " j= 0 X = " e = + e + e + e +... i.e. an AR() process is MA(&). An MA() process can be wrien as or i.e. X µ = ( +.B)e ( +.B) - (X µ) = e

ST4064 Time Series Analysis X -µ.(x - µ) +. (X - µ) +... = e So an MA() process is represened as an AR(&) one bu only if. <, in which case he MA() process is inverible. Example: MA() wih & = 0.5 or & = For boh values of. we have: * =./(+. ) = 0.5 = = 0.4, + (0.5) + So boh models have he same ACF. However, only he model wih.=0.5 is inverible. Quesion: Inerpreaion of inveribiliy Consider he MA() model X n µ.e n-. We have e n = X n µ.e n- = X n µ.(x n- µ.e n- ) =... = X n µ.(x n- µ) +. (X n- µ)... + (-.) n- (X µ) + (-.) n e 0 As n ges large, he dependence of e n on e 0 will be small if. <. Noe: AR() is saionary IFF $ <. MA() is inverible IFF % <. For an MA() process, we have * k = 0 for k >, so for an MA() process, he ACF cus off afer lag. I may be shown ha PACF ails off o zero. AR() MA() ACF Tails off o zero Cus off afer lag PACF Cus off afer lag Tails off o zero b. The MA(q) model and inveribiliy An MA(q) process is modeled by X = µ + e +. e - +... +. q e -q, where {e } is a sequence of uncorrelaed realisaions. For his model we have ' k = Cov(X, X -k ) = 0 for k > q. ' k = Cov(X, X -k ) = E[(e +. e - +... +. q e -q ) ( e -k +. e -k- +... +. q e -k-q )] q q = i j e- ie " j " k i= 0 j= 0 ## E( ) [where & 0 = ]

ST4064 Time Series Analysis q" k = ) # j+ k j, [since j = i-k ' q-k] j= 0 since he only non-zero erms occur when he subscrips of e -i and e -j-k mach, i.e. when i = j+k, for k ' q. In summary, for k > q, k = 0: For MA(q), ACF cus off afer lag q For AR(p), PACF cus off afer lag p Quesion: ACF of he MA() process X n = + e n 5e n- + 6e n-. ' 0 = Cov( + e n 5e n- + 6e n-, + e n 5e n- + 6e n- ) = ( + 5 + 36) = 6 If E(e n ) = 0 and Var(e n ) =. ' = Cov ( + e n 5e n- + 6e n-, + e n- 5e n- + 6e n-3 ) = (-5)() + (6)(-5) = -35 ' = Cov ( + e n 5e n- + 6e n-, + e n- 5e n-3 + 6e n-4 ) = (6)() = 6 ' k = 0, k > Recall ha an AR(p) process is saionary IFF roos z of he characerisic eq saisfy z >. For an MA(q) process, we have X µ = ( +. B +. B +... +. p B p ) e Consider he equaion +. z +. x +... +. p z p = 0. The MA(q) process is inverible IFF all roos z of his equaion saisfy z >. In summary: If AR(p) saionary, hen AR(p) = MA(&) If MA(q) is inverible, hen MA(q) = AR(&) Quesion: Assess inveribiliy of he MA() process X = + e 5e - + 6e -. We have X = + (-5B +6B )e. The characerisic equaion is 5z + 6z = 0 wih roos (-z)(-3z) = 0, i.e. roos z = / and z = /3 No inverible II.7 ARMA(p,q) models Recall ha he ARMA(p,q) model can be wrien eiher in is generic form

ST4064 Time Series Analysis 3 X = µ +, (X - µ) +..., p (X -p µ) + e +. e - +...+. q e -q or using he B operaor: (, B..., p B p ) (X µ) = ( +. B... +. q B q )e i.e. 0(B)(X µ) = (B)e where 0 () =, -... -, p p () = +. +... +. p q If 0 () and () have facors in common, we simplify he defining relaion. Consider he simple ARMA(,) process wih. = -,, wrien eiher or X =,X - + e,e - (,B)X = (,B)e, wih, < Dividing hrough by (,B), we obain X = e. Therefore he process is acually an ARMA(0,0), also called whie noise. We assume ha 0() and () have no common facors. Properies of ARMA(p,q) are a mixure of hose of AR(p) and hose of MA(q). Characerisic polynomial of ARMA(p,q) =, z..., p z p (as for AR(p)) ARMA(p,q) is saionary IFF all he roos z of, z..., p z p = 0 saisfy z > ARMA(p,q) is inverible IFF all he roos z of. z.... p z q = 0 saisfy z > Example: he ARMA(,) process X =,X - + e +.e - is saionary if, < and inverible if. <. Example: ACF of ARMA(,). For he model given by X =,X - + e +.e - we have Cov(e, X - ) = 0 Cov(e, e - ) = 0 Cov(e, X ) =, Cov(e,X - ) + Cov(e,e ) +. Cov(e,e - ) = ) Cov(e -, X ) =, Cov(e -,X - ) + Cov(e -,e ) +. Cov(e -,e - ) =, ) + 0 +. ) = (, +.) )

ST4064 Time Series Analysis 4 ' 0 = Cov(X,X ) =, Cov(X,X - ) + Cov(X,e ) +. Cov(X,e - ) =,' + ) +. (,+.) ) =,' + ( +,. +. ) ) ' = Cov(X -,X ) =, Cov(X -,X - ) + Cov(X -,e ) +.Cov(X -,e - ) =,' 0 +.) For k >, ' k = Cov(X -,X ) =, Cov(X -k,x - ) + Cov(X -k,e ) +. Cov(X -k,e - ) =, ' k- (Analogues of Yule-Walker Equaions) Solve for ' 0 and ' : ' 0 = ' = + " + " # - (+")( + ") # - ' k =, k- ', for k > (+"#)("+#) Hence =, * k =, k- *, for k > (compare * k =, k, for k ( 0 for AR()). 0 +"#+# For (saionary) ARMA(p,q), ACF ails off o zero PACF ails off o zero Quesion: ARMA(,) process X = 0X - X - + e e - + e -

ST4064 Time Series Analysis 5 ( 0B +B )X = ( B +B )e The roos of 0z + z = (z )(z 3) = 0 Are z = and z = 3, z > for boh roos; process saionary. II.8 ARIMA(p,d,q) models a. Non-ARMA processes Given ime series daa X... X n, find a model for his daa. Calculae sample saisics: sample mean, sample ACF, sample PACF. Compare wih known ACF/PACF of class of ARMA models o selec suiable model. All ARMA models considered are saionary so can only be used for saionary ime series daa. If ime-series daa is non-saionary, ransform i o a saionary ime series (e.g. by differencing) Model his ransformed series using an ARMA model Take he inverse ransform of his model as model for he original non-saionary ime series. Example: Random Walk X 0 = 0, X n = X n- + Z n, where Z n is a whie noise process. X n is non-saionary, bu X n = X n X n- = Z n is saionary. Quesion: Given X 0, X...X n he firs order differences are w i = x n x i-, i =,..., N From he differences w, w,..., w N and x 0 we can calculae he original ime series: w = x x 0, so x = x 0 + w w = x x, so x = x + w = x 0 + w + w, ec. The inverse process of differencing is inegraion, since we mus sum he differences o obain he original ime series. b. The I(d) noaion ( inegraed of order d ) X is said o be I(0) if X is saionary

ST4064 Time Series Analysis 6 X is said o be I() if X is no saionary bu Y = X X - is saionary X is said o be I() if X is no saionary, bu Y is I(). Thus X is I(d) if X mus be differenced d imes o make i saionary. Example: If he firs differences x n = x n x n- of x, x... x n are modelled by an AR() model (saionary) X n = 0.5 X n- + e n, Then, X n X n- = 0.5(X n- X n- ) + e n, so X n =.5X n- 0.5X n- +e n is he model for he original ime series. This AR() model is non-saionary since wrien as (.5B + 0.5B )X n = e n, for which he characerisic equaion is:.5z + 0.5z = 0 wih roos z = and z =. The model is non-saionary since z > does no hold for BOTH roos. X is ARIMA(p,,q) if X is non-saionary, bu X (he firs difference of X) is a saionary ARMA(p,q) process Recall ha a process X is I() if X is non-saionary, bu X = X X - is saionary Noe: If X is ARIMA(p,,q) hen X is I(). Example: Random Walk. X X - = e, where e is a whie noise process. We have X = X 0 + j= e j So E(X ) = E(X 0 ), if E(e ) = 0, bu Var(X ) = Var(X 0 ) + ). Hence X is non-saionary, bu X = e, where e is a saionary whie noise process. Example: Z = closing share price on day. Here he model is given by Z = Z - exp(µ + e ) Le Y = ln Z, hen Y = µ + Y - + e. This is a random walk wih drif. Now consider he daily reurns Y Y - = ln(z /Z - ). Since Y Y - = µ + e and he e s are independen, hen Y Y - is independen of Y...Y - or ln(z /Z - ) is independen of pas prices Z 0, Z,... Z -.

ST4064 Time Series Analysis 7 Example: Recall he example of Q = consumer price index a ime. We have r = ln(q /Q - ) follows AR() model r = µ +, (r - µ) + e ln(q /Q - ) = µ +, (ln(q /Q - ) - µ) + e ln(q ) = µ +,(ln(q - ) µ) + e hus ln(q ) is AR() and so ln(q ) is ARIMA(,,0) If X needs o be differenced a leas d imes o reduce i o saionariy, and Y = d X is saionary ARMA(p,q), hen X is an ARIMA(p,d,q) process. An ARIMA(p,d,q) process is I(d) Example: Idenify as ARIMA(p,d,q) he following model X = 0.6X - + 0.3X - + 0.X -3 + e 0.5e ( 0.6B 0.3B 0.B 3 ) X = ( 0.5B) e Check for facor ( B) on LHS: ( B)( 0.4B + 0.B )X = ( 0.5B) e Model is ARIMA(,,) Characerisic equaion: + 0.4z + 0.z = 0 wih roos - ± i 6 Since z = 0 > for boh roos X is saionary, as required. Alernaive mehod: Wrie model in erms of X = X X -, X -, ec X X - = -0.4X - + 0.4X - = -0.X - + 0.X -3 + e 0.5e X = -0.4 X - 0. X - + e 0.5e - Hence, X is ARMA(,) (check for saionariy as above), and so X is ARIMA(,,) Noe: if d X is ARMA(,q), o check for saionariy, we only need o see ha, <.

ST4064 Time Series Analysis 8 II.9 The Markov Propery AR() Model: X = µ +,(X - µ) + e Condiional disribuion of X n+, given X n, X n-,..., X 0 depends only on X n AR() has markov propery AR() Model: X = µ +, (X - µ) +, (X - - ) + e Condiional disribuion of X n+, given X n, X n-,... X 0 depends on X n- as well as X n. AR() does no have he Markov Propery Consider now X n+ = µ +, X n +, X n- + e n+ or X " n+ µ " " Xn en+ # $ = # $ + # $ # " $ + # " $ X % 0& % 0 & X % 0 & % n & % n-& Define X Y= " =(X,X ) % & n T n # $ n n- Xn- hen µ " " en+ " Y n+= # $ + # $ Y n + # $ % 0& % 0 & % 0 & Y is said o be a vecor auoregressive process of order. Noaion: Var() Y has he Markov propery In general, AR(P) does no have he Markov propery for p >, bu Y = (X, X -,... X -p+ ) T does Recall: Random walk ARIMA(0,,0) defined by X X - = e has independen incremens and hence does have he Markov propery I may be shown ha for p+d >, ARIMA(p,d,0) does no have he Markov propery, bu Y = (X, X -,..., X -p-d+ ) T does.

ST4064 Time Series Analysis 9 Consider he MA() process X = µ + e +.e -. I is clear ha knowing X n will never be enough o deduce he value of e n, on which he disribuion of X n+ depends. Hence an MA() process does no have he Markov propery. Now consider an MA(q) = AR(&) process. I is known ha AR(p) processes Y = (X, X -,...X -p+ ) T have he Markov propery if considered as a p-dimensional vecor process (p finie). I follows ha an MA(q) process has no finie dimensional Markov represenaion. Quesion: Associae a vecor-valued Markov process wih X = 5X - 4X - + X -3 + e We have (X X - ) = 3 (X - X - ) - (X - X -3 ) + e X = 3 X - X - + e X = X - + e ARIMA(,,0) or ARIMA(p,d,q) wih p = and d =. Since p+d = 3 >, Y = (X, X -,...X -p-d+ ) T = (X, X -,X - ) T is Markov Quesion: Le he MA() process X n = e n + e n-, where e n = wih probabiliy / - wih probabiliy / P(X n = X n- = 0) = P(e n =, e n- = e n- + e n- = 0) = P(e n = ) P(e n- = e n- + e n- = 0) = / / = 3 P(X n = X n- = 0, X n- = ) = P(e n =, e n- = e n- + e n- = 0, e n- + e n-3 = ) = 0 No Markov: since he wo probabiliies differ, value of X n does no depend on he immediae pas n- only.

ST4064 Time Series Analysis 30 III. Non-saionariy: rends and echniques III. Typical rends Possible causes of non-saionariy in a ime series are: Deerminisic rend (e.g. linear or exponenial growh) Deerminisic cycle (e.g. seasonal effecs) Time series is inegraed (as opposed o differenced) Example: X n = X n- + Z n, where Z n = +, probabiliy 0.6 -, probabiliy 0.4 Here X n is I(), since Z n = X n X n- is saionary. Also, E(X n ) = E(X n- ) + 0., so he process has a deerminisic rend. Many echniques allow o deec non-saionary series; among he simples mehods: Plo of ime series agains Sample ACF The sample ACF is an esimae of he heoreical ACF, based on he sample daa and is defined laer. A plo of he ime series will highligh a rend in he daa and will show up any cyclic variaion. X Trend X Seasonal Paern 003 004 Trend + Seasonal X Recall: For a saionary ime series, * k % 0 as k % &, i.e. (heoreical) ACF converges oward zero. Hence, he sample ACF should also converge oward zero. If he sample ACF decreases slowly, he ime series is non-saionary, and needs o be differenced before fiing a model. Sample ACF Sample ACF

ST4064 Time Series Analysis 3 r k r k 6 k If sample ACF exhibis periodic oscillaion, here is probably a seasonal paern in he daa. This should be removed before fiing a model (see Figures 7.3a and 7.3b). The following graph (Fig 7.3(a)) shows he number of hoel rooms occupied over several years. Inspecion shows he clear seasonal dependence, manifesed as a cyclic effec. The nex graph (Fig 7.3(b)) shows he sample auocorrelaion funcion for his daa. I is clear ha he seasonal effec shows up as a cycle in his funcion. In paricular, he period of his cycle looks o be monhs, reinforcing he idea ha i is a seasonal effec. "#$%&'()*+' Seasonal variaion- hoel room occupancy (7.3a) 963-976 and is sample ACF (7.3b) Mehods for removing a linear rend:

ST4064 Time Series Analysis 3 Leas squares Differencing Mehods for removing a seasonal effec Seasonal differencing Mehod of Moving Averages Mehod of seasonal means III. Leas squares rend removal Fi a model, X = a + b + Y where Y is a zero-mean, saionary process. Recall: e = error variables ( rue residuals ) in a regression model. Assume e ~ IN(0,) ) Esimae parameers a and b using linear regression Fi a saionary model o he residuals: ŷ = x (a ˆ - b) ˆ Noe: leas squares may also be used o remove nonlinear rends from a ime series. I is naurally possible o model any observed nonlinear rend by some erm "() wihin X = "() + Y which can be esimaed using leas squares. For example, a plo of hourly daa of daily energy loads agains emperaure, over a one-dayime frame, may indicae quadraic variaions over he day; in his case one could use "() = a + b. III.3 Differencing a. Differencing and linear rend removal Use differencing if he sample ACF decreases slowly. If here is a linear rend, e.g. x = a + b + y, hen " x = x x = b +" y, so differencing has removed he linear rend. If x is I(d), hen differencing x d imes will make i saionary. Differencing x once will remove any linear rend, as above.

ST4064 Time Series Analysis 33 Suppose x is I() wih a linear rend. If we difference x once, hen x is saionary and we have removed he rend. However, if we remove he rend using linear regression we will sill be lef wih an I() process ha is non-saionary. Example: X n = X n- + Z n, where Z n = +, prob. 0.6 -, prob. 0.4 Le X 0 = 0. Then E(X ) = 0., since E(Z ) = 0., and E(X ) = 0.() E(X n ) = 0.(n). Then X n is I() AND X n has a linear rend. Le Y n = X n 0.(n). Then E(Y n ) = 0, so we have removed he linear rend bu Y n Y n- = X n X n- -0. = Z n 0. Hence Y n is a random walk (which is non-saionary) and b. Selecion of d Y is saionary, so Y n n is an I() process. How many imes (d) do we have o difference he ime series X o conver i o saionariy? This will deermine he parameer d in he fied ARIMA(p,d,q) model. Recall he hree causes of non-saionariy: Trend Cycle Time series is an inegraed series We are assuming ha linear rends and cycles have been removed, so if he plo of he ime series and is SACF indicae non-saionariy, i could be ha he ime series is a realisaion of an inegraed process and so mus be differenced a number of imes o achieve saionariy. Choosing an appropriae value of d: Look a he SACF. If he SACF decays slowly o zero, his indicaes a need for differencing (for a saionary ARMA model, he SACF decays rapidly o zero). Look a he sample variance of he original ime series X and is difference. ( d) d Le ˆ be he sample variance of z = x. I is normally he case ha ˆ firs decreases wih d unil saionariy is reached, and hen sars o increase, since differencing oo much inroduces correlaion.

ST4064 Time Series Analysis 34 Take d equal o he value ha minimises ˆ. ˆ 5 5 5 5 5 5 5 5 0 3 d In he above example, ake d=, which is he value for which he esimaed variance is minimised. III.4 Seasonal differencing Example: Le X be he monhly average emperaure in London. Suppose ha he model x = µ + 4 + y applies, where 4 is a periodic funcion wih period and y is saionary. The seasonal difference of X is defined as: ( ) " x = x x Bu: x x - = (µ + 4 + y ) (µ + 4 - + y - ) = y y - since 4 = 4 -. Hence x x - is a saionary process. We can model x x - as a saionary process and hus ge a model for x. Example: In he UK, monhly inflaion figures are obained by seasonal differencing of he reail prices index (RPI). If x is he value of RPI for monh, hen annual inflaion figure for monh is x - x x - - 00% Remark : he number of seasonal differences aken is denoed by D. For example, for he seasonal differencing X X = " X we have D=. Remark : in pracice, for mos ime series we would need a mos d= and D=. III.5 Mehod of moving averages This mehod makes use of a simple linear filer o eliminae he effecs of periodic variaion. If X is a ime series wih seasonal effecs wih even period d = h, we define a smoohed process Y by " y = - -... -... # x + x - h h + + + x + x + + x x h + + h + h $ % & This ensures ha each period makes equal conribuion o y. Example wih quarerly daa: A yearly period will have d = 4 = h, so h =, and

ST4064 Time Series Analysis 35 y = 3 ( / x - + x - + x + x + + / x + ) This is a cenred moving average, since he average is aken symmerically around he ime. Such an average can only be calculaed rerospecively. For odd periods d = h +, he end erms x -h and x +h need no be halved: y = ( x -h +x -h++...+x -+x +...+x+h- x+h ) h + + Example: wih daa every 4 monhs, a yearly period will have d = 3 = h+, so h = and y = /3 (x - + x + x + ) III.6 Seasonal means In fiing he seasonal model x = µ + 4 + y wih E(Y )=0 (addiive model) o a monhly ime series, x exending over 0 years from January 990, he esimae of µ is x (he average over all 0 observaions) and he esimae of 4 January is ˆ January = (x +x 3+...+x 09)-x, 0 he difference beween he average value for January, and he overall average over all he monhs. Recall ha 4 is a periodic funcion wih period and y is saionary. Thus, 4 conains he deviaion of he model (from he overall mean µ) a ime due o he seasonal effec. Monh/Year... 0 mean January x x 3... x 09 ˆ............ December x x 4... x ˆ 0 overall mean x III.7 Filering, smoohing Filering and exponenial smoohing echniques are commonly applied o ime series in order o clean he original series from undesired arifacs. The moving average is an example of a filering echnique. Oher filers may be applied depending on he naure of he inpu series.

ST4064 Time Series Analysis 36 Exponenial smoohing is anoher common se of echniques. I is used ypically o simplify he inpu ime series by dampening is variaions so as o reain in prioriy he underlying dynamics. III.8 Transformaions Recall: In he simple linear model y i =. 0 +. i x i + e i where e i ~ IN (0,) ), we use regression diagnosic plos of he residuals, e ˆi, o es he assumpions abou he model (e.g. he normaliy of he error variables e i or he consan variance of he error variables e i ). To es he laer assumpion we plo he residuals agains he fied values. e ˆi 0 x x x x x x x x x x xx x x x x x x x x x ŷ i If he plo does no appear as above, he daa is ransformed, and he mos common ransformaion is he logarihmic ransformaion. Similarly, if afer fiing an ARMA model o a ime series x, a plo of he residuals versus he fied values indicaes a dependence, hen we should consider modelling a ransformaion of he ime series x and he mos common ransformaion is he logarihmic Transformaion Y = ln(x )

ST4064 Time Series Analysis 37 IV. Box-Jenkins mehodology IV. Overview We consider how o fi an ARIMA(p,d,q) model o hisorical daa {x, x,...x n }. We assume ha rends and seasonal effecs have been removed from he daa. The mehodology developed by Box and Jenkins consiss in 3 disinc seps: Tenaive idenificaion of an ARIMA model Esimaion of he parameers of he idenified model Diagnosic checks If he enaively idenified model passes he diagnosic ess, i can be used for forecasing. If i does no, he diagnosic ess should indicae how he model should be modified, and a new cycle of Idenificaion Esimaion Diagnosic checks is performed. IV. Model selecion a. Idenificaion of whie noise Recall: in a simple linear regression model, y i =. 0 +. X i + e i, e i ~ IN(0,) ), we use regression diagnosic plos of he residuals eˆi o es he goodness of fi of he model, i.e. if he assumpions e i ~ IN(0,) ) are jusified. The error variables e i form a zero-mean whie noise process: hey are uncorrelaed, wih common variance ). Recall: { e : } is a zero-mean whie noise process if Ee ( ) = 0 $ " k # k = Cov( e, e ) =& ' % k =, 0 0, oherwise Thus he ACF and PACF of a whie noise process (when ploed agains k) look like his: ACF (* k ) PACF ( ˆk ) 3... k 3... k - -

ST4064 Time Series Analysis 38 i.e. apar from * 0 =, we have * k = 0 for k =,,... and k = 0 for k =,,... Quesion: how do we es if he residuals from a ime series model look like a realisaion of a whie noise process? Answer: we look a he SACF and SPACF of he residuals. In sudying he SACF and SPACF, we realise ha even if he original process was whie noise, we would no expec r k = 0 for k =,, and k = 0 for k =,, as r k is only an esimae of * k and ˆk is only an esimae of k. Quesion: how close o 0 should r k and ˆk be, if r k = 0 for k =,, and ˆk = 0 for k =,,? Answer: If he original model is whie noise, X = µ + e, hen for each k, he SACF and SPACF saisfy r k ~ N 0, $ # " n % & and ˆ ~ N 0, $ # k " n & % This is rue for large samples, i.e. for large values of n. Values of r k or ˆk ouside he range inappropriae. " $ #, % can be aken as suggesing ha a whie noise model is & n n' However, hese are only approximae 95% confidence inervals. If * k = 0, we can be 95% cerain ha r k lies beween hese limis. This means ha value in 0 will lie ouside hese limis even if he whie noise model is correc. Hence a single value of r k or ˆk ouside hese limis would no be regarded as significan on is own, bu hree such values migh well be significan. There is an overall Goodness of Fi es, based on all he r k s in he SACF, raher han on individual r k s, called he Pormaneau es by Ljung and Box. I consiss in checking wheher he m sample auocorrelaion coefficiens of he residuals are oo large o resemble hose of a whie noise process (which should all be negligible). Given residuals from an esimaed ARMA(p,q) model, under he null hypohesis ha all values of r k = 0, and he Q-saisic is asympoically # -disribued wih s = m p q degrees of freedom, or, if a consan (say µ) is included, s = m p q degrees of freedom. If he whie noise model is correc hen m rk Q= n( n+ ) # n" k k= s for each s = m- p- q. Tha is, under he null hypohesis ha all values of r k = 0, he Q-saisic given above is asympoically # -disribued wih m degrees of freedom. If he Q-saisic is found o be greaer han he 95 h percenile of ha # disribuion, he null hypohesis is rejeced, which means ha he alernaive hypohesis ha a leas one auocorrelaion is non-zero is acceped. Saisical packages prin hese saisics. For large n, he Ljung-Box Q-saisic ends o closely approximae he Box-Pierce saisic:

ST4064 Time Series Analysis 39 r nn ( + ) n r m m k " " k k= n- k k= The overall diagnosic es is herefore performed as follows (for cenred realisaions): Fi ARMA(p,q) model Esimae (p+q) parameers Tes if Q = n(n + ) m r k " ~ n k m pq k= Remark: he above Ljung-Box Q-saisic was firs suggesed o improve upon he simpler Box-Pierce es saisic m k k = Q = n r which was found o perform poorly even for moderaely large sample sizes. b. Idenificaion of MA(q) Recall: for an MA(q) process, # k = 0 for all k > q, i.e. he ACF cus off afer lag q. To es if an MA(q) model is appropriae, we see if r k is close o 0 for all k > q. If he daa do come from an MA(q) model, hen for k > q (since he firs q+ coefficiens are significan), and 95% of he r k s should lie in he inerval " r k ~ N 0, " $ n $ + # # q i= %% i ' ' && " q q $ % $ % # '&.96 ) + / i *, +.96 ) + / i *( ' n+ i=, n+ i= -,(. (noe ha i is common o use insead of.96 in he above formula). We would expec in 0 values o lie ouside he inerval. In pracise, he # i s are replaced by r i s. The confidence limis on SACF plos are based on his. If r k lies ouside hese limis i is significanly differen from zero and we conclude ha # k $ 0. Oherwise, r k is no significanly differen o zero and we conclude ha # k = 0. SACF --- --- --- r k k --- --- ---

ST4064 Time Series Analysis 40 For q=0, he limis for k= are.96.96" $ #, % & n n ' as for esing for whie noise model. Coefficien r is compared wih hese limis. For q =, he limis for k = are " $ #.96 ( + r ),.96 ( + r ) % & n n ' and r is compared wih hese limis. Again, is ofen used in place of.96. c. Idenificaion of AR(p) Recall: for an AR(p) process, we have k = 0 for all k > p, i.e. he PACF cus off afer lag p. To es if an AR(p) model is appropriae, we see if he sample esimae of k is close o 0 for all k > p. If he daa do come from an AR(p) model, hen for k > p, ˆ k ~ N 0, $ # & " n % and 95% of he sample esimaes should lie in he inerval " $ #, % & n n' The confidence limis on SPACF plos are based on his: if he sample esimae of k lies ouside hese limis, i is significan. Sample PACF of AR() SPACF -0. 0.0 0. 0.4 0.6 0.8 5 0 5 Lag k

ST4064 Time Series Analysis 4 IV.3 Model fiing a. Fiing an ARMA(p,q) model We make he following assumpions: An appropriae value of d has been found and {z d+, z d+,... z n } is saionary. Sample mean z = 0; if no, subrac ˆ µ = z from each z i. For simpliciy, we assume ha d = 0 (o simplify upper and lower limis of sums). We look for an ARMA(p,q) model for he daa z: If he SACF appears o cu off afer lag q, an MA(q) model is indicaed (we use he ess of significance described previously). If he SPACF appears o cu off afer lag p, and AR(p) model is indicaed. If neiher he SACF nor he SPACF cu off, mixed models mus be considered, saring wih ARMA(,). b. Parameer esimaion: LS and ML Having idenified he values for he parameers p and q, we mus now esimae he values of he parameers (, (,... ( p and &, &,..., & q in he model Z = ( Z - +... + ( p Z -p + e + & e - + & q e -q Leas squares (LS) esimaion is equivalen o maximum likelihood (ML) esimaion if e is assumed normally disribued. Example: in he AR(p) model, e = Z ( Z -... ( p Z -p. The esimaors ˆ,..., ˆ p are chosen o minimise n " =p+ (z ˆ ˆ z -... pz -p) Once hese esimaes obained, he residual a ime is given by eˆ = z" ˆ z "..." ˆ z - p -p For general ARMA models, ê canno be deduced from he z. In he MA() model for insance, eˆ = z " ˆ eˆ " We can solve his ieraively for ê as long as some saring value ê 0 is assumed. For an ARMA(p,q) model, he lis of saring values is ( ê 0, ê,..., ê q ). The saring values are esimaed recursively by backforecasing:

ST4064 Time Series Analysis 4 0. Assume ( ê 0, ê,..., ê q ) are all zero. Esimae he ( i and & j. Use forecasing on he ime-reversed process {z n,..., z } o predic values for ( ê 0, ê,..., ê q ) 3. Repea cycle ()-() unil he esimaes converge. c. Parameer esimaion: mehod of momens Calculae heoreical ACF or ARMA(p,q): # k s will be a funcion of he ( s and & s. Se # k = r k and solve for he ( s and & s. These are he mehod of momens esimaors. Example: you have decided o fi he following MA() model You have calculaed ˆ 0 =, ˆ = -0.5. Esimae.. x n = e n +.e n-, e n ~ N(0,) ˆ We have r = ˆ 0 = -0.5. Recall: ' 0 = ( +. ) ) = +. and ' =.) =. here, from which * = +. Seing # = r = = -0.5 and solving for. gives. = -0.68 or. = -3.73. + Recall: he MA() process is inverible IFF. <. So for. = -0.68, he model is inverible. Bu for. = -3.73 he model is no inverible. Noe: If ˆ = -0.5 here, hen # = r = esimae gives an inverible model. Now, le us esimae ) = Var (e ). + = -0.5, which gives (. + ) = 0, so. = -, and neiher Recall ha in he simple linear model Y i = & 0 + & X i + e i, e i ~ IN(0, ) ), ) is esimaed by ˆ n = " eˆi n - i= where eˆ ˆ ˆ i = yi - 0- x is he i h residual. Here we use i ˆ n = $ eˆ n = p+ n = ˆ ˆ $ ( z ˆ ˆ ˆ ˆ -" z--...-" pz-p - # e--...- # qe-q) n = p+

ST4064 Time Series Analysis 43 No maer which esimaion mehod is used his parameer is esimaed las, as esimaes of he ( s and. s are required firs. Noe: In using eiher Leas Squares or Maximum Likelihood Esimaion we also find he residuals, ê, whereas using he Mehod of Momens o esimae he, s and. s hese residuals have o be calculaed aferwards. Noe: for large n, here will be lile difference beween LS, ML and Mehod of Momens esimaors. d. Diagnosic checking Assume we have idenified a enaive ARIMA(p,d,q) model and calculaed he esimaes µ, ˆ, ˆ " ˆ,... " ˆ, # ˆ,...,# ˆ. p q We mus perform diagnosic checks based on he residuals. If he ARMA(p,q) model is a good approximaion o he underlying ime series process, hen he residuals ê will form a good approximaion o a whie noise process. (I) Tess o see if he residuals are whie noise: Sudy SACF and SPACF of residuals. Do r k and ˆ.96.96" lie ouside k $ #, % & n n '? Pormaneau es of residuals (carried ou on he residual SACF): m rk nn ( + ) # ~ m " s, for s= number of parameers of he model n- k k= If he SACF or SPACF of he residuals has oo many values ouside he inerval.96.96" #, we $ % & n n ' conclude ha he fied model does no have enough parameers and a new model wih addiional parameers should be fied. The Pormaneau es may also be used for his purpose. Oher ess are: Inspecion of he graph of {e ˆ } Couning urning poins Sudy he sample specral densiy funcion of he residuals (II) Inspecion of he graph of {e ˆ }: plo ê agains plo ê agains z any paerns eviden in hese plos may indicae ha he residuals are no a realisaion of a se of independen (uncorrelaed) variables and so he model is inadequae.

ST4064 Time Series Analysis 44 (III) Couning Turning Poins: This is a es of independence. Are he residuals a realisaion of a se of independen variables? Possible configuraions for a urning poin are: In he diagram above, here exiss a urning poin for all configuraions excep (a) and (b). Since four ou of he six possible configuraions exhibi a urning poin, he probabiliy o observe one is 4/6 = /3. If y, y,..., y n is a sequence of numbers, he sequence has a urning poin a ime k if eiher or y k- < y k AND y k > y k+ y k- > y k AND y k < y k+ Resul: if Y, Y,... Y N is a sequence of independen random variables, hen he probabiliy of a urning poin a ime k is /3 The expeced number of urning poins is /3 (N - ) The variance is (6N 9)/90 [Kendall and Suar, The Advanced Theory of Saisics, 966, vol 3, p.35] herefore, he number of urning poins in a realisaion of Y, Y,... Y N should lie wihin he 95% confidence inerval: $ 6N # 9 % $ 6N # 9 % & ( N # ) #.96, ( N ).96 " ( ) # + ( )' &, 3 * 90 + 3 * 90 + '- Sudy he sample specral densiy funcion of he residuals: Recall: he specral densiy funcion on whie noise process is f(#) = ) /$, -$ < # < $. So he sample specral densiy funcion of he residuals should be roughly consan for a whie noise process.

ST4064 Time Series Analysis 45 V. Forecasing V. The Box-Jenkins approach Having fied an ARMA model o {x, x,... x n } we have he equaion: X n+k = µ +, (x n+k- µ) +... +, p (x n+k-p µ) + e n+k +. e n+k- +...+. q e n+k-q S x x...... x n... x n+k? n n+k ime ˆx n(k)= Forecas value of x n+k, given all observaions up unil ime n. = k-sep ahead forecas a ime n. In he Box-Jenkins approach, ˆx n(k)is aken as E(X n+k X,..., X n ), i.e. ˆx n(k) is he condiional expecaion of he fuure value of he process, given he informaion currenly available. From resul in ST3053 (secion A), we know ha E(X n+k X,..., X n ) minimises he mean square error E(X n+k h( X,..., X n )) of all funcions h(x,..., X n ). ˆx n(k) is calculaed as follows from he equaion for X n+k : Replace all unknown parameers by heir esimaed values Replace random variables X,..., X n by heir observed values x,..., x n. Replace random variables X n+,..., X n+k- by heir forecas values, ˆx n(),..., ˆx n(k-) Replace variables e,..., e n by he residuals e ˆ ˆ,..., e n Replace variables e n+,..., e n+k- by heir expecaions 0. Example: AR() model xn = µ + ( xn-- µ ) + ( xn-- µ ) + e. Since n we have ( ) ( ) ( ) ( ) X = µ + X µ + X µ + e n+ n n" n+ X = µ + X µ + X µ + e n+ n+ n n+ xˆ () = ˆ µ + ˆ ( x " ˆ µ ) + ˆ ( x " ˆ µ ) n n n- xˆ () = ˆ µ + ˆ ( xˆ () " ˆ µ ) + ˆ ( x " ˆ µ ) n n n

ST4064 Time Series Analysis 46 Example: -sep ahead forecas of an ARMA(,) model xn = µ + ( xn-- µ ) + ( xn-- µ ) + en + " en #. x = µ + ( x - µ ) + ( x - µ ) + e + " e, we have Since n+ n+ n n+ n xˆ () = ˆ µ + ˆ ( xˆ () - ˆ µ ) + ˆ ( x - ˆ µ ) + ˆ " eˆ n n n n The (forecas) error of he forecas ˆx n (k) is The expeced value of his error is Hence he variance of he forecas error is n+k ˆ n x - x ( k ) E(x - x ˆ (k) x,...,x ) = x ˆ (k) - x ˆ (k) = 0 n+k n n n n E((x x ˆ ( k)) x,..., x ) n+k n n This is needed for confidence inerval forecass as i is more useful han a poin esimae. For saionary processes, i may be shown ha ˆx n(k) µ as k ". Hence, he variance of he forecas error ends o E(x n+k -µ) = ) as k % &, where ) is he variance of he process. V. Forecasing ARIMA processes If X is ARIMA(p,d,q) hen Z d = X is ARMA(p,q). Use mehods reviewed o produce forecass for Z Reverse he differencing procedure o produce forecass for X Example: if X is ARIMA(0,,) hen Z Bu X n+ = X n + Z n+, so x() ˆn = x ˆ n + z() n = X is ARMA(0,), leading o he forecas ẑ(). n Quesion: Find ˆx n () for an ARIMA(,,) process. Le Z n = Xn and assume Z n = µ +, (Z n- µ) + e n +.e n-, bu Z = X = ( X " X )"( X " X ) n+ n+ n+ n+ n+ n = X " X + X n+ n+ n

ST4064 Time Series Analysis 47 so X n+ = X n+ X n + Z n+. Hence, x() ˆ = x() ˆ x + z() ˆ = x() ˆ x + µ ˆ +(z ˆ ˆ () µ ˆ n n n n n n n V.3 Exponenial smoohing and Hol-Winers The Box-Jenkins mehod requires a skilled operaor in order o obain reliable resuls. For cases where only a simple forecas is needed, exponenial smoohing is much simpler (Hol, 958). A weighed combinaion of pas values is used o predic fuure observaions. For example, he firs forecas for an AR model is obained by or xˆ () (- ) x x ( ( ) " ( ) " ) xˆ () = x + " x + x +... " i n = # n-i = n i= 0 -(- ) B n n n n i The sum of he weighs is "(-) = = -(-) i=0 Generally we use a value of, such ha 0 <, <, so ha here is less emphasis on hisoric values furher back in ime (usually, 0. 6, 6 0.3). There is only one parameer o conrol, usually esimaed via leas squares. The weighs decrease geomerically hence he name exponenial smoohing. Updaing forecass is easy wih exponenial smoohing: I is easy o see ha X n- X n X n+ 5 5? n- n n+ xˆ () = (-) xˆ () + x = xˆ () + ( x - xˆ ()) n n- n n- n n- Curren forecas = previous forecas +, 7 (error in previous forecas).

ST4064 Time Series Analysis 48 Simple exponenial smoohing can cope wih rend or seasonal variaion. Hol-Winers smoohing can cope wih rend and seasonal variaion Hol Winers can someimes ouperform Box-Jenkins forecass. V.4 Linear filering inpu process linear filer oupu process x y ime Series filer weighs ime series A linear filer is a ransformaion of a ime series {x } (he inpu series) o creae an oupu series {y } which saisfies: y = # a x. k -k k= " The collecion of weighs {a k : k % Z} forms a complee descripion of he filer. The objecive of he filering is o modify he inpu series o mee paricular objecives, or o display specific feaures of he daa. For example, an imporan problem in analysis of economic ime series is deecion, isolaion and removal of deerminisic rends. In pracice, a filer {a k : k % Z} normally conains only a relaively small number of non-zero componens. Example: regular differencing. This is used o remove a linear rend. Here a 0 =, a = -, a k = 0 oherwise. Hence y = x x -. Example: seasonal differencing. Here a 0 =, a = -, a k = 0 oherwise, and y = x x -. Example: if he inpu series is a whie noise and he filer akes he form {. 0 =,.,...,. q }, hen he oupu series is MA(q), since y q = " e k -k k = 0 If he inpu series, x, is AR(p), and he filer akes he form {, 0 =, -,,..., -, p }, hen he oupu series is whie noise p # y = x " x = e k -k k =