Levinson Durbin Recursions: I

Similar documents
Levinson Durbin Recursions: I

Calculation of ACVF for ARMA Process: I consider causal ARMA(p, q) defined by

STAT 443 Final Exam Review. 1 Basic Definitions. 2 Statistical Tests. L A TEXer: W. Kong

ARMA Models: I VIII 1

6.3 Forecasting ARMA processes

Differencing Revisited: I ARIMA(p,d,q) processes predicated on notion of dth order differencing of a time series {X t }: for d = 1 and 2, have X t

Introduction to Time Series Analysis. Lecture 7.

Contents. 1 Time Series Analysis Introduction Stationary Processes State Space Modesl Stationary Processes 8

STAT 443 (Winter ) Forecasting

Parameter estimation: ACVF of AR processes

Part III Example Sheet 1 - Solutions YC/Lent 2015 Comments and corrections should be ed to

Covariances of ARMA Processes

MATH 5075: Time Series Analysis

Introduction to Time Series Analysis. Lecture 11.

STA 6857 Forecasting ( 3.5 cont.)

STAD57 Time Series Analysis. Lecture 8

Introduction to Stochastic processes

STOR 356: Summary Course Notes

Exercises - Time series analysis

Statistics 349(02) Review Questions

Statistics 910, #15 1. Kalman Filter

Generalised AR and MA Models and Applications

Statistics of stochastic processes

Gaussian, Markov and stationary processes

3 Theory of stationary random processes

Ch 4. Models For Stationary Time Series. Time Series Analysis

2. An Introduction to Moving Average Models and ARMA Models

Ch 8. MODEL DIAGNOSTICS. Time Series Analysis

Time Series I Time Domain Methods

Forecasting. This optimal forecast is referred to as the Minimum Mean Square Error Forecast. This optimal forecast is unbiased because

STAT Financial Time Series

Prof. Dr. Roland Füss Lecture Series in Applied Econometrics Summer Term Introduction to Time Series Analysis

Problem Set 2: Box-Jenkins methodology

Autoregressive Moving Average (ARMA) Models and their Practical Applications

TMA4285 December 2015 Time series models, solution.

Classical Decomposition Model Revisited: I

Long-range dependence

Econometrics II Heij et al. Chapter 7.1

Chapter 6: Model Specification for Time Series

Chapter 4: Models for Stationary Time Series

3. ARMA Modeling. Now: Important class of stationary processes

Introduction to Time Series Analysis. Lecture 12.

Module 3. Descriptive Time Series Statistics and Introduction to Time Series Models

Exponential decay rate of partial autocorrelation coefficients of ARMA and short-memory processes

Chapter 9: Forecasting

STAT 720 sp 2019 Lec 06 Karl Gregory 2/15/2019

Time Series: Theory and Methods

Some Time-Series Models

ITSM-R Reference Manual

We will only present the general ideas on how to obtain. follow closely the AR(1) and AR(2) cases presented before.

Forecasting with ARMA

Nonlinear time series

A time series is called strictly stationary if the joint distribution of every collection (Y t

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

Module 4. Stationary Time Series Models Part 1 MA Models and Their Properties

Examination paper for Solution: TMA4285 Time series models

Empirical Market Microstructure Analysis (EMMA)

Stochastic Processes: I. consider bowl of worms model for oscilloscope experiment:

γ 0 = Var(X i ) = Var(φ 1 X i 1 +W i ) = φ 2 1γ 0 +σ 2, which implies that we must have φ 1 < 1, and γ 0 = σ2 . 1 φ 2 1 We may also calculate for j 1

Statistical signal processing

A SARIMAX coupled modelling applied to individual load curves intraday forecasting

Time Series Examples Sheet

Circle the single best answer for each multiple choice question. Your choice should be made clearly.

Circle a single answer for each multiple choice question. Your choice should be made clearly.

Class 1: Stationary Time Series Analysis

Ross Bettinger, Analytical Consultant, Seattle, WA

Multivariate Time Series

LECTURE 10 LINEAR PROCESSES II: SPECTRAL DENSITY, LAG OPERATOR, ARMA. In this lecture, we continue to discuss covariance stationary processes.

TIME SERIES AND FORECASTING. Luca Gambetti UAB, Barcelona GSE Master in Macroeconomic Policy and Financial Markets

Università di Pavia. Forecasting. Eduardo Rossi

Stochastic processes: basic notions

Open Economy Macroeconomics: Theory, methods and applications

Gaussian processes. Basic Properties VAG002-

Midterm Suggested Solutions

STA 6857 Autocorrelation and Cross-Correlation & Stationary Time Series ( 1.4, 1.5)

Introduction to ARMA and GARCH processes

MAT3379 (Winter 2016)

Time Series Examples Sheet

STAT 100C: Linear models

X n = c n + c n,k Y k, (4.2)

INTRODUCTION Noise is present in many situations of daily life for ex: Microphones will record noise and speech. Goal: Reconstruct original signal Wie

ECON 616: Lecture 1: Time Series Basics

Econ 623 Econometrics II Topic 2: Stationary Time Series

Covariance Stationary Time Series. Example: Independent White Noise (IWN(0,σ 2 )) Y t = ε t, ε t iid N(0,σ 2 )

Econometrics of financial markets, -solutions to seminar 1. Problem 1

Ch 6. Model Specification. Time Series Analysis

EASTERN MEDITERRANEAN UNIVERSITY ECON 604, FALL 2007 DEPARTMENT OF ECONOMICS MEHMET BALCILAR ARIMA MODELS: IDENTIFICATION

Advanced Digital Signal Processing -Introduction

FE570 Financial Markets and Trading. Stevens Institute of Technology

ECE534, Spring 2018: Solutions for Problem Set #5

STA 6857 Signal Extraction & Long Memory ARMA ( 4.11 & 5.2)

MAT 3379 (Winter 2016) FINAL EXAM (PRACTICE)

Econometría 2: Análisis de series de Tiempo

Time Series Analysis -- An Introduction -- AMS 586

Moving Average (MA) representations

X t = a t + r t, (7.1)

1 Introduction to Generalized Least Squares

Chapter 3. ARIMA Models. 3.1 Autoregressive Moving Average Models

Stochastic Processes. M. Sami Fadali Professor of Electrical Engineering University of Nevada, Reno

Time series models 2007

Transcription:

Levinson Durbin Recursions: I note: B&D and S&S say Durbin Levinson but Levinson Durbin is more commonly used (Levinson, 1947, and Durbin, 1960, are source articles sometimes just Levinson is used) recursions solve Γ n a n = γ n (1) efficiently, giving us the coefficients a n needed for best linear predictor X n+1 = a nx n of X n+1 given X n = [X n,..., X 1 ] in doing so, L D recursions also give us coefficients a m for X m+1 = a mx m, m = 1,..., n 1, the best linear predictor of X m+1 given X m = [X m,..., X 1 ] partial autocorrelation function (PACF), also known as partial autocorrelation sequence or reflection coefficient sequence will state L D recursions without proof (B&D have one; S&S leave it as exercise; will give alternative proof in Stat/EE 520) BD 69, CC 113, SS 112, 165 XI 1

Levinson Durbin Recursions: II to keep track of best linear predictors as sample size n increases (and to emphasize certain connections with AR processes), will switch notation from a n to φ n henceforth we now write X n+1 = φ n,1 X n + φ n,2 X n 1 + + φ n,n X 1 = φ nx n where φ n [φ n,1, φ n,1,..., φ n,n ] simplify γ n (1) to just γ n so that γ n = [γ(1), γ(2),..., γ(n)] in new notation, L D recursions solve for φ n in Γ n φ n = γ n recall that Γ n is covariance matrix for X n, so its (i, j)th element is cov {X i, X j } = γ(i j) XI 2

Levinson Durbin Recursions: III referring back to overhead X 13, will denote mean square error (MSE) associated with predictor X n+1 as v n E{(X n+1 X n+1 ) 2 } = var {X n+1 X n+1 } 3. = γ(0) φ nγ n = var {X n+1 } φ n cov {X n+1, X n } BD 69, 70 XI 3

Levinson Durbin Recursions: IV for n = 1, have X 2 φ 1,1 X 1 equation Γ n φ n = γ n becomes γ(0)φ 1,1 = γ(1) solution is φ 1,1 = γ(1)/γ(0) = ρ(1) associated MSE is ( ) v 1 = γ(0) φ 1 γ 1 = γ(0) φ 1,1 γ(1) = γ(0) φ 1,1 [φ 1,1 γ(0)] (making use of ( )) = γ(0)(1 φ 2 1,1 ) = v 0(1 φ 2 1,1 ) with v 0 γ(0) Q: why is γ(0) a natural definition for v 0? note connection to AR(1) model X t = φ 1,1 X t 1 + Z t with {Z t } WN(0, σ 2 (1 φ 2 1,1 )), for which γ(0) = σ2 BD 70, SS 112 XI 4

Levinson Durbin Recursions: V given φ n 1 & v n 1, L D recursion gets φ n & v n in 3 steps 1. get nth order partial autocorrelation (more on this later!): φ n,n = γ(n) n 1 j=1 φ n 1,jγ(n j) v n 1 note: sum is inner product of φ n 1 & order reversal of γ n 1 2. get remaining φ n,j s: φ n,1. = φ n,n φ n,n 1 3. get nth order MSE: φ n 1,1. φ n 1,n 1 v n = v n 1 (1 φ 2 n,n) φ n 1,n 1. φ n 1,1 BD 70, SS 112 XI 5

Levinson Durbin Recursions: VI as a first example, reconsider AR(1) process X t = φx t 1 +Z t, where φ < 1 and {Z t } WN(0, σ 2 ) have already argued (X 14) that X n+1 = φx n φ n = [φ, 0,..., 0] and v n = σ 2 for all n since MSE is σ 2 accordingly, let s apply L D recursions to φ n 1 = [φ, 0,..., 0] & v n 1 = σ 2 and see if required forms for φ n and v n pop out step 1: recalling that γ(h) = σ 2 φ h /(1 φ 2 ) for h 0, we have φ n,n = γ(n) n 1 j=1 φ n 1,jγ(n j) v n 1 = σ 2φn φ n 1,1 φ n 1 v n 1 (1 φ 2 ) = σ 2 φ n φ n v n 1 (1 φ 2 ) = 0 XI 6

Levinson Durbin Recursions: VII step 2: yields φ n,1 φ n,2. φ n,n 2 φ n,n 1 = φ n 1,1 φ n 1,2. φ n 1,n 2 φ n 1,n 1 φ n,1 φ φ n,2. φ n,n 2 = 0. 0 0 φ n,n 1 0 so φ n = [φ, 0,..., 0] as required φ n,n 0 0. 0 φ = φ n 1,n 1 φ n 1,n 2. φ n 1,2 φ n 1,1 φ 0. 0 0, XI 7

Levinson Durbin Recursions: VIII step 3: v n = v n 1 (1 φ 2 n,n) = v n 1 = σ 2, as required note: partial autocorrelation φ n,n for AR(1) process is φ for n = 1 and is zero for n = 2, 3,... homework exercise: run L D recursions on MA(1) process as 2nd example, reconsider stationary process of Problem 3(b): X t = Z 1 cos (ωt) + Z 2 sin (ωt), where Z 1 and Z 2 are independent N (0, 1) RVs ACVF for {X t } is γ(h) = cos (ωh) (same as is its ACF ρ(h)) starting with X 2 φ 1,1 X 1 (n = 1 case), we have φ 1,1 = ρ(1) = cos (ω) and v 1 = γ(0)(1 φ 2 1,1 ) = 1 cos2 (ω) = sin 2 (ω) XI 8

Levinson Durbin Recursions: IX now let us get coefficients for X 3 φ 2,1 X 2 + φ 2,2 X 1 (n = 2 case) using L D recursions first step φ n,n = γ(n) n 1 j=1 φ n 1,jγ(n j) v n 1, yields, for n = 2 (recalling γ(h) = cos (ωh) & φ 1,1 = cos (ω)), φ 2,2 = γ(2) φ 1,1γ(1) v 1 = cos (2ω) cos (ω) cos (ω) sin 2 (ω) = 1 because of trig identity cos (2ω) cos 2 (ω) = sin 2 (ω) XI 9

Levinson Durbin Recursions: X second step of L D recursions, namely, φ n,1 φ n 1,1. =. φ n,n φ n,n 1 φ n 1,n 1 yields, for n = 2, φ n 1,n 1. φ n 1,1, φ 2,1 = φ 1,1 φ 2,2 φ 1,1 = cos (ω)[1 ( 1)] = 2 cos (ω) third step of L D recursions, namely, v n = v n 1 (1 φ 2 n,n) yields, for n = 2, v 2 = v 1 [1 ( 1) 2 ] = 0 thus X 3 is perfectly predicable given X 2 & X 1 : X 3 = 2 cos (ω)x 2 X 1 = X 3 thus, for all t, X t is perfectly predicable given X t 1 & X t 2 : X t = 2 cos (ω)x t 1 X t 2 = X t (Q: why?) BD 77 XI 10

Aside Step-Down Levinson Durbin Recursions: I application of L D recursions to AR(p) process yields, for n p, Y t = φ 1 Y t 1 + + φ p Y t p + Z t Ŷ n+1 = φ n,1 Y n + + φ n,n Y 1 = φ 1 Y n + + φ p Y n p+1, i.e., Ŷ n+1 only depends on p most recent values and, when n > p, not on remote values Y n p,..., Y 1 associated prediction error is Y n+1 Ŷn+1 = Y n+1 φ 1 Y n φ p Y n p+1 = Z n+1, so MSE is v n = var {Y n+1 Ŷn+1} = var {Z n+1 } = σ 2 given φ p,1 = φ 1, φ p,2 = φ 2,..., φ p,p = φ p and σ 2, can invert L D recursions to get coefficients for best linear predictors of orders p 1, p 2,..., 1 and associated MSEs XI 11

Aside Step-Down Levinson Durbin Recursions: II given φ h,1,..., φ h,h & v h, compute 1. φ h 1,j = φ h,j+φ h,h φ h,h j 1 φ 2 h,h 2. v h 1 = v h /(1 φ 2 h,h ), 1 j h 1 step-down L D recursion yields φ h 1,1,..., φ h 1,h 1 & v h 1 start with φ p,1 = φ 1,..., φ p,p = φ p & v p = σ 2 apply step-down recursions to get φ p 1,j s & v p 1, φ p 2,j s & v p 2,..., φ 1,1 & v 1 as opposed to usual L D recursions, step-down L D recursions do not make use of ACVF γ(h) for {Y t } in fact, given φ 1, φ 2,..., φ p & σ 2, can use results of step-down L D recursions to compute γ(h) (yet another method!) XI 12

Aside Step-Down Levinson Durbin Recursions: III to do so, return to overhead XI 4 and note that γ(0) v 0 = v 1 /(1 φ 2 1,1 ) γ(1) = γ(0)φ 1,1 next go to overhead XI 5, grab φ n,n = γ(n) n 1 j=1 φ n 1,jγ(n j) v n 1 and manipulate it to get γ(n) = φ n,n v n 1 + n 1 j=1 φ n 1,j γ(n j) and thus γ(2) = φ 2,2 v 1 + φ 1,1 γ(1) γ(3) = φ 3,3 v 2 + φ 2,1 γ(2) + φ 2,2 γ(1) etc., ending with γ(p) = φ p,p v p 1 + φ p 1,1 γ(p 1) + + φ p 1,p 1 γ(1) XI 13

Aside Step-Down Levinson Durbin Recursions: IV to get γ(p + 1), γ(p + 2),..., make use of an equation stated on overhead IX 50: γ(k) = φ 1 γ(k 1) + + φ p γ(k p), which holds for all k p + 1 note: can now argue that AR coefficients φ 1, φ 2,..., φ p and sequence of partial autocorrelations φ 1,1, φ 2,2,..., φ p,p are equivalent to one another (in particular, φ p,p = φ p ) we now return to our regularly scheduled program... XI 14

One-Step-Ahead Prediction Errors (Innovations): I given time series X 1, X 2,..., can use L D recursions to find coefficients φ m 1 for X m i.e., best linear predictor of X m given X m 1,..., X 1 define X 1 = 0 and X n = [ X n, X n 1,..., X 1 ] letting m = 1, 2,..., n, can generate a series of one-step-ahead prediction errors (or innovations): U m = X m X m collect these into U n = [U n, U n 1..., U 1 ] so that we can write U n = X n X n BD 71, SS 114 XI 15

One-Step-Ahead Prediction Errors (Innovations): II can write U n = A nx n, where A n is lower triangular: 1 0 0 0 0 φ n 1,1 1 0 0 0 A n =...... 0 0 φ n 1,n 3 φ n 2,n 4 1 0 0 φ n 1,n 2 φ n 2,n 3 φ 2,1 1 0 φ n 1,n 1 φ n 2,n 2 φ 2,2 φ 1,1 1 inverse of A n is also lower triangular, so let s write it as 1 0 0 0 0 θ n 1,1 1 0 0 0 C n...... 0 0 θ n 1,n 3 θ n 2,n 4 1 0 0 θ n 1,n 2 θ n 2,n 3 θ 2,1 1 0 θ n 1,n 1 θ n 2,n 2 θ 2,2 θ 1,1 1 BD 72, SS 114 XI 16

One-Step-Ahead Prediction Errors (Innovations): III since C n is inverse of A n, U n = A nx n leads to X n = C nu n ; i.e., time series can be reexpressed in terms of its innovations recall that L D recursions give v m 1 = E{(X m X m ) 2 } = var{u m }, m = 1, 2,..., n can use so-called innovations algorithm to get both v m and elements of C m (note: take sum with upper limit 1 to be 0): θ m,m k = γ(m k) k 1 j=0 θ k,k jθ m,m j v j, 0 k < m v k v m = γ(0) m 1 j=0 θ 2 m,m j v j start with v 0 = γ(0), get θ 1,1 & v 1, get θ 2,2, θ 2,1 & v 2 etc. BD 72, 73, SS 114 XI 17

One-Step-Ahead Prediction Errors (Innovations): IV since X n = C nu n, can write (with θ m,0 1), m X m+1 = θ m,j U m j+1, m = 1, 2,..., n 1, j=0 i.e., linear combination of innovations yields time series since X n = X n U n = C nu n U n = (C n I n )U n, where I n is the n n identity matrix, can also write m X m+1 = θ m,j U m j+1, m = 1, 2,..., n 1, j=1 i.e., linear combination of innovations also yields predictions HW exercise: innovations U 1, U 2,..., U n are uncorrelated BD 72, SS 114 XI 18

Aside Simulation of ARMA Processes: I often of interest to generate realizations of ARMA processes first consider stationary & causal Gaussian AR(p) process: Y t φ 1 Y t 1 φ p Y t p = Z t, {Z t } Gaussian WN(0,σ 2 ) recall that, for any t p + 1, best linear predictor Ŷt of Y t given Y t 1,..., Y 1 takes the form Ŷ t = φ 1 Y t 1 + + φ p Y t p innovations are U t = Y t Ŷt = Z t and have MSE v t 1 var {U t } = σ 2 can use step-down L D recursions to get coefficients for Ŷ t = φ t 1,1 Y t 1 + + φ t 1,t 1 Y 1, t = 2, 3,..., p and associated MSEs v t 1 (recall that Ŷ1 = 0 by definition) XI 19

Aside Simulation of ARMA Processes: II innovations U t = Y t Ŷt, t = 1,..., p are such that 1. E{U t } = 0 and var {U t } = v t 1 2. U 1, U 2,..., U p are uncorrelated RVs (homework exercise) implies independence under Gaussian assumption easy to simulate U t s: generate p independent realizations of N (0, 1) RVs, say, Z 1,..., Z p, and set U t = v 1/2 t 1 Z t can unroll U t s to get simulations of Y t s, t = 1,..., p: U 1 = Y 1 Ŷ1 = Y 1 yields Y 1 = U 1 U 2 = Y 2 Ŷ2 = Y 2 φ 1,1 Y 1 yields Y 2 = φ 1,1 Y 1 + U 2 U 3 = Y 3 Ŷ3 = Y 3 φ 2,1 Y 2 φ 2,2 Y 1 yields Y 3 = φ 2,1 Y 2 + φ 2,2 Y 1 + U 3 XI 20

finally Aside Simulation of ARMA Processes: III yields U p = Y p Ŷp = Y p φ p 1,1 Y p 1 φ p 1,p 1 Y 1 Y p = φ p 1,1 Y p 1 + + φ p 1,p 1 Y 1 + U p can now generate remainder of desired simulated series using Y t = φ 1 Y t 1 + + φ p Y t p + σ Z t, t = p + 1, p + 2,..., where Z t s are independent realizations of N (0, 1) RVs (these are independent of Z 1,..., Z p also) XI 21

Aside Simulation of ARMA Processes: IV knowing how to simulate AR process φ(b)y t = Z t, can in turn simulate ARMA process φ(b)x t = θ(b)z t since we can create ARMA process {X t } by applying filter θ(b) to AR process {Y t }: X t = θ(b)y t = θ(b)φ 1 (B)Z t, i.e., φ(b)x t = θ(b)z t (see overhead IX 47) hence can generate simulated ARMA series of length n via X t = Y t + θ 1 Y t 1 + + θ q Y t q, t = q + 1,..., q + n; i.e., need to make simulated AR series of length n + q XI 22

Example Simulation of ARMA(2,2) Process: I consider ARMA(2,2) process given by X t = 3 4 X t 1 2 1X t 2 + Z t + 10 7 Z t 1 10 1 Z t 2, {Z t } WN(0, 1), so that v 2 = 1 to simulate AR(2) process Y t = 4 3Y t 1 2 1Y t 2 + Z t, need to run reverse L D recursions once to obtain φ 1,1 = φ 2,1 + φ 2,2 φ 2,1 1 φ 2 2,2 = 3 4 1 2 3 4 1 1 4 = 1 2, v 1 = v 2 1 φ 2 2,2 = 4 3 and hence v 0 = v 1 1 φ 2 1,1 = 16 9 XI 23

Example Simulation of ARMA(2,2) Process: II thus would generate AR(2) process using Y 1 = 4 3 Z 1 Y 2 = 1 2 Y 1 + 2 3 Z2 Y 3 = 3 4 Y 2 1 2 Y 1 + Z 3. Y n+2 = 3 4 Y n+1 1 2 Y n + Z n+2, where Z t s are IID N (0, 1) RVs desired ARMA(2,2) process is given by X t = Y t+2 + 7 10 Y t+1 1 10 Y t, t = 1,..., n overhead VIII 24 shows AR(2) series (n = 100) used to form ARMA(2,2) simulation (n = 98) in next overhead XI 24

Realization of Second AR(2) Process 0 20 40 60 80 100 4 2 0 2 t x t VIII 24

Realization of ARMA(2,2) Process 0 20 40 60 80 100 4 2 0 2 t x t XI 25

Aside Simulation of ARMA Processes: V method described here deemed exact because of use of socalled stationary initial conditions (method used in R function arima.sim is not exact makes use of a burn-in period) source article is Kay (1981), which is just over a page in length, making it one of the shortest useful articles relevant to time series analysis (shortest is undoubtedly David, 1985!) XI 26

Multi-Step-Ahead Prediction: I reconsider one-step-ahead predictor X n+1 of X n+1 given X n, X n 1,..., X 1 in preparation for considering multi-step-ahead prediction, will now denote X n+1 by X n+1 n X n+1 n can be written as either a linear combination of previous time series values or previous innovations: n X n+1 n = φ n,j X n j+1 or X n n+1 n = θ n,j U n j+1 j=1 j=1 for a given h 2, want to formulate best linear predictor X n+h n of X n+h given X n, X n 1,..., X 1 XI 27

Multi-Step-Ahead Prediction: II first approach: replacing n in n X n+1 n = φ n,j X n j+1 with n + h 1 gives X n+h n+h 1 = j=1 n+h 1 j=1 φ n+h 1,j X n+h j above involves unobserved X n+h 1,..., X n+1, but replacing these with X n+h 1 n,..., X n+1 n, gives desired predictor: X n+h n = h 1 j=1 φ n+h 1,j Xn+h j n + n+h 1 j=h φ n+h 1,j X n+h j XI 28

Multi-Step-Ahead Prediction: III leads to recursive scheme for computing X n+h n starting with one-step-ahead predictor X n+1 n (we know how to get this!) two-step-ahead predictor: replace X n+1 in with X n+1 n to get X n+2 n+1 = n+1 j=1 X n+2 n = φ n+1,1 Xn+1 n + φ n+1,j X n+2 j n+1 j=2 φ n+1,j X n+2 j XI 29

Multi-Step-Ahead Prediction: IV three-step-ahead predictor: replace X n+2 & X n+1 in X n+3 n+2 = n+2 j=1 with X n+2 n & X n+1 n to get φ n+2,j X n+3 j n+2 X n+3 n = φ n+2,1 Xn+2 n +φ n+2,2 Xn+1 n + j=3 yadda, yadda, yadda, coming eventually to the desired X n+h n = h 1 j=1 φ n+h 1,j Xn+h j n + n+h 1 j=h φ n+2,j X n+3 j φ n+h 1,j X n+h j XI 30

Multi-Step-Ahead Prediction: V since X n+1 n,..., X n+h 1 n are all linear combinations of X n,..., X 1, it follows that X n+h n is also such: X n+h n = h 1 j=1 φ n+h 1,j Xn+h j n + n a j X n j+1 j=1 n+h 1 j=h φ n+h 1,j X n+h j can show that a n = [a 1,..., a n ] so defined is a solution to Γ n a n = γ n (h), where n n matrix Γ n has (i, j)th entry of γ(i j), while γ n (h) = [γ(h),..., γ(h + n 1)] XI 31

Multi-Step-Ahead Prediction: VI second approach: replacing n in n X n+1 n = θ n,j U n j+1 with n + h 1 gives X n+h n+h 1 = j=1 n+h 1 j=1 θ n+h 1,j U n+h j above involves unobserved U n+h 1,..., U n+1, but replacing these with their expected values (zero!) gives desired predictor: n+h 1 n X n+h n = θ n+h 1,j U n+h j = θ n+h 1,n+h j U j j=h j=1 XI 32

Multi-Step-Ahead Prediction: VII MSE of h-step-ahead forecast is E{(X n+h X n+h n ) 2 } = E{X 2 n+h } 2E{X n+h X n+h n } + E{ X 2 n+h n } = γ(0) E{ X 2 n+h n } = γ(0) var { X n+h n } since E{X 2 n+h } = γ(0) and E{X n+h X n+h n } = E{ X 2 n+h n } (homework exercise!) since var {U j } = v j 1 and U j s are uncorrelated, { n } var { X n n+h n } = var θ n+h 1,n+h j U j = θn+h 1,n+h j 2 v j 1 j=1 MSE is thus given by E{(X n+h X n+h n ) 2 } = γ(0) j=1 n θn+h 1,n+h j 2 v j 1 σn(h) 2 j=1 BD 74, 75 XI 33

Multi-Step-Ahead Prediction: VIII under a Gaussian assumption, can use above to form 95% prediction bounds for unknown X n+h : X n+h n ± 1.96σ n (h) as example, consider 1st part of wind speed series x 1,..., x 100 after centering x t by subtracting off its sample mean x, we model x t = x t x as an AR(1) process X t = φx t 1 + Z t with φ estimated by ˆφ = ˆρ(1). = 0.856 (cf. overhead X 16) based on x 1,..., x 100, forecast last 28 values x 101 x,..., x 128 x of time series and see how well we do following overheads show results from homegrown R code based on theory presented above built-in R functions ar and predict BD 74, 75 XI 34

Multi-Step-Ahead Prediction of Wind Speed x t 2 0 2 4 0 20 40 60 80 100 120 t XI 35

Multi-Step-Ahead Prediction of Wind Speed using R x t 2 0 2 4 0 20 40 60 80 100 120 t XI 36

Predictions Based on Infinite Past: I rather than using X n,..., X 1 to predict X n+h, suppose we use, for some m 0, X n,..., X 1, X 0, X 1,..., X m and form a predictor to be denoted by X n+h n,m by letting m and assuming limit exists (in MS sense), can write X n+h n, = α j X n j+1 j=1 where α j s are set by a version of the orthogonality principle: cov {X n+h α j X n j+1, X n i } = 0, i = 0, 1,... j=1 BD 75, SS 115 XI 37

Predictions Based on Infinite Past: II refer to X n+h n, as predictor of X n+h based on infinite past X n, X n 1,... associated prediction error X n+h X n+h n, has MSE E{(X n+h X n+h n, ) 2 } = var {X n+h X n+h n, }, which can be compared to var {X n+h X n+h n } to see how much can be gained from having lots more data (recall that X n+h n is based on just X n, X n 1,..., X 1 ) BD 75, 76, SS 115 XI 38

Predictions Based on Infinite Past: III applying representation X t = ψ j Z t j at t = n + h yields X n+h = j=0 ψ j Z n+h j j=0 consider Z t s that make up X n+h but not X n ; i.e., Z n+h, Z n+h 1,..., Z n+1 replacing these h RVs by their expected values (zero) gives X n+h n, = ψ j Z n+h j j=h prediction error is thus X n+h X n+h n, = ψ j Z n+h j j=0 XI 39 ψ j Z n+h j = j=h h 1 j=0 ψ j Z n+h j

Predictions Based on Infinite Past: IV since {Z t } WN(0, σ 2 ), variance of X n+h X n+h n, = i.e., MSE of X n+h n,, is given by h 1 j=0 ψ j Z n+h j var {X n+h X h 1 n+h n, } = σ 2 in particular, for h = 1, MSE is var {X n+1 X n+1 n, } = σ 2 rather than v n = var {X n+1 X n+1 n } j=0 homework exercise: compare MSEs for specific MA(1) and AR(1) processes with specific sample sizes n ψ 2 j CC 196, SS 116 XI 40

References H. A. David (1985), Bias of S 2 Under Dependence, The American Statistician, 39, p. 201 J. Durbin (1960), The Fitting of Time Series Models, Revue de l Institut International de Statistique/Review of the International Statistical Institute, 28, pp. 233 44 S. M. Kay (1981), Efficient Generation of Colored Noise, Proceedings of the IEEE, 69, pp. 480 1 N. Levinson (1947), The Wiener RMS Error Criterion in Filter Design and Prediction, Journal of Mathematical Physics, 25, pp. 261 78 XI 41