Define y t+h t as the forecast of y t+h based on I t known parameters. The forecast error is. Forecasting

Similar documents
Covariance Stationary Time Series. Example: Independent White Noise (IWN(0,σ 2 )) Y t = ε t, ε t iid N(0,σ 2 )

Single Equation Linear GMM with Serially Correlated Moment Conditions

Ch. 14 Stationary ARMA Process

Single Equation Linear GMM with Serially Correlated Moment Conditions

Class 1: Stationary Time Series Analysis

Ch. 15 Forecasting. 1.1 Forecasts Based on Conditional Expectations

Autoregressive and Moving-Average Models

Lecture 1: Stationary Time Series Analysis

Chapter 9: Forecasting

Forecasting and Estimation

Trend-Cycle Decompositions

Discrete time processes

Introduction to Stochastic processes

2.5 Forecasting and Impulse Response Functions

Vector Auto-Regressive Models

VAR Models and Applications

Università di Pavia. Forecasting. Eduardo Rossi

Stationary Stochastic Time Series Models

Consider the trend-cycle decomposition of a time series y t

Problem Set 2: Box-Jenkins methodology

Chapter 4: Models for Stationary Time Series

White Noise Processes (Section 6.2)

Lecture 1: Stationary Time Series Analysis

Forecasting with ARMA

Econometrics II Heij et al. Chapter 7.1

ECONOMETRICS Part II PhD LBS

Lecture on ARMA model

Principles of forecasting

Multiple Linear Regression

1 Linear Difference Equations

Autoregressive Moving Average (ARMA) Models and their Practical Applications

Ch 2: Simple Linear Regression

1 Class Organization. 2 Introduction

EASTERN MEDITERRANEAN UNIVERSITY ECON 604, FALL 2007 DEPARTMENT OF ECONOMICS MEHMET BALCILAR ARIMA MODELS: IDENTIFICATION

Midterm Suggested Solutions

Weighted Least Squares

Chapter 8: Model Diagnostics

ECON/FIN 250: Forecasting in Finance and Economics: Section 7: Unit Roots & Dickey-Fuller Tests

Econometric Forecasting

Week 5 Quantitative Analysis of Financial Markets Characterizing Cycles

APPLIED ECONOMETRIC TIME SERIES 4TH EDITION

University of Oxford. Statistical Methods Autocorrelation. Identification and Estimation

B y t = γ 0 + Γ 1 y t + ε t B(L) y t = γ 0 + ε t ε t iid (0, D) D is diagonal

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

Vector autoregressions, VAR

ECON 616: Lecture 1: Time Series Basics

ARIMA Modelling and Forecasting

Circle the single best answer for each multiple choice question. Your choice should be made clearly.

A time series is called strictly stationary if the joint distribution of every collection (Y t

Lecture 16: State Space Model and Kalman Filter Bus 41910, Time Series Analysis, Mr. R. Tsay

Univariate Nonstationary Time Series 1

Regression Models - Introduction

Lecture 2: Univariate Time Series

Class 4: VAR. Macroeconometrics - Fall October 11, Jacek Suda, Banque de France

Heteroskedasticity in Time Series

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley

E 4101/5101 Lecture 6: Spectral analysis

Simple Linear Regression (Part 3)

Simple Linear Regression

ECON/FIN 250: Forecasting in Finance and Economics: Section 6: Standard Univariate Models

Econ 583 Final Exam Fall 2008

Applied time-series analysis

LECTURE 10 LINEAR PROCESSES II: SPECTRAL DENSITY, LAG OPERATOR, ARMA. In this lecture, we continue to discuss covariance stationary processes.

Notes on Time Series Modeling

Problem Set 2 Solution Sketches Time Series Analysis Spring 2010

STAT Financial Time Series

STAT 443 Final Exam Review. 1 Basic Definitions. 2 Statistical Tests. L A TEXer: W. Kong

Lecture 4a: ARMA Model

4 Multiple Linear Regression

Univariate Time Series Analysis; ARIMA Models

Basic concepts and terminology: AR, MA and ARMA processes

Class: Trend-Cycle Decomposition

State-space Model. Eduardo Rossi University of Pavia. November Rossi State-space Model Financial Econometrics / 49

ECO Econometrics III. Daniel L. Millimet. Fall Southern Methodist University. DL Millimet (SMU) ECO 6375 Fall / 150

Empirical Market Microstructure Analysis (EMMA)

Prof. Dr. Roland Füss Lecture Series in Applied Econometrics Summer Term Introduction to Time Series Analysis

E 4101/5101 Lecture 9: Non-stationarity

Time Series Analysis

SGN Advanced Signal Processing: Lecture 8 Parameter estimation for AR and MA models. Model order selection

TIME SERIES AND FORECASTING. Luca Gambetti UAB, Barcelona GSE Master in Macroeconomic Policy and Financial Markets

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8

2. An Introduction to Moving Average Models and ARMA Models

Financial Econometrics and Volatility Models Estimation of Stochastic Volatility Models

Heteroskedasticity in Panel Data

Switching Regime Estimation

Heteroskedasticity in Panel Data

Ch 6. Model Specification. Time Series Analysis

Chapter 6: Model Specification for Time Series

5: MULTIVARATE STATIONARY PROCESSES

Lesson 13: Box-Jenkins Modeling Strategy for building ARMA models

Chapter 2 Multiple Regression I (Part 1)

Statistics 910, #15 1. Kalman Filter

Lecture 8: ARIMA Forecasting Please read Chapters 7 and 8 of MWH Book

Ch. 19 Models of Nonstationary Time Series

Cross-Validation with Confidence

MEI Exam Review. June 7, 2002

Box-Jenkins. (1) Identification ( ) (2) Estimation ( ) (3) Diagnostic Checking ( ) (1) Identification: ARMA(p,q) p, q. (2) Estimation: ARMA(p,q)

Multivariate Regression

Multivariate Regression (Chapter 10)

Transcription:

Forecasting Let {y t } be a covariance stationary are ergodic process, eg an ARMA(p, q) process with Wold representation y t = X μ + ψ j ε t j, ε t ~WN(0,σ 2 ) j=0 = μ + ε t + ψ 1 ε t 1 + ψ 2 ε t 2 + Let I t = {y t,y t 1,} denote the information set available at time t Recall, E[y t ] = μ var(y t ) = σ 2 X ψ 2 j j=0 Goal: Using I t produce optimal forecasts of y t+h for h =1, 2,,s Define y t+h t as the forecast of y t+h based on I t known parameters The forecast error is ε t+h t = y t+h y t+h t and the mean squared error of the forecast is MSE(ε t+h t ) = E[ε 2 t+h t ] = E[(y t+h y t+h t ) 2 ] Theorem: The minimum MSE forecast (best forecast) of y t+h based on I t is y t+h t = E[y t+h I t ] Proof: See Hamilton pages 72-73 Note: y t+h = μ + ε t+h + ψ 1 ε t+h 1 + +ψ h 1 ε t+1 + ψ h ε t + ψ h+1 ε t 1 +

Remarks 1 The computation of E[y t+h I t ] depends on the distribution of {ε t } and may be a very complicated nonlinear function of the history of {ε t } Even if {ε t } is an uncorrelated process (eg white noise) it may be the case that E[ε t+1 I t ] 6= 0 2 If {ε t } is independent white noise, then E[ε t+1 I t ]= 0 and E[y t+h I t ] will be a simple linear function of {ε t } y t+h t = μ + ψ h ε t + ψ h+1 ε t 1 + Linear Predictors A linear predictor of y t+h t is a linear function of the variables in I t Theorem: The minimum MSE linear forecast (best linear predictor) of y t+h based on I t is y t+h t = μ + ψ h ε t + ψ h+1 ε t 1 + Proof See Hamilton page 74 The forecast error of the best linear predictor is ε t+h t = y t+h y t+h t = μ + ε t+h + ψ 1 ε t+h 1 + +ψ h 1 ε t+1 + ψ h ε t + (μ + ψ h ε t + ψ h+1 ε t 1 + ) = ε t+h + ψ 1 ε t+h 1 + + ψ h 1 ε t+1 and the MSE of the forecast error is MSE(ε t+h t )=σ 2 (1 + ψ 2 1 + + ψ2 h 1 )

Remarks Example: BLP for MA(1) process 1 E[ε t+h t ]=0 2 ε t+h t is uncorrelated with any element in I t Here y t = μ + ε t + θε t 1, ε t WN(0,σ 2 ) ψ 1 = θ, ψ h =0for h>1 3 The form of y t+h t is closely related to the IRF Therefore, y t+1 t = μ + θε t 4 MSE(ε t+h t )=var(ε t+h t ) var(y t ) 5 lim h y t+h t = μ 6 lim h MSE(ε t+h t )=var(y t ) y t+2 t = μ y t+h t = μ for h>1 The forecast errors and MSEs are ε t+1 t = ε t+1, MSE(ε t+1 t )=σ 2 ε t+2 t = ε t+2 + θε t+1, MSE(ε t+2 t )=σ 2 (1 + θ 2 )

Prediction Confidence Intervals Predictions with Estimated Parameters If {ε t } is Gaussian then y t+h I t N(y t+h t,σ 2 (1 + ψ 2 1 + + ψ2 h 1 )) A95%confidence interval for the h step prediction has the form q y t+h t ± 196 σ 2 (1 + ψ 2 1 + + ψ2 h 1 ) Let ŷ t+h t denote the BLP with estimated parameters: ŷ t+h t =ˆμ + ˆψ hˆε t + ˆψ h+1ˆε t 1 + where ˆε t is the estimated residual from the fitted model The forecast error with estimated parameters is ˆε t+h t = y t+h ŷ t+h t = (μ ˆμ)+ε t+h + ψ 1 ε t+h 1 + + ψ h 1 ε t+1 + ³ ψ h ε t ˆψ hˆε t + ³ ψh+1 ε t 1 ˆψ h+1ˆε t 1 + Obviously, MSE(ˆε t+h t ) 6= MSE(ε t+h t )=σ 2 (1+ψ 2 1 + +ψ2 h 1 ) Note: Most software computes dmse(ε t+h t )=ˆσ 2 (1 + ˆψ 2 1 + + ˆψ 2 h 1)

Computing the Best Linear Predictor The BLP y t+h t maybecomputedinmanydifferent but equivalent ways The algorithm for computing y t+h t from an AR(1) model is simple and the methodology allows for the computation of forecasts for general ARMA models as well as multivariate models Example: AR(1) Model y t μ = φ(y t 1 μ)+ε t ε t ~WN(0,σ 2 ) μ, φ, σ 2 are known In the Wold representation ψ j = φ j Starting at t and iterating forward h periods gives y t+h = μ + φ h (y t μ)+ε t+h + φε t+h 1 + +φ h 1 ε t+1 = μ + φ h (y t μ)+ε t+h + ψ 1 ε t+h 1 + The best linear forecasts of y t+1,y t+2,,y t+h are computed using the chain-rule of forecasting (law of iterated projections) y t+1 t = μ + φ(y t μ) y t+2 t = μ + φ(y t+1 t μ) =μ + φ(φ(y t μ)) = μ + φ 2 (y t μ) y t+h t = μ + φ(y t+h 1 t μ) =μ + φ h (y t μ) The corresponding forecast errors are ε t+1 t = y t+1 y t+1 t = ε t+1 ε t+2 t = y t+2 y t+2 t = ε t+2 + φε t+1 = ε t+2 + ψ 1 ε t+1 ε t+h t = y t+h y t+h t = ε t+h + φε t+h 1 + +φ h 1 ε t+1 = ε t+h + ψ 1 ε t+h 1 + + ψ h 1 ε t+1 +ψ h 1 ε t+1

The forecast error variances are var(ε t+1 t ) = σ 2 var(ε t+2 t ) = σ 2 (1 + φ 2 )=σ 2 (1 + ψ 2 1 ) var(ε t+h t ) = σ 2 (1 + φ 2 + + φ 2(h 1) )=σ 21 φ2h 1 φ 2 Clearly, = σ 2 (1 + ψ 2 1 + + ψ2 h 1 ) lim y t+h t = μ = E[y t ] h lim var(ε t+h t ) = σ 2 h 1 φ 2 = σ 2 X h=0 ψ 2 h = var(y t) AR(p) Models Consider the AR(p) model φ(l)(y t μ) = ε t, ε t WN(0,σ 2 ) φ(l) = 1 φ 1 L φ p L p The forecasting algorithm for the AR(p) models is essentiallythesameasthatforar(1)modelsonceweputthe AR(p) model in state space form Let X t = y t μ The AR(p) in state space form is or X t X t 1 X t p+1 = φ 1 φ 2 φ p 1 0 0 0 1 0 ξ t = Fξ t 1 +w t var(w t ) = Σ w X t 1 X t 2 X t p + ε t 0 0

Starting at t and iterating forward h periods gives ξ t+h = F h ξ t + w t+h + Fw t+h 1 + + F h 1 w t+1 Then the best linear forecasts of y t+1,y t+2,,y t+h are computed using the chain-rule of forecasting are ξ t+1 t = Fξ t ξ t+2 t = Fξ t+1 t = F 2 ξ t ξ t+h t = Fξ t+h 1 t = F h ξ t The forecast for y t+h is given by μ plus the first row of ξ t+h t = F h ξ t : ξ t+h t = φ 1 φ 2 φ p 1 0 0 0 1 0 h y t μ y t 1 μ y t p+1 μ The forecast errors are given by w t+1 t = ξ t+1 ξ t+1 t = w t+1 w t+2 t = ξ t+2 ξ t+2 t = w t+2 + Fw t+1 w t+h t = ξ t+h ξ t+h t = w t+h + Fw t+h 1 + +F h 1 w t+1 and the corresponding forecast MSE matrices are var(w t+1 t ) = var(w t )=Σ w var(w t+2 t ) = var(w t+2 )+Fvar(w t+1 )F 0 var(w t+h t ) = Notice that = Σ w + FΣ w F 0 h 1 X F j Σ w F j0 j=0 var(w t+h t )=Σ w + Fvar(w t+h 1 t )F 0

Forecast Evaluation Diebold-Mariano Test for Equal Predictive Accuracy Let {y t } denote the series to be forecast and let y t+h t 1 and y t+h t 2 denote two competing forecasts of y t+h based on I t For example, y t+h t 1 could be computed from an AR(p) modelandy t+h t 2 could be computed from an ARMA(p, q) model The forecast errors from the two models are ε 1 t+h t = y t+h y 1 t+h t ε 2 t+h t = y t+h y 2 t+h t The h step forecasts are assumed to be computed for t = t 0,,T for a total of T 0 forecasts giving {ε 1 t+h t }T t 0, {ε 2 t+h t }T t 0 Because the h-step forecasts use overlapping data the forecast errors in {ε 1 t+h t }T t 0 and {ε 2 t+h t }T t 0 will be serially correlated

The accuracy of each forecast is measured by a particular loss function L(y t+h,yt+h t i )=L(εi t+h t ), i =1, 2 Some popular loss functions are: L(ε i t+h t ) = ³ ε i t+h t 2 : squared error loss L(ε i t+h t ) = ε i t+h t : absolute value loss To determine if one model predicts better than another wemaytestnullhypotheses H 0 : E[L(ε 1 t+h t )] = E[L(ε2 t+h t )] against the alternative H 1 : E[L(ε 1 t+h t )] 6= E[L(ε2 t+h t )] The Diebold-Mariano test is based on the loss differential d t = L(ε 1 t+h t ) L(ε2 t+h t ) The null of equal predictive accuracy is then H 0 : E[d t ]=0 The Diebold-Mariano test statistic is d S = ³ avar( d d) 1/2 = d ³ LRV d 1/2 d /T where d = 1 X T d t T 0 t=t 0 LRV d = X γ 0 +2 γ j, γ j = cov(d t,d t j ) j=1 Note: The long-run variance is used in the statistic because the sample of loss differentials {d t } T t 0 are serially correlated for h>1

Diebold and Mariano (1995) show that under the null of equal predictive accuracy S A ~ N(0, 1) So we reject the null of equal predictive accuracy at the 5% level if S > 196 One sided tests may also be computed