AR, MA and ARMA models

Similar documents
FE570 Financial Markets and Trading. Stevens Institute of Technology

Prof. Dr. Roland Füss Lecture Series in Applied Econometrics Summer Term Introduction to Time Series Analysis

Autoregressive Moving Average (ARMA) Models and their Practical Applications

Lecture Notes of Bus (Spring 2017) Analysis of Financial Time Series Ruey S. Tsay

Multivariate Time Series Analysis and Its Applications [Tsay (2005), chapter 8]

Applied time-series analysis

Stochastic Processes

Week 5 Quantitative Analysis of Financial Markets Characterizing Cycles

Time Series Analysis -- An Introduction -- AMS 586

Multivariate Time Series

STAT Financial Time Series

Review Session: Econometrics - CLEFIN (20192)

at least 50 and preferably 100 observations should be available to build a proper model

{ } Stochastic processes. Models for time series. Specification of a process. Specification of a process. , X t3. ,...X tn }

Empirical Market Microstructure Analysis (EMMA)

Advanced Econometrics

Econometrics I: Univariate Time Series Econometrics (1)

Lecture 2: Univariate Time Series

Dynamic Time Series Regression: A Panacea for Spurious Correlations

Time Series I Time Domain Methods

Some Time-Series Models

Note: The primary reference for these notes is Enders (2004). An alternative and more technical treatment can be found in Hamilton (1994).

A time series is called strictly stationary if the joint distribution of every collection (Y t

TIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA

A lecture on time series

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley

Modelling using ARMA processes

Midterm Suggested Solutions

University of Oxford. Statistical Methods Autocorrelation. Identification and Estimation

Problem Set 2: Box-Jenkins methodology

3. ARMA Modeling. Now: Important class of stationary processes

FORECASTING SUGARCANE PRODUCTION IN INDIA WITH ARIMA MODEL

Time Series Analysis

Chapter 12: An introduction to Time Series Analysis. Chapter 12: An introduction to Time Series Analysis

Non-Stationary Time Series and Unit Root Testing

Introduction to Stochastic processes

Lecture 2: ARMA(p,q) models (part 2)

Unit root problem, solution of difference equations Simple deterministic model, question of unit root

AR(p) + I(d) + MA(q) = ARIMA(p, d, q)

Forecasting using R. Rob J Hyndman. 2.4 Non-seasonal ARIMA models. Forecasting using R 1

Booth School of Business, University of Chicago Business 41914, Spring Quarter 2017, Mr. Ruey S. Tsay. Solutions to Midterm

Econometric Forecasting

Lecture 3: Autoregressive Moving Average (ARMA) Models and their Practical Applications

Ch 6. Model Specification. Time Series Analysis

Lecture 1: Fundamental concepts in Time Series Analysis (part 2)

Econometrics of Panel Data

Multivariate Time Series: VAR(p) Processes and Models

TIME SERIES ANALYSIS. Forecasting and Control. Wiley. Fifth Edition GWILYM M. JENKINS GEORGE E. P. BOX GREGORY C. REINSEL GRETA M.

Module 3. Descriptive Time Series Statistics and Introduction to Time Series Models

Econometría 2: Análisis de series de Tiempo

Univariate ARIMA Models

Univariate Time Series Analysis; ARIMA Models

Estimation and application of best ARIMA model for forecasting the uranium price.

MODELING INFLATION RATES IN NIGERIA: BOX-JENKINS APPROACH. I. U. Moffat and A. E. David Department of Mathematics & Statistics, University of Uyo, Uyo

Econ 623 Econometrics II Topic 2: Stationary Time Series

Stationary Stochastic Time Series Models

Econometrics II Heij et al. Chapter 7.1

Quantitative Finance I

Chapter 2: Unit Roots

Stat 248 Lab 2: Stationarity, More EDA, Basic TS Models

Lecture 7: Model Building Bus 41910, Time Series Analysis, Mr. R. Tsay

STAT 443 Final Exam Review. 1 Basic Definitions. 2 Statistical Tests. L A TEXer: W. Kong

Non-Stationary Time Series and Unit Root Testing

Lecture 5: Unit Roots, Cointegration and Error Correction Models The Spurious Regression Problem

LECTURES 2-3 : Stochastic Processes, Autocorrelation function. Stationarity.

Introduction to ARMA and GARCH processes

ECON 616: Lecture 1: Time Series Basics

Lecture 4a: ARMA Model

Chapter 8: Model Diagnostics

Comment about AR spectral estimation Usually an estimate is produced by computing the AR theoretical spectrum at (ˆφ, ˆσ 2 ). With our Monte Carlo

Nonlinear time series

ECON/FIN 250: Forecasting in Finance and Economics: Section 6: Standard Univariate Models

Vector autoregressions, VAR

Box-Jenkins ARIMA Advanced Time Series

LATVIAN GDP: TIME SERIES FORECASTING USING VECTOR AUTO REGRESSION

Topic 4 Unit Roots. Gerald P. Dwyer. February Clemson University

Econ 424 Time Series Concepts

Lecture 3: Autoregressive Moving Average (ARMA) Models and their Practical Applications

STAT 248: EDA & Stationarity Handout 3

TMA4285 December 2015 Time series models, solution.

Econometrics of financial markets, -solutions to seminar 1. Problem 1

Econ 423 Lecture Notes: Additional Topics in Time Series 1

Booth School of Business, University of Chicago Business 41914, Spring Quarter 2017, Mr. Ruey S. Tsay Midterm

Lab: Box-Jenkins Methodology - US Wholesale Price Indicator

Ch 8. MODEL DIAGNOSTICS. Time Series Analysis

Non-Stationary Time Series and Unit Root Testing

ITSM-R Reference Manual

7. Forecasting with ARIMA models

1 Linear Difference Equations

Lecture 6: Univariate Volatility Modelling: ARCH and GARCH Models

Chapter 4: Models for Stationary Time Series

APPLIED ECONOMETRIC TIME SERIES 4TH EDITION

Reliability and Risk Analysis. Time Series, Types of Trend Functions and Estimates of Trends

ARIMA Models. Jamie Monogan. January 16, University of Georgia. Jamie Monogan (UGA) ARIMA Models January 16, / 27

ARIMA Modelling and Forecasting

Part II. Time Series

5 Autoregressive-Moving-Average Modeling

Gaussian processes. Basic Properties VAG002-

2. An Introduction to Moving Average Models and ARMA Models

Univariate, Nonstationary Processes

Transcription:

AR, MA and AR by Hedibert Lopes P Based on Tsay s Analysis of Financial Time Series (3rd edition)

P 1 Stationarity 2 3 4 5 6 7 P 8 9 10 11 Outline

P Linear Time Series Analysis and Its Applications For basic concepts of linear time series analysis see Box, Jenkins, and Reinsel (1994, Chapters 2-3), and Brockwell and Davis (1996, Chapters 1-3) Shumway and Stoffer (2011, Chapters 1-3) The theories of linear time series discussed include stationarity dynamic dependence autocorrelation function modeling forecasting

P The econometric models introduced include (a) simple autoregressive models, (b) simple moving-average models, (b) mixed autoregressive moving-average models, (c) seasonal models, (d) unit-root nonstationarity, (e) regression models with time series errors, and (f) fractionally differenced models for long-range dependence.

Strict stationarity P The foundation of time series analysis is stationarity. A time series {r t } is said to be strictly stationary if the joint distribution of (r t1,..., r tk ) is identical to that of (r t1 +t,..., r tk +t) for all t, where k is an arbitrary positive integer and (t 1,..., t k ) is a collection of k positive integers. The joint distribution of (r t1,..., r tk ) is invariant under time shift. This is a very strong condition that is hard to verify empirically.

P Weak stationarity A time series {r t } is weakly stationary if both the mean of r t and the covariance between r t and r t l are time invariant, where l is an arbitrary integer. More specifically, {r t } is weakly stationary if (a) E(r t ) = µ, which is a constant, and (b) Cov(r t, r t l ) = γ l, which only depends on l. In practice, suppose that we have observed T data points {r t t = 1,..., T }. The weak stationarity implies that the time plot of the data would show that the T values fluctuate with constant variation around a fixed level. In applications, weak stationarity enables one to make inference concerning future observations (e.g., prediction).

P The covariance γ = Cov(r t, r t l ) is called the lag-l autocovariance of r t. It has two important properties: (a) γ 0 = V ar(r t ), and (b) γ l = γ l. The second property holds because Properties Cov(r t, r t ( l) ) = Cov(r t ( l), r t ) = Cov(r t+l, r t ) = Cov(r t1, r t1 l), where t 1 = t + l.

P Autocorrelation function The autocorrelation function of lag l is ρ l = Cov(r t, r t l ) V ar(rt )V ar(r t l ) = Cov(r t, r t l ) = γ l V ar(r t ) γ 0 where the property V ar(r t ) = V ar(r t l ) for a weakly stationary series is used. In general, the lag-l sample autocorrelation of r t is defined as ˆρ l = T t=l+1 (r t r)(r t l r) T t=1 (r t r) 2 0 l < T 1.

P Portmanteau Box and Pierce (1970) propose the Portmanteau statistic m Q (m) = T l=1 as a statistic for the null hypothesis H 0 : ρ 1 = = ρ m = 0 against the alternative hypothesis ˆρ 2 l H a : ρ i 0 for some i {1,..., m}. Under the assumption that {r t } is an iid sequence with certain moment conditions, Q (m) is asymptotically χ 2 m.

Ljung and Box (1978) P Ljung and Box (1978) modify the Q (m) statistic as below to increase the power of the in finite samples, Q(m) = T (T + 2) m l=1 ˆρ 2 l T l. The decision rule is to reject H 0 if Q(m) > q 2 α, where q 2 α denotes the 100(1 α)th percentile of a χ 2 m distribution.

P # Load data da = read.table("http://faculty.chicagobooth.edu/ruey.tsay/teaching/fts3/m-ibm3dx2608.txt", header=true) # IBM simple returns and squared returns sibm=da[,2] sibm2 = sibm^2 # par(mfrow=c(1,2)) acf(sibm) acf(sibm2) # statistic Q(30) Box.(sibm,lag=30,type="Ljung") Box.(sibm2,lag=30,type="Ljung") > Box.(sibm,lag=30,type="Ljung") Box-Ljung data: sibm X-squared = 38.241, df = 30, p-value = 0.1437 > Box.(sibm2,lag=30,type="Ljung") Box-Ljung data: sibm2 X-squared = 182.12, df = 30, p-value < 2.2e-16

P A time series r t is called a white noise if {r t } is a sequence of independent and identically distributed random variables with finite mean and variance. All the s are zero. If r t is normally distributed with mean zero and variance σ 2, the series is called a Gaussian white noise.

P Linear Time Series A time series r t is said to be linear if it can be written as r t = µ + ψ i a t i, where µ is the mean of r t, ψ 0 = 1, and {a t } is white noise. i=0 a t denotes the new information at time t of the time series and is often referred to as the innovation or shock at time t. If r t is weakly stationary, we can obtain its mean and variance easily by using the independence of {a t } as E(r t ) = µ, where σ 2 a is the variance of a t. V (r t ) = σ 2 a ψi 2, i=0

P The lag-l aucovariance of r t is so γ l = Cov(r t, r t l ) ( ) = E ψ i a t i ψ j a t l j i=0 j=0 = E ψ i ψ j a t i a t l j = i=0 i,j=0 ψ j+l ψ j E(a 2 t l j ) = σ2 a ψ j ψ j+l, ρ l = γ l γ 0 = j=0 i=0 ψ iψ i+l 1 + i=1 ψ2 i Linear time series models are econometric and statistical models used to describe the pattern of the ψ weights of r t.

P AR(1) For instance, an stationary AR(1) model can be written as r t µ = φ 1 (r t 1 µ) + a t where {a t } is white noise. It is easy to see that r t µ = φ i 1a t i, and i=0 V (r t ) = 1 φ 2, 1 provided that φ 2 1 < 1. In other words, the weak stationarity of an AR(1) model implies that φ 1 < 1. Using φ 0 = (1 φ 1 )µ, one can rewrite a stationary AR(1) as σ2 a r t = φ 0 + φ 1 r t 1 + a t, with φ 1 measuring the persistence of the process.

P Autocovariance function The of a stationary AR(1) process is γ 0 = E[(r t µ)(r t µ)] = E[(φ 1 (r t 1 µ) + ɛ t )(r t µ)] = φ 1 E[(r t 1 µ)(r t µ)] + E(ɛ t (r t µ)] = φ 1 γ 1 + σa, 2 and γ l = E[(r t µ)(r t l µ)] = E[(φ 1 (r t 1 µ) + ɛ t )(r t l µ)] = φ 1 E[(r t 1 µ)(r t 1 l µ)] = φ 1 γ l 1 = φ l 1γ 0, for l = ±1, ±2,....

Autocorrelation function Stationarity From γ 1 = φ 1 γ 0 and γ 0 = φ 1 γ 1 + σ 2 a, it follows that P γ 0 = σ2 1 φ 2. 1 The of a (weakly) stationary AR(1) process is ρ l = φ l 1 l = ±1, ±2,... which decays exponentially with rate φ 1.

P An AR(2) model assumes the form where r t = φ 0 + φ 1 r t 1 + φ 2 r t 2 + a t, E(r t ) = µ = provided that φ 1 + φ 2 1. φ 0 1 φ 1 φ 2, AR(2) It is easy to see that γ 0 = φ 1 γ 1 + φ 2 γ 2 + σ 2 a, for l 0, γ l = φ 1 γ l 1 + φ 2 γ l 2, for l 0, ρ l = φ 1 ρ l 1 + φ 2 ρ l 2, l = ±2, 3,...

It is easy to see that Stationarity P and φ 1 ρ 1 = 1 φ 2 ρ 2 = φ 1 ρ 1 + φ 2, γ 0 = φ 1 ρ 1 γ 0 + φ 2 ρ 2 γ 0 + σ 2 a = (φ 1 ρ 1 + φ 2 (φ 1 ρ 1 + φ 2 )) γ 0 + σ 2 a = ( φ 1 (1 + φ 2 )ρ 1 + φ 2 2) γ0 + σa 2 ( ) φ 2 = 1 (1 + φ 2 ) + φ 2 2 γ 0 + σa 2 1 φ 2 such that { γ 0 = 1 φ 2 (1 + φ 2 )[(1 φ 2 2 ) φ2 1 ] } σ 2 a

P https://stats.stackexchange.com/questions/118019/a-proof-for-the-stationarity-of-an-ar2 pdf(file="ar2-stationarityregion.pdf",width=7,height=7) phi1 <- seq(from = -2.5, to = 2.5, length = 51) plot(phi1,1+phi1,lty="dashed",type="l", xlab="",ylab="",cex.axis=.8,ylim=c(-1.5,1.5)) abline(a = -1, b = 0, lty="dashed") abline(a = 1, b = -1, lty="dashed") title(ylab=expression(phi[2]),xlab=expression(phi[1]),cex.lab=.8) polygon(x = phi1[6:46], y = 1-abs(phi1[6:46]), col="gray") lines(phi1,-phi1^2/4) text(0,-.5,expression(phi[2]<phi[1]^2/4),cex=.7) text(1.2,.5,expression(phi[2]>1-phi[1]),cex=.7) text(-1.75,.5,expression(phi[2]>1+phi[1]),cex=.7) points(1.75,-0.8,pch=16,col=1) points(0.9,0.05,pch=16,col=2) dev.off()

P Two sets of φ s Let us compute the for two cases: (φ 1, φ 2 ) = (1.75, 0.8) and (φ 1, φ 2 ) = (0.9, 0.05). φ 2 1.5 1.0 0.5 0.0 0.5 1.0 1.5 φ2 > 1+φ1 φ2 < φ 1 2 4 φ2 > 1 φ1 2 1 0 1 2 φ 1

Theoretical s P acf.hedi = function(k,phi1,phi2){ rhos = rep(0,k) rhos[1] = phi1/(1-phi2) rhos[2] = phi1*rhos[1]+phi2 for (l in 3:k) rhos[l] = phi1*rhos[l-1]+phi2*rhos[l-2] plot(0:k,c(1,rhos),type="h",xlab="lag",ylab="") title(substitute(paste(phi[1],"=",a,", ",phi[2],"=",b), list(a=phi1,b=phi2))) abline(h=0,lty=2) } pdf(file="ar2-acf.pdf",width=7,height=5) par(mfrow=c(1,2)) acf.hedi(100,1.75,-0.8) acf.hedi(150,0.9,0.05) dev.off()

Theoretical s φ 1 =1.75, φ 2 = 0.8 φ 1 =0.9, φ 2 =0.05 P 0.2 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0 20 40 60 80 100 Lag 0 50 100 150 Lag

P Empirical s Let us simulate some data and compute the empirical s. acf1.hedi = function(n,k,phi1,phi2){ y = rep(0,2*n) for (t in 3:(2*n)) y[t] = phi1*y[t-1]+phi2*y[t-2]+rnorm(1,0,1) y = y[(n+1):(2*n)] acf(y,k) } pdf(file="ar2-acf-1.pdf",width=10,height=7) par(mfrow=c(2,2)) acf.hedi(100,1.75,-0.8) acf.hedi(150,0.9,0.05) set.seed(1234) acf1.hedi(10000,100,1.75,-0.8) acf1.hedi(10000,150,0.9,0.05) dev.off()

n = 10, 000 Theoretical vs Empirical φ 1=1.75, φ 2= 0.8 φ 1=0.9, φ 2=0.05 P 0.2 0.2 0.6 1.0 0 20 40 60 80 100 Lag 0.0 0.2 0.4 0.6 0.8 1.0 0 50 100 150 Lag 0.2 0.2 0.4 0.6 0.8 1.0 Series y 0.0 0.2 0.4 0.6 0.8 1.0 Series y 0 20 40 60 80 100 0 50 100 150 Lag Lag

n = 1, 000 Theoretical vs Empirical φ 1=1.75, φ 2= 0.8 φ 1=0.9, φ 2=0.05 P 0.2 0.2 0.6 1.0 0 20 40 60 80 100 Lag 0.0 0.2 0.4 0.6 0.8 1.0 0 50 100 150 Lag 0.2 0.2 0.6 1.0 Series y 0.0 0.2 0.4 0.6 0.8 1.0 Series y 0 20 40 60 80 100 0 50 100 150 Lag Lag

n = 500 Theoretical vs Empirical φ 1=1.75, φ 2= 0.8 φ 1=0.9, φ 2=0.05 P 0.2 0.2 0.6 1.0 0 20 40 60 80 100 0.0 0.2 0.4 0.6 0.8 1.0 0 50 100 150 Lag Lag 0.2 0.2 0.6 1.0 Series y 0.0 0.2 0.4 0.6 0.8 1.0 Series y 0 20 40 60 80 100 0 50 100 150 Lag Lag

n = 200 Theoretical vs Empirical φ 1=1.75, φ 2= 0.8 φ 1=0.9, φ 2=0.05 P 0.2 0.2 0.6 1.0 0 20 40 60 80 100 0.0 0.2 0.4 0.6 0.8 1.0 0 50 100 150 Lag Lag 0.5 0.0 0.5 1.0 Series y 0.2 0.2 0.6 1.0 Series y 0 20 40 60 80 100 0 50 100 150 Lag Lag

US real GNP P As an illustration, consider the quarterly growth rate of U.S. real gross national product (GNP), seasonally adjusted, from the second quarter of 1947 to the first quarter of 1991. Here we simply employ an AR(3) model for the data. Denoting the growth rate by r t the fitted model is r t = 0.0047 + 0.348r t 1 + 0.179r t 2 0.142r t 3 + a t, with ˆσ a = 0.0097.

P Alternatively, r t 0.348r t 1 0.179r t 2 + 0.142r t 3 = 0.0047 + a t, with the corresponding third-order difference equation (1 0.348B 0.179B 2 + 0.142B 3 ) = 0 or (1 + 0.521B)(1 0.869B + 0.274B 2 ) = 0 The first factor (1 + 0.521B) shows an exponentially decaying feature of the GNP.

P Business cycles The second factor (1 0.869B + 0.274B 2 ) confirms the existence of stochastic business cycles. For an AR(2) model with a pair of complex characteristic roots, the average length of the stochastic cycles is 2π k = cos 1 [φ 1 /(2 φ 2 )] or k = 10.62 quarters, which is about 3 years. Fact: If one uses a nonlinear model to separate U.S. economy into expansion and contraction periods, the data show that the average duration of contraction periods is about 3 quarters and that of expansion periods is about 12 quarters. The average duration of 10.62 quarters is a compromise between the two separate durations. The periodic feature obtained here is common among growth rates of national economies. For example, similar features can be found for many OECD countries.

P R code gnp=scan(file="http://faculty.chicagobooth.edu/ruey.tsay/teaching/fts3/dgnp82.txt") # To create a time-series object gnp1=ts(gnp,frequency=4,start=c(1947,2)) par(mfrow=c(1,1)) plot(gnp1) points(gnp1,pch="*") # Find the AR order m1=ar(gnp,method="mle") m1$order m2=arima(gnp,order=c(3,0,0)) m2 # In R, intercept denotes the mean of the series. # Therefore, the constant term is obtained below: (1-.348-.1793+.1423)*0.0077 # Residual standard error sqrt(m2$sigma2) # Characteristic equation and solutions p1=c(1,-m2$coef[1:3]) roots = polyroot(p1) # Compute the absolute values of the solutions Mod(roots) [1] 1.913308 1.920152 1.913308 # To compute average length of business cycles: k=2*pi/acos(1.590253/1.913308)

P AR(p) The results of the AR(1) and AR(2) models can readily be generalized to the general AR(p) model: r t = φ 0 + φ 1 r t 1 + + φ p r t p + a t, where p is a nonnegative integer and {a t } is white noise. The mean of a stationary series is φ 0 E(r t ) = 1 φ 1 φ p provided that the denominator is not zero. The associated characteristic equation of the model is 1 φ 1 x φ 2 x 2 φ p x p = 0. If all the solutions of this equation are greater than 1 in modulus, then the series r t is stationary.

P Partial The P of a stationary time series is a function of its and is a useful tool for determining the order p of an AR model. A simple, yet effective way to introduce P is to consider the following in consecutive orders: r t = φ 0,1 + φ 1,1 r t 1 + e 1,t, r t = φ 0,2 + φ 1,2 r t 1 + φ 2,2 r t 2 + e 2,t, r t = φ 0,3 + φ 1,3 r t 1 + φ 2,3 r t 2 + φ 3,3 r t 3 + e 3,t,. The estimate ˆφ i,i the ith equation is called the lag-i sample P of r t.

P For a stationary Gaussian AR(p) model, it can be shown that the sample P has the following properties: ˆφp,p φ p as T. ˆφl,l 0 for all l > p. V ( ˆφ l,l ) 1/T for l > p. These results say that, for an AR(p) series, the sample P cuts off at lag p.

P AIC The well-known Akaike information criterion (AIC) (Akaike, 1973) is defined as AIC = 2 T log(likelihood) + 2 (number of parameters), }{{}} T {{} goodness of fit penalty function where the likelihood function is evaluated at the maximum-likelihood estimates and T is the sample size. For a Gaussian AR(l) model, AIC reduces to AIC(l) = log( σ 2 l ) + 2l T where σ l 2 is the maximum-likelihood estimate of σa, 2 which is the variance of a t and T is the sample size.

Another commonly used criterion function is the SchwarzBayesian information criterion (BIC). BIC P For a Gaussian AR(l) model, the criterion is BIC(l) = log( σ 2 l ) + l log(t ) T The penalty for each parameter used is 2 for AIC and log(t ) for BIC. Thus, BIC tends to select a lower AR model when the sample size is moderate or large.

P For the AR(p) model, suppose that we are at the time index h and are interested in forecasting r h+l where l 1. The time index h is called the forecast origin and the positive integer l is the forecast horizon. Let ˆr h (l) be the forecast of r h+l using the minimum squared error loss function, i.e. E{[r h+l ˆr h (l)] 2 F h } min g E[(r h+l g) 2 F h ], where g is a function of the information available at time h (inclusive), that is, a function of F h. We referred to ˆr h (l) as the l-step ahead forecast of r t at the forecast origin h.

1-step-ahead forecast P It is easy to see that ˆr h (1) = E(r h+1 F h ) = φ 0 + and the associated forecast error is p φ i r h+1 i, i=1 and e h (1) = r h+1 ˆr h (1) = a h+1, V (e h (1)) = V (a h+1 ) = σ 2 a.

2-step-ahead forecast P Similarly, ˆr h (2) = φ 0 + φ 1ˆr h (1) + φ 2 r h + + φ p r h+2 p, with e h (2) = a h+2 + φ 1 a h+1 and V (e h (2)) = (1 + φ 2 1)σa. 2

Multistep-ahead forecast P In general, r h+l = φ 0 + and p φ i r h+l i + a h+l, i=1 ˆr h (l) = φ 0 + where ˆr h (i) = r h+i if i 0. p φ iˆr h (l i), i=1

P Mean reversion It can be shown that for a stationary AR(p) model, ˆr h (l) E(r t ) as l, meaning that for such a series long-term point forecast approaches its unconditional mean. This property is referred to as the mean reversion in the finance literature. For an AR(1) model, the speed of mean reversion is measured by the half-life defined as half-life = log(0.5) log( φ 1 ).

MA(1) model P There are several ways to introduce. One approach is to treat the model as a simple extension of white noise series. Another approach is to treat the model as an infinite-order AR model with some parameter constraints. We adopt the second approach.

P We may entertain, at least in theory, an AR model with infinite order as r t = φ 0 + φ 1 r t 1 + φ 2 r t 2 + + a t. However, such an AR model is not realistic because it has infinite many parameters. One way to make the model practical is to assume that the coefficients φ i s satisfy some constraints so that they are determined by a finite number of parameters. A special case of this idea is r t = φ 0 θ 1 r t 1 θ 2 1r t 2 θ 3 1r t 3 + a t. where the coefficients depend on a single parameter θ 1 via φ i = θ1 i for i 1.

P Obviously, so r t + θ 1 r t 1 + θ1r 2 t 2 + θ1r 3 t 3 + = φ 0 + a t θ 1 (r t 1 + θ 1 r t 2 + θ1r 2 t 3 + θ1r 3 t 4 + ) = θ 1 (φ 0 + a t 1 ) r t = φ 0 (1 θ 1 ) + a t θ 1 a t 1, i.e., r t is a weighted average of shocks a t and a t 1. Therefore, the model is called an MA model of order 1 or MA(1) model for short.

P The general form of an MA(q) model is q r t = c 0 θ i a t i, i=1 MA(q) or r t = c 0 + (1 θ 1 B θ q B q )a t, where q > 0. Moving-average models are always weakly stationary because they are finite linear combinations of a white noise sequence for which the first two moments are time invariant. E(r t ) = c 0 V (r t ) = (1 + θ1 2 + θ2 2 + + θq)σ 2 a. 2

P of an MA(1) Assume that c 0 = 0 for simplicity. Then, r t l r t = r t l a t θ 1 r t l a t 1. Taking expectation, we obtain γ 1 = θ 1 σa 2 and γ l = 0, for l > 1. Since V (r t ) = (1 + θ1 2)σ2 a, it follows that ρ 0 = 1, ρ 1 = θ 1 1 + θ1 2, ρ l = 0, for l > 1.

P of an MA(2) For the MA(2) model, the autocorrelation coefficients are ρ 1 = θ 1 + θ 1 θ 2 1 + θ1 2 +, ρ 2 = θ2 2 Here the cuts off at lag 2. θ 2 1 + θ1 2 +, ρ l = 0, for l > 2. θ2 2 This property generalizes to other. For an MA(q) model, the lag-q is not zero, but ρ l = 0 for l > q. an MA(q) series is only linearly related to its first q-lagged values and hence is a finite-memory model.

Estimation Maximum-likelihood estimation is commonly used to estimate. There are two approaches for evaluating the likelihood function of an MA model. P The first approach assumes that a t = 0 for t 0, so a 1 = r 1 c 0, a 2 = r 2 c 0 + θ 1 a 1, etc. This approach is referred to as the conditional-likelihood method. The second approach treats a t = 0 for t 0, as additional parameters of the model and estimate them jointly with other parameters. This approach is referred to as the exact-likelihood method.

an MA(1) Stationarity P For the 1-step-ahead forecast of an MA(1) process, the model says r h+1 = c 0 + a h+1 θ 1 a h. Taking the conditional expectation, we have ˆr h (1) = E(r h+1 F h ) = c 0 θ 1 a h, e h (1) = r h+1 ˆr h (1) = a h+1 with V [e h (1)] = σ 2 a. Similarly, ˆr h (2) = E(r h+1 F h ) = c 0 e h (2) = r h+2 ˆr h (2) = a h+2 θ 1 a h+1 with V [e h (2)] = (1 + θ 2 1 )σ2 a.

an MA(2) P Similarly, for an MA(2) model, we have r h+l = c 0 + a h+l θ 1 a h+l 1 θ 2 a h+l 2, from which we obtain ˆr h (1) = c 0 θ 1 a h θ 2 a h 1, ˆr h (2) = c 0 θ 2 a h, ˆr h (l) = c 0, for l > 2.

P A brief summary of AR and is in order. We have discussed the following properties: For, is useful in specifying the order because cuts off at lag q for an MA(q) series. For, P is useful in order determination because P cuts off at lag p for an AR(p) process. An MA series is always stationary, but for an AR series to be stationary, all of its characteristic roots must be less than 1 in modulus. Carefully read Section 2.6 of Tsay (2010) about AR.