Multivariate Time Series: VAR(p) Processes and Models

Similar documents
Multivariate Time Series Analysis and Its Applications [Tsay (2005), chapter 8]

Multivariate Time Series

Prof. Dr. Roland Füss Lecture Series in Applied Econometrics Summer Term Introduction to Time Series Analysis

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley

Econometría 2: Análisis de series de Tiempo

Vector error correction model, VECM Cointegrated VAR

Booth School of Business, University of Chicago Business 41914, Spring Quarter 2017, Mr. Ruey S. Tsay. Solutions to Midterm

ECON 4160, Spring term Lecture 12

THE UNIVERSITY OF CHICAGO Booth School of Business Business 41914, Spring Quarter 2013, Mr. Ruey S. Tsay

New Introduction to Multiple Time Series Analysis

FE570 Financial Markets and Trading. Stevens Institute of Technology

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

2. Multivariate ARMA

Advanced Econometrics

Vector Auto-Regressive Models

VAR Models and Applications

Vector autoregressions, VAR

ECON 4160, Lecture 11 and 12

VAR Models and Cointegration 1

7. Integrated Processes

Econ 423 Lecture Notes: Additional Topics in Time Series 1

Chapter 2: Unit Roots

Cointegrated VARIMA models: specification and. simulation

Multivariate Time Series: Part 4

7. Integrated Processes

VAR Model. (k-variate) VAR(p) model (in the Reduced Form): Y t-2. Y t-1 = A + B 1. Y t + B 2. Y t-p. + ε t. + + B p. where:

Booth School of Business, University of Chicago Business 41914, Spring Quarter 2017, Mr. Ruey S. Tsay Midterm

Multivariate forecasting with VAR models

Unit roots in vector time series. Scalar autoregression True model: y t 1 y t1 2 y t2 p y tp t Estimated model: y t c y t1 1 y t1 2 y t2

Cointegrated VAR s. Eduardo Rossi University of Pavia. November Rossi Cointegrated VAR s Financial Econometrics / 56

Nonstationary Time Series:

Questions and Answers on Unit Roots, Cointegration, VARs and VECMs

Title. Description. var intro Introduction to vector autoregressive models

1 Linear Difference Equations

9. Multivariate Linear Time Series (II). MA6622, Ernesto Mordecki, CityU, HK, 2006.

TIME SERIES ANALYSIS. Forecasting and Control. Wiley. Fifth Edition GWILYM M. JENKINS GEORGE E. P. BOX GREGORY C. REINSEL GRETA M.

TESTING FOR CO-INTEGRATION

Stationarity and Cointegration analysis. Tinashe Bvirindi

11. Further Issues in Using OLS with TS Data

MA Advanced Econometrics: Spurious Regressions and Cointegration

MEI Exam Review. June 7, 2002

Introduction to Algorithmic Trading Strategies Lecture 3

Lecture 5: Unit Roots, Cointegration and Error Correction Models The Spurious Regression Problem

Non-Stationary Time Series and Unit Root Testing

Non-Stationary Time Series and Unit Root Testing

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication

Y t = ΦD t + Π 1 Y t Π p Y t p + ε t, D t = deterministic terms

Lecture 7a: Vector Autoregression (VAR)

This chapter reviews properties of regression estimators and test statistics based on

Booth School of Business, University of Chicago Business 41914, Spring Quarter 2013, Mr. Ruey S. Tsay. Midterm

Notes on Time Series Modeling

Lecture 7a: Vector Autoregression (VAR)

BCT Lecture 3. Lukas Vacha.

Financial Econometrics

Univariate ARIMA Models

at least 50 and preferably 100 observations should be available to build a proper model

Statistics 910, #5 1. Regression Methods

It is easily seen that in general a linear combination of y t and x t is I(1). However, in particular cases, it can be I(0), i.e. stationary.

MFE Financial Econometrics 2018 Final Exam Model Solutions

Lecture 5: Unit Roots, Cointegration and Error Correction Models The Spurious Regression Problem

AR, MA and ARMA models

Lecture 2: Univariate Time Series

Empirical Market Microstructure Analysis (EMMA)

Chapter 12: An introduction to Time Series Analysis. Chapter 12: An introduction to Time Series Analysis

Time Series Methods. Sanjaya Desilva

Chapter 5. Analysis of Multiple Time Series. 5.1 Vector Autoregressions

Topic 4 Unit Roots. Gerald P. Dwyer. February Clemson University

Chapter 4: Models for Stationary Time Series

E 4160 Autumn term Lecture 9: Deterministic trends vs integrated series; Spurious regression; Dickey-Fuller distribution and test

CHAPTER 21: TIME SERIES ECONOMETRICS: SOME BASIC CONCEPTS

Non-Stationary Time Series and Unit Root Testing

2. An Introduction to Moving Average Models and ARMA Models

Cointegrated VAR s. Eduardo Rossi University of Pavia. November Rossi Cointegrated VAR s Fin. Econometrics / 31

1 Teaching notes on structural VARs.

Econometrics. Week 11. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Chapter 2. Some basic tools. 2.1 Time series: Theory Stochastic processes

Volatility. Gerald P. Dwyer. February Clemson University

Time Series Econometrics 4 Vijayamohanan Pillai N

Covariance Stationary Time Series. Example: Independent White Noise (IWN(0,σ 2 )) Y t = ε t, ε t iid N(0,σ 2 )

Elements of Multivariate Time Series Analysis

Introduction to Eco n o m et rics

Non-Stationary Time Series, Cointegration, and Spurious Regression

Vector Autoregression

Heteroskedasticity; Step Changes; VARMA models; Likelihood ratio test statistic; Cusum statistic.

Ch 6. Model Specification. Time Series Analysis

A time series is called strictly stationary if the joint distribution of every collection (Y t

Economics 618B: Time Series Analysis Department of Economics State University of New York at Binghamton

Week 5 Quantitative Analysis of Financial Markets Characterizing Cycles

EC408 Topics in Applied Econometrics. B Fingleton, Dept of Economics, Strathclyde University

Cointegration, Stationarity and Error Correction Models.

Stochastic Processes

Time Series 4. Robert Almgren. Oct. 5, 2009

ECON/FIN 250: Forecasting in Finance and Economics: Section 7: Unit Roots & Dickey-Fuller Tests

Vector Autogregression and Impulse Response Functions

Regression with random walks and Cointegration. Spurious Regression

Bootstrapping the Grainger Causality Test With Integrated Data

Univariate, Nonstationary Processes

Multivariate Time Series

Identifiability, Invertibility

Transcription:

Multivariate Time Series: VAR(p) Processes and Models A VAR(p) model, for p > 0 is X t = φ 0 + Φ 1 X t 1 + + Φ p X t p + A t, where X t, φ 0, and X t i are k-vectors, Φ 1,..., Φ p are k k matrices, with Φ p 0, and {A t } is a sequence of serially uncorrelated k-vectors with 0 mean and constant positive definite variancecovariance matrix Σ. We also can write this using the back-shift operator as (I Φ 1 B Φ p B p )X t = φ 0 + A t, or Φ(B)X t = φ 0 + A t, 1

Companion Matrix We can sometimes get a better understanding of a k-dimensional VAR(p) process by writing it as a kp VAR(1). It is Y t = Φ X t 1 + B t where 0 I 0 0 0 0 I 0.... 0 0 I 0 0 Φ p Φ p 1 Φ p 2 Φ 1. This is sometimes called the companion matrix. The key fact here is that stationarity can be assessed by looking at the eigenvalues of Φ. 2

Number of Terms in Time Series Models The two common general types of time series models incorporate past history either through linear combinations of past observations (AR) or of previous errors (shocks) in the system (MA). To use either type of model, we need to decide on the order. For an AR model, we do that by using a sequence of partial models. Recall: We consider the models R t R t. R t φ 0,1 + φ 1,1 R t 1 φ 0,2 + φ 1,2 R t 1 + φ 2,2 R t 2 φ 0,p + φ 1,p R t 1 + + φ 2,p R t p The coefficients φ i,i constitute the partial autocorrelation function (PACF). (The argument for the function is the index.) 3

The Partial Autocorrelation Function (PACF) in AR Models The PACF is useful for an AR model because we can partial out the dependence. Consider AR(1): R t = φr t 1 + A t. We have γ(2) = φ 2 γ(0) for R t and R t 2. Could we get a covariance of something to go to 0 at lag 2? Consider R t φr t 1 and R t 2 φr t 1. The covariance is 0. This is the idea behind the PACF; for R t and R t+h, regress each on the R k s between them. The important result is that the PACF in an AR(p) model is 0 beyond lag p; that is φ p+1,p+1 = 0; hence, we can use it to identify p. 4

The Partial Autocorrelation Function (PACF) in AR Models The question is how to use the sample PACF. Often, since after all, we use it to build a model, we just use simple graphs to decide at what order the sample PACF has died off. More formally, if the errors in the AR(p) model are iid with mean 0, then the sample PACFs beyond p are asymptotically iid N(0,1/n). 5

Number of Terms in a VAR(p) Model We use similar ideas to determine the p in a VAR(p) model. Instead of scalar partial correlations, however, we have partial covariance matrices. It s a little harder even to get started. We take a parametric approach using a multivariate normal distribution. Residuals from a partial true model have a PDF of the form f(r) = 1 (2π) k/2 Σ 1/2 exp ( (r µ r ) T Σ 1 (r µ r )/2 ). 6

Sequential Tests for Φ j = 0 in a VAR(p) Model We consider a sequence of VAR models, X t = φ 0 + Φ 1 X t 1 + A t X t = φ 0 + Φ 1 X t 1 + Φ 2 X t 2 + A t. X t = φ 0 + Φ 1 X t 1 + + Φ i X t i + A t. where X t, φ 0, and X t i are k-vectors, Φ 1,..., Φ i are k k matrices, with Φ i 0, and {A t } is a sequence of serially uncorrelated k-vectors with 0 mean and constant positive definite variancecovariance matrix Σ. We test sequentially that Φ h = 0, using likelihood ratio tests. The likelihood ratio leads to two similar tests, Wald tests and score tests (also called Rao tests and Lagrange multiplier tests ). In a Wald test, we use the MLE under the hypothesized model. 7

Sequential Tests for Φ j = 0 in a VAR(p) Model We ll use a Wald test, using given data x 1,..., x n. That means to test a model with i 1 terms versus a model with i 1 terms, the log of the likelihood ratio only involves ( ) Σ log i, Σ i 1 where the Σ s are the MLEs of the variance-covariance of the errors in the model with the appropriate number of terms. When i = 1, Σ 0 is just the sample variance of the x s. With proper normalizing factors shown in equation (8.18) on page 406 (derived by Tiao and Box), the log likelihood ratio has an asymptotic chi-squared distribution with k 2 degrees of freedom under the null hypothesis. (This asymptotic distribution holds under what I call the Le Cam regularity conditions see Gentle (2013), page 169. These are satisfied if our likelihood is correct in the first place!) 8

Sequential Tests for Φ j = 0 in a VAR(p) Model We test H 0 : Φ 1 = 0 versus H 1 : Φ 1 0. What next? In practice, whether or not we reject, we may try H 0 : Φ 2 = 0, but usually we don t we proceed to the next model only if we reject the preceding hypothesis. I am not sure whether there is an R function that does these tests directly, but the output of VAR in the var package can easily be used to compute the statistic. 9

The ARCH Effect in a VAR(p) Model An extension to the VAR model allows for the volatility to vary as in an ARCH or GARCH model. The R function serial.test in the var package computes the portmanteau test statistic for the ARCH effect (at least if the model is VAR+ARCH). 10

Forecasting with a VAR(p) Model Forecasting with a VAR(p) model is similar to the same thing in a univariate model. Given x t,..., x t p+1, the 1-step-ahead forecast at time t is X t (1) = φ 0 + and the forecast error is A t+1. p i=1 Φ i X t+1 i, Substituting, we get the 2-step-ahead forecast at time t as X t (2) = φ 0 + Φ 1 X t (1) + p i=2 and the forecast error is A t+2 + Φ 1 A t+1. Φ i X t+2 i, 11

Impulse Response Function We can also express a causal VAR(p) as an infinite moving average model model just as we did with a univariate model: X t = θ 0 + A t + Ψ 1 A t 1 + Ψ 2 A t 2 + The coefficient matrices in such an infinite MA model are called impulse response functions. What s causal? 12

Vector Moving-Average or VMA(q) Models and VARMA(p, q) Models The vector moving-average or VMA(q) model is the obvious extension of the univariate MA model. We can write it as X t = θ 0 + A t Θ 1 A t 1 Θ q A t q or X t = θ 0 + Θ(B)A t. The differences between a VMA and an MA are similar to the differences between a VAR and an AR. Also, just as we combine an AR model and an MA model, we combine a VAR and a VMA to get a vector ARMA or VARMA(p, q) model. 13

Marginal Models of Components of VMA(q) Models The marginal models of a VMA(q) model are just MA(q) models. We see this because the cross-correlation matrix of X t vanishes after lag q, and so we can write X it as X it = θ i0 + q j=1 θ i,j B i,t j, where {B i,t j } is a sequence of uncorrelated random variable with 0 mean and constant variance. 14

Marginal Models of Components of VAR(p) Models One approach to studying the marginal components of VAR(p) models is by use of the structural equations. These are formed by diagaonalizing the variance-covariance matrix of A t, as we discussed last week for a VAR(p) model. This approach shows the concurrent relationships of one component to all the others. Another approach is to obtain explicit representations of all of the component series as AR models. We can do this if we can diagonalize the AR polynomial coefficient matrix in a VAR(p) model. 15

Marginal Models of Components and Diagonalizing Matrices Some technical notes are in order here. A nonnegative definite matrix can always be diagonalized by a Cholesky decomposition, but not all square matrices can be diagonalized. A matrix that can be diagonalized is called a regular matrix. (See Gentle, 2007, pages 116 and following, for conditions and general discussion of the problem.) One general method of diagonalizing a regular matrix A is to use the matrix V whose columns are linearly independent eigenvectors of A and C is the diagonal matrix whose elements are the eigenvalues of A. This requires both premultiplication and postmultiplication of A, and if the matrix is not of full rank, requires some rearrangement of the rows and columns. 16

Marginal Models of Components We ll just do the example in the text for the VAR(1) case for k = 2. The bivariate VAR(1) model is [ 1 Φ11 B Φ 12 B Φ 21 B 1 Φ 22 B ] [ X1t X 2t We premultiply both sides by [ 1 Φ22 B Φ 12 B Φ 21 B 1 Φ 11 B ] ] =. [ A1,t A 2,t ] This gives us the marginal models, in which each AR component has a coefficient of (1 Φ 11 B)(1 Φ 22 B) Φ 12 Φ 21 B 2. Note, however, that we have AR(2) models on the left side and we have MA(1) models on the right side; that is, a bivariate VAR(1) model became two marginal ARMA(2,1) models. 17

Marginal Models of Components This idea generalizes (with a lot of tedious algebra). A k-variate VAR(p) model yields k ARMA(kp,(k 1)p) models. The VMA(q) part of the original VARMA may add up to q addition MA components. In general, however, the number of MA components is min((k 1)p, q). We next consider some other ways that decomposing a VAR(p) model can lead to new insights about the process in some cases. This is the case where we have cointegration. 18

Unit-Root Nonstationarity Many economic time series exhibit either (apparent) random walk behavior, P t = P t 1 + A t, or random walk with a drift behavior, P t = µ + P t 1 + A t, where {A t } is iid with variance σ 2 A. Either of these processes has unit-root nonstationarity. These processes can be made stationary by differencing; that is, the series is integrated. We speak of integrated series of order d, and denote as I(d), if d differences result in a stationary process. Notice the effects of the nonstationarity. 19

Simple Random Walk Process In the simple random walk process, the k-step ahead forecast is It is not mean reverting. P t (k) = E(P t+k p t, p t 1,..., p 0 ) = p t. The forecast error is e t (k) = a t+k + + a t+1 Its variance is V(E t (k)) = kσa 2. The forecast has no value. 20

Random Walk Process with Drift In the simple random walk process, the k-step ahead forecast is P t (k) = E(P t+k p t, p t 1,..., p 0 ) = kµ + p 0. It is not mean reverting. The conditional variance of P t is tσa 2, which grows without bound. I should mention one more type of nonstationary process. It is the trend-stationary process, P t = α 0 + α 1 t + A t. Notice that this process is not stationary because of its mean; its variance, however, is time invariant. This process can be made stationary by detrending, that is, by subtracting βt. 21

Spurious Regressions First, consider two trend-stationary processes, and Y t = α 0 + α 1 t + A t X t = δ 0 + δ 1 t + B t, that have nothing to do with each other (i.e., everything is independent ). Now, consider the regression of Y t on X t : Y t = β 0 + β 1 X t + ɛ t = β 0 + β 1 (δ 0 + δ 1 t + B t ) + ɛ t = γ 0 + (β 1 δ 1 )t + ɛ t. The regression test will probably be significant. This results from the trends. It is spurious, however. Everybody knows this. 22

Spurious Regressions Next, consider two random walks, and Y t = y t 1 + A t X t = x t 1 + B t, that have nothing to do with each other (i.e., everything is independent ). For simplicity, assume that A t and B t are iid N(0,1). Now, consider the regression of Y t on X t (without intercept): Y t = βx t + ɛ t. We see that β = Cov(Y t, X t )/V(X t ) and ɛ t N(0, t). 23

Spurious Regressions of Random Walks Granger and Newbold, in a very famous Monte Carlo study in 1974, found that the standard t test of H 0 : β = 0 rejected 76% of the time. This example is very different from the spurious regressions of one trend-stationary series on another. The problem here is unit-root nonstationarity. 24

Spurious Regressions of Random Walks: A Technical Aside Consider a regression model of the form Y t = βx t + ɛ t, with the usual assumption of 0 correlations between all ɛ t and V(ɛ t ) = σ 2. What about the relationship between x t and ɛ t? The asymptotic properties (relating to normality) will hold if x t and ɛ t are independent. This is OK if x t is a constant. What about if x t is a random variable? This happens all the time in financial applications. In these applications, however, we cannot assume that x t and ɛ t are independent. Can we find a weaker condition? 25

Spurious Regressions of Random Walks: A Technical Aside (continued) A weaker sufficient condition is called the martingale difference assumption: E(ɛ t x t, ɛ t 1 x t 1,..., ɛ 1, x 1 ) = 0, for all t and lim t E(ɛ2 t x t, ɛ t 1 x t 1,..., ɛ 1, x 1 ) = σ 2, almost surely. The punchline is that the second condition is not satisfied in the regression of one random walk on another. The problem is that n 2 x 2 t has a nondegenerate limiting distribution. 26

Unit-Root Nonstationarity and Cointegration The spurious regression problem (as well as other issues) makes consideration of unit-root nonstationarity in multivariate time series important. Now let s consider unit-root nonstationarity in the context of a VARMA. There are different kinds of situations. In some cases the component time series may not have any relationships to each other (although spurious regressions may exist). In some interesting cases, however, even though the component series are unit-root nonstationary, a linear combination of some of them is stationary. This phenomenon is called cointegration. 27

Unit-Root Nonstationarity and Cointegration The example in the text (p 428) is a good simple one to illustrate the idea. We have the bivariate ARMA(1,1) model [ ] [ X1t X 2t 0.50 1.00 0.25 0.50 X t ΦX t 1 ] [ ] X1,t 1 X 2,t 1 = A t ΘA t 1 = [ ] [ A1,t A 2,t We first determine the eigenvalues of the AR matrix. 0.20 0.40 0.10 0.20 ] [ ] A1,t 1 A 2,t 1 > phi <- matrix(c(0.50,-0.25,-1.00,0.50),nrow=2) > eigen(phi)$values [1] 1.000000e+00-5.421011e-20 We note that the AR coefficient matrix is singular. Also, we see that the other eigenvalue is 1. (This is a necessary condition of an idempotent matrix, but it is not sufficient. We note in this case, however, the the coefficient matrix is idempotent.) As illustrated on the previous slides, we write the model in the form that uses the backshift operator, than then we obtain the marginal components by premultiplication by [ ] 1 0.50B 1.00B. 0.25B 1 0.50B 28

Unit-Root Nonstationarity and Cointegration This premultiplication yields the coefficient matrix on the left as [ ] 1 B 0 ; 0 1 B hence, we see that each component is unit-root nonstationary. Now we seek a linear combination of the component time series that is stationary. Following Tsay, we transform the system as in equation (8.32). 29

Unit-Root Nonstationarity and Cointegration By premultiplying by the generalized inverse of the coefficient matrix, [ ] 1.0 2.0 0.5 1.0 we get equation (8.32), which has two linear combinations of X 1t and X 2t, [ ] [ ] [ ] [ ] [ ] [ ] Y1t 1.0 0 Y1,t 1 B1,t 0.4 0 B1,t 1 = 0 0 0 0 Y 2t Y 2,t 1 B 2,t B 2,t 1 The two linear combinations of X 1t and X 2t, that is, Y 1t and Y 2t, are uncoupled. Their concurrent correlation is the correlation between B 1t and B 2t (which is not 0). Y 1t is unit-root nonstationary, but Y 2t is stationary. 30

Cointegration Y 1t = X 1t 2X 2t is called the common trend of X 1t and X 2t. In Y 2t = 0.5X 1t + 1X 2t = b T (X 1t, X 2t ), the vector b = (0.5,1.0), which yields a stationary process, is called the cointegration vector. In general, cointegration or order m exists within a multivariate time series whenever all of the component series are unit-root nonstationary, but there exist m > 0 linearly independent cointegration vectors. A financial interpretation of a cointegrated multivariate time series is that the components have some common threads that result in linear combinations that have long-run equilibrium even though the individual components are nonstationary and have variances diverging to. 31

Error Corrections Unit-root nonstationarity problems can often be overcome by differencing. For the multivariate ARMA(p, q) process {X t } that is cointegrated of order m, we seek some meaningful representation of X t = X t X t 1. In a cointegrated time series, we represent the differenced time series as X t = CB T X t 1 + p 1 j=1 Φ j X t j + A t q i=1 Θ i A t i, where the C and B are k m full rank matrices, the columns of B are the cointegrating vectors, and for j = 1,...,p 1, Φ j = p Φ i i=j+1 In this representation, B T X t 1 is stationary. 32

Error Correction Model (ECM) for a VAR(p) Process Let {X t } be an I(0) or I(1) VAR(p) process. Following the form of representation for the cointegrated ARMA(p, q) process on the previous slide, we write the model in the form where Π = CB T. X t = µ t + ΠX t 1 + p 1 j=1 Φ j X t j + A t, The term ΠX t 1 is call the error correction term, and the model is called the error correction model or ECM. The rank of Π determines the extent of cointegration. If rank(π) = 0, there is no cointegration, and the process is actually VAR(p 1). 33

If rank(π) = k, that is the matrix is full-rank, there is no cointegration, and the process is just VAR(p). If < 0rank(Π) = m < k, there is cointegration of order m.

Johansen s Test To test for cointegration in a nominal VAR(p) process is essentially to test the rank of Π. There is a likelihood ratio test for this, called Johansen s test. It is in the R function ca.jo in the urca package. 34

Cointegrated Financial Time Series The only way to receive returns uniformly above a risk-adjusted rate is by arbitrage. In a fair and stable market, there is no arbitrage. Whenever cointegrated time series exist, there is often the possibility that the two series do not reflect true value. An example is pairs trading. 35