VAR Models and Applications Laurent Ferrara 1 1 University of Paris West M2 EIPMC Oct. 2016
Overview of the presentation 1. Vector Auto-Regressions Definition Estimation Testing 2. Impulse responses functions (IRF) Concept General IRF 3. Forecasting Concept Variance decomposition 4. Extensions 5. Applications US-EA GDP relationships
Vector Auto-Regressions: Short introduction The VAR models are widely used in economic analysis. While simple and easy to estimate, they make it possible to conveniently capture the dynamics of multivariate systems. VAR popularity is mainly due to Sims (1980) influential work. Reference text books : Hamilton (1994), Tsay (2014)
Vector Auto-Regressions: Notations Let y t denote an (n 1) vector of random variables. y t follows a p th order Gaussian VAR if, for all t, we have y t = c + Φ 1 y t 1 +... Φ p y t p + ε t where c is n-vector, Φ i are n n matrices, ε t N(0, Ω) with Ω is a positive-definite covariance matrix. Consequently, the conditional distribution : y t y t 1, y t 2,..., y p+1 N(c + Φ 1 y t 1 +... Φ p y t p, Ω).
Vector Auto-Regressions: Exemple n = 2 VAR(1)for y t = (y 1,t, y 2,t ): y 1,t = c 1 + φ 11 y 1,t 1 + φ 12 y 2,t 1 + ε 1,t y 2,t = c 2 + φ 21 y 1,t 1 + φ 22 y 2,t 1 + ε 2,t. where ε 1,t GWN(σ 2 ε 1 ), ε 2,t GWN(σ 2 ε 2 ) and ρ(ε 1,t, ε 2,t ) = 0 φ 11 and φ 22 are autoregressive coefficients, φ 21 and φ 12 are coefficients measuring the linear dependence between y 1 and y 2.
Vector Auto-Regressions: Notations In lag operator notation: Φ(B)y t = c + ε t where Φ(B) = I n Φ 1 B... Φ p B p, B the backwards operator The VAR(p) is stationary if all the roots of det(i n Φ 1 z... Φ p z p ) = 0 lie outside the unit circle (ie: have modulus greater than 1) If y t is stationary, the unconditional mean vector is: µ = (I n Φ 1... Φ p ) 1 c = (Φ(1)) 1 c
Rewritting a VAR(p) as a VAR(1) Let define Y t = (y t, y t 1,..., y t p+1 ) the stacked pn-vector, then the VARp process can be rewritten Y t = ΦY t 1 + ε t with ε t = (ε t, 0 ) with 0 being a k(p 1)-vector of zeros such that Y t = c 0. 0 + Φ 1 Φ 2 Φ p 1 0 0 0... 0 0 0 0 1 0 Y t 1 + ε t 0. 0
Rewritting a VAR(p) as a MA process For sake of simplicity, assume y t is VAR(1) with µ = 0. y t = ε t + φ 1 y t 1 y t = ε t + φ 1 (ε t 1 + φ 1 y t 1 ) y t = ε t + φ 1 ε t 1 + φ 2 1(+ε t 2 + φ 1 y t 2 ) y t =... ie general form : y t = ε t + φ 1 ε t 1 + φ 2 1ε t 2 + φ 3 1ε t 3 +... y t = ε t + ψ 1 ε t 1 + ψ 2 ε t 2 + ψ 3 ε t 3 +...
Rewritting a VAR(p) as a MA process If y t is a general VAR(p) process, we get: y t = ε t + ψ 1 ε t 1 + ψ 2 ε t 2 + ψ 3 ε t 3 +... with ψ i = min(i,p) j=1 φ j ψ i j
Parameter estimation methods 3 main types of techniques for VAR models 1. Least-Squares (LS) 2. Maximum Likelihood (ML) 3. Bayesian methods Under the multivariate Normality assumption for the error terms, ML estimates are asymptotically equivalent to the LS estimates (see Hamilton, 94, for a proof)
VAR estimation: LS Let s rewrite y t supposed to follow a VAR(p) model for t = p + 1,..., T : as: y t = c + Φ 1 y t 1 +... + Φ p y t p + ε t y t = x tπ + ε t where x t = (1, y t 1,..., y t p) is a np + 1-vector and Π = [ c Φ 1 Φ 2... Φ p ] is a n (np + 1) matrix. Matrix notation: Z = XΠ + ε where Z is a (T p) n matrix ith ith row being y p+i and X is a (T p) (np + 1) design matrix with ith row being x p+i and ε is a (T p) n matrix with ith row being ε p+i
VAR estimation: LS The LSE of Π, denoted with ˆΠ LS is given by ˆΠ LS = (X X) 1 X Y (1) [ T ] 1 [ T ] ˆΠ LS = x t x t y t x t (2)
Vector Auto-Regressions: MLE Denoting with Π the matrix [ ] c Φ 1 Φ 2... Φ p and with x t the vector [ 1 y t 1 y t 2... y t p ], the log-likelihood is given by L(Y T ; θ) = (Tn/2) log(2π) + (T /2) log Ω 1 1 T [ (yt Π ) x t Ω 1 ( y t Π ) ] x t. 2 The MLE of Π, denoted with ˆΠ is given by [ T ] [ T 1 ˆΠ = y t x t x t x t]. (3)
Vector Auto-Regressions: MLE Proof of equation (3) Let s rewrite the last term of the log-likelihood T [ (yt Π ) x t Ω 1 ( y t Π ) ] x t = T [ ( ) y t ˆΠ x t + ˆΠ ( ) ] x t Π x t Ω 1 y t ˆΠ x t + ˆΠ x t Π x t = T [ (ˆε t + (ˆΠ Π) x t ) (ˆε ) ] Ω 1 t + (ˆΠ Π) x t where the j th element of the (n 1) vector ˆε t is the sample residual for observation t from an OLS regression of y jt on x t.
Vector Auto-Regressions: MLE T [ (yt Π ) x t Ω 1 ( y t Π ) ] x t = T ˆε tω 1ˆε T t + 2 ˆε tω 1 (ˆΠ Π) x t + T x t(ˆπ Π)Ω 1 (ˆΠ Π) x t
Vector Auto-Regressions: MLE Let s apply the trace operator on the second term (that is a scalar): ( T T ) ˆε tω 1 (ˆΠ Π) x t = trace ˆε tω 1 (ˆΠ Π) x t ( T ) = trace Ω 1 (ˆΠ Π) x t ˆε t ( T ) = trace Ω 1 (ˆΠ Π) x t ˆε t
Vector Auto-Regressions: MLE Given that, by construction, the sample residuals are orthogonal to the explanatory variables, this term is equal to zero. If x t = (ˆΠ Π) x t, we have T [ (yt Π ) x t Ω 1 ( y t Π ) ] x t = T ˆε tω 1ˆε T t + x tω 1 x t Since Ω is a positive definite matrix, Ω 1 is as well. Consequently, the smallest value that the last term can take is obtained when x t = 0,ie when Π = ˆΠ.
Vector Auto-Regressions: MLE Assume that we have computed ˆΠ, the MLE of is the matrix ˆΩ that maximizes Ω l L(Y T ; ˆΠ, Ω). Denoting with ˆε t the estimated residual y t ˆΠx t, we have L(Y T ; ˆΠ,Ω) = (Tn/2) log(2π) + (T /2) log Ω 1 1 T [ˆε 2 t Ω 1ˆε ] t. ˆΩ is a symmetric positive definite matrix. Fortunately, it turns out that that the unrestricted matrix that maximizes the latter expression is a symmetric postive definite matrix. Indeed, l(ω) Ω = T 2 Ω 1 2 T ˆε t ˆε t = ˆΩ = 1 T T ˆε t ˆε t.
Vector Auto-Regressions: Likelihood-Ratio test The simplicity of the VAR framework and the tractability of its MLE contribute to convenience of various econometric tests. We illustrate this here with the likelihhod ratio test. The maximum value achieved by the MLE is L(Y T ; ˆΠ,ˆΩ) = (Tn/2) log(2π) + (T /2) log ˆΩ 1 1 2 T [ˆε t ˆΩ 1ˆε t ].
Vector Auto-Regressions: Likelihood-Ratio test The last term is T ˆε ˆΩ 1ˆε [ T t t = trace ˆε ˆΩ 1ˆε ] t t [ T ] [ = trace ˆΩ 1ˆε t ˆε t = trace = trace [ˆΩ ( )] 1 T ˆΩ = Tn. ˆΩ 1 T ˆε t ˆε t Therefore L(Y T ; ˆΠ,ˆΩ) = (Tn/2) log(2π) + (T /2) log ˆΩ 1 Tn/2. which is easy to calculate. ]
Vector Auto-Regressions: Likelihood-Ratio test For instance, assume that we want to test the null hypothesis that a set of variable follows a VAR(p 0 ) against the alternative specification of p 1 lags (with p 1 > p 0 ). Let us respectively denote with ˆL 0 and ˆL 1 the maximum log-likelihoods obtained withp 0 and p 1 lags. Under the null hypothesis, we have 2 (ˆL1 ˆL ) ( ) 0 = T log ˆΩ 1 log ˆΩ 1 which asymptotically has a χ 2 distribution with degrees of freedom equal to the number of restrictions imposed under H 0 (compared with H 1 ), ie n 2 (p 1 p 0 ). 1 0
Vector Auto-Regressions: Criteria Adding lags quickly consume degrees of freedom. If lag length is p, each of the n equations contains n p coefficients plus the intercept term. Adding lengths improve in-sample fit, but is likely to result in over-parameterization and affect the out-of-sample prediction performance. To select appropriate lag length, some criteria can be used (they have to be minimized) AIC(p) = log ˆΩ + 2 T N SBIC(p) = log ˆΩ + log T T N HQ(p) = log ˆΩ + 2 log(log T ) N T where N = n p 2 + p.
Vector Auto-Regressions: Granger Causality Granger (1969) developed a method to analyze the causal relationship among variables systematically. The approach consists in determining whether the past values of y 1,t can help to explain the current y 2,t. Let us denote three information sets I 1,t = {y 1,t, y 1,t 1,...} I 2,t = {y 2,t, y 2,t 1,...} I t = {y 1,t, y 1,t 1,... y 2,t, y 2,t 1,...}. We say that y 1,t Granger-causes y 2,t if E [y 2,t I 2,t 1 ] E [y 2,t I t 1 ].
Vector Auto-Regressions: Granger Causality To get the intuition behind the testing procedure, consider the following bivariate VAR(p) process: y 1,t = Φ 10 + Σ p i=1 Φ 11(i)y 1,t i + Σ p i=1 Φ 12(i)y 2,t i + u 1,t y 2,t = Φ 20 + Σ p i=1 Φ 21(i)y 1,t i + Σ p i=1 Φ 22(i)y 2,t i + u 2,t. Then,y 1,t does not Granger-cause y 2,t if Φ 21 (1) = Φ 21 (2) =... = Φ 21 (p) = 0. Therefore the hypothesis testing is { H 0 : Φ 21 (1) = Φ 21 (2) =... = Φ 21 (p) = 0 H A : Φ 21 (1) 0 or Φ 21 (1) 0 or... Φ 21 (p) 0.
Vector Auto-Regressions: Granger Causality Rejection of H 0 implies that some of the coefficients on the lagged y 1,t s are statistically significant. This can be tested using the F -test or asymptotic chi-square test. The F -statistic is F = (RSS USS)/p USS/(T 2p 1) (where RSS: restricted residual sum of squares, USS: unrestriced residual sum of squares) Under H0, the F -statistic is distributed as F (p, T 2p 1) In addition, pf χ 2 (p).
Vector Auto-Regressions: Granger Causality See RATS example on the US-EA GDP growth relationships
Vector Auto-Regressions: Impulse responses Objective: analyzing the effect of a given shock on the endogenous variables. Let the stationary VAR(p) system: y t = c + Φ 1 y t 1 +... Φ p y t p + ε t Assume the system receives a shock at t: ε t = δ bf Definition A standard IRF (h, δ) describes the effects of the shock at date t + h compared to a zero-shock ε t = 0, assuming that ε t+h = 0 for all h > 0. The Generalized IRF by Koop, Pesaran and Potter (1996): GIRF (h, δ, F t 1 ) = E{y t+h ε t = δ; ε t+h = 0, h > 0; F t 1 } E{y t+h ε t+h = 0, h ; F t 1 }
Vector Auto-Regressions: Impulse responses Exemple of a centered univariate AR(1) : x t = φx t 1 + ε t. Assume x t 1 = 0, thus x t = ε t = δ. IRF (1, δ) = E(x t+1 ε t = δ, ε t+1 = 0, F t 1 ) E(x t+1 ε t = ε t+1 = 0, F t 1 ) IRF (1, δ) = φδ IRF (2, δ) = φ 2 δ... IRF (h, δ) = φ h δ Remark : IRF is proportional to the size of the shock and independent of past history
Vector Auto-Regressions: Impulse responses Let us consider a stationary vector random variable y t that presents the following Wold s decomposition: y t = ε t + Ψ j ε t j. j=1 The hth impulse response function of the shock ε t on y t, y t+1,... is given by Ψ h δ and vanishes as h Formally, the impulse response of the shock ε t on the variable y is defined as y t+h ε t = Ψ h.
Vector Auto-Regressions: Impulse responses Dynamics of y t, y t+1, y t+2,... when ε t = 1, ε t+1 = 0, ε t+2 = 0,...!!! "! # " # " #$% " #$& " #$' " #$( " #$) " #$*!!
Vector Auto-Regressions: Unconditional variance The unconditional matrix of variance-covariance of y t is Var(y) = lim t E 0 ((y t ȳ t )(y t ȳ t ) ) where ȳ t denotes the unconditional mean of y. Let denote with yt the vector [ y t y t 1... y t p ], we have c Φ 1 Φ 2 Φ p ε t yt 0 =. + 1 0 0. 0.. 0 0 y t 1 0 +. 0 0 0 1 0 0 y t = c + Φy t 1 + ε t
Vector Auto-Regressions: Unconditional variance It is then easy to get the Wold s decomposition of y t : yt = c + Φ ( c + Φyt 2 + ε t 1) + ε t = c + ε t + Φ(c + ε t 1) +... + Φ k (c + ε t k ) +... The ε t s being iid, we have Var(y) = Ω + ΦΩΦ +... + Φ k ΩΦ k +...
Vector Auto-Regressions: Forecasting The best linear 1-step-ahead forecast that minimizes the MSE if parameters are known is the conditional expectation : ŷ t (1) = c + Φ 1 y t +... + Φ p y t p+1 for h > 1, the iterations lead to ŷ t (h) = c + Φ 1 ŷ t (h 1) +... + Φ p ŷ t (h p) where ŷ t (h p) = y t p if h < p
Forecast variance decomposition The h-step-ahead forecast error in y t is defined as the difference between the actual value of y t and its VAR-based forecast ŷ t (h), expressed according to their infinite MA representation: y t = Ψ j ε t j j=0 h 1 y t+h ŷ t (h) = Ψ j ε t+h j j=0
Forecast variance decomposition It is straigthforward to compute the forecast error variance of a variable in y t for the h-step forecast horizon as well as the corresponding shares of individual innovations (Lutkepohl, 2015) In applications h corresponds to the business cycle horizon (between 1,5y and 8y) h 1 V (y t+h ŷ t (h)) = Ψ j ΩΨ j j=0 The contribution of the i th shock to the forecast of k th variable is h 1 {e k Ψ je i } 2 j=0 where e i is the i th column of Ω
Vector Auto-Regressions: Applications See RATS example on the US-EA GDP growth relationships
Extension: VAR-X Let y t denote an (n 1) vector of random variables. y t is Gaussian VAR(p) with exogeneous variables x t = (x 1 t,..., x m t ) of dimension m, for all t y t = Φ 1 y t 1 +... Φ p y t p + Cx t + ε t where ε t N(0, Ω) and c 11... c 1m C =. c ij. c n1... c nm
Vector Auto-Regressions: Exemple n = 2 VAR(1)for y t = (y 1,t, y 2,t ): y 1,t = φ 11 y 1,t 1 + φ 12 y 2,t 1 + c 11 xt 1 1 + c 12 xt 1 2 + ε 1,t y 2,t = φ 21 y 1,t 1 + φ 22 y 2,t 1 + c 21 xt 1 1 + c 22 xt 1 2 + ε 2,t.
Vector Auto-Regressions: Extensions VAR in levels Bayesian VAR Non-linear VAR (Smooth-Transition VAR, Markov-Switching VAR) Factor-Augmented VAR