A Primer on Vector Autoregressions

A Primer on Vector Autoregressions Ambrogio Cesa-Bianchi VAR models 1

[DISCLAIMER] These notes are meant to provide intuition on the basic mechanisms of VARs As such, most of the material covered here is treated in a very informal way If you crave a formal treatment of these topics, you should stop here and buy a copy of Hamilton s Time Series Analysis VAR models VAR Representation 2

VARs & Macro-econometricians job According to a well-known paper by Stock & Watson (21, JEP) macroeconometricians (would like to) do four things 1. Describe and summarize macroeconomic time series 2. Make forecasts 3. Recover the true structure of the macroeconomy from the data 4. Advise macroeconomic policymakers Vector autoregressive models are a statistical tool to address these tasks VAR models VAR Representation 3

What can we do with vector autoregressive models? 3 variables: real GDP growth ( y), inflation (π) and the policy rate (i) A VAR can help us answering the following questions 1. What is the dynamic behaviour of these variables? How do these variables interact? 2. What is the profile of GDP conditional on a specific future path for the policy rate? 3. What is the effect of a monetary policy shock on GDP and inflation? 4. What has been the contribution of monetary policy shocks to the behaviour of GDP over time? VAR models VAR Representation 4

What is a Vector Autoregression (VAR)? Given, for example, a (3 1) vector of time series x t where y 1 y 2... y T x t = π 1 π 2... π T r 1 r 2... r T A stationary structural VAR of order 1 is Ax t = Bx t 1 + ε t for t = 1,..., T A and B are (3 3) matrices of coefficients ε t is an (3 1) vector of unobservable zero mean white noise processes VAR models VAR Representation 5

Three different ways of writing the same thing There are different ways to represent the VAR(1) Ax t = Bx t 1 + ε t For example, we can also write it: In matrix form a 11 a 12 a 13 y t b 11 b 12 b 13 a 21 a 22 a 23 π t = b 21 b 22 b 23 a 31 a 31 a 33 r t b 31 b 32 b 33 y t 1 π t 1 r t 1 + As a system of linear equation a 11 y t + a 12 π t + a 13 r t = b 11 y t 1 + b 12 π t 1 + b 13 r t 1 + ε yt a 21 y t + a 22 π t + a 13 r t = b 21 y t 1 + b 22 π t 1 + b 23 r t 1 + ε πt a 31 y t + a 32 π t + a 33 r t = b 31 y t 1 + b 32 π t 1 + b 33 r t 1 + ε rt ε yt ε πt ε rt VAR models VAR Representation 6

The structural innovations We defined ε t as a vector of unobservable zero mean white noise processes. What does it mean? This simply means that they are serially uncorrelated and independent of each other In other words or VCV(ε t ) = ε t = (ε yt, ε πt, ε rt) N (, I) 1 1 1 and CORR(ε t ) = 1 1 1 VAR models VAR Representation 7

[Back to basics] What is a variance-covariance matrix? The formula for the variance of a univariate time series x = [x 1, x 2,..., x T ] is VAR = T (x t x) 2 N t= = T (x t x) (x t x) N t= If we have a bivariate time series [ x1 x x t = 2... x T y 1 y 2... y T ] the formula becomes [ VCV = T t= T t= (x t x)(x t x) N T (x t x)(y t ȳ) t= N (y t ȳ)(x t x) N T (y t ȳ)(y t ȳ) t= N ] [ = VAR(x) COV(x, y) COV(x, y) VAR(y) ] VAR models VAR Representation 8

The general form of the stationary structural VAR(p) model The basic VAR(1) model may be too poor to sufficiently summarize the main characteristics of the data Deterministic terms (such as time trend or seasonal dummy variables) Exogenous variables (such as the price of oil) The general form of the VAR(p) model with deterministic terms (Z t ) and exogenous variables (W t ) is given by Ax t = B 1 x t 1 + B 2 x t 2 +... + B p x t p + ΛZ t + ΨW t + ε t VAR models VAR Representation 9

Why is it called structural VAR? The equations of a structural VAR define the true structure of the economy The fact that ε t = (ε yt, ε πt, ε rt ) N (, I) implies that we can interpret ε t as structural shocks For example we could interpret ε yt as an aggregate shock ε πt as a cost-push shock ε rt as a monetary policy shock VAR models VAR Representation 1

Why is it called stationary VAR? One of the main assumptions of standard VARs is stationarity of the data A stochastic process is said covariance stationary if its first and second moments, E (x) and VCV (x) respectively, exist and are constant over time 11 1 Real GDP 8 6 4 Real GDP (Detrended).3.2 Real GDP (QoQ) 9 2.1 8 7 2 4.1 6 6.2 5 85 92 99 6 13 8.3 85 92 99 6 13 85 92 99 6 13 VAR models VAR Representation 11

Structural VARs potentially answers many interesting questions For example in our VAR(1) a 11 a 12 a 13 y t b 11 b 12 b 13 a 21 a 22 a 23 π t = b 21 b 22 b 23 a 31 a 32 a 33 r t b 31 b 32 b 33 y t 1 π t 1 r t 1 + ε yt ε πt ε rt a 31 is the impact multiplier of monetary policy shocks on GDP a 32 is the impact multiplier of monetary policy shocks on inflation If we simulate the model, we can evaluate the time profile of a monetary policy shock on GDP We can add additional variables (and equations) and simulate other shocks: credit supply, oil price, QE,... VAR models VAR Representation 12

However... the estimation of structural VARs is problematic The equations of Ax t = Bx t 1 + ε t cannot be estimated with OLS because they violate one important assumption = the regressor cannot be correlated with the error term To see that, take the GDP equation a 11 y t + a 12 π t + a 13 r t = b 11 y t 1 + b 12 π t 1 + b 13 r t 1 + ε yt and compute COV[π t, ε yt ] OLS estimation of Ax t = Bx t 1 + ε t would produce inconsistent estimates of the parameters, impulse responses, etc VAR models VAR Representation 13

How to solve this problem? In the GDP equation a 11 y t + a 12 π t + a 13 r t = b 11 y t 1 + b 12 π t 1 + b 13 r t 1 + ε yt the terms a 12 π t and a 13 r t are the ones generating problems for OLS estimation This endogeneity problem disappears if we remove the contemporaneous dependence of y t on the other endogenous variables More in general the A matrix is problematic (since it includes all the contemporaneous relation among the endogenous variables) VAR models VAR Representation 14

We can solve this problem by simply pre-multiplying the VAR by A 1 A 1 Ax t = A 1 Bx t 1 + A 1 ε t x t = Fx t 1 + A 1 ε t x t = Fx t 1 + u t That is, we moved the contemporaneous dependence of the endogenous variables (which is given by A) into the modified error terms u t This implies that now CORR(u t ) = I VAR models VAR Representation 15

[Back to basics] The inverse of a matrix The inverse of a 2 2 matrix [ ] a b X = c d X 1 = 1 ad bc [ d ] b c a This implies that u t = A 1 ε t can be computed as where = a 11 a 22 a 12 a 21 u 1t = a 22ε 1t a 21 ε 2t u 2t = a 21ε 1t +a 11 ε 2t VAR models VAR Representation 16

The reduced-form VAR This alternative formulation of the VAR x t = Fx t 1 + u t is called the reduced-form representation In matrix form y t π t = r t where f 11 f 12 f 13 f 21 f 22 f 23 f 31 f 32 f 33 y t 1 π t 1 r t 1 u t N (, Σ u ) and CORR(u t ) = + u yt u πt u it 1 ρ 12 ρ 13 1 ρ 23 1 VAR models VAR Representation 17

VAR Estimation VAR models VAR Estimation 18

We can now estimate the VAR with some real data UK quarterly data from 1985.I to 213.III VAR(1) with a constant x t = c + Fx t 1 + u t 1 Real GDP (Annualized QoQ, %) 1 Inflation (Annualized QoQ, %) 15 Policy rate (%) 5 5 1 5 5 1 15 85 92 99 6 13 5 85 92 99 6 13 85 92 99 6 13 VAR models VAR Estimation 19

OLS estimation Typical VAR output Matrix of coefficients F Real GDP GDP Deflator Policy Rate c 1.14 1.19 -.11 Real GDP(-1).61 -.7.6 GDP Deflator(-1) -.9.2.3 Policy Rate(-1).1.3.96 Correlation between reduced-form residuals Real GDP GDP Deflator Policy Rate Real GDP 1. -.178.373 GDP Deflator 1..137 Policy Rate 1. VAR models VAR Estimation 2

Debunking the typical VAR output The constant is not the mean nor the long-run equilibrium of a variable In our example, the mean of the policy rate is not negative Real GDP GDP Deflator Policy Rate c 1.14 1.19 -.11 The correlation of the residuals reflects the contemporaneous relation between our variables In our example GDP growth and inflation are contemporaneously negatively correlated Real GDP GDP Deflator Policy Rate Real GDP 1. -.178.373 VAR models VAR Estimation 21

Model checking & tuning We do not cover this in detail but before interpreting the VAR results you should check a number of assumptions Loosely speaking we need to check that the reduced-form residuals are Normally distributed Not autocorrelated Not heteroskedastic (i.e., have constant variance)... and that the VAR is stationary (we ll see in a second what it means) VAR models VAR Estimation 22

Model checking: why the residuals? The VAR believes that x N (µ, σ) = y N (µ y, σ y ) π N (µ π, σ π ) i N (µ i, σ i ) If the data that we feed into the VAR has not these features, the residuals will inherit them Note that Mean (µ) and variance (σ) are constant = the data have to be stationary! The mean (µ) and variance (σ) are not known a priori At each point in time x t does not generally coincide with µ because of (i) shocks hitting at t and (ii) shocks that hit in the past and that are slowly dying out The average of x t over a given sample does not necessarily coincide with µ VAR models VAR Estimation 23

Can we recover the mean of our endogenous variables? Yes. It is given by µ =E[x t ] E[x t ] = E[c + Fx t 1 + u t ] = = E[c + F(c + Fx t 2 +u t 1 ) + u t ] = E[c + Fc + F 2 x t 2 +Fu t 1 +u t ] = = E[c + Fc + F 2 c +... + F t 2 c + F t 1 x 1 +F t 2 u 2 +... + F 2 u t 2 +Fu t 1 +u t ] = =... = [ ] = E F j c + F t 1 x 1 + t 2 F j u t j j= t 2 j= If VAR is stationary, from the last expression we get E[x t ] = (I F) 1 c = µ VAR models VAR Estimation 24

[Back to basics] Geometric series Consider the following sum T Y j j= When T we have: if and only if: eig(y) < 1 when Y is a matrix Y < 1 when Y is a number (1 + Y + Y 2 +... + Y ) = (I Y) 1 Therefore, for T large enough we have [ ] t 2 E F j c + F t 1 t 2 [ ] x + F j u t j = E (I F) 1 c j= j= + F t 1 x }{{} 1 + E [ t 2 j= F j u t j ] } {{ } VAR models VAR Estimation 25

Stationary VAR, stationary data A VAR is stationary (or stable) when eig(f) < 1 In that case, the VAR thinks that (I F) 1 c is the unconditional mean of the stochastic processes governing our variables VAR models stationary data GDP level clearly does not have a well defined mean GDP growth does! VAR models VAR Estimation 26

Unconditional mean in practice The unconditional mean is an interesting element of VAR analysis but it is often ignored From the estimated VAR, recover both c and F Compute the unconditional mean as (I F) 1 c = 2.32.28 1.89 1.34 1.24 1.58 4.89.67 33.87 1.14 1.19.11 = 2.51 1.8 2.49 = µ y µ π µ i VAR models VAR Estimation 27

Why stationarity of the data is important In absence of shocks, each variable will converge to its unconditional mean 4.5 For example, start from a point in time T where: y T = 2% π T = 2% r T = 4% 4. 3.5 3. 2.5 2. 1.5 T T+8 T+16 T+24 T+32 T+4 T+48 T+56 T+64 T+72 T+8 T+88 T+96 T+14 T+112 T+12 T+128 T+136 T+144 T+152 T+16 T+168 Real GDP GDP Deflator Policy Rate VAR models VAR Estimation 28

In light of these considerations: is our data sensible? UK quarterly data from 1985.I to 213.III 1 Real GDP (Annualized QoQ, %) 1 Inflation (Annualized QoQ, %) 15 Policy rate (%) 5 5 1 5 5 1 15 85 92 99 6 13 5 85 92 99 6 13 85 92 99 6 13 VAR models VAR Estimation 29

Some VAR limitations VARs are linear models of stationary data... but often macro data is non-linear (crisis periods) is non-stationary (trends, breaks, etc) displays time-varying variance (Great Moderation Vs Great Recession) All these elements have to be taken into account when analyzing the output from VAR analysis (especially when estimated over the sample period we use these days which features all the above) VAR models VAR Estimation 3

Back to our estimated VAR Real GDP GDP Deflator Policy Rate c 1.14 1.19 -.11 Real GDP(-1).61 -.7.6 GDP Deflator(-1) -.9.2.3 Policy Rate(-1).1.3.96 What can we do with it? What are the dynamic properties of these variables? [Look at lagged coefficients] How do these variables interact? [Look at cross-variable coefficients] What will be inflation tomorrow? [???] VAR models VAR Estimation 31

Forecasting Forecasting is one of the main objectives of multivariate time series analysis The best linear predictor (in terms of minimum mean squared error) of x T+1 based on information available at time T (today) is x f T+1 = Fx T For example, if today (T) we have y T = 2% x T π T = 2% = x f T+1 = Fx T r T = 4% y T+1 = 2.15% π T+1 = 2.31% r T+1 = 3.94% Note that we can also construct conditional forecasts by specifying an exogenous path for a variable (the policy rate for example) VAR models VAR Estimation 32

The Identification Problem VAR models The identification problem 33

Reduced-form VARs do not tell us anything about the structure of the economy We cannot interpret the reduced-form error terms (u) as structural shocks How do we interpret a movement in u r? Since it is a linear combination of ε r, ε y, and ε π it is hard to know what is the nature of the shock Is it a shock to aggregate demand that induces policy to move the interest rate? Or is it a monetary policy shock? This is the very nature of the identification problem To answer this question we need to get back to the structural representation (where the error terms are uncorrelated) VAR models The identification problem 35

From the reduced-form back to the structural form We normally estimate x t = Fx t 1 + u t and Σ u But our ultimate objective is to recover Ax t = Bx t 1 + ε t Does not sound too difficult... We know that F = A 1 B u t = A 1 ε t VAR models The identification problem 36

We also know that Σ u = E [ [ u t u t ] ( ) ] ( = E A 1 ε A 1 ε = A 1 Σ ε A 1) = A 1 A 1 since Σ ε = I In other words if we pin down A 1 we are done, since we can recover ε t = Au t B = AF and we would know the structural representation of the economy You can think of the idnetified Ax t = Bx t 1 + ε t model as the 3 equation New Keynesian model VAR models The identification problem 37

[Back to basics] The variance covariance matrix of residuals Why Σ u = E [u t u t ]? The formula for the variance of a univariate time series is x = [x, x 1,..., x T ] VAR = T (x t x) 2 N t= But the residuals (both the u t and the ε t ) have zero mean which implies that formula would be T x VAR = 2 t N t= VAR models The identification problem 38

[Back to basics] The variance covariance matrix of residuals In a bivariate VAR we would have [ u t u t u 1 = 1 u 1... u 1 ] T u 2 1 u 2 2... u 2 T u 1 u 2 1 u 1 2 u 2 2...... u 1 T u2 T = [ T ( ) t= u 1 2 t T ( t= u 1 t u 2 ) ] t T ( ) t= u 2 2 t And therefore E [ [ [ u t u t ] VAR u 1 ] COV [ u 1 u 2] = VAR [ u 1] ] VAR models The identification problem 39

The identification problem Identification problem boils down to pinning down A 1 If we write Σ u = A 1 A 1 in matrices Σ u }{{} σ 2 u1 σ 2 u1,u2 σ 2 u1,u3 σ 2 u2 σ 2 u2,u3 σ 2 u3 = A} 1 {{ A 1 } we can derive a system of equations a 11 a 12 a 13 a 21 a 22 a 23 a 31 a 32 a 33 1 a 11 a 12 a 13 a 21 a 22 a 23 a 31 a 32 a 33 However, there are 9 unknowns (the elements of A 1 A 1 ) but only 6 equations (because the variance-covariance matrix is symmetric) The system is not identified! 1 VAR models The identification problem 4

Common Identification Schemes VAR models Common identification schemes 41

Common identification schemes Zero short-run restrictions (also known as recursive, Cholesky, orthogonal) Zero long-run restrictions (also known as Blanchard-Quah) Theory-based restrictions Sign restrictions VAR models Common identification schemes 42

Zero short-run restrictions Assume that A is lower triangular a 11 y t b 11 b 12 b 13 a 21 a 22 π t = b 21 b 22 b 23 a 31 a 32 a 33 r t b 31 b 32 b 33 Or in other words that y t 1 π t 1 r t 1 ε yt affects contemporaneously all variables, namely y t, π t and r t ε πt affects contemporaneously only π t and r t, but no y t ε rt affects contemporaneously only r t We now have 6 unknowns and 6 equations! + ε yt ε πt ε rt VAR models Common identification schemes: zero short-run restrictions 43

To see what are the implications of our assumptions on A, first remember that the inverse of a lower triangular matrix is also lower triangular [ ] a11 1 [ ] ã11 a 21 a 22 = ã 21 ã 22 a 31 a 32 a 33 ã 31 ã 32 ã 33 Now pre-multiply the VAR by A 1 [ ] [ ] [ ] yt f11 f 12 f 13 yt 1 π t = f 21 f 22 f 23 + r t f 31 f 32 f 33 π t 1 r t 1 [ ã11 ã 21 ã 22 ã 31 ã 32 ã 33 ] [ ] ε yt ε πt ε rt Which implies that y t π t r t =... + ã 11 ε yt =... + ã 21 ε yt + ã 22 ε πt =... + ã 31 ε yt + ã 32 ε πt + ã 33 ε rt VAR models Common identification schemes: zero short-run restrictions 44

We normally implement this identification scheme via a Cholesky decomposition of Σ u Σ u = P P where P is lower triangular Note that Σ u = P P but also Σ u = A 1 A 1 and that A is lower triangular Then it musty follow that P = A 1 = Identification! VAR models Common identification schemes: zero short-run restrictions 45

[Back to basics] Cholesky decomposition of a matrix Don t be scared of Cholesky decomposition! It s a kind of square root of a matrix As in Excel you type sqrt() in Matlab you type chol() A symmetric and positive definite matrix X can be decomposed as: X = P P where P is an upper triangular matrix (and therefore P is lower triangular) The formula is [ ] a b X = b c a a b P = c b2 a VAR models Common identification schemes: zero short-run restrictions 46

Zero long-run restrictions Re-write the VAR as x t = Fx t 1 + A 1 ε t If a shock hits in t, its cumulative (long run) impact on x t would be x t,t+ = A 1 ε t +FA 1 ε t +F 2 A 1 ε t +... + F A 1 ε t We can rewrite x t,t+ = F j A 1 ε t = (I F) 1 A 1 ε t = Dε t j= where D is the cumulative effect of the shock ε t from time t to VAR models Common identification schemes: zero long-run restrictions 47

What is the intuition for D? y t,t+ π t,t+ i t,t+ = d 11 d 12 d 13 d 21 d 22 d 23 d 31 d 32 d 33 ε yt ε πt ε rt Take the first equation: y t,t+ = d 11 ε yt + d 12 ε πt + d 13 ε rt d 13 represents the cumulative impact of a monetary policy shock (hitting in t) on the level GDP in the long-run If you believe in the neutrality of money you would expect d 13 = VAR models Common identification schemes: zero long-run restrictions 48

To achieve identification note that DD = (I F) 1 A 1 A 1 (I F) 1 = (I F) 1 Σ u (I F) 1 Note also that The right-hand side of the above equation is known Both DD and (I F) 1 Σ u (I F) 1 are symmetric matrices There exists an upper triangular matrix P such that P P = (I F) 1 Σ u (I F) 1 Therefore, if we assume that D is lower triangular, it must be that D = P Finally A 1 = (I F) D = Identification! VAR models Common identification schemes: zero long-run restrictions 49

Sign restrictions In the zero short-run restriction identification we used the fact that Σ u = A 1 A 1 and Σ u = P P where the lower triangular P matrix is the Cholesky decomposition of Σ u For a given random orthonormal matrix (i.e., such that S S = I) we have that Σ u = A 1 A 1 = P S SP =P P where P is generally not lower triangular anymore VAR models Common identification schemes: sign restrictions 5

A 1 = P is clearly a valid solution to the identification problem But S is a random matrix... is the solution A 1 = P plausible? Identification is achieved by checking whether the impulse responses implied by S satisfy a set of a priori (and possibly theory-driven) sign restrictions We can draw as many S as we want and construct a distribution of the solutions that satisfy the sign restrictions VAR models Common identification schemes: sign restrictions 51

Sign restriction in steps 1. Draw a random orthonormal matrix S 2. Compute A 1 = P S where P is the Cholesky decomposition of the reduced form residuals Σ u 3. Compute the impulse response associated with A 1 4. Are the sign restrictions satisfied? 4.1 Yes. Store the impulse response 4.2 No. Discard the impulse response 5. Perform N replications and report the median impulse response (and its confidence intervals) VAR models Common identification schemes: sign restrictions 52

Structural Dynamic Analysis VAR models Structural Dynamic Analysis 53

Why do we need identification? According to Stock & Watson s list so far we have done 1. Describe and summarize macroeconomic time series 2. Make forecasts 3. Recover the true structure of the macroeconomy from the data How about 1. Advise macroeconomic policymakers (A good) identification allows us to address the last point This is normally done by means of Impulse responses, Forecast error variance decompositions, and Historical decompositions VAR models Structural Dynamic Analysis 54

Impulse response functions Impulse response functions (IR) answer the question What is the response of current and future values of each of the variables to a one-unit increase in the current value of one of the structural errors, assuming that this error returns to zero in subsequent periods and that all other errors are equal to zero The implied thought experiment of changing one error while holding the others constant makes sense only when the errors are uncorrelated across equations VAR models Structural Dynamic Analysis 55

How to compute impulse response functions As an example, we compute the IR for a bivariate VAR x t = ( x 1t, ) x 2t Define a vector of exogenous impulses (s τ ) that we want to to impose to the structural errors of the system Time (τ) 1 2 h Impulse to ε 1 (s 1,τ ) s 1,1 = 1 s 1,2 = s 1,h = Impulse to ε 2 (s 2,τ ) s 2,1 = s 2,2 = s 2,h = VAR models Structural Dynamic Analysis 56

We can use the following hybrid representation to compute the IR x t = Fx t 1 + A 1 s t, The impulse response IR τ is given by { IR1 = A 1 s 1, IR τ = F IR τ 1, for τ = 2,..., h. VAR models Structural Dynamic Analysis 57

Forecast error variance decompositions Forecast error variance decompositions (V D) answer the question What portion of the variance of the forecast error in predicting x i,t+h is due to the structural shock ε i? Provide information about the relative importance of each structural shock in affecting the variables in the VAR VAR models Structural Dynamic Analysis 58

How to compute forecast error variance decompositions As an example, we compute the VD for the 1-step ahead forecast error in a bivariate VAR x t = ( x 1t, ) x 2t First, let s define the 1-step ahead forecast error ξ T+1 = x T+1 x f T+1 = x T+1 Fx T It follows that ξ T+1 = u T+1 = A 1 ε T+1 Forecast error is the yet unobserved realization of the shocks VAR models Structural Dynamic Analysis 59

In a bivariate VAR we have [ ] [ ] [ ξ 1,T+1 ã = 11 ã 12 ξ 2,T+1 ã 21 ã 22 where ã are the elements of A 1 Or ] ε 1,T+1 ε 2,T+1 { ξ1,t+1 = ã 11 ε 1,T+1 + ã 12 ε 2,T+1 ξ 2,T+1 = ã 21 ε 1,T+1 + ã 22 ε 2,T+1 What is the variance of the forecast error? VAR ( ξ 1,T+1 ) = ã 2 11 VAR (ε 1,T+1 ) + ã 2 12 VAR (ε 2,T+1) = ã 2 11 + ã2 12 VAR ( ξ 2,T+1 ) = ã 2 21 VAR (ε 1,T+1 ) + ã 2 22 VAR (ε 2,T+1) = ã 2 21 + ã2 22 VAR models Structural Dynamic Analysis 6

[Back to basics] Basic properties of the variance If X is a random variable X and a is a constant VAR (X + a) = VAR (X) VAR (ax) = a 2 VAR (X) If Y is a random variable and b is a constant VAR (ax+by) = a 2 VAR (X) + b 2 VAR (Y) + 2abCOV (X, Y) Since the structural errors are independent, it follows that COV (ε 1, ε 2 ) = VAR models Structural Dynamic Analysis 61

Therefore, we just showed that variance of the 1-step ahead forecast error is VAR ( ξ 1,T+1 ) = ã 2 11 + ã 2 12 VAR ( ξ 2,T+1 ) = ã 2 21 + ã 2 22 Which portion of the variance is due to each structural error? VD ε 1 x 1 = ã2 11 ã 2 VD ε 1 11 +ã2 x 2 = ã2 21 12 ã 2 21 +ã2 22 VD ε 2 x 1 = ã2 12 ã 2 11 +ã2 12 }{{} This sums up to 1 VD ε 2 x 2 = ã2 22 ã 2 21 +ã2 22 }{{} This sums up to 1 VAR models Structural Dynamic Analysis 62

Historical decompositions Historical decompositions (HD) answer the question What portion of the deviation of x i,t from its unconditional mean is due to the structural shock ε i? We showed before that each observation of a variable does not generally coincide with its unconditional mean This is because, in each period, the structural shocks realize and push all variables away from their equilibrium values VAR models Structural Dynamic Analysis 63

[Back to basics] Wold representation of a VAR Each observation of our original data can be re-written as the cumulative sum of the structural shocks x 2 = Fx 1 + A 1 ε 2 x 3 = Fx 2 + A 1 ε 3 = F(Fx 1 + A 1 ε 2 ) + A 1 ε 3 = F 2 x 1 + FA 1 ε 2 + A 1 ε 3 =... = x T = F T 1 x 1 + (F T 2 A 1 ε 2 +... + FA 1 ε T 1 + A 1 ε T ) Or, more in general x t = F t 1 x 1 + t 2 F j A 1 ε t j for t > 1 j= VAR models Structural Dynamic Analysis 64

How to compute historical decompositions As an example, we compute the HD of the third observation in a bivariate VAR x t = ( x 1t, ) x 2t First write x 3 as a function of past errors (ε 2 and ε 3 ) and the initial conditions (x 1 ) x 3 = F 2 x }{{} 1 + FA }{{ 1 } ε 2 + }{{} A 1 ε 3 init 3 Θ 1 Θ Re-write x 3 in matrices [ ] [ ] x 1,3 init = 1,3 + x 2,3 init 2,3 [ θ 1 11 θ 1 12 θ 1 21 θ 1 22 ] [ ] [ ε 1,2 + ε 2,2 θ 11 θ 12 θ 21 θ 22 ] [ ] ε 1,3 ε 2,3 VAR models Structural Dynamic Analysis 65

Therefore x 3 can be expressed as { x1,3 = init 1,3 + θ 1 11 ε 1,2 + θ 1 12 ε 2,2 + θ 11 ε 1,3 + θ 12 ε 2,3 x 2,3 = init 2,3 + θ 1 21 ε 1,2 + θ 1 22 ε 2,2 + θ 21 ε 1,3 + θ 22 ε 2,3 The historical decomposition is given by HD ε 1 1,3 = θ1 11 ε 1,2 + θ 11 ε 1,3 HD ε 2 1,3 = θ1 12 ε 2,2 + θ 12 ε 2,3 HD init 1,3 = init 1,3 }{{} This sums up to x 1,3 HD ε 1 2,3 = θ1 21 ε 1,2 + θ 21 ε 1,3 HD ε 2 2,3 = θ1 22 ε 2,2 + θ 22 ε 2,3 HD init 2,3 = init 2,3 }{{} This sums up to x 2,3 VAR models Structural Dynamic Analysis 66

Famous VAR Examples VAR models Examples 67

Examples of different identification schemes Zero short-run restrictions Stock & Watson (21). Vector Autoregressions, Journal of Economic Perspectives Zero long-run restrictions Blanchard & Quah (1989). The Dynamic Effects of Aggregate Demand and Supply Disturbances, American Economic Review Sign Restrictions Uhlig (25). What are the effects of monetary policy on output? Results from an agnostic identification procedure, Journal of Monetary Economics VAR models Examples 68

Example: zero short-run restrictions Stock & Watson (21). Vector Autoregressions, Journal of Economic Perspectives US quarterly data from 196.I to 2.IV 15 Inflation (Annualized QoQ, %) 12 Unemployment (%) 2 Policy rate (%) 1 5 1 8 6 4 15 1 5 6 7 8 9 2 6 7 8 9 6 7 8 9 VAR models Examples: Stock & Watson (21, JEP) 69

Monetary policy shocks, inflation and unemployment Objective: infer the causal influence of monetary policy on unemployment, inflation and interest rates Assumptions MP (r t ) reacts contemporaneously to movements in inflation and in unemployment MP shocks (ε rt ) do not affect inflation and unemployment within the quarter of the shock a 11 a 21 a 22 a 31 a 32 a 33 π t ur t r t = b 11 b 12 b 13 b 21 b 22 b 23 b 31 b 32 b 33 π t 1 ur t 1 r t 1 + ε πt ε urt ε rt Do these assumptions make sense? VAR models Examples: Stock & Watson (21, JEP) 7

The effect of a monetary policy shock.5 Inflat. to Fed Funds.2 Unempl. to Fed Funds.5 5 1 15 2 1 Fed Funds to Fed Funds.2 5 1 15 2 1 5 1 15 2 VAR models Examples: Stock & Watson (21, JEP) 71

The other two shocks are identified by definition... but how can we interpret them? How about ε π and ε ur? They represent an aggregate supply and a demand equation... The shock to ε π affects all variables contemporaneously The shock to ε ur affects r t contemporaneously but not π t Do these assumptions make sense? [Some shocks may be better identified than others] VAR models Examples: Stock & Watson (21, JEP) 72

Aggregate demand and aggregate supply shocks Shock to ε πt behaves as a negative aggregate supply shock Shock to ε ut behaves as a negative aggregate demand shock 1 Inflat. to Inflat..5 Unempl. to Inflat..5 Inflat. to Unempl..5 Unempl. to Unempl. 1 5 1 15 2.5 5 1 15 2.5 5 1 15 2.5 5 1 15 2 1 Fed Funds to Inflat. 1 Fed Funds to Unempl. 1 5 1 15 2 1 5 1 15 2 VAR models Examples: Stock & Watson (21, JEP) 73

Forecast error variance decomposition Inflation Unemployment Fed Funds ε π ε ur ε r ε π ε ur ε r ε π ε ur ε r t = 1 1.... 1...2.2.79 t = 4.88.1.1.2.96.2.9.51.41 t = 8.83.16.1.1.76.13.11.6.29 t = 12.83.15.2.21.6.19.15.59.26 VAR models Examples: Stock & Watson (21, JEP) 74

Example: zero long-run restrictions Blanchard & Quah (1989). The Dynamic Effects of Aggregate Demand and Supply Disturbances, American Economic Review US quarterly data from 1948.I to 1987.IV 4 GDP Growth (QoQ, %) 4 Unemployment (%) 2 2 2 2 4 48 58 68 77 87 4 48 58 68 77 87 VAR models Examples: Blanchard & Quah (1989, AER) 75

Economic theory and the long-run Economic theory usually tells us a lot more about what will happen in the long-run, rather than exactly what will happen today Demand-side shocks have no long-run effect on output, while supply-side shocks do Monetary policy shocks have no long-run effect on output... This suggests an alternative approach: to use these theoretically-inspired long-run restrictions to identify shocks and impulse responses VAR models Examples: Blanchard & Quah (1989, AER) 76

Blanchard & Quah s identification assumptions There are two types of disturbances affecting unemployment and output The first has no long-run effect on either unemployment or output level The second has no long-run effect on unemployment, but may have a long-run effect on output level Blanchard & Quah refer to the first as demand disturbances, and to the second as supply disturbances (traditional Keynesian view of fluctuations) VAR models Examples: Blanchard & Quah (1989, AER) 77

Identification We showed that the long-run cumulative impact of a structural shocks is Assume that ε y t IR τ,τ+ = F j A 1 ε t = (I F) 1 A 1 ε t = Dε t j= is the supply shock and that ε ur t is the demand shock We can rewrite the VAR such that the cumulated effect of ε ur t on y t is equal to zero by assuming [ ] [ ] [ y ] yt d11 = ε t ur t d 21 d 22 ε ur t VAR models Examples: Blanchard & Quah (1989, AER) 78

Aggregate demand and supply shocks Aggregate supply shock initially increases unemployment (puzzle of hours to producticvity shocks) Aggregate demand shocks have a hump-shaped effect on output and unemployment GDP Growth to GDP Growth.6 Unempl. to GDP Growth.4 GDP Growth to Unempl..4.6 Unempl. to Unempl..4.3.2.2.5.4.2.1.2.3.4.2.2.1.2.6.8.1.4 1 2 3 4.3 1 2 3 4 1 1 2 3 4.1 1 2 3 4 VAR models Examples: Blanchard & Quah (1989, AER) 79

How can we check the long-run neutrality of demand shocks on output level? Let s simply plot the cumulative sum of the impulse responses of output growth 1.2 1.8 Supply shock GDP Level Unemployment 1.5 1 Demand shock GDP Level Unemployment.6.5.4.2.5.2 1 2 3 4 1 1 2 3 4 VAR models Examples: Blanchard & Quah (1989, AER) 8

Example: sign restrictions Uhlig (25). What are the effects of monetary policy on output? Results from an agnostic identification procedure, Journal of Monetary Economics US monthly data from 1965.I to 23.XII 15 GDP 15 CPI Commodity x 19 2 1 1 1.5 5 5 1 65 74 84 94 3 65 74 84 94 3.5 65 74 84 94 3 2 Fed Fund 8 Non Borr. Res. 8 Tot. Res. 15 6 6 1 4 4 5 2 2 65 74 84 94 3 65 74 84 94 3 65 74 84 94 3 VAR models Examples: Uhlig (25, JME) 81

What are the effects of monetary policy on output? Before asking what are the effects of a monetary policy shocks we should be asking, what is a monetary policy shock? In the inflation targeting era a monetary policy shock is an increase in the policy rate that is ordered last in a Cholesky decomposition? has no permanent effect on output?...? VAR models Examples: Uhlig (25, JME) 82

What is a monetary policy shock? According to conventional wisdom, monetary contractions should 1. Raise the federal funds rate 2. Lower prices 3. Decrease non-borrowed reserves 4. Reduce real output VAR models Examples: Uhlig (25, JME) 83

Again... What are the effects of monetary policy on output? Standard identification schemes do not fully accomplish the 4 points above Liquidity puzzle: when identifying monetary policy shocks as surprise increases in the stock of money, interest rates tend to go down, not up Price puzzle: after a contractionary monetary policy shock, even with interest rates going up and money supply going down, inflation goes up rather than down Successful identification needs to deliver results matching the conventional wisdom VAR models Examples: Uhlig (25, JME) 84

Uhlig s identification assumptions Uhlig s assumption: a contractionary monetary policy shock does not lead to Increases in prices Increase in nonborrowed reserves Decreases in the federal funds rate How about output? Since is the response of interest, we leave it un-restricted VAR models Examples: Uhlig (25, JME) 85

[Reminder] How to compute sign restrictions 1. Estimate from the reduced-form VAR F, u t, and Σ u 2. Draw a random orthonormal matrix S, compute P =chol(σ u ) and recover A 1 = P S 3. Compute the impulse response using IR 1 = A 1 ε t 4. Are the sign restrictions satisfied? Yes. Store the impulse response // No. Discard the impulse response 5. Perform N replications and report the median impulse response (and its confidence intervals) VAR models Examples: Uhlig (25, JME) 86

What happens when you do sign restrictions First draw: signs are correct, I keep it!.5 Real GDP.5 CPI 2 Commodity Price.5 2 4.5 2 4 6 1 2 4 6 6 2 4 6 1 Fed Funds 1 NonBorr. Reserves 2 Total Reserves.5 1 2 2.5 2 4 6 3 2 4 6 4 2 4 6 VAR models Examples: Uhlig (25, JME) 87

What happens when you do sign restrictions Second draw: signs are not correct, I discard it!.5 Real GDP.5 CPI 2 Commodity Price.5 2 4.5 2 4 6 1 2 4 6 6 2 4 6 1 Fed Funds 1 NonBorr. Reserves 2 Total Reserves.5 1 2 2.5 2 4 6 3 2 4 6 4 2 4 6 VAR models Examples: Uhlig (25, JME) 88

What happens when you do sign restrictions After a while....5 Real GDP.5 CPI 2 Commodity Price.5 2 4.5 2 4 6 1 2 4 6 6 2 4 6 1 Fed Funds 2 NonBorr. Reserves 2 Total Reserves.5 2 1 1.5 2 4 6 4 2 4 6 2 2 4 6 VAR models Examples: Uhlig (25, JME) 89

What are the effects of monetary policy on output? Ambiguous effect on real GDP = Long-run monetary neutrality.6 Real GDP to Mon. Pol. CPI to Mon. Pol. Commodity Price to Mon. Pol..4.2 1.2.4 2.6 3.2 1 2 3 4 5 6.8 1 2 3 4 5 6 4 1 2 3 4 5 6.6 Fed Funds to Mon. Pol. NonBorr. Reserves to Mon. Pol..5 Total Reserves to Mon. Pol..5.4.2.5 1.5 1.5 1.2 1 2 3 4 5 6 2 1 2 3 4 5 6 1.5 1 2 3 4 5 6 VAR models Examples: Uhlig (25, JME) 9