Econometría 2: Análisis de series de Tiempo

Econometría 2: Análisis de series de Tiempo Karoll GOMEZ kgomezp@unal.edu.co http://karollgomez.wordpress.com Segundo semestre 2016

IX. Vector Time Series Models

VARMA Models A.

1. Motivation: The vector autoregression (VAR) model is one of the most successful, flexible, and easy to use models for the analysis of multivariate time series. It is a natural extension of the univariate autoregressive model to dynamic multivariate time series. Has proven to be especially useful for describing the dynamic behavior of economic and financial time series, and for forecasting. It often provides superior forecasts to those from univariate time series models and elaborate theory-based simultaneous equations models.

Famous papers are Chris Sims s paper Macroeconomics and Reality (ECTA, 1980) and Stock and Watson paper Vector Autoregressions (JEP, 2001). Vector autoregressive models are a statistical tool to address the following tasks: Describe and summarize economic time series Make forecasts Recover the true structure of the macroeconomy from the data Advise macroeconomic policymakers In consequence, this analysis is commonly named as Macroeconometrics A VAR can help us answering the following questions:

Example 1 Problem: You want to study a sales performance for a company. Research Question: Is there a relationship between the amounts a firm spends on advertisement and sales revenue and volume? Goal: 3. To establish the relationship between advertising and sales revenue and volume Variables: sales revenue (R), sales volume (S), prices (P), sales force (F ) and advertising expenditure (E).

Example 2 Consider three variables: real GDP growth ( Y ), inflation (π) and the policy rate (r) A VAR can help us answering the following questions: 1. What is the dynamic behavior of these variables? How do these variables interact? 2. What is the profile of GDP conditional on a specific future path for the policy rate? 3. What is the effect of a monetary policy shock on GDP and inflation? 4. What has been the contribution of monetary policy shocks to the behavior of GDP over time?

1.A) What is a Vector Autoregression (VAR)?

1.B) The general form of the stationary structural VAR(p) model

2. Structural and Reduce form of a VAR The VAR has a very important role as a statistical model that underlies identified structural econometric models (endogenous system). However, we can write the model in a reduced form, ie stationary reduced VAR

The structural innovations:

What is a variance-covariance matrix? (Reminder)

Why is it called structural VAR?

Why is it called stationary VAR?

Example Structural VARs potentially answers many interesting questions

However... the estimation of structural VARs is problematic

How to solve the problem?

The reduced-form VAR

3. VAR stability (stationarity): DEFINITION 12: A stable VAR(p) process is stationary and ergodic with time invariant means, variances, and autocovariances. Two ways to check stationarity: 3.1 lag-operator (B) representation 3.2. System representation 3.2.1) Using the system of linear equations form 3.2.2) Using matrix form

3.1 lag-operator (B) representation Considering a VAR(1) model: x t = Fx t 1 + ɛ t x t Fx t 1 = ɛ t (I FB)x t = ɛ t Φ(B)x t = ɛ t VAR(1) is invertible For the process to be stationary, the zeros (roots) of the determinant equation I FB must be outside the unit circle.

The zeros of I FB are related to the eigenvalues of F Let be λ = λ 1,..., λ m the eigenvalues and H = h 1,..., h m the associated eigenvectors of F, such that: Thus FH =HΛ F =HΛH 1 I FB = I HΛH 1 B = I HΛBH 1 = I ΛB =Π m i=1(1 λ i B) Hence, the zeros of I FB lie outside the unit circle iff all eigenvalues λ i lies inside the unit circle, ie λ i < 1.

3.2. System representation 3.2.1) System of linear equations form: Consider a bivariate VAR(1) model:

3.2.2) The (companion) matrix form

In other words, λ i < 1

For the VAR(p) Model

where:

Nonstationarity VAR models: As we already know, in time series analysis is very common to observe series that exhibit nonstationary behavior The way to reduce nonstationary to stationary series is by differencing A natural extension (of the univariate case) to the VAR process is: Φ(B)(I IB) d Z t = ɛ t

Remarks for nonstationary VAR models: Orders for the differencing for each component series could be the same or not. When the differencing order for each component series could be the same and the linear combination of nonstationary series is stationary, then they are cointegrated.

4. Forecasting

5. Impulse-response function (IRFs) Impulse responses trace out the response of current and future values of each of the variables to a one-unit increase (or to a one-standard deviation increase, when the scale matters) in the current value of one of the VAR errors, assuming that this error returns to zero in subsequent periods and that all other errors are equal to zero.

Characteristics: The implied thought experiment of changing one error while holding the others constant makes most sense when the errors are uncorrelated across equations, so impulse responses are typically calculated for recursive and structural VARs.

Example 1: Considering the bivariate VAR as we already seen, we can write where

Characteristics (continued): IRFs is based on the VMA( ) representation of VAR(p) model. the VMA representation is an especially useful tool to examine the interaction between variable in the VAR

Considering the model VAR(p):

In other words, the matrix Ψ s collects the marginal effects of the innovation in the system on to ɛ, where: ψ ij,s = y i,t+s ɛ j,t holding all other innovations at all other dates constant. The function that evaluates those derivatives for s > 0 is called IRFs.

Remarks: If the correlations are high, it doesn t make much sense to ask what if ε 1,t has a unit impulse with no change in ε 2,t since both come usually at the same time. For impulse response analysis, it is therefore desirable to express the VAR in such a way that the shocks become orthogonal, (that is, the ε i s,t are uncorrelated). Additionally it is convenient to rescale the shocks so that they have a unit variance. In consequence, we need to compute the orthogonalization of correlated shocks in the original VAR. One generally used method is to use Cholesky decomposition for matriz Σ ε.

Choleski decomposition:

Example 2: We can calculate the IRF s to a unit shock of ε once we know A 1. Suppose we are interested in tracing the dynamics to a shock to the first variable in a two variable VAR: Thus ε 0 = [1, 0, 0] x 0 =A 1 ε t for s = 0 x s =A 1 x s 1 for s > 0 To summarize, the impulse response function is a practical way of representing the behavior over time of x in response to shocks to the vector ε.

Example Data are on: P=100xlog(GDP deflator), Y=100xlog(GDP), M=M2, R= Fed Funds Rate, on US quarterly data running from 1960 to 2002. We estimate a VAR(4).

Remember that: Impulse responses trace out the response of current and future values of each of the variables to a unit increase in the current value of one of the VAR structural errors, assuming that this error returns to zero thereafter.

We observe that:

6. Forecast error variance decomposition (FEVD) Variance decomposition can tell a researcher the percentage of the fluctuation in a time series attributable to other variables at select time horizons. In other words, tell us the proportion of the movements in a variable due to is own shocks vs the shocks to the other variables Thus, the variance decomposition provides information about the relative importance of each random innovation in affecting the variables in the VAR. In addition it can indicate which variables have short-term and long-term impacts on another variable of interest.

Example 3: Given the structural model:. Φ(B)x t = ε t The VMA representation is given by x t = Ψ 0 ε t + Ψ 1 ε t 1 + Ψ 2 ε t 2 +... and the error in forecasting x t in the future is, for each horizon s: x t E[x t+s ] = Ψ 0 ε t+s + Ψ 1 ε t+s 1 + Ψ 2 ε t+s 2 +... + Ψ s 1 ε t+1 from which the variance of the forecasting error is: var(x t E[x t+s ]) = Ψ 0 Σ ε Ψ 0 + Ψ 1 Σ ε Ψ 1 +... + Ψ s 1 Σ ε Ψ s 1

Now defining e t+s = var(x t E[x t+s ]), and given that: 1. the shocks are both serially and contemporaneously uncorrelated 2. all shock components have unit variance This implies:

Comparing this to the sum of innovation responses, we get a relative measure: How important variable j s innovations are in the explaining the variation in variable i at different step-ahead forecasts, i.e., In other words, we compute the share of the total variance of the forecast error for each variable attributable to the variance of each structural shock.

In summary: Thus, while impulse response functions traces the effects of a shock to one endogenous variable on to the other variables in the VAR, variance decomposition separates the variation in an endogenous variable into the component shocks to the VAR.

Example FEVD (in %) OF NICARAGUAS INFLATION RATE

7. Granger causality One of the main uses of VAR models is forecasting. The following intuitive notion of a variable s forecasting ability is due to Granger (1969). If a variable, or group of variables, y 1 is found to be helpful for predicting another variable, or group of variables, y2 then y 1 is said to Granger-cause y 2 ; otherwise it is said to fail to Granger-cause y2. The notion of Granger causality does not imply true causality. It only implies forecasting ability.

In other words: A variable y 1 fails to Granger-causes y 2 if y 2 CAN NOT be better predicted using the histories of both y 1 and y 2, than it can using the history of y 2 alone.

Example: Bivariate VAR model

Formally, Which correspond to an invertible VMA(1) process: Z t = C + Θ(B)u t :

In other words, this corresponds to the restrictions that all cross-lags coefficients are all zeros which can be tested by traditional F test. For instance, for the following VAR(p) model: y t =a 1 y t 1 + a 2 y t 2 +.. + a p y t p + b 1 x t 1 + b 2 x t 2 +.. + b p x t p + ε y,t x t =c 1 y t 1 + c 2 y t 2 +.. + c p y t p + d 1 x t 1 + d 2 x t 2 +.. + d p x t p + ε x,t x does not Granger-cause y H0 :b 1 = b 2 =... = b p = 0 H1 :b 1 = b 2 =... = b p 0 y does not Granger-cause x H0 :c 1 = c 2 =... = c p = 0 H1 :c 1 = c 2 =... = c p 0

Conceptually, the idea has several components: 1. Temporality: Only past values of x can cause y. 2. Exogeneity: Sims (1972) points out that a necessary condition for x to be exogenous of y is that x fails to Granger-cause y. 3. Independence: Similarly, variables x and y are only independent if both fail to Granger-cause the other.

Example: Trivariate VAR model

8. Estimation Note that if the disturbances in one equation are for example autocorrelated, the theory does not apply. Then need IV estimators, including GMM

Traditionally, VAR models are designed for stationary variables without time trends. So, first we need to be sure vector series must be stationary or properly stationarized. How do we test the VAR assumptions?

1. Determination of p (specification testing) Information criteria: The general approach is to fit VAR(p) models with orders p = 0,..., pmax and choose the value of p which minimizes some model selection criteria: Schwarz (SC), Hannan-Quin (HQ) and Akaike (AIC). Start with a large p and test successively that the coefficients of the largest lag in the VAR is zero: i.e., a sequence of F-tests. Under-specification of p might result in residuals that are autocorrelated. Information criteria and sequence of tests can of course be combined

REMARKS: AIC criterion asymptotically overestimates the order with positive probability BIC and HQ criteria estimate the order consistently, under fairly general conditions, if the true order p is less than or equal to pmax.

2. Testing assumptions about ε it (mis-specifiation testing) Since each equation is estimated by OLS, we can use a test-battery: autoregresive autocorrelation, ARCH disturbance, White tests of heteroskedasticity, non-normality tests. Note the degrees of freedom tend to be very large for these tests, so even if the size of the test is OK, mis-specification may be hidden (due to low power of test). Significant departures from the hypothesis of Gaussian disturbances can often be resolved by: Larger p Increase the dimension of the VAR: more variables in the yt vector. Introduce exogenous stochastic explanatory variables: VAR-X model, conditional or partial model. Introduce deterministic variables in the VAR.

The economic relevance of the statistically well specified VAR is a matter in itself. Little help if p is set so large that there are no degrees of freedom left, or if VAR-X introduce variables that are difficult to rationalize or interpret theoretically or historically. May then want to estimate simpler model with GMM for each equation instead.

How to build flexibility into the VAR?

VARMA Models B. VARMA Models

VARMA Models Considering the following VARMA(p,q) model: Φ(B)Z t = Θ(B)u t where Φ(B) =F 1 B + F 2 B 2 +... + F p B p Θ(B) =Θ 1 B + Θ 2 B 2 +... + Θ p B p