Stationarity and Cointegration analysis. Tinashe Bvirindi

Stationarity and Cointegration analysis By Tinashe Bvirindi tbvirindi@gmail.com

layout Unit root testing Cointegration Vector Auto-regressions Cointegration in Multivariate systems

Introduction Stationarity or otherwise of a series can strongly influence its behaviour and properties. For instance a shock dies away with stationarity but is persistent if non stationary. Spurious regressions - if variables are trended over time it may produce significant coefficients and high R2 but it is a meaningless relationship. Two types of trend Stochastic Trend - [random walk] Deterministic Trend Why distinguish between them? May look the same but have very different properties

Deterministic trends

Deterministic trends Taking the first difference of a trend stationary series removes the non-stationarity but at the cost of introducing an MA(1) process in the residuals. Non invertible MA process i.e. cannot be written as an AR process

Deterministic trend

Stochastic trends Consider the following

Stochastic trends Take this process forward S periods in time: As S approaches infinity the values of Y do not become independent of the error terms; and the drift term increases over time This process is know as the stochastic trend because it is dependent on the drift and the stochastic progression of error terms

Stochastic trend

Detecting unit root- dickey fuller tests Dickey and Fuller (Fuller, 1976; Dickey and Fuller, 1979).- pioneers on testing for a unit root in time series The basic objective of the test is to examine the null hypothesis that: Against a one sided alternative

Dickey Fuller tests Reject Null if DF statistic is more negative than the critical values

Augmented Dickey Fuller Unit root Test

Augmented Dickey Fuller Unit root Test The ADF test requires a specific lag length to augment the autoregressive process of Y t so as to soak any dynamic structure present in the dependent variable and to expunge any possible serial correlation in the regression residuals. However: larger lag length may increase the standard errors of the coefficients- degrees of freedom are used up. lower lag length will not remove all the autocorrelation and will bias the estimated results (Enders, 2010; and Brookes, 2010). Use information criterion to choose lag length

An Eviews Demonstration

Unit root testing

Augmented Dickey Fuller results Computed value

Order of integration A time series is said to be integrated of order d, I(d), if after differencing d times it becomes stationary If a variable is stationary it is said to be I(0) If the first difference of a non stationary variable is stationary it is said to be I(1) Most economic data is I(1)

Why are we concerned about the order of integration? Has a direct bearing on the appropriateness and statistical validity of regression results If we wish to regress Y on X when: Yt and Xt are stationary, classical OLS is valid Yt and Xt are integrated of different orders, regression is meaningless Yt and Xt are integrated of same order and residuals are non stationary, regression may be spurious Yt and Xt are integrated of same order and residuals are stationary, regression may indicate a cointegrating relationships CLR is founded on asymptotc theory, which implies convergence of variance on a constant This is not the case when variables are non stationary- sample moments converge to Brownian Motion or Weiner Processes

Consequences of non stationarity Sampling distributions take a non standard form We can no longer rely on t and F distributions in statistical inference Normal hypothesis testing is invalidated There is a tendency to reject the null of no association between individual and all regressors jointly- the proble only intensifies withan increase in the sample size

High level over-view on Cointegration

Introduction Modern econometric analysis emphasise the importance of unit root testing in conducting empirical econometric work. Granger and Newbold (1974) non-stationary data yield misleading or spurious regression results i.e. regressions that do not make sense e.g

Introduction Results exhibit high R 2 values which converge to 1, high F and t- statistics and very low Durbin Watson statistics (serial correlation in residuals. Phillips (1986) a pioneer on asymptotic theory with I(1) variables, concurs with Granger and Newbold and proves that in the above regression: While beta parameters should converge to zero as sample size approaches infinity, they are non zero; R 2 statistics approach ; and T-statistics approach infinity. Brooks (2010), results reflect contemporaneously correlated time trends instead of the true underlying relationships

Introduction To avoid spurious regressions and to compute BLUE parameters variables are often differenced to achieve stationarity However, economic and finance theory is anchored on long run relationship and differencing removes long run information from the data series

Cointegration Engle and Granger (1987) it is possible to estimate valid regressions using non-stationary data. Develop a technique to estimate valid parameters and to test for longrun relationships between nonstationary variable (Granger Representation Theorem) A set of non-stationary variables integrated of the same order, say I(1), are linked to form an equilibrium relationship spanning the long-run if they combine to form a lower order series integrated of the order I(0), they are said to be cointegrated.

Cointegration This implies that variables will move closely together and will not drift arbitrarily over time and the distance between them will be stationary The concept of cointegration mimics the existence of a long-run equilibrium relationship to which the variables converge over time. the distance that the system is away from equilibrium at any given time is termed the equilibrium error

Cointegration The distance that the system is away from equilibrium at any given time is termed the equilibrium error. It allows for a richer study of the short-run dynamics of adjustment towards equilibrium through the use of error correction models.

Cointegration in bivariate systems

Testing for Cointegration (residuals based test)

Cointegration and error correction

Procedure in testing for Cointegration Two step Engel and Granger procedure Step 1: Run a static regression in levels between the variables Save the residuals series: and Step 2: Test for stationary of residuals If stationary- Cointegration, proceed to estimate ECM If non stationary- No Cointegration

Step 1:Estimating a static Longrun equation Go to Quick, then select estimate Equation on the drop down menu

Step 1:Estimating a static long run equation In the equation dialog box Type the equation you wish to estimate. Always remember to include a constant

Step 1: long run equation results Recall spurious regressions

Caution Check whether the coefficients in the long-run equation conform to apriori expectations in terms of direction of the impact and not on the magnitude and significance of the coefficients. The Rsqr statistic is useless and should not be interpreted.

Step 1: Creating a residual series/ equilibrium error To create a residuals series Go to the proc button and select the Make residual series option

Create residual series/ Equilibrium error

Creating residual series/ Equilibrium error Name the residual series and Click OK

Testing for unit roots in the residuals Once the residual series is created Click on view and select the unit root test button

Step 2:Testing for Cointegration Select the ADF test on the test Type window Select the level button Always select none for the ADF model when conducting unit root tests

Step 2 residual test Since the ADF is more negative than the critical values we reject the null that the variables are not cointegrated Theoretically, the ADF critical values are not valid Should ordinarily be based on MacKinnon surface response functions, Harris, 1995. However, in practice ADF is used as a proxy for the true critical values

Step 2: Estimating an error correction model The error correction model also known as the dynamics of adjustment are estimated using the lagged differences of the data series and the lag of the equilibrium error we have calculated above

Estimating an error correction model In the equation dialogue box enter the variables in their differences First difference of lm3 i.e. money supply constant Lagged equilibrium error

Estimate an error correction (2 step model) Error correction term/ Speed of adjustment Valid regressions?

Estimating a one step EG Cointegration ECM equation After testing for Cointegration as above, proceed to estimate an error correcting model In the equation dialogue box type the following equation Lm3(-1) Captures the speed of adjustment towards equilibrium Longrun component Shortrun dynamics

Cointegration- One step Engel Granger Procedure Error correction term/ speed of adjustment should always be negative

Calculation of Elasticities The elasticity of money demand to the changes in the longru variables are calculated as follows: e.g. the elasticity of nominal gdp LNGDP elasticity=(coefficient of gdp)/(coefficient of adjustment = 0.1575/0.0875 =1.8 Therefore we would say that a 10% change in nominal income will result in an 18% change in the money demanded in the longrun. Note that: the coeffient in the longrun equation estimated in the two step procedure and the elasticity above are almost the same This is due to the superconsistency property of OLS

Diagnostic tests- subject equations to a battery of tests Whilst the equation is still open, click on View to see the menu of diagnostic tests

Diagnostic testing (plot of residual series)

Correlogram of residuals

Correlogram of squared residuals

Residual tests- normality test

Serial correlation test Fail to reject the null of no serial correlation

Heteroskedasticity tests Fail to reject the null of no Heteroskedasticity at 5%

Cumulative sum of residuals test

Cumulative sum of squared residuals tests Equation is unstable Investigate and find out why.. In this case it s the run-up to the attainment of independence and the end of apartheid.. To control for this we need dummy variables

Multivariate Cointegration and Vector Auto-regressions

Advantages of the Engle and Granger approach Relatively simple Useful as a first indication of the existence of a longrun equilibrium relationship Where there is a consistent Cointegration vector it allows us to use the superconsistency property of OLS to obtain consistent estimates of the cointegrating vector Provides longrun equilibrium information and the short term dynamics Provides speed of adjustment to equilibrium

Limitations of the E_G approach Distribution of test statistics is only a rough guide and will be slightly different in any application For more than two variables it is no longer possible to demonstrate the uniqueness of the Cointegration vector If we have a vector of N variables each integrated of the same order, we can have up to N-1 Cointegration vectors Has no systematic procedure to estimate multiple Cointegration vectors Results are based on asymptotic theory but we do not have infinitely large samples in practise Carry over error bias

Multivariate cointegration Johansen and Jesilius (1988) and Stock and Watson (1988) develop max likelihood procedure to test for Cointegration Their test could estimate and test the number of cointegration equations and to test restricted versions of the cointegrating vectors and speeds of adjustment Allows verification of theories through coefficient restrictions e.t.c The test based on the stationary VAR

Vector Autoregressive (VAR) models We popularised by Sims(1980) as a natural generalisation of univariate autoregressive models. Variables should be treated symmetrically to avoid incredible identification restrictions Let the data speak for itself i.e. no apriori assumption about exogeneity of variables Very helpful in identifying the relationship among a set of macroeconomic models

Vector Autoregressive (VAR) models Multiequation time series model Considers a number of interrelated variables Imposes zero restrictions on estimation of parameters Atheoretical i.e. no strict reliance on theory to formulate the model Everything causes everything However, the number of estimated parameters makes the model difficult to interpret

Vector Autoregressive (VAR) models Advantages of VARs over simple regression models: Every variable is endogenous (no incredible exogeneity assumptions). Every variable depends on the others (no incredible exclusion restrictions). Simple to estimate and use. General disadvantages: It is a reduced form model; no economic interpretation of the dynamics is possible. Potentially difficult to relate VAR dynamics with DSGE dynamics (which have an ARMA structure)can be specified as follows Can t be used for certain policy analyses (Lucas critique).

Vector Autoregressive (VAR) models Multi-equation time series model k Y t = μ + i=1 θ i Y t i + ε t Y t is a (m 1) vector of I(0) variables μ is a (m 1) vector of constants, and θ 1.. θ k are (m m) matrices of parameters, k is the appropriate lag length of the model, ε t is a ( m 1) vector of normally distributed error terms.

Vector Autoregressive (VAR) models The properties are: The variabels are stationary Error terms are white noise disturbances with a constant variance Error terms are not serially correlated The structure of the system allows for feedback effects If contemporanoues effects are assumed to be zero the VAR is said to be in standard form and estimation can proceed using OLS

Vector Autoregressive (VAR) models Stationarity and VAR Brookes (2010) it is important that all of the variables in the VAR process be stationary otherwise hypothesis are invalid Sims (1980) and Sims, Stock and Watson (1990) as cited in Enders (2010) recommend against differencing even if variables contain a unit root. Argue that the objective of VAR analysis is to determine interrelationships among variables and not to determine parameter estimates. Also argue against detrending data in a VAR Canova (2005) If we want a constant coefficient VAR, we need stationarity of the variables. If non-stationarities are present a VAR representation exists, but with time varying coefficients.

Vector Autoregressive (VAR) models To determine the appropriate lag length/ order of VAR Akaike information criterion Schwarts information criterion Likelihood ratio test Final prediction error HQ information criterion Maximum lag for autoregressive models Experimentation i.e. general to specific modeling You choose the lag length that soaks or expunges serial correlation in the residuals If maximum lag length is p, then it s a VAR(p)

Example of a VAR

Variance decompositions Enders (2010) enables us to study the variation in Y that is due to its own shocks versus the component of the variation that is due to shocks in other variables help determine the relative importance of each innovation in explaining the variables in the system. To conduct variance decompositions, the AR process is inverted into an MA process of the errors using Walds Decomposition Theorem Rewrites the AR process Y t = μ + k i=1 into θ i Y t i + ε t X t = μ + i=0 θ i ε t i

Forecast error variance decomposition Forecast error In a 2 variable case This reduces to Variance in Y due to itself Due to others

Variance decomposition If the forecast error variance is explained by shocks in the variable itself, the the variable is exogenous It is typical for a variable to explain almost all its forecast error variance for short horizons and smaller proportions at longer horizons (Enders, 2010) It is also subject to an under identification problem as is the impulse response function, thus there might be need to place additional restrictions on the system in order to obtain the decomposition and impulse responses One such restriction is the Choleski decomposition The contemporaneous value of Y has no contemporaneous effect on X This implies an ordering of the variables Brooks and Tsolacos (1998) and Enders (2010- the Choleski ordering of the variables has important ramifications on the resulting impulse responses and variance decompositions and is equivalent to an identifying restriction on the VAR.

Impulse response functions Impulse responses allows for tracing the time profile of various shocks on the variables in the VAR system

Impulse response function Impulse response functions are a practical tool which aid in visualising the behaviour of the variables understudy in response to various shocks. They show the dynamics of transmission of shocks, direction and magnitude of the shocks. In practice you should always plot your impulse responses together with their standard deviation bands

Multivariate Cointegration Johansen and Jesilius enhance the VAR (p) by including the long run components (cointegrating relations) in the VAR (p) process i.e. separating permanent effects from transitory effects. It specifies a VECM among variables

Multivariate Cointegration

The cointegrated VAR and VECM

Multivariate cointegration

Testing for Cointegration JJ suggest five assumption on which test can be conducted 1. No deterministic trends in the VAR system and the cointegrating relationship has no intercept and no trend; 2. No deterministic trends in the VAR system and the cointegrating relationship has an intercept and no trend; 3. Linear trend in the VAR system and the cointegrating relationship has no trend but has an intercept; 4. Linear trend in the VAR system and the cointegrating relationship only has a deterministic trend; and 5. A quadratic trend in the VAR and the cointegrating relationship has a linear deterministic trend.

Testing for cointegration

Testing for cointegration Trace statistics tests the null hypothesis that the rank r = 0 (i.e. no cointegration) against the alternative that r > 0 (i.e. there is one or more cointegrating vectors). The maximum Eigenvalue statistics on the other hand tests the null hypothesis that the number of cointegrating vectors is r against the specific alternative of r + 1 cointegrating vectors

Testing for Cointegration rank The critical values for the tests are obtained using Monte Carlo approach The distribution of statistics depends on two components: The number of non stationary components under the null hypothesis The form of the deterministic components, constant, trend or both- has similarity with the Dickey fuller test Sometimes the two tests may give conflicting results Harris(1995) the maximum eigen value has a sharper alternative hypothesis and is preferred to pin down the number of cointegrating vectors. The sequence of the Trace tests leads to a consistent procedure.

Testing for cointegration rank Cheung and Lai (1993) propose choosing cointegration rank based on the Trace statistic. They state that the trace statistic is more robust to skewness and excess kurtosis in residuals than the maximum Eigen value statistic. Enders (2010) concurs with these findings and states that when the two tests for cointegration rank are in conflict the Trace statistic is likely to give more reliable results. Current practise is to only consider the Trace test.

Practical demonstration of Multivariate Cointegration in E-views

Step 1: Pretest data Pretest all variables to determine their order of integration i.e. test for unit roots Plot the variables to see if a linear time trend is likely to appear in the data series

Step 2: Estimating an Unrestricted VAR Go to Quick and select estimate VAR

Estimating a VAR Enter the variables of interest ad click OK Choose the sample size over which to estimate the VAR

Step 3: Choosing the optimal lag length In the estimated VAR window, go to View, Lag structure, lag length criteria

Choosing the optimal lag length Leave the default and click ok

Selecting the lag length Asterisk indicates lag length selected by Information criteria If a long lag is required to ake residuals white noise, reconsider the choice of variables and look for another important explanatory variable to include in the information set

Selecting the lag length A summary of test statistics that measures the magnitude of the residual autocorrelation in given b the Portmanteau test Eviews uses Wald Lag exclusion tests to determine the default lag In our case we will select the lag based on SIC (a more stricter test) AIC gives a generous lag lenth HQ is a middle of the road approach Chosen lag length for this exercise is 2

Step 4: Deterministic trend specification of the VAR The variables may have non zero means and deterministic and stochastic trends Similarly Cointegration equations may have intercepts and deterministic trends Since the asymptotic distributions of the LR test statistic for Cointegration does not have the usual Chi Square distribution and depends on the restrictions we make with respect to deterministic trends, we need to make assumptions regarding trends underling our data. Eviews allows for the 5 trend specification of Johansen and Jesilius

Step 5: Estimation and Determination of Rank Go to view and select Cointegration tests

Step 5: Estimation and Determination of Rank Choose the trend assumption you have made in 4: in unique circumstances will you consider a trend in the Cointegration vector Enter the chosen lag length from step 3 and click OK

Step 5: Estimation and determination of rank The test is done in specific order from the largest eigen value to the smallest. We use the Pantula Principle where we test for significance until you no longer reject the null The first null is that there is non stationary relations in the data (r=0) So long as TS/MES> critical value reject the null Use p =-values to make decision

Step 5: Estimation and determination of rank Click estimate, Select Vector Error Correction Cross check if lag interval is correct and click on the Cointegration tab

Step 5: Estimation and determination of rank Enter the number of Cointegration equations and click OK

Estimations Cointegration equations/ Longrun component Speed of adjustment The 2 separate longrun relationships enter into each of the 4 equations

Estimation NB: even if we are only interested in the first cointegrating relationship, both coinntergation relationships should enter that equation separately. There are in effect two ECM terms in the equation There is a longrun positive relationship between money supply and inflation and a negative relationship betwee n money supply and the interest rate And approximately 8% of deviations in the money supply from its long run equilibrium are cleared in the next quarter

Step 6: Diagnostic testing Once a VEC is estimated, a number of diagnostic tests should be performed These tests assist in checking the appropriateness of the estimated VAR Residual tests: Portmanteau Autocorrelation test: computes the multivariate Box-Pierce/ Ljung Box Q statistics for serial correlation upto a specified order. Eviews reports both tests under the null hypothesis of no serial correlation Autocorrelation LM test: reports the multivariate LM test statistics for residual serial correlation. Under the null hypothesis of no serial correlation of order h the LM test is asymptotically Chi Square distributed with K^2 degrees of freedom

Step 6: Diagnostic testing To conduct residual tests go to view, residual tests, portmanteau test

Residual tests Reject the null of probabilitys of the Q- stat and Ajd Q are less than 0.05

Diagnostics residual tests We reject the Null Hypothesis of lag order 2 in the residuals of the VECM and conclude that residuals are autocorrelated

Normality tests Reports the Multivariate extension fo the Jarque Bera test For the multivariate test you must choose a factorisation of residuals that are orthogonal to each other: Cholesky Inverse square root of residual correlation matrix Doornik and Hansen (1994) Inverse square root of residual covariance matrix Urza (1997) Factorisation from identified VECM Eviews reports the test statistics for each othorgonal component

Normality Choose factorisation method, select cholesky

Normality The P is the inverse of the lower triangular Cholesky factor of the residual covariance matrix Reports the joint normality test for our four component equation

Normality NB: Paruolo (1997) points out that if normality of the error terms is rejected for other reasons (kurtosis), Johansen results are not affected. That is we should not worry if our skewness results are fine.

Whites heteroskedasticity test Is an extension of the white s 1980 test No cross terms: uses only levels and squares of regressors Cross terms: includes all non redundant cross products of regressors (heteroskedasticity of an unknown form)

Heteroskedasticity Reject the null of no heteroskedasticity

Choose the number of quarter for the graph Impulse responses Click on Impulse, then in the impulse box select the variables you wish to shock and in the responses box select the variable you want to be affected Then click on impulse definition

Cholesky ordering Select the order of the variables i.e. as identified in theory This is similar to an identification restriction on the impulses

Impulse responses Profile and direction of shocks

Variance decompositions Click on view then choose variance decomposition

Variance decomposition Select the table option and specify the cholesky ordering

Variance decomposition LNGDP accounts for about 28% of objserved variations in money supply

Testing theoretical restrictions Click estimate, the VECM restrictions, impose restrictions

What can go wrong in Johansen Methodology We need normally distributed white noise The test is asymptotic and can be sensitive to how we formulate the VECM model in limited samples Test assumes there are no structural breaks If we put a stationary variable in the model, the number of cointegrating vectors may increase Weak exogeneity: if weak exogeneity is foun then use a single equation model

References Enders, W., 2010, Applied Econometric Time Series 3e, Wiley, USA Brookes