1 THE COINTEGRATED VECTOR AUTOREGRESSIVE MODEL WITH AN APPLICATION TO THE ANALYSIS OF SEA LEVEL AND TEMPERATURE by Søren Johansen Department of Economics, University of Copenhagen and CREATES, Aarhus University
CONTENT OF LECTURES 1. AN EXAMPLE: SEA LEVEL AND TEMPERATURE 1881-1995 2. A SIMPLE EXAMPLE AND SOME LITERATURE 3. THE VECTOR AUTOREGRESSIVE MODEL AND ITS SOLUTION 4. HYPOTHESES OF INTEREST IN THE I(1) MODEL 5. THE STATISTICAL ANALYSIS 6. THE ANALYSIS OF THE DATA 7. ASYMPTOTIC ANALYSIS 8. CONCLUSION
1. AN EXAMPLE: SEA LEVEL AND TEMPERATURE 1881-1995 Data in levels and differences 3
4 125 Sea Level 0.6 Temperature 100 75 0.4 50 0.2 25 0 0.0 25 50 0.2 75 1881 1890 1899 1908 1917 1926 1935 1944 1953 1962 1971 1980 1989 1998 0.4 1881 1890 1899 1908 1917 1926 1935 1944 1953 1962 1971 1980 1989 1998 15 Change of Sea Level 0.3 Change of Temperature 10 0.2 5 0.1 0 0.0 5 0.1 10 0.2 15 1882 1891 1900 1909 1918 1927 1936 1945 1954 1963 1972 1981 1990 1999 0.3 1882 1891 1900 1909 1918 1927 1936 1945 1954 1963 1972 1981 1990 1999 1.Data : 1881:01 to 1995:01, Hansen, J. et al J. Geophys. Res. Atmos. 106 23947 (2001)
5 2. A SIMPLE EXAMPLE AND SOME LITERATURE The mathematical formulation of cointegration by a simple example. Take two stochastic processes which have a stochastic trend (random walk) tx X 1t = a " 1i + " 2t I(1) X 2t = b tx " 1i + " 3t I(1) bx 1t ax 2t = b" 2t a" 3t I(0) 1. X(t) is nonstationary and X t is stationary: X t is I(1) 2. 0 X t is stationary with = (b; a) 0 3. The common stochastic trend is P t " 1i
6 The cointegrated vector autoregressive model X t = 0 X t 1 + Xk 1 ix t i + " t Literature 1. Johansen S. (1996) `Likelihood-Based Inference in Cointegrated Vector Autoregressive Models.' Oxford University Press, Oxford. www.math.ku.dk/~sjo 2. Juselius, K. (2006) `The Cointegrated VAR Model: Methodology and Applications.' Oxford University Press, Oxford. www.econ.ku.dk/~katarina.juselius 3. Dennis, J., Johansen, S. and Juselius, K. (2006), `CATS for RATS: Manual to Cointegration Analysis of Time Series.' Estima, Illinois. See also Johansen, S. (2006) Cointegration: a survey. In: T.C. Mills and K. Patterson (eds.) Palgrave Handbook of Econometrics: Volume 1, Econometric Theory, Basingstoke, Palgrave Macmillan.
7 3. THE VECTOR AUTOREGRESSIVE MODEL AND ITS SOLUTION Error correction formulation of the vector autoregressive model X t = 0 X t 1 + Xk 1 (z) = (1 z)i p 0 z ix t i + " t Xk 1 (1 z)z i i det (1) = det 0 = 0 =) z = 1 is a root of det (z) = 0 Question: If the VAR has unit roots and the other roots larger than one, what is the moving average representation and what are the properties of the process? I(1) condition : det((z)) = 0 =) z = 1 or jzj > 1 and Xk 1 = I p i; det( 0?? ) 6= 0
8 X t = 0 X t 1 + Xk 1 (z) = (1 z)i p 0 z ix t i + " t Xk 1 (1 z)z i i THEOREM If det((z)) = 0 =) z = 1 or jzj > 1 and det( 0? (z) 1 = C 1 1 z + 1 X X t = C tx " i + i=0 C i z i 1X Ci " t i + A; 0 A = 0 i=0 C =? ( 0?? ) 1 0? 1. X t is nonstationary, X t is stationary: X t is called I(1) 2. 0 X t is stationary: X t is cointegrating (r relations) 3. The common trends are 0? P t " i (p r trends)?) 6= 0; then
9 Example X 1t = 1 4 (X 1t 1 X 2t 1 ) + " 1t X 2t = 1 4 (X 1t 1 X 2t 1 ) + " 2t (X 1t X 2t ) = 1 2 (X 1t 1 X 2t 1 ) + " 1t " 2t =) X 1t X 2t stationary (= y t ) (X 1t + X 2t ) = " 1t + " 2t =) X 1t + X 2t nonstationary random walk (= S t ) Granger Representation Theorem X 1t = 1 2 (S t + y t ) X 2t = 1 2 (S t y t ) 1. X t is nonstationary, X t is stationary 2. 0 X t is stationary with cointegration vector = (1; 1) 0 3. P t (" 1i + " 2i ) is a common trend
10 Two applications of the Granger Representation Theorem 1. The role of deterministic terms X t = 0 X t 1 + X t = C X t = C Xk 1 tx (" i + ) + ix t 1X Ci (" t i=0 i + + " t i + ) + A tx " i + Ct + stationary process Thus 1. Linear trend in general 2. If C =? ( 0??) 1 0? = 0 (or 0? = 0) : only constant term Other deterministics.
11 2. Asymptotics X [T u] p T X t = C = C tx " i + Ct + stationary process P [T u] " i p T [T u] + C p + T stationary process p T # d CW (u) The process X [T u] is proportional to T in the direction C; but orthogonal to this direction it behaves like T 1=2 : # P 0
Sealevel t 6 X t Xb X XXz 0 X t b X 1;t sp(? ) XXy X X 12 0 0? P t " i - T t 2.The process X 0 t = [SeaLevel t ; T t ] is pushed along the attractor set by the common trends and pulled towards the attractor set by the adjustment coefcients
13 4. HYPOTHESES OF INTEREST IN THE I(1) MODEL 1. Hypotheses on the rank H r : X t = 0 X t 1 + 1 X t 1 + D t + " t ; pr ; pr H 0 : : : H r : : : H p 2. Hypotheses on the long run relations 3. Hypotheses on
14 5. THE STATISTICAL ANALYSIS Test for misspecication of the VAR The VAR model assumes X t = X t 1 + Xk 1 ix t i + D t + " t 1. Linear conditional mean explained by the past observations and deterministic terms (Check for unmodelled systematic variation, the choice of lag length, choice of information set (data), possible outliers, nonlinearity, constant parameters) 2. Constant conditional variance (Check for ARCH effects, but also for regime shifts in the variance) 3. Independent Normal errors, mean zero, variance (Check for lack of autocorrelation, distributional form)
15 Estimation of the I(1) model H r : X t = 0 X t 1 + Xk 1 ix t i + D t + " t ; where " t i.i.d. N p (0; ) and and are (p r): Maximum likelihood is calculated by reduced rank regression of X t on X t for 1 corrected X t 1 ; : : : ; X t k+1 ; D t (T. W. Anderson, 1951). The estimate of are the r linear combinations of the data which have the largest empirical correlation with the stationary process X t : (Canonical variates and canonical correlations) If i, are the squared canonical correlations, then L 2=T max (H r ) _ j^j _ ry (1 ^i ); 2 log LR(H r jh p ) = T px i=r+1 log(1 ^i )
6. THE ANALYSIS OF THE DATA 16
17 DLEVEL 15 10 Actual and Fitted 1.00 0.75 0.50 Autocorrelations 5 0.25 0 0.00 0.25 5 0.50 10 0.75 15 1883 1890 1897 1904 1911 1918 1925 1932 1939 1946 1953 1960 1967 1974 1981 1988 1995 2002 1.00 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 Lag 3 2 Standardized Residuals 0.45 0.40 0.35 Histogram SB DH: ChiSqr(2) = 0.07 [0.96] K S = 0.95 [5% C.V. = 0.08] J B: ChiSqr(2) = 0.29 [0.86] 1 0.30 0 0.25 0.20 1 0.15 2 0.10 0.05 3 1883 1890 1897 1904 1911 1918 1925 1932 1939 1946 1953 1960 1967 1974 1981 1988 1995 2002 0.00 3.2 1.6 0.0 1.6 3.2 3.Residual analysis: Plot of SeaLevel t ; and tted value, normalized residuals, autocorrelations function of residuals and histogram
18 DTEMP 0.3 0.2 Actual and Fitted 1.00 0.75 0.50 Autocorrelations 0.1 0.25 0.0 0.00 0.25 0.1 0.50 0.2 0.75 0.3 1883 1890 1897 1904 1911 1918 1925 1932 1939 1946 1953 1960 1967 1974 1981 1988 1995 2002 1.00 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 Lag 3 2 Standardized Residuals 0.40 0.35 Histogram SB DH: ChiSqr(2) = 1.47 [0.48] K S = 0.89 [5% C.V. = 0.08] J B: ChiSqr(2) = 1.34 [0.51] 0.30 1 0.25 0 0.20 1 0.15 0.10 2 0.05 3 1883 1890 1897 1904 1911 1918 1925 1932 1939 1946 1953 1960 1967 1974 1981 1988 1995 2002 0.00 3.2 1.6 0.0 1.6 3.2 4.Residual analysis: Plot of T emperature t ; and tted value, normalized residuals, autocorrelations function of residuals and histogram
19 Roots of the Companion Matrix 1.0 Rank(PI)=2 0.5 0.0 0.5 1.0 1.0 0.5 0.0 0.5 1.0 1.5
20 The two eigenvectors for temperature sea level 0.24 0.12 0.00 0.12 0.24 0.36 1881 1890 1899 1908 1917 1926 1935 1944 1953 1962 1971 1980 1989 3.6 2.4 1.2 0.0 1.2 2.4 1881 1890 1899 1908 1917 1926 1935 1944 1953 1962 1971 1980 1989 5.The rst canonical variate is T t 1 0:0031 (t= 7:37) h t 1
21 0.6 Temperature versus Sea Level 0.4 Temperature 0.2 0.0 0.2 0.4 80 40 0 40 80 120 Sea Level
22 Rank determination for temperature and sea level data. Rank r = 0 is rejected and rank r = 1 is accepted p r r EigVal 2 log LR(H r jh p ) 95%Fract p-val 2 0 0:168 20:76 15:41 0:005 1 1 0:003 0:36 3:84 0:54 The tted ECM model for temperature and sea level data h t = 4:15 (T t 1 0:0031 h t 1) 0:2805 h t 1 + 3:04 T t 1 + 2:22 (t=0:86) (t= 7:37) (t= 3:11) (t=0:60) (t=3:55) T t = 0:40 (T t 1 0:0031 h t 1) 0:0024 h t 1 0:053 T t 1 0:023 (t= 4:26) (t= 7:37) (t= 1:40) (t= 0:54) (t= 1:91) Note h t weakly and strongly exogenous because the coefcients 4:15 to (T t 1 0:0031 h t 1) (t= 7:37) and 3:04 to T t 1 are insignicant.
A partial model for (T t ; h t ) conditional on the forcing (weakly exogenous) variables W mgg (CO2 and Methan) and Aerosols (Sulphate) Tt h t Cointegrating relation = (T + 1 h + 2 W mgg + 3 Aerosol) t 1 + + " t ^ 0 X t = T t 0:0065 (t= 3:52) h t 0:768 (t=4:68) W mgg t 1:478 (t=3:51) Aerosol t 23
24 0.4 Temperature against Temperature* 0.3 0.2 0.1 0.0 0.1 0.2 0.3 0.4 0.3 0.2 0.1 0.0 0.1 0.2 0.3
30 2.0 1.5 1.0 TEMP WMGGS LEVELSTAR AEROSOLSTAR TEMPSTAR 0.5 0.0 0.5 1.0 1.5 2.0 2.5 1881 1892 1903 1914 1925 1936 1947 1958 1969 1980 1991 6.Plot of the components of the tted Temperature for simplied forcing variables
25 7. ASYMPTOTIC ANALYSIS Asymptotic properties of the process X t and its product moments for model without deterministic terms. Three basic results T 1=2 X [T u] = CT 1=2 P [T u] " i + T 1=2 P 1 i=0 C i " t i d! CW (u) T 2 P T t=1 X t 1X 0 t 1 d! R 1 0 W (u)w (u)0 du where W is Brownian motion with variance : Test for rank px 2 log LR(H r jh p ) = T log(1 ^i )! d trf i=r+1 T 1 P T t=1 X t 1" 0 t Z 1 0 d! R 1 0 W (dw )0 Z 1 1 Z 1 (db)b 0 BB 0 B(dB) 0 g where B is standard Brownian motion. Limit invariant to distribution of i.i.d. (0; ) errors but depends on the choice of deterministic terms 0 0
26 4. ASYMPTOTIC DISTRIBUTION OF ^ Consider the model with no deterministic terms, r = 2; and identied by = ( h 1 p1 THEOREM If " t i.i.d. (0; ); then + H 1 ' 1 ; h 2 pm 1 m 1 1 p1 + H 2 ' 2 ): pm 2 m 2 1 TX T 1=2 d x [T u]! CW = G; and T 1 S 11 = T 2 x t 1 x 0 t 1 T ^'1 ^' 2 d! t=1 d! C Z 1 11 H1GH 0 1 12 H1GH 0 1 R! 2 H 0 1 21 H2GH 0 1 22 H2GH 0 1 R0 G(dV 1) 2 H2 0 1 0 G(dV ; 2) 0 W W 0 duc 0 = G where ij = 0 j 1 j ; and V = 0 1 W = (V 1 ; V 2 ) 0. Note that V = 0 1 W is independent of CW =? ( 0??) 1 0? W: Hence limit is mixed Gaussian. The estimators of the remaining parameters are asymptotically Gaussian and asymptotically independent of ^' and ^.
27 Example The simplest example is a cointegrating regression T (^ x 1t = x 2t ) = T 1 P T t=1 x 2t 1" 1t T 2 P T t=1 x2 2t 1 1 + " 1t ; and x 2t = " 2t where " t are i.i.d. Gaussian and f" 1t g and f" 2t g are independent. Then R 1 0 W 2dW 1 For given fx 2t ; t = 1; : : : ; T g ^j fx2t g N(; d! R 1 0 W 2 2 2 1 P T ) and (^ ) ( t=1 x2 2t 1 1 This implies that the marginal distributions satisfy ^ q ^ TX not Gaussian and ( V ar(^) 1 t=1 (u)du Mixed Gaussian TX x 2 2t 1) 1=2 j fx2t g N(0; 1) t=1 x 2 2t 1) 1=2 is Gaussian
28 The mixed Gaussian distribution. A simple simulation of the model X 1t = 1 (X 1t 1 X 2t 1 ) + " 1t and X 2t = " 2t where " t are i.i.d. N 2 (0; diag( 2 1; 2 2)). DGP ( 1 = 1; = 1; 2 1 = 2 2 = 1) : X 1t = X 2t 1 + " 1t ; X t = " 2t. MLE by regression: t=1 X 1t = 1 X 1t 1 + 2 X 2t 1 + " 1t ; ^ 1 = ^ 1 ; ^ = ^ 2 =^ 1 Asymptotic results for testing using the Wald test 1 TX ( (X 1t 1 ^X 2t 1 ) 2 ) 1=2 (^ 1 1 )! d N(0; 1); or (1 + ^2 ) 1=2 T 1=2 (^ ) ^ 1 ^ t=1 1 j^ 1 j TX ( X2t 2 ^ 1) 1=2 (^ )! d N(0; 1); or j^ Z 1 1j ( W 2 (u) 2 du) 1=2 T (^ )! d N(0; 1); 2 ^ 2 0 We need the distribution of ^ 1 and the joint distribution of ^; T 2 P T t=1 X2 2t 1 inference. to conduct
31 alpha alpha gamma 1.15 1.10 1.05 1.00 0.95 0.90 0.85 0.72 0.80 0.88 0.96 1.04 1.12 1.20 1.28 0.72 0.80 0.88 0.96 1.04 1.12 1.20 1.28 Estimate of gamma against information per observation 0 100 200 300 400 Information per observation Estimate of alpha against information per observation 0 100 200 300 400 Information per observation Estimate of alpha against information per observation 3.6 4.8 6.0 7.2 8.4 9.6 Information per observation 7.From the model x 1t = (x 1t 1 x 2t 1 ) + " 1t ; x t = " 2t ; T = 100; SIM = 400
29 8. CONCLUSION When modelling stochastically trending variables, the usual regression methods need to be modied, in order to guarantee valid inference. The vector autoregressive model allowing for I(1) variables and cointegration has been analysed in detail, and the challenge is to apply it to nonstationary climate data, in order to avoid the fallacies involved in spurious correlation and regression.