Co-integration rank determination in partial systems using information criteria

Co-integration rank determination in partial systems using information criteria Giuseppe Cavaliere Luca Fanelli Luca De Angelis Department of Statistical Sciences, University of Bologna February 15, 2016 PRELIMINARY VERSION Abstract We investigate the asymptotic and finite sample properties of the most widely used information criteria for co-integration rank determination in partial systems, i.e. in cointegrated Vector Autoregressive (VAR) models in which a sub-set of variables of interest is modeled conditional on another sub-set of variables. The asymptotic properties of the Akaike Information Criterion (AIC), the Bayesian Information Criterion (BIC) and the Hannan-Quinn Information Criterion (HQC) are established, and the consistency of the BIC and HQC is proved. Notably, the consistency of the BIC and HQC is robust to violations of the hypothesis of weak exogeneity of the conditioning variables with respect to the cointegration parameters. This result opens up interesting possibilities from practitioners who can determine the co-integration rank in partial systems without being concerned with the hypothesis of weak exogeneity of the conditioning variables. A Monte Carlo experiment which uses a dynamic macro model as data generating process shows that the BIC and the HQC perform reasonably well in small samples, and comparatively better than traditional approaches for co-integration rank determination. We further show the usefulness of our approach by investigating the co-integration implications of a dynamic monetary business cycle model estimated on U.S. quarterly data. Overall, our analysis clearly shows that the gains of combining information criteria with partial systems analysis are indisputable. Keywords: Information criteria, Co-integration, Partial system, Conditional model, VAR 1

1 Introduction Co-integration rank determination in vector autoregressive (VAR) systems in which some variables of interest, Y t, are modeled conditional on some other variables, Z t, has been addressed in Harbo, Johansen, Nielsen and Rahbek (1998) [HJNR hereafter]. The idea of the conditional analysis is that the dimensionality of the system is reduced, and, when the conditioning is valid, the co-integration tests may have better power properties. The partial system approach by HJNR, see also Johansen (1992, 1996), proves to be useful in situations in which (i) the practitioner is solely interested in modeling Y t and the dimension of the whole VAR system p := dim(x t ), X t:=(y t, Z t), is large relative to the sample size T, (ii) it is known that Z t contributes to the long run equilibrium of the system, but either the theory is silent about the short run dynamics of Z t, or the practitioner is not concerned with it, (iii) the time series used for Z t are considered poor proxies of the corresponding theoretical variables they should measure, especially as concerns their transitory dynamics. In general, the partial system approach is useful when it is known that the true co-integration rank, r 0, satisfies the condition 0 r 0 p Y := dim(y t ) and Z t is integrated of order one (I(1)) and not co-integrated. HJNR derive the asymptotic distribution of the likelihood ratio test for co-integrating rank in the partial (conditional) model for Y t given Z t, and provide table of critical values, under the assumption of weak exogeneity of Z t for the co-integrating parameters. Actually, the weak exogeneity condition of Z t can nullify the benefits of the partial system analysis and limit its usefulness in applied work. Indeed, when the weak exogeneity of Z t fails, the practioner can not apply the asymptotic critical values tabulated in HJNR and must resort to the full system analysis. On the other hand, in order to verify the hypothesis of weak exogeneity of Z t the practitioner can not ignore the marginal model for Z t. It is not suprising, therefore, that only seldom the partial system approach has been used in applied work, see e.g. Doornik et al. (1998) and Bruggeman et al. (2003). In this paper, we put forth an information criteria-based approach for co-integration rank determination in the partial (conditional) model for Y t given Z t, which does not require the hypothesis of weak exogeneity of Z t with respect to the co-integration parameters nor prior knowledge of the co-integrating order of Z t. Our main result is that the use of either the BIC or the HQC in partial systems yields weakly consistent estimates of the true co-integration rank (the same result does not apply to the AIC). Compared to HJNR, the suggested approach does not require the use of tables of critical values: the selected co-integration rank is the one which minimizes the BIC or HQC across rank r = 0, 1,..., p Y. Notably, we prove that the consistency of the BIC and HQC is valid irrespective of whether Z t is weakly exogeneous or not. The main implication of this robustness result is that the practioner can determine the co-integration 2

rank in partial systems by the BIC and/or HQC by actually disregarding the marginal model for Z t. The only requirement on the marginal model is that Z t be I(1) and not co-integrated. Thus, the suggested approach restores the main benefit of co-integration rank determination in partial systems. Information-based methods are a well-established alternative to approaches based on Neyman- Pearson type tests for co-integration rank determination. In particular, Aznar and Salvador (2002) and Cavaliere, De Angelis, Rahbek, Taylor (2015a) [CDRT hereafter], among others, show that standard information such as the familiar Bayesian Information Criterion (BIC) (Schwarz, 1978) and Hannan-Quinn Information Criterion (HQC) (Hannan and Quinn, 1979) provide an interesting alternative to the traditional Johansen s co-integration rank tests, as they are weakly consistent estimators of the co-integration rank. Conversely, the Akaike Information Criterion (AIC) (Akaike, 1974) does not deliver a consistent estimate of the co-integration rank as its penalty does not satisfy a required rate condition (formally defined in Section 3) and, therefore, it is an asymptotically upward biased estimator of the co-integration rank both in full and partial systems. We document the usefulness and advantages in finite samples of our approach through a Monte Carlo experiment which uses a dynamic macro model as data generating process. More precisely, we investigate the co-integration implications of a small-scale dynamic stochastic general equilibrium (DSGE) monetary model from a partial system approach, and compare results with those obtained with the full system analysis. We consider two scenarios: one in which the condition variables are weakly exogenous with respect to the co-integration parameter, and the other in which the weak exogeneity constraint does not hold, which is the typical situation in applied work. Our simulation studies illustrate the benefit of the partial system analysis and shows that the BIC and HQC tend to outperform traditional co-integration rank tests in small samples, irrespective of whether the conditioning variable is weakly exogenous or not for the co-integration parameters. In small samples, the acceptance frequency of the null hypothesis of co-integration rank less than or equal to the true co-integration rank delivered by the HJRN test is remarkably inferior than the corresponding selection frequencies provided by the BIC and the HQC. This is especially true when the weak exogeneity of the conditioning is not valid. Compared to the full system analysis, the information criteria perform in partial systems and finite samples no worse than they do in the full (unconditional) co-integrated VAR. More precisely, under weak exogeneity, the BIC and HQC in partial systems clearly outperform their analogues in full systems, in the sense that they tend to select the true co-integration rank with more precision. When weak exogeneity does not hold, the BIC and HQC in partial systems perform similarly to their analogues in full systems, in the sense that selection frequencies are roughly the same in the two situations, and do not perform worse than Johansen s (1996) 3

sequential test for cointegration rank. We also provide an empirical illustration in which the small-dsge model used in the Monte Carlo study is taken to U.S. quarterly data on the Great Moderation period. We show that the are cases in which even if p := dim(x t ) is not too large, it may be convenient to determine the co-integration rank by conditioning Y t with respect to Z t. Our empirical results confirm that co-integration rank determination by the BIC and HQC delivers reasonable and reliable results in samples of lengths typically available to practitioners. The remainder of the paper is organized as follows. Section 2 describes the co-integrated VAR model and Section 3 outlines the information criteria for co-integration rank determination in partial systems. The results from a Monte Carlo simulation experiment based on a small-dsge model are reported in Section 4, while a more comprehensive simulation study is summarized in the Technical Supplement associated with this paper. 1 Section 5 shows the usefulness of the suggested approach by testing the co-integration implications of the small-dsge model on U.S. quarterly data. Section 6 concludes. The proof of our main result, Theorem 1, is given in the Technical Supplement. 2 The co-integrated VAR model In this section, we introduce the (full system) co-integrated VAR model and define the corresponding conditional (or partial) model upon the analysis will be based in Section 2.1. Consider the p-dimensional process {X t } (the full system) which satisfies the k-th order reduced rank VAR model: k 1 X t = αβ X t 1 + Γ i X t i + αρ D t + φd t + ε t, t = 1,..., T (2.1) i=1 where X t := (X 1t,..., X pt ) and the initial values, X 1 k,..., X 0, are fixed in the statistical analysis. In the context of (2.1), we assume that the standard I(1, r 0 ) conditions, where r 0 {0,..., p} denotes the co-integration rank of the system, defined by Cavaliere, Rahbek and Taylor (2012) hold; that is, the characteristic polynomial associated with (2.1) has p r 0 roots equal to 1 with all other roots lying outside the unit circle, and where α and β have full column rank r 0. The innovation process ε t := (ε 1t,..., ε pt ) is assumed independent and distributed as N p (0, Σ), where Σ is a (p p) positive definite and symmetric matrix. The deterministic variables in (2.1) are taken to satisfy one of the following cases (see, e.g., Johansen, 1996): (i) D t = 0, d t = 0 (no deterministic component); (ii) D t = 1, d t = 0 (restricted constant); or (iii) D t = t, d t = 1 (restricted linear trend). 1 The Technical Supplement is available at http://www2.stat.unibo.it/deangelis/ts_partial_ic.pdf 4

2.1 The partial model Suppose that X t is decomposed as X t = (Y t, Z t) where Y t is of dimension p Y and Z t is of dimension p Z, with p Y + p Z = p. Therefore, ε t = (ε Y t, ε Z t ) in (2.1) are i.i.d. Gaussian with mean zero and covariance matrix ( ) ΣY Y Σ Y Z Σ :=. Σ ZY All other matrices of parameters of system (2.1) are partitioned conformably. As in HJNR, we define the partial and marginal models assuming, for simplicity but without loss of generality, Σ ZZ that k = 2. Extension to the case k > 2 is straightforward. The conditional (or partial) model is defined as Y t = ω Z t +(α Y ωα Z )β X t 1 +(Γ Y Y ωγ ZY ) X t 1 +(α Y ρ Y ωα Z ρ Z)D t +(φ Y ωφ Z )d t +ε ct where ω := Σ Y Z Σ 1 ZZ and ε ct := ε Y t ωε Z t, with covariance matrix Σ c = Σ Y Y.Z := Σ Y Y Σ Y Z Σ 1 ZZ Σ ZY. The marginal model is defined as (2.2) Z t = α Z β X t 1 + Γ ZY X t 1 + α Z ρ ZD t + φ Z d t + ε Z t. (2.3) The following assumption is required on Z t. Assumption 1: The conditioning variable Z t is I(1) and such that its components are not co-integrated. Remark 1: Assumption 1 implies that the co-integration rank of the partial model equals the co-integration rank of the full system. In particular, if Assumption 1 does not hold, then the cointegration rank in the partial model is r Y.Z = min(r 0, p Y ). On the other hand, if Assumption 1 holds then r Y.Z = r 0 and p Y r 0 0. 3 Information criteria for co-integration rank determination in partial systems The statistical analysis of the likelihood function of the partial model in (2.2) can be performed by reduced rank regression of the process Y t on X t, corrected for lagged differences and the conditioning (stationary) variable Z t (see Chapter 8 in Johansen, 1996, and HJNR). 2 By 2 For simplicity, here we consider the case of no deterministic components, i.e., case (i) in (2.1). 5

concentrating the likelihood function of the partial model in (2.2) with respect to the parameters ω, α Y, β, Γ c and Σ c obtains the residuals R 0t, R 1t, and R εt of Y t, X t 1, and ε ct, respectively, and the corresponding residual product moments S ij := T 1 T R it R jt, i, j = 0, 1, ε. t=1 Up to a constant term which does not depend on r, the maximised pseudo log-likelihood function associated with (2.1) is l T (r) = T 2 log ˆΣ r, where ˆΣ r is the residual covariance matrix estimate from the conventional reduced rank regression estimation of (2.2) under rank r (see Johansen, 1996). Moreover, ˆΣ r := S 00 r i=1 (1 ˆλ i ), and so l T (r) = T 2 log S 00 T 2 r i=1 log(1 ˆλ i ), where S 00 is p Y p Y and ˆλ 1 >... > ˆλ py are the p Y largest solutions to the eigenvalue problem λs 11 S 10 S 1 00 S 01 = 0. (3.1) Note that S 11 is p p but the rank of S 10 S00 1 S 01 is only p Y since S 00 is p Y p Y and, thus, the eigenvalue problem in (3.1) has p Z zero solutions. Let IC(r) := 2l T (r) + p T denote an information criterion, where p T = c T π(r) is a penalty function which depends on the number of parameters π(r) and on a term c T, specified below. We have that IC(r) = T log ˆΣ r + c T π(r) = T log S 00 + T r log(1 ˆλ i ) + c T π(r) (3.2) where π(r) = r(p Y +p r) when no deterministic component is involved, π(r) = r(p Y +p r+1) in the case of restricted constant and π(r) = r(p Y + p r + 1) + p Y in the restricted trend case. Different values of the coefficient c T yield different information criteria through the resulting penalty function, p T. As is well-known, c T = 2, log T, and 2 log log T yields the AIC, BIC, and HQC, respectively. The co-integration rank estimator is then given, in generic form, by i=1 ˆr IC := arg min IC(r). (3.3) r=0,1,...,p Y It is worth noting that the determination of the co-integration rank based on the estimator in (3.3) exempts practitioners from the use of tables of asymptotic critical values on which sequential testing procedure such as the one proposed by HJNR is based. Importantly, we do not need any assumption on α Z in (2.3). In other words, our approach can be also applied when the variables in Z t adjust to β X t 1. In the following theorem, we state our main result about the consistency of the information criteria. The key requirement is that the penalty term c T T. 6 diverges at rate lower than T, as

Theorem 1: Let {X t } be generated as in (2.1) with the true parameters satisfying the I(1, r 0 ) conditions and let Assumption 1 hold. Then, it holds that, as T : (i) for r 0 < r p Y, Prob(IC(r) > IC(r 0 )) 1, provided c T ; (ii) for 0 r < r 0, Prob(IC(r) > IC(r 0 )) 1, provided c T /T 0. Remark 2: The results in Theorem 1 imply that the use of either BIC or HQC penalty in the partial system (2.2) will yield a weakly consistent estimate of r 0. Conversely, the AIC will not deliver a consistent estimate of the co-integration rank as its penalty does not satisfy condition (i) in Theorem 1. Therefore, we will not consider the AIC in the Monte Carlo simulations discussed in Section 4 and in the empirical illustration summarized in Section 5. Remark 3: Theorem 1 has been derived without any assumption about the weak exogeneity of Z t for the cointegration parameters. The robustness of the BIC and HQC to possible violations of weak exogeneity of the conditioning variables allows the practioner to fully benefit from reductions of the dimensionality of the system in co-integration rank determination. HJRN s sequential test for co-integration rank in partial systems is based on asymptotic critical values tabulated under the maintained assumption of weak exogeneity of Z t for β. With the BIC and HQC, the practitioner can determine the co-integration rank without being concerned about the possible adjustment of Z t to β X t 1. Her only concern will be the I(1) condition of Z t and that these variables do not co-integrate as stated in Assumption 1. Remark 4: Theorem 1 assumes that the system lag length k in (2.1) is known. This assumption, which is unreasonable in practice, can be relaxed and information criteria can be used to determine both the autoregressive lag length and the co-integration rank following either a jointly or a sequential procedure; see Cavaliere, De Angelis, Rahbek and Taylor (2015b). Moreover, Aznar and Salvador (2002) provide an approach based on information criteria to also determine the form of the deterministic component. While we do not consider the determination of the system lag order and the choice of the form of the deterministic component in this paper, the strategies outlined by Cavaliere et al. (2015b) and Aznar and Salvador (2002), respectively, can also be extended to the present framework. 4 Rank determination in a small DSGE model In this section, we investigate the finite sample performance of the BIC and the HQC in partial systems through a Monte Carlo experiment which uses a dynamic macro model as data generating process. The design of the experiment is taken from Bårdsen and Fanelli (2015), 7

who consider a slightly adapted version of the small-scale DSGE monetary model estimated by Benati and Surico (2009). 3 The structural model features the following three equations: x t = γe t ỹ t+1 + (1 γ) x t 1 δ(r t E t π t+1 ) + ωỹ,t (4.1) ϱ π t = 1 + ϱχ E tπ t+1 + χ 1 + ϱχ π t 1 + κ x t + ω π,t (4.2) R t = ρr t 1 + (1 ρ)(ϕ π π t + ϕ x x t ) + ω R,t (4.3) where ω a,t = ρ a ω a,t 1 + u a,t, 1 < ρ a < 1, u a,t WN(0, σ 2 a), a = x, π, R (4.4) and expectations are conditional on the agents information set F t, i.e. E t :=E( F t ). The first equation is a forward-looking IS curve, the second is the New Keynesian Phillips curve and the last is the policy rule through which the Central Bank fixes the short term interest rate. x t = (x t x n t ) is the output gap, where x t is the log of output and x n t the natural rate of output; π t is the inflation rate and R t is the nominal interest rate; ω a,t, a = x, π, R are stochastic disturbances autocorrelated of order one, while the u a,t, a = x, π, R, can be interpreted as demand, supply and monetary shocks, respectively. The structural parameters are collected in the vector θ = (γ, δ, ϱ, κ, κ, ρ, ϕ π, ϕ x, ρ x, ρ π, ρ R, σ 2 x, σ2 π, σ 2 R ) and their economic interpretation may be found in Benati and Surico (2009). System (4.1)-(4.4) does not specify how the natural rate of output is generated. To complete the specification, we depart from Benati and Surico s (2009) baseline model and assume that x n t obeys the equation x n t = ψ(r t 1 π t 1 ) + ω x n,t, ω x n,t WN(0, σx 2 n) (4.5) in which ψ 0 is an additional auxiliary parameter which plays an important role in our setup. It is seen that when ψ = 0, x n t in (4.5) follows a driftless random walk which captures the effects of technology shocks and represents the I(1) stochastic trend driving system. Instead, with ψ < 0, x n t is still I(1) but is negatively affected by the level of the lagged (ex-post) real interest rate. It is worth remarking that equation (4.5) has no claim of reality. We include it artificially in the small-dsge model simply to illustrate the consequences of the validity/violation of the condition of weak exogeneity of x n t on co-integration rank determination, see below. System (4.1)-(4.5) with ψ = 0 is analyzed in Bårdsen and Fanelli (2015) to investigate the testable implications of small DSGE models, including their common trend/co-integration properties. Indeed, the appealing feature of system (4.1)-(4.5) is that, under particular conditions, its rational expectations solution can be approximated by a four-dimensional (p = 4) 3 A comprehensive simulation study which considers different data generating processes outside the DSGE framework may be found in the Technical Supplement. 8

co-integrated VAR system for X t :=(x t, π t, R t, x n t ). The testable co-integration implications of this DSGE model are captured by the term β X t 1 := 1 0 0 1 0 1 0 0 0 0 1 0 which implies (true) co-integration rank r 0 = 3, and requires that the output gap x t x n t, the inflation rate and the policy rate are jointly stationary. Moreover, it can be shown that when ψ = 0 in (4.5), the variable x n t does not adjust to the three co-integrating relations, i.e. it is weakly exogenous with respect to the β = β matrix in (4.6), while when ψ < 0, the variables x n t adjusts to the inflation rate and the policy rate. x t 1 π t 1 R t 1 x n t 1 (4.6) It is seen that in this model, the natural rate of output enters the co-integration relations in (4.6). However, a practitioner might not be interested in modeling x n t directly within the VAR framework. Indeed, the natural rate of output is a variable which captures an highly theoretic concept: it refers to the highest level of real output that can be sustained over the long term, and there is some disagreement among economists as to how to characterize such a concept in applied work. In many empirical business cycle models x n t is treated as an unobserved (latent) state variable, while in others x n t is proxied with measures of economic activity provided by official institutions or national statistical institutes (see the empirical illustration in Section 5). Thus, albeit the dimension of the full system is not particularly large, in this example the practitioner might be interested in not modelling x n t directly. In light of the above considerations, x n t seems to be a natural conditioning variable for the purpose of testing the co-integration rank in X t := (x t, π t, R t, x n t ). We thus consider the partition X t = (Y t, Z t) with Y t = (x t, π t, R t ) (p Y = 3) and Z t = (x n t ) (p Z = 1), and the partial system approach discussed in the previous sections. Thus, we can solely devote attention to the conditional system (2.2). In the generation of the data, the values of structural parameters θ in system (4.1)-(4.5) are fixed as in Bårdsen and Fanelli (2015), and the structural shocks u a,t, a = x, π, R are assumed Gausssian. The auxiliary parameter ψ is fixed as detailed below. Given θ and ψ, it is possible to obtain the coefficients that characterize the four-dimensional VAR for X t from the rational expectations solution of system (4.1)-(4.5), see Bårdsen and Fanelli (2015), Section 5, for details. Samples of length T = 100 and T = 200 are generated M = 5000 times from VARs of the form (2.2) with k = 2 lags. The matrix β = β is fixed as in (4.6) (r 0 = 3), while the remaining parameters in α, Γ 1 and Σ depend nonlinearly on θ and ψ. No deterministic component is included because the variables generated from the DSGE model represent deviations from constant steady states (means, linear deterministic trends, etc.). The initial conditions X 1 and X 0 are fixed to zero. For each replication, a sample of T + 200 observations is generated 9

and the first 200 observations are then discarded. 4 On each generated sample, the partial (conditional) system (2.2) is estimated and the procedure for co-integration rank determination discussed in Section 3 is then applied. Following Theorem 1, the co-integration rank is determined by the BIC and HQC. We consider two scenarios. In the first, denoted Scenario A, the auxiliary parameter ψ is set to ψ = 0 in Eq. (4.5). This implies that the matrix of adjustment coefficients, α, incorporates the hypothesis of weak exogeneity of Z t = x n t with respect to β in (4.6), hence the last row of α is zero in the data generating process. In the second, denoted Scenario B, the auxiliary parameter ψ is fixed to ψ = ψ < 0 in Eq. (4.5), where ψ is set to a value such that the uniqueness and stability of the rational expectations solution of the DSGE model is still satisfied. This implies a structure for the matrix α in which the last row is not zero: in particular, Z t = x n t adjusts to the (lagged) inflation and policy rates, hence only the (4,1) element of the matrix α is equal to zero in the data generating process. Scenario A: weak exogeneity Results are reported in Table 1, which summarizes selection frequencies, i.e., the percentage of times in which the BIC (second column) and the HQC (third column) select r = 0, 1, 2, 3 = p Y, respectively, out of M = 5000 Monte Carlo replications. We compare the finite sample performance of these information criteria with that of HJNR s sequential test for co-integration rank in partial systems, using the 0.05 and 0.10 nominal levels of significance (fourth and fifth column). The asymptotic critical values for this test are taken from Table 4 of HJNR. Results obtained from the partial system approach are also compared with those obtained by the (unconditional) full VAR specification in (2.1). Specifically, we consider the BIC and HQC (sixth and seventh column) as outlined in CDRT and Johansen s (1996) sequential test for co-integration rank (eighth and ninth column). Also in this case, Johansen s test is performed at the 0.05 and 0.10 nominal levels of significance. From the left-panel of Table 1, we notice that for T = 100, the BIC (HQC) selects the true co-integration rank, r = r 0 = 3, 91.6% (98.9%) of the time. Instead, r = 2 is selected less than 5% (1%) of the time, r = 1 the 3.1% (0.1%) of the time and r = 0 less than the 1% of the time for both information criteria. For T = 200, the BIC and the HQC select r = r 0 = 3 in all Monte Carlo replications. The two criteria clearly outperform HJNR s sequential test in terms of selection frequencies. Indeed, when T = 100 HJNR s sequential test determines the true co-integration rank 61.6% (80.3%) of the time at the 5% (10%) nominal significance level. On the other hand, when T = 200, HJNR s sequential test performs equally well as BIC and 4 All calculations in this and in the next section have been performed in OxMetrics. Codes are available from the authors upon request. 10

HQC with a selection frequency of the true co-integration rank which exceeds 99% for both significance levels. Comparing the results in the left-panel and right-panel of Table 1 we observe that the two information criteria perform remarkably better in the partial model compared to the (unconditional) full system. In particular, in the full system, the BIC and HQC select r = 3 only 7.2% and 51.9% of the time for T = 100, and these frequencies increase to 64.5% and 90.5% for T = 200. The selection frequencies for the true rank, r = r 0 = 3, for Johansen s sequential tests are 33.4% and 90.7% (48.4% and 88.9%) at the 5% (10%) significance level, when T = 100 and 200, respectively. These results highlight the benefit of analyzing the partial system in place of the (unconditional) full VAR system in this case. Scenario B: no weak exogeneity Results reported in Table 2 show the selection frequencies, i.e., the percentage of times in which the BIC (second column) and the HQC (third column) select r = 0, 1, 2, 3, respectively. Also in this case we sketch the selection frequencies associated with HJNR s sequential test (fourth and fifth column) and as in the previous scenario we also determine the co-integration rank through a full (unconditional) system approach, using the BIC (sixth column), the HQC (seventh column) as in CDRT, and Johansen s sequential test (eighth and ninth column). From the left-panel of Table 2 we notice that for T = 100, the true co-integration rank r = r 0 = 3 is selected by the BIC (HQC) the 59% (80%) of the time; r = 2 is selected less than 30% (20%) of the time, r = 1 the 9.5% (0.5%) of the time and, finally, rank r = 0 is selected the 2.5% (0%) of the time. For T = 200, these selection frequencies turn to 86% (94.5%) for r = r 0 = 3, 14% (5.5%) for r = 2 and are equal to zero for r = 1 and r = 0, respectively. Both information criteria outperform HJNR s sequential test, which selects the true co-integration rank only the 29% of times for T = 100 and the 69% of times for T = 200 when the 5% asymptotic critical values are used. These selection frequencies increase to 44% (T = 100) and 80% (T = 200), respectively, when the 10% asymptotic critical values are used. This evidence was expected: the asymptotic critical values upon which the test is computed are still taken from Table 4 of HJNR which assumes the weak exogeneity of the conditioning variables, hence are not the correct ones in this scenario. Comparing the results in the right-panel of Table 2 (full system analysis) with those in the left-panel (partial system analysis), we observe that for both T = 100 and T = 200, the empirical performance of the BIC and HQC is substantially the same in the two situations (the BIC performs slightly better in the partial system analysis, while the HQC performs slightly better in the full system analysis). Notably, while the performance of the BIC in the partial system analysis is outperformed by Johansen s sequential test in the full system analysis when the 5% asymptotic critical values are used, the HQC in the partial system performs almost as 11

well as Johansen s sequential test in the complete VAR. The important message we derive from Table 2 is that when the conditioning variables are not weakly exogenous for the co-integration parameters, the BIC and especially the HQC perform in partial systems no worse than they perform in full systems in samples of lengths typically available to practitioners. Thus, there is no loss of information in determining the co-integration rank from the conditional system of reduced dimension. Overall, the results of our Monte Carlo simulations in scenarios A and B, along the simulation results in the Technical Supplement, show that the benefits of combining information criteria with partial systems analyses are indisputable. 5 Empirical illustration In this section, we investigate the co-integration implications of the DSGE model (4.1)-(4.5) on U.S. quarterly data. The variables in X t = (x t, π t, i t, x n t ) are chosen as follows: x t is the real GDP; π t is measured by the quarterly growth rate of the GDP deflator; i t is measured by the effective federal funds rate expressed in quarterly terms (averages of monthly values); x n t is approximated with the official measure provided by the Congressional Budget Office (CBO). The data source is the web site of the Federal Reserve Bank of St. Louis. The data cover the Great Moderation period, 1985Q1-2008Q3, hence we have T = 95 observations (not including initial lags). The choice of the sample is motivated in Bårdsen and Fanelli (2015) in detail. If the VAR model for X t = (x t, π t, i t, x n t ) is interpreted as the solution of the small-dsge model (4.1)-(4.5), then the system should be driven by a single stochastic trend capturing the cumulated effects of technology shocks. As observed in Section 4, albeit the dimension of X t is not particularly large, in this setup x n t represents a natural conditioning variable for the purpose of testing the co-integration implications of our small-dsge model. Hence, we consider the partition X t = (Y t, Z t), with Y t = (x t, π t, R t ) (p Y = 3) and Z t = (x n t ) (p Z = 1) and apply the BIC and HQC information criteria. For comparative purposes, we also compute the sequential co-integration rank testing procedure of HJNR. Furthermore, to show the usefulness of the partial system approach relative to the (unconditional) full system analysis, we also determine the co-integration rank through the VAR model for X t. The partial (conditional) system (2.2) is estimated on the period 1985Q1-2008Q3 with k = 2 lags and a restricted constant. Results are reported in the left-panel of Table 3, which summarizes the co-integration rank selected by the BIC and HQC, and HJNR s sequential test for co-integration rank. It is seen that both the BIC and HQC select cointegration rank r = 3, as predicted by the DSGE model. Also HJNR s sequential test selects rank r = 3 if one consider the 0.05 nominal level of signficance, but it also leaves some room to rank r = 2, 12

which would be selected using the 0.01 nominal level of significance. Recall, however, that the p-values associated with HJNR s sequential test have been computed using asymptotic critical values tabulated under the maintained hypothesis of weak exogeneity of Z t = (x n t ). The right panel of Table 3 summarizes the results obtained with the full VAR system analysis and reports the co-integration rank selected by the BIC and HQC and Johansen s sequential test for co-integration rank. In this case, it is the VAR system (2.1) which is estimated on the period 1985Q1-2008Q3 with k = 2 lags and a restricted constant. We notice that now the BIC and Johansen s sequential test select rank r = 1, while the HQC selects r = 3 as in the conditional model. 5 We conclude our empirical illustration by testing whether for fixed co-integration rank r = 3, the variable Z t = x n t is weakly exogenous with respect to β in (2.1). This amounts to testing whether the last row of the p 3 matrix α is zero in system (2.1) for a given (exactly) identified formulation of the p 3 matrix β. The likelihood-ratio test for the three zero restrictions in the α matrix is equal to 35.2 and when confronted with the asymptotic critical value taken from the χ 2 -distribution with 3 degree of freedom leads to a firm rejection of the null hypothesis of weak exogeneity. This evidence suggests that the p-values associated with the HJNR s sequential test reported in the left panel of Table 3 must be interpreted with caution, because they have been computed by the wrong asymptotic critical values. 6 Conclusions Information criteria have been recently used as valid alternative to co-integration rank tests in co-integrated VAR systems. There are situations in which it is convenient to devote attention to the model that characterizes the dynamics of the variables of interest, conditional on some other variables whose dynamic characterization is not of primary interest. This approach is useful either in large systems, or when the practioner benefits from marginalizing out the variables which contribute to the long run equilibrium but do not play a key role for the purposes of the analysis. In this paper we have extended the use of information criteria for co-integration rank determination to the case of partial systems. We have proved the consistency of the BIC and HQC in partial systems and shown the robustness of these criteria to violations of the weak exogeneity of the conditioning variables for the cointegration parameters. A Monte Carlo 5 The analysis in Bårdsen and Fanelli (2015) suggests that if the co-integration rank is fixed at r = 3, a likelihood-ratio test for the restrictions that β = β as in (4.6) leads to a p-value of 0.009 when the asymptotic critical values are used, and to a bootstrap p-value of 0.04 when the bootstrap version of the likelihood-ratio test is computed along the lines of Boswijk et al. (2015). Overall, we can interpret the results in Table 3 plus these evidences as substantially supportive of the common trend/co-integration implications of the small-dsge model (4.1)-(4.5) on the Great Moderation period. 13

experiment and an empirical illustration based on a small-scale DSGE model has shown that these criteria perform reasonably well in small samples and represent valid alternatives to the traditional approaches to co-integration rank determination. References Akaike, H. (1974): A new look at the statistical model identification, IEEE Transactions on Automatic Control 19, 716 723. Aznar, A. and M. Salvador (2002): Selecting the rank of the cointegration space and the form of the intercept using an information criterion, Econometric Theory 18, 926 947. Bårdsen, G. and L. Fanelli (2015): Frequentist evaluation of small DSGE models, Journal of Business and Economic Statistics 33, 307-322. Benati, L. and P. Surico (2009): VAR analysis and the Great Moderation, American Economic Review, 99, 1636-1652. Boswijk, P., Cavaliere, G., Rahbek, A. and A.M.R. Taylor (2015), Inference on co-integration parameters in heteroskedastic vector autoregression, Journal of Econometrics, forthcoming. Bruggeman, A., P. Donati and A. Warne (2003): Is the demand for Euro area M3 stable?, ECB Working Paper Series No. 255. Cavaliere, G., A. Rahbek and A.M.R. Taylor (2012): Bootstrap determination of the cointegration rank in VAR models, Econometrica 80, 1721 1740. Cavaliere, G., L. De Angelis, A. Rahbek and A.M.R. Taylor (2015a): A comparison of sequential and information-based methods for determining the co-integration rank in heteroskedastic VAR models, Oxford Bulletin of Economics and Statistics 77, 106 128. Cavaliere, G., L. De Angelis, A. Rahbek and A.M.R. Taylor (2015b): Determining the cointegration rank in heteroskedastic VAR models of unknown order, http://www2.stat. unibo.it/deangelis/ic2-wp.pdf. Doornik, J., D.F. Hendry and B. Nielsen (1998): Inference in cointegrating models: UK M1 revisited, Journal of Economic Surveys 12, 533 572. Hannan, E.J. and B.G. Quinn (1979): The determination of the order of an autoregression, Journal of the Royal Statistical Society, Series B 41, 190 195. 14

Harbo, I., S. Johansen, B. Nielsen and A. Rahbek (1998): Asymptotic Inference on Cointegrating Rank in Partial Systems, Journal of Business & Economic Statistics 16, 388 399. Johansen, S. (1992), Cointegration in partial systems and the efficiency of single-equation analysis, Journal of Econometrics 52, 389-402. Johansen, S. (1996): Likelihood-Based Inference in Cointegrated Vector Autoregressive Models, Oxford: Oxford University Press. Schwarz, G. (1978): Estimating the dimension of a model, Annals of Statistics 6, 461 464. 15

Table 1: Scenario A - Case of weak exogeneity DGP: same as Bårdsen and Fanelli (2015, JBES). Partial system Full system BIC HQC HJNR(5%) HJNR(10%) BIC HQC Johan(5%) Johan(10%) T = 100 r = 0 0.4 0.0 0.0 0.0 32.0 0.4 0.0 0.0 r = 1 3.1 0.1 3.3 1.0 42.5 9.6 10.9 4.3 r = 2 4.9 1.0 35.1 18.7 17.9 31.7 52.1 38.8 r = 3 91.6 98.9 61.6 80.3 7.2 51.9 33.4 48.4 r = 4 - - - - 0.4 6.4 3.6 8.5 T = 200 r = 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 r = 1 0.0 0.0 0.0 0.0 4.7 0.0 0.0 0.0 r = 2 0.0 0.0 0.6 0.1 29.0 1.8 4.7 1.4 r = 3 100.0 100.0 99.4 99.9 64.5 90.5 90.7 88.9 r = 4 - - - - 1.8 7.7 4.6 9.7 Critical values of HJNR tests taken from Table 4 of Harbo et al. (1998); Critical values of Johansen s test taken from Table 15.3 of Johansen (1996). 16

Table 2: Scenario B - Case of no weak exogeneity DGP: same as Bårdsen and Fanelli (2015, JBES) but with weak exogeneity relaxed in potential output equation Partial system Full system BIC HQC HJNR(5%) HJNR(10%) BIC HQC Johan(5%) Johan(10%) T = 100 r = 0 2.5 0.0 0.1 0.0 1.4 0.0 0.0 0.0 r = 1 9.5 0.5 14.6 6.7 19.0 0.5 0.3 0.0 r = 2 28.8 19.5 56.2 49.4 23.3 4.5 15.9 6.3 r = 3 59.2 80.0 29.1 43.9 52.9 84.5 78.0 82.4 r = 4 - - - - 3.4 10.5 5.8 11.3 T = 200 r = 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 r = 1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 r = 2 13.7 5.5 30.7 20.0 0.5 0.0 0.0 0.0 r = 3 86.3 94.5 69.3 80.0 96.5 91.3 94.6 89.3 r = 4 - - - - 3.0 8.7 5.4 10.7 Critical values of HJNR tests taken from Table 4 of Harbo et al. (1998); Critical values of Johansen s test taken from Table 15.3 of Johansen (1996). 17

Table 3: Empirical results from systems estimated on U.S. quarterly data Partial system Full system BIC HQC HJNR BIC HQC Johansen 1985Q1-2008Q3 (T = 95) r = 0 * * 71.1 [0.00] * * 108.7[0.00] r = 1 * * 28.8 [0.00] sel * 33.0 [0.08] r = 2 * * 11.2 [0.02] * * 15.1 [0.23] r = 3 sel sel - * sel 2.4 [0.69] r = 4 * * - sel stands for selected, * stands for rejected. p-values in brackets. Cointegration rank in bold indicates the value predicted by the theory under the null of the DSGE model. 18