UNIT ROOT MODEL SELECTION BY PETER C. B. PHILLIPS COWLES FOUNDATION PAPER NO. 1231 COWLES FOUNDATION FOR RESEARCH IN ECONOMICS YALE UNIVERSITY Box 28281 New Have, Coecticut 652-8281 28 http://cowles.eco.yale.edu/
J. Japa Statist. Soc. Vol. 38 No. 1 28 65 74 UNIT ROOT MODEL SELECTION * Peter C. B. Phillips** Some limit properties for iformatio based model selectio criteria are give i the cotext of uit root evaluatio ad various assumptios about iitial coditios. Allowig for a oparametric short memory compoet, stadard iformatio criteria are show to be weakly cosistet for a uit root provided the pealty coefficiet C ad C / as. Strog cosistecy holds whe C /(log log ) 3 uder covetioal assumptios o iitial coditios ad uder a slightly stroger coditio whe iitial coditios are ifiitely distat i the uit root model. The limit distributio of the AIC criterio is obtaied. Key words ad phrases: root. AIC, cosistecy, model selectio, oparametric, uit 1. Itroductio Followig Akaike (1969, 1973, 1977), iformatio criteria have bee systematically explored for order selectio purposes, ofte i the cotext of time series models like autoregressios. The methods have bee studied i both statioary ad ostatioary models (Tsay (1984), Pötscher (1989), Wei (1992), Nielse (26)) ad are widely used i practical work. A commoly occurig problem i moder time series, particularly ecoometrics, is model evaluatio that ivolves testig for a uit root ad coitegratio. Agai, order selectio methods have bee cosidered i this cotext (Phillips ad Ploberger (1996), Phillips (1996), Kim (1998)). If the focus is o these particular features of a time series the it is ot ecessary to build a complete model ad it is ofte desirable to perform the evaluatio i a semiparametriccotext allowig for a geeral short memory compoet i the series. The preset ote looks at the specific issue of uit root evaluatio by iformatio criteria. We seek to distiguish processes with a uit root (UR) from statioary series (SS). The UR model has the autoregressive form (1.1) X t = ρx t 1 + u t, ρ =1, t {1,...,}, where u t is a weakly depedet statioary time series with zero mea ad cotiuous spectral desity f u (λ). The series X t is iitialized at t = by some (possibly radom) quatity X. The SS model has the form X t = u t, so that ρ = i (1.1). We aim to treat (1.1) semiparametrically with regard to u t ad i this cotext ρ = is effectively equivalet to ρ < 1 i (1.1). Accepted December8, 27. *Thaks go to Xu Cheg ad a referee for commets o the origial versio. Partial support is ackowledge from a Kelly Fellowship ad from the NSF uder Grat No. SES 6-4786. **Uiversity of Aucklad, Uiversity of York, Sigapore Maagemet Uiversity, Yale Uiversity; Departmet of Ecoomics, Yale Uiversity, P.O. Box 28281, New Have, CT 652-8281, U.S.A. Email: peter.phillips@yale.edu
66 PETER C. B. PHILLIPS Stadard order selectio criteria may be used to evaluate whether ρ = 1 or ρ = i (1.1). The criteria have the followig form (1.2) IC k = log σ 2 k + kc with coefficiet C = log, log log, 2 correspodig to the BIC (Schwarz (1978), Akaike (1977), Rissae (1978)), Haa ad Qui (1979), ad Akaike (1973) pealties, respectively. Sample iformatio-based versios of the coefficiet C may also be employed, such as those i Wei s (1992) FIC criterio ad Phillips ad Ploberger s (1996) PIC criterio. I the uit root (UR) autoregressio, ρ = 1 ad there is o ukow autoregressive parameter to estimate, so i this case we set the parameter cout to k = i (1.2). I the statioary model (SS), a parametricautoregressive model may still be fitted by least squares regressio with / / ˆρ = X t X t 1 Xt 1 2 = u t u t 1 u 2 t 1, ad so i this case the parameter cout is set to k = 1. The residual variace estimates i (1.2) for the two models are formed i the usual maer, viz., σ 2 = 1 ( X t ) 2, σ 1 2 = 1 (X t ˆρX t 1 ) 2. Model evaluatio based o IC k the leads to the selectio criterio ˆk = arg mi k {,1} IC k. As show below, the iformatio criterio IC k is weakly cosistet for testig a uit root provided the pealty term i (1.2) satisfies the weak requiremets that C ad C / as. No specific expasio rate C is required. Strog cosistecy also holds provided C faster tha (log log ) 3 uder commoly used assumptios about iitial coditios. 2. Results The followig assumptios make specific the semiparametric ad iitializatio compoets of (1.1), the secod beig importat whe ρ = 1. Assumptio LP is a stadard liear process coditio of the type that is coveiet i developig partial sum limit theory (c.f., Phillips ad Solo (1992)). Assumptio IN gives, for the uit root case, a partial sum structure to the iitial observatio X i terms of past iovatios, makig X aalogous to later observatios X t of the series which take the form of partial sums measured from X. The sequece κ i (2.2) determies how may past iovatios are icluded i the iitializatio, with larger values of κ associated with the more distat past. This type of iitial coditio has bee used i other recet limit theory i ecoometrics (Phillips ad Magdalios (27)).
UNIT ROOT MODEL SELECTION 67 Assumptio LP. Let d(l) = j= d j L j, with d = 1 ad d(1), ad let u s have Wold represetatio (2.1) u s = d(l)ε s = d j ε s j, with j 1/2 d j <, j= where ε t is iid(,σ 2 ε). Defie λ = h=1 E(u t u t h ), ω 2 = h= E(u t u t h ), ad σ 2 = E{u 2 t }. Assumptio IN. form (2.2) j= The iitializatio of (1.1) whe ρ = 1 has the geeral X () = κ u j j= with u j satisfyig Assumptio LP ad κ a iteger valued sequece satisfyig κ ad κ (2.3) τ [, ] as. The followig cases are distiguished: (i) If τ =,X () is said to be a recet past iitializatio. (ii) If τ (, ), X () is said to be a distat past iitializatio. (iii) If τ =, X () is said to be a ifiite past iitializatio. Theorem 1. (a) Uder Assumptios LP ad IN, the criterio IC k is weakly cosistet for distiguishig uit root ad statioary time series provided C ad C / as. (b) The asymptotic distributio of the AIC criterio (IC k with coefficiet C = 2) is give by where lim P {ˆk AIC = k =1} =, lim P {ˆk AIC = k =} = P {ξ 2 < 2} lim P {ˆk AIC =1 k =} =1 P {ξ 2 < 2}, ( 1 ) 2 1 ) dbb + λ /(σ 2 B 2 ξ 2 ( = 1 ) 2 1 ) dbb τ + λ /(σ 2 Bτ 2 B(1) 2 /σ 2 B is Browia motio with variace ω 2, (2.4) B τ (s) =B(s)+ τb (1), lim P {ˆk AIC =1 k =1} =1, uder IN(i) uder IN(ii) uder IN(iii) ad B is a idepedet Browia motio with variace ω 2.,
68 PETER C. B. PHILLIPS Remarks. (1) The weak cosistecy results i part (a) of Theorem 1 show that iformatio criteria ca be used, essetially i their preset form, for distiguishig uit root ad statioary time series. This approach allows for a oparametrictreatmet of the short memory compoet i both the statioary ad ostatioary models. The simple coditios o the pealty coefficiet C that C ad C / as are miimal. Evidetly, BIC ad the Haa-Qui criterio are both cosistet. Similar argumets show that the FIC (Wei (1992)) ad PIC (Phillips ad Ploberger (1996)) criteria are also cosistet. (2) The AIC criterio (with fixed C = 2) is icosistet ad the limit distributio of ˆk is see i part (b) of the theorem to deped o the asymptotic distributio of the squared uit root t statistic ξ 2. This distributio ivolves uisace parameters. The limit variate ξ 2 has a uit root limit distributio uder IN(i) ad IN(ii) that depeds o ω 2, σ 2 ad λ, ad a scaled chi-squared distributio uder IN(iii) that depeds o ω 2 ad σ 2. Theorem 2. Uder Assumptios LP ad IN, the criterio IC k is strogly cosistet for distiguishig uit root ad statioary time series provided (2.5) C (log log ) 3 ad C / as. C κ (log log )2 log log κ uder IN(i) ad IN(ii) uder IN(iii) Remarks. (3) The rate coditio (2.5) implies that BIC is strogly cosistet i distiguishig uit root ad statioary models uder IN(i) ad IN(ii) ad strogly cosistet uder IN(iii) provided κ does ot icrease too fast relative to. The results complemet those of Wei (1992), who proved strog cosistecy of the BIC ad FIC criteria i order selectio for parametric autoregressios allowig for ostatioarity. (4) The proof of Theorem 2 depeds o the asymptoticbehavior of the quadraticform u P 1 u, which ivolves the projectio matrix P 1 = X 1 (X 1 X 1) 1 X 1, where X 1 =(X,X 1,...,X 1 ), ad u =(u 1,..., u ). The asymptoticproperties of projectios of this type arisig i stochastic regressio models were studied by Lai ad Wei (1982a) for martigale differeces u t i a geeral regressio settig. Their Lemma 2(ii) ad Theorem 3 (see, i particular, equatio (2.18)) give the followig order for u P 1 u whe X t is geerated by the uit root model UR ad u t is a martigale differece with uiform 2 + η momets with η > : ( )) u (2.6) P 1 u = O a.s. (log = O a.s. (log ). 1 X 2 t 1
UNIT ROOT MODEL SELECTION 69 (See also Lemma 2(iii) ad equatio (3.25) i Lai ad Wei (1982b).) I parametricstochasticregressio models that iclude ostatioary autoregressios, Pötscher (1989) used (2.6) to establish strog cosistecy of iformatio criteria of the form IC k uder a expasio rate for the pealty coefficiet C that requires C / log, thereby excludig BIC. I the proof of Theorem 2, we make explicit use of the fact that X t 1 = S t 1 +X, where S t is a partial sum of the u j, to establish that i this case (2.7) O a.s. ((log log ) 3 ) uder IN(i) ad IN(ii) ( ) u κ P 1 u = O a.s. (log log )2 log log κ, uder IN(iii) givig a sharper result tha (2.6). I provig (2.6), Lai ad Wei (1982a, Lemma 2) cosider the sample covariace of a martigale differece with a geeral radom sequece that is ot ecessarily a partial sum process of the iovatios. I view of the uit root structure, we ca make use of the followig explicit decompositio of the sample covariace S t 1 u t = 1 { } S 2 u 2 t = O a.s. ( log log ), 2 which, i cojuctio with a lower boud result for St 1 2, leads directly to (2.7) uder the commoly used iitial coditios IN(i) ad IN(ii). 3. Proofs Proof of Theorem 1. Part (a) Suppose the true model is a UR model with ρ = 1 i (1.1). The IC = log σ 2 = log { u } u. Defie P 1 = X 1 (X 1 X 1) 1 X 1, X 1 = (X,X 1,...,X 1 ), ad u = (u 1,...,u ). The behavior of IC 1 depeds o u P 1 u, which we ow ivestigate uder the various iitializatios. Uder IN(i), we have (Phillips (1987)) (3.1) ( 1 u P 1 u ) 2 / 1 dbb + λ B 2, as, where λ = h=1 E(u t u t h ), ad B is Browia motio with variace ω 2. Uder IN(ii) we have (e.g., Phillips ad Magdalios (27)) (3.2) ( 1 u P 1 u ) 2 / 1 dbb τ + λ Bτ 2,
7 PETER C. B. PHILLIPS where B τ (s) =B(s)+ τb (1), τ (, ), ad B is a Browia motio with variace ω 2 that is idepedet of B. Uder IC(iii) Phillips ad Magdalios (27, Theorem 2) show that ( ) 2 1 X t 1 u t κ u (3.3) P 1 u = B(1) 1 2. Xt 1 2 κ Thus, whether the iitializatio is recet, distat or ifiitely distat, we have u P 1 u = O p (1). It follows that { Hece, IC 1 = log σ 2 1 + C = log 1 } { 1 = log (u u u P 1 u) + C [ = log u u + log 1 u P 1 u u u [ = log u u + log 1 (X t ˆρX t 1 ) 2 } ] + C u P 1 u σ 2 {1+o a.s. (1)} = log u u + C u P 1 u σ 2 {1+o a.s. (1)}. ] + C + C IC IC 1 = C + u P 1 u σ 2 {1+o a.s. (1)} = C { ( )} 1 u P 1 u 1 (3.4) C σ 2 + o a.s. C < whe C as because u P 1 u C σ = O 2 p (C 1 ). Thus, criterio IC k correctly selects the uit root model i favor of the statioary model whe ρ =1. This is the case irrespective of the iitial coditio ad holds provided C. Next suppose the true model is statioary ad X t = u t. The we have / / ˆρ = X t X t 1 Xt 1 2 = u t u t 1 u 2 t 1 ρ = E(u tu t 1 ) a.s. E(u 2 t 1 ) := γ 1, γ where γ h = E{u t u t h }. Observe that i this case by the strog law of large umbers { } 1 IC = log σ 2 = log (u u 1) (u u 1 ) { u u = log u 1 2u + u 1 u } 1 = log{2γ 2γ 1 + o a.s. (1)} = log{2γ (1 ρ)} + o a.s. (1).
UNIT ROOT MODEL SELECTION 71 Also, 1 u P 1 u =( 1 u u 1 ) 2 /( 1 u 1 u 1) a.s. γ 2 1 /γ, ad the { } 1 IC 1 = log (u u u P 1 u) + C { (u ) u ( = log 1 1 u P 1 u 1 u u ( ) } {γ = log 1 γ2 1 γ 2 + o a.s. (1) )} + C + C It follows that = log{γ (1 ρ 2 )} + C + o a.s.(1). (3.5) IC IC 1 = log{2γ (1 ρ)} log{γ (1 ρ 2 )} C + o a.s.(1) 2 = log 1+ρ C + o a.s.(1) >, a.s. as provided C. Hece, the criterio IC k correctly selects the statioary model a.s. as for both fixed C ad C at a slower rate tha. Part (b) We seek to fid the limit distributio of the AIC criterio (i.e., IC k i (1.2) with C k = 2). Note from the above that AIC makes the correct choice whe the model is statioary as (3.5) holds whe C is fixed as. The (3.6) lim P {ˆk AIC = k =1} =, lim P {ˆk AIC =1 k =1} =1, as stated. O the other had, whe the model has a uit root, we have from (3.4) ad (3.1) (3.3) (IC IC 1 )= 2+ u P 1 u σ 2 {1+o a.s. (1)} 2+ξ 2, where It follows that ( 1 ) 2 1 ) dbb + λ /(σ 2 B 2 ξ 2 ( = 1 ) 2 1 ) dbb τ + λ /(σ 2 Bτ 2 B(1) 2 /σ 2 uder IN(i) uder IN(ii) uder IN(iii). lim P {ˆk AIC = k =} = lim P {(IC IC 1 ) < } = P {ξ 2 < 2} lim P {ˆk AIC =1 k =} =1 P {ξ 2 < 2}.
72 PETER C. B. PHILLIPS Combiig this with (3.6) gives the required limit distributio. Proof of Theorem 2. I view of (3.5), whe the statioary model is true the criterio IC k correctly selects the statioary model a.s. as for fixed C ad for C provided C /. To establish strog cosistecy we therefore eed oly cosider the limit behavior uder the uit root model. From (3.4) we have IC IC 1 = C ad so we eed to examie the limit behavior of { ( )} 1 u P 1 u 1 C σ 2 + o a.s., C u P 1 u C = (u X 1 ) 2 C (X 1 X 1). By a result of Dosker ad Varadha (1977, equatio (4.6) o p. 751) see also equatio (3.29) of Lai ad Wei (1982a) we have uder IN(i) (3.7) lim if log log 2 Xt 1 2 >, a.s., which gives a lower limit of O a.s. ( 2 log log ) to the fluctuatios of the sample iformatio Xt 1 2. I this case, 1/2 X t behaves i the limit like a Browia motio ad the lower limit (3.7) is obtaied i Dosker ad Varadha (1977, p. 751) by way of the lower limit of the correspodig limitig quatity log log 2 B(s)2 ds. Result (3.7) may also be show to hold uder IN(ii) ad IN(iii). I particular, ote that i both these cases we ca write X t = S t +X (), where S t = t 1 u j. The Xt 1 2 (X t 1 X 1 ) 2 = (S t 1 S 1 ) 2, 1 where X 1 = 1 X t 1 ad S 1 = 1 S t 1, so that (3.8) lim if log log 2 X 2 t 1 lim if log log 2 (S t 1 S 1 ) 2. The lower boud (3.8) is the same as (3.9) where lim if log log 2 = lim if 1 2 (B(s) B(s)ds) 1 ds ( 1 ) 2 B (p) B dp, ( ) log log 1/2 B (p) = B(p) for p 1
UNIT ROOT MODEL SELECTION 73 is Browia motio over [, 1] with variace (log log )ω 2. The lower limit of the variatio (3.9) therefore satisfies lim if 1 ( 1 ) 2 B (p) B dp > by virtue of the properties of Browia motio. Otherwise B (p) would ecessarily be costat with positive probability as. It follows that uder IN(ii) ad IN(iii) the same lower boud of order O a.s. ( 2 log log ) applies for 1 Xt 1 2. Next ote that (3.1) u t X t 1 = u t S t 1 + X u t = 1 { } S 2 u 2 t + X u t 2 = O a.s. ((log log )) + O a.s. (X () log log ), by virtue of the law of the iterated logarithm (e.g., Phillips ad Solo (1992)) for S. It follows that uder IN(i) ad IN(ii) ad i view of a further applicatio of the law of the iterated logarithm to X () ad usig (3.7) we fid that ( u 2 (log log ) 2 ) P 1 u = O a.s. 2 = O a.s. ((log log ) 3 ). / log log We deduce that wheever IC IC 1 = C a.s. { ( )} 1 u P 1 u 1 C σ 2 + o a.s. <, C (3.11) C (log log ) 3, as for the u P 1 u C σ = o 2 a.s. (1) ad IC < IC 1 a.s. as. This proves strog cosistecy uder the rate coditio (3.11) ad iitial coditios IN(i) ad IN(ii). Whe IN(iii) applies, (3.1) holds ad we have u t X t 1 = O a.s. ((log log )) + O a.s. (X () log log ) = O a.s. ( κ log log κ log log ), by the law of the iterated logarithm for X (). The, usig (3.7), we have ( ) ( ) u κ log log log log κ κ P 1 u = O a.s. 2 = O a.s. / log log (log log )2 log log κ,
74 PETER C. B. PHILLIPS ad deduce that wheever (3.12) IC IC 1 = C { ( )} 1 u P 1 u 1 C σ 2 + o a.s. < C C κ (log log )2 log log κ. This proves strog cosistecy uder the rate coditio (3.12) ad the iitial coditio IN(iii). Refereces Akaike, H. (1969). Fittig autoregressive models for predictio, Aals of the Itstitute of Statistical Mathematics, 21, 243 247. Akaike, H. (1973). Iformatio theory ad a extesio of the maximum likelihood priciple, Secod Iteratioal Symposium o Iformatio Theory (eds. B. N. Petrov ad F. Csaki), Akademiai Kiado, Budapest. Akaike, H. (1977). O etropy maximizatio priciple, Applicatios of Statistics (ed. P. R. Krisharah), Amsterdam, North-Hollad. Dosker, M. D. ad Varadha, S. R. S. (1977). O laws of the iterated logarithm for local times, Commuicatios i Pure ad Applied Mathematics, 3, 77 753. Haa, E. J. ad Qui, B. G. (1979). The determiatio of the order of a autoregressio, Joural of the Royal Statistical Society, Series B, 41, 19 195. Kim, J.-Y. (1998). Large sample properties of posterior desities, Bayesia iformatio criterio ad the likelihood priciple i ostatioary time series models, Ecoometrica, 66, 359 38. Lai, T. L. ad Wei, C. Z. (1982a). Asymptotic properties of projectios with applicatios to stochastic regressio problems, Joural of Multivariate Aalysis, 12, 346 37. Lai, T. L. ad Wei, C. Z. (1982b). Least squares estimates i stochastic regressio models with applicatios to idetificatio ad cotrol of dyamic systems, Aals of Statistics, 1, 154 166. Magdalios, T. ad Phillips, P. C. B. (26). Limit theory for coitegrated systems with moderately itegrated ad moderately explosive regressors, upublished paper, Yale Uiversity. Nielse, B. (26). Order determiatio i geeral vector autoregressios, IMS Lecture Notes- Moograph Series, 52, 93 112. Phillips, P. C. B. (1987). Time series regressio with a uit root, Ecoometrica, 55, 277 32. Phillips, P. C. B. (1996). Ecoometric model determiatio, Ecoometrica, 64, 763 812. Phillips, P. C. B. ad Magdalios, T. (27). Uit root ad coitegratig limit theory whe iitializatio is i the ifiite past, upublished paper, Yale Uiversity. Phillips P. C. B. ad Ploberger, W. (1996). A asymptotic theory of Bayesia iferece for time series, Ecoometrica, 64, 381 413. Phillips, P. C. B. ad Solo, V. (1992). Asymptotics for liear processes, Aals of Statistics, 2, 971 11. Pötscher, B. M. (1989). Model selectio uder ostatioarity: Autoregressive models ad stochastic liear regressio models, Aals of Statistics, 17, 1257 1274. Rissae, J. (1978). Modelig by shortest data descriptio, Automatica, 14, 465 471. Schwarz, G. (1978). Estimatig the dimesio of a model, Aals of Statistics, 6, 461 464. Tsay, R. S. (1984). Order selectio i ostatioary autoregressive models, Aals of Statistics, 12, 1425 1433. Wei, C. Z. (1992). O predictive least squares priciples, Aals of Statistics, 2, 1 42.