Testing Weak Convergence Based on HAR Covariance Matrix Estimators

Testing Weak Convergence Based on HAR Covariance atrix Estimators Jianning Kong y, Peter C. B. Phillips z, Donggyu Sul x August 4, 207 Abstract The weak convergence tests based on heteroskedasticity autocorrelation robust (HAR) covariance matrix estimators are considered. Asymptotic limits are derived. By means of numerical simulations, we evaluate the performance of the trend regression tests based on HAR estimators. We nd, however, that the improvement of alternative tests is very limited. Keywords Weak JE Classi cation C33 convergence. HAR estimator, HAC estimator. Introduction Valid statistical inference on spurious trend regressions have been of interest. Sun (2004) pointed out that the valid statistical inference can be obtained by using the heteroskedasticity and autocorrelation consistent (HAC) standard error estimator with a bandwidth proportional to the sample size. ater the t statistics based on the HAC estimator without truncation is called heteroskedasticity autocorrelation robust (HAR) test statistics. (See Kiefer and Vogelsang, 2002 Phillips, 2005 Phillips, Zhang and Wang, 202) Consider the following simple trend regression, x t = at + z t () where z t = z t + u t Since z t is I () the trend regression in () becomes spurious Testing H 0 a = 0 is not easy with the following conventional t-statistic. t a = ^a T P T t= ^z2 t T P =2 with ^a = T t= t P T t= tx t P T t= t2 Phillips acknowledges NSF support under Grant No. SES-2525. y Shandong University, China z Yale University, USA University of Auckland, New Zealand Singapore anagement University, Singapore University of Southampton, UK. x University of Texas at Dallas, USA

Phillips, Zhang and Wang (202, PZW hereafter) show that t a diverges under the null, and suggest to use the following t-ratio with HAR estimator. where t HAR a = ^ HAR = T PT ^a t= t2 h T ^ HAR i PT t= t2 =2 t= 2 t + 2 T X = t= + t t+ with = bgt c for some g 2 (0 ) and t = ^z t t The underlying intuition is rather simple. As the serial dependence of the regression the error, z t goes to extremely, more larger lag length are needed. When the error becomes nonstationary, the in nite lag length leads to a constant t-ratio. Recently, Kong, Phillips and Sul (207, KPS hereafter) propose the following simple trend regression for testing the weak convergence. K nt = a + t + u t (2) where K nt is the sample cross sectional variance of the idiosyncratic components of a panel data. Suppose that y it = a i + t + yit o then K nt = P n n i= y P n it n i= y 2 it Since the underlying data generating process is unknown, this trend regression is misspeci ed unless the data generating process is given in (2). KPS (207) consider the following t-ratio. t = r ^ 2 ^ nt PT t= ~ t 2 (3) where ^ nt is the estimate of the slope coe cient, ~t = t ^ 2 = T t= ~u2 t + 2 T X = t= T P T t= t and ^ 2 is de ned as + ~u t ~u t+ where ~u t = ~ K nt ^nt ~t with ~ K nt = K nt T P T t= K nt and = bt c In fact, the trend regression in (2) is misspeci ed if the data generating process is given by K nt = a + bt + e t (4) where e t = O p n =2 usually The trend regression considered by KPS in (2) is not spurious in the sense that the regression error in (4) is O p n =2 but misspeci ed. Interestingly, due to this misspeci cation, the residual has a spurious trend. That is, ^u t = ~ K nt ^~t = b g t + ~e t ^~t where ~ stands for the deviation from its time series mean. Unless = 0 the least squares estimator, ^ is not equal to zero, so that the regression residual has a spurious trend, which in uences on the long run variance calculation. Here we investigate whether or not the HAR type correction improves the testing result proposed by KPS (207). 2

2 Alternative Estimators Even though the long run variance formula follows the Newey-West HAC estimator, the variance of ^ nt is not a typical sandwich form. Hence the use of other long run variances of ^ nt becomes of interest. In KPS (207), the hypothesis of interest is weak convergence. That is, H A nt 0 When > 0 then the least squares estimate of ^ nt approaches to zero as n T! but the t statistic in (3) diverges negative in nity p if Even when the decay value, becomes large, the t statistic is still converging 3 Also more importantly, the t statistic is discontinuous at = 0 When = 0 the limiting distribution of the t statistic becomes a standard normal. However as deviates slightly from zero, the t statistic diverges. We consider the following alternative t-ratios. t 2 = r ^ 2 2 ^ nt PT t= ~ t 2 (5) et ~p t = ~u t ~t and de ne t HAR = t HAC = r PT r PT ^ nt t= ~ t 2 T ^2 PT t= ~ t 2 (6) ^ nt t= ~ t 2 T ^2 PT t= ~ t 2 (7) where ^ 2 = T ^ 2 = T ^ 2 2 = T t= ~p2 t + 2 T t= ~p2 t + 2 T t= ~u2 t + 2 T X = t= X = t= X = t= ~p t ~p t+ where = bgt c for some g 2 (0 ) + ~p t ~p t+ where = bt c for some 2 (0 ) + ~u t ~u t+ + Interestingly, the asymptotic behavior of the t HAC is somewhat di erent from the t statistic, meanwhile the asymptotic properties of the t HAR becomes very distinct. The next theorem provides more details. Theorem (Asymptotic Properties) Under regularity conditions, the t-ratio statistics have the following asymptotic behaviors 3

as n T! t = t 2 = > > O T =2 =2 if =2 O T =2 =2 (ln T ) =2 if = =2 O T =2 if =2 = ( + ) O T ( )( )=2 if = ( + ) p 6= if = p 3 if >, () O () if =2 O (ln T ) =2 if = =2 O p () if =2 (9) p 6 if = 3 if > O () > 0 if 0 t HAR = 0 if = 0 O () 0 if > 0 and t HAC = O T ( )=2 > 0 if 0 0 if = 0 O T ( )=2 0 if > 0 See the online appendix for the detailed proof of Theorem. Note that the result for the t statistic in () is proved by KPS (207). Except for a few speci c cases, the asymptotic orders of the t- ratios with HAR estimators can be obtained by replacing by unity. Here we provide some intuitive results based on the following numerical simulation. We generate K nt by setting K nt = a + bt by assuming e t = 0 for all t and then calculate ^ nt and the t ratios. Figure shows the numerical simulation results for the t and the t 2 statistics. As KPS demonstrates, the t statistic is discontinous at = 0 and as increases, the t value converges 3 Of p course, the t values decreases as T increases as long as 0 The asymptotic behavior of the t 2 p statistic is somewhat similar to that of the t The t 2 is discontinuous at = 0 and converges 3 as! However, the biggest di erence is the time varying behavior. The t2 statistic is almost independent from the size T except for the case of = =2 Even when = =2 the t 2 statistic changes little since the asymptotic order is O = p ln T which is an extremely slowly moving function. Figure 2 plots the numerical simulation values of the t HAR and t HAC statistics. With a xed T we can observe the fact that jt HAR j jt HAC j where the equality holds only when = 0 oreover, surprisingly the value of jt HAR j is not big enough to reject the null of no convergence even when 0 Also both t-ratios are not discontinuous at = 0 so that there exist regions where the test becomes inconsistent. For example, As it shown in Figure 2, if 0 0 then both t HAC and t HAR are not small enough to reject the null of no convergence even with T = 0 000 The t HAR does not change over T meanwhile the absolute value of the t HAC becomes larger as T increases. 3 Concluding Remark We investigate various types of long run variance estimators for testing the weak convergence. Among them, we show that the original estimator suggested by KPS is the most e ective unless the convergence speed is very fast. When the convergence speed is very fast, the typical sandwich form with the HAC estimator provides better result. (0) 4

References [] Kiefer, N.., Vogelsang, T.J. (2002a). Heteroskedasticity-autocorrelation robust testing using bandwidth equal to sample size. Econometric Theory 350-366. [2] Kong, J., P.C.B. Phillips, and D. Sul (207). Weak convergence Theory and Applications, imeo, University of Texas at Dallas. [3] Sun, Y. (2004). A convergent t-statistic in spurious regression. Econometric Theory 20943-962. [4] Phillips, P.C.B. (2005). HAC estimation by automated regression. Econometric Theory 26 42. [5] Phillips, P.C.B., Y. Zhang and X. Wang (202). imit theory of three t statistics in spurious regressions. imeo, Singapore anagement University. t ratio 25 20 5 0 5 0 5 0 5 20 t2 with T=0000 t with T=0000 t2 with T=000 t with T=000 25 0. 0 0. 0.2 0.3 0.4 0.5 0.6 0.7 0. 0.9 lambda Figure Numerical Simulations for t and t 2 (g = = 05) 5

t ratio 6 4 2 t HAR with T=000 t HAC with T=000 t HAR with T=0000 t HAC with T=0000 0 2 4 0.4 0.2 0 0.2 0.4 0.6 0. lambda Figure 2 Numerical Simulations of the t HAR and t HAC 6

4 Online Appendix 4. Proof of (0) We consider the exact orders of ^ 2 and ^ 2, rst. Note that as T! we have T t= ~p2 t = T = T = > t= t2 ~m 2 t t= t2 " XT gt ~t ~t t g XT # 2 ~t 2 t= t= O T 2 2 if O () if = O T 2 2 if 3=2 O T ln T if = 3=2 O T if > 3=2 Next, let and We expand G as P (T ) = T P (T ) = T X = X = t= t= ~p t ~p t+ + ~p t ~p t+ + X X T ~p t ~p t+ T = t= + = X X T tg T = t= + g (t + ) T T ( ) T X 4 X T tg = t= + g (t + ) 2 T T ( ) T X 4 X T et = t= + 2 g (t + ) + (T T ( )) 2 T X 7 X T g (t + ) 2 t e2 = t= + = 2 3 + 4 where T T ( ) = P T ~ t= tt g, which is well de ned in emma 4 in KPS (207). Further note that = X XT tg T = + g (t + ) t= = X XT t 2 + t T = + t= X X T XT T = + T t= t (t + ) t= = 2 7

2 = T T ( ) T X 4 = T T ( ) T 4 X = = = T T ( ) T 4 X 2 22 = t= + + tg + (t + ) 2 XT t= t (t + ) 2 XT T t= t XT t= (t + )2 3 = T T ( ) T X 4 = = T T ( ) T 4 X = = T T ( ) T 4 X 3 32 = t= t= + et + 2 g (t + ) t 2 (t + ) + XT T t= t2 XT t= (t + ) From the direct calculation, we can show the order of each term. Rather than write down all equations, here we show how to get the exact order of 2 and 2 Note that 2 = T T ( ) T X 4 XT = + t= t (t + ) 2 = T T ( ) T X 4 XT t 3 + t 2 + 2t 2 = + t= Consider each term. X X = = XT + t= t3 = X XT + 2 t t= = = = X + O T 4 if 4 O ( ln T ) if = 4 O () if > 4 = = X = = + + 4 (T )4 if 4 ln (T ) if = 4 ( 3) if > 4 2 t= t 2 O T 4 if 2 O T 2 ln T if = 2 O T 2 if > 2 2 (T )2 if 2 ln (T ) if = 2 ( ) if > 2

and X = XT + t= 2t2 = 2 X = = 2 X = = + + XT t= t2 O T 4 if 3 O (T ln T ) if = 3 O (T ) if > 3 3 (T )3 if 3 ln (T ) if = 3 ( 2) if > 3 Combining all terms yeilds 2 = T T ( ) T X 4 = XT + t= t (t + ) 2 O T 4 if 4 = T T ( ) T 4 O ( ln T ) if = 4 + T T ( ) T 4 O () if > 4 O T 4 if 3 +T T ( ) T 4 O (T ln T ) if = 3 O (T ) if > 3 O T 4 if 2 = T T ( ) T 4 O T 2 ln T if = 2 O T 2 if > 2 Next, replacing by T leads to 2 = T T ( ) = = > O T + if 2 O T 2+ ln T if = 2 O T 2+ if > 2 T 2 if T ln T if = () T if > O T 2 2+ if O (T ln T ) if = O T + if 2 O T + ln T if = 2 O T + if > 2 O T + if 2 O T 2+ ln T if = 2 O T 2+ if > 2 O T 4 if 2 O T 2 ln T if = 2 O T 2 if > 2 9

eanwhile to calculate the order of 2 we replace by gt That is, O T 4 if 2 2 = T T ( ) T 4 O T 2 ln T if = 2 O T 2 if > 2 O T 5 if 2 = T T ( ) O T 3 ln T if = 2 O T 3 if > 2 T 2 if O T 5 if 2 = T ln T if = O T 3 ln T if = 2 () T if > O T 3 if > 2 O T 3 2 if O T 2 ln T if = = O T 2 if 2 > O () if > 2 In the below, we provide the nal order of each term. = 2 = > O T 3 2 if 3=2 O (ln T ) if = 3=2 O () if > 3=2 c 2 (2 ) 2 T 3 2 if 2 c 2 T ln 2 T if = 2 2 ( ) (gc c 2 ) T if > 2 = 2 = > O T 2 2+ if 3=2 O T + ln T if = 3=2 O T + if > 3=2 (2 ) 2 T 2 2+ if 2 T 2+ ln 2 T if = 2 2 ( ) T 2+ if > 2 Hence = O T 3 2 if 3=2 O (ln T ) if = 3=2 O () if > 3=2 and = O T 2 2+ if 3=2 O T + ln T if = 3=2 O T + if > 3=2 Next, 2 = 22 = > > O T 3 2 if O T 2 ln T if = O T 2 if 2 O () if > 2 O T 3 2 if O T 2 ln T if = O T 2 if 2 O () if > 2 2 = 22 = > > O T 2 2+ if O T + ln T if = O T + if 2 O T + ln T if = 2 O T + if > 2 O T 2 2+ if O T + ln T if = O T + if 2 O T + ln T if = 2 O T + if > 2 0

Combining these two leads to 2 = > The third term becomes 3 = 32 = > > O T 3 2 if O T 2 ln T if = O T 2 if 2 O () if > 2 O T 2 2 if O T ln T if = O T if 2 O () if > 2 O T 3 2 if O T 2 ln T if = O T 2 if 2 O () if > 2 Combining these two terms yeilds 3 = > O T 3 2 if O T 2 ln T if = O T 2 if 2 O () if > 2 2 = 3 = 32 = > > > 3 = O T 2 2+ if O T + ln T if = O T + if 2 O T + ln T if = 2 O T + if > 2 O T (5 ) 2 if O T (5 ) 3 ln T if = O T (5 ) 3 if 2 O T 3 3 ln T if = 2 O T 3 3 if > 2 > O T 2 2+ if O T + ln T if = O T + if 2 O T + ln T if = 2 O T + if > 2 O T 2 2+ if O T + ln T if = O T + if 2 O T + ln T if = 2 O T + if > 2 ast, the fourth term becomes 4 = O T 3 2 if O T ln 2 T if = O (T ) if > 4 = O T 2 2+ if O T ln 2 T if = O (T ) if > After combining all terms, we have O T 3 2 if P (T ) = O T ln 2 T if = O (T ) if > Finally, ^ 2 = T ^ 2 = T P (T ) = O T 2 2+ if O T ln 2 T if = O (T ) if > O T 3 2 if t= ~p2 t + 2P (T ) = O T ln 2 T if = O (T ) if > t= ~p2 t + 2P (T ) = O T 2 2+ if O T ln 2 T if = O (T ) if >

Therefore we have t HAR = t HAC = > > T T 3 = O () if T =2 T 3=2 T 2 ln T T =2 T =2 ln T T 3 = O () if = T 2 T 3 = O () if > T =2 T =2 T T 3 = T ( )=2 if T =2 T +=2 T 2 ln T T 3 = T ( )=2 if = T =2 (T ln 2 T) =2 T 2 T 3 = T ( )=2 if > T =2 T =2 4.2 Proof of () As it is shown in the previous proof, the order of the t-ratio based on the HAR estimator can be directly obtained by replacing by. The underlying reason was very straightforward. The resulting order was expressed as O T 2 if P (T ) = O ln 2 T if = O () if > P (T ) = O T 2 2 if O ln 2 T if = O () if > so that by replacing by gt, and by T becomes equivalent to replace by However the order of the t is not always expressed as a function of, especially when From KPS (207), we have 6bT 2 ln T 2 T 3 =2 2 =2 if = t = 2 b2 T ln 2 T 6bT > 2 () 2 T 3 =2 (T b 2 f P t= t ( t)g) =2 if > When > the expression of the t 2 becomes the exactly same as that of the t since ^ 2 ' ^ 2 2 if > When = we have t 2 = 6bT 2 ln T 2 T 3 =2 =2 = p 6 ln T 2 b2 T (ln ) 2 ln = p ln T 6 ln g + ln T = p 6 The rest of the orders can be directly obtained by replacing by 2