USING THE WILD BOOTSTRAP TO IMPLEMENT HETEROSKEDASTICITY-ROBUST TESTS FOR SERIAL CORRELATION IN DYNAMIC REGRESSION MODELS* April PDF Free Download

USING THE WILD BOOTSTRAP TO IMPLEMENT HETEROSKEDASTICITY-ROBUST TESTS FOR SERIAL CORRELATION IN DYNAMIC REGRESSION MODELS* by L.G. Godfrey 1 and A.R. Tremayne 1,2 April 2003 Abstract Conditional heteroskedasticity is a common feature of financial and macroeconomic time series data. When such data are used to estimate dynamic regression models, standard checks for serial correlation are inappropriate. In such circumstances, it is obviously important to have valid tests that are reliable in finite samples. Generalizations of the standard Lagrange multiplier test and a Hausman-type check are examined, as is a new procedure. Monte Carlo results on significance levels and power are reported. Asymptotic critical valuesfailtogivegoodcontrolof finite sample significance levels. It is, however, found that, if a particularly simple form of the wild bootstrap is used, it is possible to obtain well-behaved tests that are asymptotically robust to conditional heteroskedasticity of unspecified form. Keywords: heteroskedasticity; serial correlation; wild bootstrap JEL classification: C22,C52 1 Department of Economics, University of York, Heslington, York YO10 5DD, United Kingdom 2 School of Economics and Political Science, University of Sydney, Sydney NSW 2006, Australia * We are grateful to Peter N. Smith for helpful suggestions. 1

1 Introduction The importance of testing for serial correlation after the least squares estimation of a dynamic regression model has been understood for many years. Lagrange Multiplier (LM) tests allow for flexibility in the choice of alternative hypothesis, are easily implemented, and are now used as a matter of routine in estimation programmes. However, these tests, along with many others, were derived under the assumption of homoskedasticity. It is now recognized that conditional heteroskedasticity may be common, especially (but certainly not exclusively) when the data relate to financial variables. In the presence of conditional heteroskedasticity, standard checks for serial correlation cannot be assumed to be asymptotically valid and may lead to misleading inferences. There is therefore a need to examine heteroskedasticity-robust (HR) procedures. An obvious way to derive LM-type checks that are asymptotically HR is to employ heteroskedasticity-consistent covariance matrix estimates (HCCME) as discussed by White (1980). The basic idea of using HCCME to obtain diagnostic tests is long established; see, for example, Pagan and Hall (1983). Details of a specific application of HCCME to testing for serial correlation are given by Godfrey (1994). However, there is a considerable body of evidence indicating that tests derived from the HCCME can have finite sample distributions that are quite unlike those predicted by asymptotic theory. In particular, finite sample significance levels can be far from what is desired. The main purpose of this paper is to report how the wild bootstrap approach can be employed to obtain control of the finite sample significance levels of serial correlation tests based upon HCCME. Given the significant increases in the availability of cheap and powerful computers in the last few years, this approach is attractive. Once implemented in standard programmes, it could be used not only for serial correlation tests, but also for other variable addition diagnostic checks, e.g. the RESET 2

test. Thus the wild bootstrap described below provides a way for applied workers to obtain misspecification tests that are asymptotically robust to heteroskedasticity (and nonnormality) with the bootstrap being used to improve upon the approximation provided by asymptotic theory. This combination of features matches Hansen s (1999) recommendations for good econometric practice. In order to assess the marginal value of the results given below, reference should be made to the existing body of published work. 1 There are relatively few contributions. Robinson (1991) considers a static linear regression model and derives a LM test for serial correlation that is asymptotically valid under dynamic conditional heteroskedasticity and nonnormality. No evidence on finite sample performance is provided and, as noted by Robinson, the test is inappropriate in the presence of lagged dependent variables. Whang (1998) derives tests for serial correlation in dynamic models that are asymptotically valid in the presence of heteroskedasticity of unknown form. Thus his test serves the role for which the tests of this paper are designed. However, Whang (1998) does not use the HCCME approach. Instead his test statistic involves a nonparametric kernel estimate of the unknown variance function, and his technique requires the specification of kernel and trimming functions. He reports Monte Carlo results for his asymptotic tests based upon a static simple regression. His estimates show that asymptotic theory cannot be relied upon to give a very good approximation. For example, with 50 observations and a nominal 1 Binbin Guo and Peter Phillips have kindly supplied a copy of unpublished work. Guo and Phillips (2000) provide tests that are asymptotically valid in the presence of conditional heteroskedasticity of unspecified form, either when there are only exogenous regressors, or when there are only lagged dependent variables, as regressors. They report Monte Carlo results for a model of the latter type. The use of asymptotic critical values produces mixed results with some evidence of tests being undersized. 3

significance level of 5%, the majority of his estimates are between 2% and 3.5%. The use of HCCME can also be avoided if there is precise information about the error variances. Given such information, more specific adjustments can be made to obtain HR checks for serial correlation. Silvapulle and Evans (1998) give a discussion of ARCH-corrected tests. However, such an approach suffers from the general drawback that applied researchers are in reality not likely to have accurate information about the precise form of heteroskedasticity. Mistakes made in the specification of the variance model will, in general, lead to asymptotically invalid inferences. Moreover, contrary to what is asserted by Silvapulle and Evans (1998, page 35), the Box-Pierce and Ljung-Box portmanteau tests are not asymptotically valid when exogenous variables and lagged values of the dependent variable are both included amongst the regressors. Consequently their Monte Carlo analysis includes inappropriate tests and the corresponding results must be treated with caution. In view of the above, the novel features of our study are: principally, the use of a wild bootstrap to gain better control over significance levels than is provided by asymptotic theory; the use in Monte Carlo designs of dynamic regression models with exogenous variables; and in addition a new test is derived as a modification of the HR version of the LM statistic. The paper consists of six sections. Section 2 contains descriptions of models and tests, including the new variant of the LM test. Bootstrap methods are explained in Section 3. The design of the Monte Carlo experiments is provided in Section 4. Results from these experiments are summarized in Section 5. Finally, Section 6 contains the conclusions. 2 Models and test procedures The dynamic regression model is written as 4

y t = Y 0 tα + X 0 tβ+² t = W 0 tγ+² t, (1) in which: Y 0 t =(y t 1,..., y t P ) with P 1; α 0 =(α 1,..., α P ) with the coefficients α i being such that the roots of z P α 1 z P 1... α P =0 are all strictly inside the unit circle; X t is L by 1 and is a typical observation vector on regressors that are either nonrandom or strictly exogenous; W 0 t=(y 0 t, X0 t); and γ 0 =(α 0, β 0 ). Let K = P + L denote the number of regression coefficients in (1) and T denote the number of observations available for estimation. The null hypothesis to be tested is that the errors ² t are serially uncorrelated. It will be argued that the use of a wild bootstrap is very important to the reliable assessment of the statistical significance of HR variants of the LM tests of Breusch (1978) and Godfrey (1978), and the Hausman-type test of Godfrey (1997). Consequently the regularity conditions must not only support the usual HCCME and associated asymptotic tests, but also validate the bootstrap procedure. All tests are constructed using the results of ordinary least squares (OLS) estimation of (1). It is essential that the OLS estimators are consistent under the null hypothesis. Consequently it is assumed that RSS j, the residual sum of squares from the artificial regression of the j th variable of W t on all other regressors, is at least O p (T ). Restrictions on the coefficients of the lagged dependent variables of Y t have been given above. The exogenous variables of X t are not restricted to be nonstochastic or covariance-stationary I(0) terms; see Wooldridge (1999) for a discussion of nonstationary regressors in the context of testing for serial correlation. Let eγ 0 =(eα 0, e β 0 ) denote the OLS coefficient estimator for (1) and e t = y t Wt 0 eγ be a typical residual derived from this estimator. 5

Turning to assumptions that are made, under H 0, about the errors of (1), the following, provided by Gonçalves and Kilian (2002) for wild bootstraps, are adopted for the case of conditional heteroskedasticity: a. E(² t F t 1 )=0,almostsurely,whereF t 1 is the sigma-field generated by (² t 1,² t 2,...). b. E(² 2 t )=σ 2, 0 < σ 2 <. c. lim T T P 1 T t=1 E(²2 t F t 1 )=σ 2 in probability. d. E(² 2 t ² t r ² t s )=0for all r 6= s, forallt, r 1,s 1. e. lim T T P 1 T t=1 ² t r² t s E(² 2 t F t 1 )=0in probability for any r 1,s 1. f. E( ² t 4r ) is uniformly bounded for some r 2 and all t. These assumptions are somewhat stronger than those required for the asymptotic validity of the HCCME and, in particular, require that at least eight, rather than at least four, moments exist. For the case of unconditional heteroskedasticity, conditions for models with strictly exogenous regressors are available in the statistics literature; see Liu (1988) and Mammen (1993). Suppose that the null hypothesis is to be tested against the alternative of Q th order autocorrelation, Q 1. 2 A convenient variable addition form of the LM test is a test of λ =(λ 1,..., λ Q ) 0 = 0 in the augmented model y t = Y 0 tα + X 0 tβ + E 0 tλ+² t = W 0 tγ + E 0 tλ + ² t, (2) in which E 0 t =(e t 1,..., e t Q ) and e t q is set equal to zero when t q 0. 3 Let the OLS estimator of the regression coefficients for (2) be denoted by ˆγ and ˆλ. The Breusch-Godfrey test is then a check of the joint significance of the elements of ˆλ. 2 The Q th order autoregression (AR(Q)) and Q th order moving average (MA(Q)) alternatives lead to the same test statistic. 3 Restricted alternatives, e.g. simple AR(Q) schemes, can easily be accommodated by omitting irrelevant terms from E t. 6

Although Breusch (1978) and Godfrey (1978) proposed a large sample χ 2 form of the test, subsequent investigations by Kiviet (1986) revealed that better finite sample behaviour was achieved by using an approximate F statistic. Kiviet s version of the test is denoted by LM F anditisthisformthatwillbeshowntobevulnerableto dynamic heteroskedasticity. An alternative check for serial correlation is derived by Godfrey (1997) as an extension of the work by Dezhbakhsh and Thursby (1994). The test is a check of the significance of the difference between the estimators of α from (1) and (2), i.e. (eα ˆα). Godfrey (1997) gives a formula based upon a direct application of Hausman s (1978) general result for the variance-covariance matrix of estimator contrasts. However, a variable addition form is also available. Let A denote the cross-product moment matrix for the regressors of (2), i.e. Y 0 Y Y 0 X Y 0 E A =T 1 X 0 Y X 0 X X 0 E E 0 Y E 0 X E 0 E in which Y is T by P with typical row Yt, 0 X is T by L with typical row X 0 t, and E is T by Q with typical row E 0 t. The submatrices of the inverse of A corresponding to partitioning by lagged endogenous regressors, exogenous regressors, and lagged residuals are denoted by A YY, A YX, A YE etc. The variable addition form of Godfrey s (1997, p. 201, eq. 2.6) test is then obtained by testing δ =(δ 1,..., δ P ) 0 = 0 in the extended model y t = Ytα 0 + X 0 tβ +(A YE E t ) 0 δ + ² t = Wtγ 0 + J 0 tδ + ² t, (3) i.e. the test variables are the P linear combinations of the terms in E t given by J t = A YE E t. If P Q with rank(a YE )= Q, the test variables of (3) are (omitting 7

any redundant terms) equivalent to those of (2). If P<Qwith rank(a YE )= P,the Hausman and LM tests are not asymptotically equivalent and there is no generally valid ranking of these procedures by asymptotic local power; see, e.g., Holly (1982). Using White s (1980) general formula for the HCCME, the HR version of the LM statistic can be shown to be LM HR = e 0 E{E 0 M(W)eΩM(W)E} 1 E 0 e, (4) where e is the T -dimensional OLS residual vector with typical element e t, e Ω diag(e 2 1,..., e 2 T ), and is M(W) =I T P(W) =I T W(W 0 W) 1 W 0, in which W is the T by K matrix with typical row Wt. 0 The calculation of the HR- Hausman test of δ = 0 in (3), when P<Q, is carried out in a similar fashion. The test statistic for this procedure will be denoted by HA HR and is given by HA HR = e 0 J{J 0 M(W) ΩM(W)J} e 1 J 0 e, (5) where J is the T P matrix with typical row J 0 t. It should be noted that the terms e 2 t are used to compute the HCCME for both robust serial correlation tests. This corresponds to using the squared residuals from restricted (under the null hypothesis) estimation and, compared to using the unrestricted squared residuals, has been found elsewhere to give superior finite sample performance; e.g. see Davidson and MacKinnon (1985) and Godfrey and Orme (2002). 4 4 HCCME are sometimes derived by multiplying squared residuals by terms that tend to unity; see, e.g., Davidson and MacKinnon (1985). Davidson and MacKinnon (1985) report that, when restricted residuals are used, such adjustments lead only to slight changes. The simplest version is, therefore, used. 8

e 2 t Consideration of finite sample performance suggests that, in addition to using in expressions for HCCME, it may be useful to examine devices that reduce the sampling variability of the HCCME. One such device is used here to derive a modified version of LM HR. The basic idea is to eliminate asymptotically negligible terms. To identify and remove such terms, partition W as (W 1, W 2 ), where, under the null hypothesis, p lim T 1 W 0 1E 6= 0 and p lim T 1 W 0 2E = 0, so that W 1 contains terms like y t j,j =1,..., min(p, Q), andw 2 contains the exogenous regressors of X t and any terms y t j with j>q. Standard results on projection matrices imply that M(W) =M( f W 1 ) P(W 2 ), where f W 1 = M(W 2 )W 1,andM(W 2 ), M( f W 1 ) and P(W 2 ) have the meanings implied by the definitions of M(W) and P(W) in (4). Hence E 0 M(W) e ΩM(W)E = Ψ 1 + Ψ 2 +Ψ 0 2+Ψ 3, in which, Ψ 1 = Ě 0 e ΩĚ, Ψ2 = Ě 0 e ΩP(W2 )E, Ψ 3 = E 0 P(W 2 ) e ΩP(W 2 )E, and Ě denotes M( W f 1 )E. Since p lim T 1 W2E 0 = 0, Ψ 2 and Ψ 3 are both o p (T ) and are asymptotically negligible relative to Ψ 1 which is O p (T ). Hence the HR-variant of the LM test given by (4) is asymptotically equivalent to the modified procedure based upon MLM HR = e 0 E(Ě 0 e ΩĚ) 1 E 0 e. (6) 9

Under the null hypothesis, LM HR and MLM HR are asymptotically distributed as χ 2 (Q). When P < Q, the asymptotic null distribution of HA HR is χ 2 (P ). ( If P Q, all three tests are asymptotically equivalent.) Asymptotically valid tests are obtained by comparing sample values of tests statistics with critical values from the right-hand-tail of the appropriate χ 2 distributions. However, as illustrated by the results provided by Godfrey and Orme (2002), the finite sample distributions of HR statistics may differ appreciably from those predicted by asymptotic theory when several restrictions are being tested. Their results also indicate the potential usefulness of wild bootstrap methods and these techniques will be discussed in the next section. 3 Wild bootstrap methods In order to describe the wild bootstrap, it is useful to write the OLS estimated version of (1) as y t = W 0 t eγ + e t,t=1,..., T. (7) Let Z be the discrete random variable with the two-point distribution Z = ( 5 1)/2 with probability ( 5+1)/(2 5) (8) =( 5+1)/2, otherwise, so that E(Z) =0, E(Z 2 )=1, and E(Z 3 )=1. The conditions that E(Z) =0and E(Z 2 )=1are essential for the validity of the bootstrap procedure. The additional restriction pertaining to the third moment of (8) was suggested by Liu (1988). 5 The 5 After examining the distribution of a single linear combination of OLS estimators, Liu (1988) shows that, if E(Z 3 )=1, the wild bootstrap enjoys second order properties with the first three 10

wild bootstrap (termed the recursive-design wild bootstrap by Gonçalves and Kilian, 2002) is then implemented as follows. (i) For t =1,..., T, observations are generated by yt = α 1 yt 1 +... + α P yt P + X 0 e tβ + ² t, where bootstrap sample starting values are set equal to actual estimation sample starting values (see Li and Maddala, 1996, Section 2.3), and the errors ² t are given by ² t = e t Z t,z t being a drawing from (8). Note that, in contrast to the standard residual resampling scheme for iid errors, it is not necessary to centre the OLS residuals if the null model does not contain an intercept when using the wild bootstrap; see Liu (1988, pp. 1706-7). (ii) Given the bootstrap data, the regression model is estimated and the associated values of the test statistics HA HR,LM HR, and MLM HR are calculated. (iii) Repeat (i) and (ii) B times in order to estimate the p-values of the observed statistics HA HR, LM HR,andMLM HR. The null hypothesis of serial independence is rejected for p-values that are sufficiently small. Several other distributions have been suggested as alternatives to (8); e.g. see Davidson and Flachaire (2000), henceforth DF, Liu (1988), and Mammen (1993). However, with the exception of the DF scheme, the use of these alternative wild bootstraps produced results in the Monte Carlo experiments of this paper that are markedly inferior to those produced by (8). Consequently detailed results are only provided below for (8) and the DF scheme which is given by the simple two-point distribution Z = 1 with probability 0.5, (9) = 1, otherwise. moments of the relevant test statistic being estimated correctly to O(T 1 ) by the bootstrap. 11

(The Monte Carlo results for all wild bootstrap schemes are available on request.) The distribution of (9) satisfies the essential requirements that E(Z) =0and E(Z 2 )=1, but has E(Z 3 )=0, rather than E(Z 3 )=1as in (8). Provided the regularity conditions mentioned in Section 2 are satisfied, the wild bootstrap can be used for either conditional or unconditional heteroskedasticity. If the regressors are strictly exogenous and the variances exhibit unconditional heteroskedasticity, with variances being functions of regressor values, a paired bootstrap can be used; see Flachaire (1999). Brownstone and Valetta (2001) discuss why the wild bootstrap is likely to be more accurate than the paired bootstrap when both are appropriate. 4 Monte Carlo Design The number of Monte Carlo replications for each experiment is R = 25000, withthe number of bootstraps being B =399. MacKinnon (2002) remarks that, while he would not recommend using B =399in genuine applications, sampling errors associated with this value tend to cancel out in Monte Carlo experiments. In empirical work, B =999could be used without incurring important waiting times, given the power of modern personal computers. The dynamic regression model upon which experiments are based is y t = α 1 y t 1 + α 2 y t 2 + β 1 + β 2 x t + ² t,t=1,..., T, (10) in which x t is a scalar variable and T equals 40 or 80. This model corresponds to (7) of Dezhbakhsh (1990). The values used for (α 1, α 2 ) are (0.5, 0.3), (0.7, -0.2), (1.0, -0.2), (1.3, -0.5), (0.9, -0.3), and (0.6, 0.2) which are regarded by Dezhbakhsh and Thursby (1995) as typical of values observed in applied work. In all experiments, β 1 = β 2 =1. The exogenous variable x t is constructed in two ways. First, it is 12

generated as an artificial variable according to the first order autoregression x t = θx t 1 + v t, (11) with v t being NID(0, σ 2 v), θ =0.5 or 0.9, and σ 2 v selected, given the value of θ, sothat Var(x t )=1. 6 The starting value x 0 is a drawing from a standard normal distribution. Second, x t is obtained by standardizing the logs of quarterly observations for GDP in the UK, with the basic data being taken from the file GDP95.FIT in the Microfit 4.0 package (Pesaran and Pesaran, 1996). The transformed data appear to exhibit deterministic and stochastic trends, as might be expected for this type of macroeconomic variable. The starting values y 1 and y 0 are set equal to their unconditional mean with x =0, i.e. β 1 /(1 α 1 α 2 ). The effect of using these starting values is reduced by discarding the first 40 of any sequence of (T +40)generated values. The finite sample significance levels of versions of the tests described in Section 2 are estimated for nominal levels of 1% and 5%. As a useful example of testing using quarterly data, HR-tests LM HR and MLM HR of λ 1 = λ 2 = λ 3 = λ 4 =0in 4X y t = α 1 y t 1 + α 2 y t 2 + β 1 + β 2 x t + λ j e t j + error, (12) are used, with e t j being a typical lagged residual from the OLS estimation of (10). The Hausman-type test HA HR is a test of the joint significance of the estimator contrasts for α 1 and α 2 arising from a comparison of OLS estimates of (10) and (12), i.e. it is a special case of the test associated with (3). With conditional heteroskedasticity, the error terms ² t of (10) can be written as ² t = p h t ζ t, (13) where h t denotes a conditional variance and, under the null hypothesis, the terms ζ t are iid(0,1). Various distributions for ζ t are used. The normal distribution serves 6 All pseudo-random numbers are generated using subroutines from the NAG Library. j=1 13

as a benchmark and standardized forms of the t(5) and χ 2 (8) distributions are also employed. The t(5) distribution is used, following Gonçalves and Kilian (2002), to investigate robustness of the wild bootstrap methods to departures from condition (f) of Section 2. The χ 2 (8) distribution is used to provide evidence on the effects of skewness. This distribution, with a coefficient of skewness equal to 1, is heavily skewed, according to the arguments of Ramberg, Tadikamalla, Dudewicz and Mykytka (1979). The final component required to derive a typical error ² t,afterdrawingζ t, is the conditional standard deviation h t. Since the HR-tests for serial correlation are intended for general use, it is important to obtain evidence not only for several forms of heteroskedasticity, but also for the case of homoskedastic errors. The following five specifications for variance schemes are used. First, the h t are observation-invariant, i.e. the errors are homoskedastic with h t = σ 2,t=1,..., T, (14) σ 2 being set equal to 1 or 10. Second, the ARCH(1) process provides an important example of conditional heteroskedasticity and is used in the form h t = φ 0 + φ 1 ² 2 t 1, (15) in which φ 0 = σ 2 /(1 φ 1 ), φ 1 = 0.4 or 0.8, and σ 2 is defined as for (14). The GARCH(1, 1) model is also frequently used in applied studies. This variance scheme is written as h t = ψ 0 + ψ 1 ² 2 t 1 + ψ 2 h t 1, (16) where ψ 0 =1, ψ 1 =0.1, and ψ 2 =0.8; see Bollerslev (1986). The values of ψ 1 and ψ 2 are similar to those reported in empirical work. The average R 2 calculated from the simulations under this specification is about 0.5. 7 7 The average R 2 varies between 0.4 and 0.9 across the five variance schemes which seems consistent with values observed in applied work. 14

Given the widespread use of quarterly data in applied work, it seems apposite to examine seasonal schemes of heteroskedasticity. Consideration is, therefore, given to a fourth-order model taken from Engle s (1982) classic article. As in (38) of Engle (1982, page 1002), conditional variances h t are written as h t = φ 0 + φ 1 (0.4² 2 t 1 +0.3² 2 t 2 +0.2² 2 t 3 +0.1² 2 t 4), (17) in which φ 0 and φ 1 areasdefined for (15). The fifth and final model to be adopted has unconditional quarterly heteroskedasticity and can be written as (ω 2 1, ω 2 2, ω 2 3, ω 2 2σ 2 4)=( (1 + c), 2σ 2 (1 + c), 2cσ 2 (1 + c), 2cσ 2 ), (18) (1 + c) in which ω 2 j denotes the variance in quarter j, the average of these terms is σ 2, σ 2 is as specified in (14), and c equals 4 or 9. The model of (18) is similar in spirit to that used by Burridge and Taylor (2001, p. 104). Note that, although (18) generates unconditional heteroskedasticity, its effects on serial correlation tests cannot be ignored. In the estimation of the covariance matrix of OLS estimators for (12), the second moments of the errors are not asymptotically orthogonal to the squared regressors e 2 t j,j =1,..., 4, so that the conventional (homoskedasticity-based) estimator is not consistent; see White (1980, p. 826). In addition to significance levels, the sensitivity of tests to serial correlation is clearly of interest. For experiments designed to provide evidence on power, various alternatives are used, some of which involve a combination of conditional heteroskedasticity and serial correlation. Five alternatives with homoskedastic errors are obtained as special cases of the stationary fifth-order autoregession 5X ² t = ρ j ² t j + ζ t, (19) j=1 which replaces (13). The coefficient vectors ρ 0 =(ρ 1, ρ 2, ρ 3, ρ 4, ρ 5 ) of these special cases are as follows: 15

ρ 0 (1) =(0.7, 0.0, 0.0, 0.0, 0.0); (20) ρ 0 (2) =(0.0, 0.0, 0.0, 0.7, 0.0); (21) ρ 0 (3) =(1.4, 0.71, 0.15, 0.1, 0.0); (22) ρ 0 (4) =(0.8, 0.0, 0.2, 0.06, 0.0); and (23) ρ 0 (5) =(0.5, 0.0, 0.0, 0.5, 0.25). (24) Thus the alternative hypothesis of the tests LM HR and MLM HR from (12) is overspecified for the first two cases and is underspecified for the final case. 8 The roots of the polynomial equations z 4 ρ 1 z 3... ρ 4 =0implied by the coefficients of (22) and (23) are (0.5, 0.4, 0.3, 0.2) and (0.5, -0.5, 0.4 ±0.3i), respectively. In order to derive power estimates in the presence of ARCH errors, the general random coefficient specification discussed by Bera and Higgins (1993) is adopted. The two specific errormodelsusedinthemontecarloexperiments are ² t =(0.7+0.1τ t )² t 1 + ζ t, (25) and ² t =(0.7+0.1τ t )² t 4 + ζ t, (26) in which the τ t are NID(0, 1) and are independent of ζ s for all s and t. Clearly the probability that the random coefficient of the serial correlation process is strictly between 0 and 1 is high for equations (25) and (26). 8 In practical situations, there is little information about the nature of serial correlation and so it is useful to examine overspecified and underspecified tests. 16

5 Monte Carlo results The estimates of significance levels provide clear evidence of the following main features. (i) As expected from the work of Kiviet (1986), the standard LM F test is wellbehaved when there is neither conditional nor unconditional heteroskedasticity. However, in the presence of either of these forms of heteroskedasticity, LM F rejects too frequently relative to the nominal value. The extent of the problem usually increases with the sample size, markedly so in the case of some forms of conditional heteroskedasticity. The error distribution does not appear to have any important effects on rejection rates. (ii) The estimates for MLM HR are quite similar to those for LM HR, whichever of the three methods is used to obtain the critical value. When the asymptotic χ 2 (4) distribution is used, LM HR and MLM HR are both markedly undersized at 1% and 5% levels. The problem is especially severe when the nominal size is 1%, as might be appropriate when a check for serial correlation is one of a battery of misspecification tests. When PD 1 is used in the wild bootstrap, LM HR and MLM HR have estimates that are slightly greater than required values. The use of PD 2 leads to better agreement between estimates and nominal values. 9 Indeed all the evidence suggests that using PD 2 leads to very good control of finite sample significance levels. (iii) The Hausman-type test HA HR is, like LM HR and MLM HR, undersized at the nominal 1% level when asymptotic critical values are used. However, asymptotic critical values produce estimates close to the nominal significance level of 5%. It might be conjectured that the improved performance of asymptotic theory reflects the reduction in the number of restrictions being tested, i.e. two rather than four. 9 This ranking of PD 1 and PD 2 by errors in estimated rejection probabilities conflicts with Liu s (1988) asymptotically valid results for a quasi-t test. 17

The use of PD 1 again leads to slightly oversized tests and good control is derived when PD 2 is employed. The full set of Monte Carlo results is available on request, but, in order to save space, only a representative sample is provided below. This sample is obtained from five cases. Details of these cases are given in Table 1. The cases cover all variance models and both types of exogenous regressor sequence. For variance models (15), (17) and (18), there is a choice between two levels of heteroskedasticity. The stronger of the two forms has been used in all cases in order to obtain stringent checks of the efficacy of the wild bootstrap. For the artificial AR(1) regressor of (11), the smaller value of θ, i.e. θ =0.5, is used so that the time series behaviour of this variable is not similar to that of the standardized log(uk GDP) series which can be modelled by an I(1) process. All the cases in Table 1 are for (10) with (α 1 =0.5, α 2 =0.3). Different choices of (α 1, α 2 ) from the pairs of values given in Section 4 do not lead to important changes in results; see Godfrey and Tremayne (2002). When using the cases of Table 1 to obtain information about HR-tests, estimates are only used for T =40in order to show just how well the wild bootstrap with PD 2 works in small samples. 18

Table 1 Sample of cases used in Tables 2-5 Variance model Regressor Case 1 (14) with σ 2 =1 log(uk GDP) a Case 2 (15) with φ 1 =0.8 (11) with θ =0.5 Case 3 (16) with ψ 1 =0.1, and ψ 2 =0.8 log(uk GDP) a Case 4 (17) with ψ 1 =0.1, and ψ 2 =0.8 (11) with θ =0.5 Case 5 (18) with c =9 log(uk GDP) a Note a These data are standardized before OLS estimation. Tables 2 to 5, constructed from estimates from Cases 1 to 5, contain examples that reflect the general findings (i) to (iii). In particular, the departures from nominal significance levels observed when LM F is asymptotically inappropriate indicate the vulnerability of this well-known standard check to ARCH and GARCH forms of conditional heteroskedasticity and to seasonal unconditional heteroskedasticity. The contents of Tables 3 to 5 show very clearly that asymptotic critical values cannot be relied upon when using HR tests. However, the combination of restricted residuals in the HCCME and the very simple PD 2 wild bootstrap gives tests with good finite sample behaviour. The results for the t(5) error distribution in Tables 3 to 5 indicate that the wild bootstrap can work well even when regularity condition (f) of Section 2isnotsatisfied. 19

Table 2 Estimated significance levels of LM F with T =40 nominal significance level 1% nominal significance level 5% sample size T =40 T =80 T =40 T =80 distribution for ζ t of (13) is N(0, 1) Case 1 0.92 0.90 4.61 4.27 Case 2 2.26 6.48 8.29 15.94 Case 3 1.34 1.59 5.86 6.38 Case 4 2.35 6.14 8.95 16.22 Case 5 1.99 2.26 7.81 8.31 distribution for ζ t of (13) is standardized t(5) Case 1 1.00 1.04 4.54 4.64 Case 2 2.37 6.18 7.95 15.05 Case 3 1.87 2.14 6.44 7.57 Case 4 2.65 7.22 9.22 17.51 Case 5 2.77 2.78 8.07 8.23 distribution for ζ t of (13) is standardized χ 2 (8) Case 1 1.06 0.97 4.72 4.33 Case 2 2.34 6.07 8.17 15.44 Case 3 1.66 2.14 6.52 7.50 Case 4 2.61 6.42 8.93 16.68 Case 5 2.89 2.80 9.00 8.45 Notes Cases are as specified in Table 1. All estimates are given as percentages. 20

Table 3 Estimated significance levels of LM HR with T =40 nominal significance level 1% nominal significance level 5% critical value χ 2 (4) PD 1 PD 2 χ 2 (4) PD 1 PD 2 distribution for ζ t of (13) is N(0, 1) Case 1 0.30 1.33 0.98 3.69 5.98 5.04 Case 2 0.17 1.36 1.00 3.41 5.99 5.08 Case 3 0.21 1.43 1.14 3.34 5.86 5.07 Case 4 0.18 1.27 1.02 3.52 6.26 5.20 Case 5 0.16 1.56 1.29 3.36 6.62 5.91 distribution for ζ t of (13) is standardized t(5) Case 1 0.17 1.12 0.85 2.88 5.48 4.41 Case 2 0.14 1.23 0.94 2.90 5.82 4.78 Case 3 0.14 1.20 1.00 2.98 5.92 5.14 Case 4 0.15 1.25 0.90 3.04 5.78 4.87 Case 5 0.12 1.22 1.10 2.62 6.08 5.47 distribution for ζ t of (13) is standardized χ 2 (8) Case 1 0.22 1.15 0.90 3.24 5.55 4.61 Case 2 0.16 1.15 0.96 3.18 6.01 4.92 Case 3 0.17 1.23 0.90 3.09 5.75 4.94 Case 4 0.15 1.20 0.87 3.20 5.88 4.96 Case 5 0.17 1.28 1.18 2.99 6.40 5.76 Notes Cases are as specified in Table 1.All estimates are given as percentages. 21

Table 4 Estimated significance levels of HA HR with T =40 nominal significance level 1% nominal significance level 5% critical value χ 2 (2) PD 1 PD 2 χ 2 (2) PD 1 PD 2 distribution for ζ t of (13) is N(0, 1) Case 1 0.50 1.29 1.06 4.77 5.69 4.95 Case 2 0.56 1.35 0.99 5.62 5.96 5.02 Case 3 0.48 1.60 1.30 4.34 6.30 5.75 Case 4 0.57 1.37 1.03 5.60 5.95 4.93 Case 5 0.30 1.38 1.20 3.64 6.24 5.80 distribution for ζ t of (13) is standardized t(5) Case 1 0.40 1.12 0.84 4.32 5.31 4.68 Case 2 0.48 1.31 0.96 5.04 5.92 4.89 Case 3 0.39 1.43 1.22 4.29 6.30 5.71 Case 4 0.50 1.35 0.95 5.33 6.00 5.08 Case 5 0.23 1.16 1.09 3.22 5.82 5.53 distribution for ζ t of (13) is standardized χ 2 (8) Case 1 0.41 1.20 0.94 4.40 5.39 4.57 Case 2 0.51 1.29 1.06 5.70 6.26 5.31 Case 3 0.34 1.38 1.11 4.32 6.38 5.89 Case 4 0.57 1.35 1.01 5.77 6.30 5.34 Case 5 0.28 1.41 1.16 3.77 6.26 6.01 Notes Cases are as specified in Table 1. All estimates are given as percentages. 22

Table 5 Estimated significance levels of MLM HR with T =40 nominal significance level 1% nominal significance level 5% critical value χ 2 (4) PD 1 PD 2 χ 2 (4) PD 1 PD 2 distribution for ζ t of (13) is N(0, 1) Case 1 0.24 1.30 0.97 3.25 5.68 4.82 Case 2 0.14 1.25 1.00 2.86 5.78 4.98 Case 3 0.17 1.26 1.05 2.71 5.51 4.84 Case 4 0.13 1.24 0.99 2.98 5.90 5.12 Case 5 0.13 1.43 1.20 2.70 6.22 5.65 distribution for ζ t of (13) is standardized t(5) Case 1 0.11 1.12 0.86 2.65 5.25 4.48 Case 2 0.12 1.14 1.02 2.64 5.63 4.74 Case 3 0.13 1.08 0.90 2.50 5.55 4.96 Case 4 0.14 1.16 0.90 2.64 5.57 4.81 Case 5 0.11 1.18 1.10 2.24 5.80 5.45 distribution for ζ t of (13) is standardized χ 2 (8) Case 1 0.18 1.02 0.83 2.74 5.12 4.33 Case 2 0.15 1.07 0.90 2.72 5.78 4.92 Case 3 0.13 1.23 0.96 2.63 5.58 4.89 Case 4 0.12 1.18 0.86 2.80 5.72 4.97 Case 5 0.14 1.25 1.16 2.54 6.02 5.53 Notes Cases are as specified in Table 1.All estimates are given as percentages. 23

Turning to estimates of power derived from (19)-(26), results for the artificial regressor sequence of (11) with θ =0.5 are given in Table 6. The following features emerge from consideration of the contents of Table 6. (iv) When LM F is asymptotically valid, i.e. in the cases of panel (a), it is usually slightly more powerful than the HR variants LM HR and MLM HR. However, the cost of the insurance premium for robustness to unspecified forms of heteroskedasticity is not large. Estimates for LM F arenotgivenforthearchmodelsinpanel(b) because the test is asymptotically inappropriate; see Table 2. (v) As predicted by the results of Holly (1982), there is no generally valid ranking by power that applies to HA HR on the one hand and the asymptotically equivalent LM HR and MLM HR on the other. Differences between Hausman-type and LMtype tests vary in sign and are usually small. However, an important difference is observed in the cases of the simple AR(4) model defined by ρ (2) of (21) and the corresponding ARCH process (26). For these two cases, HA HR has manifestly inferior power. The insensitivity of HA HR to serial correlation is worrying even if the implied inconsistencies of OLS estimators are small. The usual HCCME is inappropriate in the presence of serial correlation; so that failure to detect serial correlation may lead to misleading inferences. Overall it may be safer to use either LM HR or MLM HR, the differences between the power estimates of these asymptotically equivalent tests being small. (vi) Finally, comparison of the first and second rows of results in panel (a) with the corresponding results in panel (b) suggests that power estimates are not greatly affected by the presence of random variation in AR coefficients. 24

Table 6 Estimates of power with T =40and T =80 Model is (10) with (α 1, α 2, β 1, β 2 )=(0.5, 0.3, 1.0, 1.0) AR(1) regressor with θ =0.5 (a) Errors ² t generated by stationary AR(5) model (19) with perturbations ζ t being NID(0,1) T =40 T =80 ρ (j) of LM F LM HR HA HR MLM HR LM F LM HR HA HR MLM HR (20) 36.7 31.5 33.4 30.2 80.6 75.1 74.9 74.5 (21) 82.2 71.9 9.0 72.9 99.7 99.3 18.0 99.3 (22) 14.7 17.8 21.7 16.5 69.1 65.5 72.0 64.5 (23) 30.3 28.2 24.5 26.9 91.5 86.6 75.5 86.3 (24) 47.6 39.5 27.2 39.6 95.8 92.3 76.5 92.1 (b) Errors ² t generated by ARCH models (25) and (26) with perturbations ζ t being NID(0,1) T =40 T =80 ARCH model LM F LM HR HA HR MLM HR LM F LM HR HA HR MLM HR (25) n/a 30.9 32.2 29.5 n/a 74.3 74.7 73.9 (26) n/a 67.3 9.7 67.6 n/a 99.0 18.9 99.0 Notes All critical values estimated using PD 2 wild bootstrap, with nominal significance level of 5%. n/a denotes that the test LM F is inappropriate 25

6 Conclusions Time-varying heteroskedasticity is recognized as an important phenomenon in applied studies; see, e.g., the discussions in Bollerslev, Engle and Nelson (1994). It is shown that misleading inferences may be made when standard checks for serial correlation are applied to dynamic regression models in the presence of either conditional or seasonal heteroskedasticity. To be more precise, simulation evidence presented in this paper illustrates a tendency for the well-known Lagrange multiplier test to be oversized, sometimes by a substantial amount. This evidence is derived from Monte Carlo experiments designed to mimic quarterly empirical studies. In these experiments, heteroskedasticity-robust versions of the usual Lagrange multiplier test and a Hausman-type test are also examined, along with a modification of the former procedure. The modification is designed to improve performance by using a less variable estimate of the covariance matrix. It is clear from the Monte Carlo results that the use of asymptotic critical values cannot be recommended. Heteroskedasticity-robust tests conducted on this basis are found to be undersized. However, a wild bootstrap approach proves to be reliable, with good control over finite sample significance levels being provided by the simple two-point pick distribution discussed by Davidson and Flachaire (2000). Results on power are also reported. The original and modified forms of the Lagrange multiplier test are very similar in performance. Under several alternatives, the Hausman test rejects a false null with a frequency similar to the other two tests, but evidence is found that it has relatively weak power against simple fourth-order autocorrelation. Consequently it may be safer to use the Lagrange multiplier test in the form in which it is robust to heteroskedasticity of unspecified form, or the modified version of this test, rather than the Hausman test. 26

References Bera, A.K. and M.L. Higgins, ARCH Models: Properties, Estimation and Testing, Journal of Economic Surveys 7 (1993), 305-366. Bollerslev, T., Generalized Autoregressive Conditional Heteroskedasticity, Journal of Econometrics 31 (1986), 307-327. Bollerslev, T., R.F. Engle and D.B. Nelson, ARCH Models, in R.F. Engle and D.L. McFadden (eds.), Handbook of Econometrics, Volume IV (Amsterdam: North Holland, 1994). Breusch, T.S., Testing for Autocorrelation in Dynamic Linear Models, Australian Economic Papers 17 (1978), 334-355. Brownstone, D. and R. Valletta, The bootstrap and multiple imputations: harnessing increased computing power for improved statistical tests, Journal of Economic Perspectives 15 (2001), 129-142. Burridge, P. and A.M.R. Taylor, On regression-based tests for seasonal unit roots in the presence of periodic heteroscedasticity, Journal of Econometrics 104 (2001), 91-118. Davidson, R. and E. Flachaire, The Wild Bootstrap, Tamed at Last, working paper, GREQAM (2000). Davidson, R. and J.G. MacKinnon, Heteroskedasticity-Robust Tests in Regression Directions, Annals de l INSEE 59/60 (1985), 183-218. Dezhbakhsh, H., The Inappropriate Use of Serial Correlation Tests in Dynamic Linear Models, Review of Economics and Statistics 72 (1990), 126-132. Dezhbakhsh, H. and J.G. Thursby, Testing for Autocorrelation in the Presence of Lagged Dependent Variables: A Specification Error Approach, Journal of Econometrics 60 (1994), 251-272. Dezhbakhsh, H. and J.G. Thursby, A Monte Carlo Comparison of Tests Based on the Durbin-Watson Statistic with Other Autocorrelation Tests in Dynamic Models, 27

Econometric Reviews 14 (1995), 347-366. Engle, R.F., Autoregressive Conditional Heteroscedasticity with Estimates of the Variance of U.K. Inflation, Econometrica 50 (1982), 987-1008. Flachaire, E., A better way to bootstrap pairs, Economics Letters 64 (1999), 257-262. Godfrey, L.G., Testing Against General Autoregressive and Moving Average Error Models when the Regressors Include Lagged Dependent Variables, Econometrica 49 (1978), 1293-1302. Godfrey, L.G., Testing for serial correlation by variable addition in dynamic models estimated by instrumental variables, Review of Economics and Statistics 76 (1994), 550-559. Godfrey, L.G., Hausman Tests for Autocorrelation in the Presence of Lagged Dependent Variables: Some Further Results, Journal of Econometrics 82 (1997), 197-207. Godfrey, L.G. and C.D. Orme, Significance Levels of Heteroskedasticity-Robust Tests for Specification and Misspecification: Some Results on the Use of the Wild Bootstrap, unpublished paper, University of York (2002). Godfrey, L.G. and A.R. Tremayne, Heteroskedasticity-robust testing for serial correlation in regression models with lagged dependent variables, unpublished paper, University of York (2002). Gonçalves, S. and L. Kilian, Bootstrapping autoregressions with conditional heteroskedasticity of unknown form, European Central Bank Working Paper no. 196 (2002). Guo, B.B. and P.C.B.Phillips, Testing for Autocorrelation and Unit Roots in the Presence of Conditional Heteroskedasticity of Unknown Form, unpublished paper, Yale University (2000). Hansen, B.E., Discussion of Data Mining Reconsidered, Econometrics Journal 28

2 (1999), 192-201. Hausman, J.A., Specification Error Tests in Econometrics, Econometrica 46 (1978), 1251-1271. Holly, A., A Remark on Hausman s Specification Test, Econometrica 50 (1982), 749-759. Kiviet, J., On the rigour of some specification tests for modelling dynamic relationships, Review of Economic Studies 53 (1986), 241-262. Li, H. and G.S. Maddala, Bootstrapping Time Series Models, Econometric Reviews 15 (1996), 115-158. Liu, R.Y., Bootstrap Procedures under Some Non-IID Models, Annals of Statistics 16 (1988), 1696-1708. MacKinnon, J.G., Bootstrap inference in econometrics, unpublished paper, Queen s University (2002). Mammen, E., Bootstrap and Wild Bootstrap for High Dimensional Linear Models, Annals of Statistics 21 (1993), 255-285. Pagan, A.R. and A.D. Hall, Diagnostic tests as residual analysis, Econometric Reviews 2 (1983), 159-218. Pesaran, M.H. and B. Pesaran, Microfit 4.0, (Oxford: Oxford University Press, 1996). Ramberg, J.S., E.J. Dudewicz, P.R. Tadikamalla and E.F. Mykytka, A Probability Distribution and its Uses in Fit, Technometrics 21 (1979), 201-213. Robinson, P.M., Testing for serial correlation and dynamic conditional heteroskedasticity in multiple regression, Journal of Econometrics 47 (1991), 67-84. Silvapulle, P. and M. Evans, Testing for Serial Correlation in the Presence of Dynamic Heteroscedasticity, Econometric Reviews 17 (1998), 31-56. Whang, Y.-J., A Test of Autocorrelation in the Presence of Heteroskedasticity of Unknown Form, Econometric Theory 14 (1998), 87-122. 29

White, H., A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity, Econometrica 48 (1980), 817-838. Wooldridge, J.M., Asymptotic Properties of Some Specification Tests in Linear Models with Integrated Processes, in R.F. Engle and H. White (eds.) Cointegration, Causality and Forecasting-Festschrift in Honour of Clive W.J. Granger (Oxford: Oxford University Press, 1999), 366-384. 30

USING THE WILD BOOTSTRAP TO IMPLEMENT HETEROSKEDASTICITY-ROBUST TESTS FOR SERIAL CORRELATION IN DYNAMIC REGRESSION MODELS* April 2003.