Block Bootstrap HAC Robust Tests: The Sophistication of the Naive Bootstrap

Similar documents
A New Approach to Robust Inference in Cointegration

LECTURE ON HAC COVARIANCE MATRIX ESTIMATION AND THE KVB APPROACH

Discussion of Bootstrap prediction intervals for linear, nonlinear, and nonparametric autoregressions, by Li Pan and Dimitris Politis

Understanding Regressions with Observations Collected at High Frequency over Long Span

Bootstrap prediction intervals for factor models

A Fixed-b Perspective on the Phillips-Perron Unit Root Tests

Optimal Bandwidth Selection in Heteroskedasticity-Autocorrelation Robust Testing

Robust Unit Root and Cointegration Rank Tests for Panels and Large Systems *

OPTIMAL BANDWIDTH CHOICE FOR INTERVAL ESTIMATION IN GMM REGRESSION. Yixiao Sun and Peter C.B. Phillips. May 2008

A better way to bootstrap pairs

Testing in GMM Models Without Truncation

A New Asymptotic Theory for Heteroskedasticity-Autocorrelation Robust Tests

Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator

Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION. September 2017

Estimation and Inference of Linear Trend Slope Ratios

Robust Nonnested Testing and the Demand for Money

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models

An estimate of the long-run covariance matrix, Ω, is necessary to calculate asymptotic

HAR Inference: Recommendations for Practice

Fixed-b Inference for Testing Structural Change in a Time Series Regression

THE ERROR IN REJECTION PROBABILITY OF SIMPLE AUTOCORRELATION ROBUST TESTS

LONG RUN VARIANCE ESTIMATION AND ROBUST REGRESSION TESTING USING SHARP ORIGIN KERNELS WITH NO TRUNCATION

Bootstrap Testing in Econometrics

ESSAYS ON TIME SERIES ECONOMETRICS. Cheol-Keun Cho

The Number of Bootstrap Replicates in Bootstrap Dickey-Fuller Unit Root Tests

11. Bootstrap Methods

Single Equation Linear GMM with Serially Correlated Moment Conditions

CAE Working Paper # Spectral Density Bandwith Choice: Source of Nonmonotonic Power for Tests of a Mean Shift in a Time Series

Serial Correlation Robust LM Type Tests for a Shift in Trend

Single Equation Linear GMM with Serially Correlated Moment Conditions

Heteroskedasticity-Robust Inference in Finite Samples

Economic modelling and forecasting

Heteroskedasticity and Autocorrelation Consistent Standard Errors

CAE Working Paper # Fixed-b Asymptotic Approximation of the Sampling Behavior of Nonparametric Spectral Density Estimators

The Bootstrap: Theory and Applications. Biing-Shen Kuo National Chengchi University

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

RESIDUAL-BASED BLOCK BOOTSTRAP FOR UNIT ROOT TESTING. By Efstathios Paparoditis and Dimitris N. Politis 1

A strong consistency proof for heteroscedasticity and autocorrelation consistent covariance matrix estimators

The Size-Power Tradeoff in HAR Inference

BOOTSTRAPPING DIFFERENCES-IN-DIFFERENCES ESTIMATES

CAE Working Paper # A New Asymptotic Theory for Heteroskedasticity-Autocorrelation Robust Tests. Nicholas M. Kiefer and Timothy J.

Choice of Spectral Density Estimator in Ng-Perron Test: Comparative Analysis

Central Bank of Chile October 29-31, 2013 Bruce Hansen (University of Wisconsin) Structural Breaks October 29-31, / 91. Bruce E.

Robust Backtesting Tests for Value-at-Risk Models

Testing for Unit Roots in the Presence of a Possible Break in Trend and Non-Stationary Volatility

Department of Economics, Vanderbilt University While it is known that pseudo-out-of-sample methods are not optimal for

The Functional Central Limit Theorem and Testing for Time Varying Parameters

Testing for a Change in Mean under Fractional Integration

Bootstrapping heteroskedastic regression models: wild bootstrap vs. pairs bootstrap

Robust Performance Hypothesis Testing with the Sharpe Ratio

A New Approach to the Asymptotics of Heteroskedasticity-Autocorrelation Robust Testing

Powerful Trend Function Tests That are Robust to Strong Serial Correlation with an Application to the Prebish Singer Hypothesis

The Impact of the Initial Condition on Covariate Augmented Unit Root Tests

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Long-Run Covariability

Using all observations when forecasting under structural breaks

ECONOMETRICS FIELD EXAM Michigan State University May 9, 2008

Inference with Dependent Data Using Cluster Covariance Estimators

Exogeneity tests and weak identification

Size and Power of the RESET Test as Applied to Systems of Equations: A Bootstrap Approach

Comment on HAC Corrections for Strongly Autocorrelated Time Series by Ulrich K. Müller

Inference with Dependent Data Using Cluster Covariance Estimators

Testing for a Trend with Persistent Errors

Asymptotic distribution of GMM Estimator

Testing Error Correction in Panel data

Heteroskedasticity- and Autocorrelation-Robust Inference or Three Decades of HAC and HAR: What Have We Learned?

Maximum Non-extensive Entropy Block Bootstrap

CALCULATION METHOD FOR NONLINEAR DYNAMIC LEAST-ABSOLUTE DEVIATIONS ESTIMATOR

Cointegrating Regressions with Messy Regressors: J. Isaac Miller

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

A TIME SERIES PARADOX: UNIT ROOT TESTS PERFORM POORLY WHEN DATA ARE COINTEGRATED

Darmstadt Discussion Papers in Economics

Economics Division University of Southampton Southampton SO17 1BJ, UK. Title Overlapping Sub-sampling and invariance to initial conditions

Diagnostics for the Bootstrap and Fast Double Bootstrap

POWER MAXIMIZATION AND SIZE CONTROL IN HETEROSKEDASTICITY AND AUTOCORRELATION ROBUST TESTS WITH EXPONENTIATED KERNELS

On the Long-Run Variance Ratio Test for a Unit Root

Econometrics Summary Algebraic and Statistical Preliminaries

New Methods for Inference in Long-Horizon Regressions

On block bootstrapping areal data Introduction

Title. Description. Quick start. Menu. stata.com. xtcointtest Panel-data cointegration tests

Bootstrap Methods in Econometrics

HETEROSKEDASTICITY, TEMPORAL AND SPATIAL CORRELATION MATTER

Fixed-b Asymptotics for Spatially Dependent Robust Nonparametric Covariance Matrix Estimators

Block Bootstrap Prediction Intervals for Vector Autoregression

1 Introduction to Generalized Least Squares

Econometric Methods for Panel Data

Ch.10 Autocorrelated Disturbances (June 15, 2016)

by Ye Cai and Mototsugu Shintani

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Department of Economics, UCSD UC San Diego

A Modified Confidence Set for the S Title Date in Linear Regression Models.

Studies in Nonlinear Dynamics & Econometrics

The Generalized Cochrane-Orcutt Transformation Estimation For Spurious and Fractional Spurious Regressions

E 4160 Autumn term Lecture 9: Deterministic trends vs integrated series; Spurious regression; Dickey-Fuller distribution and test

Groupe de Recherche en Économie et Développement International. Cahier de recherche / Working Paper 08-17

1. Stochastic Processes and Stationarity

A Bootstrap Test for Conditional Symmetry

A Practitioner s Guide to Cluster-Robust Inference

The Restricted Likelihood Ratio Test at the Boundary in Autoregressive Series

Transcription:

Block Bootstrap HAC Robust ests: he Sophistication of the Naive Bootstrap Sílvia Gonçalves Département de sciences économiques, CIREQ and CIRANO, Université de ontréal and imothy J. Vogelsang Department of Economics and Department of Statistical Science, Cornell University April 27, 26 Abstract his paper studies the properties of naive block bootstrap tests that are scaled by zero frequency spectral density estimators long run variance estimators. he naive bootstrap is a bootstrap where the formula used in the bootstrap world to compute the test is the same as the formula used on the original data. Simulation evidence shows that the naive bootstrap is much more accurate than the standard normal approximation. he larger the HAC bandwidth, the greater the improvement. his improvement holds for a large number of popular kernels, including the Bartlett kernel, and it holds when the i.i.d. bootstrap is used and yet the data are serially correlated. he patterns in the simulation evidence are in stark contrast with the conventional wisdom, which suggests that we must use different formulae for computing the standard error estimates in the bootstrap data and in the actual data, and would appear puzzling at first glance. Using the recently developed fixed bandwidth asymptotics for HAC robust tests, we provide theoretical results that can explain the finite sample patterns. We show that the block bootstrap, including the special case of the i.i.d. bootstrap, has the same limiting distribution as the fixed-b asymptotic distribution. For the special case of a location model with a Bartlett kernel HAC variance estimator, we provide theoretical results that suggest the naive bootstrap provides a refinement over the standard normal approximation. Our theoretical results lay the foundation for a bootstrap asymptotic theory that is an alternative to the traditional approach based on Edgeworth expansions. For helpful comments and suggestions we thank Lutz Kilian, Guido Kuersteiner, Nour eddahi, Ulrich ueller, Pierre Perron, Yixiao Sun, Hal White, and seminar participants at Boston University, Queen s University, U. oronto, Johns Hopkins Biostatistics, Chicago GSB, UCLA, UCSD, U. ichigan, U. Laval, U. Pittsburgh, the 25 Winter eetings of the Econometrics Society in Philadelphia, the 25 European Economic Association eetings in Amsterdam and the 24 Forecasting Conference at Duke University. Vogelsang thanks the Center for Analytic Economics at Cornell and gratefully acknowledges financial support from the NSF through grant SES-52577.

1 Introduction he bootstrap, originally proposed by Efron 1979, has become a standard tool used in statistics as a way to provide critical values for test statistics. he impact of the bootstrap on econometrics has been substantial. he bootstrap is a computationally intensive alternative to using critical values based on asymptotic approximations. Interest in the bootstrap has increased over time for two reasons. First, the computational power of computers has increased to the point where the bootstrap can often be quickly implemented in practice. Second, in simulations studies it is often found that the approximation delivered by the bootstrap is more accurate than the approximation given by first order asymptotics. heoretical explanations for this improvement in accuracy, often called a refinement, are typically established using higher order asymptotics. With time series data, implementation of the bootstrap is more complicated than in the i.i.d. case because of the dependence structure in the data. any variants of the bootstrap have been proposed for dependent data including the well known moving blocks bootstrap originally proposed by Kunsch 1989. heoretical conditions under which the block bootstrap can be expected to provide refinements have been established by Götze and Künsch 1996, Lahiri 1996, Andrews 22 and others. Refinements of the block bootstrap in generalized method of moments G models have been shown by Hall and Horowitz 1996 and Inoue and Shintani 24. he theoretical results in these papers have been established using Edgeworth expansions with leading terms that are distributed standard normal. When the moving blocks bootstrap BB is applied to tests based on heteroskedasticity autocorrelation HAC robust variance estimators, conventional wisdom makes the following predictions: 1. he i.i.d. bootstrap does not work when data are dependent Singh 1981. 2. he naive bootstrap, defined as a bootstrap where the formula used in the bootstrap world is the same as the formula used to compute the statistic using the actual data, is no more accurate than the usual first order asymptotic approximation Davison and Hall 1993 and Götze and Künsch 1996 GK 1. 3. he bootstrap for the Bartlett kernel based test is no more accurate than the standard first-order asymptotic approximation, whereas other kernels including the quadratic spectral kernel can lead to bootstrap tests with higher order refinements Götze and Künsch 1996 and Inoue and Shintani 24. In a recent paper, Kiefer and Vogelsang 25 reported small sample simulation results for HAC robust t-statistics for testing hypotheses about the sample mean of a stationary univariate 1 Davison and Hall 1993 and Götze and Künsch 1996 suggest alternatives to the naive bootstrap that use centering and variance estimators for the bootstrap that differ from what are used for the original data. We do not analyze variants of these bootstraps in this paper. 1

time series that suggests the naive bootstrap performs much better than predicted by the existing theoretical literature. Kiefer and Vogelsang 25 found that the naive bootstrap, including the i.i.d. bootstrap, can dramatically outperform the standard normal approximation, and this improvement over the standard normal approximation occurs for many kernels including the Bartlett kernel. he purpose of this paper is to provide theoretical results that can explain why the naive bootstrap performs so well in spite of the negative theoretical predictions made by Edgeworth analyzes. Whereas the Edgeworth approach seeks to approximate the null distributions of t-statistics with expansions with a standard normal leading term, we develop expansions within the fixed-b asymptotic framework recently proposed by Kiefer and Vogelsang 25. Under fixed-b asymptotics, the leading term of the expansion is nonstandard and depends on the kernel and bandwidth used to construct the HAC robust variance estimator. For stationary regression models we show that under fixed-b asymptotics, HAC robust statistics have the same first order asymptotic distribution as the naive BB versions of the tests. his equivalence holds for a wide range of block length choices including block length of one - the i.i.d. bootstrap. hese results hold for a wide range of kernels including the Bartlett kernel. Whereas establishing the fixed-b asymptotic distribution of the naive BB is relatively straightforward, showing that the naive BB and fixed-b asymptotics provides a refinement over the standard normal approximation is more difficult. Focusing on Bartlett kernel based HAC robust tests in the simple location model we show that the naive i.i.d. bootstrap and fixed-b asymptotic approximation provides a refinement over the standard normal approximation. Although potentially tedious, it should be possible to extend these results to other kernels and to stationary regression models. he remainder of the paper is organized as follows. In the next section we describe the model and test statistics. We review the fixed-b asymptotic approximation. Section 3 reports simulation results for a stationary regression model and the special case of a simple location model. he simulations illustrate the performance of the naive BB bootstrap relative to the standard normal and fixed-b approximations. In Sections 4 and 5 we provide theoretical explanations for several of the patterns that emerge from the simulations. Section 4 focuses on stationary regression models and establishes the first order asymptotic equivalence between the naive bootstrap and the fixed-b asymptotic approximation. hese results could be generalized in straightforward ways to nonlinear models estimated by generalized method of moments. In Section 5 we narrow the focus to the simple location model and we provide higher order asymptotic results for Bartlett kernel based tests. hese results establish that the fixed-b asymptotic approximation and the naive i.i.d. bootstrap provide asymptotic refinements over the standard normal approximation. In Section 6 we discuss heuristic comparisons between fixed-b asymptotic approximations and the Edgeworth approximations derived 2

by Velasco and Robinson 21 in an effort to shed some light on the relative performance of Edgeworth approximations and the naive bootstrap/fixed-b asymptotics in the simple location model. Proofs are given in two mathematical appendices. 2 odel and est Statistics hroughout the paper we focus on stationary regression models of the form y t = x tβ + u t, t = 1, 2,...,, 1 where x t and β are s 1 vectors. he stationary time series {x t } and {u t } are autocorrelated and possibly conditionally heteroskedastic. It is assumed that u t is mean zero and is uncorrelated with x t. he parameter of interest is β and its estimator is ˆβ 1 = x tx t x ty t, the ordinary least squares OLS estimator. Let Q = E x t x t and Ω = lim V ar 1/2 v t, where v t = x t u t. For HAC robust testing we require estimates of Q and Ω. he usual estimate of Q is ˆQ = 1 x tx t. Estimation of Ω is often implemented with a kernel variance estimator such as ˆΩ = 1 j= 1 j k ˆΓ j, 2 where k x is a kernel function such that k x = k x, k = 1, k x 1, k x is continuous at x =, and k2 x <. Here, for j, ˆΓ j = 1 t=j+1 ˆv tˆv t j are the sample autocovariances of the score vector ˆv t = x t û t, with û t = y t x tˆβ the OLS residuals, and ˆΓ j = ˆΓ j for j <. is the bandwidth parameter, which can act as a truncation lag for kernels such that k x = for x > 1. Consider testing the null hypothesis H : Rβ = r against H 1 : Rβ r, where R is a q s matrix of rank q and r is a q 1 vector. We consider the following F-type statistic: F = Rˆβ r [R ˆQ 1ˆΩ ˆQ 1 R ] 1 Rˆβ r /q. In the case where q = 1 we can consider t statistics of the form Rˆβ r t = R ˆQ 1ˆΩ ˆQ. 1 R Under suitable regularity conditions described subsequently, Rˆβ r can be approximated by a vector of normal random variables with variance-covariance matrix RQ 1 ΩQ 1 R. Given that p lim ˆQ = Q, the traditional asymptotic approach seeks to establish consistency of ˆΩ to justify approximating ˆΩ by Ω. Consistency of ˆΩ requires that as, but /. Under 3

the traditional approach, F has a limiting chi-square distribution and t has a limiting standard normal distribution. An alternative approximation for ˆΩ has been proposed by Kiefer and Vogelsang 25. Suppose the bandwidth is modelled as = b, with b a fixed constant in, 1]. Because b is held fixed in this asymptotic nesting of, this approach has been labelled fixed-b asymptotics. Under fixed-b asymptotics, ˆΩ converges to a random variable rather than a constant that depends on the kernel and bandwidth. As a consequence, F and t have nonstandard distributions. hese limiting distributions are useful for testing because they reflect the choice of bandwidth and kernel but are otherwise asymptotically pivotal i.e. independent of nuisance parameters and critical values can be tabulated. For example, under suitable regularity conditions to be described subsequently, Kiefer and Vogelsang 25 showed that F W q 1 Q q b 1 B q 1/q, t W 11 Q1 b, 3 where denotes weak convergence, W i r is an i 1 vector of independent standard Wiener processes and Q i b is a random matrix that depends on the kernel. For example, in the case of the Bartlett kernel, Q i b = 2 b where W i r = W i r rw i 1. W i r W i r dr 1 b b Wi r + b W i r + W i r W i r + b dr 4 An alternative to asymptotic approximations is the bootstrap. In this paper we focus on the BB of Kunsch 1989 and Liu and Singh 1992. Define the vector w t = y t, x t that collects the dependent and the explanatory variables for each observation. Let l N 1 l < be a block length, and let B t,l = {w t, w t+1,...,w t+l 1 } be the block of l consecutive observations starting at w t. Note that l = 1 gives the standard i.i.d. bootstrap. For simplicity take = k l. he BB draws k = /l blocks randomly with replacement from the set of overlapping blocks {B 1,l,...,B l+1,l }. Let F and t denote the naive bootstrap versions of F and t. F and t are computed as follows. Given a bootstrap resample w t = y t, x t, let β denote the OLS estimate from the regression of y t on x t and let Q = 1 x tx t. Let Ω denote the bootstrap version of Ω where ˆv t = x tû t = x ty t x t ˆβ is used in place of v t. he naive bootstrap statistics are defined as F = Rˆβ r [R ˆQ 1ˆΩ ˆQ 1 R ] 1 Rˆβ r /q 4

where r = R β, and in the case of q = 1, Rˆβ r t = R ˆQ 1ˆΩ ˆQ. 1 R he bootstrap statistics are naive in the sense that, except for the centering around r instead of r, they are computed in the same way as F and t using the resampled data in place of the original data. he empirical distributions of F and t can be accurately estimated using simulations. 3 Finite Sample Performance In this section we use simulations to compare and contrast the finite sample performance of the standard asymptotic approximation, the fixed-b asymptotic approximation and the naive bootstrap. We first present results for a stationary regression model with four regressors. We then present results for the special case of a regression with no regressors - the simple location model. here are two reasons for giving results in the simple location model. First, explicit Edgeworth expansion approximations are available for the simple location model from Velasco and Robinson 21 whereas such expansions have not been derived for the regression model. hese Edgeworth expansion approximations are useful tools for understanding the relative performance of the asymptotic approximations and the naive bootstrap. Second, because of the complicated nature of obtaining higher order asymptotic results, the simple location model is a natural starting point where progress can be made. Having finite sample results is useful for motivating theoretical analysis. For the stationary regression model we adopt the well-studied setup of Andrews 1991. In particular, we consider a linear regression model where x t contains an intercept x t1 = 1 and four regressors. he regressors and errors are generated as mutually independent AR1 processes with variance 1 and AR parameter ρ: x ti = ρx t 1,i + 1 ρ 2 1/2 vti, i = 2,...,5; u t = ρu t 1 + 1 ρ 2 1/2 εt, where v ti and ε t are generated as independent standard normal distributions. he true parameter, β, is equal to zero and we consider three values for the AR parameter ρ:.3,.5 and.9. In the simulations, 2 random samples are generated for the sample sizes {25, 5}. he bootstrap tests are based on 999 replications for each sample. We report results for block sizes l {1, 5, 1, 2}. We consider testing the null hypothesis that β 2 = against the alternative that β 2 at a nominal level of 5%. he test statistic is the t-test tˆβ2 = ˆβ 2, se ˆβ2 5

where se ˆβ2 is a HAC standard error estimate. We consider the Bartlett and the QS kernel estimates and report results across 25 different values of the bandwidth = 1, 2,...,25, for = 25, and = 2, 4,...,48, 5, for = 5. For any given method, we reject the null hypothesis whenever tˆβ2 > tc, where t c is a critical value. he methods differ in the way they compute the critical values. In particular, t c = 1.96 for the standard asymptotic theory, whereas t c is the 97.5% percentile of the fixed-b asymptotic distribution derived by Kiefer and Vogelsang 25. For the bootstrap methods, t c is the 95% bootstrap percentile of the absolute value of the studentized bootstrap t statistic. he naive bootstrap computes the t-statistic as tˆβ 2 where se for the real data. tˆβ 2 = ˆβ 2 ˆβ, 2 se ˆβ 2 ˆβ 2 is computed using the same formula as se ˆβ2, but replacing the bootstrap data Figures 1 and 2 contain results for the Bartlett kernel for = 25 and = 5 whereas figures 3 and 4 contain results for the Quadratic Spectral QS kernel for these same two sample sizes. Each figure contains three panels, corresponding to the three values of ρ. Each panel depicts the actual null rejection rate as a function of the bandwidth. he figures clearly show that the naive block bootstrap is almost always substantially more accurate than the standard normal asymptotic approximation. he exception is when the block size is too large relatively to the sample size l = 2 and = 25 and ρ is equal to.3. he superior performance of the naive block bootstrap over the standard normal approximation holds for the two kernels used, although the improvement is larger for the QS kernel as compared to the Bartlett kernel. he larger the bandwidth, the larger the improvement. he naive i.i.d. bootstrap tends to closely follow the fixed-b asymptotics, across all DGP s, bandwidths, sample sizes and kernels, despite the presence of autocorrelation. his pattern strongly suggests a systematic relationship between the naive block bootstrap and the fixed-b asymptotic approximation. It is interesting to note that as ρ increases e.g. for ρ =.9, increasing the block size helps in further reducing the size distortions. his suggests that the naive block bootstrap may offer an asymptotic refinement over the fixed-b asymptotics with careful choice of the block length. We now report simulations results for the special case of a simple location model, y t = β 1 + u t, 5 where u t is still modelled as before. It is useful to consider the simple location model because comparisons can be made to the formal Edgeworth expansions of Velasco and Robinson 21. We consider testing the null hypothesis that β 1 against the alternative that β 1 > at a nominal 6

level of 5% using tˆβ1 = ˆβ 1, se ˆβ1 where se ˆβ1 is a HAC standard error estimate. We report results in Figures 5-8 again for the Bartlett and QS kernels. In this case we report results for ρ =.,.3,.9 and we only consider block lengths 1 and 5 to conserve space. We also report rejection probabilities using the Edgeworth approximation derived by Velasco and Robinson 21. Using t z to denote the right tail standard normal critical value, the Edgeworth critical value is given by where and δ = 1 Ω 1 fx = j= 2 ks 2 ds + t edge = t z + 1 2 δt2 z + ft z, 6 x 1 ksds 2 + ks 2 ds x 3 3x j Γ j for the Bartlett kernel and δ = 18 125 π2 2 Ω 1 4 j=, j 2 Γ j for the QS 2ρ kernel. Given the AR1 structure in the simulations, the formulas for δ simplify to 1 ρ 2 and 36π 2 ρ respectively for the Bartlett and QS kernels 2. 2 1 ρ 2 We implement the Edgeworth approximations in two ways. In the first, we make the unrealistic assumptions from the perspective of practice that it is known that the errors are AR1 and that the value of ρ is known. his provides an infeasible benchmark. In the second, we replace Ω with Ω 1 and we replace j Γ j and j 2 Γ j with estimators k j j Γ j and 1 j= 1 j= j= j= 1 k j j2 Γj where kx and are the same as used to construct Ω. his feasible approach preserves the nonparametric nature of the HAC robust test in that we are not assuming any knowledge about the form of the autocovariance structure. Looking at the results depicted in Figures 5-8, several interesting patterns emerge. First, the performance of the standard normal approximation, the fixed-b approximation, and the naive block bootstrap is very similar to what was obtained for the regression model. Second, the performance of the fixed-b approximation and the naive block bootstrap relative to the Edgeworth approximations are striking. If we first focus on the case of i.i.d. errors ρ =, we see that fixed-b and the naive block bootstrap have rejection probabilities very close.5. For the Bartlett kernel the infeasible Edgeworth approach has rejections just slightly above.5, but for the QS kernel the Edgeworth 2 he regularity conditions used by Velasco and Robinson 21 appear to exclude the Bartlett and Parzen kernels because their spectral windows do not truncate outside the range [ π, π] although our simulation results suggest that their results likely hold for the Bartlett kernel. Velasco and Robinson 21 conjecture that their proofs could be modified to allow kernels like the Bartlett and Parzen. 7

has rejections above.5. As increases, fixed-b and the naive bootstrap have rejections that remain close to.5 whereas the infeasible Edgeworth has rejections that systematically increase with. he feasible Edgeworth approximation performs better than the standard normal but is less accurate than fixed-b or the naive bootstrap. hese patterns hold for both sample sizes. When the errors have weak serial correlation, ρ =.3, the patterns continue to hold except that for smaller values of, the infeasible Edgeworth gives rejections closer to.5 than the other approximations. When the errors have strong serial correlation, ρ =.9, all approximations, except the infeasible Edgeworth with small, have substantial over-rejection problems although it continues to be the case that fixed-b and the naive block bootstrap perform better than the standard normal approximation. Increasing the block length from 1 to 5 improves the situation for the naive block bootstrap, but over-rejections are still a problem. he infeasible Edgeworth substantially underrejects when is small and over-rejects when is large. Clearly, none of the approximations are systematically working when the serial correlation is strong. his is not surprising because the underlying central limit theorem implicit in all of the approximations becomes less adequate as ρ approaches one. he patterns in the simulations for the regression model and the location model suggest that the fixed-b approximation and the naive block bootstrap are related and that they may provide a refinement over the standard normal approximation. Careful choice of the block length may provide a refinement over fixed-b asymptotics. In the location model with i.i.d. errors, it appears that fixed-b and the naive block bootstrap may provide a refinement over the Edgeworth approximation. Obtaining theoretical results that explain all of these patterns is a very challenging research program. In the remainder of the paper we focus on establishing the asymptotic equivalence of the fixed-b approximation and the naive block bootstrap and conditions under which they provide refinements over the standard normal approximation. 4 Fixed-b Bootstrap Asymptotics In this section we derive the asymptotic distribution of naive block bootstrap HAC robust tests under the fixed-b asymptotics. In particular, for linear regression models we show that t and F have the same limiting fixed-b distribution as t and F. Define S [r] = [r] v t, where [r] denotes the integer part of r with r [, 1]. Let X r = 1/2 S [r] be the corresponding partial sum process. Similarly, define Q r = 1 [r] x tx t. Following Kiefer and Vogelsang 22a and Kiefer and Vogelsang 22b we make the following two high level assumptions: A1. X r ΛW s r, with Ω = ΛΛ = lim V ar 1/2 v t. A2. sup r [,1] Q r rq in probability. 8

Here, we assume in addition that the statistic of interest, A, is such that: A3. A can be written as A = g X r,q r,d r, where g is a continuous functional of X r,q r,d r, and D r is a vector of deterministic functions of and r such that D r D r as, uniformly in r. Condition A3 is a general way of expressing statistics that includes t and F. he function D r reflects the choice of kernel. Using the arguments of Kiefer and Vogelsang 25, it follows that ˆΩ is a continuous functional of the processes X r, Q r, and D r, where D r is a function of k r. If k r exists, then we can show that lim D r = b 2 k r/b, in which case D r = b 2 k r/b. For kernels that truncate to zero for x 1, D r is a 2 1 vector and Dr has elements given by b 2 k r/b for r b and b 1 k 1, where k 1 is the first derivative of kx from the left evaluated at x = 1. For the Bartlett kernel we have D r = 2b 1, b 1 and Dr = 2b 1, b 1. hus, A3 holds for a wide class of kernels including the Bartlett kernel. See Kiefer and Vogelsang 25 for additional details on how D r is constructed. Under conditions A1 through A3, an application of the continuous mapping theorem C implies that as, A g ΛW s r,rq, D r G. Suppose that the random variable G has a distribution that is pivotal, i.e. invariant to Λ and Q. For example, this is the case for F and t as indicated by 3. he goal in this section is to provide a set of primitive conditions on {x t } and {v t } under which the naive block bootstrap test, F, weakly converges to G, in which case the naive bootstrap and the fixed-b approximation will be equivalent in a first order sense. Note that results for t follow as an obvious corollary. We now need to introduce some additional notation. Given a bootstrap resample wt = yt, x t, ˆβ let vt = x t yt x t β x tu t, and let v t = x t yt x t x tu t. In order to simplify the notation, we omit in the definition of the bootstrap variables, e.g., we write v t instead of vt. Notice that v t and not vt is the bootstrap analogue of v t as it replaces β with ˆβ. Let S[r] = [r] v t and define the bootstrap partial sum process X r = 1/2 S[r]. Similarly, define Q r = 1 [r] x tx t. As usual in the bootstrap literature, P denotes the probability measure induced by the bootstrap resampling, conditional on a realization of the original time series. We use the following notation for the bootstrap asymptotics see Chang and Park 23 for similar notation and for several useful bootstrap asymptotic properties: Let Z be a sequence of bootstrap statistics. We write Z = o P 1 in probability, or Z P in probability, if for any ε >, δ >, lim P [P Z > δ > ε] =. Similarly, we write Z = O P 1 in probability if for all ε > there exists a ε < such that lim P [P Z > ε > ε] =. Finally, we 9

write Z P Z in probability if, conditional on the sample, Z weakly converges to Z under P, for all samples contained in a set with probability converging to one. Suppose the bootstrap processes X r and Q r satisfy the following assumptions, in probability: A1. X r P Λ W s r, for some Λ. A2. sup r [,1] Q r rq P for some Q. that In this section we study the asymptotic behavior of naive bootstrap statistics, i.e., we suppose A3. he bootstrap statistic A can be written as A = g X r,q r,d r, where g and D r are as defined in A3. According to condition A3, the bootstrap statistic is equal to the exact same function as the original statistic, but replaces the bootstrap data for the real data. his is the sense in which the bootstrap statistic is naive. It is a very straightforward algebraic calculation to show that t and F satisfy condition A3. It is clear that under Assumptions A1 -A3, by an application of the C, we have that A P g Λ W s r,rq, D r, in probability. Because the distribution of the random variable g,, is pivotal as in the case of t and F tests, the limiting distribution of A coincides with the limiting distribution of A, independently of Λ and Q. hus, the asymptotic equivalence between A and A depends crucially on the conditions A1 and A2. Next, we provide primitive conditions on {x t } and {v t } that are sufficient for A1 and A2. We derive results under the assumptions that {x t } and {v t } are near epoch dependent NED on an underlying mixing process {ε t }. NED processes allow for very general forms of dependence and contain mixing processes as a special case. For a general time series {w t }, we view each coordinate w t as a measurable function of the potentially infinite history of another underlying process {ε t }, i.e. w t..., ε t 1, ε t, ε t+1,.... Let F t s σ ε s,...,ε t for any s t be the sigma-field generated by ε s,...,ε t, and let E t s denote the expectation conditional on F t s. We say {w t } is L q -NED on {ε t }, q 1, if w t q < and ν k = sup t wt E t+k t k w t q as k. Here and in what follows, w q = i E w i q 1/q denotes the L q -norm of a random vector w. Similarly, we let denote the Euclidean norm of the corresponding vector or matrix. If the NED coefficients ν k are such that ν k = 1

O k a δ for some δ >, we say {w t } is L q -NED of size a. We assume {ε t } is strong mixing. he strong mixing coefficients are α k = sup m sup {A F m,b F m+k } P A B P AP B ; we require α k as k suitably fast. We impose the following assumptions on {x t } and {v t }: Assumption 1 1a. For some p > 2, x t 2p < for all t = 1, 2,.... 1b. {x t } is a weakly stationary sequence L 2 -NED on {ε t } with NED coefficients of size 2p 1 p 2. 1c. v t p <, and E v t = for all t = 1, 2,.... 1d. {v t } is a weakly stationary sequence L 2 -NED on {ε t } with NED coefficients of size 1 2. 1e. {ε t } is an α-mixing sequence with α k of size 2p 1f. Ω = lim V ar p 2. 1/2 v t is positive definite. We can show that Assumption 1 is a sufficient assumption for the high level conditions A1 and A2. Note that under Assumption 1, Ω = lim V ar 1/2 S exists. We further assume Ω is positive definite, which ensures the existence of a matrix Λ such that Ω = ΛΛ. Next, we show that the following strengthened version of Assumption 1 is sufficient to ensure that conditions A1 and A2 hold. Assumption 1 1c. For some p > 2 and δ >, v t p+δ <, and E v t = for all t = 1, 2,.... 1d. {v t } is a weakly stationary sequence L 2+δ -NED on {ε t } with NED coefficients of size 1. 1e. {ε t } is an α-mixing sequence of size 2+δp+δ p 2. Lemma 4.1 Under Assumption 1 strengthened by Assumption 1, it follows that, a For any fixed l such that 1 l <, as, X r P Λ l W s r, 7 in probability, where Λ l is the square root matrix of Ω l Γ + l j=1 1 j l Γ j + Γ j. b Let l = l as such that l 2 /. hen in probability, where Λ is the square root matrix of Ω. X r P ΛW s r, 8 11

c Under both sets of assumptions on l, it follows that in probability. sup Q r rq P, r [,1] Parts a and b of Lemma 4.1 provide functional central limit theorems FCL for the bootstrap partial sum process of the bootstrap scores X r = 1/2 [r] v t. o prove these results, we apply a bootstrap FCL Lemma A.3 given in the Appendix for Z r = 1/2 [r] X t E X t, when {X t } is a BB resample of {X t }, a NED process on a mixing process. Lemma A.3 is a multivariate extension of an univariate bootstrap FCL given in Paparoditis and Politis 23 for stationary mixing processes to the NED case. We consider two cases: a one where l is fixed as, and b another where l as. Note that the first case includes the i.i.d. bootstrap as a special case. According to Lemma 4.1, the bootstrap partial sum process X r weakly converges to a Brownian motion with the right covariance matrix Ω only if the block size l increases with the sample size at an appropriate rate. When l is fixed, the limiting covariance matrix is Ω l, which is different from Ω under general autocorrelation. his reflects the well-known fact that the BB with fixed block size and therefore the i.i.d. bootstrap achieves only partial correction of dependence cf. Liu and Singh 1992. Our first formal theoretical result is as follows. heorem 4.1 Let b, 1] be a constant and suppose = b. Let Assumption 1 strengthened by Assumption 1 hold, and let k x be the Bartlett kernel or let kx be such that k x exists and is continuous everywhere with the possible exception of x = 1. Suppose the block size l is either fixed as, or l as such that l 2 /. hen, under H : Rβ = r, as, F P W q 1 Q q b 1 W q 1/q, in probability, where Q q b is a random matrix defined in Definition 1 of Kiefer and Vogelsang 25. heorem 4.1 shows that the naive bootstrap F test statistic has asymptotically the same distribution of F derived under the fixed-b asymptotics nesting of Kiefer and Vogelsang 25. A similar result holds for t. he first implication of heorem 4.1 is that a naive bootstrap is as accurate as the new first order fixed-b asymptotics of Kiefer and Vogelsang 25. he second implication is that a simple i.i.d. bootstrap is asymptotically valid and equivalent to the fixed-b asymptotics, even in the presence of serial correlation. his result is a consequence of the asymptotic pivotalness of the F statistic. 12

5 Higher-order results In this section we prove that the naive i.i.d. bootstrap is capable of providing an asymptotic refinement over the standard normal approximation even for dependent data. We focus on the t-statistic in the simple location model given by 5, i.e. we assume x t 1 for all t. Here the score vector v t is equal to the scalar u t. We derive results for the Bartlett kernel because Ω can be expressed as a relatively simple function of X r in this case. We expect our results to naturally extent to other kernels although the details are likely to be very tedious. Specifically, we show that the error of the naive i.i.d. bootstrap approximation to the finite sample distribution of a fixed-bandwidth HAC based statistic is of order o 1/2+3/2p+ɛ for any ɛ >, where p is the number of finite moments of u t. We also show that the error of the fixed-b asymptotic distribution derived by Kiefer and Vogelsang 25 is of the same magnitude as the error of the naive i.i.d. bootstrap. In contrast, the error of the normal approximation to the distribution of a HAC statistic computed with the optimal SE bandwidth is of order O 1/3 for the Bartlett kernel. hus, the naive i.i.d. bootstrap and the fixed-b asymptotics provide a smaller error than the normal approximation whenever p > 9. he i.i.d. bootstrap is capable of providing an asymptotic refinement over the standard normal approximation even for dependent data because it mimics the fixed-b asymptotic distribution, which itself improves upon the normal approximation. In this section we assume u t is a linear process. his is a more restrictive dependence assumption than our previous NED Assumption 1. o prove our results, we will rely on the method of strong approximations see below for more details on this method, available for linear processes, and this is the main reason why we restrict attention to the special class of linear processes. We are unaware of such results for NED sequences. hus, we let u t = π Lε t = π j ε t j, with π z = j= π jz j, and make the following additional assumptions. Assumption 2 a ε t are i.i.d. with E ε t =, E ε 2 t = σ 2 and E ε t p < for some p > 2. b π z for all z 1 and i= i π i <. Under Assumption 2, the FCL for linear processes cf. heorem 3.4, Phillips and Solo, 1992 implies that j= [r] W r 1/2 u t ΛW 1 r, 13

where Ω = Λ 2 π 2 1 σ 2 is the long run variance under Assumption 2. o establish our results we need a result stronger than this invariance principle. In particular, we need specific rates of convergence of the partial sum process for its limiting process. his can be achieved through the method of strong approximations. Recently, Park 23 uses strong approximations to show that the bootstrap provides an asymptotic refinement for unit root tests. Similarly, Park 24 relies on this method to show asymptotic refinements of the bootstrap in the context of weakly integrated processes. Our methods of proof will closely follow those of Park 24. Consider the following probabilistic embedding of the partial sum process of u t : [r] W r = d 1/2 u t, where = d denotes equality in distribution. W is a Brownian motion in D [, 1] having the same distribution as W. In what follows, we will not make a distinction between W and its distributionally equivalent copy W. herefore we will interpret the distributional equality = d as the usual equality. he Skohorod representation theorem guarantees that there exists a probability space Ω, F, P supporting W and W 1 such that W ΛW 1 a.s. uniformly in [, 1]. oreover, we can state the following result, which follows from a strong approximation result due to Akonom 1993 heorem 3, p. 74. Lemma 5.1 Under Assumption 2, we have that a sup W r ΛW 1 r = O P 1/2+1/p. r [,1] b For any ɛ >, P sup r [,1] W r ΛW 1 r 1/2+3/2p = o 1 2 + 3 +ɛ 2p. Part a of Lemma 5.1 shows that the stochastic order of sup r [,1] W r ΛW 1 r is equal to O P 1/2+1/p. As we will show next, the t-statistic can be written as a functional of W r or of its distributionally equivalent copy 1/2 [r] u t. hus, we can use part a to determine the stochastic order of the error term in the stochastic expansion of the t-statistic. Part b shows that W can be approximated by ΛW 1 with an error that is distributionally 3 of order O 1/2+3/2p. hus, although the approximation error of W with ΛW 1 is of order O P 1/2+1/p, its effect is distributionally of a larger order of magnitude, namely O 1/2+3/2p. We will rely on this result to derive the error of the fixed-b asymptotic approximation. 3 We follow Park 23 and say that a random sequence R is distributionally of order o a+ɛ = O a if P R > a = O a for some ɛ >. 14

5.1 Asymptotic expansion of the t-statistic We first provide an asymptotic expansion for the t-statistic. he t-statistic can be written as follows: where β 1 = y and ˆΩ = ˆΓ + 2 j=1 tˆβ1 = 1 j ˆβ1 β 1 ˆΩ ˆΓ j, with ˆΓ j = 1, t=j+1 û t û t j. hus, ˆΩ is the Bartlett kernel variance estimator of Ω = lim V ar 1/2 u t = σ 2 π 2 1. he bandwidth is equal to = b, where b is a fixed constant. Following Kiefer and Vogelsang 25, we can write 1 [b] 1 ˆΩ = 2b 1 2 Ŝt 2 2b 1 2 Ŝ t Ŝ t+[b], where Ŝt = t i=1 ûi and Ŝt = S t t S, with S t = t i=1 u i. Lemma 5.2 Under Assumption 2, and for any fixed b, 1], we have ˆΩ = ΩQ 1 b + O P 1/2+1/p, with Q 1 b given by 4. Lemma 5.2 provides an asymptotic expansion for ˆΩ with remainder O P 1/2+1/p. he leading term of this expansion is the distribution derived by Kiefer and Vogelsang 25. he rate of convergence of ˆΩ increases with p, the number of finite moments of ε. If all moments of ε exist, we can set p = and get the parametric convergence rate of O P 1/2. Our next result provides the asymptotic expansion for the t-statistic. heorem 5.1 Under Assumption 2, and for any fixed b, 1], we have = W 1 1 tˆβ1 Q1 b + O P 1/2+1/p, where tˆβ1 and Q 1 b are defined as above. he leading term of the expansion for tˆβ1 is the fixed-b first-order asymptotic distribution derived by Kiefer and Vogelsang 25. Using Lemma 5.1. b and following Park 23, Corollary 3.8 we can prove the following corollary to heorem 5.1. 15

Corollary 5.1 Under Assumption 2, and for any fixed b, 1], we have W 1 1 P x = P tˆβ1 Q1 b x + o 1/2+3/2p+ɛ, uniformly in x R, for any ɛ >. Corollary 5.1 gives the rate at which the fixed-b asymptotic approximation converges to the true sampling distribution of tˆβ1. When all moments exist as in the Gaussian case considered in our simulations, p =, and the error of the fixed-b asymptotic approximation is of order o 1/2+ɛ for any ɛ >. For Gaussian stationary time series, Velasco and Robinson 21 show that the error made by the asymptotic normal approximation of a t-statistic studentized with a HAC estimator 1/2 with bandwidth is of order O cf. their equation 6 when SE optimal bandwidths are used. hus, when p =, the error of the normal approximation is larger than the o 1/2+ɛ associated with the fixed-b asymptotics. his explains why the fixed-b asymptotics outperforms the normal approximation in our onte Carlo simulations where errors are Gaussian and therefore p =. If we set = const. 1/3 the optimal rate of for the B kernel, the error incurred by the first-order standard normal approximation is O 1/3 = o 1/3+ɛ for any ɛ >, which is larger than the o 1/2+3/2p+ɛ error incurred by the fixed-b asymptotics for p > 9. hus, whenever p > 9, the fixed-b asymptotics outperforms the standard normal approximation when the bandwidth is set to = const. 1/3. We should point out that stronger results than Corollary 5.1 have been obtained in some recent work if it assumed that u t is Gaussian. Jansson 24 has established that the error of the fixed-b asymptotic approximation is O for the case of the Bartlett kernel with b = 1. his result has log been refined to O 1 and extended to a general class of kernels and wider range of b by Phillips, Sun and Jin 25. While these error rate results are stronger than ours, it is not known whether they continue to hold without the Gaussian assumption. Because the Gaussian assumption cannot hold for the bootstrap, the methods of proof used by Jansson 24 and Phillips et al. 25 cannot be directly applied to the bootstrap. 5.2 Asymptotic expansion of the naive i.i.d. bootstrap t-statistic Next, we provide an asymptotic expansion for the naive i.i.d. bootstrap statistic. Let u t i.i.d. {û t = y t ȳ : t = 1,, } be an i.i.d. bootstrap sample. Note that u t = y t ȳ, where y t is an i.i.d. bootstrap observation drawn from {y t }. he naive i.i.d. bootstrap t statistic is defined as ˆβ 1 ˆβ 1 =, tˆβ 1 ˆΩ 16

where ˆβ 1 = y and ˆΩ is of the same form as ˆΩ but evaluated with the bootstrap data: 1 ˆΩ = 2b 1 2 Ŝ 2 t [b] 1 2b 1 2 Ŝt Ŝ t+[b], where Ŝ t = St t S, St t i=1 u i. Let Ω = V ar 1/2 u t. We can show that Ω = 1 V ar u t = 1 û2 t, and Ω p lim Ω = E u 2 t = σ 2 π 2 i σ 2 π 2 1 = Ω, so the i.i.d. bootstrap does not consistently estimate the long run variance of ˆβ 1. However, we will show that the i.i.d. bootstrap can still provide an asymptotic refinement over the standard normal approximation. By a bootstrap FCL, i=1 [r] W r = 1/2 u t d Ω 1/2 W 1 r, in probability, where W 1 denotes a standard Brownian motion independent of the realization of u t. As above, we can find a Brownian motion W that has the same distribution as W, conditional on the original sample, and such that the following result follows. We write [r] W r = 1/2 u t, in probability, where the equality is to be interpreted as an equality in distribution under the bootstrap measure. he following result is a strong approximation for the bootstrap partial sum process. Lemma 5.3 Under Assumption 2, we have a sup r [,1] W r Ω 1/2 W 1 r = O P 1/2+1/p, in probability. b For any ɛ >, P sup r [,1] W r Ω 1/2 W 1 r 1/2+3/2p = o P 1/2+3/2p+ɛ. he next result gives an expansion for ˆΩ and is the bootstrap analogue of Lemma 5.2. Lemma 5.4 Under Assumption 2, we have ˆΩ = Ω Q 1 b + O P in probability, where Q 1 b is as defined previously. 1/2+1/p, 17

Given Lemma 5.4, we can derive the following asymptotic expansion for the naive i.i.d. bootstrap t-statistic. heorem 5.2 Under Assumption 2, we have in probability. = W 1 1 tˆβ 1 Q 1 b 1/2 + O P 1/2+1/p, he following corollary to heorem 5.2 shows that the effect of the remainder term in the asymptotic expansion of tˆβ 1 is distributionally of order O 1/2+3/2p. Corollary 5.2 Under Assumption 2, we have P W 1 x = P tˆβ 1 Qb 1/2 x uniformly in x R, for any ɛ >. + o P 1/2+3/2p+ɛ, It then follows from Corollaries 5.1 and 5.2 that sup P x P x tˆβ 1 tˆβ1 = op 1/2+3/2p+ɛ, 9 x R uniformly in x R, for any ɛ >. he result in 9 shows that the i.i.d. bootstrap error is of the same order of magnitude as the error implied by the fixed-b asymptotic approximation. In particular, if p = the i.i.d. bootstrap error is arbitrarily close to o P 1/2+ɛ, smaller than the error implied by the normal approximation. he i.i.d. bootstrap error is smaller than the error associated with the normal approximation when the optimal bandwidth is used to compute the HAC Bartlett kernel estimator whenever p > 9. he reason why the i.i.d. bootstrap provides a refinement in this context is that it replicates the the fixed-b distribution. his is true even when the data are dependent, as we showed more generally before. 6 Heuristic Comparisons of Edgeworth and Fixed-b While rigorous comparisons of the accuracy of Edgeworth approximations with fixed-b approximations are well beyond the scope of this paper, some heuristic comparisons can be instructive for guiding future work. In deriving formal Edgeworth approximations, Velasco and Robinson 21 approximate the bias and variance of the HAC variance estimator under the traditional assumption that / shrinks to zero. In the simple location model we have from Velasco and Robinson 21 18

for the QS kernel bias Ω Ω = E Ω Ω 1 18 125 π2 2 Ω 1 V ar Ω Ω 2 = 18 125 π2 2 Ω 1 j= ks 2 ds = 2. j 2 Γ j j= 5 4, j 2 Γ j ksds Although the Bartlett kernel does not satisfy the assumptions used by Velasco and Robinson 21, existing results in the spectral analysis literature give for the Bartlett kernel bias Ω Ω V ar Ω Ω 2 1 Ω 1 j= ks 2 ds = jγ j 4 3. ksds = 1 Ω 1 j= Notice that the Edgeworth approximation 6 is a function of these moments. jγ j, Alternatively, the fixed-b approximation approximates the entire distribution of Ω: Ω Ω Q 1b. he bias of Ω/Ω can be approximated by EQ 1 b 1. It is interesting to compare EQ 1 b 1 and V arq 1 b with the traditional bias and variance formulas. For the Bartlett kernel see Kiefer and Vogelsang 25 EQ 1 b 1 = b + 1 3 b2 V arq 1 b = 4 3 b 7 3 b2 + 14 15 b3 + 2 9 b4 for b 1 2. Recalling that b = / we see that the moments of the fixed-b asymptotic distribution match the traditional moments for terms of order /. he differences between the two approximations are that the fixed-b approximation does not include the 1 Ω 1 j= jγ j bias term because it is o1 under fixed-b asymptotics, but the fixed-b approximation includes terms of order / 2 and higher. his heuristic observations can shed some light on some of the patterns observed in Figures 5-8. In the case of i.i.d. errors, the bias term of order 1 is exactly zero because j Γ j = j 2 Γ j = and the main difference between the fixed-b approximation and j= j= the Edgeworth approximation are the higher order terms in the bias and variance formulas for the fixed-b approximation. When is small, the difference between the Edgeworth and fixed-b approx- 19

imations are negligible whereas for large, the fixed-b approximation is slightly more accurate. When ρ =.3, the Edgeworth approximation is more accurate when is small because it picks up the 1 term whereas for larger, the fixed-b approximation is more accurate. hese differences become more apparent when ρ =.9. An intriguing possibility is apparent. Because of the asymptotic equivalence between fixed-b asymptotics and the naive i.i.d. bootstrap, it appears the i.i.d. bootstrap captures the influence of the bias and variance of Ω to higher orders than the Edgeworth approximation with respect to terms that depend on powers of /, but the i.i.d. bootstrap does not capture the bias term that depends on j Γ j or j 2 Γ j. With careful choice of block length, the naive block j= j= bootstrap could capture these bias term while continuing to capture the / and higher order terms terms. he simulations reported in Figures 5-8 show that when there is serial correlation in the data, increasing the block length from 1 to 5 does improve the approximation. It is possible this improvement is coming at least in part through the first bias term. 7 Conclusion he bootstrap literature for dependent data suggests that the bootstrap application should not be automatic, if asymptotic refinements over the standard normal approximation are the main aim. In particular, a naive bootstrap i.e., a bootstrap where the formula used in the bootstrap world is the same as the formula used to compute the test statistic using the actual data is no more accurate than the standard normal approximation. hus, the bootstrap literature suggests that we should recenter the bootstrap statistic, and carefully choose the standard error estimates in the real and bootstrap worlds in order to obtain improvements. In particular, to studentize the original statistic, we should not use a kernel variance estimator with triangular weights as this will destroy the second-order properties of the block bootstrap. In this paper, we conduct onte Carlo simulations that show that a naive bootstrap outperforms the standard normal approximation in finite samples. his improvement holds for several kernels, including the Bartlett kernel, and holds even for an i.i.d. bootstrap, despite the dependence in the data. Our simulations suggest that the performance of the naive bootstrap is tightly linked to the finite sample performance of the recently developed fixed-b i.e. fixed bandwidth asymptotics. We provide a theoretical explanation for this result: we prove that the bootstrap distribution of the naive bootstrap statistics is asymptotically the same as the fixed-b asymptotic distribution. In addition, for a simple location model we show that a naive i.i.d. bootstrap can reduce the magnitude of the error in estimating one-sided distribution functions of robust t statistics compared to the standard normal approximation error for statistics studentized with a Bartlett kernel variance estimator based on optimal SE bandwidths. Our simulations suggest that a naive block bootstrap 2

can offer an asymptotic refinement over the fixed-b asymptotics when the block size is chosen appropriately. Providing a theoretical explanation for this finding is an interesting topic of research, which we will undertake elsewhere. Appendix A his Appendix contains the proofs of the results in Section 4. hroughout this Appendix K denotes a generic constant that may change from one usage to the next. We first state four lemmas that are auxiliary in proving Lemma 4.1 and heorem 4.1 in Section 4. We then provide the proofs of our main results followed by the proofs of the auxiliary lemmas. he following result is a maximal inequality for mixingales see e.g. Davidson 1994 for a definition of mixingale due to Hansen 1991, 1992. Zero mean NED processes on a mixing process are mixingales and we will repeatedly use this result in our proofs. Lemma A.1 For some nondecreasing sequence of σ-fields { F t} and for some p > 1, let { X t, F t} be an L p -mixingale with mixingale coefficients ψ m and mixingale constants c t. hen, letting S j = j X t and Ψ = m=1 ψ m it follows that a If 1 < p 2, max j S j p KΨ cp t 1/p. b For p 2, max j S j p KΨ c2 t 1/2. he following result gives the probability limits of the BB variance of a scaled bootstrap sample mean under two different assumptions on the block size l: a when l is fixed as ; and b when l as at an appropriate rate. We state the result for a general time series {X t } satisfying the following assumptions: Assumption A Let {X t } be a weakly stationary sequence of s 1 random vectors such that the following hold: i For some p > 2, X t p < for all t = 1, 2,.... ii {X t } is L 2 -NED on {V t } with NED coefficients of size 1/2. iii {V t } is α-mixing of size p p 2. Let {Xt : t = 1, 2,..., } denote a BB resample obtained from {X t : t = 1, 2,..., } using block size l. Let Ω = V ar 1/2 X t denote the bootstrap variance of X. Lemma A.2 Suppose {X t } satisfies Assumption A. hen, 21