Bootstrap prediction intervals for factor models

Similar documents
Discussion of Bootstrap prediction intervals for linear, nonlinear, and nonparametric autoregressions, by Li Pan and Dimitris Politis

Bootstrap prediction intervals for factor models

Bootstrapping factor models with cross sectional dependence

Bootstrapping factor models with cross sectional dependence

Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator

Robust Backtesting Tests for Value-at-Risk Models

Central Bank of Chile October 29-31, 2013 Bruce Hansen (University of Wisconsin) Structural Breaks October 29-31, / 91. Bruce E.

Darmstadt Discussion Papers in Economics

LECTURE ON HAC COVARIANCE MATRIX ESTIMATION AND THE KVB APPROACH

Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION. September 2017

BOOTSTRAPPING DIFFERENCES-IN-DIFFERENCES ESTIMATES

Block Bootstrap HAC Robust Tests: The Sophistication of the Naive Bootstrap

Econometrics of Panel Data

Reliability of inference (1 of 2 lectures)

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

A Course in Applied Econometrics Lecture 7: Cluster Sampling. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

Inference about Clustering and Parametric. Assumptions in Covariance Matrix Estimation

A Practitioner s Guide to Cluster-Robust Inference

The Exact Distribution of the t-ratio with Robust and Clustered Standard Errors

Statistics 910, #5 1. Regression Methods

Understanding Regressions with Observations Collected at High Frequency over Long Span

Testing Error Correction in Panel data

A Primer on Asymptotics

The Exact Distribution of the t-ratio with Robust and Clustered Standard Errors

The Bootstrap: Theory and Applications. Biing-Shen Kuo National Chengchi University

MFE Financial Econometrics 2018 Final Exam Model Solutions

Economics Division University of Southampton Southampton SO17 1BJ, UK. Title Overlapping Sub-sampling and invariance to initial conditions

Bootstrapping Autoregressions with Conditional Heteroskedasticity of Unknown Form

Econometrics of Panel Data

11. Bootstrap Methods

GARCH Models. Eduardo Rossi University of Pavia. December Rossi GARCH Financial Econometrics / 50

1 Random walks and data

Economics 583: Econometric Theory I A Primer on Asymptotics

Inference For High Dimensional M-estimates. Fixed Design Results

Estimation of Time-invariant Effects in Static Panel Data Models

Casuality and Programme Evaluation

Estimation of the Conditional Variance in Paired Experiments

where x i and u i are iid N (0; 1) random variates and are mutually independent, ff =0; and fi =1. ff(x i )=fl0 + fl1x i with fl0 =1. We examine the e

Chapter 6. Panel Data. Joan Llull. Quantitative Statistical Methods II Barcelona GSE

Studies in Nonlinear Dynamics & Econometrics

Statistical Data Analysis

Inference For High Dimensional M-estimates: Fixed Design Results

A New Approach to Robust Inference in Cointegration

Robust Nonnested Testing and the Demand for Money

Econ 423 Lecture Notes: Additional Topics in Time Series 1

Does k-th Moment Exist?

So far our focus has been on estimation of the parameter vector β in the. y = Xβ + u

Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions

A Bootstrap Test for Conditional Symmetry

Econometrics II - EXAM Outline Solutions All questions have 25pts Answer each question in separate sheets

Bootstrap Testing in Econometrics

A Test of Cointegration Rank Based Title Component Analysis.

RESIDUAL-BASED BLOCK BOOTSTRAP FOR UNIT ROOT TESTING. By Efstathios Paparoditis and Dimitris N. Politis 1

Introduction to Econometrics

Inference in VARs with Conditional Heteroskedasticity of Unknown Form

Least Squares Estimation of a Panel Data Model with Multifactor Error Structure and Endogenous Covariates

ECON 4160, Autumn term Lecture 1

Outline. Overview of Issues. Spatial Regression. Luc Anselin

Independent Component (IC) Models: New Extensions of the Multinormal Model

Switching Regime Estimation

F9 F10: Autocorrelation

Panel Data Models. Chapter 5. Financial Econometrics. Michael Hauser WS17/18 1 / 63

A TIME SERIES PARADOX: UNIT ROOT TESTS PERFORM POORLY WHEN DATA ARE COINTEGRATED

Inference Based on the Wild Bootstrap

Conditional Least Squares and Copulae in Claims Reserving for a Single Line of Business

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails

E 4160 Autumn term Lecture 9: Deterministic trends vs integrated series; Spurious regression; Dickey-Fuller distribution and test

The regression model with one stochastic regressor (part II)

What s New in Econometrics. Lecture 13

Time Series and Forecasting Lecture 4 NonLinear Time Series

Estimation and Inference of Quantile Regression. for Survival Data under Biased Sampling

An estimate of the long-run covariance matrix, Ω, is necessary to calculate asymptotic

Using all observations when forecasting under structural breaks

New Methods for Inference in Long-Horizon Regressions

The Wild Bootstrap, Tamed at Last. Russell Davidson. Emmanuel Flachaire

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Heteroskedasticity-Robust Inference in Finite Samples

Comparing Forecast Accuracy of Different Models for Prices of Metal Commodities

Econometrics II - EXAM Answer each question in separate sheets in three hours

Simulating Properties of the Likelihood Ratio Test for a Unit Root in an Explosive Second Order Autoregression

Dynamic Regression Models (Lect 15)

Inference on distributions and quantiles using a finite-sample Dirichlet process

Flexible Estimation of Treatment Effect Parameters

Econ 623 Econometrics II Topic 2: Stationary Time Series

Prof. Dr. Roland Füss Lecture Series in Applied Econometrics Summer Term Introduction to Time Series Analysis

ECO Class 6 Nonparametric Econometrics

Appendices to the paper "Detecting Big Structural Breaks in Large Factor Models" (2013) by Chen, Dolado and Gonzalo.

A Simple, Graphical Procedure for Comparing Multiple Treatment Effects

A fixed-t version of Breitung s panel data unit root test and its asymptotic local power

Bootstrapping heteroskedastic regression models: wild bootstrap vs. pairs bootstrap

On the Power of Tests for Regime Switching

A Fixed-b Perspective on the Phillips-Perron Unit Root Tests

Estimation of Threshold Cointegration

Regression. Oscar García

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

Vector Autoregressive Model. Vector Autoregressions II. Estimation of Vector Autoregressions II. Estimation of Vector Autoregressions I.

Linear models and their mathematical foundations: Simple linear regression

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,

Estimating the Number of Common Factors in Serially Dependent Approximate Factor Models

Transcription:

Bootstrap prediction intervals for factor models Sílvia Gonçalves and Benoit Perron Département de sciences économiques, CIREQ and CIRAO, Université de Montréal April, 3 Abstract We propose bootstrap prediction intervals for an observation h periods into the future and its conditional mean. We assume that these forecasts are made using a set of factors extracted from a large panel of variables. Because we treat these factors as latent, our forecasts depend both on estimated factors and estimated regression coeffi cients. Under regularity conditions, Bai and g proposed the construction of asymptotic intervals under Gaussianity of the innovations. he bootstrap allows us to relax this assumption and to construct valid prediction intervals under more general conditions. Moreover, even under Gaussianity, the bootstrap leads to more accurate intervals in cases where the cross-sectional dimension is relatively small as it reduces the bias of the OLS estimator as shown in a recent paper by Gonçalves and Perron 3. Keywords: factor model, bootstrap, forecast, conditional mean. Introduction Forecasting using factor-augmented regression models has become increasingly popular since the seminal paper of Stock and Watson. he main idea underlying the so-called diffusion index forecasts is that when forecasting a given variable of interest, a large number of predictors can be summarized by a small number of indexes when the data follows an approximate factor model. he indexes are the latent factors driving the panel factor model and can be estimated by principal components. Point forecasts can be obtained by running a standard OLS regression augmented with the estimated factors. In this paper, we consider the construction of prediction intervals in factor-augmented regression models using the bootstrap. o be more specific, suppose that y t+h denotes the variable to be forecast where h is the forecast horizon and let X t be a -dimensional vector of candidate predictors. We assume that y t+h follows a factor-augmented regression model, y t+h = α F t + β W t + ε t+h, t =,..., h, where W t is a vector of observed regressors including for instance lags of y t which jointly with F t help forecast y t+h. he r-dimensional vector F t describes the common latent factors in the panel factor model, X it = λ if t + e it, i =,...,, t =,...,, where the r vector λ i contains the factor loadings and e it is an idiosyncratic error term. We are grateful for comments from seminar participants at the University of California, San Diego and the University of Southern California and from participants at the 3d meeting of the EC^ in Maastricht and the 3 orth American Winter Meeting of the Econometric Society in San Diego. Gonçalves acknowledges financial support from the SERC and MIACS whereas Perron acknowledges financial support from the SSHRC and MIACS.

he goal is to forecast y +h or its conditional mean y +h = α F + β W using {y t, X t, W t : t =,..., }, the available data at time. Since factors are not observed, the diffusion index forecast approach typically involves a two-step procedure: in the first step we estimate F t by principal components yielding F t and in the second step we regress y t+h on W t and F t to obtain the regression coeffi cients. he point forecast is then constructed as ŷ +h = ˆα F + ˆβ W. he main goal of this paper is to propose bootstrap prediction intervals for y +h and y +h. Because we treat factors as latent, forecasts for y +h and its conditional mean depend both on estimated factors and regression coeffi cients. hese two sources of parameter uncertainty must be accounted for when constructing prediction intervals, as recently shown by Bai and g. Under regularity conditions, Bai and g derived the asymptotic distribution of regression estimates and the corresponding forecast errors and proposed the construction of asymptotic intervals. Our motivation for using the bootstrap as an alternative method of inference is twofold. First, the finite sample properties of the asymptotic approach of Bai and g can be poor, especially if is not suffi ciently large relative to. his was recently shown by Gonçalves and Perron 3 in the context of confidence intervals for the regression coeffi cients, and as we will show below, the same is true in the context of prediction intervals. In particular, estimation of factors leads to an asymptotic bias term in the OLS estimator if / c and c. Gonçalves and Perron 3 proposed a bootstrap method that removes this bias and outperforms the asymptotic approach of Bai and g. Second, the bootstrap allows for the construction of prediction intervals for y +h that are consistent under more general assumptions than the asymptotic approach of Bai and g. In particular, the bootstrap does not require the Gaussianity assumption on the regression errors that justifies the asymptotic prediction intervals of Bai and g. As our simulations show, prediction intervals based on the Gaussianity assumption perform poorly when the regression error is asymmetrically distributed whereas the bootstrap prediction intervals do not suffer significant size distortions. he remainder of the paper is organized as follows. Section introduces our forecasting model and considers asymptotic prediction intervals. Section 3 describes two bootstrap prediction algorithms. Section presents a set of high level assumptions on the bootstrap idiosyncratic errors under which the bootstrap distribution of the estimated factors at a given time period is consistent for the distribution of the sample estimated factors. hese results together with the results of Gonçalves and Perron 3 are used in Section to show the asymptotic validity of wild bootstrap prediction intervals. Section presents our simulation experiments, and Section 7 concludes. Mathematical proofs appear in the Appendix. Prediction intervals based on asymptotic theory his section introduces our assumptions and reviews the asymptotic theory-based prediction intervals proposed by Bai and g.. Assumptions Let z t = F t W t, where zt is p, with p = r + q. Following Bai and g, we make the following assumptions. Assumption a E F t M and t= F tf t P Σ F >, where Σ F is a non-random r r matrix. b he loadings λ i are either deterministic such that λ i M, or stochastic such that E λ i M. In either case, Λ Λ/ P Σ Λ >, where Σ Λ is a non-random matrix.

c he eigenvalues of the r r matrix Σ Λ Σ F are distinct. Assumption a E e it =, E e it M. b E e it e js = σ ij,ts, σ ij,ts σ ij for all t, s, σ ij,ts τ ts for all i, j. Furthermore, s= τ ts M, for each t, and t,s,i,j σ ij,ts M. c For every t, s, E / e ite is E e it e is M. d t,s,l,u i,j Cov e ite is, e jl e ju < <. e For each t, λ ie it d, Γ t, where Γ t lim V ar λ ie it >. Assumption 3 he variables {λ i }, {F t } and {e it } are three mutually independent groups. Dependence within each group is allowed. Assumption a E ε t+h = and E ε t+h < M. b E ε t+h y t, F t, y t, F t,... = for any h >, and F t and ε t are independent of the idiosyncratic errors e is for all i, s, t. c E z t M and t= z tz t P Σ zz >. d As, h t= z tε t+h d, Ω, where E Ω p lim h t= zt z tε t+h >. h t= z tε t+h < M, and Assumptions and are standard in the approximate factors literature, allowing in particular for weak cross sectional and serial dependence in e it of unknown form. Assumption 3 assumes independence among the factors, the factor loadings and the idiosyncratic error terms. We could allow for weak dependence among these three groups of variables at the cost of introducing restrictions on this dependence. Assumption imposes moment conditions on {ε t+h }, on {z t } and on the score vector {z t ε t+h }. Part c requires {z t z t} to satisfy a law of large numbers. Part d requires the score to satisfy a central limit theorem, where Ω denotes the limiting variance of the scaled average of the scores. We make the same assumption as in Bai and g regarding the form of this covariance matrix.. ormal-theory prediction intervals As described in Section, the diffusion index forecasts are based on a two step estimation procedure. he first step consists of extracting the common factors F t from the -dimensional panel X t. In particular, given X, we estimate F and Λ with the method of principal components. F is estimated with the r matrix F = F... F composed of times the eigenvectors corresponding to the r largest eigenvalues of of XX / arranged in decreasing order, where the normalization = I r is used. he matrix containing the estimated loadings is then Λ = λ,..., λ = X F F F = X F /. F F 3

In the second step, we run an OLS regression of y t+h on ẑ t = F t W t, i.e. we compute ˆδ ˆαˆβ = h t= ẑ t ẑ t h t= ẑ t y t+h, 3 where ˆδ is p with p = r + q. Suppose the object of interest is y +h, the conditional mean of y +h = α F + β W at time. he point forecast is ŷ +h = ˆα F + ˆβ W and the forecast error is given by ŷ +h y +h = ẑ ˆδ δ + α H F HF, where δ α H β is the probability limit of ˆδ. he matrix H is defined as H = Ṽ F F Λ Λ, where Ṽ is the r r diagonal matrix containing on the main diagonal the r largest eigenvalues of XX /, in decreasing order cf. Bai 3. It arises because factor models are only identified up to rotation, implying that the principal component estimator F t converges to HF t, and the OLS estimator ˆα converges to H α. It must be noted that forecasts do not depend on this rotation since the product is uniquely identified. he above decomposition shows that the asymptotic distribution of the forecast error depends on two sources of uncertainty: the first is the usual parameter estimation uncertainty associated with estimation of α and β, and the second is the factors estimation uncertainty. Under Assumptions -, and assuming that / and / as,, Bai and g show that the studentized forecast error ŷ +h y +h d,, ˆB where ˆB is a consistent estimator of the asymptotic variance of ŷ +h given by ˆB = V ar ŷ +h = ẑ ˆΣ δ ẑ + ˆα ˆΣ F ˆα. 7 Here, ˆΣδ consistently estimates Σ δ = V ar ˆδ δ and ˆΣ F = V ar F HF. In particular, under Assumptions -, consistently estimates Σ F ˆΣ α = h t= ẑ t ẑ t h t= ẑ t ẑ tˆε t+h h t= ẑ t ẑ t, and ˆΣ F = Ṽ Γ Ṽ, 9 where Γ is an estimator of Γ = lim V ar λ ie i and depends on the cross sectional dependence and heterogeneity properties of e i. Bai and g provide three different estimators of Γ. Section below considers such an estimator.

he central limit theorem result in justifies the construction of an asymptotic α% level prediction interval for y +h given by ŷ +h z α/ ˆB, ŷ +h + z α/ ˆB, where z α/ is the α/ quantile of a standard normal distribution. When the object of interest is a prediction interval for y +h, Bai and g propose ŷ +h z α/ Ĉ, ŷ +h + z α/ Ĉ, where Ĉ = ˆB + ˆσ ε, with ˆB as above and ˆσ ε = t= ˆε t. he validity of depends on the additional assumption that ε t is i.i.d., σ ε. An important condition that justifies and is that /. his condition ensures that the term reflecting the parameter estimation uncertainty in the forecast error decomposition, ˆδ δ, is asymptotically normal with a mean of zero and a variance-covariance matrix that does not depend on the factors estimation uncertainty. As was recently shown by Gonçalves and Perron 3, when / c, ˆδ δ d c δ, Σ δ, where δ is a bias term that reflects the contribution of the factors estimation error to the asymptotic distribution of the regression estimates ˆδ. In this case, the two terms in will depend on the factors estimation uncertainty and a natural question is whether this will have an effect on the prediction intervals and derived by Bai and g under the assumption that c =. As we argue next, these intervals remain valid even when c. he main reason is that when / c, the ratio /, which implies that the parameter estimation uncertainty associated with δ is dominated asymptotically by the uncertainty from having to estimate F. More formally, when / c, / and the convergence rate of ŷ +h is, implying that ŷ +h y +h = / ˆδ δ ẑ + α H F HF = α H hus, the forecast error is asymptotically F HF + o P., α H Σ F H α. Since ˆB = / ẑ ˆΣ δ ẑ + ˆα ˆΣ F ˆα = α H Σ F H α + o P, the studentized forecast error given in is still, as,. For the studentized forecast error associated with forecasting y +h, the variance of ŷ +h is asymptotically as, dominated by the variance of the error term σ ε, implying that neither the parameter estimation uncertainty nor the factors estimation uncertainty contribute to the asymptotic variance. 3 Description of bootstrap prediction intervals Following Gonçalves and Perron 3, we consider the following bootstrap DGP: Xt = Λ F t + e t, yt+h = ˆα Ft + ˆβ W t + ε t+h, 3

where { e t = e t,..., e t } denotes a bootstrap sample from {ẽ t = X t Λ F } t and { ε t+h} is a resampled version of {ˆε t+h = y t+h ˆα Ft ˆβ } W t. A specific method to generate e t and ε t+h is discussed in Section. We estimate the factors by the method of principal components using the bootstrap panel data set {Xt : t =,..., }. We let F = F,..., F denote the r matrix of bootstrap estimated factors which equal the r eigenvectors of X X / multiplied by corresponding to the r largest eigenvalues. he r matrix of estimated bootstrap loadings is given by Λ = λ,..., λ = X F /. We then run a regression of yt+h on F t and W t using observations t =,..., h. We let ˆδ denote the corresponding OLS estimator h ˆδ = t= ẑ t ẑ t h t= ẑ t y t+h, where ẑ t = F t, W t. he steps for obtaining a bootstrap prediction interval for y +h are as follows. Algorithm Bootstrap prediction interval for y +h. For t =,...,, generate Xt = Λ F t + e t, { where {e it } is a resampled version of ẽ it = X it λ F i t }. { }. Estimate the bootstrap factors F t : t =,..., using X. 3. For t =,..., h, generate y t+h = ˆα Ft + ˆβ W t + ε t+h, where the error term ε t+h is a resampled version of ˆε t+h.. Regress y t+h generated in step 3 on the bootstrap estimated factors F t obtained in step and on the fixed regressors W t and obtain the OLS estimator ˆδ.. Obtain bootstrap forecasts and bootstrap standard errors ŷ +h = ˆα F + ˆβ W ˆδ ẑ, ˆB = ẑ ˆΣ δẑ + ˆα ˆΣ F ˆα, where the choice of ˆΣ δ and ˆΣ F depends on the assumptions on ε t+h and e it.. Let y +h = ˆα F + ˆβ W and compute bootstrap prediction errors:

a For equal-tailed percentile-t bootstrap intervals, compute studentized bootstrap prediction errors as s +h = ŷ +h y +h. ˆB b For symmetric percentile-t bootstrap intervals, compute s +h. { } { } 7. Repeat this process B times, resulting in statistics s s +h,,..., s +h,b and s +h,,..., +h,b.. Compute the corresponding empirical quantiles: a { For equal-tailed percentile-t } bootstrap intervals, q α s +h,,..., s +h,b. is the empirical α quantile of b For symmetric percentile-t bootstrap intervals, q, α is the empirical α quantile of { } s s +h,,..., +h,b. A α% equal-tailed percentile-t bootstrap interval for y +h is given by EQ α y +h ŷ +h q α/ ˆB, ŷ +h qα/ ˆB, whereas a α% symmetric percentile-t bootstrap interval for y +h is given by SY α y +h ŷ +h q, α ˆB, ŷ +h + q, α ˆB, When prediction intervals for a new observation y +h are the object of interest, the algorithm reads as follows. Algorithm Bootstrap prediction interval for y +h. Identical to Algorithm.. Identical to Algorithm. 3. Generate { y +h,..., y, y +,..., y +h} using y t+h = ˆα Ft + ˆβ W t + ε t+h, where { ε +h,..., ε, ε +,..., ε +h} is a bootstrap sample obtained from {ˆε+h,..., ˆε }.. ot making use of the stretch { y +,..., y +h}, compute ˆδ as in Algorithm.. Obtain the bootstrap point forecast ŷ +h as in Algorithm but compute its standard error as Ĉ = ˆB + ˆσ ε, where ˆσ ε is a consistent estimator of σ ε = V ar ε +h and ˆB is as in Algorithm.. Let y +h = ˆα F + ˆβ W + ε +h and compute bootstrap prediction errors: 7

a For equal-tailed percentile-t bootstrap intervals, compute studentized bootstrap prediction errors as s +h = ŷ +h y +h. Ĉ b For symmetric percentile-t bootstrap intervals, compute s +h. 7. Identical to Algorithm.. Identical to Algorithm. A α % equal-tailed percentile-t bootstrap interval for y +h is given by EQ α y +h ŷ +h q α/ Ĉ, ŷ +h qα/ Ĉ, 7 whereas a α % symmetric percentile-t bootstrap interval for y +h is given by SY α y +h ŷ +h q, α Ĉ, ŷ +h + q, α Ĉ. he main differences between the two algorithms is that in step 3 of Algorithm we generate observations for yt+h for t =,..., instead of stopping at t = h. his allows us to obtain a bootstrap observation for y +h, the bootstrap analogue of y +h, which we will use in constructing the studentized statistic s +h in step of Algorithm. he point forecast is identical to Algorithm and relies only on observations for t =,..., h, but the bootstrap variance Ĉ contains an extra term ˆσ ε that reflects the uncertainty associated with the error ε +h associated with the new observation y +h. Bootstrap distribution of estimated factors he asymptotic validity of the bootstrap prediction intervals for y +h and y +h described in the previous section depends on the ability of the bootstrap to capture two sources of estimation error: the parameter estimation error and the factors estimation error. In particular, the bootstrap prediction error for the conditional mean is given by ŷ +h y +h = ẑ ˆδ δ + ˆα H F H F, where δ = Φ ˆδ and Φ = diag H, I q. Here, H is the bootstrap analogue of the rotation matrix H defined in, i.e. H = Ṽ F F Λ Λ, where Ṽ is the r r diagonal matrix containing on the main diagonal the r largest eigenvalues of X X /, in decreasing order. ote that contrary to H, which depends on unknown population parameters, H is fully observed. Adding and subtracting appropriately, we can write ŷ +h y +h = ẑ Φ ˆδ ˆδ + ˆα H F F + o P. 9

As usual in the bootstrap literature, we use P to denote the bootstrap probability measure, conditional on a given sample; E and V ar denote the corresponding bootstrap expected value and variance operators. For any bootstrap statistic, we write = o P, in probability, or P, in probability, when for any δ >, P > δ = o P. We write = O P, in probability, when for all δ > there exists M δ < such that lim, P [P > M δ > δ] =. Finally, we write d D, in probability, if conditional on a sample with probability that converges to one, weakly converges to the distribution D under P, i.e. E f P E f D for all bounded and uniformly continuous functions f. See Chang and Park 3 for similar notation and for several useful bootstrap asymptotic properties. he stochastic expansion 9 shows that the bootstrap prediction error captures the two forms of estimation uncertainty in provided: the bootstrap distribution of Φ ˆδ ˆδ is a consistent estimator of the distribution of ˆδ δ, and the bootstrap distribution of H F F is a consistent estimator of the distribution of F HF. Gonçalves and Perron 3 discussed conditions for the consistency of the bootstrap distribution of ˆδ δ. Here we propose a set of conditions that justifies using the bootstrap to consistently estimate the distribution of the estimated factors Ft HF t at each point t. Condition A. A.. a t= s= γ st E e it e is. = O P, and b for each t, s= γ st = O P, where γ st = A.. a t= s= E e it e is E e it e is s= E e it e is E e it e is = OP. = O P, and b for each t, A.3. For each t, E s= F s e it e is E e it e is = OP. A.. E F t= t λ ie it = O P. A.. t= E λ i e it = O P. A.. For each t, Γ / λ t i e it d, I r, in probability, where Γ t = V ar λ i e it is uniformly definite positive. Condition A is the bootstrap analogue of Bai s 3 assumptions used to derive the limiting distribution of Ft HF t. Gonçalves and Perron 3 also relied on similar high level assumptions to study the bootstrap distribution of ˆδ δ. In particular, Conditions A. and A. correspond to their Conditions B*c and B*d, respectively. Since our goal here is to characterize the limiting distribution of the bootstrap estimated factors at each point t, we need to complement some of their other conditions by requiring boundedness in probability of some bootstrap moments at each point in time t in addition to boundedness in probability of the time average of these bootstrap moments; e.g. Conditions A..b and A..b expand Conditions A*b and A*c in Gonçalves and Perron 3 in this manner. We also require that a central limit theorem applies to the scaled cross sectional average of λ i e it, at each time t Condition A.. his high level condition ensures asymptotic 9

normality for the bootstrap estimated factors. It was not required by Gonçalves and Perron 3 because their goal was only to consistently estimate the distribution of the regression estimates, not of the estimated factors. heorem. Suppose Assumptions and hold. / 3/, we have that for each t, Under Condition A, as, such that in probability, which implies that in probability, where Π t = Ṽ Γ t Ṽ. F t H Ft = H Ṽ λ i e it + o P, Π / t H F t F t d, I r, heorem.i of Bai 3 shows that under regularity conditions weaker than Assumptions and and provided /, Ft HF t d, Π t, where Π t = V QΓ t Q V, Q = p lim F F. heorem. is its bootstrap analogue. A stronger rate condition / 3/ instead of / is used to show that the remainder terms in the stochastic expansion of F t H Ft are asymptotically negligible. his rate condition is a function of the number of finite moments for F s we assume. In particular, if we replace Assumption a with E F t q M for all t, then the required rate restriction is / /q. See Remarks and below. o prove the consistency of Π t for Π t we impose the following additional condition. Condition B. For each t, p lim Γ t = QΓ t Q. Condition B requires that Γ t, the bootstrap variance of the scaled cross sectional average of the scores λ i e it, be consistent for QΓ tq. his in turn requires that we resample ẽ it in a way that preserves the cross sectional dependence and heterogeneity properties of e it. Corollary. Under Assumptions and and Conditions A and B, we have that for each t, as, such that / 3/, H F t F t d, Π t, in probability, where Π t = V QΓ t Q V is the asymptotic covariance matrix of Ft HF t. Corollary. justifies using the bootstrap to construct confidence intervals for the rotated factors HF t provided Conditions A and B hold. hese conditions are high level conditions that can be checked for any particular bootstrap scheme used to generate e it. We verify them for a wild bootstrap in Section when proving the consistency of bootstrap prediction intervals for the conditional mean. he fact that factors and factor loadings are not separately identified implies the need to rotate the bootstrap estimated factors in order to consistently estimate the distribution of the sample factor estimates, i.e. we use H F t F t to approximate the distribution of Ft HF t. A similar rotation was discussed in Gonçalves and Perron 3 in the context of bootstrapping the regression coeffi cients ˆδ.

Validity of bootstrap prediction intervals. Intervals for y + When prediction intervals for the conditional mean y + are the object of interest, we use a two-step wild bootstrap scheme, as in Gonçalves and Perron 3. Specifically, we rely on Algorithm and we let ε t+ = ˆε t+ v t+, t =,...,, with v t+ i.i.d.,, and where η it is i.i.d., across i, t, independently of v t+. e it = ẽ it η it, t =,...,, i =,...,, o prove the asymptotic validity of this method we strengthen Assumptions - as follows. Assumption. λ i are either deterministic such that λ i M <, or stochastic such that E λ i M < for all i; E F t M < ; E e it M <, for all i, t ; and for some q >, E ε t+ q M <, for all t. Assumption. E e it e js = if i j. With h =, our assumption b on ε t+h becomes a martingale difference sequence assumption, and this motivates the use of the wild bootstrap in. his assumption rules out serial correlation in ε t+ but allows for conditional heteroskedasticity. Devising a bootstrap scheme that will be valid for h > would require some blocking method and is the subject of ongoing research. Some simulation results with a particular scheme will be provided below. Assumption assumes the absence of cross sectional correlation in the idiosyncratic errors and motivates the use of the wild bootstrap in. As the results in the previous sections show, prediction intervals for y +h or y +h are a function of the factors estimation uncertainty even when this source of uncertainty is asymptotically negligible for the estimation of the distribution of the regression coeffi cients i.e. even when / c =. Since factors estimation uncertainty depends on the cross sectional correlation of the idiosyncratic errors e it via Γ = lim V ar / λ ie i, bootstrap prediction intervals need to mimic this form of correlation to be asymptotically valid. Contrary to the pure time series context, a natural ordering does not exist in the cross sectional dimension, which implies that proposing a nonparametric bootstrap method e.g. a block bootstrap that replicates the cross sectional dependence is challenging if a parametric model is not assumed. herefore, we follow Gonçalves and Perron 3 and use a wild bootstrap to generate e it under Assumption. he bootstrap percentile-t method, as described in Algorithm and equations and, requires the choice of two standard errors, ˆB and its bootstrap analogue ˆB. o compute ˆB we use 7, where ˆΣ δ is given in. ˆΣ F is given in 9, where Γ = λ i λ iẽ i is estimator a in Bai and g, and it is a consistent estimator of a rotated version of Γ = lim V ar λ ie i under Assumption. We compute ˆB using and relying on the bootstrap analogues of ˆΣ δ and ˆΣ F.

heorem. Suppose Assumptions - hold and we use Algorithm with ε t+ = ˆε t+ v t+ and e it = ẽ it η it, where v t+ i.i.d., for all t =,..., and η it i.i.d., for all i =,..., ; t =,...,, and v t+ and η it are mutually independent. Moreover, assume that E η it < C for all i, t and E v t+ < C for all t. If / c, where c <, and / /, then conditional on {y t, X t, W t : t =,..., }, ŷ + y + ˆB d,, in probability. Remark he rate restriction / / is slightly stronger than the rate used by Bai 3 cf. /. It is weaker than the restriction / 3/ used in heorem. and Corollary. because we have strengthened the number of factor moments that exist from to compare Assumption with Assumption a. See Remark in the Appendix.. Intervals for y + In this section we provide a theoretical justification for bootstrap prediction intervals for y + as described in Algorithm. We add the following assumption. Assumption 7. ε t+ is i.i.d., σ ε with a continuous distribution function Fε x = P ε t+ x. Assumption 7 strengthens the m.d.s. Assumption.b by requiring the regression errors to be i.i.d. However, and contrary to Bai and g, F ε does not need to be Gaussian. he continuity assumption on F ε is used below to prove that the Kolmogorov distance between the bootstrap distribution of the studentized forecast error and the distribution of its sample analogue converges in probability to zero. Let the studentized forecast error be defined as s + ŷ + y +, ˆB + ˆσ ε where ˆσ ε is a consistent estimate of σ ε = V ar ε + and ˆB = V ar ŷ + = ẑ ˆΣ δ ẑ + ˆα ˆΣ F ˆα. Given Assumption 7, we use ˆσ ε = ˆε t+ and ˆΣδ = ˆσ ε t= ẑ t ẑ t. Our goal is to show that the bootstrap can be used to estimate consistently F,s x = P s + x, the distribution function of s +. ote that we can write ŷ + y + = ŷ + y + + y + y + δ = ε + + O P /δ, given that ŷ + y + = O, P, where δ = min this follows under the assumptions of heorem.. Since ˆσ ε P σ ε and ˆB = O P /δ = op, it follows that t= s + = ε + σ ε + o P. 3

hus, as,, s + converges in distribution to the random variable ε + σ ε, i.e. F,s x P s + x P ε + σ ε x = F ε xσ ε F,s x, for all x R. If we assume that ε t+ is i.i.d., σ ε, as in Bai and g, then Fε xσ ε = Φ x = Φ x, implying that F,s x Φ x, i.e. s + d,. evertheless, this is not generally true unless we make the Gaussianity assumption. We note that although asymptotically the variance of the prediction error ŷ + y + does not depend on any parameter nor factors estimation uncertainty as it is dominated by σ ε for large and, we still suggest using Ĉ = ˆB + ˆσ ε to studentize ŷ + y + since ˆσ ε will underestimate the true forecast variance for finite and. ext we show that the bootstrap yields a consistent estimate of the distribution of s + without assuming that ε t+ is Gaussian. Our proposal is based on a two-step residual based bootstrap scheme, { as described in Algorithm and equations 7 and, where in step 3 we generate ε,..., ε, ε +,..., } ε + as a random sample obtained from the centered residuals {ˆε ˆε,..., ˆε ˆε }. Resampling in an i.i.d. fashion is justified under Assumption 7. We recenter the residuals because ˆε is not necessarily zero unless W t contains a constant regressor. evertheless, since ˆε = o P, resampling the uncentered residuals is also asymptotically valid in our context. We compute ˆB and ˆσ ε using the bootstrap analogues of ˆΣ δ and ˆσ ε introduced in. ote that ˆσ ε is a consistent estimator of σ ε and ˆB = o P, in probability. As above, we can write ŷ + y + = ŷ + y + + y + y + in probability, which in turn implies = ε + + O P /δ, s + ŷ + y + ˆB + ˆσ ε = ε + σ ε + o P. hus, F,s x = P s + x, the bootstrap distribution of s + conditional on the sample is asymptotically the same as the bootstrap distribution of ε + σ ε. Let F,ε denote the bootstrap distribution function of ε t. It is clear from the stochastic expansions 3 and that the crucial step is to show that ε + converges weakly in probability to ε +, i.e. d F,ε, F ε P for any metric that metrizes weak convergence. In the following we use Mallows metric which is defined as d F X, F Y = inf E X Y / over all joint distributions for the random variables X and Y having marginal distributions F X and F Y, respectively. Lemma. Under Assumptions -7, and as, such that / c, c <, d F,ε, F ε P. Corollary. Under the same assumptions as heorem. strengthened by Assumption 7, we have that sup F,s x F,s x, x R in probability. 3

Corollary. implies the asymptotic validity of the bootstrap prediction intervals given in 7 and. Specifically, we can show that P y + EQ α y + α and P y + SYy α + α as,. See e.g. Beran 97 and Wolf and Wunderli, Proposition 3.. For instance, P y + EQ α y + = P s + q α/ P s + qα/ = P F,s s + α/ P F,s s + α/. Given Corollary., we have that F,s s + = F,s s + +o P, and we can show that F,s s + d U [, ]. Indeed, for any x, P F,s s + x = P s + F s x F,s F,s x F,s F,s x = x. Simulations We now report results from a simulation experiment to analyze the properties of the normal asymptotic intervals as well as their bootstrap counterpart analyzed above. he data-generating process DGP is similar to the one used in Gonçalves and Perron 3. We consider the single factor model: y t+ = αf t + ε t+ where F t is drawn from a standard normal distribution independently over time. Because we are interested in constructing intervals for y + which depend on the distribution of ε + and require homoskedasticity and independence over time, the regression error ε t+ will be homoskedastic and independent over time with expectation and variance. o analyze the effects of deviations from normality, we have considered five distributions for ε t+ : ormal: ε t, χ : ε t [ χ ] Uniform : ε t U, Exponential: ε t Exp Mixture : ε t [p, + p 9, ], p B.9, but we will report results only for the normal and mixture distributions as they illustrate our conclusions. Results for the other cases are available from the authors upon request. he particular mixture distribution we are using was proposed by Pascual, Romo and Ruiz. Most of the data is drawn from a, but about % will come from a second normal with a much larger mean of 9. he matrix of panel variables is generated as: X it = λ i F t + e it where λ i is drawn from a U [, ] distribution independent across i and e it is heteroskedastic but independent over i and t. he variance of e it is drawn from U [.,.] for each i. As in Gonçalves and Perron 3, we consider two values for the coeffi cient, either α = or. When α =, the OLS estimator of α is unbiased. When α =, the OLS estimator of α is biased. unless We consider asymptotic and bootstrap confidence intervals at a nominal level of 9%. We use Algorithms and described above to generate the bootstrap data with B = 99 bootstrap replications.

We consider the same two types of intervals analyzed above symmetric percentile-t and equal-tailed percentile-t. We report experiments based on, replications and with two values for and and values for,,,, and.. Results We report graphically four sets of results: those for our quantities of interest y + and the conditional mean ŷ +, and those for their components, the rotated factor at the end of the sample HF and the rotated coeffi cient H α as these help understand some of the behavior. In the first two cases, we report the frequency of times the true parameter is to the left or right of the corresponding 9% confidence interval. For the rotated factor and coeffi cient, we only report the total rejection rate as the side of rejection is not identified because of the sign indeterminacy associated with H. Each figure has two rows corresponding to = and =. he distribution of ε + noticeably affects the results for y + only. As a consequence, we only report results with Gaussian ε t+ for each of the other three quantities. On the other hand, the results of y + are dominated by the behavior of ε +. Given our single factor model, the asymptotic variance of the conditional mean is: V ar ŷ + = F V ar ˆδ δ + δ V ar F HF t and in our setup, V ar F HF t = 3 and V ar ˆδ δ =. So even in the worst scenario where = and =, this variance is only. the unconditional expectation of F is compared to in all setups for ε +. his allows us to easily separate the contribution of the conditional mean from the contribution of ε in the forecasts of y +. Forecast of y + We report four figures for this quantity. he first two correspond to case where α = with ε either normal or from the mixture, while the second two figures provide the same information when α =. As mentioned above, the behavior of forecast intervals are dominated by the distribution of ε + so the value of α is not very important. his means that Figures and 3 are almost identical and so are Figures and. Figures and 3 show that under normality, inference for y + is quite accurate, and it is essentially unaffected by the values of and as predicted as it is dominated by the behavior of ε. Figures and reveal that when the errors are drawn from a mixture of normals, we see overrejections with asymptotic theory, and these almost exclusively come from the right side. he equal-tailed intervals improve the under rejection on the left side to a large extent and pretty much eliminate the over-rejection on the right side. In the results we do not report, the results are similar with the other non-normal distributions. We get under rejection with uniform errors symmetrically on both sides, while in the case of χ and exponential errors, we get overall rejection rates that are essentially correct, but all rejections are on the right side because of the asymmetry in the distributions. Again, the equal-tailed bootstrap intervals perform best because they capture some of the asymmetry. Conditional mean, ŷ + he results for the conditional mean are presented in Figures and. Because the results are quite insensitive to the distribution of ε +, we only report results for the normal distribution. Figure reports results for case where α = and Figure does the same for α =.

With α =, asymptotic intervals underreject the true conditional mean. his can be explained by the fact that the true variance only depends on the variance of the coeffi cient, but when estimating it, the second term in 7 contributes because ˆα is obviously not in finite samples. hus, the variance is over estimated, and the t statistics are smaller than they should be. We see this underrejection disappearing as the sample grows as expected. In Figure, we see that with α =, there is some overrejection with asymptotic theory a 9% confidence interval would cover the true conditional mean roughly 9% of the time for = =, but this disappears as and increase. his is due to a large bias in the estimation of the coeffi cient documented by Gonçalves and Perron 3 and that we discuss below. his bias comes from the estimation of the factors and is reduced with an increasing. he bootstrap corrects some of these over rejections, in particular the symmetric intervals. Factor, HF Figure 7 presents results for inference on the rotated factor at the end of the sample. Obviously, this is independent of both the value of the parameter α and of the distribution of ε so we only report one figure. As mentioned above, the side of rejection is not well-defined as it depends on the sign of H. A left-side rejection with a positive H would be a right-side rejection with a negative H. Since the sign of H is arbitrary, the side of rejection is also arbitrary. hus, only total rejections are meaningful, and this is the quantity we report. he first thing to note is that there are notable size distortions, in particular when =. he true rotated factor is not in the confidence interval between and % of the times instead of the nominal % rate. For this smaller value of, rejection rates are not sensitive to the value of. For the larger value of, size distortions are much smaller, and we do see a slight decrease as increases. Again, the bootstrap succeeds in correcting these distortions to a large extent. Symmetric percentilet intervals are more accurate but still suffer from distortions for small values of. he equal-tailed intervals are less affected by the value of, and they also offer a notable improvement over asymptotic theory. Coeffi cient, H α Finally, we report the results for the coeffi cients in figures α = and 9 α =. hese results are in line with those of Gonçalves and Perron 3. Asymptotic theory works well in all cases when α = in figure. he bootstrap has some problems because using the estimated OLS coeffi cient in the bootstrap DGP introduces a bias in the bootstrap world. he equal-tailed intervals are particularly affected, and this effect diminishes as and increase since the OLS estimator concentrates around its true value of. When α =, a bias appears in sample estimation, and this gets reflected in over-rejections the rejection rate is 33.% instead of the nominal % with = = which disappears as we increase. he bootstrap corrects most of this bias and reduces the rejection rate to about % with = = for the symmetric intervals and % for the equal-tailed. he two types of intervals are remarkably close. One last interesting observation is that the behavior of the individual components does not translate directly to the prediction intervals. For example, the asymptotic confidence intervals for both the factor and coeffi cients tend to over reject with α =, but the interval for the conditional mean is reasonably accurate. his is because, contrary to what the theory assumes, there is a negative correlation between the two components of the forecast error in finite samples. his negative correlation arises from the presence of F in the first term and F HF in the second term. his means that the variance

of the sum of the two terms is less than the sum of the variances as is assumed in the asymptotic approximation. In summary, the bootstrap is particularly useful in two situations. he first one is in constructing prediction intervals for y + when the distribution of the innovation is non-gaussian and its variance is large relative to the uncertainty around the conditional mean. he second one is in constructing confidence intervals for the conditional mean. With a non-zero coeffi cient on the factors, the OLS estimator is severely biased unless is large, and the bootstrap can correct this bias as documented by Gonçalves and Perron 3.. Multi-horizon forecasting In a second set of experiments, we look at cases where h >. We introduce dynamics by letting the factor be an autoregressive process of order : and y t as: F t = γf t + u t y t+ = αf t + v t+. Both error terms are iid, and the autoregressive parameter γ =.. We look at the same sample sizes as before. We consider inference for y +h and y +h based on direct forecasts for h =,...,. In other words, we regress y t+h on y t and F t for t =,..., h : y t+h = δ h Ft + ε t+h. and generate ŷ +h = ŷ +h = ˆδ h F where ˆδ h is the OLS estimate of the projection coeffi cient which equals αγ h for any h. ote this coeffi cient converges to with h, and as a consequence, the magnitude of the bias of the OLS estimator also converges to with the forecasting horizon. he error term ε t+h is serially correlated to order h in his design. Asymptotic theory is conducted in two ways, either with homoskedastic covariance matrix as before or using a HAC estimator quadratic spectral kernel with pre-whitening and Andrews bandwidth selection to account for serial correlation. We use the wild bootstrap for drawing e it as above. For generating v t+h, we consider two schemes. he first one is the i.i.d. bootstrap and is the same as above. It is obviously not valid for h > as it does not reproduce the serial correlation. he second scheme is what we call a wild block bootstrap. In this case, we separate the sample residuals ˆε t+h into non-overlapping blocks of h consecutive observations and multiply each observation within a block by the same draw of an external variable. In other words, we generate the error term as: ε t = ˆε t η t with η =... = η h, η h+ =... = η h,...,η h+ =... = η. We report coverage rates based on symmetric t intervals. While we do not have a formal proof of validity, we expect that it will be valid in this context as it preserves the serial dependence among the residuals. he simulation results in figures -3 are supportive of this claim. Each figure is organized similarly. he results in the left column are for =, while those in the right column are for =. Each row provides results for a given horizon. Serial correlation affects mostly inference on the coeffi cients in figure 3. For the coeffi cient on the factors, the large bias documented above adds to the distortions induced by serial correlation that become apparent for h 3. Autocorrelation-robust inference reduces the distortions due to serial 7

correlation, but not those from the bias. he bootstrap reduces the bias and the wild block bootstrap appears to be best. Serial correlation does not seem to affect inference on y +h. here are some effects when =, but this seems to be a finite-sample issue related to estimating the distribution of ε +h. For the conditional mean, serial correlation affects asymptotic inference to some extent. HAC correction reduces the distortions, but again the bootstrap helps for small values of. he wild block bootstrap seems to be best, but the gain is not dramatic. 7 Conclusion In this paper, we have proposed the bootstrap to construct valid prediction intervals for models involving estimated factors. We considered two objects of interest: the conditional mean y +h and the realization y +h.. he bootstrap improves considerably on asymptotic theory for the conditional mean when the factors are relevant because of the bias in the estimation of the regression coeffi cients. For the realization, the importance of the bootstrap is in providing intervals without having to make strong distributional assumptions such as normality as was done in previous work by Bai and g. One key assumption that we had to make to establish our results is that the idiosyncratic errors in the factor models are cross-sectionally independent. his is certainly restrictive, but it allows for the use of the wild bootstrap on the idiosyncratic errors. on-parametric bootstrapping under more general conditions remains a challenge. he results in this paper could be used to prove the validity of a scheme in that context by showing the conditions A and B are satisfied.

A Appendix he proof of heorem. requires the following auxiliary result, which is the bootstrap analogue of Lemma A. of Bai 3. It is based on the following identity that hold for each t: F t H Ft = Ṽ F s γ st + F s ζ st + F s η st + F s ξ st s= s= s= s=, }{{}}{{}}{{}}{{} A t A t A 3t A t where γ st = E η st = e ise it λ F i s e it = F s, ζ st = Λ e t e ise it E e ise it, and ξ st = λ F i t e is = η ts. Lemma A. Assume Assumptions and hold. Under Condition A, we have that for each t, in probability, as,, a F s= s γ st = O P δ + O P ; 3/ b F s= s ζ st = O P δ ; c F s= s η st = O P ; d F s= s ξ st = O P δ. Remark he term O P / 3/ that appears in a is of a larger order of magnitude than the corresponding term in Bai 3, Lemma A.i, which is O P /. he reason why we obtain this larger term is that we rely on Bonferroni s inequality and Chebyshev s inequality to bound max s F s = O P / using the fourth order moment assumption on F s cf. Assumption a. In general, if E F s q M for all s, then max s F s = O P /q and we will obtain a term of order / /q. O P Proof of Lemma A.. he proof follows closely that of Lemma A. of Bai 3. he only exception is a, where an additional O term appears. In particular, we write 3/ s= F s γ st = s= F s H Fs γ st + H F s γ st = a t + b t. We use Cauchy-Schwartz and Condition A. to bound a t as follows a t = O P s= δ F / s H Fs s= s= O P = O P δ 9 γ st /,

where s= F s H Fs = OP δ by Lemma 3. of Gonçalves and Perron 3 note that this lemma only requires Conditions A*b, A*c, and B*d, which correspond to our Condition A.a, A.a and A.. For b t, we have that ignoring H, which is O P, b t = s= F s γ st = s= Fs HF s γ st + H F s γ st = b t + b t, where b t = O P /δ using the fact that s= F s HF s = OP δ under Assumptions and and the fact that s= γ st = O P / for each t by Condition A.b. For b t, note that ignoring H = O P, b t max F s γ s st }{{} s= }{{} O P / O P s= = O P 3/, where we have used the fact that E F s M for all s Assumption to bound max s F s. Indeed, by Bonferroni s inequality and Chebyshev s inequality, we have that P / max s F s > M P s= F s > / M s= E F s M M 3 for M suffi ciently large. For b, we follow exactly the proof of Bai 3 and use the second part of Condition A. to bound s= ζ st = O P for each t; similarly, we use Condition A.3 to bound F s= s ζ st for each t. For c, we bound F s= s η st = H λ i e it = O P / by using Condition A.. his same condition is used to bound Finally, for part d, we use Condition A. to bound s= F s ξ st = O P s= η st = O P / for each t. for each t and we use Condition A. to bound s= ξ st = O P / for each t. Proof of heorem..by Lemma A., it follows that the third term in F t H Ft is the dominant one it is O P ; the first term is O P δ + O P = O P = o P 3/ 3/ if / 3/ whereas the second and the fourth terms are O P /δ = o P as,. hus, we have that F t H Ft = Ṽ = [ s= F s Ṽ F F Λ Λ λ i F s e it + o P ] Λ Λ λ i e it + o P = H Ṽ Γ / t Γ / t λ i e it + o P, }{{} d,i r by Condition A. given the definition of H Λ Λ and the fact that Ṽ =. Since det Γ t > ɛ > for all and some ɛ, Γ t exists and we can define Γ / t = Γ / / t where Γ t Γ / t = Γ t. Let Π / t = Γ / t Ṽ and

note that Π / t is symmetric and it is such that Π / t Π / t = Ṽ Γ t Ṽ = Π. he result follows by multiplying by Π / t H and using Condition A.. Proof of Corollary.. Condition B and the fact that Ṽ P V under our assumptions imply that Π t P Π t V QΓ t Q V. his suffi ces to show the result. Proof of heorem.. Using the decomposition 9 and the fact that where Φ = diag H, I q, it follows that where the remainder is First, we argue that F ẑ = Φ ẑ + H F ŷ + y + = ẑ Φ ˆδ ˆδ + ˆα H F F + r,, r = F H F ˆα H ˆα = O P. ŷ + y + B d,, 7 where B is the asymptotic variance of ŷ + y +, i.e. B ŷ = av ar + y + = ẑ Σ δẑ + ˆα Π ˆα. o show 7, we follow the arguments of Bai and g, proof of their heorem 3 and show that Z = Φ ˆδ ˆδ d c δ, Σ δ ; Z = H F F, Π ; 3 Z and Z are asymptotically independent conditional on the original sample. Condition follows from Gonçalves and Perron 3 under Assumptions -; follows from Corollary. provided / / and conditions A and B hold for the wild bootstrap which we verify next; 3 holds because we generate e t independently of ε t+. Proof of Condition A for the wild bootstrap. Starting with A., note that A.a was verified in Gonçalves and Perron 3. We verify A.b for t =. We have that s= γ s =. ẽ i hus, it suffi ces to show that ẽ i = O P. his follows by using the decomposition which implies that ẽ it = e it λ ih Ft HF t λi H λ i Ft, ẽ it 3 +3 e it + 3 λ i H F HF λ i H λ i F. he first term is O P given that E e it = O ; the second term is O P since E λ i = O and given that F HF = OP / = o P ; and the third term is O P given Lemma C..ii of Gonçalves and Perron 3 and the fact that F = OP. ext, we verify A.. Part a t d

was verified already by Gonçalves and Perron 3, so we only need to check part b for t =. Following the proof of heorem. in Gonçalves and Perron 3 condition A*c, we have that = s= s= E η ẽ i e i e is E e i e is ẽ i ẽ isv ar η i η is η }{{} η / s= ẽ i s= ẽ is ẽ is / = O P, where the first factor in can be bounded by an argument similar to that used above to bound ẽ i, and the second factor can be bounded by Lemma C. iii of Gonçalves and Perron 3. A.3 follows by an argument similar to that used by Gonçalves and Perron 3 to verify Condition B*b. In particular, E F s e ise i E e ise i s= = F s F s ẽ i ẽ isv ar η i η is η ẽ i F F s s ẽ is s= s= [ ] / [ η ẽ i F / s ẽ is] = O P, s= s= under our assumptions. Conditions A. and A. correspond to Gonçalves and Perron s 3 Conditions B*c and B*d, respectively. Finally, we prove Condition A. for t =. Using the fact that e i = ẽ i η i, where η i i.i.d., across i, note that Γ = V ar λ i e i = λ i λ iẽ i P QΓ Q, by heorem of Bai 3, where Γ lim V ar λ ie i > by assumption. hus, is uniformly positive definite. We now need to verify that Γ l Γ / λ i e i = l Γ / λ i ẽ i η }{{ i d,, } =ω i in probability, for any l such that l l =. Since ω i is an heterogeneous array of independent random variables given that η it is i.i.d., we apply a CL for heterogeneous independent arrays. ote that E ω i = and V ar ω i = l Γ / V ar λ i ẽ i η i Γ / l = l Γ / λ i λ iẽ i Γ / l = l l =.