Statistica Sinica Preprint No: SS

Similar documents
A simple bootstrap method for constructing nonparametric confidence bands for functions

Nonparametric Inference via Bootstrapping the Debiased Estimator

Confidence intervals for kernel density estimation

University, Tempe, Arizona, USA b Department of Mathematics and Statistics, University of New. Mexico, Albuquerque, New Mexico, USA

Nonparametric confidence intervals. for receiver operating characteristic curves

Working Paper No Maximum score type estimators

Smooth simultaneous confidence bands for cumulative distribution functions

Inference in Nonparametric Series Estimation with Data-Dependent Number of Series Terms

Preface. 1 Nonparametric Density Estimation and Testing. 1.1 Introduction. 1.2 Univariate Density Estimation

Additive Isotonic Regression

Nonparametric Econometrics

Local linear multiple regression with variable. bandwidth in the presence of heteroscedasticity

11. Bootstrap Methods

Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION. September 2017

SMOOTHED BLOCK EMPIRICAL LIKELIHOOD FOR QUANTILES OF WEAKLY DEPENDENT PROCESSES

Uniform Inference for Conditional Factor Models with Instrumental and Idiosyncratic Betas

UNIVERSITÄT POTSDAM Institut für Mathematik

Inference on distributions and quantiles using a finite-sample Dirichlet process

DEPARTMENT MATHEMATIK ARBEITSBEREICH MATHEMATISCHE STATISTIK UND STOCHASTISCHE PROZESSE

A New Method for Varying Adaptive Bandwidth Selection

Local Polynomial Regression

High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data

Goodness-of-fit tests for the cure rate in a mixture cure model

Single Index Quantile Regression for Heteroscedastic Data

Inference For High Dimensional M-estimates: Fixed Design Results

Identification and Estimation of Partially Linear Censored Regression Models with Unknown Heteroscedasticity

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon

A Bootstrap Test for Conditional Symmetry

Efficient Estimation of Population Quantiles in General Semiparametric Regression Models

Time Series and Forecasting Lecture 4 NonLinear Time Series

Single Index Quantile Regression for Heteroscedastic Data

Defect Detection using Nonparametric Regression

Simple and Honest Confidence Intervals in Nonparametric Regression

Michael Lechner Causal Analysis RDD 2014 page 1. Lecture 7. The Regression Discontinuity Design. RDD fuzzy and sharp

Convergence rates for uniform confidence intervals based on local polynomial regression estimators

Economics Division University of Southampton Southampton SO17 1BJ, UK. Title Overlapping Sub-sampling and invariance to initial conditions

Program Evaluation with High-Dimensional Data

A Local Generalized Method of Moments Estimator

Some Theories about Backfitting Algorithm for Varying Coefficient Partially Linear Model

Monte Carlo Studies. The response in a Monte Carlo study is a random variable.

SEMIPARAMETRIC QUANTILE REGRESSION WITH HIGH-DIMENSIONAL COVARIATES. Liping Zhu, Mian Huang, & Runze Li. The Pennsylvania State University

Negative Association, Ordering and Convergence of Resampling Methods

Nonparametric Modal Regression

CALCULATION METHOD FOR NONLINEAR DYNAMIC LEAST-ABSOLUTE DEVIATIONS ESTIMATOR

Robust Backtesting Tests for Value-at-Risk Models

interval forecasting

Inference For High Dimensional M-estimates. Fixed Design Results

TESTING REGRESSION MONOTONICITY IN ECONOMETRIC MODELS

Smooth nonparametric estimation of a quantile function under right censoring using beta kernels

Web-based Supplementary Material for. Dependence Calibration in Conditional Copulas: A Nonparametric Approach

Quantile Regression for Panel Data Models with Fixed Effects and Small T : Identification and Estimation

SIMPLE AND HONEST CONFIDENCE INTERVALS IN NONPARAMETRIC REGRESSION. Timothy B. Armstrong and Michal Kolesár. June 2016 Revised March 2018

Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions

AUTOMATIC CONTROL COMMUNICATION SYSTEMS LINKÖPINGS UNIVERSITET. Questions AUTOMATIC CONTROL COMMUNICATION SYSTEMS LINKÖPINGS UNIVERSITET

WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION. Abstract

Nonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix

Truncated Regression Model and Nonparametric Estimation for Gifted and Talented Education Program

41903: Introduction to Nonparametrics

Quantile Processes for Semi and Nonparametric Regression

Sparse Nonparametric Density Estimation in High Dimensions Using the Rodeo

Statistical inference on Lévy processes

OPTIMAL POINTWISE ADAPTIVE METHODS IN NONPARAMETRIC ESTIMATION 1

On the Effect of Bias Estimation on Coverage Accuracy in Nonparametric Inference

Identification and shape restrictions in nonparametric instrumental variables estimation

On variable bandwidth kernel density estimation

A Shape Constrained Estimator of Bidding Function of First-Price Sealed-Bid Auctions

Locally Robust Semiparametric Estimation

Inference in Regression Discontinuity Designs with a Discrete Running Variable

7 Semiparametric Estimation of Additive Models

Nonparametric Methods

Nonparametric Identi cation and Estimation of Truncated Regression Models with Heteroskedasticity

Bickel Rosenblatt test

IV Quantile Regression for Group-level Treatments, with an Application to the Distributional Effects of Trade

A Note on Data-Adaptive Bandwidth Selection for Sequential Kernel Smoothers

Transformation and Smoothing in Sample Survey Data

TESTING SERIAL CORRELATION IN SEMIPARAMETRIC VARYING COEFFICIENT PARTIALLY LINEAR ERRORS-IN-VARIABLES MODEL

University of California San Diego and Stanford University and

The EM Algorithm for the Finite Mixture of Exponential Distribution Models

Tobit and Interval Censored Regression Model

Local Polynomial Modelling and Its Applications

The properties of L p -GMM estimators

The high order moments method in endpoint estimation: an overview

Estimation of the Conditional Variance in Paired Experiments

Nonparametric regression with rescaled time series errors

SIMPLE AND HONEST CONFIDENCE INTERVALS IN NONPARAMETRIC REGRESSION. Timothy B. Armstrong and Michal Kolesár. June 2016 Revised October 2016

Function of Longitudinal Data

Likelihood ratio confidence bands in nonparametric regression with censored data

Model-free prediction intervals for regression and autoregression. Dimitris N. Politis University of California, San Diego

Density estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas

arxiv: v1 [stat.me] 25 Mar 2019

Statistical Properties of Numerical Derivatives

A Non-parametric bootstrap for multilevel models

On block bootstrapping areal data Introduction

Weak Stochastic Increasingness, Rank Exchangeability, and Partial Identification of The Distribution of Treatment Effects

arxiv: v2 [stat.me] 12 Sep 2017

Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions

Optimal bandwidth selection for the fuzzy regression discontinuity estimator

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be

Issues on quantile autoregression

Quantile regression and heteroskedasticity

Transcription:

Statistica Sinica Preprint No: SS-017-0013 Title A Bootstrap Method for Constructing Pointwise and Uniform Confidence Bands for Conditional Quantile Functions Manuscript ID SS-017-0013 URL http://wwwstatsinicaedutw/statistica/ DOI 105705/ss00170013 Complete List of Authors Joel L Horowitz and Anand Krishnamurthy Corresponding Author Joel L Horowitz E-mail joel-horowitz@northwesternedu Notice: Accepted version subject to English editing

A BOOTSTRAP METHOD FOR CONSTRUCTING POINTWISE AND UNIFORM CONFIDENCE BANDS FOR CONDITIONAL QUANTILE FUNTIONS by Joel L Horowitz Department of Economics Northwestern University Evanston, IL 6001 joel-horowitz@northwesternedu Anand Krishnamurthy Cornerstone Research 599 Lexington Ave New York, NY 100 akrishnamurthy@cornerstonecom January 018 ABSTRACT This paper is concerned with inference about the conditional quantile function in a nonparametric quantile regression model Any method for constructing a confidence interval or band for this function must deal with the asymptotic bias of nonparametric estimators of the function In estimation methods such as local polynomial estimation, this is usually done through undersmoothing or explicit bias correction The latter usually requires oversmoothing However, there are no satisfactory empirical methods for selecting bandwidths that under- or oversmooth This paper extends the bootstrap method of Hall and Horowitz (013) for conditional mean functions to conditional quantile functions The paper also shows how the bootstrap method can be used to obtain uniform confidence bands The bootstrap method uses only bandwidths that are selected by standard methods such as cross validation and plug-in It does not use under- or oversmoothing The results of Monte Carlo experiments illustrate the numerical performance of the bootstrap method

A BOOTSTRAP METHOD FOR CONSTRUCTING POINTWISE AND UNIFORM CONFIDENCE BANDS FOR CONDITIONAL QUANTILE FUNTIONS 1 INTRODUCTION This paper is concerned with inference about the unknown function g in the nonparametric quantile regression model (11) Y = g( X) + ε; P( ε 0) = τ, where X is an observed continuously distributed explanatory variable and ε is an unobserved continuously distributed random variable that is independent of X and whose τ quantile ( 0< τ < 1) is 0 Hall and Horowitz (013) (hereinafter HH) describe a bootstrap method for constructing a pointwise confidence band for the unknown function mx ( ) = EY ( X= x) in a nonparametric mean regression This paper extends the bootstrap method of HH to g in the quantile regression model (11) The paper also shows how the bootstrap method can be used to construct a uniform confidence band for g The method for constructing a uniform confidence band for g can be used to construct a uniform confidence band for m, but this is not done here Any method for constructing a pointwise or uniform confidence band for g based on a nonparametric estimate must deal with the problem of asymptotic bias For example, a local polynomial estimate of g with a bandwidth chosen by cross-validation or plug-in methods is asymptotically biased Denote the estimate by ĝ The expected value of ĝ does not equal g, the asymptotic distribution of the scaled estimate is not centered at g, and the true coverage probability of an asymptotic confidence interval for g that is constructed from the normal distribution in the usual way is less than the nominal probability This problem is usually overcome by undersmoothing or explicit bias reduction Undersmoothing consists of making the bias asymptotically negligible by using a bandwidth whose rate of convergence is faster than the asymptotically optimal rate In explicit bias reduction, an estimate of the asymptotic bias is used to construct an asymptotically unbiased estimate of g Most explicit bias reduction methods involve some form of oversmoothing, that is using a bandwidth whose rate of convergence is slower than the asymptotically optimal rate Undersmoothing and explicit bias correction methods are also available for the conditional mean function m Methods based on undersmoothing or oversmoothing require a bandwidth whose rate of convergence is faster or slower than the asymptotically optimal rate As discussed by HH, there are no attractive, effective empirical ways to choose these bandwidths In addition, undersmoothing can produce very wiggly confidence bands, even for smooth conditional quantile or conditional mean functions 1

Explicit bias correction methods that rely on estimation of derivatives can also produce wiggly confidence bands The method presented in this paper, like the method of HH, uses bandwidths chosen by standard empirical methods such as cross validation or a plug-in rule It does not under- or oversmooth and does not use auxiliary or other non-standard bandwidths Instead, the method uses the bootstrap to estimate the bias of ĝ The bootstrap estimate of the bias has stochastic noise that is comparable in size to the bias itself However, combining a suitable quantile of the distribution of the bootstrap bias estimate with ĝ enables us to obtain a pointwise confidence band with an asymptotic coverage probability that equals or exceeds 1 α for any given α > 0 at all but a user specified fraction of the possible values of x The exceptional points are in regions where the function g has sharp peaks or troughs that cause the bias of ĝ to be unusually large These regions are typically visible in a plot of ĝ and can also be found through a theoretical analysis An asymptotic uniform confidence band that has no exceptional points is obtained by replacing the bootstrap bias estimate with an upper bound on the estimated bias This paper differs from HH in two important ways First, we obtain confidence bands for g that are uniform in x, whereas HH obtain only pointwise bands Second, our bootstrap method is different from that of HH To avoid complications caused by the non-smoothness of the quantile objective function, we apply the bootstrap to the leading term of the asymptotic bias of the quantile regression estimator We do not estimate the conditional quantile function from the bootstrap sample In contrast, the objective function of a mean-regression model is smooth This enables HH to estimate the conditional mean function and its bias directly from the bootstrap sample Methods that use undersmoothing have been described by Bjerve, Doksum, and Yandell (1985); Hall (199); Hall and Owen (1993); Neumann (1995); Chen (1996); Neumann and Polzehl (1998); Picard and Tribouley (000); Chen, Härdle, and Li (003); Claeskens and Van Keilegom (003); Härdle, Huet, Mammen, and Sperlich (004); and McMurry and Politis (008) Methods based on oversmoothing have been described by Härdle and Bowman (1988); Härdle and Marron (1991); Hall (199); Eubank and Speckman (1993); Sun and Loader (1994); Härdle, Huet, and Jolivet (1995); Xia (1998); and Schucany and Somers (1977) Calonico, Cattaneo, and Farrell (016) describe an explicit bias correction method for conditional mean functions that does not require oversmoothing or an auxiliary bandwidth It is not known whether this method can be extended to conditional quantile functions There is also a large literature on bootstrap methods for parametric quantile regression models See, for example, De Angelis, Hall, and Young (1993); Hahn (1995); Horowitz (1998); Feng, He, and Hu (011); Aguirre and Dominguez (013); Galvao and Montes-Rojas (015); and Hagemann (017)

Section of this paper presents an informal description of our method The method is similar in some respects to that of HH for conditional mean functions, but the non-smoothness of quantile estimators presents problems that are different from those involved in estimating conditional mean functions These require a separate treatment and modifications of parts of the method of HH Section also outlines the extension of our method to a heteroskedastic version of model (11) Section 3 presents formal theoretical results Section 4 presents simulation results that illustrate the numerical performance of the method Conclusions are presented in Section 5 The proofs of theorems are in the online supplementary appendix INFORMAL DESCRIPTION OF THE METHOD Let { Y, X : i = 1,, n} denote an independent random sample of observations from the i i distribution of ( Y, X ) in model (11) Let gx ( ) denote a local polynomial nonparametric estimator of gx ( ) based on bandwidth h Denote the scaled asymptotic bias and variance of gx, ( ) respectively, by 1/ β ( x) = lim Enh ( ) [ gx ( ) gx ( )] and n g σ ( x) = lim ( nh) Var[ g ( x)] Assume that n { gx ( ) Egx [ ( )]}/ σ ( x) N(0,1) as n To minimize the complexity of the discussion in the g d remainder of this paper, we assume that X is a scalar random variable and ĝ is a local linear quantile regression estimator The main results of the paper continue to apply if X is a vector or ĝ is a local polynomial estimator of odd degree different from 1 This paper does not treat series estimators The local linear quantile estimation procedure is described in Step 1 in Section 1 To avoid boundary effects we restrict attention to a compact set that is contained in an open subset of the support of X Let h denote the bandwidth used in local polynomial estimation of g If β ( x) were known, an asymptotic 1 α0 confidence interval for gx ( ) would be 1/ 1/ 1 α/σg 1 α/σg gx ( ) z ( x)/( nh) gx ( ) gx ( ) + z ( x)/( nh) where z1 α / is the 1 α / quantile of the standard normal distribution, α = α( x, α0) satisfies (1) Φ[ z β( x)/ σ ( x)] Φ[ z β( x)/ σ ( x)] = 1 α, 1 α/ g 1 α/ g 0 and Φ is the normal distribution function In applications, β ( x) and σ ( x ) are unknown Let σ ( x ) be the estimate of σ ( x ) that is described in Section 1 Let ( λ x) denote the bootstrap estimate of g β( x)/ σg ( x) that is obtained in Step 5 of the procedure described in Section 1, and let x 0 the solution in α to Φ[ z λ( x)] Φ[ z λ( x)] = 1 α 1 α/ 1 α/ 0 3 g g ( α, α ) denote

For ξ [0,1], let αξ ( α 0) be the ξ quantile of points in the set { α( x, α0): x } Define z( ) = Construct the pointwise confidence band α0 z1 α ( α )/ ξ 0 1/ 1/ { x y g x z x nh y g x z x nh } [ α ( α )] = (, ): ( ) ( α ) σ ( )/( ) ( ) + ( α ) σ ( )/( ) n x 0 0 g 0 g It is shown in Section 33 that n has asymptotic coverage probability equal to or greater than 0 except for a proportion ξ of points x An argument like that in Section 6 of HH shows that the exceptional points occur in regions where g ( x) is large These points typically occur near peaks and troughs of gx ( ) and can be identified from a graph of gx ( ) 1 α The critical value z( ) = z is greater than α0 1 α ( α )/ ξ 0 z1 α / 0 Therefore, the confidence band [ α ( α )] is wider than a band that uses the critical value n ξ 0 This enables n[ αξ ( α0)] to z1 α / accommodate the asymptotic bias of gx ( ) The parameter ξ is selected by the user and controls the fraction of points x at which the asymptotic coverage probability of the confidence band n[ αξ ( α0)] is at least 1 α0 This control is an important advantage of our method over undersmoothing and explicit bias correction As is illustrated in Section 4, the latter two methods have poor coverage accuracy but provide no information about the extent of this inaccuracy or ability to control it At the cost of a wider confidence band, the fraction of points at which our method undercovers can be reduced to zero asymptotically by constructing the uniform band described in Step 7 of Section 1 To construct a uniform confidence band for g, define 0 and λ λ max min = max λ( x) x = min λ( x) x Let W 1 be the mean-zero Gaussian process defined in Section 31, and let t satisfy Û x P tu λmin W1 tu λmax x = 1 a0 h, where h is the bandwidth used for local linear quantile estimation of g It is shown in Section 33 that ( ) {( y, x): g( x) t ( x)/( nh) y g( x) + t ( x)/( nh) ; x } 1/ 1/ U α0 U σg U σg is an asymptotic uniform confidence band for g whose coverage probability equals or exceeds 1 α0 4

1 The Estimation Procedure This section provides a step-by-step explanation of the method for constructing n and Steps 1-6 are broadly similar to those of HH, but their implementation details differ because of differences between the mean-regression model of HH and the quantile regression model considered here Step 7 constructs a uniform confidence band and is new Step 1: Local linear estimation of g and estimation of U σ ĝ Let K be a kernel function and h be a possibly random bandwidth For any real v, define K() v= Kv ( / h) Define the check function ρ ( v) = v[ τ Iv ( 0)], τ where 0< τ < 1 and I is the indicator function The local linear estimator of gx ( ) is gx ( ) = b 0, where n 0 1 r τ i 0 1 i h i, = 1 ( b, b ) = arg min [ Y b b ( X x)] K ( X x) b0 b1 i Fan, Hu, and Truong (1994) and Yu and Jones (1997) describe properties of this estimator h To obtain σ ( x ), let f ( x ) be a consistent kernel nonparametric estimator of f ( x ), the g X X probability density function of X at x Let ε = Y g( X ) be the residuals from estimating model (11), i i i and let f ε (0) be a consistent kernel nonparametric estimator of f ε (0), the probability density of ε at 0 Specifically, n (0) ( ) 1 fε nhε Kh ( ε ε i ) i= 1 =, where h ε is a bandwidth Define B = K () v dv K It is shown in Section 3 that the scaled variance of the asymptotic distribution of gx ( ) is σ τ(1 τ) B K g ( x) = ( nh) Var[ g( x)] = fx ( x)[ f ε (0)] The scaled variance can be estimated by replacing fx ( x ) and f ε (0) with their consistent estimators to obtain τ(1 τ) B K σ g ( x) = fx ( x)[ f ε (0)] Step : Compute centered residuals Let q n be the τ quantile of the residuals { ε i } That is 5

n 1 qn = inf q: n I( εi q) τ i= 1 The centered residuals are ε = ε q i i n The τ quantile of centered residuals is 0 and the Step 3 Construct the bootstrap resample The bootstrap resample is { Y, X : i= 1,, n}, where * * i = ( i) + i Y g X ε * εi s are obtained by sampling the i ε s randomly with replacement The * i i Xi s are not resampled Step 4: Compute the bootstrap estimate of the asymptotic bias of gx ( ) Let there be B bootstrap resamples that are indexed by b= 1,, B For resample b, define { τ 0 1 } n * 1 1 * nb i i= 1 i h i T ( x) = n 1 I Y b + b ( X x) K ( X x) T * ( x ) is a bootstrap analog of the first-order condition in equation (A8) of the supplementary appendix nb Let * * nb B ( ) 1 * Tn x B Tnb ( x ) b= 1 E ( T ) denote the bootstrap expectation of = * T nb conditional on the data Estimate * * nb E ( T ) by T ( x ) converges almost surely to n * * nb E [ T ( x )] and can be made arbitrarily close to * * nb E [ T ( x )] by making B sufficiently large As n, the bias of 1/ ( nh) [ gx ( ) gx ( )] converges to 1/ β ( x) = lim ( nh) Egx [ ( ) gx ( )] Equation (3) gives an analytic expression for β ( x) The n bootstrap estimate of β ( x) is 1/ 1 ( ) τ σ g x β ( x) = h T ( ) (1 ) 1/ n x τ BK [ f ( x)] X In contrast to HH, we do not form a bootstrap estimate of gx ( ) Instead, we form a bootstrap estimate of the asymptotic form of 1/ Enh ( ) [ gx ( ) gx ( )] from the analytic expression for this form This expression is given by equation (34) The variance of ( β x) converges to zero at the same rate as 6 β ( x) Therefore, ( β x) is not consistent for β ( x) in the sense that ( β x)/ β ( x) does not converge in probability to one The methods for finding pointwise and uniform confidence bands for g take account of this inconsistency See Steps 6 and 7 below

Step 5: Obtain the normalized estimate of the bias and effective significance level The normalized bias of gx ( ) is defined as λ( x) = β( x)/ σ ( x) and is estimated by λ( x) = β( x)/ σ ( x) The effective significance level at point x, ( α x, α 0), is defined as the solution in α to the equation Φ[ z λ( x)] Φ[ z λ( x)] = 1 α 1 α/ 1 α/ 0 g Step 6: Construct a pointwise confidence band for g Let ξ [0,1] Let αξ ( α 0) be the ξ g quantile of points in the set { α( x, α0): x } Define z( ) = Construct the pointwise α0 z1 α ( α )/ ξ 0 confidence band 1/ 1/ { x y g x z x nh y g x z x nh } [ α ( α )] = (, ): ( ) ( α ) σ ( )/( ) ( ) + ( α ) σ ( )/( ) n x 0 0 g 0 g It is shown in Section 33 that the pointwise band n[ αξ ( α0)] covers gx ( ) with probability at least 1 α 0 except for a proportion ξ of points x Specifically, for α( x, α 0) as in (1), let αξ ( α 0) denote the ξ quantile of the points { α( x, α0): x } Define the set () x( α0) = { x : α( x, α0) > αx( α0)} Then liminf P{[ xgx, ( )] [ αx ( α )} 1 α n n 0 0 for each x x ( α0) ξ ( α0) is the set of points x on which the pointwise confidence band has an asymptotic coverage probability of at least 1 α0 It contains a fraction 1 ξ of points x Specifically, let and, respectively, denote the Lebesgue measures of the sets and ξ ( α0) Then / = 1 ξ Step 7: Construct a uniform confidence band for g Define (3) λ max = max λ( x) x and (4) λ min = min λ( x) x Let W 1 denote the mean-zero Gaussian process defined in Theorem 31 in Section 31 Define t Û as the solution in t to (5) x P t λmin W1 t λmax x = 1 a0 h The asymptotic uniform confidence band is 7

( ) {( y, x): g( x) t ( x)/( nh) y g( x) + t ( x)/( nh) ; x } 1/ 1/ U α0 U σg U σg The quantities max λ and min λ can be computed by replacing in (3) and (4) with a fine grid of equally spaced points The critical value t Û can be computed by replacing in (5) with the grid Heteroskedasticity A heteroskedastic version of (11) is (6) Y = g( X) + σ( X) ε; P( ε 0) = τ, where σ () is a scale function and ε is independent of X Identification of σ requires normalizing the scale of ε This is done by setting the interquartile range (IQR) of ε equal to 1 Then σ ( x) = IQR( Y X = x) Let gx ( ) be the local linear quantile regression estimate of gx ( ) and ( σ x) be a consistent nonparametric estimate of IQR( Y X = x) The residuals of model (5) are Yi g( Xi) εi = ( σ X ) i The centered residuals of (6) are as in Step after replacing ε i with εi The estimate of the scaled asymptotic variance of the estimate of gx ( ) in (6) is σ ( x) τ(1 τ) BK g x fx ( x)[ f ε (0)] σ ( ) =, where f ε is now based on the εi s Bootstrap sampling is done by setting * * i ( i) σ( i) εi Y = g X + X, where the ε i * s are sampled randomly with replacement from the centered εi s Steps 4-7 for construction of pointwise and uniform confidence bands remain as in Section 1 but with the foregoing modifications of g σ ( x ) and the bootstrap sampling procedure We conjecture that our method can be extended to the generalization of model (11) in which P( ε 0) = τ is replaced by P( ε 0 X) = τ, but we do not analyze this extension here 3 THEORETICAL RESULTS This section presents theorems giving conditions under which the pointwise and uniform confidence bands constructed in Steps 6-7 of Section 1 have the claimed coverage properties when ĝ is a local linear quantile regression estimator Theorem 31 shows that 1/ ( nh) [ gx ( ) gx ( )] is 8

approximated sufficiently accurately by the sum of its asymptotic bias and a mean-zero Gaussian process Theorem 3 shows that a similar approximation applies to the bootstrap bias estimator These two approximations are combined in Theorem 34 to show that the bootstrap procedure of Section 1 yields pointwise confidence intervals with the coverage probabilities explained in Step 6 of Section 1 Theorem 35 shows that the pointwise confidence intervals of Theorem 34 can be widened to construct a uniform confidence band We make the following assumptions: Assumption 1: (i) The data { Y, X : i = 1,, n} are an independent random sample from model i i (11); (ii) X in (11) has compact support; (iii) ε in (11) is independent of X and P( ε 0) = τ for some τ (0,1) Assumption : (i) The distribution of X is absolutely continuous with respect to Lebesgue measure with probability density function f X ; (ii) f X is bounded away from 0 on supp( X ) and twice continuously differentiable on the interior of supp( X ) Assumption 3: The distribution of ε is absolutely continuous with respect to Lebesgue measure with probability density function f ε ; (ii) f ε is twice continuously differentiable and f ε (0) > 0 Assumption 4: The function g in (11) is three times continuously differentiable on the interior of supp( X ) ; (ii) ĝ is a local linear quantile regression estimator of g (iii) There is a compact set such that [ gx ( ), g ( x)] for each x Assumption 5: The kernel K is a probability density function with support [ 1,1], symmetrical around 0, and twice continuously differentiable on ( 1,1) Assumption 6: The bandwidth h used to construct ĝ satisfies: (i) 1/5 h = dn, where d is a function of the data { Y, X : i= 1,, n} and p d d0 as n for some finite constant d 0 > 0 (ii) There exists a finite constant D 1 > 0 such that ( 1 0 ) P d d > n D 0 as n (iii) There are constants D and D 3 such that 0< D < D3 < 1 and ( D 3 D P n h n ) = 1 O( n C ) as n for all finite C > 0 Assumption 1 defines the data generation process Assumptions -4 are smoothness assumptions Assumption 5 specifies standard properties of K Assumption 6 is satisfied by standard bandwidth i i 9

choice methods such as cross-validation and plug-in methods Under assumptions 1-6, local linear estimates of g obtained using the random bandwidth h and the deterministic bandwidth asymptotically equivalent See Lemma A1 in the supplementary appendix h 1/5 = dn are 0 0 31 Asymptotic Approximations to 1/ ( nh) [ gx ( ) gx ( )] and the Bootstrap Bias Estimate The asymptotic coverage probabilities of the pointwise and uniform confidence bands defined in Steps 6 and 7 of Section 1 depend on strong asymptotic approximations to 1/ ( nh) [ gx ( ) gx ( )] and the bootstrap estimate of 1/ Enh ( ) [ gx ( ) gx ( )] These approximations are given by the following theorems Theorem 31: Let assumptions 1-6 hold Define ψ K 0( x) 1/ X 1/ [ τ(1 τ) B ] =, f ( x) f ε (0) κ = v K() v dv, and h 1/5 = dn There exists a Gaussian process W1 ( x ) defined on the same probability space as the 0 0 data such that EW [ 1( x )] = 0 for all x, 1 EW [ ( x )] = 1 for all x, and for any η > 0 lim sup ( ) [ ( ) ( )] ( ) + ( ) > h = 0 5/ 1/ d0 κ x P nh gx gx g x ψ0 xw1 n x h 0 It follows from Theorem 31 that for each x, (31) 1/ ( nh) [ gx ( ) gx ( )] N[ µ, V( x)], d g g where (3) and 5/ d0 κ µ g = g ( x) (33) 0 Vg ( x ) = ψ ( x ) Moreover, asymptotically, (34) and (35) 0κ h Egx ( ) gx ( ) = g ( x) τ(1 τ) B ( nh ) Var[ g( x)] = ( ) = 0 K σ g x fx ( x)[ f ε (0)] Properties (31)-(35) were obtained previously by Fan, Hu, and Truong (1994) and Yu and Jones (1997) 10

The quantity ( nh0 ) Var[ g ( x )] can be estimated consistently by replacing f ( x ) and f ε (0) on the right-hand sides of (33) and (35) by the consistent estimators f X ( x ) and f ε (0) Bootstrap estimation of 1/ 0 Enh ( ) [ gx ( ) gx ( )] relies on the strong approximation given by the following theorem Theorem 3: Let assumptions 1-6 hold Let the data Define * E denote the bootstrap expectation conditional on X and Then 1 5/ 1 = τ 0 ε A ( x) d f (0) f ( x) 1 τ A ( x) = BK fx( x) τ X 1/ (i) For all x, and any η > 0 1/ n * * A1( x) κ x lim P sup E [ Tnb ( x)] + g ( x) A( xw ) 1 > h = 0 n x h h0 (ii) There exists a Gaussian process ( x) such that E[ ( x)] = 0 for all x, E[ ( x)] = 1 for all x [0,1], and for any η > 0 β ( x) lim P sup l( x) ( ) 0 n x g ( ) + x > η = s x For any α (0,1) and x define ( π x, α) =Φ z λ( x) Φ z λ( x) 1 α/ 1 α/ The following corollary to Theorem 3 is used to establish the asymptotic coverage probabilities of the confidence bands constructed in Steps 6 and 7 of Section 1 Corollary 33: Let assumptions 1-6 hold Then for any η > 0 and 0< α < 1 β( x) β( x) lim P sup p( x, α) Φ z ( x) Φ z ( x) > η = 0 ( ) ( x) 1 α/ 1 α/ n x sg x sg 33 Coverage Probabilities of Confidence Bands This section shows that the pointwise and uniform confidence bands constructed in Steps 6 and 7 of Section 1 have asymptotic coverage probabilities of at least 1 α0 We use the following notation Let α0 (0,1/ ) Define ( α x, α 0) as in Step 5 Define T( x, α 0) as the solution in T to 11

Define β( x) β( x) Φ T ( x) Φ T ( x) = 1 α0 σg( x) σg( x) Ax (, α ) = {1 Φ [ T( x, α )]} 0 0 As in (1), define α( x, α 0) to be the solution in a to β( x) β( x) Φ z1 a/ Φ z1 a/ = 1 a0 σg( x) σg( x) Let αξ ( α 0) denote the ξ -level quantile of points in the set { α( x, α0): x } Define ξ ( α0) as in () The following theorem gives the asymptotic coverage probability of the pointwise confidence band constructed in Step 6 Theorem 34: Let assumptions 1-6 hold For all C > 0 and η > 0, (i) lim P sup α( x, α0) Ax (, α0) > η = 0 n x, ( x) C (ii) lim P α ξ ( α0) α ξ ( α0) = 1 n liminf [, ( )] [ α ( α )] 1 α for each x x ( α0) (iii) P{ xgx n x } n 0 0 Theorem 34(iii) shows that the pointwise band n[ αξ ( α0)] covers gx ( ) with probability at least 1 α0 except for a proportion ξ of points x Now consider the uniform confidence band constructed in Step 7 of Section 1 It follows from Theorem 31 that up to asymptotically negligible terms 1/ ( nh) [ gx ( ) gx ( )] β ( x) x = + W1 σg( x) σg( x) h0 uniformly over x Recall that λ( x) = β( x)/ σ ( x) If λ ( x) were known, an asymptotic uniform g 1 α 0 confidence band for g would be x tu λ( x) + W1 t h0 where t U is the solution in t to U, Define x P t λ( x) W1 t λ( x) x = 1 α0 h0 1

and Then λmax = max λ( x) x λmin = min λ( x) x x x P tu λmin W1 tu λmax x P tu λ( x) W1 tu λ( x) x h0 h0 Therefore, asymptotically (36) 1/ ( nh) [ gx ( ) gx ( )] P tu tu x 1 α0 σ g ( x) if t U is chosen so that x P tu λmin W1 tu λmax x = 1 a0 h0 The quantities λ max and λ min are unknown in applications A feasible confidence band can be obtained by replacing them with the bootstrap estimates λ = max λ( x) and λ max min x = min λ( x) x Similarly, the unknown quantities σ g ( x ) and h 0 can be replaced with σ g ( x ) and h, respectively The critical value t U is replaced by t Û, which is the solution in t to x h 1, (37) P t λmin W1 t λmax x = a0 where max λ and min λ are treated as non-stochastic constants, not random variables, when calculating the probability on the right-hand side of (37) The resulting uniform confidence band is 1/ ( ) [ nh gx ( ) gx ( )] t U tu; x σ ( x) g The following theorem establishes the asymptotic coverage probability of this interval Theorem 35: Let assumptions 1-6 hold Then 1/ ( nh) [ gx ( ) gx ( )] lim inf P t U tu; x 1 α0 n σ g ( x) 13

The covariance function of W 1 can be estimated consistently See equation (A13) in the supplementary appendix The probability on the right-hand side of (37) can be computed by simulation with arbitrary accuracy by replacing the covariance function of W 1 with the consistent estimate and the continuum with a grid of equally spaced points 4 NUMERICAL EXPERIMENTS This section reports the results of a set of Monte Carlo experiments that illustrate the finitesample performance of the method described in Section 41 Design Data { Y, X : i = 1,, n} were generated from the models i i Y = g ( X) + ε; j = 1,,3, j where g ( x ) = x+ 5 φ (10 x ), 1 sin(15 π x) g( x) =, 1 + 18 x [sgn( x) + 1] sin(15 π x) g3( x) =, 1 + x [sgn( x) + 1] φ is the standard normal probability density function, and X ~ U[ 1,1] The distribution of ε is N ( µ,1) for τ = 05, τ = 05, and τ = 075, respectively, where µ 05 = 06745, µ 050 = 0, and τ µ 075 = 06745 Thus, P( ε 0) = τ for each value of τ The functions g j were used in numerical experiments by HH and other authors Graphs of these functions are shown in Figure 1 The function g 1 has a sharp peak and is the most challenging for our method The function g is less challenging than g 1 The function g 3 is the smoothest and least challenging The sample sizes in the experiments were n = 100, 500, and 1000 The kernel function was Kv ( ) = 075(1 v) I( v 1) The bandwidth h for local linear estimation of the g j s was chosen using the plug-in method of Yu and Jones (1998) Bandwidths for estimating fx ( x ) and f ε (0) were chosen by Silverman s rule of thumb To avoid boundary effects, the set was chosen so that its boundaries were at least one bandwidth from the boundaries of [ 1,1] This resulted in = [ 09,09] for experiments with g 1 and g and = [ 085,085] for experiments with g 3 is narrower for the experiments with g 3 because that 14

function is smoother than g 1 and g and has a larger bandwidth We set α 0 = 005 and 1 ξ = 095 Pointwise confidence bands were computed using an equally spaced grid of points x with a spacing of 005 The grid spacing was 00 for uniform confidence bands The proportion of points x at which the coverage probability is at least 095 was estimated by the proportion of grid points at which the coverage probability equals or exceeds this value There were 1000 Monte Carlo replications in each experiment We also computed pointwise confidence bands using undersmoothing and the explicit bias correction method of Schucany and Sommers (1977) There are no satisfactory empirical methods for choosing an undersmoothing bandwidth or the auxiliary bandwidth required for explicit bias correction Therefore, for undersmoothing, we set the bandwidth equal to γ 1 h, where h is the bandwidth selected by the method of Yu and Jones (1998) and γ1 1 is a constant For explicit bias correction, we set the auxiliary bandwidth equal to dn γ /5, where γ 1 is a constant and d is as in assumption 6 The values of γ 1 and γ were chosen to achieve coverage probabilities of at least 095 for as large a proportion of values of x in the grid as possible This approach cannot be used in applications and gives an advantage to undersmoothing and explicit bias correction Nonetheless, it will be seen in Section 4 that the performance of these methods is poor compared to that of the method of Section 1 4 Results of the Experiments Tables 1-3 show properties of pointwise confidence bands for τ = 05, 050 and 075, respectively At all quantiles and sample sizes, the bootstrap method described in Section 1 has much higher proportions of values of x for which the probability of covering of gx ( ) exceeds 095 than do the undersmoothing and explicit bias correction methods When n = 100, the bootstrap method s proportions exceed 070 for j = 1 and, and 095 for j = 3 When n = 1000, the bootstrap method s proportions exceed 09 for all values of j By contrast, the proportion of values of x for which undersmoothing achieves a coverage probability of at least 095 is below 065 for all values of n and j The absolute error in the coverage probability (column 5 of Tables 1-3) is the absolute value of the difference between the actual coverage probability and the nominal probability of 095 Thus, the absolute error increases when the actual coverage probability exceeds 095 as well as when the actual coverage probability is less than 095 The proportion of values of x for which explicit bias correction achieves a coverage probability of at least 095 is below 00 for all values of n and j Undersmoothing and explicit bias correction 15

perform poorly despite choosing the bandwidth for undersmoothing and the auxiliary bandwidth for explicit bias correction to achieve optimal performance of these methods Although confidence intervals based on undersmoothing and explicit bias correction rarely achieve the nominal coverage probability of 095, intervals based on these methods often have coverage probabilities of at least 093 if n = 1000 and γ 1 and γ are chosen to maximize the proportions of x values at which the coverage probability equals or exceeds this value For example, with g 1 ( x ), τ = 050, and undersmoothing, the proportion of x values with coverage probabilities of at least 093 is 097 With explicit bias correction, the proportion of x values with coverage probabilities of at least 093 is 065 Figure shows the coverage probabilities obtained by the three methods as functions of x with n = 1000 The relatively low proportions of points at which the coverage probability of the bootstrap method equals or exceeds 095 for g 1 are due to the sharp peak of this function in the vicinity of x = 0, which causes the bias of ĝ 1 to be especially large HH provide a theoretical explanation for why the bootstrap method performs poorly in regions of unusually high bias The phenomenon is illustrated in Table 4, which shows the proportion of points for which the bootstrap confidence band covers g 1 ( x ) with probability exceeding 095 when the interval [ 005,005] containing the peak is excluded from The proportions of points for which the coverage probability equals or exceeds 095 is at least 094 when n = 500 or n = 1000 except for n = 500 and τ = 05, when the proportion is 091 The function g has peaks and troughs at x = 015 and x = 035, but these are not as sharp as the peak of g 1 Consequently, they have little effect on the coverage probabilities for g when n 500 Table 5 shows the coverage probabilities of uniform confidence bands obtained with the bootstrap method The coverage probabilities for g and g 3 at all quantiles equal or exceed 095 if n 500 and 093 if n = 100 The coverage probabilities for g 1 with n 500 equal or exceed 094 except for n = 500 and τ = 05, when the coverage probability is 09 5 CONCLUSIONS This paper has described a bootstrap method for constructing pointwise and uniform confidence bands for a conditional quantile function that is estimated nonparametrically The method is based on local polynomial estimation and uses only a bandwidth that can be selected using standard methods such as cross validation or plug-in In contrast to other methods for constructing confidence bands, the bootstrap method does not require bandwidths that under- or oversmooth the nonparametric function estimator This is an important advantage of the bootstrap method, because there are no satisfactory 16

empirical methods for selecting bandwidths that under- or oversmooth a nonparametric estimator The bootstrap method presented here is an extension of the method of Hall and Horowitz (013) for conditional mean functions to conditional quantile functions and uniform confidence bands The results of Monte Carlo experiments have illustrated the good finite-sample performance of the bootstrap method and the poor performance of methods based on under- or oversmoothing 17

TABLE 1: SIMULATION RESULTS FOR τ=05 Method n j Prop Av Abs Av Width with Error of Cov Cov Prob Prob 095 Bootstrap 100 1 076 0034 176 method of Sec 1 073 00 157 3 097 004 16 500 1 089 0049 101 10 0033 083 3 10 008 059 1000 1 09 0051 080 095 003 06 3 094 006 044 Undersmooth 100 1 0 0097 14 0 007 118 3 003 004 116 500 1 030 0065 068 0 0016 071 3 035 001 066 1000 1 03 006 05 041 0014 057 3 040 0009 056 Bias Corr 100 1 0 018 138 0 019 137 3 0 015 131 500 1 0 011 107 008 005 091 3 006 0036 0078 1000 1 0 0068 088 008 014 081 3 009 0018 0059 18

TABLE : SIMULATION RESULTS FOR τ=050 Method n j Prop Av Abs Av Width with Error of Cov Cov Prob Prob 095 Bootstrap 100 1 073 0034 164 method of Sec 1 076 00 148 3 097 004 117 500 1 089 0048 095 1 0035 079 3 095 007 056 1000 1 09 0056 074 097 0035 058 3 094 009 041 Undersmooth 100 1 0 0075 10 0 0053 119 3 006 006 117 500 1 07 001 078 041 0016 06 3 051 0094 057 1000 1 049 005 056 051 0011 051 3 063 0099 043 Bias Corr 100 1 0 014 138 0 01 135 3 0 009 16 500 1 0 0069 101 011 008 086 3 017 0019 069 1000 1 005 0038 080 011 005 063 3 00 0013 051 19

TABLE 3: SIMULATION RESULTS FOR τ=075 Method n j Prop Av Abs Av Width with Error of Cov Cov Prob Prob 095 Bootstrap 100 1 076 003 174 method of Sec 1 089 003 158 3 094 000 10 500 1 086 0045 098 1 0033 078 3 091 00 057 1000 1 086 0046 076 097 0035 06 3 094 006 04 Undersmooth 100 1 0 009 10 0 0071 110 3 0 0045 090 500 1 07 0056 068 07 005 06 3 0 0033 045 1000 1 035 0055 05 03 004 046 3 011 008 034 Bias Corr 100 1 0 0011 107 0 0019 137 3 0 0015 131 500 1 0 0068 089 008 005 091 3 006 0036 078 1000 1 0 0068 089 008 0048 081 3 009 0018 059 0

TABLE 4: THE BOOTSTRAP METHOD S COVERAGE PROBABILITIES FOR g 1 WHEN THE INTERVAL [-005,005] IS REMOVED FROM S n τ Prop with Cov Prob Av Abs Error of Cov Prob Av Width 095 100 05 076 000 171 050 079 00 16 075 085 001 175 500 05 091 0033 09 050 10 0039 088 075 094 0035 094 1000 05 094 0034 070 050 10 0041 067 075 094 0035 071 1

TABLE 5: COVERAGE PROBABILITIES OF UNIFORM CONFIDENCE BANDS OBTAINED BY THE BOOTSTRAP METHOD τ n j Cov Prob 05 100 1 084 094 3 095 500 1 09 099 3 097 1000 1 094 1 3 098 050 100 1 085 095 3 093 500 1 095 1 3 097 1000 1 095 1 3 098 075 100 1 086 096 3 096 500 1 094 1 3 097 1000 1 095 10 3 097

g(x) -1 0 1-1 -5 0 5 1 X Figure 1: Conditional quantile functions Solid line is g 1 ( x ) Long dashes are g ( x ) Short dashes are g 3 ( x ) 3

7 8 9 1-1 -5 0 5 1 X Figure a: Coverage probabilities of nominal 95% pointwise confidence bands with gx ( ) = x+ 5 φ(10 x) and n = 1000 Solid line: proposed bootstrap method Dashes: explicit bias correction method Dashdots: undersmoothing method Dots: 95% line 4

9 95 1-1 -5 0 5 1 X Figure b: Coverage probabilities of nominal 95% pointwise confidence bands with gx ( ) = sin(15 π x) /{1 + 18 x[sgn( x) + 1]} and n = 1000 Solid line: proposed bootstrap method Dashes: explicit bias correction method Dash-dots: undersmoothing method Dots: 95% line 5

9 94 96 98 1-1 -5 0 5 1 X Figure c: Coverage probabilities of nominal 95% pointwise confidence bands with gx ( ) = sin(15 π x) / {1 + x[sgn( x) + 1]} and n = 1000 Solid line: proposed bootstrap method Dashes: explicit bias correction method Dash-dots: undersmoothing method Dots: 95% line 6

REFERENCES Bjerve, S, KA Doksum, and BS Yandell (1985) Uniform confidence bounds for regression based on a simple moving average Scandinavian Journal of Statistics, 1, 159-169 Boucheron, S, G Lugosi, and P Massart (013) Concentration Inequalities: A Nonasymptotic Theory of Independence Oxford, UK: Oxford University Press Boyd, S and L Vandenberghe (004) Convex Optimization Cambridge University Press: Cambridge, UK Calonico, S, MD Cattaneo, and MH Farrell (016) On the effect of bias estimation on coverage accuracy in nonparametric inference Working paper, Department of Economics, University of Michigan Chen, SX (1996) Empirical likelihood confidence intervals for nonparametric density estimation Biometrika, 83, 39-341 Chen, SX, W Härdle, and M Li (003) An empirical likelihood goodness-of-fit test for time series Journal of the Royal Statistical Society Series B, 65, 663-678 Claeskens, G and I Van Keilegom (003) Bootstrap confidence bands for regression curves and their derivatives Annals of Statistics, 31, 185-1884 Eubank, RL and PL Speckman (1993) Confidence regions in nonparametric regression Journal of the American Statistical Association, 88, 187-1301 Fan, J, T-C Hu, and YK Truong (1994) Robust nonparametric function estimation Scandinavian Journal of Statistics, 1, 433-446 Hall, P (199) Effect of bias estimation on coverage accuracy of bootstrap confidence intervals for a probability density Annals of Statistics, 0, 675-694 Hall, P and JL Horowitz (013) A simple bootstrap method for constructing nonparametric confidence bands for functions Annals of Statistics, 41, 189-191 Hall, P and AB Owen (1993) Empirical likelihood confidence bands in density estimation Journal of Computational and Graphical Statistics,, 73-89 Härdle, W and AW Bowman (1988) Bootstrapping in nonparametric regression: Local adaptive smoothing and confidence bands Journal of the American Statistical Association, 83, 10-110 Härdle, W, S Huet, and E Jolivet (1995) Better bootstrap confidence intervals for regression curve estimation Statistics, 6, 87-306 Härdle, W, S Huet, E Mammen, and S Sperlich (004) Bootstrap inference in semiparametric generalized additive models Econometric Theory, 0, 65-300 7

Härdle, W and JS Marron (1991) Bootstrap simultaneous error bars for nonparametric regression Annals of Statistics, 19, 778-796 McMurry, TL and DN Politis (008) Bootstrap confidence intervals in nonparametric regression with built-in bias correction Statistics and Probability Letters, 78, 463-469 Neumann, MH (1995) Automatic bandwidth choice and confidence intervals in nonparametric regression Annals of Statistics, 1937-1959 Neumann, MH and J Polzehl (1998) Simultaneous bootstrap confidence bands in nonparametric regression Journal of Nonparametric Statistics, 9, 307-333 Picard, D and K Tribouley (000) Adaptive confidence interval for pointwise curve estimation Annals of Statistics, 8, 98-335 Schucany, WR and JP Sommers (1977) Improvement of kernel type density estimators Journal of the American Statistical Association, 7, 40-43 Sun, J and CR Loader (1994) Simultaneous confidence bands for linear regression and smoothing Annals of Statistics,, 138-1345 Xia, Y (1998) Bias-corrected confidence bands in nonparametric regression Journal of the Royal Statistical Society Series B, 60, 797-811 Yu, K and MC Jones (1998) Local linear quantile regression Journal of the American Statistical Association, 8-37 8