SEMIPARAMETRIC QUANTILE REGRESSION WITH HIGH-DIMENSIONAL COVARIATES. Liping Zhu, Mian Huang, & Runze Li. The Pennsylvania State University

Size: px

Start display at page:

Download "SEMIPARAMETRIC QUANTILE REGRESSION WITH HIGH-DIMENSIONAL COVARIATES. Liping Zhu, Mian Huang, & Runze Li. The Pennsylvania State University"

Dora Taylor
6 years ago
Views:

Pennsylvania State University Technical Report Series

1 SEMIPARAMETRIC QUANTILE REGRESSION WITH HIGH-DIMENSIONAL COVARIATES Liping Zhu, Mian Huang, & Runze Li The Pennsylvania State University Technical Report Series # College of Health and Human Development The Pennsylvania State University

2 SEMIPARAMETRIC QUANTILE REGRESSION WITH HIGH-DIMENSIONAL COVARIATES Liping ZHU, Mian HUANG and Runze LI Abstract This paper is concerned with quantile regression for a semiparametric regression model, in which both conditional mean function and conditional variance function of response given the covariates admit single-index structure. This semiparametric regression model enables us to reduce the dimension of the covariates and simultaneously retains the flexibility of nonparametric regression. Under mild conditions, we show that simple linear quantile regression offers a consistent estimate of the index parameter vector. This is a surprising and interesting result because the single-index model is possibly misspecified under the linear quantile regression. With root-n consistent estimate of the index vector, one may employ a local polynomial regression technique to estimate the conditional quantile function. We show that the resulting estimator of the quantile function performs asymptotically as efficiently as if the true value of the index vector were known. The methodologies are demonstrated through comprehensive simulation studies and an application to a real dataset. KEY WORDS: Dimension reduction; heteroscedasticity; linearity condition; local polynomial regression; quantile regression; single index model. Liping Zhu (luz15@psu.edu is Research Associate, The Methodology Center, The Pennsylvania State University at University Park, PA Huang Mian (mzh136@stat.psu.edu is Assistant Professor, School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai , P. R. China. Runze Li (rli@stat.psu.edu is Professor, Department of Statistics and The Methodology Center, The Pennsylvania State University at University Park, PA Zhu was supported by a National Institute on Drug Abuse (NIDA grant R21-DA Huang was supported by a National Science Foundation grant, DMS and funding through Projet 211 Phase 3 of SHUFE, and Shanghai Leading Academic Discipline Project, B803. Li was supported a NIDA grant P50-DA The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIDA or the NIH. 1

3 1. INTRODUCTION Since the seminal work of Koenker and Bassett (1978, quantile regression has become an important statistical analytic tool. Koenker (2005 provided a comprehensive review of the topic of quantile regression. Nonparametric quantile regression has been extensively studied in the literature. In situations where the covariate vector is univariate, Bhattacharya and Gangopadhyay (1990 studied kernel estimation and nearest neighborhood estimation, and Fan, Hu and Truong (1994 and Yu and Jones (1998 proposed local linear polynomial quantile regression. Koenker, Ng and Portnoys (1994 proposed regression spline approaches for estimating the conditional quantile regression. When the covariate vector x is multivariate, Stone (1977 and Chaudhuri (1991 proposed fully nonparametric quantile regression. Due to the curse of dimensionality, however, the fully nonparametric quantile regression may not be very useful in practice. Thus, researchers considered alternatives by assuming a structure on the regression function. De Gooijer and Zerom (2003, Yu and Lu (2004, and Horowitz and Lee (2005 proposed quantile regression for additive models. Wang, Zhu and Zhou (2009 considered quantile regression for partially linear varying coefficient models. Let Y be a response variable, and x = (X 1,..., X p T be a covariate vector. In this paper, we consider the following heteroscedastic single-index model: Y = G(x T β 0 + σ(x T β 0 ε, (1.1 where the error term ε is assumed to be independent of x, E(ε = 0 and var(ε = 1. The functions G( and σ( are unspecified, nonparametric smoothing functions. The parameter of interest is the direction of β 0. To ensure the identifiability, it is assumed throughout this paper that β 0 = 1 with first nonzero element being positive. It is 2

4 easy to see that model (1.1 includes linear quantile regression (Koenker and Bassett, 1978 as a special case thereof. Model (1.1 becomes a nonparametric regression model when p = 1, and the traditional single-index model (Ichimura, 1993 when σ( is a positive constant. Here we allow for the heteroscedasticity. The single-index structure in (1.1 retains the flexibility of nonparametric regression and allows the presence of high-dimensional covariates. Chaudhuri, Doksum and Samarov (1997 proposed an estimation procedure for β 0 using the average derivative approach, which takes partial derivatives of the conditional quantile with respect to the covariates X j s. The average derivative approach requires multivariate kernel regression, which is hence less useful in the presence of high-dimensional covariates. In this paper, we will propose a new estimation procedure for model (1.1. The newly proposed estimation procedure consists of two steps. We first estimate β 0 using linear quantile regression. We show that the resulting estimate is consistent. We then employ local linear regression (Fan and Gijbels, 1996 to estimate the conditional quantile function of Y given x T β 0. This paper makes the following major contributions to the literature: (a Under mild conditions, we show that for any mean function G( and τ (0, 1, the τ-th linear quantile regression coefficient for Y x is proportional to β 0 in single index model (1.1. (b We prove that the local linear regression estimate for the conditional quantile function based upon the root-n consistent estimate of β 0 has the same asymptotic bias and variance as the local linear estimate with true value of β 0. The theoretical result in (a implies that the ordinary linear quantile regression actually results in root n consistent estimates for β 0. Thus, the quantile regression 3

5 provides an effective tool to reduce the dimension of the covariate in the presence of high-dimensional covariates. In particular, variable selection procedures for quantile regression (Zou and Yuan, 2008; Wu and Liu, 2009 may be directly applicable to model (1.1 with high dimensional covariates when many covariates are not significant. The result in (b implies that the dimensionality of covariate does not affect the performance of the local linear estimate asymptotically. The rest of this article is organized as follows. In Section 2 we propose a two-step procedure to estimate the conditional quantile function. Asymptotic properties of the resulting estimates are also rigorously established in this section. We demonstrate the methodologies through comprehensive simulation studies and an application to real data in Section 3. We conclude this paper with a brief discussion in Section 4. All proofs are given to the Appendix. 2. A NEW ESTIMATION PROCEDURE In this section we study the estimation of quantile regression for model (1.1. We prove that the linear quantile regression yields consistent estimate for β 0. We further demonstrate that the quantile regression for model (1.1 can be reduced to the univariate quantile regression by replacing β 0 by ˆβ0, the resulting estimate of linear quantile regression Estimation of β 0 Let ρ τ (r = τr r1(r < 0 be the check loss function, and L τ (u, β = E { ρ τ ( Y u β T x }. Define (u τ, β τ def = argmin {L τ (u, β}. (2.1 u,β 4

6 Theorem 1 below implies that the linear quantile regression offers a consistent estimate for β 0 in (1.1 under mild conditions. Theorem 1. If the covariate vector x in model (1.1 satisfies the following linearity condition E ( x β T 0 x = var(xβ 0 { β T 0 var(xβ 0 } 1 β T 0 x, (2.2 then β τ = κβ 0 for some constant κ, where β τ is defined in (2.1. The linearity condition (2.2 is widely assumed in the context of sufficient dimension reduction. Li (1991 pointed out that (2.2 is satisfied when x follows an elliptical distribution. Hall and Li (1993 proved that this linearity condition always holds to a good approximation in single-index models of the form (1.1 when the dimension p of the covariates diverges. Thus, the linearity condition is typically regarded as mild, particularly when p is fairly large. It is worth pointing out that similar results can be established for general convex loss functions under the linearity condition. Theorem 1 implies that the existing statistical procedures for linear quantile regression may directly apply to model (1.1. In particular, the existing variable selection procedure for linear quantile regression with high-dimensional covariates (Wu and Liu, 2009; Zou and Yuan, 2008 may be used to select significant variables for model (1.1. Let {(x i, Y i, i = 1,..., n} be a random sample from (x, Y, and L τ n (u, β = n 1 { ( ρτ Yi u x T i β}. 5

7 Let def (û τ, βτ = argmin {L τ n (u, β}, (2.3 u,β The asymptotic normality of βτ is established as follows: Theorem 2. Assume that Λ = E { f (ξ τ xxx T} is invertible, where f (z x denotes the conditional density of ( Y x T β τ given x, and ξτ denotes its τ-th quantile. Then n 1/2 ( βτ β τ is asymptotically normal with mean zero and covariance matrix τ (1 τ Λ 1 E ( xx T Λ 1. If the error term Y x T β τ is independent of x, the asymptotic variance matrix of βτ reduces to τ (1 τ E ( xx T 1 / {f (ξ τ } Estimation of conditional quantile function By Theorem 1, β τ is proportional to β 0, and hence the conditional distribution of Y ( x T β 0 is exactly the same as that of Y ( x T β τ. Consequently, Theorem 1 allows to estimate the quantile regression of Y ( x T β 0 through the conditional distribution of Y ( x T β τ. We denote the τ-quantile function of Y ( x T β τ by Gτ (. By using data points } {(x Ti β τ, Y i, i = 1,..., n, we estimate G τ ( by using local linear regression. For ease of presentation, let z = x T 0 β τ, ẑ = x T 0 β τ, Z i = x T i β τ, Ẑ i = x T i β τ. Local linear regression is used to approximate G τ (Z by G τ (Z G τ (z + G τ(z(z z for Z in the neighborhood of z. Let ( def 1 â, b = argmin a,b n } { } ρ τ {Y i a b (Ẑi ẑ K (Ẑi ẑ/h. Then, Ĝτ(ẑ = â and Ĝ τ(ẑ = b. 6

8 In the sequel we derive the asymptotic property for the nonparametric estimate of G τ (z. For any fixed x 0 R p, the asymptotic normality of the local linear estimator Ĝ τ (ẑ, is presented in the following theorem: Theorem 3. Under the regularity conditions (i-(iv in Appendix A, if h 0 and nh, then the asymptotic conditional bias and variance of local linear quantile regression estimator Ĝτ(ẑ are given by bias{ĝτ(ẑ} = G τ(zµ 2 h 2 /2 + o p (h 2, var{ĝτ(ẑ} = (nh 1 {1 + o p (1}, where = 1 1 K2 (udu f(z τ(1 τ f 2 (ξ τ z σ2 (z. Furthermore, as n, } (nh {Ĝτ 1/2 (ẑ G τ (z G τ(zµ 2 h 2 /2 is asymptotically normal with mean zero and asymptotic variance. For univariate x, the asymptotic bias and variance of nonparametric quantile regression can be found in Fan, Hu and Truong (1994 and Yu and Jones (1998. We compared the asymptotic bias and variance of the local linear quantile regression estimator built upon } {(x Ti β τ, Y i, i = 1,..., n with that of the estimator built upon {( } x T i β τ, Y i, i = 1,..., n, and found they performed equally well asymptotically. This implies that the resulting estimate of the newly proposed procedure performs as well as an oracle estimate with replacing ˆβτ with its true value. 3. SIMULATIONS AND APPLICATION In this section we conduct simulations to evaluate the performance of the proposed methods and illustrate the proposed methodology with a real data example. 7

9 3.1. Simulation We generated 1000 data sets, each consisting of n = 500 observations, from Y = sin { 2 ( x T β 0 } + 2 exp { 16 ( x T β 0 2 } + σ ( x T β 0 ε, (3.1 where the index parameter β 0 = (2, 2, 1, 1, 0, 0, 0, 0, 0, 0 T / 10, and the covariate vector x = (X 1,..., X 10 T is generated from a multivariate normal population with mean zero and variance-covariance matrix var(x = (σ ij with σ ij = 0.5 i j. The conditional mean function in model (3.1 was designed by Kai, Li and Zou (2010. In our simulations, we considered four error distributions for ε: (i the standard normal distribution N(0, 1; (ii a mixture of two normal distributions 0.8N(0, N(0, 3 2 ; (iii the Laplace distribution; and (iv the student-t distribution with 3 degrees of freedom t(3. The error term ε and the covariates X j s are mutually independent. We considered two cases for σ ( x T β 0 : σ ( x T β 0 = 1 in homoscedasticity case (1 and σ ( x T β 0 = exp ( x T β 0 in heteroscedasticity case (2. We considered case (2 because it is often of interest to investigate the effect of heteroscedastic errors in quantile regressions. In case (2 the response values contain outliers with large probabilities. We examined the performance of our proposed procedures in these scenarios. Performance in estimating β 0. The index parameter β 0 was estimated via a series of quantile regressions with τ = 0.025, 0.05, 0.5, 0.95 and 0.975, respectively. Table 1 depicts the averages of mean squared errors (MSE of the estimate βτ, which is T defined by MSE = ( βτ β 0 ( βτ β 0. In Table 1, the LSE denotes the ordinary least squares estimate (LSE. Li and Duan (1989 proved that this LSE is a consistent estimator of the direction of β 0 ; thus, it serves naturally as a competitor here. In the homoscedasticity case (1 with standard normal errors, the LSE performs slightly better than the quantile regression in terms of the MSE values. However, the quantile 8

10 regression is superior to the LSE in all other cases. The improvement of quantile regression is more significant in heteroscedasticity case (2 than in homoscedasticity case (1. This phenomenon agrees with our expectation, in that the performance of the LSE is sensitive to the presence of outliers. In contrast, the quantile regression offers a more robust estimation in most scenarios. Performance in coverage probability of prediction intervals. It is often of interest to evaluate the accuracy of quantile regression in offering the prediction interval of Y given x T β 0. To this end, we can estimate the τ-th and (1 τ-th quantile of the conditional distribution of Y given x T β 0, which can be used as the confidence interval at the level of (1 2τ for τ < 0.5. For example, we choose τ = 0.025, 0.05 and 0.10 to provide the confidence intervals at 95%, 90% and 80%. We expect the empirical coverage probability of such confidence interval to be close to the nominal level of (1 2τ. The average and the standard deviation of the empirical coverage probabilities are summarized in Table 2. Here the oracle estimation provides the prediction interval by using the true index parameter β 0, which serves as a benchmark for comparison purposes. The quantile regression estimation is based on βτ. Both the homoscedasticity and the heteroscedasticity cases reported in Table 2 show very similar messages. The coverage probabilities are very close to the nominal levels for different τ values. The standard deviations of the empirical probabilities are all very small, indicating that the quantile regression offers a very reliable coverage probability, which is comparable with its oracle version. This phenomenon verifies the theoretical results in Theorem An application 9

11 Table 1: The averages of MSEs. The values in table are all multiplied by 100. case (1: σ ( x T β 0 = 1 ( ( case (2: σ x T β 0 = exp x T β 0 the τ-th quantile in % the τ-th quantile in % LSE LSE Standard normal distribution Mixture normal distribution Laplace distribution Student t-distribution t( Table 2: The empirical coverage probabilities. The numbers in table are all in %. case (1: σ ( x T β 0 = 1 ( ( case (2: σ x T β 0 = exp x T β 0 oracle oracle the nominal level in % Standard normal distribution aver stdev Mixture normal distribution aver stdev Laplace distribution aver stdev Student t-distribution t(3 aver stdev To illustrate the usefulness of the proposed procedures, we analyze an automobile dataset (Johnson, It is of interest to know how the manufactures suggested retail price (MSRP of vehicles depends upon different levels of attributes such as miles per gallon and horsepower. Thus the MSRP serves naturally as the response variable. This is the manufacturer s assessment of a vehicle s worth, and includes adequate 10

12 profit for the manufacturer and the dealer. We use the logarithm transformation of the MSRP in U.S. dollars as the response (Y. In addition, there are seven features which possibly affect the resulting price of vehicles, such as the engine size (X 1, number of cylinders (X 2, horsepower (X 3, average city miles per gallon (MPG, X 4 and average highway MPG (X 5, weight in pounds (X 6 and wheel base in inches (X 7. This data set consists of 428 observations, sixteen of which have missing values. We remove those observations with missing values in our subsequent analysis. The remaining data set has a total of eight variables and 412 observations. Both the response variable Y and the covariate vector x = (X 1,..., X 7 T are standardized marginally to have zero mean and unit variance during this empirical analysis. The fully nonparametric regression is not applicable to this particular case because the sample size is small compared with the dimensionality of the covariates. We specify the heteroscedastic single-index model (1.1 for analyzing the new vehicle data. The index parameter is estimated at five different quantiles: τ = 0.025, 0.05, 0.50, 0.95 and They yield similar estimates for the index parameter β τ at different quantiles, indicating that similar patterns between the MSRP and different features are revealed at different quantiles. Specifically, the estimated index parameter at τ = 0.5 is βτ = (0.2535, , , , , , T. It implies that the horsepower (X 3 is perhaps the most important factor that affects the suggested retail price. In contrast, the number of cylinders (X 2 has little impact on the MSRP. Using the data } {(x Ti β τ, Y i, i = 1,..., n, we first estimate the regression function G(x T β 0 via the local least squares estimator. The estimated function and its 95% confidence interval are reported in Figure 1(C. We further conduct some exploratory data analysis on this data set. The boxplot of the log-transformed MSRP, the response variable, is depicted in Figure 1(A, which shows that there are four vehicles whose 11

13 prices are significantly higher than other vehicles. In addition, the histogram in Figure 1(B reveals that the distribution of the log-transformed prices are highly skewed. This motivates us to further conduct empirical analysis of this data set using the proposed procedure. Next we apply the quantile regression to this dataset. Figure 1(D presents the 2.5% and 97.5% quantiles to give an approximate 95% prediction interval of Y. This prediction interval covers 95.87% of the data points, which is close to the nominal level. The line in the middle of Figure 1(D presents the quantile regression at the level of 50%. It behaves similarly to the local least squares estimator in Figure 1(C within the range of x T βτ. When x T βτ is near to the boundary, however, the local least squares estimate is clearly more sensitive to the outliers than the quantile estimate. This example demonstrates the effectiveness of the proposed procedure in terms of description of the conditional distribution of Y given x. 4. DISCUSSION To estimate the quantile regression with high-dimensional covariates, we impose a heteroscedastic single-index structure on the regression function. The heteroscedastic single-index structure allows us to reduce the dimension of covariates and simultaneously retains the flexibility of nonparametric regression. We propose a computationally efficient two-step estimation procedure to estimate the parameters involved in the quantile regression function. Asymptotic properties of the proposed procedures are studied. It is remarkable here that, if (1.1 reduces to the homoscedastic single-index model (i.e. σ( remains constant, and the error ε is symmetric about zero, then a byproduct of this paper is that our estimation procedure offers a robust estimation for the condi- 12

14 (A Boxplot of Y (B Histogram of Y Log of Price Log of Price (C Plot of Ĝ( (D Plot of Ĝτ( Figure 1: (A Boxplot of log-transformed retail price. (B Histogram of log-transformed retail price. (C Plot of Ĝ( and its 95% confidence interval using local linear least squares method. (D Plot of Ĝτ( with τ = 2.5%, 50% and 97.5%. The 95% prediction interval of Y given x covers 96% points 13

15 tional mean function. The simulation studies in this article verify this point implicitly. In addition, it is not necessary that the error term ε in model (1.1 is independent of x, in which case we can assume instead that E ( x x T β 0, ε is a linear function of x T β 0. If the parameter of interest is the index parameter β 0 only, the composite quantile regression (Zou and Yuan, 2008 can be adapted to improve the efficiency in estimating β 0. Further research on these issues is of great interest. APPENDIX: PROOF OF THEOREMS Appendix A. Technical Conditions The following technical conditions are imposed. They are not the weakest possible conditions, but they are imposed to facilitate the proofs. (i The quantile function G τ ( has a continuous and bounded second derivative. (ii The density function f( of (x T β is positive and uniformly continuous for β in a neighborhood of β 0. Furthermore, the density function of (x T β 0 is continuous and bounded away from zero and infinity on its support. (iii The conditional density of Y given (x T β 0, denoted by f Y (y x T β 0, is continuous in (x T β 0 for each y R. Moreover, there exist positive constants ε and δ and a positive function F (y x T β 0 such that sup f(y x T β F (y x T β 0. x T β x T β 0 ε For any fixed value of (x T β 0, we assume that F (y x T β 0 dy < and {ρτ (y t ρ τ (y ρ τ (y t} 2 F (y x T β 0 dy = o(t 2, as t 0. 14

16 (iv The kernel function K( is symmetric and has compact support [ 1, 1]. It satisfies the first order Lipschitz condition. Appendix B. Proof of Theorem 1 Theorem 1 follows from the convexity of the check loss function and the linearity condition (2.2. It suffices to prove that, for any u R, there exists a constant κ such that L τ (u, β L τ (u, κβ 0. Specifically, L τ (u, β = E [ E { ( ρ τ Y u β T x β T 0 x, ε }] E [ { ( ρ τ E Y β T 0 x, ε u E ( β T x β T 0 x, ε }] = E [ { ( ρ τ Y u E β T x β T 0 x }] = E { ( ρ τ Y u κβ T 0 x }, where κ = β T var(xβ 0 / { } β T 0 var(xβ 0. The first equality follows from the iterative law of conditional expectation; the first inequality follows from Jensen s inequality; the second equality is true in view of model (1.1 and the last equality holds true by invoking the linearity condition (2.2 and the error ε is independent of x. This completes the proof. Appendix C. Proof of Theorem 2 This proof follows from the quadratic approximation lemma (Hjort and Pollard, Let us present the quadratic approximation lemma here to facilitate the proof. Quadratic Approximation Lemma. Suppose L τ n (β is convex and can be represented as β T Bβ/2 + u T n β + a n + R n (β, where B is symmetric and positive definite, u n is stochastically bounded, a n is arbitrary, and R n (β goes to zero in probability 15

17 for each β. Then β, the minimizer of Lτ n (β, is only o p (1 away from B 1 u n. If u n converges in distribution to u, then β converges in distribution to B 1 u n accordingly. Now we use the quadratic approximation lemma to prove Theorem 2. Let α n = n 1/2 and a = n ( βτ 1/2 β τ. Further, let L τ n (u, β = n 1 n ρ τ(y i u x T i β. By applying the identity (Knight, 1998 that [ y ] ρ τ (x y ρ τ (x = 2 y {1(x 0 τ} + {1(x t 1(x 0} dt, 0 it follows that L τ n (ξ τ + α n b, β τ + α n a L τ n (ξ τ, β τ = n 1 ρ τ (Y i ξ τ α n b x T i β τ α n x T i a n 1 ρ τ (Y i ξ τ x T i β τ ( = n 1 α n x T i a + b { 1(Y i x T i β τ ξ τ τ } αn(x T +n 1 i a+b { 1(Yi x T i β τ ξ τ + t 1(Y i x T i β τ ξ τ } dt. 0 The objective function L τ n (ξ τ +α n b, β τ +α n a L τ n (ξ τ, β τ is convex in (a, b. With a slight change of notation, we denote F (t x = E { 1(Y x T β τ t x }. By Taylor s expansion, it follows that E {L τ n (ξ τ + α n b, β τ + α n a L τ n (ξ τ, β τ x 1,..., x n } [ ( = n 1 α n x T i a + b {F (ξ τ x i τ} + = α n n ξτ +α n(x T i a+b [ (x T i a + b {F (ξ τ x i τ} + α2 n 2 f(ξ τ x i ( ] b + a T 2 x i ξ τ F (t x i F (0 x i dt ] + o p (α 2 n. (C.1 The first two terms are of order O p (1/n. The last term of (C.1 admits a quadratic function of (a, b. Following standard arguments, we obtain that R 2 = L τ n (ξ τ + 16

18 α n b, β τ +α n a L τ n (ξ τ, β τ E {L τ n (ξ τ + α n b, β τ + α n a L τ n (ξ τ, β τ x 1,..., x n } = o p (n 1. This, together with the quadratic approximation lemma of Hjort and Pollard (1993 and (C.1, leads to the asymptotic normality of β. Appendix D. Proof of Theorem 3 Recall the notations z = x T 0 β τ, ẑ = x T β 0 τ, Z i = x T i β τ, Ẑi = x T β i τ. Let a 0 = G τ (z = } G τ (x T 0 β τ, b 0 = G τ(z, Ki = K {(Ẑi ẑ /h, K i = K {(Z i z /h} and K i = K {(Z i z /h} for i = 1,..., n. We first show that, for βτ to be a root-n consistent estimate of β τ, n 1 { } ] ( [ρ τ Y i a b (Ẑi ẑ Ki ρ τ {Y i a b (Z i z} K i = O p n 1/2.(D.1 Define R 11 = n 1 R 12 = n 1 R 13 = E [ } ] K i ρ τ {Y i a b (Ẑi ẑ ρ τ {Y i a b (Z i z}, ( Ki K i ρ τ {Y i a b (Z i z}, { } [ρ τ Y i a b (Ẑi z ] ( ρ τ {Y i a b (Z i z} Ki K i. Then, the left hand side of (D.1 can be decomposed into R 11 + R 12 + R 13. Using the identity ρ τ (x y ρ τ (x = 2 [ y {1(x 0 τ} + y 0 {1(x z 1(x 0} dz], we have ρ τ (x y ρ τ (x 4 y. Invoking the root-n consistency of βτ, we obtain n 1 { } ρτ [K i Y i a b (Ẑi ẑ 4b sup K(u n 1 u Ẑi Z i = 4bn 1 ] ρ τ {Y i a b (Z i z} x T i ( βτ β τ = Op (n 1/2. This indicates that R 11 = O p (n 1/2. 17

19 Next we deal with R 12. R 12 can be decomposed into ( βτ β τ T (nh 1 ( K Zi z (x i x 0 ρ τ {Y i a b (Z i z} {1 + o p (1}. h Let z i = (x i x 0 /h. Then (nh 1 n ( K Z i z (xi x h 0 ρ τ {Y i a b (Z i z} = n 1 n K ( z T i β τ zi ρ τ ( Yi a bhz T i β τ = Op (1, ( which, together with the root-n consistency of βτ, in turn proves that R 12 = O p n 1/2. It remains to investigate the order of R 13. Following similar arguments to deal with R 11 and R 12, we have R 13 = O p (n 1/2. Thus, (D.1 follows by combining the results of R 1k s, which implies that n { 1 n ρ τ Y i a b (Ẑi ẑ} Ki = n 1 n ρ τ {Y i a b (Z i z} K i + o p {(nh 1/2}. } That is, the quantile regression estimate of {(x Ĝ( based on T β i τ, Y i, i = 1,..., n is asymptotically as efficient as that based on { (x T i β 0, Y i, i = 1,..., n }. The rest of the proof follows literally from Fan, Hu and Truong (1994, Theorem 3. Based on the auxiliary data points { (x T i β 0, Y i, i = 1,..., n }, the technique for establishing the asymptotic normality involves the convexity lemma (Pollard, 1991 and the quadratic approximation lemma (Hjort and Pollard, We only sketch the outline below. Let ψ(z Z = E {Y G (Z z + η τ + G (z G (Z G (z (Z z Z}. 18

20 Therefore, n 1 ρ τ {Y i a b (Z i z} K i n 1 ρ τ {Y i a 0 b 0 (Z i z} K i = n 1 {(a a 0 + (b b 0 Z i } {ψ(0 Z i τ} K i +n 1 ψ (0 Z i {(a a 0 + (b b 0 Z i } 2 K i {1/2 + o p (1}. Following similar arguments for proving Theorem 3 in Fan, Hu and Truong (1994, we obtain that (nh 1/2 (a a 0 = (nh 1/2 n {ψ(0 Z i τ}k i φ (0 zf(z completes the proof. + o p { (nh 1/2 }, which REFERENCES Bhattacharya, P. K. and Gangopadhyay, A. K. (1990 Kernel and nearest-neighbor estimation of a conditional quantile regression. Annals of Statistics, 18, Chaudhuri, P. (1991 Nonparametric estimates of regression quantiles and their local Bahahur representation. Annals of Statistics, 19, Chaudhuri, P., Doksum, K. and Samarov, A. (1997 On average derivative quantile regression. Annals of Statistics, 25, De Gooijer, J. G. and Zerom, D. (2003 On additive conditional quantiles with high dimensional covariates. Journal of the American Statistical Association, 98, Fan, J. and Gijbels, I. (1996 Local Polynomial Modelling and its Applications. London: Chapman and Hall 19

21 Fan, J., Hu, T.-C. and Truong, Y. K. (1994 Robust non-parametric function estimation. Scandinavian Journal of Statistics, 21, Hall, P. and Li, K. C. (1993 On almost linearity of low dimensional projection from high dimensional data. Annals of Statistics, 21, Hjort, N. L. and Pollard, D. (1993 Asymptotics for minimisers of convex processes Preprint available at pollard/papers/convex.pdf Horowitz, J. L. and Lee, S. (2005 Nonparametric estimation of an additive quantile regression model. Journal of the American Statistical Association, 100, Ichimura, H. (1993 Semiparametric least squares (SLS and weighted SLS estimation of single-index models. Journal of Econometrics, 58, Johnson, R. W. (2003. Kiplinger s personal finance, Journal of Statistics Education, 57, Kai, B., Li, R. and Zou, H. (2010 Local composite quantile regression smoothing: an efficient and safe alternative to local polynomial regression. Journal of the Royal Statistical Society Series B, 72, Koenker, R. (2005 Quantile Regression. Cambridge University Press. Koenker, R. and Bassett, G. (1978 Regression quantiles. Econometrika, 46, Koenker, R., Ng, P. and Portnoys, S. (1994 Quantiles smoothing splines. Biometrika, 81, Knight, K. (1998 Limiting distribution for L 1 regression estimators under general conditions. Annals of Statistics, 26,

22 Li, K.-C. (1991 Sliced inverse regression for dimension reduction. Journal of the American Statistical Association, 86, Li, K.-C. and Duan, N. (1989 Regression analysis under link violation. The Annals of Statistics, 17, Pollard, D. (1991 Asymptotics for least absolute deviation regression estimators. Econometric Theory, 7, Stone, C. J. (1977 Consistent nonparametric regression (with discussion. Annals of Statistics, 5, Wang, H., Zhu, Z., and Zhou, J. (2009 Quantile regression in partially linear varying coefficient models. Annals of Statistics, 37, Wu, Y. and Liu, Y. (2009 Variable selection in quantile regression. Statistica Sinica, 19, Yu, K. and Lu, Z. (2004 Local linear additive quantile regression. Scandinavian Journal of Statistics, 31, Yu, K. and Jones, M. C. (1998 Local linear quantile regression. Journal of the American Statistical Association, 93, Zou, H. and Yuan, M. (2008 Composite quantile regression and the oracle model selection theory. Annals of Statistics, 36,

SEMIPARAMETRIC ESTIMATION OF CONDITIONAL HETEROSCEDASTICITY VIA SINGLE-INDEX MODELING

SEMIPARAMETRIC ESTIMATION OF CONDITIONAL HETEROSCEDASTICITY VIA SINGLE-INDEX MODELING Statistica Sinica 3 (013), 135-155 doi:http://dx.doi.org/10.5705/ss.01.075 SEMIPARAMERIC ESIMAION OF CONDIIONAL HEEROSCEDASICIY VIA SINGLE-INDEX MODELING Liping Zhu, Yuexiao Dong and Runze Li Shanghai