SEMIPARAMETRIC QUANTILE REGRESSION WITH HIGH-DIMENSIONAL COVARIATES. Liping Zhu, Mian Huang, & Runze Li. The Pennsylvania State University

Similar documents
SEMIPARAMETRIC ESTIMATION OF CONDITIONAL HETEROSCEDASTICITY VIA SINGLE-INDEX MODELING

Single Index Quantile Regression for Heteroscedastic Data

WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION. Abstract

Single Index Quantile Regression for Heteroscedastic Data

Nonparametric Modal Regression

Some Theories about Backfitting Algorithm for Varying Coefficient Partially Linear Model

Issues on quantile autoregression

Identification and Estimation of Partially Linear Censored Regression Models with Unknown Heteroscedasticity

SMOOTHED BLOCK EMPIRICAL LIKELIHOOD FOR QUANTILES OF WEAKLY DEPENDENT PROCESSES

Quantile Regression for Extraordinarily Large Data

Quantile Processes for Semi and Nonparametric Regression

Density estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas

Statistica Sinica Preprint No: SS

Smooth simultaneous confidence bands for cumulative distribution functions

Function of Longitudinal Data

Truncated Regression Model and Nonparametric Estimation for Gifted and Talented Education Program

A note on profile likelihood for exponential tilt mixture models

Least Absolute Deviations Estimation for the Accelerated Failure Time Model. University of Iowa. *

New Local Estimation Procedure for Nonparametric Regression Function of Longitudinal Data

Rejoinder. 1 Phase I and Phase II Profile Monitoring. Peihua Qiu 1, Changliang Zou 2 and Zhaojun Wang 2

Improving linear quantile regression for

arxiv: v2 [stat.me] 4 Jun 2016

A Resampling Method on Pivotal Estimating Functions

TESTING SERIAL CORRELATION IN SEMIPARAMETRIC VARYING COEFFICIENT PARTIALLY LINEAR ERRORS-IN-VARIABLES MODEL

Goodness-of-fit tests for the cure rate in a mixture cure model

Local Polynomial Regression

An Encompassing Test for Non-Nested Quantile Regression Models

Introduction. Linear Regression. coefficient estimates for the wage equation: E(Y X) = X 1 β X d β d = X β

A nonparametric method of multi-step ahead forecasting in diffusion processes

Does k-th Moment Exist?

A review of some semiparametric regression models with application to scoring

A Bootstrap Test for Conditional Symmetry

NEW EFFICIENT AND ROBUST ESTIMATION IN VARYING-COEFFICIENT MODELS WITH HETEROSCEDASTICITY

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be

Tobit and Interval Censored Regression Model

Asymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands

Multiscale Exploratory Analysis of Regression Quantiles using Quantile SiZer

Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests

Minimax Rate of Convergence for an Estimator of the Functional Component in a Semiparametric Multivariate Partially Linear Model.

On a Nonparametric Notion of Residual and its Applications

Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions

Statistical Properties of Numerical Derivatives

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices

Empirical likelihood based modal regression

41903: Introduction to Nonparametrics

Lecture 3: Statistical Decision Theory (Part II)

Direction Estimation in a General Regression Model with Discrete Predictors

Nonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix

Local linear multiple regression with variable. bandwidth in the presence of heteroscedasticity

DEPARTMENT MATHEMATIK ARBEITSBEREICH MATHEMATISCHE STATISTIK UND STOCHASTISCHE PROZESSE

Economics 583: Econometric Theory I A Primer on Asymptotics

Efficient Estimation for the Partially Linear Models with Random Effects

University of California, Berkeley

INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION. 1. Introduction

Additive Isotonic Regression

CALCULATION METHOD FOR NONLINEAR DYNAMIC LEAST-ABSOLUTE DEVIATIONS ESTIMATOR

Chapter 6. Order Statistics and Quantiles. 6.1 Extreme Order Statistics

IV Quantile Regression for Group-level Treatments, with an Application to the Distributional Effects of Trade

Tuning Parameter Selection in Regularized Estimations of Large Covariance Matrices

Time Series and Forecasting Lecture 4 NonLinear Time Series

High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data

An Empirical Characteristic Function Approach to Selecting a Transformation to Normality

Regression Graphics. 1 Introduction. 2 The Central Subspace. R. D. Cook Department of Applied Statistics University of Minnesota St.

The EM Algorithm for the Finite Mixture of Exponential Distribution Models

Confidence intervals for kernel density estimation

Short Questions (Do two out of three) 15 points each

Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator

Efficient Regressions via Optimally Combining Quantile Information

Efficient Estimation of Population Quantiles in General Semiparametric Regression Models

Some asymptotic expansions and distribution approximations outside a CLT context 1

DISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING. By T. Tony Cai and Linjun Zhang University of Pennsylvania

SEMIPARAMETRIC LONGITUDINAL MODEL WITH IRREGULAR TIME AUTOREGRESSIVE ERROR PROCESS

A Note on Data-Adaptive Bandwidth Selection for Sequential Kernel Smoothers

Working Paper No Maximum score type estimators

Nonparametric confidence intervals. for receiver operating characteristic curves

5.1 Approximation of convex functions

Inference For High Dimensional M-estimates. Fixed Design Results

Statistical inference on Lévy processes

Computation Of Asymptotic Distribution. For Semiparametric GMM Estimators. Hidehiko Ichimura. Graduate School of Public Policy

Introduction to Regression

Nonparametric Econometrics

Spatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood

Kernel Density Based Linear Regression Estimate

SUFFICIENT DIMENSION REDUCTION IN REGRESSIONS WITH MISSING PREDICTORS

Long-Run Covariability

A lack-of-fit test for quantile regression models with high-dimensional covariates

PREWHITENING-BASED ESTIMATION IN PARTIAL LINEAR REGRESSION MODELS: A COMPARATIVE STUDY

O Combining cross-validation and plug-in methods - for kernel density bandwidth selection O

The Restricted Likelihood Ratio Test at the Boundary in Autoregressive Series

Wrapped Gaussian processes: a short review and some new results

Testing Equality of Nonparametric Quantile Regression Functions

Specification testing for regressions: an approach bridging between local smoothing and global smoothing methods

Nonparametric Methods

The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA

Inference For High Dimensional M-estimates: Fixed Design Results

Transformation and Smoothing in Sample Survey Data

arxiv: v1 [cs.lg] 24 Feb 2014

Nonparametric Heteroscedastic Transformation Regression Models for Skewed Data, with an Application to Health Care Costs

Regression Graphics. R. D. Cook Department of Applied Statistics University of Minnesota St. Paul, MN 55108

Transcription:

SEMIPARAMETRIC QUANTILE REGRESSION WITH HIGH-DIMENSIONAL COVARIATES Liping Zhu, Mian Huang, & Runze Li The Pennsylvania State University Technical Report Series #10-104 College of Health and Human Development The Pennsylvania State University

SEMIPARAMETRIC QUANTILE REGRESSION WITH HIGH-DIMENSIONAL COVARIATES Liping ZHU, Mian HUANG and Runze LI Abstract This paper is concerned with quantile regression for a semiparametric regression model, in which both conditional mean function and conditional variance function of response given the covariates admit single-index structure. This semiparametric regression model enables us to reduce the dimension of the covariates and simultaneously retains the flexibility of nonparametric regression. Under mild conditions, we show that simple linear quantile regression offers a consistent estimate of the index parameter vector. This is a surprising and interesting result because the single-index model is possibly misspecified under the linear quantile regression. With root-n consistent estimate of the index vector, one may employ a local polynomial regression technique to estimate the conditional quantile function. We show that the resulting estimator of the quantile function performs asymptotically as efficiently as if the true value of the index vector were known. The methodologies are demonstrated through comprehensive simulation studies and an application to a real dataset. KEY WORDS: Dimension reduction; heteroscedasticity; linearity condition; local polynomial regression; quantile regression; single index model. Liping Zhu (luz15@psu.edu is Research Associate, The Methodology Center, The Pennsylvania State University at University Park, PA 16802. Huang Mian (mzh136@stat.psu.edu is Assistant Professor, School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai 200243, P. R. China. Runze Li (rli@stat.psu.edu is Professor, Department of Statistics and The Methodology Center, The Pennsylvania State University at University Park, PA 16802. Zhu was supported by a National Institute on Drug Abuse (NIDA grant R21-DA024260. Huang was supported by a National Science Foundation grant, DMS 0348869 and funding through Projet 211 Phase 3 of SHUFE, and Shanghai Leading Academic Discipline Project, B803. Li was supported a NIDA grant P50-DA10075. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIDA or the NIH. 1

1. INTRODUCTION Since the seminal work of Koenker and Bassett (1978, quantile regression has become an important statistical analytic tool. Koenker (2005 provided a comprehensive review of the topic of quantile regression. Nonparametric quantile regression has been extensively studied in the literature. In situations where the covariate vector is univariate, Bhattacharya and Gangopadhyay (1990 studied kernel estimation and nearest neighborhood estimation, and Fan, Hu and Truong (1994 and Yu and Jones (1998 proposed local linear polynomial quantile regression. Koenker, Ng and Portnoys (1994 proposed regression spline approaches for estimating the conditional quantile regression. When the covariate vector x is multivariate, Stone (1977 and Chaudhuri (1991 proposed fully nonparametric quantile regression. Due to the curse of dimensionality, however, the fully nonparametric quantile regression may not be very useful in practice. Thus, researchers considered alternatives by assuming a structure on the regression function. De Gooijer and Zerom (2003, Yu and Lu (2004, and Horowitz and Lee (2005 proposed quantile regression for additive models. Wang, Zhu and Zhou (2009 considered quantile regression for partially linear varying coefficient models. Let Y be a response variable, and x = (X 1,..., X p T be a covariate vector. In this paper, we consider the following heteroscedastic single-index model: Y = G(x T β 0 + σ(x T β 0 ε, (1.1 where the error term ε is assumed to be independent of x, E(ε = 0 and var(ε = 1. The functions G( and σ( are unspecified, nonparametric smoothing functions. The parameter of interest is the direction of β 0. To ensure the identifiability, it is assumed throughout this paper that β 0 = 1 with first nonzero element being positive. It is 2

easy to see that model (1.1 includes linear quantile regression (Koenker and Bassett, 1978 as a special case thereof. Model (1.1 becomes a nonparametric regression model when p = 1, and the traditional single-index model (Ichimura, 1993 when σ( is a positive constant. Here we allow for the heteroscedasticity. The single-index structure in (1.1 retains the flexibility of nonparametric regression and allows the presence of high-dimensional covariates. Chaudhuri, Doksum and Samarov (1997 proposed an estimation procedure for β 0 using the average derivative approach, which takes partial derivatives of the conditional quantile with respect to the covariates X j s. The average derivative approach requires multivariate kernel regression, which is hence less useful in the presence of high-dimensional covariates. In this paper, we will propose a new estimation procedure for model (1.1. The newly proposed estimation procedure consists of two steps. We first estimate β 0 using linear quantile regression. We show that the resulting estimate is consistent. We then employ local linear regression (Fan and Gijbels, 1996 to estimate the conditional quantile function of Y given x T β 0. This paper makes the following major contributions to the literature: (a Under mild conditions, we show that for any mean function G( and τ (0, 1, the τ-th linear quantile regression coefficient for Y x is proportional to β 0 in single index model (1.1. (b We prove that the local linear regression estimate for the conditional quantile function based upon the root-n consistent estimate of β 0 has the same asymptotic bias and variance as the local linear estimate with true value of β 0. The theoretical result in (a implies that the ordinary linear quantile regression actually results in root n consistent estimates for β 0. Thus, the quantile regression 3

provides an effective tool to reduce the dimension of the covariate in the presence of high-dimensional covariates. In particular, variable selection procedures for quantile regression (Zou and Yuan, 2008; Wu and Liu, 2009 may be directly applicable to model (1.1 with high dimensional covariates when many covariates are not significant. The result in (b implies that the dimensionality of covariate does not affect the performance of the local linear estimate asymptotically. The rest of this article is organized as follows. In Section 2 we propose a two-step procedure to estimate the conditional quantile function. Asymptotic properties of the resulting estimates are also rigorously established in this section. We demonstrate the methodologies through comprehensive simulation studies and an application to real data in Section 3. We conclude this paper with a brief discussion in Section 4. All proofs are given to the Appendix. 2. A NEW ESTIMATION PROCEDURE In this section we study the estimation of quantile regression for model (1.1. We prove that the linear quantile regression yields consistent estimate for β 0. We further demonstrate that the quantile regression for model (1.1 can be reduced to the univariate quantile regression by replacing β 0 by ˆβ0, the resulting estimate of linear quantile regression. 2.1. Estimation of β 0 Let ρ τ (r = τr r1(r < 0 be the check loss function, and L τ (u, β = E { ρ τ ( Y u β T x }. Define (u τ, β τ def = argmin {L τ (u, β}. (2.1 u,β 4

Theorem 1 below implies that the linear quantile regression offers a consistent estimate for β 0 in (1.1 under mild conditions. Theorem 1. If the covariate vector x in model (1.1 satisfies the following linearity condition E ( x β T 0 x = var(xβ 0 { β T 0 var(xβ 0 } 1 β T 0 x, (2.2 then β τ = κβ 0 for some constant κ, where β τ is defined in (2.1. The linearity condition (2.2 is widely assumed in the context of sufficient dimension reduction. Li (1991 pointed out that (2.2 is satisfied when x follows an elliptical distribution. Hall and Li (1993 proved that this linearity condition always holds to a good approximation in single-index models of the form (1.1 when the dimension p of the covariates diverges. Thus, the linearity condition is typically regarded as mild, particularly when p is fairly large. It is worth pointing out that similar results can be established for general convex loss functions under the linearity condition. Theorem 1 implies that the existing statistical procedures for linear quantile regression may directly apply to model (1.1. In particular, the existing variable selection procedure for linear quantile regression with high-dimensional covariates (Wu and Liu, 2009; Zou and Yuan, 2008 may be used to select significant variables for model (1.1. Let {(x i, Y i, i = 1,..., n} be a random sample from (x, Y, and L τ n (u, β = n 1 { ( ρτ Yi u x T i β}. 5

Let def (û τ, βτ = argmin {L τ n (u, β}, (2.3 u,β The asymptotic normality of βτ is established as follows: Theorem 2. Assume that Λ = E { f (ξ τ xxx T} is invertible, where f (z x denotes the conditional density of ( Y x T β τ given x, and ξτ denotes its τ-th quantile. Then n 1/2 ( βτ β τ is asymptotically normal with mean zero and covariance matrix τ (1 τ Λ 1 E ( xx T Λ 1. If the error term Y x T β τ is independent of x, the asymptotic variance matrix of βτ reduces to τ (1 τ E ( xx T 1 / {f (ξ τ } 2. 2.2. Estimation of conditional quantile function By Theorem 1, β τ is proportional to β 0, and hence the conditional distribution of Y ( x T β 0 is exactly the same as that of Y ( x T β τ. Consequently, Theorem 1 allows to estimate the quantile regression of Y ( x T β 0 through the conditional distribution of Y ( x T β τ. We denote the τ-quantile function of Y ( x T β τ by Gτ (. By using data points } {(x Ti β τ, Y i, i = 1,..., n, we estimate G τ ( by using local linear regression. For ease of presentation, let z = x T 0 β τ, ẑ = x T 0 β τ, Z i = x T i β τ, Ẑ i = x T i β τ. Local linear regression is used to approximate G τ (Z by G τ (Z G τ (z + G τ(z(z z for Z in the neighborhood of z. Let ( def 1 â, b = argmin a,b n } { } ρ τ {Y i a b (Ẑi ẑ K (Ẑi ẑ/h. Then, Ĝτ(ẑ = â and Ĝ τ(ẑ = b. 6

In the sequel we derive the asymptotic property for the nonparametric estimate of G τ (z. For any fixed x 0 R p, the asymptotic normality of the local linear estimator Ĝ τ (ẑ, is presented in the following theorem: Theorem 3. Under the regularity conditions (i-(iv in Appendix A, if h 0 and nh, then the asymptotic conditional bias and variance of local linear quantile regression estimator Ĝτ(ẑ are given by bias{ĝτ(ẑ} = G τ(zµ 2 h 2 /2 + o p (h 2, var{ĝτ(ẑ} = (nh 1 {1 + o p (1}, where = 1 1 K2 (udu f(z τ(1 τ f 2 (ξ τ z σ2 (z. Furthermore, as n, } (nh {Ĝτ 1/2 (ẑ G τ (z G τ(zµ 2 h 2 /2 is asymptotically normal with mean zero and asymptotic variance. For univariate x, the asymptotic bias and variance of nonparametric quantile regression can be found in Fan, Hu and Truong (1994 and Yu and Jones (1998. We compared the asymptotic bias and variance of the local linear quantile regression estimator built upon } {(x Ti β τ, Y i, i = 1,..., n with that of the estimator built upon {( } x T i β τ, Y i, i = 1,..., n, and found they performed equally well asymptotically. This implies that the resulting estimate of the newly proposed procedure performs as well as an oracle estimate with replacing ˆβτ with its true value. 3. SIMULATIONS AND APPLICATION In this section we conduct simulations to evaluate the performance of the proposed methods and illustrate the proposed methodology with a real data example. 7

3.1. Simulation We generated 1000 data sets, each consisting of n = 500 observations, from Y = sin { 2 ( x T β 0 } + 2 exp { 16 ( x T β 0 2 } + σ ( x T β 0 ε, (3.1 where the index parameter β 0 = (2, 2, 1, 1, 0, 0, 0, 0, 0, 0 T / 10, and the covariate vector x = (X 1,..., X 10 T is generated from a multivariate normal population with mean zero and variance-covariance matrix var(x = (σ ij 10 10 with σ ij = 0.5 i j. The conditional mean function in model (3.1 was designed by Kai, Li and Zou (2010. In our simulations, we considered four error distributions for ε: (i the standard normal distribution N(0, 1; (ii a mixture of two normal distributions 0.8N(0, 1 2 +0.2N(0, 3 2 ; (iii the Laplace distribution; and (iv the student-t distribution with 3 degrees of freedom t(3. The error term ε and the covariates X j s are mutually independent. We considered two cases for σ ( x T β 0 : σ ( x T β 0 = 1 in homoscedasticity case (1 and σ ( x T β 0 = exp ( x T β 0 in heteroscedasticity case (2. We considered case (2 because it is often of interest to investigate the effect of heteroscedastic errors in quantile regressions. In case (2 the response values contain outliers with large probabilities. We examined the performance of our proposed procedures in these scenarios. Performance in estimating β 0. The index parameter β 0 was estimated via a series of quantile regressions with τ = 0.025, 0.05, 0.5, 0.95 and 0.975, respectively. Table 1 depicts the averages of mean squared errors (MSE of the estimate βτ, which is T defined by MSE = ( βτ β 0 ( βτ β 0. In Table 1, the LSE denotes the ordinary least squares estimate (LSE. Li and Duan (1989 proved that this LSE is a consistent estimator of the direction of β 0 ; thus, it serves naturally as a competitor here. In the homoscedasticity case (1 with standard normal errors, the LSE performs slightly better than the quantile regression in terms of the MSE values. However, the quantile 8

regression is superior to the LSE in all other cases. The improvement of quantile regression is more significant in heteroscedasticity case (2 than in homoscedasticity case (1. This phenomenon agrees with our expectation, in that the performance of the LSE is sensitive to the presence of outliers. In contrast, the quantile regression offers a more robust estimation in most scenarios. Performance in coverage probability of prediction intervals. It is often of interest to evaluate the accuracy of quantile regression in offering the prediction interval of Y given x T β 0. To this end, we can estimate the τ-th and (1 τ-th quantile of the conditional distribution of Y given x T β 0, which can be used as the confidence interval at the level of (1 2τ for τ < 0.5. For example, we choose τ = 0.025, 0.05 and 0.10 to provide the confidence intervals at 95%, 90% and 80%. We expect the empirical coverage probability of such confidence interval to be close to the nominal level of (1 2τ. The average and the standard deviation of the empirical coverage probabilities are summarized in Table 2. Here the oracle estimation provides the prediction interval by using the true index parameter β 0, which serves as a benchmark for comparison purposes. The quantile regression estimation is based on βτ. Both the homoscedasticity and the heteroscedasticity cases reported in Table 2 show very similar messages. The coverage probabilities are very close to the nominal levels for different τ values. The standard deviations of the empirical probabilities are all very small, indicating that the quantile regression offers a very reliable coverage probability, which is comparable with its oracle version. This phenomenon verifies the theoretical results in Theorem 3. 3.2. An application 9

Table 1: The averages of MSEs. The values in table are all multiplied by 100. case (1: σ ( x T β 0 = 1 ( ( case (2: σ x T β 0 = exp x T β 0 the τ-th quantile in % the τ-th quantile in % 2.5 5 50 LSE 95 97.5 2.5 5 50 LSE 95 97.5 Standard normal distribution 18.2 17.5 13.7 11.7 27.9 29.1 16.1 15.4 11.1 26.7 22.5 23.6 Mixture normal distribution 24.4 23.3 17.9 20.5 34.4 36.0 23.2 22.1 15.0 49.6 28.8 30.2 Laplace distribution 20.1 19.1 14.5 16.8 29.6 30.8 19.0 18.1 12.8 40.5 25.3 26.5 Student t-distribution t(3 23.9 22.8 16.6 21.9 32.9 34.4 22.4 21.2 14.2 49.2 27.5 28.8 Table 2: The empirical coverage probabilities. The numbers in table are all in %. case (1: σ ( x T β 0 = 1 ( ( case (2: σ x T β 0 = exp x T β 0 oracle oracle the nominal level in % 80 90 95 80 90 95 80 90 95 80 90 95 Standard normal distribution aver. 79.2 89.4 94.5 79.8 89.8 94.8 80.1 90.1 95.1 81.3 90.9 95.5 stdev 0.6 0.5 0.4 0.7 0.5 0.4 0.6 0.5 0.4 0.7 0.5 0.4 Mixture normal distribution aver. 79.4 89.7 94.7 79.8 89.9 94.7 80.2 90.3 95.0 81.1 90.7 95.3 stdev 0.5 0.4 0.3 0.6 0.5 0.3 0.6 0.4 0.4 0.7 0.5 0.4 Laplace distribution aver. 79.4 89.7 94.7 79.7 89.8 94.8 80.2 90.3 95.1 81.2 90.9 95.5 stdev 0.5 0.4 0.3 0.6 0.5 0.4 0.6 0.4 0.4 0.7 0.5 0.4 Student t-distribution t(3 aver. 79.4 89.7 94.7 79.7 89.8 94.8 80.2 90.3 95.1 81.2 90.8 95.4 stdev 0.6 0.4 0.3 0.6 0.5 0.4 0.6 0.4 0.4 0.7 0.5 0.4 To illustrate the usefulness of the proposed procedures, we analyze an automobile dataset (Johnson, 2003. It is of interest to know how the manufactures suggested retail price (MSRP of vehicles depends upon different levels of attributes such as miles per gallon and horsepower. Thus the MSRP serves naturally as the response variable. This is the manufacturer s assessment of a vehicle s worth, and includes adequate 10

profit for the manufacturer and the dealer. We use the logarithm transformation of the MSRP in U.S. dollars as the response (Y. In addition, there are seven features which possibly affect the resulting price of vehicles, such as the engine size (X 1, number of cylinders (X 2, horsepower (X 3, average city miles per gallon (MPG, X 4 and average highway MPG (X 5, weight in pounds (X 6 and wheel base in inches (X 7. This data set consists of 428 observations, sixteen of which have missing values. We remove those observations with missing values in our subsequent analysis. The remaining data set has a total of eight variables and 412 observations. Both the response variable Y and the covariate vector x = (X 1,..., X 7 T are standardized marginally to have zero mean and unit variance during this empirical analysis. The fully nonparametric regression is not applicable to this particular case because the sample size is small compared with the dimensionality of the covariates. We specify the heteroscedastic single-index model (1.1 for analyzing the new vehicle data. The index parameter is estimated at five different quantiles: τ = 0.025, 0.05, 0.50, 0.95 and 0.975. They yield similar estimates for the index parameter β τ at different quantiles, indicating that similar patterns between the MSRP and different features are revealed at different quantiles. Specifically, the estimated index parameter at τ = 0.5 is βτ = (0.2535, 0.1830, 0.7361, 0.2312, 0.3244, 0.4012, 0.2021 T. It implies that the horsepower (X 3 is perhaps the most important factor that affects the suggested retail price. In contrast, the number of cylinders (X 2 has little impact on the MSRP. Using the data } {(x Ti β τ, Y i, i = 1,..., n, we first estimate the regression function G(x T β 0 via the local least squares estimator. The estimated function and its 95% confidence interval are reported in Figure 1(C. We further conduct some exploratory data analysis on this data set. The boxplot of the log-transformed MSRP, the response variable, is depicted in Figure 1(A, which shows that there are four vehicles whose 11

prices are significantly higher than other vehicles. In addition, the histogram in Figure 1(B reveals that the distribution of the log-transformed prices are highly skewed. This motivates us to further conduct empirical analysis of this data set using the proposed procedure. Next we apply the quantile regression to this dataset. Figure 1(D presents the 2.5% and 97.5% quantiles to give an approximate 95% prediction interval of Y. This prediction interval covers 95.87% of the data points, which is close to the nominal level. The line in the middle of Figure 1(D presents the quantile regression at the level of 50%. It behaves similarly to the local least squares estimator in Figure 1(C within the range of x T βτ. When x T βτ is near to the boundary, however, the local least squares estimate is clearly more sensitive to the outliers than the quantile estimate. This example demonstrates the effectiveness of the proposed procedure in terms of description of the conditional distribution of Y given x. 4. DISCUSSION To estimate the quantile regression with high-dimensional covariates, we impose a heteroscedastic single-index structure on the regression function. The heteroscedastic single-index structure allows us to reduce the dimension of covariates and simultaneously retains the flexibility of nonparametric regression. We propose a computationally efficient two-step estimation procedure to estimate the parameters involved in the quantile regression function. Asymptotic properties of the proposed procedures are studied. It is remarkable here that, if (1.1 reduces to the homoscedastic single-index model (i.e. σ( remains constant, and the error ε is symmetric about zero, then a byproduct of this paper is that our estimation procedure offers a robust estimation for the condi- 12

(A Boxplot of Y (B Histogram of Y 4 120 Log of Price 3 2 1 0 100 80 60 1 40 2 20 3 0 3 2 1 0 1 2 3 4 Log of Price (C Plot of Ĝ( (D Plot of Ĝτ( 4 4 3 3 2 2 1 1 0 0 1 1 2 2 3 4 3 2 1 0 1 2 3 3 4 3 2 1 0 1 2 3 Figure 1: (A Boxplot of log-transformed retail price. (B Histogram of log-transformed retail price. (C Plot of Ĝ( and its 95% confidence interval using local linear least squares method. (D Plot of Ĝτ( with τ = 2.5%, 50% and 97.5%. The 95% prediction interval of Y given x covers 96% points 13

tional mean function. The simulation studies in this article verify this point implicitly. In addition, it is not necessary that the error term ε in model (1.1 is independent of x, in which case we can assume instead that E ( x x T β 0, ε is a linear function of x T β 0. If the parameter of interest is the index parameter β 0 only, the composite quantile regression (Zou and Yuan, 2008 can be adapted to improve the efficiency in estimating β 0. Further research on these issues is of great interest. APPENDIX: PROOF OF THEOREMS Appendix A. Technical Conditions The following technical conditions are imposed. They are not the weakest possible conditions, but they are imposed to facilitate the proofs. (i The quantile function G τ ( has a continuous and bounded second derivative. (ii The density function f( of (x T β is positive and uniformly continuous for β in a neighborhood of β 0. Furthermore, the density function of (x T β 0 is continuous and bounded away from zero and infinity on its support. (iii The conditional density of Y given (x T β 0, denoted by f Y (y x T β 0, is continuous in (x T β 0 for each y R. Moreover, there exist positive constants ε and δ and a positive function F (y x T β 0 such that sup f(y x T β F (y x T β 0. x T β x T β 0 ε For any fixed value of (x T β 0, we assume that F (y x T β 0 dy < and {ρτ (y t ρ τ (y ρ τ (y t} 2 F (y x T β 0 dy = o(t 2, as t 0. 14

(iv The kernel function K( is symmetric and has compact support [ 1, 1]. It satisfies the first order Lipschitz condition. Appendix B. Proof of Theorem 1 Theorem 1 follows from the convexity of the check loss function and the linearity condition (2.2. It suffices to prove that, for any u R, there exists a constant κ such that L τ (u, β L τ (u, κβ 0. Specifically, L τ (u, β = E [ E { ( ρ τ Y u β T x β T 0 x, ε }] E [ { ( ρ τ E Y β T 0 x, ε u E ( β T x β T 0 x, ε }] = E [ { ( ρ τ Y u E β T x β T 0 x }] = E { ( ρ τ Y u κβ T 0 x }, where κ = β T var(xβ 0 / { } β T 0 var(xβ 0. The first equality follows from the iterative law of conditional expectation; the first inequality follows from Jensen s inequality; the second equality is true in view of model (1.1 and the last equality holds true by invoking the linearity condition (2.2 and the error ε is independent of x. This completes the proof. Appendix C. Proof of Theorem 2 This proof follows from the quadratic approximation lemma (Hjort and Pollard, 1993. Let us present the quadratic approximation lemma here to facilitate the proof. Quadratic Approximation Lemma. Suppose L τ n (β is convex and can be represented as β T Bβ/2 + u T n β + a n + R n (β, where B is symmetric and positive definite, u n is stochastically bounded, a n is arbitrary, and R n (β goes to zero in probability 15

for each β. Then β, the minimizer of Lτ n (β, is only o p (1 away from B 1 u n. If u n converges in distribution to u, then β converges in distribution to B 1 u n accordingly. Now we use the quadratic approximation lemma to prove Theorem 2. Let α n = n 1/2 and a = n ( βτ 1/2 β τ. Further, let L τ n (u, β = n 1 n ρ τ(y i u x T i β. By applying the identity (Knight, 1998 that [ y ] ρ τ (x y ρ τ (x = 2 y {1(x 0 τ} + {1(x t 1(x 0} dt, 0 it follows that L τ n (ξ τ + α n b, β τ + α n a L τ n (ξ τ, β τ = n 1 ρ τ (Y i ξ τ α n b x T i β τ α n x T i a n 1 ρ τ (Y i ξ τ x T i β τ ( = n 1 α n x T i a + b { 1(Y i x T i β τ ξ τ τ } αn(x T +n 1 i a+b { 1(Yi x T i β τ ξ τ + t 1(Y i x T i β τ ξ τ } dt. 0 The objective function L τ n (ξ τ +α n b, β τ +α n a L τ n (ξ τ, β τ is convex in (a, b. With a slight change of notation, we denote F (t x = E { 1(Y x T β τ t x }. By Taylor s expansion, it follows that E {L τ n (ξ τ + α n b, β τ + α n a L τ n (ξ τ, β τ x 1,..., x n } [ ( = n 1 α n x T i a + b {F (ξ τ x i τ} + = α n n ξτ +α n(x T i a+b [ (x T i a + b {F (ξ τ x i τ} + α2 n 2 f(ξ τ x i ( ] b + a T 2 x i ξ τ F (t x i F (0 x i dt ] + o p (α 2 n. (C.1 The first two terms are of order O p (1/n. The last term of (C.1 admits a quadratic function of (a, b. Following standard arguments, we obtain that R 2 = L τ n (ξ τ + 16

α n b, β τ +α n a L τ n (ξ τ, β τ E {L τ n (ξ τ + α n b, β τ + α n a L τ n (ξ τ, β τ x 1,..., x n } = o p (n 1. This, together with the quadratic approximation lemma of Hjort and Pollard (1993 and (C.1, leads to the asymptotic normality of β. Appendix D. Proof of Theorem 3 Recall the notations z = x T 0 β τ, ẑ = x T β 0 τ, Z i = x T i β τ, Ẑi = x T β i τ. Let a 0 = G τ (z = } G τ (x T 0 β τ, b 0 = G τ(z, Ki = K {(Ẑi ẑ /h, K i = K {(Z i z /h} and K i = K {(Z i z /h} for i = 1,..., n. We first show that, for βτ to be a root-n consistent estimate of β τ, n 1 { } ] ( [ρ τ Y i a b (Ẑi ẑ Ki ρ τ {Y i a b (Z i z} K i = O p n 1/2.(D.1 Define R 11 = n 1 R 12 = n 1 R 13 = E [ } ] K i ρ τ {Y i a b (Ẑi ẑ ρ τ {Y i a b (Z i z}, ( Ki K i ρ τ {Y i a b (Z i z}, { } [ρ τ Y i a b (Ẑi z ] ( ρ τ {Y i a b (Z i z} Ki K i. Then, the left hand side of (D.1 can be decomposed into R 11 + R 12 + R 13. Using the identity ρ τ (x y ρ τ (x = 2 [ y {1(x 0 τ} + y 0 {1(x z 1(x 0} dz], we have ρ τ (x y ρ τ (x 4 y. Invoking the root-n consistency of βτ, we obtain n 1 { } ρτ [K i Y i a b (Ẑi ẑ 4b sup K(u n 1 u Ẑi Z i = 4bn 1 ] ρ τ {Y i a b (Z i z} x T i ( βτ β τ = Op (n 1/2. This indicates that R 11 = O p (n 1/2. 17

Next we deal with R 12. R 12 can be decomposed into ( βτ β τ T (nh 1 ( K Zi z (x i x 0 ρ τ {Y i a b (Z i z} {1 + o p (1}. h Let z i = (x i x 0 /h. Then (nh 1 n ( K Z i z (xi x h 0 ρ τ {Y i a b (Z i z} = n 1 n K ( z T i β τ zi ρ τ ( Yi a bhz T i β τ = Op (1, ( which, together with the root-n consistency of βτ, in turn proves that R 12 = O p n 1/2. It remains to investigate the order of R 13. Following similar arguments to deal with R 11 and R 12, we have R 13 = O p (n 1/2. Thus, (D.1 follows by combining the results of R 1k s, which implies that n { 1 n ρ τ Y i a b (Ẑi ẑ} Ki = n 1 n ρ τ {Y i a b (Z i z} K i + o p {(nh 1/2}. } That is, the quantile regression estimate of {(x Ĝ( based on T β i τ, Y i, i = 1,..., n is asymptotically as efficient as that based on { (x T i β 0, Y i, i = 1,..., n }. The rest of the proof follows literally from Fan, Hu and Truong (1994, Theorem 3. Based on the auxiliary data points { (x T i β 0, Y i, i = 1,..., n }, the technique for establishing the asymptotic normality involves the convexity lemma (Pollard, 1991 and the quadratic approximation lemma (Hjort and Pollard, 1993. We only sketch the outline below. Let ψ(z Z = E {Y G (Z z + η τ + G (z G (Z G (z (Z z Z}. 18

Therefore, n 1 ρ τ {Y i a b (Z i z} K i n 1 ρ τ {Y i a 0 b 0 (Z i z} K i = n 1 {(a a 0 + (b b 0 Z i } {ψ(0 Z i τ} K i +n 1 ψ (0 Z i {(a a 0 + (b b 0 Z i } 2 K i {1/2 + o p (1}. Following similar arguments for proving Theorem 3 in Fan, Hu and Truong (1994, we obtain that (nh 1/2 (a a 0 = (nh 1/2 n {ψ(0 Z i τ}k i φ (0 zf(z completes the proof. + o p { (nh 1/2 }, which REFERENCES Bhattacharya, P. K. and Gangopadhyay, A. K. (1990 Kernel and nearest-neighbor estimation of a conditional quantile regression. Annals of Statistics, 18, 1400 1415. Chaudhuri, P. (1991 Nonparametric estimates of regression quantiles and their local Bahahur representation. Annals of Statistics, 19, 760 777. Chaudhuri, P., Doksum, K. and Samarov, A. (1997 On average derivative quantile regression. Annals of Statistics, 25, 715 744. De Gooijer, J. G. and Zerom, D. (2003 On additive conditional quantiles with high dimensional covariates. Journal of the American Statistical Association, 98, 135 146. Fan, J. and Gijbels, I. (1996 Local Polynomial Modelling and its Applications. London: Chapman and Hall 19

Fan, J., Hu, T.-C. and Truong, Y. K. (1994 Robust non-parametric function estimation. Scandinavian Journal of Statistics, 21, 433-446 Hall, P. and Li, K. C. (1993 On almost linearity of low dimensional projection from high dimensional data. Annals of Statistics, 21, 867 889. Hjort, N. L. and Pollard, D. (1993 Asymptotics for minimisers of convex processes Preprint available at www.stat.yale.edu/ pollard/papers/convex.pdf Horowitz, J. L. and Lee, S. (2005 Nonparametric estimation of an additive quantile regression model. Journal of the American Statistical Association, 100, 1238 1249. Ichimura, H. (1993 Semiparametric least squares (SLS and weighted SLS estimation of single-index models. Journal of Econometrics, 58, 71 120. Johnson, R. W. (2003. Kiplinger s personal finance, Journal of Statistics Education, 57, 104-123. Kai, B., Li, R. and Zou, H. (2010 Local composite quantile regression smoothing: an efficient and safe alternative to local polynomial regression. Journal of the Royal Statistical Society Series B, 72, 49 69. Koenker, R. (2005 Quantile Regression. Cambridge University Press. Koenker, R. and Bassett, G. (1978 Regression quantiles. Econometrika, 46, 33 50. Koenker, R., Ng, P. and Portnoys, S. (1994 Quantiles smoothing splines. Biometrika, 81, 673 680. Knight, K. (1998 Limiting distribution for L 1 regression estimators under general conditions. Annals of Statistics, 26, 755 770. 20

Li, K.-C. (1991 Sliced inverse regression for dimension reduction. Journal of the American Statistical Association, 86, 316 327. Li, K.-C. and Duan, N. (1989 Regression analysis under link violation. The Annals of Statistics, 17, 1009 1052. Pollard, D. (1991 Asymptotics for least absolute deviation regression estimators. Econometric Theory, 7, 186-199. Stone, C. J. (1977 Consistent nonparametric regression (with discussion. Annals of Statistics, 5, 595-645. Wang, H., Zhu, Z., and Zhou, J. (2009 Quantile regression in partially linear varying coefficient models. Annals of Statistics, 37, 3841-3866. Wu, Y. and Liu, Y. (2009 Variable selection in quantile regression. Statistica Sinica, 19, 801-817. Yu, K. and Lu, Z. (2004 Local linear additive quantile regression. Scandinavian Journal of Statistics, 31, 333-346. Yu, K. and Jones, M. C. (1998 Local linear quantile regression. Journal of the American Statistical Association, 93, 228-237. Zou, H. and Yuan, M. (2008 Composite quantile regression and the oracle model selection theory. Annals of Statistics, 36, 1108 1126. 21