SEMIPARAMETRIC QUANTILE REGRESSION WITH HIGH-DIMENSIONAL COVARIATES. Liping Zhu, Mian Huang, & Runze Li. The Pennsylvania State University

Size: px
Start display at page:

Download "SEMIPARAMETRIC QUANTILE REGRESSION WITH HIGH-DIMENSIONAL COVARIATES. Liping Zhu, Mian Huang, & Runze Li. The Pennsylvania State University"

Transcription

1 SEMIPARAMETRIC QUANTILE REGRESSION WITH HIGH-DIMENSIONAL COVARIATES Liping Zhu, Mian Huang, & Runze Li The Pennsylvania State University Technical Report Series # College of Health and Human Development The Pennsylvania State University

2 SEMIPARAMETRIC QUANTILE REGRESSION WITH HIGH-DIMENSIONAL COVARIATES Liping ZHU, Mian HUANG and Runze LI Abstract This paper is concerned with quantile regression for a semiparametric regression model, in which both conditional mean function and conditional variance function of response given the covariates admit single-index structure. This semiparametric regression model enables us to reduce the dimension of the covariates and simultaneously retains the flexibility of nonparametric regression. Under mild conditions, we show that simple linear quantile regression offers a consistent estimate of the index parameter vector. This is a surprising and interesting result because the single-index model is possibly misspecified under the linear quantile regression. With root-n consistent estimate of the index vector, one may employ a local polynomial regression technique to estimate the conditional quantile function. We show that the resulting estimator of the quantile function performs asymptotically as efficiently as if the true value of the index vector were known. The methodologies are demonstrated through comprehensive simulation studies and an application to a real dataset. KEY WORDS: Dimension reduction; heteroscedasticity; linearity condition; local polynomial regression; quantile regression; single index model. Liping Zhu (luz15@psu.edu is Research Associate, The Methodology Center, The Pennsylvania State University at University Park, PA Huang Mian (mzh136@stat.psu.edu is Assistant Professor, School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai , P. R. China. Runze Li (rli@stat.psu.edu is Professor, Department of Statistics and The Methodology Center, The Pennsylvania State University at University Park, PA Zhu was supported by a National Institute on Drug Abuse (NIDA grant R21-DA Huang was supported by a National Science Foundation grant, DMS and funding through Projet 211 Phase 3 of SHUFE, and Shanghai Leading Academic Discipline Project, B803. Li was supported a NIDA grant P50-DA The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIDA or the NIH. 1

3 1. INTRODUCTION Since the seminal work of Koenker and Bassett (1978, quantile regression has become an important statistical analytic tool. Koenker (2005 provided a comprehensive review of the topic of quantile regression. Nonparametric quantile regression has been extensively studied in the literature. In situations where the covariate vector is univariate, Bhattacharya and Gangopadhyay (1990 studied kernel estimation and nearest neighborhood estimation, and Fan, Hu and Truong (1994 and Yu and Jones (1998 proposed local linear polynomial quantile regression. Koenker, Ng and Portnoys (1994 proposed regression spline approaches for estimating the conditional quantile regression. When the covariate vector x is multivariate, Stone (1977 and Chaudhuri (1991 proposed fully nonparametric quantile regression. Due to the curse of dimensionality, however, the fully nonparametric quantile regression may not be very useful in practice. Thus, researchers considered alternatives by assuming a structure on the regression function. De Gooijer and Zerom (2003, Yu and Lu (2004, and Horowitz and Lee (2005 proposed quantile regression for additive models. Wang, Zhu and Zhou (2009 considered quantile regression for partially linear varying coefficient models. Let Y be a response variable, and x = (X 1,..., X p T be a covariate vector. In this paper, we consider the following heteroscedastic single-index model: Y = G(x T β 0 + σ(x T β 0 ε, (1.1 where the error term ε is assumed to be independent of x, E(ε = 0 and var(ε = 1. The functions G( and σ( are unspecified, nonparametric smoothing functions. The parameter of interest is the direction of β 0. To ensure the identifiability, it is assumed throughout this paper that β 0 = 1 with first nonzero element being positive. It is 2

4 easy to see that model (1.1 includes linear quantile regression (Koenker and Bassett, 1978 as a special case thereof. Model (1.1 becomes a nonparametric regression model when p = 1, and the traditional single-index model (Ichimura, 1993 when σ( is a positive constant. Here we allow for the heteroscedasticity. The single-index structure in (1.1 retains the flexibility of nonparametric regression and allows the presence of high-dimensional covariates. Chaudhuri, Doksum and Samarov (1997 proposed an estimation procedure for β 0 using the average derivative approach, which takes partial derivatives of the conditional quantile with respect to the covariates X j s. The average derivative approach requires multivariate kernel regression, which is hence less useful in the presence of high-dimensional covariates. In this paper, we will propose a new estimation procedure for model (1.1. The newly proposed estimation procedure consists of two steps. We first estimate β 0 using linear quantile regression. We show that the resulting estimate is consistent. We then employ local linear regression (Fan and Gijbels, 1996 to estimate the conditional quantile function of Y given x T β 0. This paper makes the following major contributions to the literature: (a Under mild conditions, we show that for any mean function G( and τ (0, 1, the τ-th linear quantile regression coefficient for Y x is proportional to β 0 in single index model (1.1. (b We prove that the local linear regression estimate for the conditional quantile function based upon the root-n consistent estimate of β 0 has the same asymptotic bias and variance as the local linear estimate with true value of β 0. The theoretical result in (a implies that the ordinary linear quantile regression actually results in root n consistent estimates for β 0. Thus, the quantile regression 3

5 provides an effective tool to reduce the dimension of the covariate in the presence of high-dimensional covariates. In particular, variable selection procedures for quantile regression (Zou and Yuan, 2008; Wu and Liu, 2009 may be directly applicable to model (1.1 with high dimensional covariates when many covariates are not significant. The result in (b implies that the dimensionality of covariate does not affect the performance of the local linear estimate asymptotically. The rest of this article is organized as follows. In Section 2 we propose a two-step procedure to estimate the conditional quantile function. Asymptotic properties of the resulting estimates are also rigorously established in this section. We demonstrate the methodologies through comprehensive simulation studies and an application to real data in Section 3. We conclude this paper with a brief discussion in Section 4. All proofs are given to the Appendix. 2. A NEW ESTIMATION PROCEDURE In this section we study the estimation of quantile regression for model (1.1. We prove that the linear quantile regression yields consistent estimate for β 0. We further demonstrate that the quantile regression for model (1.1 can be reduced to the univariate quantile regression by replacing β 0 by ˆβ0, the resulting estimate of linear quantile regression Estimation of β 0 Let ρ τ (r = τr r1(r < 0 be the check loss function, and L τ (u, β = E { ρ τ ( Y u β T x }. Define (u τ, β τ def = argmin {L τ (u, β}. (2.1 u,β 4

6 Theorem 1 below implies that the linear quantile regression offers a consistent estimate for β 0 in (1.1 under mild conditions. Theorem 1. If the covariate vector x in model (1.1 satisfies the following linearity condition E ( x β T 0 x = var(xβ 0 { β T 0 var(xβ 0 } 1 β T 0 x, (2.2 then β τ = κβ 0 for some constant κ, where β τ is defined in (2.1. The linearity condition (2.2 is widely assumed in the context of sufficient dimension reduction. Li (1991 pointed out that (2.2 is satisfied when x follows an elliptical distribution. Hall and Li (1993 proved that this linearity condition always holds to a good approximation in single-index models of the form (1.1 when the dimension p of the covariates diverges. Thus, the linearity condition is typically regarded as mild, particularly when p is fairly large. It is worth pointing out that similar results can be established for general convex loss functions under the linearity condition. Theorem 1 implies that the existing statistical procedures for linear quantile regression may directly apply to model (1.1. In particular, the existing variable selection procedure for linear quantile regression with high-dimensional covariates (Wu and Liu, 2009; Zou and Yuan, 2008 may be used to select significant variables for model (1.1. Let {(x i, Y i, i = 1,..., n} be a random sample from (x, Y, and L τ n (u, β = n 1 { ( ρτ Yi u x T i β}. 5

7 Let def (û τ, βτ = argmin {L τ n (u, β}, (2.3 u,β The asymptotic normality of βτ is established as follows: Theorem 2. Assume that Λ = E { f (ξ τ xxx T} is invertible, where f (z x denotes the conditional density of ( Y x T β τ given x, and ξτ denotes its τ-th quantile. Then n 1/2 ( βτ β τ is asymptotically normal with mean zero and covariance matrix τ (1 τ Λ 1 E ( xx T Λ 1. If the error term Y x T β τ is independent of x, the asymptotic variance matrix of βτ reduces to τ (1 τ E ( xx T 1 / {f (ξ τ } Estimation of conditional quantile function By Theorem 1, β τ is proportional to β 0, and hence the conditional distribution of Y ( x T β 0 is exactly the same as that of Y ( x T β τ. Consequently, Theorem 1 allows to estimate the quantile regression of Y ( x T β 0 through the conditional distribution of Y ( x T β τ. We denote the τ-quantile function of Y ( x T β τ by Gτ (. By using data points } {(x Ti β τ, Y i, i = 1,..., n, we estimate G τ ( by using local linear regression. For ease of presentation, let z = x T 0 β τ, ẑ = x T 0 β τ, Z i = x T i β τ, Ẑ i = x T i β τ. Local linear regression is used to approximate G τ (Z by G τ (Z G τ (z + G τ(z(z z for Z in the neighborhood of z. Let ( def 1 â, b = argmin a,b n } { } ρ τ {Y i a b (Ẑi ẑ K (Ẑi ẑ/h. Then, Ĝτ(ẑ = â and Ĝ τ(ẑ = b. 6

8 In the sequel we derive the asymptotic property for the nonparametric estimate of G τ (z. For any fixed x 0 R p, the asymptotic normality of the local linear estimator Ĝ τ (ẑ, is presented in the following theorem: Theorem 3. Under the regularity conditions (i-(iv in Appendix A, if h 0 and nh, then the asymptotic conditional bias and variance of local linear quantile regression estimator Ĝτ(ẑ are given by bias{ĝτ(ẑ} = G τ(zµ 2 h 2 /2 + o p (h 2, var{ĝτ(ẑ} = (nh 1 {1 + o p (1}, where = 1 1 K2 (udu f(z τ(1 τ f 2 (ξ τ z σ2 (z. Furthermore, as n, } (nh {Ĝτ 1/2 (ẑ G τ (z G τ(zµ 2 h 2 /2 is asymptotically normal with mean zero and asymptotic variance. For univariate x, the asymptotic bias and variance of nonparametric quantile regression can be found in Fan, Hu and Truong (1994 and Yu and Jones (1998. We compared the asymptotic bias and variance of the local linear quantile regression estimator built upon } {(x Ti β τ, Y i, i = 1,..., n with that of the estimator built upon {( } x T i β τ, Y i, i = 1,..., n, and found they performed equally well asymptotically. This implies that the resulting estimate of the newly proposed procedure performs as well as an oracle estimate with replacing ˆβτ with its true value. 3. SIMULATIONS AND APPLICATION In this section we conduct simulations to evaluate the performance of the proposed methods and illustrate the proposed methodology with a real data example. 7

9 3.1. Simulation We generated 1000 data sets, each consisting of n = 500 observations, from Y = sin { 2 ( x T β 0 } + 2 exp { 16 ( x T β 0 2 } + σ ( x T β 0 ε, (3.1 where the index parameter β 0 = (2, 2, 1, 1, 0, 0, 0, 0, 0, 0 T / 10, and the covariate vector x = (X 1,..., X 10 T is generated from a multivariate normal population with mean zero and variance-covariance matrix var(x = (σ ij with σ ij = 0.5 i j. The conditional mean function in model (3.1 was designed by Kai, Li and Zou (2010. In our simulations, we considered four error distributions for ε: (i the standard normal distribution N(0, 1; (ii a mixture of two normal distributions 0.8N(0, N(0, 3 2 ; (iii the Laplace distribution; and (iv the student-t distribution with 3 degrees of freedom t(3. The error term ε and the covariates X j s are mutually independent. We considered two cases for σ ( x T β 0 : σ ( x T β 0 = 1 in homoscedasticity case (1 and σ ( x T β 0 = exp ( x T β 0 in heteroscedasticity case (2. We considered case (2 because it is often of interest to investigate the effect of heteroscedastic errors in quantile regressions. In case (2 the response values contain outliers with large probabilities. We examined the performance of our proposed procedures in these scenarios. Performance in estimating β 0. The index parameter β 0 was estimated via a series of quantile regressions with τ = 0.025, 0.05, 0.5, 0.95 and 0.975, respectively. Table 1 depicts the averages of mean squared errors (MSE of the estimate βτ, which is T defined by MSE = ( βτ β 0 ( βτ β 0. In Table 1, the LSE denotes the ordinary least squares estimate (LSE. Li and Duan (1989 proved that this LSE is a consistent estimator of the direction of β 0 ; thus, it serves naturally as a competitor here. In the homoscedasticity case (1 with standard normal errors, the LSE performs slightly better than the quantile regression in terms of the MSE values. However, the quantile 8

10 regression is superior to the LSE in all other cases. The improvement of quantile regression is more significant in heteroscedasticity case (2 than in homoscedasticity case (1. This phenomenon agrees with our expectation, in that the performance of the LSE is sensitive to the presence of outliers. In contrast, the quantile regression offers a more robust estimation in most scenarios. Performance in coverage probability of prediction intervals. It is often of interest to evaluate the accuracy of quantile regression in offering the prediction interval of Y given x T β 0. To this end, we can estimate the τ-th and (1 τ-th quantile of the conditional distribution of Y given x T β 0, which can be used as the confidence interval at the level of (1 2τ for τ < 0.5. For example, we choose τ = 0.025, 0.05 and 0.10 to provide the confidence intervals at 95%, 90% and 80%. We expect the empirical coverage probability of such confidence interval to be close to the nominal level of (1 2τ. The average and the standard deviation of the empirical coverage probabilities are summarized in Table 2. Here the oracle estimation provides the prediction interval by using the true index parameter β 0, which serves as a benchmark for comparison purposes. The quantile regression estimation is based on βτ. Both the homoscedasticity and the heteroscedasticity cases reported in Table 2 show very similar messages. The coverage probabilities are very close to the nominal levels for different τ values. The standard deviations of the empirical probabilities are all very small, indicating that the quantile regression offers a very reliable coverage probability, which is comparable with its oracle version. This phenomenon verifies the theoretical results in Theorem An application 9

11 Table 1: The averages of MSEs. The values in table are all multiplied by 100. case (1: σ ( x T β 0 = 1 ( ( case (2: σ x T β 0 = exp x T β 0 the τ-th quantile in % the τ-th quantile in % LSE LSE Standard normal distribution Mixture normal distribution Laplace distribution Student t-distribution t( Table 2: The empirical coverage probabilities. The numbers in table are all in %. case (1: σ ( x T β 0 = 1 ( ( case (2: σ x T β 0 = exp x T β 0 oracle oracle the nominal level in % Standard normal distribution aver stdev Mixture normal distribution aver stdev Laplace distribution aver stdev Student t-distribution t(3 aver stdev To illustrate the usefulness of the proposed procedures, we analyze an automobile dataset (Johnson, It is of interest to know how the manufactures suggested retail price (MSRP of vehicles depends upon different levels of attributes such as miles per gallon and horsepower. Thus the MSRP serves naturally as the response variable. This is the manufacturer s assessment of a vehicle s worth, and includes adequate 10

12 profit for the manufacturer and the dealer. We use the logarithm transformation of the MSRP in U.S. dollars as the response (Y. In addition, there are seven features which possibly affect the resulting price of vehicles, such as the engine size (X 1, number of cylinders (X 2, horsepower (X 3, average city miles per gallon (MPG, X 4 and average highway MPG (X 5, weight in pounds (X 6 and wheel base in inches (X 7. This data set consists of 428 observations, sixteen of which have missing values. We remove those observations with missing values in our subsequent analysis. The remaining data set has a total of eight variables and 412 observations. Both the response variable Y and the covariate vector x = (X 1,..., X 7 T are standardized marginally to have zero mean and unit variance during this empirical analysis. The fully nonparametric regression is not applicable to this particular case because the sample size is small compared with the dimensionality of the covariates. We specify the heteroscedastic single-index model (1.1 for analyzing the new vehicle data. The index parameter is estimated at five different quantiles: τ = 0.025, 0.05, 0.50, 0.95 and They yield similar estimates for the index parameter β τ at different quantiles, indicating that similar patterns between the MSRP and different features are revealed at different quantiles. Specifically, the estimated index parameter at τ = 0.5 is βτ = (0.2535, , , , , , T. It implies that the horsepower (X 3 is perhaps the most important factor that affects the suggested retail price. In contrast, the number of cylinders (X 2 has little impact on the MSRP. Using the data } {(x Ti β τ, Y i, i = 1,..., n, we first estimate the regression function G(x T β 0 via the local least squares estimator. The estimated function and its 95% confidence interval are reported in Figure 1(C. We further conduct some exploratory data analysis on this data set. The boxplot of the log-transformed MSRP, the response variable, is depicted in Figure 1(A, which shows that there are four vehicles whose 11

13 prices are significantly higher than other vehicles. In addition, the histogram in Figure 1(B reveals that the distribution of the log-transformed prices are highly skewed. This motivates us to further conduct empirical analysis of this data set using the proposed procedure. Next we apply the quantile regression to this dataset. Figure 1(D presents the 2.5% and 97.5% quantiles to give an approximate 95% prediction interval of Y. This prediction interval covers 95.87% of the data points, which is close to the nominal level. The line in the middle of Figure 1(D presents the quantile regression at the level of 50%. It behaves similarly to the local least squares estimator in Figure 1(C within the range of x T βτ. When x T βτ is near to the boundary, however, the local least squares estimate is clearly more sensitive to the outliers than the quantile estimate. This example demonstrates the effectiveness of the proposed procedure in terms of description of the conditional distribution of Y given x. 4. DISCUSSION To estimate the quantile regression with high-dimensional covariates, we impose a heteroscedastic single-index structure on the regression function. The heteroscedastic single-index structure allows us to reduce the dimension of covariates and simultaneously retains the flexibility of nonparametric regression. We propose a computationally efficient two-step estimation procedure to estimate the parameters involved in the quantile regression function. Asymptotic properties of the proposed procedures are studied. It is remarkable here that, if (1.1 reduces to the homoscedastic single-index model (i.e. σ( remains constant, and the error ε is symmetric about zero, then a byproduct of this paper is that our estimation procedure offers a robust estimation for the condi- 12

14 (A Boxplot of Y (B Histogram of Y Log of Price Log of Price (C Plot of Ĝ( (D Plot of Ĝτ( Figure 1: (A Boxplot of log-transformed retail price. (B Histogram of log-transformed retail price. (C Plot of Ĝ( and its 95% confidence interval using local linear least squares method. (D Plot of Ĝτ( with τ = 2.5%, 50% and 97.5%. The 95% prediction interval of Y given x covers 96% points 13

15 tional mean function. The simulation studies in this article verify this point implicitly. In addition, it is not necessary that the error term ε in model (1.1 is independent of x, in which case we can assume instead that E ( x x T β 0, ε is a linear function of x T β 0. If the parameter of interest is the index parameter β 0 only, the composite quantile regression (Zou and Yuan, 2008 can be adapted to improve the efficiency in estimating β 0. Further research on these issues is of great interest. APPENDIX: PROOF OF THEOREMS Appendix A. Technical Conditions The following technical conditions are imposed. They are not the weakest possible conditions, but they are imposed to facilitate the proofs. (i The quantile function G τ ( has a continuous and bounded second derivative. (ii The density function f( of (x T β is positive and uniformly continuous for β in a neighborhood of β 0. Furthermore, the density function of (x T β 0 is continuous and bounded away from zero and infinity on its support. (iii The conditional density of Y given (x T β 0, denoted by f Y (y x T β 0, is continuous in (x T β 0 for each y R. Moreover, there exist positive constants ε and δ and a positive function F (y x T β 0 such that sup f(y x T β F (y x T β 0. x T β x T β 0 ε For any fixed value of (x T β 0, we assume that F (y x T β 0 dy < and {ρτ (y t ρ τ (y ρ τ (y t} 2 F (y x T β 0 dy = o(t 2, as t 0. 14

16 (iv The kernel function K( is symmetric and has compact support [ 1, 1]. It satisfies the first order Lipschitz condition. Appendix B. Proof of Theorem 1 Theorem 1 follows from the convexity of the check loss function and the linearity condition (2.2. It suffices to prove that, for any u R, there exists a constant κ such that L τ (u, β L τ (u, κβ 0. Specifically, L τ (u, β = E [ E { ( ρ τ Y u β T x β T 0 x, ε }] E [ { ( ρ τ E Y β T 0 x, ε u E ( β T x β T 0 x, ε }] = E [ { ( ρ τ Y u E β T x β T 0 x }] = E { ( ρ τ Y u κβ T 0 x }, where κ = β T var(xβ 0 / { } β T 0 var(xβ 0. The first equality follows from the iterative law of conditional expectation; the first inequality follows from Jensen s inequality; the second equality is true in view of model (1.1 and the last equality holds true by invoking the linearity condition (2.2 and the error ε is independent of x. This completes the proof. Appendix C. Proof of Theorem 2 This proof follows from the quadratic approximation lemma (Hjort and Pollard, Let us present the quadratic approximation lemma here to facilitate the proof. Quadratic Approximation Lemma. Suppose L τ n (β is convex and can be represented as β T Bβ/2 + u T n β + a n + R n (β, where B is symmetric and positive definite, u n is stochastically bounded, a n is arbitrary, and R n (β goes to zero in probability 15

17 for each β. Then β, the minimizer of Lτ n (β, is only o p (1 away from B 1 u n. If u n converges in distribution to u, then β converges in distribution to B 1 u n accordingly. Now we use the quadratic approximation lemma to prove Theorem 2. Let α n = n 1/2 and a = n ( βτ 1/2 β τ. Further, let L τ n (u, β = n 1 n ρ τ(y i u x T i β. By applying the identity (Knight, 1998 that [ y ] ρ τ (x y ρ τ (x = 2 y {1(x 0 τ} + {1(x t 1(x 0} dt, 0 it follows that L τ n (ξ τ + α n b, β τ + α n a L τ n (ξ τ, β τ = n 1 ρ τ (Y i ξ τ α n b x T i β τ α n x T i a n 1 ρ τ (Y i ξ τ x T i β τ ( = n 1 α n x T i a + b { 1(Y i x T i β τ ξ τ τ } αn(x T +n 1 i a+b { 1(Yi x T i β τ ξ τ + t 1(Y i x T i β τ ξ τ } dt. 0 The objective function L τ n (ξ τ +α n b, β τ +α n a L τ n (ξ τ, β τ is convex in (a, b. With a slight change of notation, we denote F (t x = E { 1(Y x T β τ t x }. By Taylor s expansion, it follows that E {L τ n (ξ τ + α n b, β τ + α n a L τ n (ξ τ, β τ x 1,..., x n } [ ( = n 1 α n x T i a + b {F (ξ τ x i τ} + = α n n ξτ +α n(x T i a+b [ (x T i a + b {F (ξ τ x i τ} + α2 n 2 f(ξ τ x i ( ] b + a T 2 x i ξ τ F (t x i F (0 x i dt ] + o p (α 2 n. (C.1 The first two terms are of order O p (1/n. The last term of (C.1 admits a quadratic function of (a, b. Following standard arguments, we obtain that R 2 = L τ n (ξ τ + 16

18 α n b, β τ +α n a L τ n (ξ τ, β τ E {L τ n (ξ τ + α n b, β τ + α n a L τ n (ξ τ, β τ x 1,..., x n } = o p (n 1. This, together with the quadratic approximation lemma of Hjort and Pollard (1993 and (C.1, leads to the asymptotic normality of β. Appendix D. Proof of Theorem 3 Recall the notations z = x T 0 β τ, ẑ = x T β 0 τ, Z i = x T i β τ, Ẑi = x T β i τ. Let a 0 = G τ (z = } G τ (x T 0 β τ, b 0 = G τ(z, Ki = K {(Ẑi ẑ /h, K i = K {(Z i z /h} and K i = K {(Z i z /h} for i = 1,..., n. We first show that, for βτ to be a root-n consistent estimate of β τ, n 1 { } ] ( [ρ τ Y i a b (Ẑi ẑ Ki ρ τ {Y i a b (Z i z} K i = O p n 1/2.(D.1 Define R 11 = n 1 R 12 = n 1 R 13 = E [ } ] K i ρ τ {Y i a b (Ẑi ẑ ρ τ {Y i a b (Z i z}, ( Ki K i ρ τ {Y i a b (Z i z}, { } [ρ τ Y i a b (Ẑi z ] ( ρ τ {Y i a b (Z i z} Ki K i. Then, the left hand side of (D.1 can be decomposed into R 11 + R 12 + R 13. Using the identity ρ τ (x y ρ τ (x = 2 [ y {1(x 0 τ} + y 0 {1(x z 1(x 0} dz], we have ρ τ (x y ρ τ (x 4 y. Invoking the root-n consistency of βτ, we obtain n 1 { } ρτ [K i Y i a b (Ẑi ẑ 4b sup K(u n 1 u Ẑi Z i = 4bn 1 ] ρ τ {Y i a b (Z i z} x T i ( βτ β τ = Op (n 1/2. This indicates that R 11 = O p (n 1/2. 17

19 Next we deal with R 12. R 12 can be decomposed into ( βτ β τ T (nh 1 ( K Zi z (x i x 0 ρ τ {Y i a b (Z i z} {1 + o p (1}. h Let z i = (x i x 0 /h. Then (nh 1 n ( K Z i z (xi x h 0 ρ τ {Y i a b (Z i z} = n 1 n K ( z T i β τ zi ρ τ ( Yi a bhz T i β τ = Op (1, ( which, together with the root-n consistency of βτ, in turn proves that R 12 = O p n 1/2. It remains to investigate the order of R 13. Following similar arguments to deal with R 11 and R 12, we have R 13 = O p (n 1/2. Thus, (D.1 follows by combining the results of R 1k s, which implies that n { 1 n ρ τ Y i a b (Ẑi ẑ} Ki = n 1 n ρ τ {Y i a b (Z i z} K i + o p {(nh 1/2}. } That is, the quantile regression estimate of {(x Ĝ( based on T β i τ, Y i, i = 1,..., n is asymptotically as efficient as that based on { (x T i β 0, Y i, i = 1,..., n }. The rest of the proof follows literally from Fan, Hu and Truong (1994, Theorem 3. Based on the auxiliary data points { (x T i β 0, Y i, i = 1,..., n }, the technique for establishing the asymptotic normality involves the convexity lemma (Pollard, 1991 and the quadratic approximation lemma (Hjort and Pollard, We only sketch the outline below. Let ψ(z Z = E {Y G (Z z + η τ + G (z G (Z G (z (Z z Z}. 18

20 Therefore, n 1 ρ τ {Y i a b (Z i z} K i n 1 ρ τ {Y i a 0 b 0 (Z i z} K i = n 1 {(a a 0 + (b b 0 Z i } {ψ(0 Z i τ} K i +n 1 ψ (0 Z i {(a a 0 + (b b 0 Z i } 2 K i {1/2 + o p (1}. Following similar arguments for proving Theorem 3 in Fan, Hu and Truong (1994, we obtain that (nh 1/2 (a a 0 = (nh 1/2 n {ψ(0 Z i τ}k i φ (0 zf(z completes the proof. + o p { (nh 1/2 }, which REFERENCES Bhattacharya, P. K. and Gangopadhyay, A. K. (1990 Kernel and nearest-neighbor estimation of a conditional quantile regression. Annals of Statistics, 18, Chaudhuri, P. (1991 Nonparametric estimates of regression quantiles and their local Bahahur representation. Annals of Statistics, 19, Chaudhuri, P., Doksum, K. and Samarov, A. (1997 On average derivative quantile regression. Annals of Statistics, 25, De Gooijer, J. G. and Zerom, D. (2003 On additive conditional quantiles with high dimensional covariates. Journal of the American Statistical Association, 98, Fan, J. and Gijbels, I. (1996 Local Polynomial Modelling and its Applications. London: Chapman and Hall 19

21 Fan, J., Hu, T.-C. and Truong, Y. K. (1994 Robust non-parametric function estimation. Scandinavian Journal of Statistics, 21, Hall, P. and Li, K. C. (1993 On almost linearity of low dimensional projection from high dimensional data. Annals of Statistics, 21, Hjort, N. L. and Pollard, D. (1993 Asymptotics for minimisers of convex processes Preprint available at pollard/papers/convex.pdf Horowitz, J. L. and Lee, S. (2005 Nonparametric estimation of an additive quantile regression model. Journal of the American Statistical Association, 100, Ichimura, H. (1993 Semiparametric least squares (SLS and weighted SLS estimation of single-index models. Journal of Econometrics, 58, Johnson, R. W. (2003. Kiplinger s personal finance, Journal of Statistics Education, 57, Kai, B., Li, R. and Zou, H. (2010 Local composite quantile regression smoothing: an efficient and safe alternative to local polynomial regression. Journal of the Royal Statistical Society Series B, 72, Koenker, R. (2005 Quantile Regression. Cambridge University Press. Koenker, R. and Bassett, G. (1978 Regression quantiles. Econometrika, 46, Koenker, R., Ng, P. and Portnoys, S. (1994 Quantiles smoothing splines. Biometrika, 81, Knight, K. (1998 Limiting distribution for L 1 regression estimators under general conditions. Annals of Statistics, 26,

22 Li, K.-C. (1991 Sliced inverse regression for dimension reduction. Journal of the American Statistical Association, 86, Li, K.-C. and Duan, N. (1989 Regression analysis under link violation. The Annals of Statistics, 17, Pollard, D. (1991 Asymptotics for least absolute deviation regression estimators. Econometric Theory, 7, Stone, C. J. (1977 Consistent nonparametric regression (with discussion. Annals of Statistics, 5, Wang, H., Zhu, Z., and Zhou, J. (2009 Quantile regression in partially linear varying coefficient models. Annals of Statistics, 37, Wu, Y. and Liu, Y. (2009 Variable selection in quantile regression. Statistica Sinica, 19, Yu, K. and Lu, Z. (2004 Local linear additive quantile regression. Scandinavian Journal of Statistics, 31, Yu, K. and Jones, M. C. (1998 Local linear quantile regression. Journal of the American Statistical Association, 93, Zou, H. and Yuan, M. (2008 Composite quantile regression and the oracle model selection theory. Annals of Statistics, 36,

SEMIPARAMETRIC ESTIMATION OF CONDITIONAL HETEROSCEDASTICITY VIA SINGLE-INDEX MODELING

SEMIPARAMETRIC ESTIMATION OF CONDITIONAL HETEROSCEDASTICITY VIA SINGLE-INDEX MODELING Statistica Sinica 3 (013), 135-155 doi:http://dx.doi.org/10.5705/ss.01.075 SEMIPARAMERIC ESIMAION OF CONDIIONAL HEEROSCEDASICIY VIA SINGLE-INDEX MODELING Liping Zhu, Yuexiao Dong and Runze Li Shanghai

More information

Single Index Quantile Regression for Heteroscedastic Data

Single Index Quantile Regression for Heteroscedastic Data Single Index Quantile Regression for Heteroscedastic Data E. Christou M. G. Akritas Department of Statistics The Pennsylvania State University SMAC, November 6, 2015 E. Christou, M. G. Akritas (PSU) SIQR

More information

WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION. Abstract

WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION. Abstract Journal of Data Science,17(1). P. 145-160,2019 DOI:10.6339/JDS.201901_17(1).0007 WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION Wei Xiong *, Maozai Tian 2 1 School of Statistics, University of

More information

Single Index Quantile Regression for Heteroscedastic Data

Single Index Quantile Regression for Heteroscedastic Data Single Index Quantile Regression for Heteroscedastic Data E. Christou M. G. Akritas Department of Statistics The Pennsylvania State University JSM, 2015 E. Christou, M. G. Akritas (PSU) SIQR JSM, 2015

More information

Nonparametric Modal Regression

Nonparametric Modal Regression Nonparametric Modal Regression Summary In this article, we propose a new nonparametric modal regression model, which aims to estimate the mode of the conditional density of Y given predictors X. The nonparametric

More information

Some Theories about Backfitting Algorithm for Varying Coefficient Partially Linear Model

Some Theories about Backfitting Algorithm for Varying Coefficient Partially Linear Model Some Theories about Backfitting Algorithm for Varying Coefficient Partially Linear Model 1. Introduction Varying-coefficient partially linear model (Zhang, Lee, and Song, 2002; Xia, Zhang, and Tong, 2004;

More information

Issues on quantile autoregression

Issues on quantile autoregression Issues on quantile autoregression Jianqing Fan and Yingying Fan We congratulate Koenker and Xiao on their interesting and important contribution to the quantile autoregression (QAR). The paper provides

More information

Identification and Estimation of Partially Linear Censored Regression Models with Unknown Heteroscedasticity

Identification and Estimation of Partially Linear Censored Regression Models with Unknown Heteroscedasticity Identification and Estimation of Partially Linear Censored Regression Models with Unknown Heteroscedasticity Zhengyu Zhang School of Economics Shanghai University of Finance and Economics zy.zhang@mail.shufe.edu.cn

More information

SMOOTHED BLOCK EMPIRICAL LIKELIHOOD FOR QUANTILES OF WEAKLY DEPENDENT PROCESSES

SMOOTHED BLOCK EMPIRICAL LIKELIHOOD FOR QUANTILES OF WEAKLY DEPENDENT PROCESSES Statistica Sinica 19 (2009), 71-81 SMOOTHED BLOCK EMPIRICAL LIKELIHOOD FOR QUANTILES OF WEAKLY DEPENDENT PROCESSES Song Xi Chen 1,2 and Chiu Min Wong 3 1 Iowa State University, 2 Peking University and

More information

Quantile Regression for Extraordinarily Large Data

Quantile Regression for Extraordinarily Large Data Quantile Regression for Extraordinarily Large Data Shih-Kang Chao Department of Statistics Purdue University November, 2016 A joint work with Stanislav Volgushev and Guang Cheng Quantile regression Two-step

More information

Quantile Processes for Semi and Nonparametric Regression

Quantile Processes for Semi and Nonparametric Regression Quantile Processes for Semi and Nonparametric Regression Shih-Kang Chao Department of Statistics Purdue University IMS-APRM 2016 A joint work with Stanislav Volgushev and Guang Cheng Quantile Response

More information

Density estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas

Density estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas 0 0 5 Motivation: Regression discontinuity (Angrist&Pischke) Outcome.5 1 1.5 A. Linear E[Y 0i X i] 0.2.4.6.8 1 X Outcome.5 1 1.5 B. Nonlinear E[Y 0i X i] i 0.2.4.6.8 1 X utcome.5 1 1.5 C. Nonlinearity

More information

Statistica Sinica Preprint No: SS

Statistica Sinica Preprint No: SS Statistica Sinica Preprint No: SS-017-0013 Title A Bootstrap Method for Constructing Pointwise and Uniform Confidence Bands for Conditional Quantile Functions Manuscript ID SS-017-0013 URL http://wwwstatsinicaedutw/statistica/

More information

Smooth simultaneous confidence bands for cumulative distribution functions

Smooth simultaneous confidence bands for cumulative distribution functions Journal of Nonparametric Statistics, 2013 Vol. 25, No. 2, 395 407, http://dx.doi.org/10.1080/10485252.2012.759219 Smooth simultaneous confidence bands for cumulative distribution functions Jiangyan Wang

More information

Function of Longitudinal Data

Function of Longitudinal Data New Local Estimation Procedure for Nonparametric Regression Function of Longitudinal Data Weixin Yao and Runze Li Abstract This paper develops a new estimation of nonparametric regression functions for

More information

Truncated Regression Model and Nonparametric Estimation for Gifted and Talented Education Program

Truncated Regression Model and Nonparametric Estimation for Gifted and Talented Education Program Global Journal of Pure and Applied Mathematics. ISSN 0973-768 Volume 2, Number (206), pp. 995-002 Research India Publications http://www.ripublication.com Truncated Regression Model and Nonparametric Estimation

More information

A note on profile likelihood for exponential tilt mixture models

A note on profile likelihood for exponential tilt mixture models Biometrika (2009), 96, 1,pp. 229 236 C 2009 Biometrika Trust Printed in Great Britain doi: 10.1093/biomet/asn059 Advance Access publication 22 January 2009 A note on profile likelihood for exponential

More information

Least Absolute Deviations Estimation for the Accelerated Failure Time Model. University of Iowa. *

Least Absolute Deviations Estimation for the Accelerated Failure Time Model. University of Iowa. * Least Absolute Deviations Estimation for the Accelerated Failure Time Model Jian Huang 1,2, Shuangge Ma 3, and Huiliang Xie 1 1 Department of Statistics and Actuarial Science, and 2 Program in Public Health

More information

New Local Estimation Procedure for Nonparametric Regression Function of Longitudinal Data

New Local Estimation Procedure for Nonparametric Regression Function of Longitudinal Data ew Local Estimation Procedure for onparametric Regression Function of Longitudinal Data Weixin Yao and Runze Li The Pennsylvania State University Technical Report Series #0-03 College of Health and Human

More information

Rejoinder. 1 Phase I and Phase II Profile Monitoring. Peihua Qiu 1, Changliang Zou 2 and Zhaojun Wang 2

Rejoinder. 1 Phase I and Phase II Profile Monitoring. Peihua Qiu 1, Changliang Zou 2 and Zhaojun Wang 2 Rejoinder Peihua Qiu 1, Changliang Zou 2 and Zhaojun Wang 2 1 School of Statistics, University of Minnesota 2 LPMC and Department of Statistics, Nankai University, China We thank the editor Professor David

More information

Improving linear quantile regression for

Improving linear quantile regression for Improving linear quantile regression for replicated data arxiv:1901.0369v1 [stat.ap] 16 Jan 2019 Kaushik Jana 1 and Debasis Sengupta 2 1 Imperial College London, UK 2 Indian Statistical Institute, Kolkata,

More information

arxiv: v2 [stat.me] 4 Jun 2016

arxiv: v2 [stat.me] 4 Jun 2016 Variable Selection for Additive Partial Linear Quantile Regression with Missing Covariates 1 Variable Selection for Additive Partial Linear Quantile Regression with Missing Covariates Ben Sherwood arxiv:1510.00094v2

More information

A Resampling Method on Pivotal Estimating Functions

A Resampling Method on Pivotal Estimating Functions A Resampling Method on Pivotal Estimating Functions Kun Nie Biostat 277,Winter 2004 March 17, 2004 Outline Introduction A General Resampling Method Examples - Quantile Regression -Rank Regression -Simulation

More information

TESTING SERIAL CORRELATION IN SEMIPARAMETRIC VARYING COEFFICIENT PARTIALLY LINEAR ERRORS-IN-VARIABLES MODEL

TESTING SERIAL CORRELATION IN SEMIPARAMETRIC VARYING COEFFICIENT PARTIALLY LINEAR ERRORS-IN-VARIABLES MODEL Jrl Syst Sci & Complexity (2009) 22: 483 494 TESTIG SERIAL CORRELATIO I SEMIPARAMETRIC VARYIG COEFFICIET PARTIALLY LIEAR ERRORS-I-VARIABLES MODEL Xuemei HU Feng LIU Zhizhong WAG Received: 19 September

More information

Goodness-of-fit tests for the cure rate in a mixture cure model

Goodness-of-fit tests for the cure rate in a mixture cure model Biometrika (217), 13, 1, pp. 1 7 Printed in Great Britain Advance Access publication on 31 July 216 Goodness-of-fit tests for the cure rate in a mixture cure model BY U.U. MÜLLER Department of Statistics,

More information

Local Polynomial Regression

Local Polynomial Regression VI Local Polynomial Regression (1) Global polynomial regression We observe random pairs (X 1, Y 1 ),, (X n, Y n ) where (X 1, Y 1 ),, (X n, Y n ) iid (X, Y ). We want to estimate m(x) = E(Y X = x) based

More information

An Encompassing Test for Non-Nested Quantile Regression Models

An Encompassing Test for Non-Nested Quantile Regression Models An Encompassing Test for Non-Nested Quantile Regression Models Chung-Ming Kuan Department of Finance National Taiwan University Hsin-Yi Lin Department of Economics National Chengchi University Abstract

More information

Introduction. Linear Regression. coefficient estimates for the wage equation: E(Y X) = X 1 β X d β d = X β

Introduction. Linear Regression. coefficient estimates for the wage equation: E(Y X) = X 1 β X d β d = X β Introduction - Introduction -2 Introduction Linear Regression E(Y X) = X β +...+X d β d = X β Example: Wage equation Y = log wages, X = schooling (measured in years), labor market experience (measured

More information

A nonparametric method of multi-step ahead forecasting in diffusion processes

A nonparametric method of multi-step ahead forecasting in diffusion processes A nonparametric method of multi-step ahead forecasting in diffusion processes Mariko Yamamura a, Isao Shoji b a School of Pharmacy, Kitasato University, Minato-ku, Tokyo, 108-8641, Japan. b Graduate School

More information

Does k-th Moment Exist?

Does k-th Moment Exist? Does k-th Moment Exist? Hitomi, K. 1 and Y. Nishiyama 2 1 Kyoto Institute of Technology, Japan 2 Institute of Economic Research, Kyoto University, Japan Email: hitomi@kit.ac.jp Keywords: Existence of moments,

More information

A review of some semiparametric regression models with application to scoring

A review of some semiparametric regression models with application to scoring A review of some semiparametric regression models with application to scoring Jean-Loïc Berthet 1 and Valentin Patilea 2 1 ENSAI Campus de Ker-Lann Rue Blaise Pascal - BP 37203 35172 Bruz cedex, France

More information

A Bootstrap Test for Conditional Symmetry

A Bootstrap Test for Conditional Symmetry ANNALS OF ECONOMICS AND FINANCE 6, 51 61 005) A Bootstrap Test for Conditional Symmetry Liangjun Su Guanghua School of Management, Peking University E-mail: lsu@gsm.pku.edu.cn and Sainan Jin Guanghua School

More information

NEW EFFICIENT AND ROBUST ESTIMATION IN VARYING-COEFFICIENT MODELS WITH HETEROSCEDASTICITY

NEW EFFICIENT AND ROBUST ESTIMATION IN VARYING-COEFFICIENT MODELS WITH HETEROSCEDASTICITY Statistica Sinica 22 202, 075-0 doi:http://dx.doi.org/0.5705/ss.200.220 NEW EFFICIENT AND ROBUST ESTIMATION IN VARYING-COEFFICIENT MODELS WITH HETEROSCEDASTICITY Jie Guo, Maozai Tian and Kai Zhu 2 Renmin

More information

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be Quantile methods Class Notes Manuel Arellano December 1, 2009 1 Unconditional quantiles Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be Q τ (Y ) q τ F 1 (τ) =inf{r : F

More information

Tobit and Interval Censored Regression Model

Tobit and Interval Censored Regression Model Global Journal of Pure and Applied Mathematics. ISSN 0973-768 Volume 2, Number (206), pp. 98-994 Research India Publications http://www.ripublication.com Tobit and Interval Censored Regression Model Raidani

More information

Asymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands

Asymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands Asymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands Elizabeth C. Mannshardt-Shamseldin Advisor: Richard L. Smith Duke University Department

More information

Multiscale Exploratory Analysis of Regression Quantiles using Quantile SiZer

Multiscale Exploratory Analysis of Regression Quantiles using Quantile SiZer Multiscale Exploratory Analysis of Regression Quantiles using Quantile SiZer Cheolwoo Park Thomas C. M. Lee Jan Hannig April 6, 2 Abstract The SiZer methodology proposed by Chaudhuri & Marron (999) is

More information

Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests

Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests Biometrika (2014),,, pp. 1 13 C 2014 Biometrika Trust Printed in Great Britain Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests BY M. ZHOU Department of Statistics, University

More information

Minimax Rate of Convergence for an Estimator of the Functional Component in a Semiparametric Multivariate Partially Linear Model.

Minimax Rate of Convergence for an Estimator of the Functional Component in a Semiparametric Multivariate Partially Linear Model. Minimax Rate of Convergence for an Estimator of the Functional Component in a Semiparametric Multivariate Partially Linear Model By Michael Levine Purdue University Technical Report #14-03 Department of

More information

On a Nonparametric Notion of Residual and its Applications

On a Nonparametric Notion of Residual and its Applications On a Nonparametric Notion of Residual and its Applications Bodhisattva Sen and Gábor Székely arxiv:1409.3886v1 [stat.me] 12 Sep 2014 Columbia University and National Science Foundation September 16, 2014

More information

Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions

Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions Vadim Marmer University of British Columbia Artyom Shneyerov CIRANO, CIREQ, and Concordia University August 30, 2010 Abstract

More information

Statistical Properties of Numerical Derivatives

Statistical Properties of Numerical Derivatives Statistical Properties of Numerical Derivatives Han Hong, Aprajit Mahajan, and Denis Nekipelov Stanford University and UC Berkeley November 2010 1 / 63 Motivation Introduction Many models have objective

More information

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Jianqing Fan Department of Statistics Chinese University of Hong Kong AND Department of Statistics

More information

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Article Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Fei Jin 1,2 and Lung-fei Lee 3, * 1 School of Economics, Shanghai University of Finance and Economics,

More information

Empirical likelihood based modal regression

Empirical likelihood based modal regression Stat Papers (2015) 56:411 430 DOI 10.1007/s00362-014-0588-4 REGULAR ARTICLE Empirical likelihood based modal regression Weihua Zhao Riquan Zhang Yukun Liu Jicai Liu Received: 24 December 2012 / Revised:

More information

41903: Introduction to Nonparametrics

41903: Introduction to Nonparametrics 41903: Notes 5 Introduction Nonparametrics fundamentally about fitting flexible models: want model that is flexible enough to accommodate important patterns but not so flexible it overspecializes to specific

More information

Lecture 3: Statistical Decision Theory (Part II)

Lecture 3: Statistical Decision Theory (Part II) Lecture 3: Statistical Decision Theory (Part II) Hao Helen Zhang Hao Helen Zhang Lecture 3: Statistical Decision Theory (Part II) 1 / 27 Outline of This Note Part I: Statistics Decision Theory (Classical

More information

Direction Estimation in a General Regression Model with Discrete Predictors

Direction Estimation in a General Regression Model with Discrete Predictors Direction Estimation in a General Regression Model with Discrete Predictors Yuexiao Dong and Zhou Yu Abstract Consider a general regression model, where the response Y depends on discrete predictors X

More information

Nonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix

Nonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix Nonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix Yingying Dong and Arthur Lewbel California State University Fullerton and Boston College July 2010 Abstract

More information

Local linear multiple regression with variable. bandwidth in the presence of heteroscedasticity

Local linear multiple regression with variable. bandwidth in the presence of heteroscedasticity Local linear multiple regression with variable bandwidth in the presence of heteroscedasticity Azhong Ye 1 Rob J Hyndman 2 Zinai Li 3 23 January 2006 Abstract: We present local linear estimator with variable

More information

DEPARTMENT MATHEMATIK ARBEITSBEREICH MATHEMATISCHE STATISTIK UND STOCHASTISCHE PROZESSE

DEPARTMENT MATHEMATIK ARBEITSBEREICH MATHEMATISCHE STATISTIK UND STOCHASTISCHE PROZESSE Estimating the error distribution in nonparametric multiple regression with applications to model testing Natalie Neumeyer & Ingrid Van Keilegom Preprint No. 2008-01 July 2008 DEPARTMENT MATHEMATIK ARBEITSBEREICH

More information

Economics 583: Econometric Theory I A Primer on Asymptotics

Economics 583: Econometric Theory I A Primer on Asymptotics Economics 583: Econometric Theory I A Primer on Asymptotics Eric Zivot January 14, 2013 The two main concepts in asymptotic theory that we will use are Consistency Asymptotic Normality Intuition consistency:

More information

Efficient Estimation for the Partially Linear Models with Random Effects

Efficient Estimation for the Partially Linear Models with Random Effects A^VÇÚO 1 33 ò 1 5 Ï 2017 c 10 Chinese Journal of Applied Probability and Statistics Oct., 2017, Vol. 33, No. 5, pp. 529-537 doi: 10.3969/j.issn.1001-4268.2017.05.009 Efficient Estimation for the Partially

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 24 Paper 153 A Note on Empirical Likelihood Inference of Residual Life Regression Ying Qing Chen Yichuan

More information

INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION. 1. Introduction

INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION. 1. Introduction INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION VICTOR CHERNOZHUKOV CHRISTIAN HANSEN MICHAEL JANSSON Abstract. We consider asymptotic and finite-sample confidence bounds in instrumental

More information

Additive Isotonic Regression

Additive Isotonic Regression Additive Isotonic Regression Enno Mammen and Kyusang Yu 11. July 2006 INTRODUCTION: We have i.i.d. random vectors (Y 1, X 1 ),..., (Y n, X n ) with X i = (X1 i,..., X d i ) and we consider the additive

More information

CALCULATION METHOD FOR NONLINEAR DYNAMIC LEAST-ABSOLUTE DEVIATIONS ESTIMATOR

CALCULATION METHOD FOR NONLINEAR DYNAMIC LEAST-ABSOLUTE DEVIATIONS ESTIMATOR J. Japan Statist. Soc. Vol. 3 No. 200 39 5 CALCULAION MEHOD FOR NONLINEAR DYNAMIC LEAS-ABSOLUE DEVIAIONS ESIMAOR Kohtaro Hitomi * and Masato Kagihara ** In a nonlinear dynamic model, the consistency and

More information

Chapter 6. Order Statistics and Quantiles. 6.1 Extreme Order Statistics

Chapter 6. Order Statistics and Quantiles. 6.1 Extreme Order Statistics Chapter 6 Order Statistics and Quantiles 61 Extreme Order Statistics Suppose we have a finite sample X 1,, X n Conditional on this sample, we define the values X 1),, X n) to be a permutation of X 1,,

More information

IV Quantile Regression for Group-level Treatments, with an Application to the Distributional Effects of Trade

IV Quantile Regression for Group-level Treatments, with an Application to the Distributional Effects of Trade IV Quantile Regression for Group-level Treatments, with an Application to the Distributional Effects of Trade Denis Chetverikov Brad Larsen Christopher Palmer UCLA, Stanford and NBER, UC Berkeley September

More information

Tuning Parameter Selection in Regularized Estimations of Large Covariance Matrices

Tuning Parameter Selection in Regularized Estimations of Large Covariance Matrices Tuning Parameter Selection in Regularized Estimations of Large Covariance Matrices arxiv:1308.3416v1 [stat.me] 15 Aug 2013 Yixin Fang 1, Binhuan Wang 1, and Yang Feng 2 1 New York University and 2 Columbia

More information

Time Series and Forecasting Lecture 4 NonLinear Time Series

Time Series and Forecasting Lecture 4 NonLinear Time Series Time Series and Forecasting Lecture 4 NonLinear Time Series Bruce E. Hansen Summer School in Economics and Econometrics University of Crete July 23-27, 2012 Bruce Hansen (University of Wisconsin) Foundations

More information

High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data

High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data Song Xi CHEN Guanghua School of Management and Center for Statistical Science, Peking University Department

More information

An Empirical Characteristic Function Approach to Selecting a Transformation to Normality

An Empirical Characteristic Function Approach to Selecting a Transformation to Normality Communications for Statistical Applications and Methods 014, Vol. 1, No. 3, 13 4 DOI: http://dx.doi.org/10.5351/csam.014.1.3.13 ISSN 87-7843 An Empirical Characteristic Function Approach to Selecting a

More information

Regression Graphics. 1 Introduction. 2 The Central Subspace. R. D. Cook Department of Applied Statistics University of Minnesota St.

Regression Graphics. 1 Introduction. 2 The Central Subspace. R. D. Cook Department of Applied Statistics University of Minnesota St. Regression Graphics R. D. Cook Department of Applied Statistics University of Minnesota St. Paul, MN 55108 Abstract This article, which is based on an Interface tutorial, presents an overview of regression

More information

The EM Algorithm for the Finite Mixture of Exponential Distribution Models

The EM Algorithm for the Finite Mixture of Exponential Distribution Models Int. J. Contemp. Math. Sciences, Vol. 9, 2014, no. 2, 57-64 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ijcms.2014.312133 The EM Algorithm for the Finite Mixture of Exponential Distribution

More information

Confidence intervals for kernel density estimation

Confidence intervals for kernel density estimation Stata User Group - 9th UK meeting - 19/20 May 2003 Confidence intervals for kernel density estimation Carlo Fiorio c.fiorio@lse.ac.uk London School of Economics and STICERD Stata User Group - 9th UK meeting

More information

Short Questions (Do two out of three) 15 points each

Short Questions (Do two out of three) 15 points each Econometrics Short Questions Do two out of three) 5 points each ) Let y = Xβ + u and Z be a set of instruments for X When we estimate β with OLS we project y onto the space spanned by X along a path orthogonal

More information

Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator

Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator by Emmanuel Flachaire Eurequa, University Paris I Panthéon-Sorbonne December 2001 Abstract Recent results of Cribari-Neto and Zarkos

More information

Efficient Regressions via Optimally Combining Quantile Information

Efficient Regressions via Optimally Combining Quantile Information Efficient Regressions via Optimally Combining Quantile Information Zhibiao Zhao Penn State University Zhijie Xiao Boston College September 29, 2011 Abstract We study efficient estimation of regression

More information

Efficient Estimation of Population Quantiles in General Semiparametric Regression Models

Efficient Estimation of Population Quantiles in General Semiparametric Regression Models Efficient Estimation of Population Quantiles in General Semiparametric Regression Models Arnab Maity 1 Department of Statistics, Texas A&M University, College Station TX 77843-3143, U.S.A. amaity@stat.tamu.edu

More information

Some asymptotic expansions and distribution approximations outside a CLT context 1

Some asymptotic expansions and distribution approximations outside a CLT context 1 6th St.Petersburg Workshop on Simulation (2009) 444-448 Some asymptotic expansions and distribution approximations outside a CLT context 1 Manuel L. Esquível 2, João Tiago Mexia 3, João Lita da Silva 4,

More information

DISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING. By T. Tony Cai and Linjun Zhang University of Pennsylvania

DISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING. By T. Tony Cai and Linjun Zhang University of Pennsylvania Submitted to the Annals of Statistics DISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING By T. Tony Cai and Linjun Zhang University of Pennsylvania We would like to congratulate the

More information

SEMIPARAMETRIC LONGITUDINAL MODEL WITH IRREGULAR TIME AUTOREGRESSIVE ERROR PROCESS

SEMIPARAMETRIC LONGITUDINAL MODEL WITH IRREGULAR TIME AUTOREGRESSIVE ERROR PROCESS Statistica Sinica 25 (2015), 507-527 doi:http://dx.doi.org/10.5705/ss.2013.073 SEMIPARAMETRIC LOGITUDIAL MODEL WITH IRREGULAR TIME AUTOREGRESSIVE ERROR PROCESS Yang Bai 1,2, Jian Huang 3, Rui Li 1 and

More information

A Note on Data-Adaptive Bandwidth Selection for Sequential Kernel Smoothers

A Note on Data-Adaptive Bandwidth Selection for Sequential Kernel Smoothers 6th St.Petersburg Workshop on Simulation (2009) 1-3 A Note on Data-Adaptive Bandwidth Selection for Sequential Kernel Smoothers Ansgar Steland 1 Abstract Sequential kernel smoothers form a class of procedures

More information

Working Paper No Maximum score type estimators

Working Paper No Maximum score type estimators Warsaw School of Economics Institute of Econometrics Department of Applied Econometrics Department of Applied Econometrics Working Papers Warsaw School of Economics Al. iepodleglosci 64 02-554 Warszawa,

More information

Nonparametric confidence intervals. for receiver operating characteristic curves

Nonparametric confidence intervals. for receiver operating characteristic curves Nonparametric confidence intervals for receiver operating characteristic curves Peter G. Hall 1, Rob J. Hyndman 2, and Yanan Fan 3 5 December 2003 Abstract: We study methods for constructing confidence

More information

5.1 Approximation of convex functions

5.1 Approximation of convex functions Chapter 5 Convexity Very rough draft 5.1 Approximation of convex functions Convexity::approximation bound Expand Convexity simplifies arguments from Chapter 3. Reasons: local minima are global minima and

More information

Inference For High Dimensional M-estimates. Fixed Design Results

Inference For High Dimensional M-estimates. Fixed Design Results : Fixed Design Results Lihua Lei Advisors: Peter J. Bickel, Michael I. Jordan joint work with Peter J. Bickel and Noureddine El Karoui Dec. 8, 2016 1/57 Table of Contents 1 Background 2 Main Results and

More information

Statistical inference on Lévy processes

Statistical inference on Lévy processes Alberto Coca Cabrero University of Cambridge - CCA Supervisors: Dr. Richard Nickl and Professor L.C.G.Rogers Funded by Fundación Mutua Madrileña and EPSRC MASDOC/CCA student workshop 2013 26th March Outline

More information

Computation Of Asymptotic Distribution. For Semiparametric GMM Estimators. Hidehiko Ichimura. Graduate School of Public Policy

Computation Of Asymptotic Distribution. For Semiparametric GMM Estimators. Hidehiko Ichimura. Graduate School of Public Policy Computation Of Asymptotic Distribution For Semiparametric GMM Estimators Hidehiko Ichimura Graduate School of Public Policy and Graduate School of Economics University of Tokyo A Conference in honor of

More information

Introduction to Regression

Introduction to Regression Introduction to Regression p. 1/97 Introduction to Regression Chad Schafer cschafer@stat.cmu.edu Carnegie Mellon University Introduction to Regression p. 1/97 Acknowledgement Larry Wasserman, All of Nonparametric

More information

Nonparametric Econometrics

Nonparametric Econometrics Applied Microeconometrics with Stata Nonparametric Econometrics Spring Term 2011 1 / 37 Contents Introduction The histogram estimator The kernel density estimator Nonparametric regression estimators Semi-

More information

Spatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood

Spatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood Spatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood Kuangyu Wen & Ximing Wu Texas A&M University Info-Metrics Institute Conference: Recent Innovations in Info-Metrics October

More information

Kernel Density Based Linear Regression Estimate

Kernel Density Based Linear Regression Estimate Kernel Density Based Linear Regression Estimate Weixin Yao and Zibiao Zao Abstract For linear regression models wit non-normally distributed errors, te least squares estimate (LSE will lose some efficiency

More information

SUFFICIENT DIMENSION REDUCTION IN REGRESSIONS WITH MISSING PREDICTORS

SUFFICIENT DIMENSION REDUCTION IN REGRESSIONS WITH MISSING PREDICTORS Statistica Sinica 22 (2012), 1611-1637 doi:http://dx.doi.org/10.5705/ss.2009.191 SUFFICIENT DIMENSION REDUCTION IN REGRESSIONS WITH MISSING PREDICTORS Liping Zhu 1, Tao Wang 2 and Lixing Zhu 2 1 Shanghai

More information

Long-Run Covariability

Long-Run Covariability Long-Run Covariability Ulrich K. Müller and Mark W. Watson Princeton University October 2016 Motivation Study the long-run covariability/relationship between economic variables great ratios, long-run Phillips

More information

A lack-of-fit test for quantile regression models with high-dimensional covariates

A lack-of-fit test for quantile regression models with high-dimensional covariates A lack-of-fit test for quantile regression models with high-dimensional covariates Mercedes Conde-Amboage 1, César Sánchez-Sellero 1,2 and Wenceslao González-Manteiga 1 arxiv:1502.05815v1 [stat.me] 20

More information

PREWHITENING-BASED ESTIMATION IN PARTIAL LINEAR REGRESSION MODELS: A COMPARATIVE STUDY

PREWHITENING-BASED ESTIMATION IN PARTIAL LINEAR REGRESSION MODELS: A COMPARATIVE STUDY REVSTAT Statistical Journal Volume 7, Number 1, April 2009, 37 54 PREWHITENING-BASED ESTIMATION IN PARTIAL LINEAR REGRESSION MODELS: A COMPARATIVE STUDY Authors: Germán Aneiros-Pérez Departamento de Matemáticas,

More information

O Combining cross-validation and plug-in methods - for kernel density bandwidth selection O

O Combining cross-validation and plug-in methods - for kernel density bandwidth selection O O Combining cross-validation and plug-in methods - for kernel density selection O Carlos Tenreiro CMUC and DMUC, University of Coimbra PhD Program UC UP February 18, 2011 1 Overview The nonparametric problem

More information

The Restricted Likelihood Ratio Test at the Boundary in Autoregressive Series

The Restricted Likelihood Ratio Test at the Boundary in Autoregressive Series The Restricted Likelihood Ratio Test at the Boundary in Autoregressive Series Willa W. Chen Rohit S. Deo July 6, 009 Abstract. The restricted likelihood ratio test, RLRT, for the autoregressive coefficient

More information

Wrapped Gaussian processes: a short review and some new results

Wrapped Gaussian processes: a short review and some new results Wrapped Gaussian processes: a short review and some new results Giovanna Jona Lasinio 1, Gianluca Mastrantonio 2 and Alan Gelfand 3 1-Università Sapienza di Roma 2- Università RomaTRE 3- Duke University

More information

Testing Equality of Nonparametric Quantile Regression Functions

Testing Equality of Nonparametric Quantile Regression Functions International Journal of Statistics and Probability; Vol. 3, No. 1; 2014 ISSN 1927-7032 E-ISSN 1927-7040 Published by Canadian Center of Science and Education Testing Equality of Nonparametric Quantile

More information

Specification testing for regressions: an approach bridging between local smoothing and global smoothing methods

Specification testing for regressions: an approach bridging between local smoothing and global smoothing methods arxiv:70.05263v2 [stat.me] 7 Oct 207 Specification testing for regressions: an approach bridging between local smoothing and global smoothing methods Lingzhu Li and Lixing Zhu,2 Department of Mathematics,

More information

Nonparametric Methods

Nonparametric Methods Nonparametric Methods Michael R. Roberts Department of Finance The Wharton School University of Pennsylvania July 28, 2009 Michael R. Roberts Nonparametric Methods 1/42 Overview Great for data analysis

More information

The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA

The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA Presented by Dongjun Chung March 12, 2010 Introduction Definition Oracle Properties Computations Relationship: Nonnegative Garrote Extensions:

More information

Inference For High Dimensional M-estimates: Fixed Design Results

Inference For High Dimensional M-estimates: Fixed Design Results Inference For High Dimensional M-estimates: Fixed Design Results Lihua Lei, Peter Bickel and Noureddine El Karoui Department of Statistics, UC Berkeley Berkeley-Stanford Econometrics Jamboree, 2017 1/49

More information

Transformation and Smoothing in Sample Survey Data

Transformation and Smoothing in Sample Survey Data Scandinavian Journal of Statistics, Vol. 37: 496 513, 2010 doi: 10.1111/j.1467-9469.2010.00691.x Published by Blackwell Publishing Ltd. Transformation and Smoothing in Sample Survey Data YANYUAN MA Department

More information

arxiv: v1 [cs.lg] 24 Feb 2014

arxiv: v1 [cs.lg] 24 Feb 2014 Journal of Machine Learning Research 1 (2013) 1-60 Submitted 12/13; Published 00/00 Predictive Interval Models for Non-parametric Regression Mohammad Ghasemi Hamed ENAC, MAIAA, F-31055 Toulouse, France

More information

Nonparametric Heteroscedastic Transformation Regression Models for Skewed Data, with an Application to Health Care Costs

Nonparametric Heteroscedastic Transformation Regression Models for Skewed Data, with an Application to Health Care Costs Nonparametric Heteroscedastic Transformation Regression Models for Skewed Data, with an Application to Health Care Costs Xiao-Hua Zhou, Huazhen Lin, Eric Johnson Journal of Royal Statistical Society Series

More information

Regression Graphics. R. D. Cook Department of Applied Statistics University of Minnesota St. Paul, MN 55108

Regression Graphics. R. D. Cook Department of Applied Statistics University of Minnesota St. Paul, MN 55108 Regression Graphics R. D. Cook Department of Applied Statistics University of Minnesota St. Paul, MN 55108 Abstract This article, which is based on an Interface tutorial, presents an overview of regression

More information