SEMIPARAMETRIC ESTIMATION OF CONDITIONAL HETEROSCEDASTICITY VIA SINGLE-INDEX MODELING
|
|
- Owen Taylor
- 5 years ago
- Views:
Transcription
1 Statistica Sinica 3 (013), doi: SEMIPARAMERIC ESIMAION OF CONDIIONAL HEEROSCEDASICIY VIA SINGLE-INDEX MODELING Liping Zhu, Yuexiao Dong and Runze Li Shanghai University of Finance and Economics, emple University and Pennsylvania State University Abstract: We consider a single-index structure to study heteroscedasticity in regression with high-dimensional predictors. A general class of estimating equations is introduced. he resulting estimators remain consistent even when the structure of the variance function is misspecified. he proposed estimators estimate the conditional variance function asymptotically as well as if the conditional mean function was given a priori. Numerical studies confirm our theoretical observations and demonstrate that our proposed estimators have less bias and smaller standard deviation than the existing estimators. Key words and phrases: Conditional variance, heteroscedasticity, single-index model, volatility. 1. Introduction Many scientific studies rely on understanding the local variability of the data, frequently characterized through the conditional variance in statistical modeling. he conditional variance plays an important role in a variety of statistical applications, such as measuring the volatility of risk in finance (Anderson and Lund (1997); Xia, ong, and Li (00)), monitoring the reliability of nonlinear prediction (Müller and Stadtmüller (1987); Yao and ong (1994)), identifying homoscedastic transformations in regression (Box and Cox (1964); Carroll and Ruppert (1988)) and so on. Estimation of the conditional variance function is an important problem in statistics. Let Y R be the response variable and x = (X 1,..., X p ) R p be the associated predictor vector. In this paper we study the conditional variance function of Y given x, denoted var(y x). We write E(Y x) as the conditional mean function and ε = Y E(Y x) as the random error. hen we have var(y x) = E(ε x). Since ε is not observable, it is natural to replace the error with the residual ε = Y Ê(Y x), where Ê(Y x) is an arbitrary consistent estimate of the conditional mean function. It is of interest to quantify the effect of this replacement on estimating the conditional variance function. his problem first received much attention when the predictor is univariate, p = 1. See, for
2 136 LIPING ZHU, YUEXIAO DONG AND RUNZE LI example, Hall and Carroll (1989), Wang et al. (008), and references therein. Cai, Levine, and Wang (009) generalized the results of Wang et al. (008) to the case of multivariate x. In their generalization, however, the nonparametric kernel regression was applied directly to estimate the multivariate regression function, which may not be effective due to the curse of dimensionality. In regressions with univariate predictor and response, Ruppert et al. (1997) and Fan and Yao (1998) applied local linear regression to the squared residuals and demonstrated that such nonparametric estimate performs asymptotically as well as if the conditional mean were given a priori. Song and Yang (009) derived asymptotically exact and conservative confidence bands for the heteroscedastic variance functions. Yin et al. (010) extended the results of Fan and Yao (1998) to the case of multivariate response. o model the conditional variance function when p is fairly large, we assume throughout that there exists a smooth function σ ( ) and a p 1 vector β 0 such that var(y x) = σ (β 0 x). (1.1) With an unspecified link function σ ( ) and a univariate index β 0 x, (1.1) is both flexible and interpretable under the single-index structure. It provides a compromise between parametric models which are easily interpretable yet often too restrictive, and fully nonparametric models that are flexible but suffer from the curse of dimensionality. For ease of presentation we assume that, for some unknown function l( ) and a p 1 vector α 0, the conditional mean function also admits the single-index structure E(Y x) = l(α 0 x). (1.) he assumption of model (1.) is not essential and more general forms of the mean function may be assumed. In this paper we propose a general class of estimating equations to estimate β 0. By correctly specifying E(x β 0 x), the estimator of β 0 is consistent even when the structure of the variance function σ ( ) is misspecified. On the other hand, if the variance function σ ( ) is correctly specified and estimated consistently, our proposed estimator of β 0 is consistent without correctly specifying E(x β 0 x). he estimate of β 0 from the estimating equations possesses an adaptive property: the proposed procedure estimates the conditional variance function as efficiently, asymptotically, as if the conditional mean were given a priori. his is achieved by replacing the error ε = Y E(Y x) in the estimating equation with its corresponding residual ε = Y Ê(Y x).
3 SINGLE-INDEX MODELING FOR CONDIIONAL HEEROSCEDASICIY 137 his paper is organized in the following way. In Section we present the methodology and study its properties under both the population and the sample levels. In Section 3 we examine the finite sample performance of the proposed procedure through simulations and an application to a data set. his paper concludes with a brief discussion in Section 4. All technicalities are given in the Appendix.. A New Procedure and Its heoretical Properties.1. A general class of estimating equations Consider the estimating equation E[ε σ (β 0 x)σ (β 0 x) x] = 0, (.1) where ε = Y E(Y x), and σ ( ) stands for the first order derivative of σ ( ). Here (.1) corresponds to the classical nonlinear least squares estimation of Ichimura (1993) and Härdle, Hall, and Ichimura (1993). A limitation of (.1) is that it requires correct specification of σ ( ) to obtain a consistent estimate for β 0, and this may be troublesome in practice. o address this issue we propose a new class of estimating equations, E [ ε σ (β 0 x) x Ẽ ( x β 0 x )] = 0, (.) where σ (β 0 x) and Ẽ ( x β 0 x ) may be different from σ (β 0 x) and E ( x β 0 x ). When σ (β 0 x) = σ (β 0 x) and Ẽ ( x β 0 x ) = E ( x β 0 x ), (.) has the form E [ ε σ (β 0 x) x E ( x β 0 x )] = 0. (.3) It is not difficult to verify that (.3) produces a consistent estimate of β 0. In other words, β 0 is a solution to E [ ε σ (β x) x E ( x β x )] = 0. An important virtue of (.) is that, as long as one of the functions σ (β 0 x) and Ẽ ( x β 0 x ) is correctly specified, (.) yields a consistent estimate of β 0. In particular, without knowing the exact form of σ ( ), suppose we specify it as σ ( ), which may be different from σ ( ), and impose the working estimating equation E [ ε σ (β 0 x) x E ( x β 0 x )] = 0. (.4) Invoking a conditional expectation, (.4) is equivalent to E ( E [ ε σ (β 0 x) x E ( x β 0 x ) x ]) = 0.
4 138 LIPING ZHU, YUEXIAO DONG AND RUNZE LI o verify that this yields a consistent estimate of the true β 0 at the population level, we recall that E ( ε x ) = σ (β 0 x) under (1.1). hen the left hand side of (.4) can be further reduced as E [ E ε σ (β 0 x) x x E ( x β 0 x )] = E [ E ε σ (β 0 x) β 0 x x E ( x β 0 x )] = E [ ε σ (β 0 x) E x E ( x β 0 x ) β 0 x ], where the last term is obviously 0, indicating that (.4) is able to produce a consistent estimator of β 0. his derivation provides the motivation for (.), as it continues to yield a consistent estimate of β 0 even if σ ( ) is misspecified. As a special case, let σ (β 0 x) = 0 in (.4) and get E [ ε x E(x β 0 x) ] = 0, (.5) which is similar to the estimating equation proposed by Li and Dong (009) to recover the central solution space. In their context they utilize Y instead of ε to estimate the mean function. We generalize the idea of the central solution space method to estimate the variance function. When x satisfies the linearity condition (Li (1991)), (E(x β 0 x) is a linear function of x), then β 0 must be proportional to var(x) 1 cov(x, ε ). his property dramatically simplifies solving (.5) at the sample level. Yet, unlike the response variable Y, the error term ε is not observable. his motivates us to examine the effect of estimating the mean function to obtain the residuals on estimating the variance function. We study this issue in the next section. Aside from this, we have seen that (.5) is a specific member of the general class of estimating equations at (.). On the other hand, correct specification of E(x β 0 x) may be challenging in some situations. Even if all components in E(x β 0 x) can be correctly specified, calculating E(x β 0 x) is rather intensive when x is high dimensional. In order to simplify the calculation, suppose we estimate E(x β 0 x) by Ẽ(x β 0 x). hen (.) becomes [ ε E σ (β 0 x) ] x Ẽ(x β 0 x) = 0, (.6) noting that E ( ε x ) = σ (β 0 x) and conditioning on x. hus (.) continues to yield a consistent estimate of β 0 even if E(x β 0 x) is misspecified. For example, we can set Ẽ(x β 0 x) = 0 and then (.6) becomes E [ ε σ (β 0 x) x ] = 0. (.7) We remark here that (.5) and (.7) are equivalent at the population level by noting that E ε E ( x β 0 x ) = E E ( ε β 0 x ) x = E σ (β 0 x)x under (1.1).
5 SINGLE-INDEX MODELING FOR CONDIIONAL HEEROSCEDASICIY 139 We have shown that the estimating equation at (.) has a desirable robustness property: as long as either σ ( ) or Ẽ(x β 0 x) is correctly specified, the estimator based on the sample version of (.) is consistent. Both parametric and nonparametric methods could be used to model σ ( ) or Ẽ(x β 0 x). For example, Ẽ(x β 0 x) is typically assumed to be a linear function of β 0 x in the dimension reduction literature (Li (1991)). We propose to estimate σ ( ) and Ẽ(x β 0 x) via kernel regression. he theoretical properties of the sample estimates based on kernel regression are investigated in the next section... Asymptotic properties Given independent and identically distributed (x i, Y i ), i = 1,..., n, we discuss sample versions of the proposed estimating equations and their asymptotic properties. Ideally, one may estimate β 0 by solving for β that satisfies n 1/ ε i σ ( β x i )x i Ê(x i β x i ) = 0, (.8) which is the sample counterpart of (.3). In (.8), the quantities σ ( β x i ) and Ê(x i β x i ) are the consistent estimators of σ ( β x i ) and E(x i β x i ), respectively, and can be obtained through the classical kernel regression method. In practice, we have to replace the unobservable error ε i with the residual ε i. o guarantee the consistency of β, we need consistent estimation of ε i, indicating that we have to estimate E(Y x) consistently. Under (1.), E(Y x) = l(α 0 x), we estimate α 0 by solving for α that satisfies n 1/ Y i l( α x i )x i Ê(x i α x i ) = 0, (.9) which is parallel to (.8) with Ê(x i α x i )= n j=1 K h( α x j α x i )x j n j=1 K h( α x j α x i ), l( α x i )= n j=1 K h 1 ( α x j α x i )Y j n j=1 K h 1 ( α x j α x i ), In this K h ( ) = K(./h)/h is the kernel function. h and h 1 are the bandwidths. Next we calculate the residual ε i = Y i l( α x i ), and get our final estimate of β 0 by solving n 1/ ε i σ ( β x i )x i Ê(x i β x i ) = 0. (.10)
6 140 LIPING ZHU, YUEXIAO DONG AND RUNZE LI In (.10) we estimate E(x i β x i ) and σ ( β x i ) as n Ê(x i β j=1 x i )= K h( β x j β x i )x n j n j=1 K h( β x j β, σ ( β j=1 x i )= K h ( β x j β x i ) ε j x i ) n j=1 K h ( β x j β. x i ) o summarize, we implement the following algorithm to estimate β Solve (.9) to obtain α through the following Newton-Raphson algorithm. (a) Start with an initial value α (0). In our implementation, we choose α (0) = n (x i x)(x i x) 1 n (x i x)(y i Y ), where x = n 1 n x i and Y = n 1 n Y i. (b) Write J(α) = n 1/ n Y i l(α x i )x i Ê(x i α x i ) and J (α) = n 1/ n l(α x i ) x i Ê(x i α x i )x i. he derivative l(α x i ) = l(α x i )/ (α x i ) is taken directly from the corresponding kernel estimator. Update α (k) with α (k+1) = α (k) J (α (k) ) 1 J(α (k) ). (.11) In case the matrix J (α (k) ) is singular or nearly so, we adopt a ridge regression approach using (.11) with J (α (k) ) replaced by J r(α (k) ) = J (α (k) )+λ n I p p for some positive ridge parameter λ n. Here I p p denotes a p p identity matrix. (c) Iterate (.11) until α (k+1) fails to change. Denote the resultant estimate by α.. Obtain l( α x i ) by using kernel regression of Y i onto ( α x i ), for i = 1,..., n. 3. Solve (.10) to obtain β, where ε i = Y i l( α x i ). (a) Start with an initial value β (0). In our implementation, we choose β (0) = n (x i x)(x i x) 1 n (x i x) ε i. (b) Write I(β) = n 1/ n ε i σ (β x i )x i Ê(x i β x i ) and I (β) = n 1/ n σ (β x i ) x i Ê(x i β x i )x i, where the derivative σ (β x i ) = σ (β x i )/ (β x i ) is taken from the kernel estimator σ (β x i ). Update β (k) with β (k+1) = β (k) I (β (k) ) + λ n I p p 1 I(β (k) ). (.1) (c) Iterate (.1) until β (k+1) fails to change. Denote the resultant estimate by β. 4. Obtain σ ( β x i ) by using kernel regression of ε i onto ( β x i ), for i = 1,..., n.
7 SINGLE-INDEX MODELING FOR CONDIIONAL HEEROSCEDASICIY 141 One question arises naturally: how to quantify the differences between the estimators obtained from (.8) and (.10)? his is answered by the following. heorem 1. Suppose the conditions (C1) (C5) in the Appendix are satisfied. Let [ Q = E A(x) σ0(β 0 x) ] x and V = E var(ε x)a(x)a (x), ) where A(x) = x E(x β 0 x). hen n 1/ V 1/ Q ( β β0 converges in distribution to the standard multivariate normal distribution as n. heorem 1 provides the asymptotic normality of β obtained from Step 3 in our proposed algorithm, which uses the residual ε i and relies on (.10). As detailed in the proof in the Appendix, we obtain exactly the same asymptotic distribution of β based on (.8) with known error ε i. his means our procedure performs as well as the oracle procedure in terms of estimating β 0, namely, without knowing the conditional mean, we can estimate the conditional variance function asymptotically as well as if the conditional mean was given a priori. With a consistent estimator β, we estimate σ ( ) via kernel regression. he next theorem states the consistency of σ ( ) obtained from Step 4 in our proposed algorithm. heorem. Suppose the conditions (C1) (C5) in the Appendix are satisfied. Let [ σ bias = h (β 0 x) σ (β 0 x) f(β µ + 0 x) ] f(β, 0 x) where µ = 1 1 u K(u)du, σ ( ), f( ) and σ ( ) denote the first and second order derivatives, respectively. hen (nh ) σ 1/ ( β x) σ (β 0 x) bias converges in distribution to normal distribution with mean zero and variance var(ε β 0 x)/f(β 0 x). Because β has a faster convergence rate than the nonparametric regression, the above result is not surprising. heorem implies that we can estimate σ ( ) based on β x as efficiently as if β 0 x was known a priori. his extends the results in Fan and Yao (1998), as we obtain the adaptive property when x is high-dimensional and the link function is an unknown smoothing function. Remark 1. In this section, we describe how to estimate β 0 from the estimating equation (.10) and establish the adaptive property for β and σ ( β x) in an asymptotic sense. In practical implementation, one may choose the bandwidth
8 14 LIPING ZHU, YUEXIAO DONG AND RUNZE LI for the proposed procedure by using a cross-validation procedure. he estimating equation (.10) corresponds to (.3) at the population level. Similarly, one can derives an estimate for β 0 using the sample version of (.5) or (.7). However, it is necessary to undersmooth Ê(x β 0 x) in (.5), or ˆσ (β 0 x) in (.7), in order for the resulting estimate to achieve the adaptive properties. We skip the details. 3. Numerical Studies 3.1. Simulations In this section, we report on simulation studies to compare the performance of the estimation procedures discussed in Section.1. Specifically, we consider n Y i l( α x i ) l( α x i ) x i = 0, n ε i σ ( β x i ) σ ( β (3.1) x i ) x i = 0. Solving (3.1) yields the classical nonlinear least squares estimation proposed by Härdle, Hall, and Ichimura (1993), and it serves as a benchmark for our comparisons. Estimating equations for the sample level of (.7) are n Y i l( α x i )x i = 0, n ε i σ ( β (3.) x i )x i = 0. We recall that the quantity Ẽ(x β 0 x) is misspecified to be 0 in (.7). We have seen in Section.1 that the estimator of β 0 obtained from (3.) remains consistent if ε and σ ( ) are consistent. he estimator based on (3.) is included to demonstrate this robustness property. he sample version of the estimating equation (.3) is n Y i l( α x i )x i Ê(x i α x i ) = 0, n ε i σ ( β x i )x i Ê(x i β (3.3) x i ) = 0. In equations (3.1), (3.), and (3.3), ε i = Y i l( α x i ). As in Section., we estimate l(α x), σ (β x), E(x α x), and E(x β x) through the corresponding kernel estimates. Both σ (β x) and E(x β x) are estimated consistently in (3.3). he Epanechnikov kernel is used in our numerical study and the bandwidths are selected via cross-validation. We remark here that, our estimating procedure is not very sensitive to the choice of the bandwidth, which confirms the theoretical investigations in heorem 1 that the asymptotic normality of β holds true for a wide range of bandwidth. In our simulation, we used two schemes to generate x.
9 SINGLE-INDEX MODELING FOR CONDIIONAL HEEROSCEDASICIY 143 method able 1. Simulation results for Case 1. he oracle estimates use the true error term ε. All numbers reported are multiplied by 100. mean estimate variance estimate oracle estimation α 1 α α 3 α 4 β1 β β8 β1 β β8 (a): l(α 0 x) = 5 ( α 0 x ) (3.1) bias std (3.) bias std (3.3) bias std (b): l(α 0 x) = exp ( α 0 x ) (3.1) bias std (3.) bias std (3.3) bias std Case 1: he predictors x were drawn from a normal population with mean zero and variance-covariance matrix (σ ij ) 8 8, where σ ij = 0.5 i j. Case : We generated X 1 as uniform U(0, 1 1/ ), X from a binomial distribution with success probability 0.5, and X 3 from a Poisson distribution with parameter. We kept (X 4,..., X 8 ) generated from Case 1. Conditioning on x, Y was generated as normal with the mean functions (a) l(α 0 x) = 5 ( α 0 x), (b) l(α 0 x) = exp ( α 0 x), where α 0 = (0.8, 0.4, 0.4, 0., 0, 0, 0, 0) in (a) and (b) and the variance function σ (β 0 x) = 0.5 ( β 0 x + ) with β0 = ( 0.45, 0,..., 0, 0.9). Performances of estimating α 0 and β 0. For the estimation accuracy of α 0 and β 0, simulations were repeated 1,000 times with sample size n = 600. he bias ( bias ) and the standard deviation ( std ) of the estimates of typical elements of α 0 and β 0 are reported in ables 1 and for Cases 1 and, respectively. he estimating equations (3.1), (3.), and (3.3) have similar performances for estimating α in terms of both the bias and the standard deviation. he estimating equations (3.3) perform the best for estimating β 0, and (3.) perform only slightly worse, confirming the robustness property of (.). he estimating equations (3.1) have the largest biases and standard deviations; this may be
10 144 LIPING ZHU, YUEXIAO DONG AND RUNZE LI able. Simulation results for Case. he oracle estimates use the true error term ε. All numbers reported are multiplied by 100. method mean estimate variance estimate oracle estimation α 1 α α 3 α 4 β1 β β8 β1 β β8 (a): l(α 0 x) = 5 ( α 0 x ) (3.1) bias std (3.) bias std (3.3) bias std (b): l(α 0 x) = exp ( α 0 x ) (3.1) bias std (3.) bias std (3.3) bias std able 3. Simulation results of the average squared errors (ASE). All numbers reported are multiplied by 100. model method Case 1 Case ASE( α) ASE( β) ASE( α) ASE( β) (3.1) (a) (3.) (3.3) (3.1) (b) (3.) (3.3) explained by the fact that estimation of l( ) and σ ( ) are involved. From ables 1 and, all three procedures are close to their corresponding oracle estimates, which adopt the true errors instead of the residuals. his serves to confirm the adaptive property of the proposed estimators. Performances of estimating l( ) and σ ( ). We take the average squared errors criteria ASE( α) = n l( 1 α x i ) l(α 0 x i ), ASE( β) = n 1 σ ( β x i ) σ (β 0 x i ).
11 SINGLE-INDEX MODELING FOR CONDIIONAL HEEROSCEDASICIY 145 he results are reported in able 3. he estimating equations (3.) and (3.3) offer similar results, and both are superior to the results based on (3.1). 3.. An application We consider the horse mussels data collected in the Malborough Sounds at the Northeast of New Zealands South Island (Camden (1989)). he response variable Y is the shell mass, with three quantitative predictors as related characteristics of mussel shells: the shell length X 1, the shell height X, and the shell width X 3, all in mm. he sample size of this data set is 01. All predictors in this analysis are standardized marginally to have zero mean and unit variance. We only report the results obtained from (3.3), as similar results are obtained from (3.1) and (3.). By assuming (1.), we estimate α 0 by the first equation of (3.3), and find α = (0.3615, 0.443, 0.830). Based on ( α x i, Y i ), i = 1,..., n, we estimate l(α 0 x) by kernel regression. he estimated regression function and its point-wise prediction interval are plotted in Figure 1(A). he prediction interval at the 95% level is calculated by assuming tentatively that the variance function is a constant. We can clearly see that the empirical converge probability is very poor, particularly when α x is large, only 63 among 01 points lie within this prediction region. Homoscedasticity is not a reasonable assumption for this data set. Next we solve the second of the estimating equations (3.3) and find β = ( 0.379, 0.374, ). he variance function is then estimated by kernel regression based on data points ( β x i, ε i ), i = 1,..., n. he estimated variance function is plotted in Figure 1(B), which does not seem to be constant. aking into account the heteroscedasticity, we report the 95% prediction interval of l( β x) in Figure 1(C). We can see that, around 94% of the sample (189 points) is covered by this prediction interval. It is of practical interest to examine the adaptive property of our proposal by using the bootstrap method. Based on the original sample, we obtained the estimates of the index parameters α and β and the link function l( ). We then bootstrapped the original data 1,000 times. hree quantities can be obtained from the bootstrap sample: α denotes the estimator of α 0 based on the bootstrap sample (x i, Y i ); β denotes the estimator of β 0 based on x i, Y i l (x i α ), and β o denotes the estimator of β 0 based on x i, Y i l(x i α). We remark here that β o differs from β in that the former used the true error term because it adopted α and l( ), estimated from the original data, while the latter used the residuals calculated from the bootstrap sample. he third plot in Figure 1(D) gives the boxplot of the absolute values of correlation coefficients corr(x β, x β0 ). It implies that β behaves very similarly to β 0 because the correlation coefficients
12 146 LIPING ZHU, YUEXIAO DONG AND RUNZE LI Figure 1. Analysis of the Horse Mussel Data. he dashed line in (A) is the kernel estimate l( α x). he dot-dash lines are the 95% pointwise confidence intervals obtained under the homoscedasticity assumption. he dashed line in (B) is the kernel estimate σ ( β x). he dashed line in (C) is the kernel estimate l( α x). he dot-dash lines are the 95% pointwise confidence intervals obtained under the heteroscedasticity assumption. (D) depicts the boxplots of corr(x α, x α), corr(x β, x β), and corr(x β, x β 0 ), as calculated from the bootstrap samples. are all very large. he first and second plots in Figure 1(D) show the respective boxplots of the absolute value of the correlation coefficients corr(x α, x α) and corr(x β, x β). It can be seen that α performs much more stably than β and β 0, indicating that estimating the conditional variance function is more difficult than estimating the conditional mean function. 4. Discussion Estimation of the conditional heteroscedasticity remains an important and open problem in the literature when the dimension of predictors is very large (Antoniadis, Grégoire, and Mckeague (004)). Based on a single-index structure,
13 SINGLE-INDEX MODELING FOR CONDIIONAL HEEROSCEDASICIY 147 this paper offers a general class of estimating equations to estimate the conditional heteroscedasticity. he proposed framework allows for flexibility when the structure of the conditional variance function is unknown. he resulting estimator enjoys an adaptive property in that it performs as well as if the true mean function was given. For ease of illustration, we assume that both the conditional mean and the conditional variance functions are of the single-index structure. Extension of the proposed methodology to the multi-index conditional mean and conditional variance models warrants future investigation. As pointed out by an anonymous referee, the model considered for ε is of the form ε = σ(β 0 x)ϵ. hus, one can estimate β 0 by considering ε α = σ α (β 0 x)e ϵ α + ξ for any α > 0, where ξ = σ α (β 0 x)( ϵ α E ϵ α ). he model used in this paper corresponds to α =. Mercurio and Spokony (004) discussed how to determine α, and suggested α = 1/ in general. It may be of interest to consider other α than α =. his is beyond the scope of this paper. Acknowledgement Zhu s research was supported by National Natural Science Foundation of China (NNSFC) , the National Institute on Drug Abuse (NIDA) grant R1-DA0460, Innovation Program of Shanghai Municipal Education Commission 13ZZ055 and Pujiang Project of Science and echnology Commission of Shanghai Municipality 1PJ Dong s research was supported by National Science Foundation Grant DMS Runze Li is the corresponding author and his research was supported in part by NNSFC grants and , NIDA grant P50-DA10075 and NCI grant R01 CA he content is solely the responsibility of the authors and does not necessarily represent the official views of the NIDA, NCI or the National Institutes of Health. Appendix: echnical Conditions and Proofs Appendix A: Some technical conditions (C1) he density functions of (α 0 x) and (β 0 x), denoted by f α0 (α 0 x) and f β0 (β 0 x), are continuous and bounded from zero and above for all x R p, and have locally Lipschitz second derivatives. (C) he functions l(α 0 x), σ (β 0 x), l(α 0 x)f α 0 (α 0 x), σ (β 0 x)f β0 (β 0 x), E(x α 0 x)f α 0 (α 0 x), and E(x β 0 x)f β0 (β 0 x) are continuous and bounded from above for all x R p, and their third derivatives are locally Lipschitz. (C3) he symmetric kernel function K( ) is twice continuously differentiable with support [-1,1], and is Lipschitz continuous.
14 148 LIPING ZHU, YUEXIAO DONG AND RUNZE LI (C4) he bandwidths h, h 1, and h satisfy nh 4 and nh 8 0, nh 4 i and nh 8 i 0 for i = 1 and. (C5) E(Y 4 ) < and E(Xi 4 ) < for i = 1,..., p. In addition, the conditional variance of Y given x is bounded away from 0 and from above. Appendix B: Proofs of heorems 1 and Proof of heorem 1. With regard to l( α x), σ ( β x), and Ê(x β x), we first quantify the extent to which these functions can approximate their respective true values. We take l( α x) as an example. Note that l( α x) l(α 0 x) = l(α x) l(α 0 x) + l( α x) l(α 0 x). he first term in the right hand side can be dealt with using standard nonparametric regression theory. herefore, it remains to quantify the difference l( α x) l(α 0 x). Let C = α : α α 0 Cn 1/. Recall that Li, Zhu, and Zhu (011, Lemma 1) proved that, if nh 4 1, sup l( α x) l(α 0 x) E l( α x) l(α log n 0 x) = Op. sup x R p C By invoking the symmetry of the kernel function and the condition that the third derivative of l( )f( ) is local Lipschitz, and using similar arguments as those in proving Lemma 3.3 of Zhu and Fang (1996), we can show that E l(α 0 x) l(α 0 x) µ h 1 l (α 0 x) + l (α 0 x)f (α 0 x) f(α 0 x) nh 1 = O(h 4 ), where µ = 1 1 u K(u)du. A similar result also holds when α 0 is replaced by α. By the Mean Value heorem, it follows that ( ) h sup l ( α x) l (α 0 x) h 1µ = O 1 p C n 1/ = O(h 4 1), since nh 4 1. he above arguments imply that, for any fixed x Rp and nh 8 1 0, l(α 0 x) l( α x) = l (α x)x (α 0 α) + o p (n 1/ ). (B.1) Following similar arguments, we can obtain that σ 0(β 0 x) σ 0( β x) = σ0(β 0 x) ( x β 0 β ) + o p (n 1/ ), and
15 SINGLE-INDEX MODELING FOR CONDIIONAL HEEROSCEDASICIY 149 Ê(x β 0 x) Ê(x β x) = E(x β 0 x) ( x β 0 β ) + o p (n 1/ ). (B.) (B.1) and (B.) are used repetitively in subsequent derivations. We discuss the consistency of β first. Let I(β) = E ε σ (β x) x E(x β x), Î(β) = n 1 E ε i σ (β x i ) x i Ê(x i β x i ). Let U(β 0 ) be any open set that includes β 0. o prove the consistency of β, we assume that inf β Θ\U(β 0 ) I(β) δ for some positive constant δ. It suffices to show that P r inf β Θ\U(β Î(β) δ 0. (B.3) 0 ) he condition inf β Θ\U(β 0 ) I(β) δ implies that P r inf β Θ\U(β Î(β) δ 0 ) P r inf β Θ\U(β 0 ) herefore, it suffices to show that, for any fixed δ, P r Î(β) I(β) δ 0. sup β Θ Î(β) I(β) δ def Recall the definition of Î(β) and I(β), and let Î(β) I(β) = 6 I 1,i, where I 1,1 = n 1 ε i σ (β x i ) x i E(x i β x i ) I(β), I 1, = n 1 I 1,3 = n 1 I 1,4 = n 1 I 1,5 = n 1 I 1,6 = n 1 ε i ε i xi E(x i β x i ), σ (β x i ) σ (β x i ) x i E(x i β x i ), ε i σ (β x i ) E(x i β x i ) Ê(x i β x i ), ε i ε i E(x i β x i ) Ê(x i β x i ), σ (β x i ) σ (β x i ) E(x i β x i ) Ê(x i β x i )..
16 150 LIPING ZHU, YUEXIAO DONG AND RUNZE LI We note that I 1,1 is an average of independent and identically distributed random variables with mean zero. he classical theory of empirical process shows that sup β Θ I 1,1 = o p (log n/n 1/ ) almost surely (Pollard (1984, hm..37)). Write I 1, = n 1 +n 1 [ l(α 0 x i ) l( α x i ) ε i xi E(x i β x i ) ] [ l(α 0 x i ) l( α x i ) xi E(x i β x i ) ] = I 1,,1 + I 1,,. We first prove that sup β Θ I 1,,1 = o p (1). Note that I 1,,1 sup l(α 0 x) l(α 0 x) n 1 x R p εi xi E(x i β x i ) +n 1 l(α 0 x i ) l( α x i ) εi xi E(x i β x i ) sup l(α 0 x) l(α 0 x) n 1 x R p εi xi E(x i β x i ) +n 1 ε i l (α 0 x i ) x i E(x i β x i ) x i α 0 α. herefore, sup β Θ I 1,,1 = o p (1) follows immediately from the consistency of α and the uniform consistency of l(α 0 x). Similarly, sup β Θ I 1,, = o p (1) can be proved. hese two results imply that sup β Θ I 1, = o p (1). Using the U- statistic theory (Serfling (1980)), we can obtain that sup β Θ I 1,i = o p (1), i = 3, 4. Following similar arguments to deal with I 1,, we have sup β Θ I 1,5 = o p (1). hat sup β Θ I 1,6 = o p (1) follows directly from the Cauchy-Schwartz inequality and standard results on the uniform convergence of Ê(x β x) and σ (β x) in nonparametric regression. By combining these results, it follows that P r sup Î(β) I(β) δ 0 β Θ for any fixed δ, which completes the proof of (B.3). Next we examine the root-n consistency of β. We expand the estimating equation as 0 = n 1/ n x i Ê(x i β x i ) ε i σ ( β x i ) =: 1 I,i. he terms I,i s are explicitly defined in the sequel.
17 SINGLE-INDEX MODELING FOR CONDIIONAL HEEROSCEDASICIY 151 By the Central Limit heorem, I,1 =: n 1/ xi E(x i β 0 x i ) ε i σ (β 0 x i ) = O p (1). Invoking the root-n consistency of α, we can show that I, =: n 1/ xi E(x i β 0 x i ) ε i ε i = op (1). Invoking (B.), it follows that [ I,3 = E σ (β 0 x) x E(x β 0 x) ] ( x + o p (1) n 1/ β 0 β ). Because E x i E(x i β 0 x i ) β 0 x i = 0, U-statistic theory implies I,4 =: n 1/ I,5 =: n 1/ xi E(x i β 0 x i ) σ (β 0 x) σ (β 0 x) = o p (1), E(x i β 0 x i ) Ê(x i β 0 x i ) ε i σ (β 0 x) = o p (1). By the uniform convergence of Ê(x i β 0 x i ), and (B.1), it is easy to see I,6 =: n 1/ I,7 =: n 1/ E(x i β 0 x i ) Ê(x i β 0 x i ) ε i ε i = op (1), E(x i β 0 x i ) Ê(x i β 0 x i ) σ (β 0 x) σ ( β x) = o p (1). Given nh 4 h 4 0 and nhh, the Cauchy-Schwartz inequality implies I,8 =: n 1/ E(x i β 0 x i ) Ê(x i β 0 x i ) σ (β 0 x) σ (β 0 x) = 0. By using the fact that E ε i σ (β 0 x) x = 0, we can show that I,9 =: n 1/ I,10 =: n 1/ ( = o p (1)n 1/ β 0 β ). Ê(xi β 0 x i ) Ê(x i β ε x i ) i σ (β 0 x), Ê(xi β 0 x i ) Ê(x i β x i ) ε i ε i
18 15 LIPING ZHU, YUEXIAO DONG AND RUNZE LI Invoking (B.) again, we have I,11 =: n 1/ Ê(xi β 0 x i ) Ê(x i β x i ) σ (β 0 x) σ ( β x) ( = o p (1)n 1/ β 0 β ), I,1 =: n 1/ Ê(xi β 0 x i ) Ê(x i β σ x i ) (β 0 x) σ (β 0 x) ( = o p (1)n 1/ β 0 β ). ( By combining these results, we obtain that β 0 β ) = O p (n 1/ ). he asymptotic normality follows immediately from the root-n consistency described above and the Central Limit heorem. hat is, [ x E E(x β 0 x) σ0(β 0 x) ] ( β ) x n 1/ β0 = n 1/ xi E(x β 0 x i ) ε i σ (β 0 x) + o p (1), which completes the proof of heorem 1. Proof of heorem. It can be proved, following (B.), the root-n consistency of α and β, and standard arguments in nonparametric regression. See, for example, Ruppert and Wand (1994). Details are omitted from here. Appendix C: Some preliminary results We discuss the consistency of α, its convergence rate, and its asymptotic normality. First we prove the consistency of α. ake J(α) = E Y l(α x) x E(x α x), Ĵ(α) = n 1 Y i l(α x i ) x i Ê(x i α x i ). Let U(α 0 ) be any open set that includes α 0. o prove the consistency of α, we assume that inf α Θ\U(α0 ) J(α) δ for some positive constant δ. It suffices to show P r inf Ĵ(α) δ 0. (C.1) α Θ\U(α 0 ) he condition inf α Θ\U(α0 ) J(α) δ implies that P r inf Ĵ(α) δ P r α Θ\U(α 0 ) inf α Θ\U(α 0 ) Ĵ(α) J(α) δ.
19 SINGLE-INDEX MODELING FOR CONDIIONAL HEEROSCEDASICIY 153 herefore, it suffices to show that P r sup Ĵ(α) J(α) δ 0 α Θ for any fixed δ. his follows directly from the consistency of Ĵ(α) if h 1 0 and nh 1 / log n. See, for example, Ichimura (1993, Lemmas ), Zhu and Fang (1996, Lemmas 3.1 and 3.3), and Pollard (1984, hm..37). (C.1) is proved. Next we prove the root-n consistency of α. With probability close to 1, by a aylor expansion, for α between α 0 and α, 0 = Ĵ( α) = Ĵ(α 0) + Ĵ ( α) ( α α 0 ) = Ĵ(α 0) + Ĵ ( α) Ĵ (α 0 ) ( α α 0 ) + Ĵ (α 0 ) ( α α 0 ). (C.) Recall the definition of Ĵ( α) and write Ĵ(α 0) def = 4 Ĵi, where Ĵ 1 = n 1 Ĵ = n 1 Ĵ 3 = n 1 Ĵ 4 = n 1 Yi l(α 0 x i ) x i E(x i α 0 x i ), Yi l(α 0 x i ) E(x i α 0 x i ) Ê(x i α 0 x i ), l(α 0 x i ) l(α 0 x i ) xi E(x i α 0 x i ), l(α 0 x i ) l(α 0 x i ) E(x i α 0 x i ) Ê(x i α 0 x i ). Because E [ Y l(α 0 x) x E(x α 0 x)] = 0, Ĵ 1 is of the order O p (n 1/ ) by the Central Limit heorem. Both Ĵ and Ĵ3 are of the order o p (n 1/ ) according to the standard U-statistics theory (Serfling (1980)); by using Cauchy- Schwartz inequality, Ĵ4 is less than [ Ê l(α 0 x) l(α ] 1/ [ ] 1/ 0 x) E E(x α 0 x) Ê(x α 0 x), which is of order O p h log n/(nh 1 ). If nh and nh 1 / log n, then the last term is also o p (n 1/ ). Combining these results, we obtained that Ĵ(α 0 ) = O p (n 1/ ). By using the Weak Law of Large Numbers, we can see that Ĵ (α 0 ) converges in probability to J (α 0 ) = E [ l (α 0 x) x E(x α 0 x) x ] for nh 1 and
20 154 LIPING ZHU, YUEXIAO DONG AND RUNZE LI nh Similarly, Ĵ ( α) Ĵ (α 0 ) = o p (1) in that α is a consistent estimate of α 0. herefore, α is a root-n consistent estimate of α 0. he asymptotic normality of α follows from its root-n consistency. Specifically, (C.) implies that n 1/ ( α α 0 ) = J (α 0 ) n 1/ ε i xi E(x i α x i ) + o p (1) where J (α 0 ) = E [ l (α 0 x) x E(x α 0 x) x ] and ε i = Y i l(α 0 x i). References Anderson,. G. and Lund, J. (1997). Estimating continuous time stochastic volatility models of the short term interest rate. J. Econometrics 77, Antoniadis, A., Grégoire, G. and Mckeague, I. (004). Bayesian estimation in single-index models. Statist. Sinica 14, Box, G. E. P and Cox, D. R. (1964). An analysis of transformations (with discussion). J. Roy. Statist. Soc. Ser. B 6, Cai,., Levine, M. and Wang, L. (009). Variance function estimation in multivariate nonparametric regression with fixed design. J. Multivariate Anal. 100, Camden, M. (1989). he Data Bundle. New Zealand Statistical Association, WeHington, New Zealand. Carroll, R. J. and Ruppert, D. (1988). ransformations and Weighting in Regression. Chapman & Hall, London. Fan, J. and Yao, Q. (1998). Efficient estimation of conditional variance functions in stochastic regression. Biometrika 85, Hall, P. and Carroll, R. J. (1989). Variance function estimating in regression: the effect of estimating the mean. J. Roy. Statist. Soc. Ser. B 51, Härdle, W., Hall, P. and Ichimura, H. (1993). Optimal smoothing in single-index models. Ann. Statist. 1, Ichimura, H. (1993). Semiparametric least squares (SLS) and weighted SLS estimation of singleindex models. J. Econometrics 58, Li, B. and Dong, Y. (009). Dimension reduction for non-elliptically distributed predictors. Ann. Statist. 37, Li, K.-C. (1991), Sliced inverse regression for dimension reduction. J. Amer. Statist. Assoc. 86, Li, L. X., Zhu, L. P. and Zhu, L. X. (011). Inference on primary parameter of interest with aid of dimension reduction estimation. J. Roy. Statist. Soc. Ser. B 73, Mercurio, D. and Spokoiny, V. (004). Statistical inference for time-inhomogeneous volatility models. Ann. Statist. 3, Müller, H.-G. and Stadtmüller, U. (1987). Estimation of heteroscedasticity in regression analysis. Ann. Statist. 15, Pollard, D. (1984). Convergence of Stochastic Processes. Springer, New York. Ruppert, D. and Wand, M. P. (1994). Multivariate locally weighted least squares regression. Ann. Statist.,
21 SINGLE-INDEX MODELING FOR CONDIIONAL HEEROSCEDASICIY 155 Ruppert, D., Wand, M., Holst, U. and Hössjer, O (1997). Local polynomial variance function estimation. echnometrics 39, Serfling, R. J. (1980). Approximation heorems of Mathematical Statistics. John Wiley, New York. Song, Q. and Yang, L. (009). Spline confidence bands for variance function. J. Nonparametr. Statist. 1, Wang, L., Brown, L. D., Cai,. and Levine, M. (008). Effect of mean on variance function estimation on nonparametric regression. Ann. Statist. 36, Xia, Y., ong, H. and Li, W. K. (00). Single-index volatility models and estimation. Statist. Sinica 1, Yao, Q. and ong, H. (1994). Quantifying the influence of initial values on nonlinear prediction. J. Roy. Statist. Soc. Ser. B 56, Yin, J., Geng, Z., Li, R. and Wang, H. (010). Nonparametric covariance model. Statist. Sinica 0, Zhu, L. X. and Fang, K.. (1996). Asymptotics for kernel estimate of sliced inverse regression. Ann. Statist. 4, School of Statistics and Management and the Key Laboratory of Mathematical Economics, Ministry of Education, Shanghai University of Finance and Economics, Shanghai 00433, P. R. China. zhu.liping@mail.shufe.edu.cn Department of Statistics, emple University, Philadelphia, PA 191, U. S. A. ydong@temple.edu Department of Statistics and he Methodology Center, Pennsylvania State University, University Park, PA , U.S.A. rli@stat.psu.edu (Received March 01; accepted August 01)
SEMIPARAMETRIC QUANTILE REGRESSION WITH HIGH-DIMENSIONAL COVARIATES. Liping Zhu, Mian Huang, & Runze Li. The Pennsylvania State University
SEMIPARAMETRIC QUANTILE REGRESSION WITH HIGH-DIMENSIONAL COVARIATES Liping Zhu, Mian Huang, & Runze Li The Pennsylvania State University Technical Report Series #10-104 College of Health and Human Development
More informationDirection Estimation in a General Regression Model with Discrete Predictors
Direction Estimation in a General Regression Model with Discrete Predictors Yuexiao Dong and Zhou Yu Abstract Consider a general regression model, where the response Y depends on discrete predictors X
More informationDiscussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon
Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Jianqing Fan Department of Statistics Chinese University of Hong Kong AND Department of Statistics
More informationDEPARTMENT MATHEMATIK ARBEITSBEREICH MATHEMATISCHE STATISTIK UND STOCHASTISCHE PROZESSE
Estimating the error distribution in nonparametric multiple regression with applications to model testing Natalie Neumeyer & Ingrid Van Keilegom Preprint No. 2008-01 July 2008 DEPARTMENT MATHEMATIK ARBEITSBEREICH
More informationSingle Index Quantile Regression for Heteroscedastic Data
Single Index Quantile Regression for Heteroscedastic Data E. Christou M. G. Akritas Department of Statistics The Pennsylvania State University SMAC, November 6, 2015 E. Christou, M. G. Akritas (PSU) SIQR
More informationSMOOTHED BLOCK EMPIRICAL LIKELIHOOD FOR QUANTILES OF WEAKLY DEPENDENT PROCESSES
Statistica Sinica 19 (2009), 71-81 SMOOTHED BLOCK EMPIRICAL LIKELIHOOD FOR QUANTILES OF WEAKLY DEPENDENT PROCESSES Song Xi Chen 1,2 and Chiu Min Wong 3 1 Iowa State University, 2 Peking University and
More informationFunction of Longitudinal Data
New Local Estimation Procedure for Nonparametric Regression Function of Longitudinal Data Weixin Yao and Runze Li Abstract This paper develops a new estimation of nonparametric regression functions for
More informationIssues on quantile autoregression
Issues on quantile autoregression Jianqing Fan and Yingying Fan We congratulate Koenker and Xiao on their interesting and important contribution to the quantile autoregression (QAR). The paper provides
More informationVariance Function Estimation in Multivariate Nonparametric Regression
Variance Function Estimation in Multivariate Nonparametric Regression T. Tony Cai 1, Michael Levine Lie Wang 1 Abstract Variance function estimation in multivariate nonparametric regression is considered
More informationEfficient Estimation for the Partially Linear Models with Random Effects
A^VÇÚO 1 33 ò 1 5 Ï 2017 c 10 Chinese Journal of Applied Probability and Statistics Oct., 2017, Vol. 33, No. 5, pp. 529-537 doi: 10.3969/j.issn.1001-4268.2017.05.009 Efficient Estimation for the Partially
More informationDESIGN-ADAPTIVE MINIMAX LOCAL LINEAR REGRESSION FOR LONGITUDINAL/CLUSTERED DATA
Statistica Sinica 18(2008), 515-534 DESIGN-ADAPTIVE MINIMAX LOCAL LINEAR REGRESSION FOR LONGITUDINAL/CLUSTERED DATA Kani Chen 1, Jianqing Fan 2 and Zhezhen Jin 3 1 Hong Kong University of Science and Technology,
More informationMinimax Rate of Convergence for an Estimator of the Functional Component in a Semiparametric Multivariate Partially Linear Model.
Minimax Rate of Convergence for an Estimator of the Functional Component in a Semiparametric Multivariate Partially Linear Model By Michael Levine Purdue University Technical Report #14-03 Department of
More informationSingle Index Quantile Regression for Heteroscedastic Data
Single Index Quantile Regression for Heteroscedastic Data E. Christou M. G. Akritas Department of Statistics The Pennsylvania State University JSM, 2015 E. Christou, M. G. Akritas (PSU) SIQR JSM, 2015
More informationSIMULTANEOUS CONFIDENCE BANDS AND HYPOTHESIS TESTING FOR SINGLE-INDEX MODELS
Statistica Sinica 24 (2014), 937-955 doi:http://dx.doi.org/10.5705/ss.2012.127 SIMULTANEOUS CONFIDENCE BANDS AND HYPOTHESIS TESTING FOR SINGLE-INDEX MODELS Gaorong Li 1, Heng Peng 2, Kai Dong 2 and Tiejun
More informationESTIMATION OF NONLINEAR BERKSON-TYPE MEASUREMENT ERROR MODELS
Statistica Sinica 13(2003), 1201-1210 ESTIMATION OF NONLINEAR BERKSON-TYPE MEASUREMENT ERROR MODELS Liqun Wang University of Manitoba Abstract: This paper studies a minimum distance moment estimator for
More informationNew Local Estimation Procedure for Nonparametric Regression Function of Longitudinal Data
ew Local Estimation Procedure for onparametric Regression Function of Longitudinal Data Weixin Yao and Runze Li The Pennsylvania State University Technical Report Series #0-03 College of Health and Human
More informationWEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION. Abstract
Journal of Data Science,17(1). P. 145-160,2019 DOI:10.6339/JDS.201901_17(1).0007 WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION Wei Xiong *, Maozai Tian 2 1 School of Statistics, University of
More informationAnalysis Methods for Supersaturated Design: Some Comparisons
Journal of Data Science 1(2003), 249-260 Analysis Methods for Supersaturated Design: Some Comparisons Runze Li 1 and Dennis K. J. Lin 2 The Pennsylvania State University Abstract: Supersaturated designs
More informationIntroduction. Linear Regression. coefficient estimates for the wage equation: E(Y X) = X 1 β X d β d = X β
Introduction - Introduction -2 Introduction Linear Regression E(Y X) = X β +...+X d β d = X β Example: Wage equation Y = log wages, X = schooling (measured in years), labor market experience (measured
More informationA review of some semiparametric regression models with application to scoring
A review of some semiparametric regression models with application to scoring Jean-Loïc Berthet 1 and Valentin Patilea 2 1 ENSAI Campus de Ker-Lann Rue Blaise Pascal - BP 37203 35172 Bruz cedex, France
More informationCOMPARISON OF GMM WITH SECOND-ORDER LEAST SQUARES ESTIMATION IN NONLINEAR MODELS. Abstract
Far East J. Theo. Stat. 0() (006), 179-196 COMPARISON OF GMM WITH SECOND-ORDER LEAST SQUARES ESTIMATION IN NONLINEAR MODELS Department of Statistics University of Manitoba Winnipeg, Manitoba, Canada R3T
More informationSome Theories about Backfitting Algorithm for Varying Coefficient Partially Linear Model
Some Theories about Backfitting Algorithm for Varying Coefficient Partially Linear Model 1. Introduction Varying-coefficient partially linear model (Zhang, Lee, and Song, 2002; Xia, Zhang, and Tong, 2004;
More informationLocal linear multiple regression with variable. bandwidth in the presence of heteroscedasticity
Local linear multiple regression with variable bandwidth in the presence of heteroscedasticity Azhong Ye 1 Rob J Hyndman 2 Zinai Li 3 23 January 2006 Abstract: We present local linear estimator with variable
More informationNonparametric Modal Regression
Nonparametric Modal Regression Summary In this article, we propose a new nonparametric modal regression model, which aims to estimate the mode of the conditional density of Y given predictors X. The nonparametric
More informationDoubly Penalized Likelihood Estimator in Heteroscedastic Regression 1
DEPARTMENT OF STATISTICS University of Wisconsin 1210 West Dayton St. Madison, WI 53706 TECHNICAL REPORT NO. 1084rr February 17, 2004 Doubly Penalized Likelihood Estimator in Heteroscedastic Regression
More informationSINGLE-STEP ESTIMATION OF A PARTIALLY LINEAR MODEL
SINGLE-STEP ESTIMATION OF A PARTIALLY LINEAR MODEL DANIEL J. HENDERSON AND CHRISTOPHER F. PARMETER Abstract. In this paper we propose an asymptotically equivalent single-step alternative to the two-step
More informationA Bootstrap Test for Conditional Symmetry
ANNALS OF ECONOMICS AND FINANCE 6, 51 61 005) A Bootstrap Test for Conditional Symmetry Liangjun Su Guanghua School of Management, Peking University E-mail: lsu@gsm.pku.edu.cn and Sainan Jin Guanghua School
More informationWavelet Shrinkage for Nonequispaced Samples
University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 1998 Wavelet Shrinkage for Nonequispaced Samples T. Tony Cai University of Pennsylvania Lawrence D. Brown University
More informationIntroduction to Regression
Introduction to Regression p. 1/97 Introduction to Regression Chad Schafer cschafer@stat.cmu.edu Carnegie Mellon University Introduction to Regression p. 1/97 Acknowledgement Larry Wasserman, All of Nonparametric
More informationNonparametric Econometrics
Applied Microeconometrics with Stata Nonparametric Econometrics Spring Term 2011 1 / 37 Contents Introduction The histogram estimator The kernel density estimator Nonparametric regression estimators Semi-
More informationTESTING SERIAL CORRELATION IN SEMIPARAMETRIC VARYING COEFFICIENT PARTIALLY LINEAR ERRORS-IN-VARIABLES MODEL
Jrl Syst Sci & Complexity (2009) 22: 483 494 TESTIG SERIAL CORRELATIO I SEMIPARAMETRIC VARYIG COEFFICIET PARTIALLY LIEAR ERRORS-I-VARIABLES MODEL Xuemei HU Feng LIU Zhizhong WAG Received: 19 September
More informationPENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA
PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA Kasun Rathnayake ; A/Prof Jun Ma Department of Statistics Faculty of Science and Engineering Macquarie University
More informationMarginal Specifications and a Gaussian Copula Estimation
Marginal Specifications and a Gaussian Copula Estimation Kazim Azam Abstract Multivariate analysis involving random variables of different type like count, continuous or mixture of both is frequently required
More informationNonparametric Inference via Bootstrapping the Debiased Estimator
Nonparametric Inference via Bootstrapping the Debiased Estimator Yen-Chi Chen Department of Statistics, University of Washington ICSA-Canada Chapter Symposium 2017 1 / 21 Problem Setup Let X 1,, X n be
More informationA Note on Data-Adaptive Bandwidth Selection for Sequential Kernel Smoothers
6th St.Petersburg Workshop on Simulation (2009) 1-3 A Note on Data-Adaptive Bandwidth Selection for Sequential Kernel Smoothers Ansgar Steland 1 Abstract Sequential kernel smoothers form a class of procedures
More informationUniversity of California, Berkeley
University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 24 Paper 153 A Note on Empirical Likelihood Inference of Residual Life Regression Ying Qing Chen Yichuan
More information1. The OLS Estimator. 1.1 Population model and notation
1. The OLS Estimator OLS stands for Ordinary Least Squares. There are 6 assumptions ordinarily made, and the method of fitting a line through data is by least-squares. OLS is a common estimation methodology
More informationA PRACTICAL WAY FOR ESTIMATING TAIL DEPENDENCE FUNCTIONS
Statistica Sinica 20 2010, 365-378 A PRACTICAL WAY FOR ESTIMATING TAIL DEPENDENCE FUNCTIONS Liang Peng Georgia Institute of Technology Abstract: Estimating tail dependence functions is important for applications
More informationSTATISTICAL ESTIMATION IN VARYING COEFFICIENT MODELS
he Annals of Statistics 1999, Vol. 27, No. 5, 1491 1518 SAISICAL ESIMAION IN VARYING COEFFICIEN MODELS By Jianqing Fan 1 and Wenyang Zhang University of North Carolina and Chinese University of Hong Kong
More informationSUFFICIENT DIMENSION REDUCTION IN REGRESSIONS WITH MISSING PREDICTORS
Statistica Sinica 22 (2012), 1611-1637 doi:http://dx.doi.org/10.5705/ss.2009.191 SUFFICIENT DIMENSION REDUCTION IN REGRESSIONS WITH MISSING PREDICTORS Liping Zhu 1, Tao Wang 2 and Lixing Zhu 2 1 Shanghai
More informationIntroduction to Regression
Introduction to Regression Chad M. Schafer May 20, 2015 Outline General Concepts of Regression, Bias-Variance Tradeoff Linear Regression Nonparametric Procedures Cross Validation Local Polynomial Regression
More informationWeb-based Supplementary Material for. Dependence Calibration in Conditional Copulas: A Nonparametric Approach
1 Web-based Supplementary Material for Dependence Calibration in Conditional Copulas: A Nonparametric Approach Elif F. Acar, Radu V. Craiu, and Fang Yao Web Appendix A: Technical Details The score and
More informationModel Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao
Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics Jiti Gao Department of Statistics School of Mathematics and Statistics The University of Western Australia Crawley
More informationIntroduction. Chapter 1
Chapter 1 Introduction In this book we will be concerned with supervised learning, which is the problem of learning input-output mappings from empirical data (the training dataset). Depending on the characteristics
More informationIdentification and Estimation of Partially Linear Censored Regression Models with Unknown Heteroscedasticity
Identification and Estimation of Partially Linear Censored Regression Models with Unknown Heteroscedasticity Zhengyu Zhang School of Economics Shanghai University of Finance and Economics zy.zhang@mail.shufe.edu.cn
More informationAnalysis of Longitudinal Data with Semiparametric Estimation of Covariance Function
Analysis of Longitudinal Data with Semiparametric Estimation of Covariance Function Jianqing Fan, Tao Huang and Runze Li Abstract Improving efficiency for regression coefficients and predicting trajectories
More informationOn a Nonparametric Notion of Residual and its Applications
On a Nonparametric Notion of Residual and its Applications Bodhisattva Sen and Gábor Székely arxiv:1409.3886v1 [stat.me] 12 Sep 2014 Columbia University and National Science Foundation September 16, 2014
More informationNonparametric regression with rescaled time series errors
Nonparametric regression with rescaled time series errors José E. Figueroa-López Michael Levine Abstract We consider a heteroscedastic nonparametric regression model with an autoregressive error process
More informationNonconcave Penalized Likelihood with A Diverging Number of Parameters
Nonconcave Penalized Likelihood with A Diverging Number of Parameters Jianqing Fan and Heng Peng Presenter: Jiale Xu March 12, 2010 Jianqing Fan and Heng Peng Presenter: JialeNonconcave Xu () Penalized
More informationTransformation and Smoothing in Sample Survey Data
Scandinavian Journal of Statistics, Vol. 37: 496 513, 2010 doi: 10.1111/j.1467-9469.2010.00691.x Published by Blackwell Publishing Ltd. Transformation and Smoothing in Sample Survey Data YANYUAN MA Department
More informationERROR VARIANCE ESTIMATION IN NONPARAMETRIC REGRESSION MODELS
ERROR VARIANCE ESTIMATION IN NONPARAMETRIC REGRESSION MODELS By YOUSEF FAYZ M ALHARBI A thesis submitted to The University of Birmingham for the Degree of DOCTOR OF PHILOSOPHY School of Mathematics The
More informationNonparametric Methods
Nonparametric Methods Michael R. Roberts Department of Finance The Wharton School University of Pennsylvania July 28, 2009 Michael R. Roberts Nonparametric Methods 1/42 Overview Great for data analysis
More informationEfficiency loss and the linearity condition in dimension reduction
Biometrika (203), 00,2,pp. 37 383 doi: 0.093/biomet/ass075 C 203 Biometrika Trust Advance Access publication 24 February 203 Printed in Great Britain Efficiency loss and the linearity condition in dimension
More informationA nonparametric method of multi-step ahead forecasting in diffusion processes
A nonparametric method of multi-step ahead forecasting in diffusion processes Mariko Yamamura a, Isao Shoji b a School of Pharmacy, Kitasato University, Minato-ku, Tokyo, 108-8641, Japan. b Graduate School
More informationFunctional Latent Feature Models. With Single-Index Interaction
Generalized With Single-Index Interaction Department of Statistics Center for Statistical Bioinformatics Institute for Applied Mathematics and Computational Science Texas A&M University Naisyin Wang and
More informationGaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012
Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature
More informationIntroduction to Regression
Introduction to Regression David E Jones (slides mostly by Chad M Schafer) June 1, 2016 1 / 102 Outline General Concepts of Regression, Bias-Variance Tradeoff Linear Regression Nonparametric Procedures
More informationQuantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be
Quantile methods Class Notes Manuel Arellano December 1, 2009 1 Unconditional quantiles Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be Q τ (Y ) q τ F 1 (τ) =inf{r : F
More informationEstimation of cumulative distribution function with spline functions
INTERNATIONAL JOURNAL OF ECONOMICS AND STATISTICS Volume 5, 017 Estimation of cumulative distribution function with functions Akhlitdin Nizamitdinov, Aladdin Shamilov Abstract The estimation of the cumulative
More informationNonparametric Regression. Badr Missaoui
Badr Missaoui Outline Kernel and local polynomial regression. Penalized regression. We are given n pairs of observations (X 1, Y 1 ),...,(X n, Y n ) where Y i = r(x i ) + ε i, i = 1,..., n and r(x) = E(Y
More informationDensity estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas
0 0 5 Motivation: Regression discontinuity (Angrist&Pischke) Outcome.5 1 1.5 A. Linear E[Y 0i X i] 0.2.4.6.8 1 X Outcome.5 1 1.5 B. Nonlinear E[Y 0i X i] i 0.2.4.6.8 1 X utcome.5 1 1.5 C. Nonlinearity
More informationEstimation of a quadratic regression functional using the sinc kernel
Estimation of a quadratic regression functional using the sinc kernel Nicolai Bissantz Hajo Holzmann Institute for Mathematical Stochastics, Georg-August-University Göttingen, Maschmühlenweg 8 10, D-37073
More informationSEMIPARAMETRIC LONGITUDINAL MODEL WITH IRREGULAR TIME AUTOREGRESSIVE ERROR PROCESS
Statistica Sinica 25 (2015), 507-527 doi:http://dx.doi.org/10.5705/ss.2013.073 SEMIPARAMETRIC LOGITUDIAL MODEL WITH IRREGULAR TIME AUTOREGRESSIVE ERROR PROCESS Yang Bai 1,2, Jian Huang 3, Rui Li 1 and
More informationUSING A BIMODAL KERNEL FOR A NONPARAMETRIC REGRESSION SPECIFICATION TEST
Statistica Sinica 25 (2015), 1145-1161 doi:http://dx.doi.org/10.5705/ss.2014.008 USING A BIMODAL KERNEL FOR A NONPARAMETRIC REGRESSION SPECIFICATION TEST Cheolyong Park 1, Tae Yoon Kim 1, Jeongcheol Ha
More informationGoodness-of-fit tests for the cure rate in a mixture cure model
Biometrika (217), 13, 1, pp. 1 7 Printed in Great Britain Advance Access publication on 31 July 216 Goodness-of-fit tests for the cure rate in a mixture cure model BY U.U. MÜLLER Department of Statistics,
More informationLocal Quantile Regression
Wolfgang K. Härdle Vladimir Spokoiny Weining Wang Ladislaus von Bortkiewicz Chair of Statistics C.A.S.E. - Center for Applied Statistics and Economics Humboldt-Universität zu Berlin http://ise.wiwi.hu-berlin.de
More informationUltra High Dimensional Variable Selection with Endogenous Variables
1 / 39 Ultra High Dimensional Variable Selection with Endogenous Variables Yuan Liao Princeton University Joint work with Jianqing Fan Job Market Talk January, 2012 2 / 39 Outline 1 Examples of Ultra High
More informationIntroduction to Regression
Introduction to Regression Chad M. Schafer cschafer@stat.cmu.edu Carnegie Mellon University Introduction to Regression p. 1/100 Outline General Concepts of Regression, Bias-Variance Tradeoff Linear Regression
More informationDepartment of Statistics Purdue University West Lafayette, IN USA
Effect of Mean on Variance Funtion Estimation in Nonparametric Regression by Lie Wang, Lawrence D. Brown,T. Tony Cai University of Pennsylvania Michael Levine Purdue University Technical Report #06-08
More informationSupplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION. September 2017
Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION By Degui Li, Peter C. B. Phillips, and Jiti Gao September 017 COWLES FOUNDATION DISCUSSION PAPER NO.
More informationSUPPLEMENT TO PARAMETRIC OR NONPARAMETRIC? A PARAMETRICNESS INDEX FOR MODEL SELECTION. University of Minnesota
Submitted to the Annals of Statistics arxiv: math.pr/0000000 SUPPLEMENT TO PARAMETRIC OR NONPARAMETRIC? A PARAMETRICNESS INDEX FOR MODEL SELECTION By Wei Liu and Yuhong Yang University of Minnesota In
More informationStatistica Sinica Preprint No: SS
Statistica Sinica Preprint No: SS-017-0013 Title A Bootstrap Method for Constructing Pointwise and Uniform Confidence Bands for Conditional Quantile Functions Manuscript ID SS-017-0013 URL http://wwwstatsinicaedutw/statistica/
More informationChapter 2: Resampling Maarten Jansen
Chapter 2: Resampling Maarten Jansen Randomization tests Randomized experiment random assignment of sample subjects to groups Example: medical experiment with control group n 1 subjects for true medicine,
More informationA simple bootstrap method for constructing nonparametric confidence bands for functions
A simple bootstrap method for constructing nonparametric confidence bands for functions Peter Hall Joel Horowitz The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP4/2
More informationLocal Polynomial Regression
VI Local Polynomial Regression (1) Global polynomial regression We observe random pairs (X 1, Y 1 ),, (X n, Y n ) where (X 1, Y 1 ),, (X n, Y n ) iid (X, Y ). We want to estimate m(x) = E(Y X = x) based
More informationSTAT 992 Paper Review: Sure Independence Screening in Generalized Linear Models with NP-Dimensionality J.Fan and R.Song
STAT 992 Paper Review: Sure Independence Screening in Generalized Linear Models with NP-Dimensionality J.Fan and R.Song Presenter: Jiwei Zhao Department of Statistics University of Wisconsin Madison April
More informationEcon 582 Nonparametric Regression
Econ 582 Nonparametric Regression Eric Zivot May 28, 2013 Nonparametric Regression Sofarwehaveonlyconsideredlinearregressionmodels = x 0 β + [ x ]=0 [ x = x] =x 0 β = [ x = x] [ x = x] x = β The assume
More informationSmooth simultaneous confidence bands for cumulative distribution functions
Journal of Nonparametric Statistics, 2013 Vol. 25, No. 2, 395 407, http://dx.doi.org/10.1080/10485252.2012.759219 Smooth simultaneous confidence bands for cumulative distribution functions Jiangyan Wang
More informationBootstrap inference for the finite population total under complex sampling designs
Bootstrap inference for the finite population total under complex sampling designs Zhonglei Wang (Joint work with Dr. Jae Kwang Kim) Center for Survey Statistics and Methodology Iowa State University Jan.
More informationSize and Shape of Confidence Regions from Extended Empirical Likelihood Tests
Biometrika (2014),,, pp. 1 13 C 2014 Biometrika Trust Printed in Great Britain Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests BY M. ZHOU Department of Statistics, University
More informationFULL LIKELIHOOD INFERENCES IN THE COX MODEL
October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach
More informationInteraction effects for continuous predictors in regression modeling
Interaction effects for continuous predictors in regression modeling Testing for interactions The linear regression model is undoubtedly the most commonly-used statistical model, and has the advantage
More informationModel Selection for Semiparametric Bayesian Models with Application to Overdispersion
Proceedings 59th ISI World Statistics Congress, 25-30 August 2013, Hong Kong (Session CPS020) p.3863 Model Selection for Semiparametric Bayesian Models with Application to Overdispersion Jinfang Wang and
More informationCALCULATION METHOD FOR NONLINEAR DYNAMIC LEAST-ABSOLUTE DEVIATIONS ESTIMATOR
J. Japan Statist. Soc. Vol. 3 No. 200 39 5 CALCULAION MEHOD FOR NONLINEAR DYNAMIC LEAS-ABSOLUE DEVIAIONS ESIMAOR Kohtaro Hitomi * and Masato Kagihara ** In a nonlinear dynamic model, the consistency and
More informationAdditive Isotonic Regression
Additive Isotonic Regression Enno Mammen and Kyusang Yu 11. July 2006 INTRODUCTION: We have i.i.d. random vectors (Y 1, X 1 ),..., (Y n, X n ) with X i = (X1 i,..., X d i ) and we consider the additive
More informationSEMI-LINEAR LINEAR INDEX MODEL WHEN THE LINEAR COVARIATES AND INDICES ARE INDEPENDENT
SEMI-LINEAR LINEAR INDEX MODEL WHEN THE LINEAR COVARIATES AND INDICES ARE INDEPENDENT By Yun Sam Chong, Jane-Ling Wang and Lixing Zhu Summary Wecker Associate, University of California at Davis, and The
More informationVector Autoregressive Model. Vector Autoregressions II. Estimation of Vector Autoregressions II. Estimation of Vector Autoregressions I.
Vector Autoregressive Model Vector Autoregressions II Empirical Macroeconomics - Lect 2 Dr. Ana Beatriz Galvao Queen Mary University of London January 2012 A VAR(p) model of the m 1 vector of time series
More informationBy 3DYHOýtåHN, Wolfgang Härdle
No. 2005 31 ROBUST ESTIMATION OF DIMENSION REDUCTION SPACE By 3DYHOýtåHN, Wolfgang Härdle February 2005 ISSN 0924-7815 Robust estimation of dimension reduction space P. Čížek a and W. Härdle b a Department
More informationPaper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001)
Paper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001) Presented by Yang Zhao March 5, 2010 1 / 36 Outlines 2 / 36 Motivation
More informationFlexible Estimation of Treatment Effect Parameters
Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both
More informationCONSTRUCTION OF SLICED SPACE-FILLING DESIGNS BASED ON BALANCED SLICED ORTHOGONAL ARRAYS
Statistica Sinica 24 (2014), 1685-1702 doi:http://dx.doi.org/10.5705/ss.2013.239 CONSTRUCTION OF SLICED SPACE-FILLING DESIGNS BASED ON BALANCED SLICED ORTHOGONAL ARRAYS Mingyao Ai 1, Bochuan Jiang 1,2
More informationWeb Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D.
Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D. Ruppert A. EMPIRICAL ESTIMATE OF THE KERNEL MIXTURE Here we
More informationConfidence intervals for kernel density estimation
Stata User Group - 9th UK meeting - 19/20 May 2003 Confidence intervals for kernel density estimation Carlo Fiorio c.fiorio@lse.ac.uk London School of Economics and STICERD Stata User Group - 9th UK meeting
More information18 Bivariate normal distribution I
8 Bivariate normal distribution I 8 Example Imagine firing arrows at a target Hopefully they will fall close to the target centre As we fire more arrows we find a high density near the centre and fewer
More informationIdentification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case
Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case Arthur Lewbel Boston College Original December 2016, revised July 2017 Abstract Lewbel (2012)
More informationBahadur representations for bootstrap quantiles 1
Bahadur representations for bootstrap quantiles 1 Yijun Zuo Department of Statistics and Probability, Michigan State University East Lansing, MI 48824, USA zuo@msu.edu 1 Research partially supported by
More informationAn Unbiased C p Criterion for Multivariate Ridge Regression
An Unbiased C p Criterion for Multivariate Ridge Regression (Last Modified: March 7, 2008) Hirokazu Yanagihara 1 and Kenichi Satoh 2 1 Department of Mathematics, Graduate School of Science, Hiroshima University
More informationNONPARAMETRIC MIXTURE OF REGRESSION MODELS
NONPARAMETRIC MIXTURE OF REGRESSION MODELS Huang, M., & Li, R. The Pennsylvania State University Technical Report Series #09-93 College of Health and Human Development The Pennsylvania State University
More informationSpecification testing for regressions: an approach bridging between local smoothing and global smoothing methods
arxiv:70.05263v2 [stat.me] 7 Oct 207 Specification testing for regressions: an approach bridging between local smoothing and global smoothing methods Lingzhu Li and Lixing Zhu,2 Department of Mathematics,
More information41903: Introduction to Nonparametrics
41903: Notes 5 Introduction Nonparametrics fundamentally about fitting flexible models: want model that is flexible enough to accommodate important patterns but not so flexible it overspecializes to specific
More information