A Difference Based Approach to Semiparametric Partial Linear Model

Size: px
Start display at page:

Download "A Difference Based Approach to Semiparametric Partial Linear Model"

Transcription

1 A Difference Based Approach to Semiparametric Partial Linear Model Lie Wang, Lawrence D. Brown and T. Tony Cai Department of Statistics The Wharton School University of Pennsylvania Abstract In this article, we consider a commonly used semiparametric partial linear model. We propose analyzing this model using a difference based approach. The procedure estimates the linear component based on the differences of the observations and then estimates the nonparametric component by a kernel estimate using the residuals of the linear fit. It is shown that both the estimator of the linear component and the estimator of the nonparametric component asymptotically perform as well as if the other component were known. The estimator of the linear component is asymptotically efficient and the estimator of the nonparametric component is asymptotically rate optimal. A test for linear combinations of the regression coefficients of the linear component is also developed. Both the estimation and the testing procedures are easily implementable. Numerical performance of the procedure is studied using both simulated and real data. In particular, we demonstrate our method in an analysis of an attitude data set as well as a data set from the Framingham Heart Study. Keywords: Asymptotic efficiency; Difference-based method; Kernel method; Partial linear model; Semiparametric model. The research of Tony Cai was supported in part by NSF Grant DMS

2 Introduction Semiparametric models have received considerable attention in statistics and econometrics. They have a wide range of applications, from biomedical studies to economics. In these models, some of the relations are believed to be of certain parametric form while others are not easily parameterized. In this paper, we consider the following semiparametric partial linear model Y i = X iβ + f(u i ) + ɛ i, i =,..., n, () where X i IR p, U i IR, β is an unknown vector of parameters, f( ) is an unknown function and ɛ i s are independent and identically distributed random noise with mean 0 and variance σ 2 and are independent of (X i, U i ). The semiparametric partial linear model has been extensively studied and several approaches have been developed to construct the estimators. A penalized least-squares method was used in for example Wahba (984), Engle, Granger, Rice and Weiss (986), and Chen and Shiau (99). A partial residual method was used for example in Cuzick (992). And a profile likelihood approach was used in Severini and Wong (992) and Carroll, Fan, Gijbels and Wand (997). In addition, Fan, Härdle, and Mammen (998) have shown that an additive nonparametric component can be estimated as well as if the other components were known. Similar oracle properties were obtained by Linton (997) and Mammen, Linton, and Nielsen (999). Several methods for estimating the additive functions have been proposed, including the marginal integration estimation methods of Tjøstheim and Auestad (994) and Linton and Nielsen (995), the backfitting algorithms of Buja, Hastie, and Tibshirani (989) and Opsomer and Ruppert (998), the estimating equation methods of Mammen et al. (999), the Fourier series approximation approach of Amato, Antoniadis, and De Feis (2002), among others. The issue of achieving the information bound in this and other non- and semiparametric models has been examined by Ritov and Bickel (990) and extensively discussed in Bickel, Klassen, Ritov and Wellner (993). In this article we consider a difference based estimation method for the semiparametric partial linear model (). The estimation procedure is optimal in the sense that the estimator of the linear component is asymptotically efficient and the estimator of the nonparametric component is asymptotically minimax rate optimal. The method of taking differences to eliminate the effect of the unknown nonparametric component have been used in both nonparametric and semiparametric settings. Rice (984) introduced a first-order differencing estimator in a nonparametric regression model for estimating the variance of the random errors. Hall, Kay and Titterington (990) and Munk, Bissantz, Wagner and Freitag (2005) 2

3 extended the idea to higher-order differences for efficient estimation of the variance in such a setting. Horowitz and Spokoiny (200) used differencing for testing between a parametric model and a nonparametric alternative. In the semiparametric regression setting (), Yatchew (997) concentrated attention on estimation of the linear components and used differencing to eliminate bias induced from the presence of the nonparametric component. The properties of the resulting estimator of the linear component were considered under the condition that the nonparametric function f has a bounded first derivative. See also Fan and Huang (200) and Lam and Fan (2007). In this paper we treat estimation of both the linear and the nonparametric components. We use higher order differences for optimal efficiency in estimating the linear part by using a special class of difference sequences. It is important how the linear component is estimated after taking differences. It is interesting to note that, although the differences are correlated, the correlation should be ignored and the linear regression coefficient vector β should be estimated by the ordinary least squares estimator instead of a generalized least squares estimator which takes into account the correlations among the differences. If the correlation structure is incorporated in the estimation, the resulting generalized least squares estimator will not be optimal. Our procedure begins with taking differences of the ordered observations (ordered according to the values of U i ). Let d t, t =, 2,, m + be an order m difference sequence that satisfies t d t = 0 and t d2 t =. For i =, 2,, n m, let D i = d t Y i+ t. (2) D i can be seen as the mth order difference of Y i. The goal of this step is to eliminate the effect of the nonparametric component f. As we will show, by taking the differences the effect of the parametric component is retained and the effect of the nonparametric component f becomes negligible provide that the function f is suitably smooth. After this step, the problem reduces to the standard multiple linear regression problem. We then proceed as if there is no correlation among the differences {D i } and estimate the linear regression coefficients β by the ordinary least squares estimator based on the differences. This estimator of β is shown to be asymptotically efficient. In this step it is important to ignore the correlation among the differences. The estimate of the nonparametric function f is then constructed by using a kernel estimate based on the residuals of the linear fit. This estimator is shown to attain the optimal rate of convergence. The results show that in the case where either the U i s are fixed or random and independent of the X i s both the linear and nonparametric components of the semiparametric model are estimated as well as 3

4 if the other component were known. Hence the estimation procedure has a very desirable oracle property. Moreover, we also derive a test for linear combinations of the regression coefficients of the linear component. The test is fully specified. The test statistic is shown to asymptotically have the usual F distribution under the null hypothesis. Both the estimation and the testing procedures are easily implementable. Numerical performance of the estimation procedure is studied using both simulated and real data. The simulation results are consistent with the theoretical findings. The effect of the nonparametric component on the linear component is negligible, and vice versa. In addition, the simulations show that the effect of the order of the difference sequence on the estimation accuracy is quite small. As real data applications, we demonstrate our estimation and testing methods in an analysis of an attitude data set discussed in Chatterjee and Price (977) and a data set from the Framingham Heart Study. The paper is organized as follows. we will begin in Section 2 with the simpler case where U i = i/n is fixed to illustrate the whole procedure. In Section 3 we treat the general case where U i are random and possibly correlated with the X i and the main results are given. In Section 4, we introduce the testing problem. A simulation study is carried out in Section 5 to study the numerical performance of the procedure. Real data applications are also discussed. The proofs are contained in Section 6. 2 Fixed Design In this section, we consider the fixed design version of the semiparametric partial linear model () where U i = i/n are fixed. The covariates X i in the linear component are assumed to be random. This version is easier to analyze than the case where the U i are also random. We shall treat the random design version in Section 3. Let X i = (X i, X i2,.x ip ) be p dimensional independent random vectors with the covariance matrix Σ X. Define the Lipschitz ball Λ α (M) in the usual way: Λ α (M) = {g : for all 0 x, y, k = 0,..., α, g (k) (x) M, and g ( α ) (x) g ( α ) (y) M x y α } where α is the largest integer less than α and α = α α. Suppose f Λ α (M) for some α > 0. Then the partial linear model () can be written as Y i = X iβ + f(u i ) + ɛ i = X i β + X i2 β X ip β p + f(u i ) + ɛ i. (3) 4

5 The goal is to estimate the coefficient β and the unknown function f. This will be done through a difference based estimation. To achieve the asymptotic efficiency, the usual first-order differences are not sufficient and higher order differences are needed. In fact, the order of differences, m, needs to increase as the sample size n increases. We begin by defining suitable difference sequence. Suppose a difference sequence d, d 2,, d satisfies d i = 0 and d 2 i =. Such a sequence is called an mth order difference sequence. Moreover, for k =, 2,, m let c k = k d i d i+k. Suppose m k= c2 k = O(m ) as m. (4) One example of a sequence that satisfies these conditions is m d = m +, d 2 = d 3 = = d = m(m + ). (5) Remark The asymptotic results in the theorems to follow require that the order m and that (4) be satisfied. However, even the simple choice of m = 2 seems to yield quite satisfactory performance as attested by the simulations and real data examples in Section 5. Then We now consider the difference based estimator of β. Let where Z i = D i = matrix form, (7) becomes d t Y i+ t, i =, 2,, n m. (6) D i = Z iβ + δ i + w i, i =, 2,, n m, (7) d t X i+ t, δ i = d t f(u i+ t ), and w i = d t ɛ i+ t. Written in D = Zβ + δ + w (8) where D = (D, D 2,, D n m ), w = (w, w 2,, w n m ) and Z is a matrix whose ith row is given by Z i. Let Ψ denote the (n m ) (n m ) covariance matrix of w. 5

6 The elements of the matrix Ψ is given by for i = j Ψ i,j = c i j for i j m 0 otherwise. (9) In (7), δ i are the deterministic errors related to the nonparametric component f in () and w i are the random errors which are correlated, and have the covariance matrix Ψ = (Ψ i,j ) given by (9). For estimating the linear regression coefficient vector β, we shall ignore both the deterministic errors δ i and the correlation among the random errors w i. That is, we estimate β by the ordinary least squares estimate β = (Z Z) Z D. (0) Although not entirely intuitive, it is important in this step to ignore the correlation among the w i and use the ordinary least squares estimate. If instead a generalized least squares estimator, which incorporates the correlation structure, is used, the optimality results in Theorem below and Theorem 3 in the next section will not generally be valid. Theorem Suppose α > 0, m, m/n 0, and that Condition (4) is satisfied. Then the estimator β given in (0) is asymptotically efficient, i.e. n( β β) F N(0, σ 2 Σ X ) where Σ X = V ar(x ). Hence the coefficient β is estimated as well as if the nonparametric component is not present and hence ˆβ is asymptotically efficient. Remark 2 Using the generalized least squares method on the differences is equivalent to applying the ordinary least squares regression of Y on X in the original model (). This would cause significant bias due to the presence of f. See Section 5 for numerical comparison. Remark 3 Our proof shows that n( ˆβ β) N(0, σ 2 ( + 2 m k= c2 k )Σ X ). So when condition (4) fails, our results cannot hold. This means condition (4) is necessary for our procedure. Once we have the estimator ˆβ for the regression coefficients, we can plug it back into the original model () and remove the effect of the linear component. For i =, 2,, n the residuals of the linear fit are r i = Y i X i β = f(u i ) + X i(β β) + ɛ i. 6

7 The nonparametric component f is estimated by a kernel estimate based on the residuals r i. Let k(x) be a kernel function satisfying k(x)dx = and x j k(x)dx = 0, for j =, 2,, α. Take h = n /(+2α) and let K i,h (u) = h estimator f is then given by (Ui +U i+ )/2 k( u (U i +U i )du for i =, 2,, n. The )/2 h f(u) = n K i,h (u)r i = n K i,h (u)(y i X i β). () We have the following theorem for the estimator ˆf. Theorem 2 For all α > 0, the estimator f given in () satisfies [ ] sup f Λ α (M) E ( f(x) f(x)) 2 dx Cn 2α/(+2α) for some constant C > 0. Moreover, for any x 0 (0, ), sup E [( f(x ] 0 ) f(x 0 )) 2 Cn 2α/(+2α). f Λ α (M) It is well known in nonparametric regression that the minimax rate of convergence for estimating f over the Lipschitz ball Λ α (M) is n 2α/(+2α) under both the global integrated squared error loss and the pointwise squared error loss. That is, [ ] inf sup E ( f(x) f(x)) 2 dx inf sup E [( f(x ] 0 ) f(x 0 )) 2 n 2α/(+2α), ˆf f Λ α (M) ˆf f Λ α (M) where for two sequences a n > 0 and b n > 0, a n b n means that there exist constants c 2 c > 0 such that c a n /b n c 2 for all n. Hence Theorem 2 shows that the estimator ˆf given in () attains the optimal rate of convergence over the Lipschitz ball Λ α (M) under both the global and local losses. Remark 4 Instead of using the kernel method to estimate the nonparametric component f, other methods such as wavelet thresholding can also be used. Wavelet estimators can achieve better spatially adaptivity over more general function spaces than the Lipschitz ball Λ α (M). 7

8 3 Random Design We now turn to the random design version of the partial linear model () where both X i and U i are assumed to be random. Again let X i be p dimensional random vectors. Let U i be random variables on [0, ] and suppose that (X i, U i ), i =,..., n, are independent with an unknown joint density function g(x, u). Assume the ɛ i are independent and identically distributed with mean 0 and variance σ 2 and are also independent of (X i, U i ). Let h(u) = E(X U) and S(U) = E(X X U). Suppose f(u) Λ α (M f ) and h(u) Λ γ (M h ) for some α > 0 and γ > 0. (When X is a vector, this means each coordinate of h(u) has γ derivatives.) Moreover, suppose the marginal density of U is bounded away from 0, i.e. there exists a constant c > 0 such that g(x, u)dx c for any u [0, ]. Suppose U () U (2) U (n) are the order statistics of the U i s and X (i) are the corresponding X. Note that X (i) s are not the order statistics of X i s, but the X associated with U (i). Similar to the fixed design case, we take the m-th order differences D i = d t Y i+ t = Z iβ + δ i + w i where Z i = d t X (i+t ), δ i = d t f(u (i+t ) ), and w i = d t ɛ (i+t ). Again we estimate the linear regression coefficient vector β by the ordinary least square estimate ˆβ = (Z T Z) Z T D. (2) In using the ordinary least squares estimator ˆβ given in (2) we have ignored the errors δ i, the correlation between Z i and δ i and the correlation among the w i. We have the following theorem. Theorem 3 When α+γ > /2 and S(u) > 0 for every u, the estimator β is asymptotically efficient, i.e. n( β β) F N(0, σ 2 Σ X U ) where Σ X U = E[(X E(X U ))(X E(X U )) ]. Remark 5 We can see from this theorem that we do not always need α > /2 to ensure the asymptotic efficiency. We only need one of the two functions f(u) and h(u) = E(X U) to have minimal smoothness. Theorem can be considered to be a special case where γ is infinity. When the condition α + γ > /2 fails, Theorem 3 does not hold. In this case 8

9 n( β β) may not have asymptotic normality. To see this, let us consider the following example. Take p = and f(u) = h(u) Λ α (M) for 0 < α < /4. Suppose X i and ɛ i both follow standard normal distribution. Set ˆX i = X i h(u i ) and τ i = ˆX i β+ɛ i +β. Then 2 τ i iid N(0, ) and Model () can be written as Y i = h(u i )( + β) + τ i + β2. (3) Treat h(u i )( + β) as an unknown mean function, then the problem of estimating the regression coefficient β becomes estimating the variance in Model (3). It follows from Wang, Brown, Cai and Levine (2006) that the minimax rate of this variance estimation problem is n 4α n. Therefore Theorem 3 fails in this case. Remark 6 When X and U are independent, E(X U) = E(X). In this case, the result become n( β β) F N(0, σ 2 Σ X ). This is the same as the fixed design case and the estimator ˆβ performs as well as if the nonparametric component is known. Remark 7 Yatchew (997) considered the partial linear model () under the conditions that both f and h have bounded first derivatives, i.e., α = and γ =, and obtained similar results. In this case the condition α + γ > /2 of Theorem 3 is easily satisfied. Once we have an estimate of β, we can then use the same procedure to estimate f(u) as in the fixed design case, and we have the same result. Theorem 4 For all α > 0, the estimator f given in () satisfies [ ] sup f Λ α (M) E ( f(x) f(x)) 2 dx Cn 2α/(+2α) for some constant C > 0. Moreover, for any x 0 (0, ), sup E [( f(x ] 0 ) f(x 0 )) 2 Cn 2α/(+2α). f Λ α (M) Hence in this case the estimator ˆf also attains the optimal rate of convergence over the Lipschitz ball Λ α (M) under both the global and local losses. The proof of Theorem 3 is given in Section 6. The following lemma is one of the main technical tools for the proof and it is also useful in the development of the test given in Section 4. 9

10 Lemma Under the assumptions of Theorem 3, we have Z Z n Z ΨZ n Z δδ Z n P Σ X U, as n, P ( m c 2 k)σ X U as n, k= = O p (n 2α ) + O p (n 2α 2γ ) as n. 4 Testing the Linear Component We have so far focused on the estimation problem. In many contexts, hypothesis testing is also important under the semiparametric partial linear model (). For example, it is often of interest to test the hypothesis that a subset of the coefficients in the linear component is equal to zero. In this section, we consider the problem of testing the null hypothesis that the linear regression coefficients satisfy certain linear constraints. That is, we wish to test H 0 : Cβ = 0 against H : Cβ 0, where C is an r p matrix with rank(c) = r. A special case is testing the hypothesis H 0 : β i = = β ir = 0. In this section, we shall assume the errors ɛ i are independent and identically distributed N(0, σ 2 ) variables. 4. Fixed design or independent case We divide the testing problem into two cases. We first consider the case where U i = i/n (fixed design) or the U i s are random but independent of the X i s. From the previous sections, we know that asymptotically in this case the estimator ˆβ of the linear regression coefficient vector β satisfies n( ˆβ β) N(0, σ 2 Σ X ). This means asymptotically n(c ˆβ Cβ) N(0, σ 2 CΣ X C ). So our test statistic will be based on nσ 2 ˆβ C (CΣ X C ) C ˆβ. Both the covariance matrix Σ X and the error variance σ 2 are unknown in general and thus need to be estimated. It follows from Lemma that Z Z/n a.s. Σ X in this case. If σ 2 is given, the test statistic σ ˆβ C (C(Z Z) C ) C ˆβ 2 has a limiting χ 2 distribution with r degrees of freedom. 0

11 To estimate the error variance σ 2, set H = I Z(Z Z) Z. It is easy to check that H is a projection of rank n m p, i.e., H 2 = H and rank(h) = n m p. We shall use ˆσ 2 = HD 2 2 n m p to estimate σ2. Note that Write HD = D Z(Z Z) Z D = H(δ + w). HD 2 2 = (δ + w )H(δ + w) = w Hw + 2w Hδ + δ Hδ. Suppose α > /2. Then it is easy to see that δ Hδ a.s. 0 as n. Since w Hδ δ N(0, 2σ 2 δ Hδ), we know that w Hδ a.s. 0. Here we also assume that the first term of the difference sequence satisfies that d 2 = O(m ) (the sequence given in (5) satisfies this condition). It can be shown that σ 2 w Hw is approximately distributed as chi-squared with n m p degrees of freedom. We thus use ˆσ 2 = HD 2 2 n m p as an estimator of σ2. We have the following theorem. Theorem 5 Suppose α > /2 and d 2 = O(m ). For testing H 0 : Cβ = 0 against H : Cβ 0, where C is an r p matrix with rank(c) = r, the test statistic F = ˆβ C (C(Z Z) C ) C ˆβ/r ˆσ 2 asymptotically follows the F (r, n m p) distribution under the null hypothesis. Moreover, the asymptotic power of this test (at local alternatives) is the same as the usual F test when f is not present in the model (). Hence an approximate level α test of H 0 : Cβ = 0 against H : Cβ 0 is to reject the null hypothesis H 0 when the test statistic F F r,n m p;α where F r,n m p;α is the α quantile of the F (r, n m p) distribution. 4.2 General random design case We now turn to the test problem in the general random design case where U i are random and correlated with X i. Again suppose that (X i, U i ), i =,..., n, are independent with an unknown joint density function g(x, u). We will show that the same F test also works in this case. Notice that in this case, asymptotically n(c ˆβ Cβ) L N(0, σ 2 CΣ X U C ),

12 where Σ X U = E[(X E(X U ))(X E(X U )) ]. Lemma shows that Z Z/n converges to Σ X U. Based on this observation and the discussion given in Section 4., we have the following theorem. Theorem 6 Suppose α > /2 and d 2 = O(m ). For testing H 0 : Cβ = 0 against H : Cβ 0, where C is an r p matrix with rank(c) = r, the test statistic F = ˆβ C (C(Z Z) C ) C ˆβ/r ˆσ 2 asymptotically follows the F (r, n m p) distribution under the null hypothesis. 5 Numerical Study The difference based procedure for estimating the linear coefficients and the unknown function introduced in the previous sections is easily implementable. In this section we investigate the numerical performance of the estimator using both simulations and analysis of real data. In the simulation studies we study the effect of the nonparametric component f on the estimation accuracy of the linear component as well as the presence of the linear component on the estimation accuracy of f. In addition, the effect of the order of the difference sequence is considered. We also apply our estimation and testing procedures to the analysis of the attitude data from Chetterjee and Price (977) and a data set from the Framingham Heart Study. 5. Simulation We first study the effect of the unknown nonparametric component f on the estimation accuracy of the linear component and then investigate the effect of the order of the difference sequence. In the first simulation study, we take n = 500, U i iid Uniform(0, ) and consider the following four different functions, 3 30x for 0 x 0. 20x for 0. x 0.25 f (x) = 4 + ( 4x)8/9 for 0.25 < x (x 0.725) for < x (x 0.89)/ for 0.89 < x f 2 (x) = + 4(e 550(x 0.2)2 + e 200(x 0.8)2 + e 950(x 0.8)2 ) 2

13 where f 3 (x) = h j ( + x x j w j ) 4, (x j ) = (0.0, 0.3, 0.5, 0.23, 0.25, 0.40, 0.44, 0.65, 0.76, 0.78, 0.8) (h j ) = (4, 5, 3, 4, 5, 4.2, 2., 4.3, 3., 5., 4.2) (w j ) = (0.005, 0.005, 0.006, 0.0, 0.0, 0.03, 0.0, 0.0, 0.005, 0.008, 0.005), and f 4 (x) = 2.π x( x) sin( x ). The test functions f 3 and f 4 are the Bumps and Doppler functions given in Donoho and Johnstone (994). When we do simulation, we will normalize these functions to make them have unit variance. The four functions are plotted in Figure below. f f U U f f U U Figure : Plots of the four test functions. We also consider the case where f 0 for comparison. The errors ɛ i are generated from the standard normal distribution. For X i and β, we consider two cases: Case (). p =, X i N(4U i, ), β = ; 3

14 Case (2). p = 3, X i N((U i, 2U i, 4Ui 2 ), I 3 ), β = (2, 2, 4) where I 3 denotes the 3 3 identity matrix. We first examine the effect of the unknown function f on the estimation of the linear component. In this part, the difference sequence in equation (5) with m = 2 is used. The mean squared errors (MSEs) of the estimator ˆβ is calculated over 00 simulation runs. Note that the average squared error (( ˆβ 3 β ) 2 + ( ˆβ 2 β 2 ) 2 + ( ˆβ 3 β 3 ) 2 ) is computed for Case (2) in each simulation and then the average is taken over the 00 simulations. We also consider the case where the presence of f is completely ignored and we directly run least squares regression of Y on X in model (). This is also equivalent to estimate β by the generalized least squares regression of D on Z in (8) after taking the differences. The results are summarized in Table. The numbers insides the parentheses are the MSEs of the estimate when the nonparametric component is ignored. By comparing the MSEs in each row, it can be easily seen that after taking the differences the effect of the nonparametric component f is fairly small on the estimation accuracy of β in this study. This means that in many cases we can estimate the linear coefficients nearly as well as if f is known. On the other hand, if f is simply ignored and β is estimated by applying the least squares regression of Y on X directly, the estimator is highly inaccurate. The mean squared errors are between 2 to over 600 times as large as those of the corresponding estimators based on the differences. f 0 f (ignored) f 2 (ignored) f 3 (ignored) f 4 (ignored) Case () (.942) (0.054) (0.04) (0.0) Case (2) (0.70) (0.025) (0.009) (0.007) Table : The MSEs of estimate ˆβ over 00 replications with sample size n = 500. The numbers insides the parentheses are the MSEs of the estimate when the nonparametric component is ignored. For estimating the nonparametric function f, we use a kernel method with the Parzen s kernel. The bandwidth was selected by cross validation, see for example Härdle (99) and Scott (992). For comparison, we also carried out the simulation in the case where β = 0, i.e., the linear component is not present. The mean squared error of the estimated f is summarized in Table 2. It can be seen that the MSEs in each column are close to each other and hence the performance of our estimator ˆf does not depend sensitively on the structure of X and β. 4

15 f 0 f f 2 f 3 f 4 β = Case () Case (2) Table 2: The MSEs of the estimate ˆf over 00 replications with sample size n = 500. We now consider the effect of the order of the difference sequence m on the estimation accuracy. In this study, different combinations of the function f and the Cases () and (2) yield basically the same results. As an illustration of this, we focus on Case (2) and f = f 2. We compare four different m: 2, 4, 8, 6. The difference sequence in equation (5) was used in each case. We summary in Table 3 the mean and standard deviation of the estimate ˆβ and the average MSE of the estimate ˆf. By comparing the means and standard deviations in each row we can see that the performance of the estimator does not depend significantly on m. Increasing m may cause more bias in the estimator of β but can decrease the standard deviation slightly if m is not too large. m = 2 m = 4 m = 8 m = 6 Mean(sd) of ˆβ 2.003(0.057) 2.007(0.053).995(0.049).997(0.05) Mean(sd) of ˆβ (0.05) 2.004(0.05) 2.007(0.048).999(0.049) Mean(sd) of ˆβ (0.057) 4.002(0.047) 4.000(0.043) 3.992(0.049) MSE of ˆf Table 3: The mean and standard deviation of the estimate ˆβ and the average MSEs of the estimate ˆf over 00 replications with sample size n = Application to Attitude Data We now apply our estimation and testing procedures to the analysis of the attitude data. This data set was first analyzed in Chatterjee and Price (977) using multiple linear regression and variable selection. This data set was from a study of the performance of supervisors and was collected from a survey of the clerical employees of a large financial organization. The questions were about the employees satisfaction with their supervisors. This survey was designed to measure the overall performance of a supervisor, as well as questions that related to specific characteristic of the supervisor. An exploratory study was 5

16 undertaken to explain the relationship between specific supervisor characteristic and overall satisfaction with supervisors as perceived by the employees. The data are aggregated from the questionnaires of the approximately 35 employees for each of the 30 departments. The numbers give the percent proportion of favorable responses to seven questions in each department. Seven variables, Y (over all rating of the job being done by supervisor), X (raises based on performance), X 2 (handle employee complaints), X 3 (does not allow special privileges), X 4 (opportunity to learn new things), X 5 (rate of advancing to better job), and U (too critical to poor performances) are considered here. The goal is to understand the effect of other variables (X,..., X 5 and U) on Rating (Y ). Figure 2 plots each independent variable against the response Y. From Figure 2 we can see that the effect of U on Y is not linear, while the effect of other variables are roughly linear. So we employ the following semiparametric partial linear model, Y = β X + β 2 X 2 + β 3 X 3 + β 4 X 4 + β 5 X 5 + f(u) + ɛ. (4) Rating (Y) Rating (Y) Rating (Y) Raises (X) Complaints (X2) Privileges (X3) Rating (Y) Rating (Y) Rating (Y) Learning (X4) Advance (X5) Critical (U) Figure 2: Plot s of the individual explanatory variables against the response variable. 6

17 Using the estimation procedure discussed in Section 3 with m = 2, the linear component in the model (4) is estimated as X X X X X 5. The F statistic and the p value for testing each coefficient H i0 : β i = 0 against H i : β i 0 are given in Table 4. The p-values for β, β 3 and β 5 are exceedingly large. Estimated coefficient F statistic p value X X X X X Table 4: The estimated coefficients of the linear component and the significance tests. We thus perform the simultaneous F test to test the hypothesis H 0 : β = β 3 = β 5 = 0 against H : at least one of them is nonzero. The value of the F statistic is and the p value is In comparison, the value of the F statistic for the global hypothesis H 0 : β = = β 5 = 0 is and the p value is less than The results show that we fail to reject the hypothesis H 0 : β = β 3 = β 5 = 0. We can thus refine the linear component by using only Learning (X 2 ) and Complaints (X 4 ) as independent variables. In this case, the estimated linear component is X X 4. The F value for this model is and p value is less than We can then estimate the nonparametric component of the effect of Critical (U). For this, we run kernel estimation using the residuals of the linear fits as we did in Section 5.. Figure 3 shows the nonparametric fits. The left panel plots the estimate of f under the model (4) but we ignore the linear component, the middle panel plots the estimate of f under the model (4) with all linear variables and the right panel plots ˆf with the variables X 2 and X 4 in the linear part. We can see that the plot on the left panel is quite different from the other two. And the two plots on middle and right are similar since including a small number of additional non-significant variables does not have a large effect on the estimates of the remaining parts of the model. Note that in Chatterjee and Price (977), the standard multiple linear regression was used to model the relationship between the response and the explanatory variables. The linear model failed to detect the relation of the variable U and Y, and it concluded that variable U did not have significant effect on Y. 7

18 Nonparametric fit Nonparametric fit Nonparametric fit Critical (U) Critical (U) Critical (U) Figure 3: Kernel estimates of the nonparametric component f. The points are the residuals of respective linear fits. 5.3 A Data Set from the Framingham Heart Study To further illustrate our method, we apply it to a data set from the Framingham Heart Study. The objective of the Framingham Heart Study was to identify the common factors or characteristics that contribute to Cardiovascular disease by following its development over a long period of time in a large group of participants who had not yet developed overt symptoms of Cardiovascular disease or suffered a heart attack or stroke. For more details on the study, see for example Dawber, Meadors and Moore (95). In our study, we use a data set from Liang, Härdle and Carroll (999). The samples we used here are n = 65 males. Let Y denote their average blood pressure in a fixed two years period; X denote the age; and U denote the logarithm of serum cholesterol level observed at exam 2. Then we have the following semiparametric partial linear model Y = βx + f(u) + ɛ. We apply our procedure with m = 2 to estimate the linear coefficient and run the significance test. The estimated linear component is X and the p-value is less than The unknown nonparametric function f(u) is then estimated based on the residuals of this linear fit. As in Section 5., we use Parzen s kernel and choose the bandwidth through cross validation. Figure 4 below shows the nonparametric fit. 8

19 Residuals of linear fit (blood pressure) U (cholesterol level) 6 Proofs Figure 4: Kernel estimates of the nonparametric component f(u). We shall prove Theorems, 2,3 and 5. The proof of Theorem 4 and Theorem 6 is similar to that of Theorem 2 and Theorem 5, respectively. 6. Proof on Theorem We first state and prove two lemmas. lemmas. Theorem then follows directly from these two Lemma 2 Under the assumptions of Theorem, n(z Z) Z w L N(0, ( + O(m ))σ 2 Σ X ). Proof. Note that Z w can be written as X Ψɛ. The asymptotic normality of n(z Z) Z w follows from the Central Limit Theorem and the fact that m k= c2 k = O(m ). Moreover, E( n(z Z) Z w) = 0. We now turn to the variance. It is easy to see that V ar((z Z) Z w Z) = (Z Z) Z ΨZ(Z Z). Note that Z Z = n m Z i Z i = n m ( ) ( d t X i+ t d t X i+ t ), 9

20 so, with m = o(n), Note also that n (Z Z) a.s. E Z ΨZ = Z Z + 2 m k= [( n m k ) ( d t X i+ t c k ( Hence, with m = o(n), for any k =, 2,, m, This implies n m k d t X i+ t )] d tx i+ t )( = Σ X ( n d tx i+ t )( d tx i+k+ t) a.s. E [( d tx i+ t )( ] d tx i+k+ t ) ( k ) = d i d i+k Σ X = c k Σ X. nv ar((z Z) Z w Z) a.s. σ 2 Σ X Σ X( + 2 m k= c 2 k)σ X This value does not depend on X or n, so n(z Z) Z w d tx i+k+ t). = ( + O( m ))σ2 Σ X. L N(0, ( + O( m ))σ2 Σ X ). Lemma 3 Under the assumptions of Theorem, ne [ ((Z Z) Z δ)((z Z) Z δ) ] ( m ) 2(α )) = O( (ΣX ) n Proof. Note that E( n m ( d tx i+ t)δ i ) = 0 This implies E{(Z Z) Z δ} = 0. Now [( n m ) ( n m )] E ( d t X i+ t )δ i ( d t X i+ t)δ i = ( n m n m 2 δi 2 c k j= δ j m l= δ j+m ) Σ X. When m = o(n), since f Λ α (M), δ i < M( m n )(α ). So n m δi 2 n m 2 m δ j m δ j+m j= l= = O(n 2(α ) m 2(α ) ). 20

21 Also we know that n Z Z a.s. Σ X, therefore, as n, nv ar((z Z) Z δ) [ ( n m ) ( n m ) ] = E n(z Z) ( d t X i+ t )δ i ( d t X i+ t)δ i (Z Z) = no(n 2(α ) m 2(α ) ) n 2 (Σ X ) = O(( m n ) 2(α )) (ΣX ). Theorem now follows from Lemmas 2 and 3. Note that n( β β) = n(z Z) Z (w+ δ). Lemma 3 implies n(z Z) Z δ P 0. This together with Lemma 2 yield Theorem. 6.2 Proof of Theorem 2 We shall only prove the convergence rate under the pointwise squared error loss, the rate of convergence under the global mean integrated squared error risk can be derive using a similar line of argument. Let ˆf (u) = n K i,h(u)(f(u i ) + ɛ i ) and ˆf 2 (u) = n K i,h(u)x i (β ˆβ), then ˆf(u) = ˆf (u) + ˆf 2 (u). In ˆf there is no linear component. From the standard nonparametric regression results we know that for any x 0, sup E [( ˆf ] (x 0 ) f(x 0 )) 2 Cn 2α/(+2α) f Λ α (M) for some constant C > 0. Note that n K 2 i,h(u) = O( nh ) = O(n 2α/(+2α) ) and E(X i (β ˆβ)) 2 = O(n ). So [ ] E( ˆf n 2 (x 0 ) 2 ) = E ( K i,h X i (β ˆβ)) 2 n Ki,hE(X 2 i (β ˆβ)) 2 = O(n 2α/(+2α) ) Putting these together, we have sup E [( ˆf(x ] 0 ) f(x 0 )) 2 C n 2α/(+2α) f Λ α (M) for some constant C > 0. Theorem 4 can be proved by using the same argument. 2

22 Proof of Lemma Lemma is the main technical tool for the proof of Theorem 3. We need to following lemma to prove Lemma. Lemma 4 Under the assumptions of Theorem 3, V ar( n X (i) X n (i+)) 0 as n. Here the variance of a matrix means the variance of each entry and the limitation is also entry-wise. Proof. First we have V ar( n X (i) X n (i+)) = n V ar(x n 2 (i) X (i+)) + 2 n 2 Cov(X n 2 (i) X (i+), X (i+) X (i+2)) + 2 n 2 i+<j Cov(X (i) X (i+), X (j) X (j+)). Here the covariance of two matrix means the covariance of corresponding entries. Let η i = h(u (i+) ) h(u (i) ) and H i = h (U (i) )h(u (i) ) for i =, 2,, n. Note that when γ > 0, η i = O(n γ ). Then for i + < j Cov(X (i) X (i+), X (j) X (j+)) = E(E(X (i) X (i+)x (j) X (j+) U)) E(E(X (i) X (i+) U))E(E(X (j) X (j+) U)) = E(h (U (i) )h(u (i+) )h (U (j) )h(u (j+) )) E(h (U (i) )h(u (i+) ))E(h (U (j) )h(u (j+) )) = E(h (U (i) )(h(u (i) ) + η i )h (U (j) )(h(u (j) ) + η j )) E(h (U (i) )(h(u (i) ) + η i ))E(h (U (j) )(h(u (j) ) + η j )) = E(H i H j ) E(H i )E(H j ) + O(n γ ) = Cov(H i, H j ) + O(n γ ). Hence, 2 n 2 i+<j Cov(X (i) X (i+), X (j) X (j+)) = 2 n 2 i+<j ( Cov(Hi, H j ) + O(n γ ) ) = n n {V ar( n 2 n H 2 i ) 2 Cov(H i, H i+ ) V ar(h i )} + O(n γ ) = V ar( n n H i ) + O(n γ ). 22

23 Also, it is easy to see that and n V ar(x n 2 (i) X (i+)) = O( n ) 2 n 2 Cov(X n 2 (i) X (i+), X (i+) X (i+2)) = O( n ). Putting these together, the lemma is proved. Remark 8 By the same calculation, we actually can prove that for any fixed integer k > 0, V ar( n k n X (i)x(i+k) ) goes to zero as n goes to infinity. We now prove Lemma stated in Section 3. Proof of Lemma. It follows from Lemma 4 and the fact m = o(n) that lim V Z n ar(z n ) = 0 lim V ΨZ n ar(z n ) = 0 lim V δδ Z n ar(z ) = 0. n So we only need to check the limit of the expectation. First note that E(Z i Z i) = E(E(( d t X (i+t ) )( d t X (i+t )) U)) = = d 2 t E[E(X (i+t ) X (i+t ) U)] + +E(X (i+k+t ) X (i+t ) U)] And by the same calculation, E(Z i Z i+j) = j d 2 t E(V ar(x (i+t ) U)) + [ m k= d t+j d t E(V ar(x (i+j+t ) U)) + [ d t d t+k E[E(X (i+t ) X (i+t+k ) U) d t h(u (i+t ) )] [ d t h(u (i+t ) )]. d t h(u (i+t ) )] [ d t h(u (i+j+t ) )]. 23

24 This implies n m Z lim n E(Z n ) = lim E(Z i Z n n i) n = lim n n { E[V ar(x i U)] + = lim n n m k= n E[V ar(x i U)] = Σ X U E[(h(U (i) ) h(u (i+k) )) (h(u (i) ) h(u (i+k) ))]} c k n k Here we use the fact that, since h(u) has β derivatives, Similarly, lim n n m k= c k n k E[(h(U (i) ) h(u (i+k) )) (h(u (i) ) h(u (i+k) ))]} = 0. n m lim n n E(Z ΨZ) = lim n n { E(Z i Z i) + = ( Σc 2 k)σ X U. m k= E(Z i Z i+k + Z i+k Z i)} c k n k Finally for the third equation, let dh i = d th(u (i+t) ) then lim n n E(Z δδ Z) = lim n n m n E[( n m Z i δ i )( Z iδ i )] n m = lim n n E{E( δi 2 Z i Z i U) + E( δ i δ j Z i Z j U)} i j n m = lim n n E{ δi 2 E(Z i Z i U) + δ i δ j E(Z i Z j U)} i j n = lim E[V ar(x n n (i) U)( d 2 t δi t 2 + d t d t+k δ i t δ i+k t )] t k= n n + lim n n [ δ i E(dh i )] [ δ i E(dh i )] Since t d2 t δi t 2 + k= d td t+k δ i t δ i+k t is of order n 2α for any i, the first part of the above expression is of order n 2α. And δ i E(dh i ) is of order n α γ for any i, so the second part of the above expression is of order n 2α 2γ. This implies n E(Z δδ Z) = O(n 2α ) + O(n 2α 2γ ). 24

25 6.3 Proof of Theorem 3 With Lemmas and 2, the proof of Theorem 3 is now straightforward. Note that n( ˆβ β) = n(z Z) Z w + n(z Z) Z δ. It follows from the same argument as in the proof of Lemma 2 that n(z Z) Z w N(0, n(z Z) Z ΦZ(Z Z) ). The first two equalities of Lemma yields lim n n(z Z) Z ΦZ(Z Z) ) = {Σ X U } and the third equality of Lemma shows that, when α + γ > /2, n(z Z) Z δ is small and negligible. Theorem 3 now follows. 6.4 Proof of Theorem 5 From the discussion in Section 4, we know that ˆβ C (C(Z Z) C ) C ˆβ asymptotically follows χ 2 distribution with r degrees of freedom. We only need to show that (n m p)ˆσ 2 /σ 2 asymptotically follows χ 2 distribution with n m p degrees of freedom, and is independent with ˆβ C (C(Z Z) C ) C ˆβ. The independence is easy to see because it is the same as the ordinary linear regression case. Since we know that 2w Hδ + δ Hδ goes to 0 in probability, we just need to focus on w Hw. Note that d 2 = O(m ) and let L be a (n m) n matrix given by { d j i+ for 0 j i m L i,j =. (5) 0 otherwise Moreover, let J be another (n m) n matrix given by J i,i = for i =, 2,, n m and 0 otherwise. Then w = Lɛ = Jɛ + (L J)ɛ = w + w 2 where w = Jɛ and w 2 = (L J)ɛ. We can see that w N(0, σ 2 I n m ) and w 2 N(0, σ 2 (L J)(L J) ). We can also see that the variance of each coordinate of w 2 is of order m. Now w Hw = w Hw + 2w Hw 2 + w 2Hw 2. It is easy to check that H is a projection of rank n m p, i.e., H 2 = H and rank(h) = n m p. Hence w Hw = (Hw ) (Hw ) 25

26 follows a χ 2 distribution with n m p degrees of freedom. Since m = o(n) and the variance of each coordinate of w 2 is of order m, we know that both w Hw 2 /(n m p) and w 2Hw 2 /(n m p) converge to 0 in probability. This means ˆβ C (C(Z Z) C ) C ˆβ/r ˆσ 2 asymptotically follows F (r, n m p) distribution. By the same argument as in the proof of Theorem, we can prove that the asymptotic power of this test (at local alternatives) is the same as the usual F test when f is not present in the model (). References [] Amato, U., Antoniadis, A., and De Feis, I. (2002). Fourier Series Approximation of Separable Models. J. Computation and Applied Mathematics. 46, [2] Bickel, P. J., Klassen, C. A. J., Ritov, Y. and Wellner, J. A. (993). Efficient and Adaptive Estimation for Semiparametric Models. The John Hopkins University Press, Baltimore. [3] Buja, A., Hastie, T. J., and Tibshirani, R. J. (989). Linear Smoothers and Additive Models. Ann. Statist. 7, [4] Carroll, R. J., Fan, J., Gijbels,I., and Wand, M. P. (977). Generalized partially linear single-index models. J. Amer. Statist. Assoc. 92, [5] Chatterjee, S. and Price, B. (977). Regression Analysis by Example. New York: Wiley. [6] Chen, H. and Shiau, J. H. (99). A two-stage spline smoothing method for partially linear models. J. Statist. Plann. Inference 27, [7] Cuzick, J. (992). Semiparametric additive regression. J. Roy. Statist. Soc. Ser. B 54, [8] Dawber, T. R., Meadors G. F., and Moore F. E. J. (95). Epidemiological approaches to heart disease: the Framingham Study. Amer. J. Public Health 4, [9] Donoho, D.L. and Johnstone, I.M. (994). Ideal spatial adaptation via wavelet shrinkage. Biometrika 8,

27 [0] Engle, R. F., Granger, C. W. J., Rice, J. and Weiss, A. (986). Nonparametric estimates of the relation between weather and electricity sales. J. Amer. Statist. Assoc. 8, [] Fan, J., Hädle, W., and Mammen, E. (998). Direct estimation of additive and linear components for high-dimensional data. Ann. Statist., 26, [2] Fan, J., and Huang, L. (200). Goodness-of-Fit Tests for Parametric Regression Models. J. Amer. Statist. Assoc. 96, [3] Hall, P., Kay, J. and Titterington, D. M. (990). Asymptotically optimal differencebased estimation of variance in nonparametric regression. Biometrika 77, [4] Härdle, W. (99). Smoothing Techniques. Berlin: Springer-Verlag. [5] Horowitz, J. and Spokoiny, V. (200). An adaptive rate-optimal test of a parametric mean-regression model against a nonparametric alternative. Econometrica 69, [6] Lam, C., and Fan, J. (2007). Profile-Kernel Likelihood Inference With Diverging Number of Parameters. Ann. Statist., to appear. [7] Liang, H., Härdle, W. and Carroll, R. J. (999). Estimation in a semiparametric partially linear errors-in-variables model. Ann. Statist. 27, [8] Linton, O. B. (997). Efficient Estimation of Additive Nonparametric Regression Models. Biometrika 84, [9] Linton, O. B., and Nielsen, J. P. (995). A Kernel Method of Estimating Regressing Structured Nonparametric Regression Based on Marginal Integration. Biometrika 82, [20] Mammen, E., Linton, O., and Nielsen, J. (999). The Existence and Asymptotic Properties of a Backfitting Projection Algorithm Under Weak Conditions. Ann. Statist. 27, [2] Munk, A., Bissantz, N., Wagner, T. and Freitag, G. (2005). On difference based variance estimation in nonparametric regression when the covariate is high dimensional. J. Roy. Statist. Soc. B 67,

28 [22] Opsomer, J.-D., and Ruppert, D. (998). A Fully Automated Bandwidth Selection Method for Fitting Additive Models. J. Amer. Statist. Assoc. 93, [23] Ritov, Y., and Bickel, P. J. (990). Achieving information bounds in semi and non parametric models. Ann. Statist. 8, [24] Scott, D. W. (992). Multivariate Density Estimation. New York: Wiley. [25] Severini, T. A. and Wong, W. H. (992). Generalized profile likelihood and conditional parametric models. Ann. Statist. 20, [26] Tjøstheim, D., and Auestad, B. (994). Nonparametric Identification of Nonlinear Time Series: Projection. J. Amer. Statist. Assoc. 89, [27] Wahba, G. (984). Cross validated spline methods for the estimation of multivariate functions from data on functionals. Statistics: An Appraisal. Proceedings 50th Anniversary Conference Iowa State Statistical Laboratory (H. A. David, ed.). Iowa State Univ. Press. [28] Wang, L., Brown, L.D., Cai, T. and Levine, M. (2006). Effect of mean on variance function estimation on nonparametric regression. Ann. Statist., to appear. [29] Yatchew, A. (997). An elementary estimator of the partial linear model. Economics Letters 57,

A semiparametric multivariate partially linear model: a difference approach

A semiparametric multivariate partially linear model: a difference approach Statistica Sinica (2013): Preprint 1 A semiparametric multivariate partially linear model: a difference approach Lawrence D. Brown, Michael Levine, and Lie Wang University of Pennsylvania, Purdue University,

More information

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Jianqing Fan Department of Statistics Chinese University of Hong Kong AND Department of Statistics

More information

Minimax Rate of Convergence for an Estimator of the Functional Component in a Semiparametric Multivariate Partially Linear Model.

Minimax Rate of Convergence for an Estimator of the Functional Component in a Semiparametric Multivariate Partially Linear Model. Minimax Rate of Convergence for an Estimator of the Functional Component in a Semiparametric Multivariate Partially Linear Model By Michael Levine Purdue University Technical Report #14-03 Department of

More information

Variance Function Estimation in Multivariate Nonparametric Regression

Variance Function Estimation in Multivariate Nonparametric Regression Variance Function Estimation in Multivariate Nonparametric Regression T. Tony Cai 1, Michael Levine Lie Wang 1 Abstract Variance function estimation in multivariate nonparametric regression is considered

More information

SMOOTHED BLOCK EMPIRICAL LIKELIHOOD FOR QUANTILES OF WEAKLY DEPENDENT PROCESSES

SMOOTHED BLOCK EMPIRICAL LIKELIHOOD FOR QUANTILES OF WEAKLY DEPENDENT PROCESSES Statistica Sinica 19 (2009), 71-81 SMOOTHED BLOCK EMPIRICAL LIKELIHOOD FOR QUANTILES OF WEAKLY DEPENDENT PROCESSES Song Xi Chen 1,2 and Chiu Min Wong 3 1 Iowa State University, 2 Peking University and

More information

Optimal Estimation of a Nonsmooth Functional

Optimal Estimation of a Nonsmooth Functional Optimal Estimation of a Nonsmooth Functional T. Tony Cai Department of Statistics The Wharton School University of Pennsylvania http://stat.wharton.upenn.edu/ tcai Joint work with Mark Low 1 Question Suppose

More information

Additive Isotonic Regression

Additive Isotonic Regression Additive Isotonic Regression Enno Mammen and Kyusang Yu 11. July 2006 INTRODUCTION: We have i.i.d. random vectors (Y 1, X 1 ),..., (Y n, X n ) with X i = (X1 i,..., X d i ) and we consider the additive

More information

Bayesian Nonparametric Point Estimation Under a Conjugate Prior

Bayesian Nonparametric Point Estimation Under a Conjugate Prior University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 5-15-2002 Bayesian Nonparametric Point Estimation Under a Conjugate Prior Xuefeng Li University of Pennsylvania Linda

More information

Integrated Likelihood Estimation in Semiparametric Regression Models. Thomas A. Severini Department of Statistics Northwestern University

Integrated Likelihood Estimation in Semiparametric Regression Models. Thomas A. Severini Department of Statistics Northwestern University Integrated Likelihood Estimation in Semiparametric Regression Models Thomas A. Severini Department of Statistics Northwestern University Joint work with Heping He, University of York Introduction Let Y

More information

Nonparametric Regression

Nonparametric Regression Adaptive Variance Function Estimation in Heteroscedastic Nonparametric Regression T. Tony Cai and Lie Wang Abstract We consider a wavelet thresholding approach to adaptive variance function estimation

More information

Discussion of Regularization of Wavelets Approximations by A. Antoniadis and J. Fan

Discussion of Regularization of Wavelets Approximations by A. Antoniadis and J. Fan Discussion of Regularization of Wavelets Approximations by A. Antoniadis and J. Fan T. Tony Cai Department of Statistics The Wharton School University of Pennsylvania Professors Antoniadis and Fan are

More information

Estimation of a quadratic regression functional using the sinc kernel

Estimation of a quadratic regression functional using the sinc kernel Estimation of a quadratic regression functional using the sinc kernel Nicolai Bissantz Hajo Holzmann Institute for Mathematical Stochastics, Georg-August-University Göttingen, Maschmühlenweg 8 10, D-37073

More information

Some Theories about Backfitting Algorithm for Varying Coefficient Partially Linear Model

Some Theories about Backfitting Algorithm for Varying Coefficient Partially Linear Model Some Theories about Backfitting Algorithm for Varying Coefficient Partially Linear Model 1. Introduction Varying-coefficient partially linear model (Zhang, Lee, and Song, 2002; Xia, Zhang, and Tong, 2004;

More information

Function of Longitudinal Data

Function of Longitudinal Data New Local Estimation Procedure for Nonparametric Regression Function of Longitudinal Data Weixin Yao and Runze Li Abstract This paper develops a new estimation of nonparametric regression functions for

More information

SEMI-LINEAR LINEAR INDEX MODEL WHEN THE LINEAR COVARIATES AND INDICES ARE INDEPENDENT

SEMI-LINEAR LINEAR INDEX MODEL WHEN THE LINEAR COVARIATES AND INDICES ARE INDEPENDENT SEMI-LINEAR LINEAR INDEX MODEL WHEN THE LINEAR COVARIATES AND INDICES ARE INDEPENDENT By Yun Sam Chong, Jane-Ling Wang and Lixing Zhu Summary Wecker Associate, University of California at Davis, and The

More information

TESTING SERIAL CORRELATION IN SEMIPARAMETRIC VARYING COEFFICIENT PARTIALLY LINEAR ERRORS-IN-VARIABLES MODEL

TESTING SERIAL CORRELATION IN SEMIPARAMETRIC VARYING COEFFICIENT PARTIALLY LINEAR ERRORS-IN-VARIABLES MODEL Jrl Syst Sci & Complexity (2009) 22: 483 494 TESTIG SERIAL CORRELATIO I SEMIPARAMETRIC VARYIG COEFFICIET PARTIALLY LIEAR ERRORS-I-VARIABLES MODEL Xuemei HU Feng LIU Zhizhong WAG Received: 19 September

More information

Generated Covariates in Nonparametric Estimation: A Short Review.

Generated Covariates in Nonparametric Estimation: A Short Review. Generated Covariates in Nonparametric Estimation: A Short Review. Enno Mammen, Christoph Rothe, and Melanie Schienle Abstract In many applications, covariates are not observed but have to be estimated

More information

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics Jiti Gao Department of Statistics School of Mathematics and Statistics The University of Western Australia Crawley

More information

Ultra High Dimensional Variable Selection with Endogenous Variables

Ultra High Dimensional Variable Selection with Endogenous Variables 1 / 39 Ultra High Dimensional Variable Selection with Endogenous Variables Yuan Liao Princeton University Joint work with Jianqing Fan Job Market Talk January, 2012 2 / 39 Outline 1 Examples of Ultra High

More information

Department of Statistics Purdue University West Lafayette, IN USA

Department of Statistics Purdue University West Lafayette, IN USA Effect of Mean on Variance Funtion Estimation in Nonparametric Regression by Lie Wang, Lawrence D. Brown,T. Tony Cai University of Pennsylvania Michael Levine Purdue University Technical Report #06-08

More information

Issues on quantile autoregression

Issues on quantile autoregression Issues on quantile autoregression Jianqing Fan and Yingying Fan We congratulate Koenker and Xiao on their interesting and important contribution to the quantile autoregression (QAR). The paper provides

More information

STT 843 Key to Homework 1 Spring 2018

STT 843 Key to Homework 1 Spring 2018 STT 843 Key to Homework Spring 208 Due date: Feb 4, 208 42 (a Because σ = 2, σ 22 = and ρ 2 = 05, we have σ 2 = ρ 2 σ σ22 = 2/2 Then, the mean and covariance of the bivariate normal is µ = ( 0 2 and Σ

More information

Efficient Estimation for the Partially Linear Models with Random Effects

Efficient Estimation for the Partially Linear Models with Random Effects A^VÇÚO 1 33 ò 1 5 Ï 2017 c 10 Chinese Journal of Applied Probability and Statistics Oct., 2017, Vol. 33, No. 5, pp. 529-537 doi: 10.3969/j.issn.1001-4268.2017.05.009 Efficient Estimation for the Partially

More information

DEPARTMENT MATHEMATIK ARBEITSBEREICH MATHEMATISCHE STATISTIK UND STOCHASTISCHE PROZESSE

DEPARTMENT MATHEMATIK ARBEITSBEREICH MATHEMATISCHE STATISTIK UND STOCHASTISCHE PROZESSE Estimating the error distribution in nonparametric multiple regression with applications to model testing Natalie Neumeyer & Ingrid Van Keilegom Preprint No. 2008-01 July 2008 DEPARTMENT MATHEMATIK ARBEITSBEREICH

More information

Nonparametric Regression

Nonparametric Regression Nonparametric Regression Econ 674 Purdue University April 8, 2009 Justin L. Tobias (Purdue) Nonparametric Regression April 8, 2009 1 / 31 Consider the univariate nonparametric regression model: where y

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 24 Paper 153 A Note on Empirical Likelihood Inference of Residual Life Regression Ying Qing Chen Yichuan

More information

Additive Models: Extensions and Related Models.

Additive Models: Extensions and Related Models. Additive Models: Extensions and Related Models. Enno Mammen Byeong U. Park Melanie Schienle August 8, 202 Abstract We give an overview over smooth backfitting type estimators in additive models. Moreover

More information

Wavelet Shrinkage for Nonequispaced Samples

Wavelet Shrinkage for Nonequispaced Samples University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 1998 Wavelet Shrinkage for Nonequispaced Samples T. Tony Cai University of Pennsylvania Lawrence D. Brown University

More information

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models University of Illinois Fall 2016 Department of Economics Roger Koenker Economics 536 Lecture 7 Introduction to Specification Testing in Dynamic Econometric Models In this lecture I want to briefly describe

More information

Likelihood Ratio Tests. that Certain Variance Components Are Zero. Ciprian M. Crainiceanu. Department of Statistical Science

Likelihood Ratio Tests. that Certain Variance Components Are Zero. Ciprian M. Crainiceanu. Department of Statistical Science 1 Likelihood Ratio Tests that Certain Variance Components Are Zero Ciprian M. Crainiceanu Department of Statistical Science www.people.cornell.edu/pages/cmc59 Work done jointly with David Ruppert, School

More information

DESIGN-ADAPTIVE MINIMAX LOCAL LINEAR REGRESSION FOR LONGITUDINAL/CLUSTERED DATA

DESIGN-ADAPTIVE MINIMAX LOCAL LINEAR REGRESSION FOR LONGITUDINAL/CLUSTERED DATA Statistica Sinica 18(2008), 515-534 DESIGN-ADAPTIVE MINIMAX LOCAL LINEAR REGRESSION FOR LONGITUDINAL/CLUSTERED DATA Kani Chen 1, Jianqing Fan 2 and Zhezhen Jin 3 1 Hong Kong University of Science and Technology,

More information

Orthogonal Matching Pursuit for Sparse Signal Recovery With Noise

Orthogonal Matching Pursuit for Sparse Signal Recovery With Noise Orthogonal Matching Pursuit for Sparse Signal Recovery With Noise The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation As Published

More information

Tuning parameter selectors for the smoothly clipped absolute deviation method

Tuning parameter selectors for the smoothly clipped absolute deviation method Tuning parameter selectors for the smoothly clipped absolute deviation method Hansheng Wang Guanghua School of Management, Peking University, Beijing, China, 100871 hansheng@gsm.pku.edu.cn Runze Li Department

More information

CALCULATION METHOD FOR NONLINEAR DYNAMIC LEAST-ABSOLUTE DEVIATIONS ESTIMATOR

CALCULATION METHOD FOR NONLINEAR DYNAMIC LEAST-ABSOLUTE DEVIATIONS ESTIMATOR J. Japan Statist. Soc. Vol. 3 No. 200 39 5 CALCULAION MEHOD FOR NONLINEAR DYNAMIC LEAS-ABSOLUE DEVIAIONS ESIMAOR Kohtaro Hitomi * and Masato Kagihara ** In a nonlinear dynamic model, the consistency and

More information

AFT Models and Empirical Likelihood

AFT Models and Empirical Likelihood AFT Models and Empirical Likelihood Mai Zhou Department of Statistics, University of Kentucky Collaborators: Gang Li (UCLA); A. Bathke; M. Kim (Kentucky) Accelerated Failure Time (AFT) models: Y = log(t

More information

Introduction Wavelet shrinage methods have been very successful in nonparametric regression. But so far most of the wavelet regression methods have be

Introduction Wavelet shrinage methods have been very successful in nonparametric regression. But so far most of the wavelet regression methods have be Wavelet Estimation For Samples With Random Uniform Design T. Tony Cai Department of Statistics, Purdue University Lawrence D. Brown Department of Statistics, University of Pennsylvania Abstract We show

More information

Regression and Statistical Inference

Regression and Statistical Inference Regression and Statistical Inference Walid Mnif wmnif@uwo.ca Department of Applied Mathematics The University of Western Ontario, London, Canada 1 Elements of Probability 2 Elements of Probability CDF&PDF

More information

Peter Hoff Minimax estimation October 31, Motivation and definition. 2 Least favorable prior 3. 3 Least favorable prior sequence 11

Peter Hoff Minimax estimation October 31, Motivation and definition. 2 Least favorable prior 3. 3 Least favorable prior sequence 11 Contents 1 Motivation and definition 1 2 Least favorable prior 3 3 Least favorable prior sequence 11 4 Nonparametric problems 15 5 Minimax and admissibility 18 6 Superefficiency and sparsity 19 Most of

More information

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

FULL LIKELIHOOD INFERENCES IN THE COX MODEL October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach

More information

Local linear multiple regression with variable. bandwidth in the presence of heteroscedasticity

Local linear multiple regression with variable. bandwidth in the presence of heteroscedasticity Local linear multiple regression with variable bandwidth in the presence of heteroscedasticity Azhong Ye 1 Rob J Hyndman 2 Zinai Li 3 23 January 2006 Abstract: We present local linear estimator with variable

More information

WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION. Abstract

WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION. Abstract Journal of Data Science,17(1). P. 145-160,2019 DOI:10.6339/JDS.201901_17(1).0007 WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION Wei Xiong *, Maozai Tian 2 1 School of Statistics, University of

More information

Estimation in Semiparametric Single Index Panel Data Models with Fixed Effects

Estimation in Semiparametric Single Index Panel Data Models with Fixed Effects Estimation in Semiparametric Single Index Panel Data Models with Fixed Effects By Degui Li, Jiti Gao and Jia Chen School of Economics, The University of Adelaide, Adelaide, Australia Abstract In this paper,

More information

On a Nonparametric Notion of Residual and its Applications

On a Nonparametric Notion of Residual and its Applications On a Nonparametric Notion of Residual and its Applications Bodhisattva Sen and Gábor Székely arxiv:1409.3886v1 [stat.me] 12 Sep 2014 Columbia University and National Science Foundation September 16, 2014

More information

SEMIPARAMETRIC ESTIMATION OF CONDITIONAL HETEROSCEDASTICITY VIA SINGLE-INDEX MODELING

SEMIPARAMETRIC ESTIMATION OF CONDITIONAL HETEROSCEDASTICITY VIA SINGLE-INDEX MODELING Statistica Sinica 3 (013), 135-155 doi:http://dx.doi.org/10.5705/ss.01.075 SEMIPARAMERIC ESIMAION OF CONDIIONAL HEEROSCEDASICIY VIA SINGLE-INDEX MODELING Liping Zhu, Yuexiao Dong and Runze Li Shanghai

More information

If we want to analyze experimental or simulated data we might encounter the following tasks:

If we want to analyze experimental or simulated data we might encounter the following tasks: Chapter 1 Introduction If we want to analyze experimental or simulated data we might encounter the following tasks: Characterization of the source of the signal and diagnosis Studying dependencies Prediction

More information

SIMULTANEOUS CONFIDENCE BANDS AND HYPOTHESIS TESTING FOR SINGLE-INDEX MODELS

SIMULTANEOUS CONFIDENCE BANDS AND HYPOTHESIS TESTING FOR SINGLE-INDEX MODELS Statistica Sinica 24 (2014), 937-955 doi:http://dx.doi.org/10.5705/ss.2012.127 SIMULTANEOUS CONFIDENCE BANDS AND HYPOTHESIS TESTING FOR SINGLE-INDEX MODELS Gaorong Li 1, Heng Peng 2, Kai Dong 2 and Tiejun

More information

A talk on Oracle inequalities and regularization. by Sara van de Geer

A talk on Oracle inequalities and regularization. by Sara van de Geer A talk on Oracle inequalities and regularization by Sara van de Geer Workshop Regularization in Statistics Banff International Regularization Station September 6-11, 2003 Aim: to compare l 1 and other

More information

Estimation in Partially Linear Single Index Panel Data Models with Fixed Effects

Estimation in Partially Linear Single Index Panel Data Models with Fixed Effects Estimation in Partially Linear Single Index Panel Data Models with Fixed Effects By Degui Li, Jiti Gao and Jia Chen School of Economics, The University of Adelaide, Adelaide, Australia Abstract In this

More information

Econ 623 Econometrics II Topic 2: Stationary Time Series

Econ 623 Econometrics II Topic 2: Stationary Time Series 1 Introduction Econ 623 Econometrics II Topic 2: Stationary Time Series In the regression model we can model the error term as an autoregression AR(1) process. That is, we can use the past value of the

More information

A NOTE ON A NONPARAMETRIC REGRESSION TEST THROUGH PENALIZED SPLINES

A NOTE ON A NONPARAMETRIC REGRESSION TEST THROUGH PENALIZED SPLINES Statistica Sinica 24 (2014), 1143-1160 doi:http://dx.doi.org/10.5705/ss.2012.230 A NOTE ON A NONPARAMETRIC REGRESSION TEST THROUGH PENALIZED SPLINES Huaihou Chen 1, Yuanjia Wang 2, Runze Li 3 and Katherine

More information

Comments on \Wavelets in Statistics: A Review" by. A. Antoniadis. Jianqing Fan. University of North Carolina, Chapel Hill

Comments on \Wavelets in Statistics: A Review by. A. Antoniadis. Jianqing Fan. University of North Carolina, Chapel Hill Comments on \Wavelets in Statistics: A Review" by A. Antoniadis Jianqing Fan University of North Carolina, Chapel Hill and University of California, Los Angeles I would like to congratulate Professor Antoniadis

More information

Nonparametric Time-Varying Coefficient Panel Data Models with Fixed Effects

Nonparametric Time-Varying Coefficient Panel Data Models with Fixed Effects The University of Adelaide School of Economics Research Paper No. 2010-08 May 2010 Nonparametric Time-Varying Coefficient Panel Data Models with Fixed Effects Degui Li, Jia Chen, and Jiti Gao Nonparametric

More information

Nonparametric Econometrics

Nonparametric Econometrics Applied Microeconometrics with Stata Nonparametric Econometrics Spring Term 2011 1 / 37 Contents Introduction The histogram estimator The kernel density estimator Nonparametric regression estimators Semi-

More information

Sample Size Requirement For Some Low-Dimensional Estimation Problems

Sample Size Requirement For Some Low-Dimensional Estimation Problems Sample Size Requirement For Some Low-Dimensional Estimation Problems Cun-Hui Zhang, Rutgers University September 10, 2013 SAMSI Thanks for the invitation! Acknowledgements/References Sun, T. and Zhang,

More information

Model Selection and Geometry

Model Selection and Geometry Model Selection and Geometry Pascal Massart Université Paris-Sud, Orsay Leipzig, February Purpose of the talk! Concentration of measure plays a fundamental role in the theory of model selection! Model

More information

A PRACTICAL WAY FOR ESTIMATING TAIL DEPENDENCE FUNCTIONS

A PRACTICAL WAY FOR ESTIMATING TAIL DEPENDENCE FUNCTIONS Statistica Sinica 20 2010, 365-378 A PRACTICAL WAY FOR ESTIMATING TAIL DEPENDENCE FUNCTIONS Liang Peng Georgia Institute of Technology Abstract: Estimating tail dependence functions is important for applications

More information

Efficiency of Profile/Partial Likelihood in the Cox Model

Efficiency of Profile/Partial Likelihood in the Cox Model Efficiency of Profile/Partial Likelihood in the Cox Model Yuichi Hirose School of Mathematics, Statistics and Operations Research, Victoria University of Wellington, New Zealand Summary. This paper shows

More information

STAT331. Cox s Proportional Hazards Model

STAT331. Cox s Proportional Hazards Model STAT331 Cox s Proportional Hazards Model In this unit we introduce Cox s proportional hazards (Cox s PH) model, give a heuristic development of the partial likelihood function, and discuss adaptations

More information

On robust and efficient estimation of the center of. Symmetry.

On robust and efficient estimation of the center of. Symmetry. On robust and efficient estimation of the center of symmetry Howard D. Bondell Department of Statistics, North Carolina State University Raleigh, NC 27695-8203, U.S.A (email: bondell@stat.ncsu.edu) Abstract

More information

Semi-Nonparametric Inferences for Massive Data

Semi-Nonparametric Inferences for Massive Data Semi-Nonparametric Inferences for Massive Data Guang Cheng 1 Department of Statistics Purdue University Statistics Seminar at NCSU October, 2015 1 Acknowledge NSF, Simons Foundation and ONR. A Joint Work

More information

Tests of independence for censored bivariate failure time data

Tests of independence for censored bivariate failure time data Tests of independence for censored bivariate failure time data Abstract Bivariate failure time data is widely used in survival analysis, for example, in twins study. This article presents a class of χ

More information

A GENERALIZED ADDITIVE REGRESSION MODEL FOR SURVIVAL TIMES 1. By Thomas H. Scheike University of Copenhagen

A GENERALIZED ADDITIVE REGRESSION MODEL FOR SURVIVAL TIMES 1. By Thomas H. Scheike University of Copenhagen The Annals of Statistics 21, Vol. 29, No. 5, 1344 136 A GENERALIZED ADDITIVE REGRESSION MODEL FOR SURVIVAL TIMES 1 By Thomas H. Scheike University of Copenhagen We present a non-parametric survival model

More information

Nonparametric Estimation of Distributions in a Large-p, Small-n Setting

Nonparametric Estimation of Distributions in a Large-p, Small-n Setting Nonparametric Estimation of Distributions in a Large-p, Small-n Setting Jeffrey D. Hart Department of Statistics, Texas A&M University Current and Future Trends in Nonparametrics Columbia, South Carolina

More information

New Local Estimation Procedure for Nonparametric Regression Function of Longitudinal Data

New Local Estimation Procedure for Nonparametric Regression Function of Longitudinal Data ew Local Estimation Procedure for onparametric Regression Function of Longitudinal Data Weixin Yao and Runze Li The Pennsylvania State University Technical Report Series #0-03 College of Health and Human

More information

An Introduction to Nonstationary Time Series Analysis

An Introduction to Nonstationary Time Series Analysis An Introduction to Analysis Ting Zhang 1 tingz@bu.edu Department of Mathematics and Statistics Boston University August 15, 2016 Boston University/Keio University Workshop 2016 A Presentation Friendly

More information

Estimation and Testing for Varying Coefficients in Additive Models with Marginal Integration

Estimation and Testing for Varying Coefficients in Additive Models with Marginal Integration Estimation and Testing for Varying Coefficients in Additive Models with Marginal Integration Lijian Yang Byeong U. Park Lan Xue Wolfgang Härdle September 6, 2005 Abstract We propose marginal integration

More information

PREWHITENING-BASED ESTIMATION IN PARTIAL LINEAR REGRESSION MODELS: A COMPARATIVE STUDY

PREWHITENING-BASED ESTIMATION IN PARTIAL LINEAR REGRESSION MODELS: A COMPARATIVE STUDY REVSTAT Statistical Journal Volume 7, Number 1, April 2009, 37 54 PREWHITENING-BASED ESTIMATION IN PARTIAL LINEAR REGRESSION MODELS: A COMPARATIVE STUDY Authors: Germán Aneiros-Pérez Departamento de Matemáticas,

More information

Selecting Local Models in Multiple Regression. by Maximizing Power

Selecting Local Models in Multiple Regression. by Maximizing Power Selecting Local Models in Multiple Regression by Maximizing Power Chad M. Schafer and Kjell A. Doksum October 26, 2006 Abstract This paper considers multiple regression procedures for analyzing the relationship

More information

The Bootstrap: Theory and Applications. Biing-Shen Kuo National Chengchi University

The Bootstrap: Theory and Applications. Biing-Shen Kuo National Chengchi University The Bootstrap: Theory and Applications Biing-Shen Kuo National Chengchi University Motivation: Poor Asymptotic Approximation Most of statistical inference relies on asymptotic theory. Motivation: Poor

More information

Nonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix

Nonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix Nonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix Yingying Dong and Arthur Lewbel California State University Fullerton and Boston College July 2010 Abstract

More information

Specification Test for Instrumental Variables Regression with Many Instruments

Specification Test for Instrumental Variables Regression with Many Instruments Specification Test for Instrumental Variables Regression with Many Instruments Yoonseok Lee and Ryo Okui April 009 Preliminary; comments are welcome Abstract This paper considers specification testing

More information

Empirical Likelihood Tests for High-dimensional Data

Empirical Likelihood Tests for High-dimensional Data Empirical Likelihood Tests for High-dimensional Data Department of Statistics and Actuarial Science University of Waterloo, Canada ICSA - Canada Chapter 2013 Symposium Toronto, August 2-3, 2013 Based on

More information

High-dimensional Ordinary Least-squares Projection for Screening Variables

High-dimensional Ordinary Least-squares Projection for Screening Variables 1 / 38 High-dimensional Ordinary Least-squares Projection for Screening Variables Chenlei Leng Joint with Xiangyu Wang (Duke) Conference on Nonparametric Statistics for Big Data and Celebration to Honor

More information

Concentration behavior of the penalized least squares estimator

Concentration behavior of the penalized least squares estimator Concentration behavior of the penalized least squares estimator Penalized least squares behavior arxiv:1511.08698v2 [math.st] 19 Oct 2016 Alan Muro and Sara van de Geer {muro,geer}@stat.math.ethz.ch Seminar

More information

arxiv: v1 [stat.me] 17 Jan 2008

arxiv: v1 [stat.me] 17 Jan 2008 Some thoughts on the asymptotics of the deconvolution kernel density estimator arxiv:0801.2600v1 [stat.me] 17 Jan 08 Bert van Es Korteweg-de Vries Instituut voor Wiskunde Universiteit van Amsterdam Plantage

More information

Peter Hoff Minimax estimation November 12, Motivation and definition. 2 Least favorable prior 3. 3 Least favorable prior sequence 11

Peter Hoff Minimax estimation November 12, Motivation and definition. 2 Least favorable prior 3. 3 Least favorable prior sequence 11 Contents 1 Motivation and definition 1 2 Least favorable prior 3 3 Least favorable prior sequence 11 4 Nonparametric problems 15 5 Minimax and admissibility 18 6 Superefficiency and sparsity 19 Most of

More information

On Estimation of Partially Linear Transformation. Models

On Estimation of Partially Linear Transformation. Models On Estimation of Partially Linear Transformation Models Wenbin Lu and Hao Helen Zhang Authors Footnote: Wenbin Lu is Associate Professor (E-mail: wlu4@stat.ncsu.edu) and Hao Helen Zhang is Associate Professor

More information

Understanding Regressions with Observations Collected at High Frequency over Long Span

Understanding Regressions with Observations Collected at High Frequency over Long Span Understanding Regressions with Observations Collected at High Frequency over Long Span Yoosoon Chang Department of Economics, Indiana University Joon Y. Park Department of Economics, Indiana University

More information

Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D.

Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D. Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D. Ruppert A. EMPIRICAL ESTIMATE OF THE KERNEL MIXTURE Here we

More information

7 Semiparametric Estimation of Additive Models

7 Semiparametric Estimation of Additive Models 7 Semiparametric Estimation of Additive Models Additive models are very useful for approximating the high-dimensional regression mean functions. They and their extensions have become one of the most widely

More information

Illustration of the Varying Coefficient Model for Analyses the Tree Growth from the Age and Space Perspectives

Illustration of the Varying Coefficient Model for Analyses the Tree Growth from the Age and Space Perspectives TR-No. 14-06, Hiroshima Statistical Research Group, 1 11 Illustration of the Varying Coefficient Model for Analyses the Tree Growth from the Age and Space Perspectives Mariko Yamamura 1, Keisuke Fukui

More information

SIMULTANEOUS CONFIDENCE INTERVALS FOR SEMIPARAMETRIC LOGISTICS REGRESSION AND CONFIDENCE REGIONS FOR THE MULTI-DIMENSIONAL EFFECTIVE DOSE

SIMULTANEOUS CONFIDENCE INTERVALS FOR SEMIPARAMETRIC LOGISTICS REGRESSION AND CONFIDENCE REGIONS FOR THE MULTI-DIMENSIONAL EFFECTIVE DOSE Statistica Sinica 20 (2010), 637-659 SIMULTANEOUS CONFIDENCE INTERVALS FOR SEMIPARAMETRIC LOGISTICS REGRESSION AND CONFIDENCE REGIONS FOR THE MULTI-DIMENSIONAL EFFECTIVE DOSE Jialiang Li 1, Chunming Zhang

More information

Exact Likelihood Ratio Tests for Penalized Splines

Exact Likelihood Ratio Tests for Penalized Splines Exact Likelihood Ratio Tests for Penalized Splines By CIPRIAN CRAINICEANU, DAVID RUPPERT, GERDA CLAESKENS, M.P. WAND Department of Biostatistics, Johns Hopkins University, 615 N. Wolfe Street, Baltimore,

More information

P n. This is called the law of large numbers but it comes in two forms: Strong and Weak.

P n. This is called the law of large numbers but it comes in two forms: Strong and Weak. Large Sample Theory Large Sample Theory is a name given to the search for approximations to the behaviour of statistical procedures which are derived by computing limits as the sample size, n, tends to

More information

Estimation of the Bivariate and Marginal Distributions with Censored Data

Estimation of the Bivariate and Marginal Distributions with Censored Data Estimation of the Bivariate and Marginal Distributions with Censored Data Michael Akritas and Ingrid Van Keilegom Penn State University and Eindhoven University of Technology May 22, 2 Abstract Two new

More information

A simple bootstrap method for constructing nonparametric confidence bands for functions

A simple bootstrap method for constructing nonparametric confidence bands for functions A simple bootstrap method for constructing nonparametric confidence bands for functions Peter Hall Joel Horowitz The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP4/2

More information

Nonconcave Penalized Likelihood with A Diverging Number of Parameters

Nonconcave Penalized Likelihood with A Diverging Number of Parameters Nonconcave Penalized Likelihood with A Diverging Number of Parameters Jianqing Fan and Heng Peng Presenter: Jiale Xu March 12, 2010 Jianqing Fan and Heng Peng Presenter: JialeNonconcave Xu () Penalized

More information

Some properties of Likelihood Ratio Tests in Linear Mixed Models

Some properties of Likelihood Ratio Tests in Linear Mixed Models Some properties of Likelihood Ratio Tests in Linear Mixed Models Ciprian M. Crainiceanu David Ruppert Timothy J. Vogelsang September 19, 2003 Abstract We calculate the finite sample probability mass-at-zero

More information

VARIANCE ESTIMATION AND KRIGING PREDICTION FOR A CLASS OF NON-STATIONARY SPATIAL MODELS

VARIANCE ESTIMATION AND KRIGING PREDICTION FOR A CLASS OF NON-STATIONARY SPATIAL MODELS Statistica Sinica 25 (2015), 135-149 doi:http://dx.doi.org/10.5705/ss.2013.205w VARIANCE ESTIMATION AND KRIGING PREDICTION FOR A CLASS OF NON-STATIONARY SPATIAL MODELS Shu Yang and Zhengyuan Zhu Iowa State

More information

CURRENT STATUS LINEAR REGRESSION. By Piet Groeneboom and Kim Hendrickx Delft University of Technology and Hasselt University

CURRENT STATUS LINEAR REGRESSION. By Piet Groeneboom and Kim Hendrickx Delft University of Technology and Hasselt University CURRENT STATUS LINEAR REGRESSION By Piet Groeneboom and Kim Hendrickx Delft University of Technology and Hasselt University We construct n-consistent and asymptotically normal estimates for the finite

More information

Wavelet Regression Estimation in Longitudinal Data Analysis

Wavelet Regression Estimation in Longitudinal Data Analysis Wavelet Regression Estimation in Longitudinal Data Analysis ALWELL J. OYET and BRAJENDRA SUTRADHAR Department of Mathematics and Statistics, Memorial University of Newfoundland St. John s, NF Canada, A1C

More information

Restricted Likelihood Ratio Tests in Nonparametric Longitudinal Models

Restricted Likelihood Ratio Tests in Nonparametric Longitudinal Models Restricted Likelihood Ratio Tests in Nonparametric Longitudinal Models Short title: Restricted LR Tests in Longitudinal Models Ciprian M. Crainiceanu David Ruppert May 5, 2004 Abstract We assume that repeated

More information

Spatial Process Estimates as Smoothers: A Review

Spatial Process Estimates as Smoothers: A Review Spatial Process Estimates as Smoothers: A Review Soutir Bandyopadhyay 1 Basic Model The observational model considered here has the form Y i = f(x i ) + ɛ i, for 1 i n. (1.1) where Y i is the observed

More information

Department of Economics, Vanderbilt University While it is known that pseudo-out-of-sample methods are not optimal for

Department of Economics, Vanderbilt University While it is known that pseudo-out-of-sample methods are not optimal for Comment Atsushi Inoue Department of Economics, Vanderbilt University (atsushi.inoue@vanderbilt.edu) While it is known that pseudo-out-of-sample methods are not optimal for comparing models, they are nevertheless

More information

A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models

A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models Jingyi Jessica Li Department of Statistics University of California, Los

More information

Local Polynomial Wavelet Regression with Missing at Random

Local Polynomial Wavelet Regression with Missing at Random Applied Mathematical Sciences, Vol. 6, 2012, no. 57, 2805-2819 Local Polynomial Wavelet Regression with Missing at Random Alsaidi M. Altaher School of Mathematical Sciences Universiti Sains Malaysia 11800

More information

Semiparametric Approximation Methods in Multivariate Model Selection

Semiparametric Approximation Methods in Multivariate Model Selection journal of complexity 17, 754 772 (2001) doi:10.1006/jcom.2001.0591, available online at http://www.idealibrary.com on Semiparametric Approximation Methods in Multivariate Model Selection Jiti Gao 1 Department

More information

Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION. September 2017

Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION. September 2017 Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION By Degui Li, Peter C. B. Phillips, and Jiti Gao September 017 COWLES FOUNDATION DISCUSSION PAPER NO.

More information

High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data

High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data Song Xi CHEN Guanghua School of Management and Center for Statistical Science, Peking University Department

More information

[y i α βx i ] 2 (2) Q = i=1

[y i α βx i ] 2 (2) Q = i=1 Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation

More information