New Local Estimation Procedure for Nonparametric Regression Function of Longitudinal Data
|
|
- Dana Mason
- 6 years ago
- Views:
Transcription
1 ew Local Estimation Procedure for onparametric Regression Function of Longitudinal Data Weixin Yao and Runze Li The Pennsylvania State University Technical Report Series #0-03 College of Health and Human Development The Pennsylvania State University
2 ew Local Estimation Procedure for onparametric Regression Function of Longitudinal Data Weixin Yao Department of Statistics, Kansas State University, Manhattan, Kansas 66506, U.S.A. Runze Li Department of Statistics, The Pennsylvania State University, University Park Pennsylvania, , U.S.A. Abstract This paper develops a new estimation of nonparametric regression functions for clustered or longitudinal data. We propose to use Cholesky decomposition and profile least squares techniques to estimate the correlation structure and regression function simultaneously. We further prove that the proposed estimator is as asymptotically efficient as if the covariance matrix were known. A Monte Carlo simulation study is conducted to examine the finite sample performance of the proposed procedure, and to compare the proposed procedure with the existing ones. Based on our empirical studies, the newly proposed procedure works better than the naive local linear regression with working independence error structure and the efficiency gain can be achieved in moderate-sized samples. A real data set application is also provided to illustrate the proposed estimation procedure. Key Words: Cholesky decomposition; Local polynomial regression; Longitudinal data; Profile least squares. Li s research is supported by ational Institute on Drug Abuse grants R2 DA and P50 DA0075 and a ational Science Foundation grant DMS The content is solely the responsibility of the authors and does not necessarily represent the official views of the IDA or the IH.
3 Introduction For clustered or longitudinal data, we know that the data collected from the same subject at different times are correlated and that observations from different subjects are often independent. Therefore, it is of great interest to estimate the regression function incorporating the within-subject correlation to improve the estimation efficiency. This issue has been well studied for parametric regression models in the literature. See, for example, generalized method of moments (Hansen, 982), the generalized estimating equation (Liang and Zeger, 986), and quadratic inference function (Qu, Lindsay, and Li, 2000). The parametric regression generally has simple and intuitive interpretations and provides a parsimonious description of the relationship between the response variable and its covariates. However, these strong assumption models may introduce modeling biases and lead to erroneous conclusions when there is model misspecification. In this article, we focus on the nonparametric regression model for longitudinal data. Suppose that (x ij, y ij ), i =,..., n, j =,..., J i } is a random sample from the following nonparametric regression model: y ij = m(x ij ) + ɛ ij, () where m( ) is a nonparametric smoothing function, and ɛ ij is a random error. Here (x ij, y ij ) is the j th observation of the i th subject or cluster. Thus, (x ij, y ij ), j =,, J i, are correlated. There has been substantial research interest in developing nonparametric estimation procedures for m( ) under the setting of clustered/longitudinal data. Lin and Carroll (2000) proposed the kernel GEE, an extension of the parametric GEE, for model () and showed that the kernel GEE works the best without incorporating the within-subject correlation. Wang (2003) proposed the marginal kernel method for longitudinal data and proved its efficiency by incorporating the true correlation structure. She also demonstrated that the marginal kernel method using the true correlation structure results in more efficient estimate than Lin and Carroll (2000) s kernel GEE. In this article, we propose a new procedure to estimate the correlation structure and regression function simultaneously, based on the Cholesky decomposition and profile least squares techniques. 2
4 We derive the asymptotic bias and variance, and further establish the asymptotic normality of the resulting estimator. We further conduct some theoretic comparison. We show that the newly proposed procedure is more efficient than Lin and Carroll (2000) s kernel GEE. We further prove that the proposed estimator is as asymptotically efficient as if the true covariance matrix were known a priori. Compared with the marginal kernel method of Wang (2003), the newly proposed procedure does not require specifying a working correlation structure. This has appeal in practice because the true correlation structure is typically unknown. Monte Carlo simulation studies are conducted to examine the finite sample performance of the proposed procedure, and to compare the proposed procedure with the existing ones. Results from our empirical studies suggest that the newly proposed procedure performs better than the naive local linear regression and the efficiency gain can be achieved in moderate-sized samples. We illustrate the proposed estimation method with an analysis of a real data set. The remainder of this paper is organized as follows. In Section 2, we introduce the new estimation procedure based on the profile least squares and the Cholesky decomposition. We then provide the asymptotic results of the proposed estimator. Finally, we present numerical comparison and analysis of a real data example in Section 3. The proofs and the regularity conditions are given in the Appendix. 2 ew estimation procedures For ease of presentation, let us start with balanced longitudinal data. The proposed procedure can be easily adapted for nearly balanced longitudinal data by binning the data. We will discuss how to use Cholesky composition to incorporate the within-subject correlation into the local estimation procedures for unbalanced longitudinal data in Section 2.2. Suppose (x ij, y ij ), i =,..., n, j =,..., J} is a random sample from the model (). In this paper, we will mainly consider univariate x ij. The newly proposed procedures are applicable for multivariate x ij, but are practically less useful due to the curse of dimensionality. Let ɛ i = (ɛ i,..., ɛ ij ) T and x i = (x i,..., x ij ). Suppose cov(ɛ i x i ) = Σ. Based on Cholesky 3
5 decomposition, there exists a lower triangle matrix Φ with diagonal ones such that cov(φɛ i ) = ΦΣΦ T = D, where D is a diagonal matrix. In other words, we have ɛ i = e i, ɛ ij = φ j, ɛ i, + + φ j,j ɛ i,j + e ij, i =,..., n, j = 2,..., J, where e i = (e i,..., e ij ) T = Φɛ i, and φ j,l is negative of (j, l) element of Φ. Let D = diag(d 2,..., d2 J ). Since D is a diagonal matrix, e ij s are uncorrelated and var(e ij) = d 2 j, j =,..., J. If ɛ,..., ɛ n } were available, then we would work on the following partially linear model with uncorrelated error term e ij s: y i = m(x i ) + e i y ij = m(x ij ) + φ j, ɛ i, + + φ j,j ɛ i,j + e ij, i =,..., n, j = 2,..., J. (2) However, in practice, ɛ ij is not available, but it may be estimated by ˆɛ ij = y ij ˆm I (x ij ), where ˆm I (x ij ) is a local linear estimate of m( ) based on model () pretending that the random error ɛ ij s are independent. As shown in Lin and Carroll (2000), ˆm I(x) under the working independence structure is a root n consistent estimate of m(x). Replacing ɛ ij s in (2) with ˆɛ ijs, we have y ij = m(x ij ) + φ j,ˆɛ i, + + φ j,j ˆɛ i,j + e ij, i =,..., n, j = 2,..., J. (3) Let Y = (y 2,..., y J,..., y nj ) T, X = (x 2,..., x J,..., x nj ) T, φ = (φ 2,..., φ J,J ) T, e = (e 2,..., e nj ) T, and ˆF T ij = 0 T (j 2)(j )/2, ˆɛ i,,..., ˆɛ i,j, 0(J )J/2 (j )j/2} T, where 0k is the k- dimension column vector with all entries 0. Then we can rewrite the model (3) with the following 4
6 matrix format: Y = m(x) + ˆF a φ + e, (4) where m(x) = m(x 2 ),..., m(x J ),..., m(x nj )} T and ˆF a = (ˆF 2,..., ˆF J,..., ˆF nj ) T. Let Y = Y ˆF a φ. Then Y = m(x) + e, (5) ote that e ij s in e are uncorrelated. Therefore, if Σ and thus φ is known, we can use the Cholesky decomposition to transfer the correlated data model () to the uncorrelated data model (5) with the new response Y. For partial linear model (4), various estimation methods have been proposed. In this paper, we will employ the profile least squares techniques (Fan and Li, 2004) to estimate φ and m( ) in (4). 2. Profile least squares estimate oting that (5) is a one-dimension nonparametric model, given φ, one may employ existing linear smoothers, such as local polynomial regression (Fan and Gijbels, 996) and smoothing splines (Gu, 2002) to estimate m(x). Here, we employ the local linear regression. Let A x0 = x 2 x 0 x J x 0 x nj x 0 T, and W x0 = diagk h (x 2 x 0 )/ ˆd 2,..., K h (x J x 0 )/ ˆd 2 J,..., K h (x nj x 0 )/ ˆd 2 J}, where K h (t) = h K(t/h), K( ) is a kernel function, and h is the bandwidth, and ˆd j is any consistent estimate of d j, the standard deviation of e j. Denote by ˆm(x 0 ) the local linear regression estimate of m(x 0 ). Then ˆm(x 0 ) = ˆβ 0 = [, 0](A x0 T W x0 A x0 ) A x0 T W x0 Y. ote that ˆm(x 0 ) is a linear function in terms of Y. Let S h (x 0 ) = [, 0](A x0 T W x0 A x0 ) A x0 T W x0. 5
7 Then ˆm(X) can be represented by ˆm(X) = S h (X)Y, where S h (X) is a (J )n (J )n smoothing matrix, depending on X and the bandwidth h only. Substituting m(x) in (5) by ˆm(X), we obtain the linear regression model: I S h (X)} Y = I S h (X)} ˆF a φ + e, where I is the identity matrix. Let Ĝ = diag( ˆd 2 2,..., ˆd 2 J,..., ˆd 2 2,..., ˆd 2 J). Then, the profile least squares estimator for φ is ˆφ p = [ˆFT a I S h (X)} T Ĝ I S h (X)} ˆF ] a ˆFT a I S h (X)} T Ĝ I S h (X)} Y. (6) Let Ŷ = Y ˆF a ˆφp, then Ŷ = m(x) + e, (7) and e ijs are uncorrelated. ote that when we estimate the regression function m(x), we can also include the observations from the first time point. Therefore, for simplicity of notation, when estimating m(x), we assume that Ŷ consists of all observations with ŷ i = y i. Similar changes are used for all other notation when estimating m(x) in (7). Since e ij s in (7) are uncorrelated, we can use the conventional local linear regression estimator: ( ˆβ 0, ˆβ ) = arg min β 0,β (Ŷ A x0 β) T W x0 (Ŷ A x0 β). Then the local linear estimate of m(x 0 ) is ˆm(x 0, ˆφ p ) = ˆβ 0. Bandwidth selection. To implement the newly proposed estimation procedure, we need to specify 6
8 bandwidths. We use local linear regression with the working independent correlation matrix to estimate ˆm I ( ). The plug-in bandwidth selector (Ruppert, Sheather, and Wand, 995) was applied for the estimation of ˆm I ( ). Then we calculate ˆɛ ij = y ij ˆm I (x ij ), and further calculate the difference-based estimate for φ (Fan and Li, 2004), denoted by ˆφ dbe. Using ˆφ dbe in (5), we select a bandwidth for the proposed profile least squares estimator using the plug-in bandwidth selector. 2.2 Theoretic comparison The following notation is used in the asymptotic results below. Let F i = (F i,..., F ij ) T where F ij = 0 T (j 2)(j )/2, ɛ i,,..., ɛ i,j, 0 T (J )J/2 (j )j/2} T, and µ j = t j K(t)dt, and ν 0 = K 2 (t)dt. Denote by f j (x) the marginal density of X j. The asymptotic results of the profile least squares estimators ˆφ p and ˆm(x 0, ˆφ p ) are given in the following theorem, whose proof can be found in the Appendix. Theorem 2.. Supposing the regularity conditions A A6 in the Appendix hold, under the assumption of cov(ɛ i x i ) = Σ, we have (a) the asymptotic distribution of ˆφ p in (6) is given by n(ˆφp φ) (0, V ), where V = J E(F j F T j)/d 2 j, j=2 and var(e j ) = d 2 j. 7
9 (b) the asymptotic distribution of ˆm(x 0, ˆφ p ), conditioning on x,..., x nj }, is given below h ˆm(x0, ˆφ p ) m(x 0 ) 2 µ 2m (x 0 )h 2 } ( ) ν 0 0,, τ(x 0 ) where = nj and τ(x 0 ) = J j= f j (x 0 ) d 2 j. Under the same assumption of Theorem 2., the asymptotic variance of the local linear estimate with working independence correlation structure (Lin and Carroll, 2000) is (h) ν 0 J f j (x 0 )σ 2 j= j, where var(ɛ j ) = σj 2. Based on the property of Cholesky s decomposition, we know that σ 2 = d 2 and σ 2 j d 2 j, j = 2,..., J. The equality only holds when cov(ɛ x) = Σ is a diagonal matrix. ote that ˆm(x 0, ˆφ p ) has the same asymptotic bias as the working independence estimate of m(x 0 ) (Lin and Carroll, 2000). Therefore, if within-subject observations are correlated (i.e., the covariance matrix Σ is not diagonal), then our proposed estimator ˆm(x 0, ˆφ p ) is asymptotically more efficient than local linear estimator with the working independence correlation structure. We next introduce how to use the Cholesky decomposition in (7) for unbalanced longitudinal data, and investigate the performance of the proposed procedure when a working covariance matrix is used for calculating Ŷ. We shall show that the resulting local linear estimator is also consistent with any working positive definite covariance matrix, and further show that its asymptotic variance is minimized when the covariance structure is correctly specified. For unbalanced longitudinal data, let ɛ i = (ɛ i,..., ɛ iji ) T and x i = (x i,..., x iji ), where J i is the number of observations for ith subject or cluster. Denoted by cov(ɛ i x i ) = Σ i, which is a J i J i matrix and may depend on x i. Based on the Cholesky decomposition, there exists a lower 8
10 triangle matrix Φ i with diagonal ones such that cov(φ i ɛ i ) = Φ i Σ i Φ i = D i, where D i is a diagonal matrix. Let φ (i) j,l be the negative of (j, l) element of Φ i. Similar to (3), we have for i =,..., n, j = 2,..., J i, y i = m(x i ) + e i y ij = m(x ij ) + φ (i) j,ˆɛ i, + + φ (i) j,j ˆɛ i,j + e ij, where e i = (e i,..., e iji ) T = Φ i ɛ i. Since D i is a diagonal matrix, e ijs are uncorrelated. Therefore, if Σ i were known, one could adapt the newly proposed procedure for unbalanced longitudinal data. Following the idea of the generalized estimating equation (GEE, Liang and Zeger, 986), we replace Σ i with a working covariance matrix, denoted by Σ i, since the true covariance matrix is unknown in practice. A parametric working covariance matrix can be constructed as in GEE, and a semiparametric working covariance matrix may also be constructed following Fan, Huang, and Li (2007). Let Φ i be the corresponding lower triangle matrix with diagonal ones such that Φ i Σi Φ i = D i, where D i is a diagonal matrix. Let ỹ ij = y ij φ (i) j,l be the negative of (j, l)-element of Φ i. Let ỹ i = y i and (i) φ j,ˆɛ (i) i, φ j,j ˆɛ i,j. Then our proposed new local linear estimate m(x 0 ) = β 0 is the minimizer of the following weighted least squares: ( β 0, β ) = arg min β 0,β n J i 2 K h (x ij x 0 ) d ij ỹ ij β 0 β (x ij x 0 )} 2, (8) i= j= where d 2 ij is the jth diagonal element of D i. The asymptotic behavior of m(x 0 ) is given in Theorem 2.2. Similar to Lin and Carroll (2000) and Wang (2003), we assume that J i = J < in order to simplify the presentation of the asymptotic 9
11 results. Let φ i = (φ (i) 2,..., φ(i) J,J )T and φ (i) (i) i = ( φ 2,..., φ J,J )T. Theorem 2.2. Suppose the regularity conditions A A6 in the Appendix hold and cov(ɛ i x i ) = Σ i. Let m(x 0 ) be the solution of (8) using the working covariance matrix Σ i. (a) The asymptotic bias of m(x 0 ) is given by bias m(x 0 )} = 2 µ 2m (x 0 )h 2 ( + o p ()) and the asymptotic variance is given by var m(x 0 )} = (h) ν 0γ(x 0 ) τ 2 (x 0 ) ( + o p()), where τ(x 0 ) = J j= 2 f j (x 0 )E( d j X j = x 0 ), and γ(x 0 ) = J j= f j (x 0 )E (c 2 j + d 2 } 4 j) d j Xj = x 0, where c 2 j F( φ is the jth diagonal element of cov φ) } X. (b) The asymptotic variance of m(x 0 ) is minimized only when Σ i = kσ i is correctly specified for a positive constant k. It can then be simplified to var m(x 0 )} (h) ν 0 J f j (x 0 )E(d 2 j X j = x 0 ) j=. For balanced longitudinal data, if Σ i = Σ for all i and does not depend on X, then var m(x 0 )} (h) ν 0 J f j (x 0 )d 2 j j=. Theorem 2.2 (a) implies that the leading term of asymptotic bias does not depend on the 0
12 working covariance matrix. This is expected since the bias is caused by the approximation error of local linear regression. Theorem 2.2 (a) also implies that the resulting estimate is consistent for any positive definite working covariance matrix. Theorem 2.2 (b) implies that the asymptotic variance of m(x 0 ) in (8) is minimized when the working correlation matrix is equal to the true correlation matrix. Comparing Theorem 2. (b) to the Theorem 2.2 (b), one can see that the proposed profile least square estimate ˆm(x 0, ˆφ p ) for balanced longitudinal data is as asymptotically efficient as if one knew the true covariance matrix. It is of great interest to compare the asymptotic variance in Theorem 2.2 (b) with the asymptotic variance of marginal kernel method proposed in Wang (2003). We attempted to make a theoretic comparison, but we found that the comparison was very challenging and was out of scope of this paper. We conjecture that the asymptotic variance of the marginal kernel method may be smaller than the one in Theorem 2.2 (b). However, the newly proposed procedure is easier to implement in practice. Thus, further research is needed. 3 Simulation results and real data application In this section, we will conduct a Monte Carlo simulation to assess the performance of our proposed profile least estimator and illustrate the newly proposed procedure with an empirical analysis of a real data example. Example. In this example, data (x ij, y ij ), i =,..., n, j =,..., 6} were generated from the model y ij = 2 sin(2πx ij ) + ɛ ij, where x ij U(0, ) and ɛ ij (0, ). Let ɛ i = (ɛ i... ɛ i6 ) T and x i = (x i,..., x i6 ) T. We considered the following three cases: Case I: ɛ ijs are independent. Case II: cov(ɛ ij, ɛ ik ) = 0.6, when j k and otherwise. Case III: Σ = cov(ɛ i x i ) is AR() correlation structure with ρ = 0.6.
13 To investigate the effect of errors in estimating the covariance matrix, we compared the proposed profile least squares procedure with the oracle estimator using the true covariance matrix. The oracle estimator serves as a benchmark for the comparison. In this example, we also compared the newly proposed procedure with the local linear regression using the working independence correlation structure. We compared the performance of different methods by the root of average squared errors (RASE) defined by RASE ˆm( )} = n grid n grid ˆm(u j ) m(u j )} 2 j= /2, where u j s are the n grid(= 200) grid points equally spaced from 0 to. Simulation results are summarized in Table. In the Table and in the discussion below, Independence stands for local linear regression with working independence, ew for the newly proposed procedure, and Oracle for the oracle estimator. The mean and standard deviation of 500 RASEs over 500 simulations are depicted in the left panel of Table. Empirical relative efficiency is reported in the right panel of Table. Empirical relative efficiency means that, for example, RE(ew)=ERASE 2 (Independence)}/ERASE 2 (ew)} and we estimate ERASE 2 (Independence)} and ERASE 2 (ew)} by the sample average over 500 simulations. Table shows that the ew and Oracle have smaller RASEs than the Independence when the data are correlated (Cases II and III) and the efficiency gain can be achieved even for moderate sample size. For independent data (Case I), the ew did not lose much efficiency for estimating the correlation structure when compared to the Independence. Furthermore, Table shows that when sample size is large, the ew performed as well as the Oracle, which uses the true correlation structure. The simulation results confirm the theoretic findings in Section 2. Example 2. In this example, we illustrate the proposed methodology with an empirical analysis of a data set collected from the website of Pennsylvania-ew Jersey-Maryland Interconnections (PJM), the largest regional transmission organization (RTO) in the U.S. electricity market. The data set includes hourly electricity price and electricity load in the Allegheny Power Service district 2
14 Table : Simulation Results of Example. Case n Independence ew Oracle RE(ew) RE(Oracle) I (0.064) 0.22(0.068) 0.20(0.062) (0.047) 0.64(0.047) 0.59(0.047) (0.026) 0.00(0.026) 0.099(0.026) (0.06) 0.067(0.06) 0.067(0.06) II (0.08) 0.20(0.078) 0.202(0.077) (0.057) 0.63(0.057) 0.59(0.056) (0.033) 0.0(0.033) 0.00(0.033) (0.08) 0.063(0.08) 0.063(0.08) III (0.070) 0.203(0.067) 0.94(0.065) (0.053) 0.55(0.048) 0.52(0.048) (0.030) 0.098(0.028) 0.097(0.028) (0.08) 0.064(0.06) 0.064(0.06) from 8:00am to 8:00pm during the period from January, 2005 to January 3, We studied the effect of the electricity load on the electricity price. As an illustration, we treated day as subject, and set the electricity price as the response variable and the electricity load as the predictor variable. Thus, the sample size n equals 3, and each subject has J = 3 observations. The scatter plot of observations is depicted in Figure (b). We first used local linear regression with working independence covariance matrix to estimate the regression. The plug-in bandwidth selector (Ruppert, Sheather, and Wand, 995) yielded a bandwidth of 277. The dashed lines in Figure are the resulting estimate along its 95% pointwise confidence interval. Based on the resulting estimate, we further obtained the residuals, and estimated the correlation between ɛ i,j and ɛ i,j+k for j =,, 2 and k 3 j. The plot of estimated correlations is depicted in Figure (a), which shows that the within-subject correlation is moderate. Thus, our proposed method may produce a more accurate estimate than the local linear regression, ignoring the within-subject correlation. ext, we apply the newly proposed procedure for this data set. The bandwidth selected by the 3
15 0.8 Correlation Lag k (a) Price Load (b) Figure : (a) Plot of estimated correlation between ɛ i,j and ɛ i,j+k versus the lag k. For example, dots at k = correspond to the correlations between ɛ i,j and ɛ i,j+ for j =, 2,, 2. (b) Scatter plot of observations and the plot of fitted regression curves. The solid curve is the fitted regression by the proposed method and the corresponding 95% point-wise confidence interval. The dash-dot curve is the local linear fit ignoring the within- subject correlation. 4
16 plug-in bandwidth selector equals 243. The solid curves in Figure (b) are the fitted regression curves along with 95% pointwise confidence interval by the newly proposed procedure. Figure (b) shows that the newly proposed procedure provided a smaller confidence interval than the one ignoring the within-subject correlation, even though the number of subjects is only 3, although the difference between the two estimated regression curves is very little. 4 Concluding remark We have developed a new local estimation procedure for regression functions of longitudinal data. The proposed procedure uses the Cholesky decomposition and profile least squares techniques to estimate the correlation structure and regression function simultaneously. We demonstrate that the proposed estimator is as asymptotically efficient as an oracle estimator which uses the true covariance matrix to take into account the within-subject correlation. In this paper, we focus on nonparametric regression models. The proposed methodology can be easily adapted for other regression models, such as additive models and varying coefficient models. Such extensions are of great interest for future research. APPEDIX: PROOFS Define B = ˆF a F a. Since G can be estimated by a parametric rate, we will assume that G is known in our proof, without loss of generality. Our proofs use a strategy similar to that in Fan and Huang (2005). The following conditions are imposed to facilitate the proof and are adopted from Fan and Huang (2005). They are not the weakest possible conditions. A. The random variable x ij has a bounded support Ω. Its density function f j ( ) is Lipschitz continuous and bounded away from 0 on its support. The x ij s are allowed to be correlated for different j s A2. m( ) has the continuous second derivative in x Ω. 5
17 A3. The kernel K( ) is a bounded symmetric density function with bounded support and satisfies the Lipschitz condition. A4. nh 8 0 and nh 2 /(log n) 2. A5. There is a s > 2 such that E F j s <, j and for some ξ > 0 such that n 2s 2ξ h. A6. sup x Ω ˆm I (x) m(x) = o p (n /4 ), where ˆm I (x) is obtained by local linear regression pretending that the data are independent and identically distributed (i.i.d.). The following lemma is taken from Lemma A. of Fan and Huang (2005) and will be used in our proof repeatedly. Lemma A.. Let (X, Y ),..., (X n, Y n ) be independent and identically distributed random vectors, where the Y i are scalar random variables. Further assume that for some s > 2 and interval [a, b] E Y s < and sup x [a,b] y s g(x, y)dy <, where g denotes the joint density of (X, Y ). Let K satisfy Condition A3. Then sup x [a,b] n n (log(/h) ) } /2 [K h (X i x)y i E K h (X i x)y i }] = O p nh i= provided that h 0, for some ξ > 0, n 2s 2ξ h. Lemma A.2. Let (x j, y j ),..., (x nj, y nj ) be independent and identically distributed random vectors for each j =,..., J, where y ij are scalar random variables. Further assume that for some s > 2 and interval [a, b] E Y j s < and sup x [a,b] y s g j (x, y)dy <, where g j (, ) denotes the joint density of (X ij, Y ij ). Let K satisfy Condition A3. Then sup x [a,b] n i= j= (log(/h) ) } /2 [K h (x ij x)y ij E K h (x ij x)y ij }] = O p nh 6
18 provided that h 0, for some ξ > 0, n 2s 2ξ h. Proof: Based on Lemma A., we have sup n [K h (x ij x)y ij E K h (x ij x)y ij }] x [a,b] i= j= n sup [K J j= x [a,b] h (x ij x)y ij E K h (x ij x)y ij }] n i= (log(/h) ) } /2 (log(/h) ) } /2 O p = O p. J nh nh j= Let A x = (x x)/h.. (x nj x)/h, e = (e,..., e nj ) T, and m = (m(x ),..., m(x nj )) T. We will first provide the asymptotic results including all the observations for simplicity of notation and then consider the change (without using (x i, y i ), i =,..., n}) when applying the result to φ. We first prove some lemmas, which will be used in the proof of Theorem 2.. Lemma A.3. Under Conditions A A6, it follows that FT a I S h (X)} T G I S h (X)}F a P Ṽ. where Ṽ = J E(F j F T j)/d 2 j. j= Proof: Let W x = diagk h (x x)/d 2,..., K h (x nj x 0 )/d 2 J}. 7
19 Then the smoothing matrix S h (X) of the local linear regression can be expressed as S h (X) = [, 0]A T x W x A x } A T x W x. [, 0] A T x nj W xnj A xnj } va T x nj W xnj. where A T x W x A x equals n J K h (x ij x) i= j= n J i= j= x ij x h d 2 j K h (x ij x) d 2 j n J i= j= n J i= j= x ij x K h (x ij x) h (x ij x) 2 d 2 j K h (x ij x) h 2 d 2 j. Let S k = n i= j= ( ) xij x k K h (x ij x). h d 2 j Based on the result X = E(X) + O p ( var(x)) and Lemma A.2, we can show that if k is even, S k = J = J d 2 j= j d 2 j= j t k K(t)f j (x + ht)dt + O p ( log(/h)/(nh) ) f j (x)µ k + O p (h 2 + log(/h)/(nh)) holds uniformly in x. Let τ(x) = J J j= f j(x)/d 2 j. Because of the symmetry of kernel function, for any odd numbered k, µ k = 0 and then S k = O p (h + log(/h)/(nh)) holds uniformly in x. Therefore, A T x W x A x equals [ τ(x) + O p h 2 + }] log(/h)/(nh) O p h + } log(/h)/(nh) O p h + } log(/h)/(nh) [ τ(x)µ 2 + O p h 2 + }] log(/h)/(nh) 8
20 and } AT x W x A x [ = τ(x)} + O p h 2 + }] log(/h)/(nh) O p h + } log(/h)/(nh) O p h + } log(/h)/(nh) [ τ(x)µ 2 } + O p h 2 + }] log(/h)/(nh) (9) hold uniformly in x. Similarly, based on Lemma A.2 and the assumption of independence between ɛ ij and x ij, it follows that AT x W x F a = log(/h)/(nh) } O p log(/h)/(nh) } O p T (J )J/2, where k is the dimension k column vector with all elements. Consequently, } } [, 0] AT x W x A x AT x W x F a = o p () T (J )J/2. Substituting this result into smoothing matrix S h (X), we have S h (X)F a = [, 0]A T x W x A x } A T x W x F a. [, 0] A T x nj W xnj A xnj } A T x nj W xnj F a = o p() (J )J/2, where = nj, a b is the dimension a by b matrix with all elements one. Therefore, F a S h (X)F a = F a + o p ()}. 9
21 Finally, by the WLL, FT a I S h (X)} T G I S h (X)}F a = n i= j= F ij F T ij/d 2 j + o p ()} 2 P Ṽ, where Ṽ = J E(F j F T j)/d 2 j. j= Lemma A.4. Under Conditions A A6, we have ˆF T a I S h (X)} T G I S h (X)}ˆF a P Ṽ. Proof: Since B = ˆF a F a, the generic element of B is of the form m(x ij ) ˆm(x ij ), which is of order o p (n /4 ) uniformly in x by Condition A6. Thus, B = o p (n /4 ). Therefore ˆF T a I S h (X)} T G I S h (X)}ˆF a = (F a + B) T I S h (X)} T G I S h (X)}(F a + B). By using similar argument in the proof of Lemma A.3, we can show that FT a I S h (X)} T G I S h (X)}B = o p (), and BT I S h (X)} T G I S h (X)}B = o p (). Therefore, ˆF T a I S h (X)} T G I S h (X)}ˆF a = FT a I S h (X)} T G I S h (X)}F a + o p (). Thus, the result follows by Lemma A.3. 20
22 Lemma A.5. Suppose Conditions A A6 hold. It follows that ˆFT a I S h (X)} T G I S h (X)}m = o p (). and ˆFT a I S h (X)} T G I S h (X)}Bφ = o p (). Proof: We will first prove ˆFT a I S h (X)} T G I S h (X)}m = o p (). Similarly to the argument in the proof of Lemma A.3, we can show that [ AT x W x m = m(x)τ(x) + O p h 2 + }] log(/h)/(nh) O p h + } log(/h)/(nh). Based on the above result and the result (9), we have [, 0] AT x W x A x } AT x W x m} = m(x) [ + O p h + }] log(/h)/(nh) holds uniformly in x Ω. Therefore, I S h (X)}m = m m [ + O p h + }] log(/h)/(nh) = o p (). ote that I S h (X)}F a = F a ( + o p ()). Therefore, we have F T a I S h (X)} T G I S h (X)}m = F T a G ( + o p ())o p (). ote that E(F j ) = 0, and covariance matrix for F,..., F J } is finite. Thus, F T a G = O p (). 2
23 We can have F T a I S h (X)} T G I S h (X)}m = o p (). (0) Since ˆF a = F a + B, we can break /2 ˆFT a I S h (X)} T G I S h (X)}m into two terms: F T a I S h (X)} T G I S h (X)}m, which is o p () by (0), and B T I S h (X)} T G I S h (X)}m, which is also o p () as B = o p (n /4 ). Therefore, ˆFT a I S h (X)} T G I S h (X)}m = o p (). From the proof of Lemma A.4, we have I S h (X)}Bφ = o p () and /2 ˆFT a I S h (X)} = O p (). Therefore, ˆFT a I S h (X)} T G I S h (X)}Bφ = o p (). Lemma A.6. Under Conditions A A6, let e = (e,..., e nj ) T. Assuming that var(e ij ) = d 2 j, then [ˆFT a I S h (X)} T G I S h (X)}ˆF a ] ˆFT a I S h (X)} T G I S h (X)}e = (0, Ṽ ). Proof: From the proof of Lemma A.5, we have F T a I S h (X)} T G I S h (X)}e = n i= j= F ij d 2 j [ ] e ij [, 0]A T x ij W xij A xij } A T x ij W xij e + o p ()}. () 22
24 By using Lemma A.2 on x ij, e ij } and Lemma A.3, we can show that [, 0]A T x W x A x } A T x W x e [ =[, 0] τ(x)} + O p h 2 + }] log(/h)/nh O p h + } log(/h)/nh O p h + } log(/h)/nh O p h + } log(/h)/nh =o p (). O p h + } log(/h)/nh [ τ(x)µ 2 } + O p h 2 + }] log(/h)/nh (2) Then e ij [, 0]A T x ij W xij A xij } A T x ij W xij e = e ij + o p ()}. Plugging this in (), we obtain F T a I S h (X)} T G I S h (X)}e = n F ij e ij + o p ()}. d 2 i= j= j ote that E(F ij e ij ) = 0, var(f ij e ij ) <, and E(F ij e ij F i j e i j ) = 0 when i i or j j since e ij is independent of F ij. Based on the central limit theorem for n i= (F ie i /d 2,..., F ije ij /d 2 J )T, we have F T a I S h (X)} T G I S h (X)}e L (0, Ṽ). By Lemma A.3, F T a I S h (X)} T G I S h (X)}F a P Ṽ. Applying the Slutsky theorem, we have [ F T a I S h (X)} T G I S h (X)}F a ] F T a I S h (X)} T G I S h (X)}e = (0, Ṽ ). Since ˆF a = F a + B, we may write ˆF T a I S h (X)} T G I S h (X)}e = F T a I S h (X)} T G I S h (X)}e+B T I S h (X)} T G I S h (X)}e. 23
25 ote that B = o p (n /4 ) based on Condition A6, it can be shown that B T I S h (X)} T G I S h (X)}e = o p (). Furthermore, we have shown that F T a I S h (X)} T G I S h (X)}e L (0, Ṽ). Thus, ˆFT a I S h (X)} T G I S h (X)}e L (0, Ṽ), as well. The proof is completed by Lemma A.4 and the Slutsky theorem. Proof of Theorem 2.: Let us first show the asymptotic normality of ˆφ p. According to the expression of ˆφ p in (6), we can break (ˆφ p φ) into the sum of following three terms A, B and C A = [ } ] ˆFT a (I S h (X)) T G (I S h (X))ˆF a ˆFT a I S h (X)} T G I S h (X)}m, B = [ } ] ˆFT a (I S h (X)) T G (I S h (X))ˆF a ˆFT a I S h (X)} T G I S h (X)}Bφ, C = [ } ] ˆFT a (I S h (X)) T G (I S h (X))ˆF a ˆFT a I S h (X)} T G I S h (X)}e. For term A, it is a product of two terms [ ˆFT a I S h (X)} T G I S h (X)}ˆF a ] and [ ˆFT a I S h (X)} T G I S h (X)}m ]. 24
26 From Lemma A.4 and A.5, the asymptotic properties of these two terms lead to the conclusion that A = o p (). Similarly, applying Lemma A.4 and A.5 on two product components of term B results in B = o p (), as well. In addition, Lemma A.6 states that term C converges to (0, Ṽ ). oting that ˆφ p does not use the observations from the first time points, we should replace J by J for ˆφ p. Putting A, B, and C together, we get the asymptotic distribution of ˆφ p. ext we derive the asymptotic bias and variance of ˆm( ). ote that, ˆm(x 0, ˆφ p ) = [, 0]A T x 0 W x0 A x0 } A T x 0 W x0 (m + e + F a φ ˆF a ˆφp ) = [, 0]A T x 0 W x0 A x0 } A T x 0 W x0 (m + e) + o p ()}. ote that Ee X} = 0. Therefore, bias ˆm(x 0, ˆφ p ) X} = [, 0]A T x 0 W x0 A x0 } A T x 0 W x0 m + o p ()} m(x 0 ) = [, 0]A T x 0 W x0 A x0 } A T x 0 W x0 m Ax0 [m(x 0 ), hm (x 0 )] T } + o p ()}. Similar to the arguments in Fan and Gijbels (996, 3.7), we can prove that the asymptotic bias is 2 m (x 0 )h 2 µ 2. In addition, note that [, 0]A T x 0 W x0 A x0 } A T x 0 W x0 = τ(x 0 ) [K h(x x 0 )/d 2,..., K h (x nj x 0 )/d 2 J]. Therefore, var ˆm(x 0, ˆφ p ) X} = [, 0]A T x 0 W x0 A x0 } A T x 0 W x0 cov(e)w x0 A x0 A T x 0 W x0 A x0 } [, 0] T + o p ()} = K 2 (x)dx + o p ()}. hτ(x 0 ) As to the asymptotic normality, ˆm(x 0, ˆφ p ) E ˆm(x 0, ˆφ p ) X} = [, 0]A T x 0 W x0 A x0 } A T x 0 W x0 e + o p ()} 25
27 Thus, conditioning on X, the asymptotic normality can be established using the CLT since given j, e ij s are identically and independent distributed with mean zero and variance d2 j..) Let Proof of Theorem 2.2: Ỹ = (ỹ,..., ỹ nj ), W x = diagk h (x x)/ d 2,..., K h (x nj x)/ d 2 nj}, where K h (t) = h K(t/h), K( ) is a kernel function, and h is the bandwidth. ote that m(x 0 ) = [, 0]A T x 0 Wx0 A x0 } A T x 0 Wx0 (m + e + H), where H = (F φ ˆF φ,..., F n φ n ˆF n φn ) T, and ˆF i = (ˆF i,..., ˆF ij ) T. ote that A T W x x A x = n J i= j= K h(x ij x)/ d 2 ij n J i= j= x ij x K h (x ij x) h d 2 ij n J i= j= x ij x K h (x ij x) h d 2 ij n J (x ij x) 2 K h (x ij x) i= j= h 2 d 2 ij. Let τ(x) = J J j= f 2 j(x)e( d j X j = x). Similar to the proof of Lemma A.3, we have that AT W [ x x A x = τ(x) + O p h 2 + }] log(/h)/(nh) O p h + } log(/h)/(nh) O p h + } log(/h)/(nh) [ τ(x)µ 2 + O p h 2 + }] log(/h)/(nh) 26
28 and } AT W x x A x [ = τ(x)} + O p h 2 + }] log(/h)/(nh) O p h + } log(/h)/(nh) O p h + } log(/h)/(nh) [ τ(x)µ 2 } + O p h 2 + }] log(/h)/(nh) hold uniformly in x. ote that F i φ i ˆF i φi = (F i ˆF i ) φ i + F i (φ i φ i ). (3) Since the generic element of F i ˆF i is of the form m(x ij ) ˆm I (x ij ), which is of order o p (n /4 ) uniformly in x by Condition A6. Therefore, F i ˆF i = o p (n /4 ). ote that E(e X) = 0 and E(F i X) = 0. Thus, E m(x 0 ) X} = [, 0]A T x 0 Wx0 A x0 } A T x 0 Wx0 m + o p ()}, and bias m(x 0 ) X} = [, 0]A T x 0 Wx0 A x0 } A T x 0 Wx0 m + o p ()} m(x 0 ) = [, 0]A T x 0 Wx0 A x0 } A T x 0 Wx0 m Ax0 [m(x 0 ), hm (x 0 )] T } + o p ()}. Similar to the arguments in Fan and Gijbels (996, 3.7), we can prove that the asymptotic bias is 2 m (x 0 )h 2 µ 2. In addition, note that [, 0]A T x 0 Wx0 A x0 } A T x 0 Wx0 = τ(x 0 ) [K h(x x 0 )/ d 2,..., K h (x nj x 0 )/ d 2 nj]( + o p ()), 27
29 Let c 2 j be the jth diagonal element of covf(φ φ) X}. Based on (3), we have var m(x 0 ) X} = [, 0]A T x 0 Wx0 A x0 } A T x 0 Wx0 cov(e+h X) W x0 A x0 A T x 0 Wx0 A x0 } [, 0] T + o p ()} = ν 0 h τ 2 (x 0 ) γ(x 0) + o p ()}, where γ(x 0 ) = J j= } f j (x 0 )E (c 2 j + d 2 4 j) d j X j = x 0. 2.) When Σ = Σ, we have c 2 j = 0 and d 2 j = d2 j. Hence var m(x 0 ) X} (h) ν 0 J f j (x 0 )E(d 2 j X j = x 0 ) j=. and By noting that γ(x 0 ) J f j (x 0 )E(d 2 j j= j X j = x 0 ), (4) d 4 f j (x 0 )E(d 2 j j= 4 d j X j = x 0 ) f j (x 0 )E(d 2 j X j = x 0 ) j= j= 2 f j (x 0 )E( d j X j = x 0 ) we can obtain the result. For result (4), the equality only holds when φ = φ. For the second inequality (5), based on the Cauchy-Schwarz Inequality, the equality only holds when d j /d j are all equal. Based on the Cholesky decomposition result, φ = φ and d j /d j are all equal only when Σ = kσ and thus Σ i = kσ i, for some constant k. (5) 2. References Fan, J. and Gijbels, I. (996). Local Polynomial Modelling and Its Applications. Chapman and Hall, London. 28
30 Fan, J. and Huang, T. (2005). Profile likelihood inferences on semiparametric varying-coefficient partially linear models. Bernoulli,, Fan, J., Huang, T., and Li, R. (2007). Analysis of longitudinal data with semiparametric estimation of covariance function. Journal of the American Statistical Association, 02, Fan, J. and Li, R. (2004). ew estimation and model selection procedures for semiparametric modeling in longitudinal data analysis. Journal of American Statistical Association, 99, Gu, C. (2002). Smoothing Spline Anova Models. Springer-Verlag, ew York. Hansen, L. (982). Large sample properties of generalized method of moments estimators. Econometrica, 50, Liang, K.-Y. and Zeger, S. L. (986). Longitudinal data analysis using generalized linear models. Biometrika, 73, Lin, X. and Carroll, R. J. (2000). onparametric function estimation for clustered data when the predictor is measured without/with error. Journal of American Statistical Association, 95, Qu, A. and Lindsay, B. G., and Li, B. (2000). Improving generalised estimating equatinos using quadratic inference functions. Biometrika, 87, Ruppert, D., Sheather S. J., and Wand M. P. (995). An effective bandwidth selector for local least sqrares regression. Journal of American Statistical Association, 90, Wang,. (2003). Marginal nonparametric kernel regression accounting for within-subject correlation. Biometrika, 90,
Function of Longitudinal Data
New Local Estimation Procedure for Nonparametric Regression Function of Longitudinal Data Weixin Yao and Runze Li Abstract This paper develops a new estimation of nonparametric regression functions for
More informationDESIGN-ADAPTIVE MINIMAX LOCAL LINEAR REGRESSION FOR LONGITUDINAL/CLUSTERED DATA
Statistica Sinica 18(2008), 515-534 DESIGN-ADAPTIVE MINIMAX LOCAL LINEAR REGRESSION FOR LONGITUDINAL/CLUSTERED DATA Kani Chen 1, Jianqing Fan 2 and Zhezhen Jin 3 1 Hong Kong University of Science and Technology,
More informationNonparametric Regression
Nonparametric Regression Econ 674 Purdue University April 8, 2009 Justin L. Tobias (Purdue) Nonparametric Regression April 8, 2009 1 / 31 Consider the univariate nonparametric regression model: where y
More informationNONPARAMETRIC MIXTURE OF REGRESSION MODELS
NONPARAMETRIC MIXTURE OF REGRESSION MODELS Huang, M., & Li, R. The Pennsylvania State University Technical Report Series #09-93 College of Health and Human Development The Pennsylvania State University
More informationSome Theories about Backfitting Algorithm for Varying Coefficient Partially Linear Model
Some Theories about Backfitting Algorithm for Varying Coefficient Partially Linear Model 1. Introduction Varying-coefficient partially linear model (Zhang, Lee, and Song, 2002; Xia, Zhang, and Tong, 2004;
More informationNonparametric Modal Regression
Nonparametric Modal Regression Summary In this article, we propose a new nonparametric modal regression model, which aims to estimate the mode of the conditional density of Y given predictors X. The nonparametric
More informationWEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION. Abstract
Journal of Data Science,17(1). P. 145-160,2019 DOI:10.6339/JDS.201901_17(1).0007 WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION Wei Xiong *, Maozai Tian 2 1 School of Statistics, University of
More informationDiscussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon
Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Jianqing Fan Department of Statistics Chinese University of Hong Kong AND Department of Statistics
More informationMixture of Gaussian Processes and its Applications
Mixture of Gaussian Processes and its Applications Mian Huang, Runze Li, Hansheng Wang, and Weixin Yao The Pennsylvania State University Technical Report Series #10-102 College of Health and Human Development
More informationLocal Modal Regression
Local Modal Regression Weixin Yao Department of Statistics, Kansas State University, Manhattan, Kansas 66506, U.S.A. wxyao@ksu.edu Bruce G. Lindsay and Runze Li Department of Statistics, The Pennsylvania
More informationUsing Estimating Equations for Spatially Correlated A
Using Estimating Equations for Spatially Correlated Areal Data December 8, 2009 Introduction GEEs Spatial Estimating Equations Implementation Simulation Conclusion Typical Problem Assess the relationship
More informationGeneralized varying coefficient models for longitudinal data
Generalized varying coefficient models for longitudinal data By DAMLA ŞENTÜRK Department of Statistics, Penn State University, University Park, PA 16802, USA dsenturk@wald.ucdavis.edu And HANS-GEORG MÜLLER
More informationSEMIPARAMETRIC LONGITUDINAL MODEL WITH IRREGULAR TIME AUTOREGRESSIVE ERROR PROCESS
Statistica Sinica 25 (2015), 507-527 doi:http://dx.doi.org/10.5705/ss.2013.073 SEMIPARAMETRIC LOGITUDIAL MODEL WITH IRREGULAR TIME AUTOREGRESSIVE ERROR PROCESS Yang Bai 1,2, Jian Huang 3, Rui Li 1 and
More informationNonparametric Econometrics
Applied Microeconometrics with Stata Nonparametric Econometrics Spring Term 2011 1 / 37 Contents Introduction The histogram estimator The kernel density estimator Nonparametric regression estimators Semi-
More informationTESTING SERIAL CORRELATION IN SEMIPARAMETRIC VARYING COEFFICIENT PARTIALLY LINEAR ERRORS-IN-VARIABLES MODEL
Jrl Syst Sci & Complexity (2009) 22: 483 494 TESTIG SERIAL CORRELATIO I SEMIPARAMETRIC VARYIG COEFFICIET PARTIALLY LIEAR ERRORS-I-VARIABLES MODEL Xuemei HU Feng LIU Zhizhong WAG Received: 19 September
More informationNonparametric Methods
Nonparametric Methods Michael R. Roberts Department of Finance The Wharton School University of Pennsylvania July 28, 2009 Michael R. Roberts Nonparametric Methods 1/42 Overview Great for data analysis
More informationLecture Notes 15 Prediction Chapters 13, 22, 20.4.
Lecture Notes 15 Prediction Chapters 13, 22, 20.4. 1 Introduction Prediction is covered in detail in 36-707, 36-701, 36-715, 10/36-702. Here, we will just give an introduction. We observe training data
More informationMinimax Rate of Convergence for an Estimator of the Functional Component in a Semiparametric Multivariate Partially Linear Model.
Minimax Rate of Convergence for an Estimator of the Functional Component in a Semiparametric Multivariate Partially Linear Model By Michael Levine Purdue University Technical Report #14-03 Department of
More informationRejoinder. 1 Phase I and Phase II Profile Monitoring. Peihua Qiu 1, Changliang Zou 2 and Zhaojun Wang 2
Rejoinder Peihua Qiu 1, Changliang Zou 2 and Zhaojun Wang 2 1 School of Statistics, University of Minnesota 2 LPMC and Department of Statistics, Nankai University, China We thank the editor Professor David
More informationSMOOTHED BLOCK EMPIRICAL LIKELIHOOD FOR QUANTILES OF WEAKLY DEPENDENT PROCESSES
Statistica Sinica 19 (2009), 71-81 SMOOTHED BLOCK EMPIRICAL LIKELIHOOD FOR QUANTILES OF WEAKLY DEPENDENT PROCESSES Song Xi Chen 1,2 and Chiu Min Wong 3 1 Iowa State University, 2 Peking University and
More informationLocal Polynomial Regression
VI Local Polynomial Regression (1) Global polynomial regression We observe random pairs (X 1, Y 1 ),, (X n, Y n ) where (X 1, Y 1 ),, (X n, Y n ) iid (X, Y ). We want to estimate m(x) = E(Y X = x) based
More informationSemiparametric estimation of covariance matrices for longitudinal data
Semiparametric estimation of covariance matrices for longitudinal data Jianqing Fan and Yichao Wu Princeton University and North Carolina State University Abstract Estimation of longitudinal data covariance
More informationLocal Polynomial Modelling and Its Applications
Local Polynomial Modelling and Its Applications J. Fan Department of Statistics University of North Carolina Chapel Hill, USA and I. Gijbels Institute of Statistics Catholic University oflouvain Louvain-la-Neuve,
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationSEMIPARAMETRIC ESTIMATION OF CONDITIONAL HETEROSCEDASTICITY VIA SINGLE-INDEX MODELING
Statistica Sinica 3 (013), 135-155 doi:http://dx.doi.org/10.5705/ss.01.075 SEMIPARAMERIC ESIMAION OF CONDIIONAL HEEROSCEDASICIY VIA SINGLE-INDEX MODELING Liping Zhu, Yuexiao Dong and Runze Li Shanghai
More informationEstimation of the Conditional Variance in Paired Experiments
Estimation of the Conditional Variance in Paired Experiments Alberto Abadie & Guido W. Imbens Harvard University and BER June 008 Abstract In paired randomized experiments units are grouped in pairs, often
More informationTime Series and Forecasting Lecture 4 NonLinear Time Series
Time Series and Forecasting Lecture 4 NonLinear Time Series Bruce E. Hansen Summer School in Economics and Econometrics University of Crete July 23-27, 2012 Bruce Hansen (University of Wisconsin) Foundations
More informationDensity estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas
0 0 5 Motivation: Regression discontinuity (Angrist&Pischke) Outcome.5 1 1.5 A. Linear E[Y 0i X i] 0.2.4.6.8 1 X Outcome.5 1 1.5 B. Nonlinear E[Y 0i X i] i 0.2.4.6.8 1 X utcome.5 1 1.5 C. Nonlinearity
More informationERROR VARIANCE ESTIMATION IN NONPARAMETRIC REGRESSION MODELS
ERROR VARIANCE ESTIMATION IN NONPARAMETRIC REGRESSION MODELS By YOUSEF FAYZ M ALHARBI A thesis submitted to The University of Birmingham for the Degree of DOCTOR OF PHILOSOPHY School of Mathematics The
More informationVariance Function Estimation in Multivariate Nonparametric Regression
Variance Function Estimation in Multivariate Nonparametric Regression T. Tony Cai 1, Michael Levine Lie Wang 1 Abstract Variance function estimation in multivariate nonparametric regression is considered
More informationNonparametric Regression Härdle, Müller, Sperlich, Werwarz, 1995, Nonparametric and Semiparametric Models, An Introduction
Härdle, Müller, Sperlich, Werwarz, 1995, Nonparametric and Semiparametric Models, An Introduction Tine Buch-Kromann Univariate Kernel Regression The relationship between two variables, X and Y where m(
More informationSimple and Efficient Improvements of Multivariate Local Linear Regression
Journal of Multivariate Analysis Simple and Efficient Improvements of Multivariate Local Linear Regression Ming-Yen Cheng 1 and Liang Peng Abstract This paper studies improvements of multivariate local
More informationIntroduction. Linear Regression. coefficient estimates for the wage equation: E(Y X) = X 1 β X d β d = X β
Introduction - Introduction -2 Introduction Linear Regression E(Y X) = X β +...+X d β d = X β Example: Wage equation Y = log wages, X = schooling (measured in years), labor market experience (measured
More informationProfessors Lin and Ying are to be congratulated for an interesting paper on a challenging topic and for introducing survival analysis techniques to th
DISCUSSION OF THE PAPER BY LIN AND YING Xihong Lin and Raymond J. Carroll Λ July 21, 2000 Λ Xihong Lin (xlin@sph.umich.edu) is Associate Professor, Department ofbiostatistics, University of Michigan, Ann
More informationWeb-based Supplementary Material for. Dependence Calibration in Conditional Copulas: A Nonparametric Approach
1 Web-based Supplementary Material for Dependence Calibration in Conditional Copulas: A Nonparametric Approach Elif F. Acar, Radu V. Craiu, and Fang Yao Web Appendix A: Technical Details The score and
More informationNonparametric Regression. Changliang Zou
Nonparametric Regression Institute of Statistics, Nankai University Email: nk.chlzou@gmail.com Smoothing parameter selection An overall measure of how well m h (x) performs in estimating m(x) over x (0,
More informationModelling Non-linear and Non-stationary Time Series
Modelling Non-linear and Non-stationary Time Series Chapter 2: Non-parametric methods Henrik Madsen Advanced Time Series Analysis September 206 Henrik Madsen (02427 Adv. TS Analysis) Lecture Notes September
More informationGauge Plots. Gauge Plots JAPANESE BEETLE DATA MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA JAPANESE BEETLE DATA
JAPANESE BEETLE DATA 6 MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA Gauge Plots TuscaroraLisa Central Madsen Fairways, 996 January 9, 7 Grubs Adult Activity Grub Counts 6 8 Organic Matter
More informationExtending clustered point process-based rainfall models to a non-stationary climate
Extending clustered point process-based rainfall models to a non-stationary climate Jo Kaczmarska 1, 2 Valerie Isham 2 Paul Northrop 2 1 Risk Management Solutions 2 Department of Statistical Science, University
More informationA nonparametric method of multi-step ahead forecasting in diffusion processes
A nonparametric method of multi-step ahead forecasting in diffusion processes Mariko Yamamura a, Isao Shoji b a School of Pharmacy, Kitasato University, Minato-ku, Tokyo, 108-8641, Japan. b Graduate School
More informationASSESSING GENERALIZED LINEAR MIXED MODELS USING RESIDUAL ANALYSIS. Received March 2011; revised July 2011
International Journal of Innovative Computing, Information and Control ICIC International c 2012 ISSN 1349-4198 Volume 8, Number 8, August 2012 pp. 5693 5701 ASSESSING GENERALIZED LINEAR MIXED MODELS USING
More informationPenalized Splines, Mixed Models, and Recent Large-Sample Results
Penalized Splines, Mixed Models, and Recent Large-Sample Results David Ruppert Operations Research & Information Engineering, Cornell University Feb 4, 2011 Collaborators Matt Wand, University of Wollongong
More informationInference For High Dimensional M-estimates. Fixed Design Results
: Fixed Design Results Lihua Lei Advisors: Peter J. Bickel, Michael I. Jordan joint work with Peter J. Bickel and Noureddine El Karoui Dec. 8, 2016 1/57 Table of Contents 1 Background 2 Main Results and
More informationTransformation and Smoothing in Sample Survey Data
Scandinavian Journal of Statistics, Vol. 37: 496 513, 2010 doi: 10.1111/j.1467-9469.2010.00691.x Published by Blackwell Publishing Ltd. Transformation and Smoothing in Sample Survey Data YANYUAN MA Department
More informationSEMIPARAMETRIC QUANTILE REGRESSION WITH HIGH-DIMENSIONAL COVARIATES. Liping Zhu, Mian Huang, & Runze Li. The Pennsylvania State University
SEMIPARAMETRIC QUANTILE REGRESSION WITH HIGH-DIMENSIONAL COVARIATES Liping Zhu, Mian Huang, & Runze Li The Pennsylvania State University Technical Report Series #10-104 College of Health and Human Development
More informationStat 5101 Lecture Notes
Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random
More informationMotivational Example
Motivational Example Data: Observational longitudinal study of obesity from birth to adulthood. Overall Goal: Build age-, gender-, height-specific growth charts (under 3 year) to diagnose growth abnomalities.
More informationPQL Estimation Biases in Generalized Linear Mixed Models
PQL Estimation Biases in Generalized Linear Mixed Models Woncheol Jang Johan Lim March 18, 2006 Abstract The penalized quasi-likelihood (PQL) approach is the most common estimation procedure for the generalized
More informationWeb Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D.
Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D. Ruppert A. EMPIRICAL ESTIMATE OF THE KERNEL MIXTURE Here we
More informationIssues on quantile autoregression
Issues on quantile autoregression Jianqing Fan and Yingying Fan We congratulate Koenker and Xiao on their interesting and important contribution to the quantile autoregression (QAR). The paper provides
More informationLocal linear multiple regression with variable. bandwidth in the presence of heteroscedasticity
Local linear multiple regression with variable bandwidth in the presence of heteroscedasticity Azhong Ye 1 Rob J Hyndman 2 Zinai Li 3 23 January 2006 Abstract: We present local linear estimator with variable
More informationLOCAL LINEAR REGRESSION FOR GENERALIZED LINEAR MODELS WITH MISSING DATA
The Annals of Statistics 1998, Vol. 26, No. 3, 1028 1050 LOCAL LINEAR REGRESSION FOR GENERALIZED LINEAR MODELS WITH MISSING DATA By C. Y. Wang, 1 Suojin Wang, 2 Roberto G. Gutierrez and R. J. Carroll 3
More informationLinear models and their mathematical foundations: Simple linear regression
Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction
More informationEstimation of a quadratic regression functional using the sinc kernel
Estimation of a quadratic regression functional using the sinc kernel Nicolai Bissantz Hajo Holzmann Institute for Mathematical Stochastics, Georg-August-University Göttingen, Maschmühlenweg 8 10, D-37073
More informationEffective Computation for Odds Ratio Estimation in Nonparametric Logistic Regression
Communications of the Korean Statistical Society 2009, Vol. 16, No. 4, 713 722 Effective Computation for Odds Ratio Estimation in Nonparametric Logistic Regression Young-Ju Kim 1,a a Department of Information
More informationStat 579: Generalized Linear Models and Extensions
Stat 579: Generalized Linear Models and Extensions Linear Mixed Models for Longitudinal Data Yan Lu April, 2018, week 15 1 / 38 Data structure t1 t2 tn i 1st subject y 11 y 12 y 1n1 Experimental 2nd subject
More informationRecall that if X 1,...,X n are random variables with finite expectations, then. The X i can be continuous or discrete or of any other type.
Expectations of Sums of Random Variables STAT/MTHE 353: 4 - More on Expectations and Variances T. Linder Queen s University Winter 017 Recall that if X 1,...,X n are random variables with finite expectations,
More informationSparse Nonparametric Density Estimation in High Dimensions Using the Rodeo
Outline in High Dimensions Using the Rodeo Han Liu 1,2 John Lafferty 2,3 Larry Wasserman 1,2 1 Statistics Department, 2 Machine Learning Department, 3 Computer Science Department, Carnegie Mellon University
More informationLocal linear multivariate. regression with variable. bandwidth in the presence of. heteroscedasticity
Model ISSN 1440-771X Department of Econometrics and Business Statistics http://www.buseco.monash.edu.au/depts/ebs/pubs/wpapers/ Local linear multivariate regression with variable bandwidth in the presence
More informationLinear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,
Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,
More informationOn a Nonparametric Notion of Residual and its Applications
On a Nonparametric Notion of Residual and its Applications Bodhisattva Sen and Gábor Székely arxiv:1409.3886v1 [stat.me] 12 Sep 2014 Columbia University and National Science Foundation September 16, 2014
More informationPREWHITENING-BASED ESTIMATION IN PARTIAL LINEAR REGRESSION MODELS: A COMPARATIVE STUDY
REVSTAT Statistical Journal Volume 7, Number 1, April 2009, 37 54 PREWHITENING-BASED ESTIMATION IN PARTIAL LINEAR REGRESSION MODELS: A COMPARATIVE STUDY Authors: Germán Aneiros-Pérez Departamento de Matemáticas,
More informationSemiparametric Mean-Covariance. Regression Analysis for Longitudinal Data
Semiparametric Mean-Covariance Regression Analysis for Longitudinal Data July 14, 2009 Acknowledgments We would like to thank the Editor, an associate editor and two referees for all the constructive comments
More informationOn the Robust Modal Local Polynomial Regression
International Journal of Statistical Sciences ISSN 683 5603 Vol. 9(Special Issue), 2009, pp 27-23 c 2009 Dept. of Statistics, Univ. of Rajshahi, Bangladesh On the Robust Modal Local Polynomial Regression
More informationEfficient Estimation for the Partially Linear Models with Random Effects
A^VÇÚO 1 33 ò 1 5 Ï 2017 c 10 Chinese Journal of Applied Probability and Statistics Oct., 2017, Vol. 33, No. 5, pp. 529-537 doi: 10.3969/j.issn.1001-4268.2017.05.009 Efficient Estimation for the Partially
More informationAnalysis of Longitudinal Data with Semiparametric Estimation of Covariance Function
Analysis of Longitudinal Data with Semiparametric Estimation of Covariance Function Jianqing Fan, Tao Huang and Runze Li Abstract Improving efficiency for regression coefficients and predicting trajectories
More informationModel Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao
Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics Jiti Gao Department of Statistics School of Mathematics and Statistics The University of Western Australia Crawley
More informationStatement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.
MATHEMATICAL STATISTICS Homework assignment Instructions Please turn in the homework with this cover page. You do not need to edit the solutions. Just make sure the handwriting is legible. You may discuss
More informationA New Method for Varying Adaptive Bandwidth Selection
IEEE TRASACTIOS O SIGAL PROCESSIG, VOL. 47, O. 9, SEPTEMBER 1999 2567 TABLE I SQUARE ROOT MEA SQUARED ERRORS (SRMSE) OF ESTIMATIO USIG THE LPA AD VARIOUS WAVELET METHODS A ew Method for Varying Adaptive
More informationECON 721: Lecture Notes on Nonparametric Density and Regression Estimation. Petra E. Todd
ECON 721: Lecture Notes on Nonparametric Density and Regression Estimation Petra E. Todd Fall, 2014 2 Contents 1 Review of Stochastic Order Symbols 1 2 Nonparametric Density Estimation 3 2.1 Histogram
More informationDiagnostics for Linear Models With Functional Responses
Diagnostics for Linear Models With Functional Responses Qing Shen Edmunds.com Inc. 2401 Colorado Ave., Suite 250 Santa Monica, CA 90404 (shenqing26@hotmail.com) Hongquan Xu Department of Statistics University
More informationThe EM Algorithm for the Finite Mixture of Exponential Distribution Models
Int. J. Contemp. Math. Sciences, Vol. 9, 2014, no. 2, 57-64 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ijcms.2014.312133 The EM Algorithm for the Finite Mixture of Exponential Distribution
More informationIndependent and conditionally independent counterfactual distributions
Independent and conditionally independent counterfactual distributions Marcin Wolski European Investment Bank M.Wolski@eib.org Society for Nonlinear Dynamics and Econometrics Tokyo March 19, 2018 Views
More informationA Bootstrap Test for Conditional Symmetry
ANNALS OF ECONOMICS AND FINANCE 6, 51 61 005) A Bootstrap Test for Conditional Symmetry Liangjun Su Guanghua School of Management, Peking University E-mail: lsu@gsm.pku.edu.cn and Sainan Jin Guanghua School
More informationLOCAL POLYNOMIAL REGRESSION ON UNKNOWN MANIFOLDS. Department of Statistics. University of California at Berkeley, USA. 1.
LOCAL POLYNOMIAL REGRESSION ON UNKNOWN MANIFOLDS PETER J. BICKEL AND BO LI Department of Statistics University of California at Berkeley, USA Abstract. We reveal the phenomenon that naive multivariate
More informationSemiparametric Trending Panel Data Models with Cross-Sectional Dependence
he University of Adelaide School of Economics Research Paper No. 200-0 May 200 Semiparametric rending Panel Data Models with Cross-Sectional Dependence Jia Chen, Jiti Gao, and Degui Li Semiparametric rending
More informationRegression and Statistical Inference
Regression and Statistical Inference Walid Mnif wmnif@uwo.ca Department of Applied Mathematics The University of Western Ontario, London, Canada 1 Elements of Probability 2 Elements of Probability CDF&PDF
More information36. Multisample U-statistics and jointly distributed U-statistics Lehmann 6.1
36. Multisample U-statistics jointly distributed U-statistics Lehmann 6.1 In this topic, we generalize the idea of U-statistics in two different directions. First, we consider single U-statistics for situations
More informationOn variable bandwidth kernel density estimation
JSM 04 - Section on Nonparametric Statistics On variable bandwidth kernel density estimation Janet Nakarmi Hailin Sang Abstract In this paper we study the ideal variable bandwidth kernel estimator introduced
More informationA time series is called strictly stationary if the joint distribution of every collection (Y t
5 Time series A time series is a set of observations recorded over time. You can think for example at the GDP of a country over the years (or quarters) or the hourly measurements of temperature over a
More information41903: Introduction to Nonparametrics
41903: Notes 5 Introduction Nonparametrics fundamentally about fitting flexible models: want model that is flexible enough to accommodate important patterns but not so flexible it overspecializes to specific
More informationConvergence rates in weighted L 1 spaces of kernel density estimators for linear processes
Alea 4, 117 129 (2008) Convergence rates in weighted L 1 spaces of kernel density estimators for linear processes Anton Schick and Wolfgang Wefelmeyer Anton Schick, Department of Mathematical Sciences,
More informationEstimating Mixture of Gaussian Processes by Kernel Smoothing
This is the author s final, peer-reviewed manuscript as accepted for publication. The publisher-formatted version may be available through the publisher s web site or your institution s library. Estimating
More informationWavelet Regression Estimation in Longitudinal Data Analysis
Wavelet Regression Estimation in Longitudinal Data Analysis ALWELL J. OYET and BRAJENDRA SUTRADHAR Department of Mathematics and Statistics, Memorial University of Newfoundland St. John s, NF Canada, A1C
More informationSome Monte Carlo Evidence for Adaptive Estimation of Unit-Time Varying Heteroscedastic Panel Data Models
Some Monte Carlo Evidence for Adaptive Estimation of Unit-Time Varying Heteroscedastic Panel Data Models G. R. Pasha Department of Statistics, Bahauddin Zakariya University Multan, Pakistan E-mail: drpasha@bzu.edu.pk
More informationLocal Polynomial Estimation for Sensitivity Analysis on Models With Correlated Inputs
Local Polynomial Estimation for Sensitivity Analysis on Models With Correlated Inputs Sébastien Da Veiga, François Wahl, Fabrice Gamboa To cite this version: Sébastien Da Veiga, François Wahl, Fabrice
More informationATINER's Conference Paper Series STA
ATINER CONFERENCE PAPER SERIES No: LNG2014-1176 Athens Institute for Education and Research ATINER ATINER's Conference Paper Series STA2014-1255 Parametric versus Semi-parametric Mixed Models for Panel
More informationError distribution function for parametrically truncated and censored data
Error distribution function for parametrically truncated and censored data Géraldine LAURENT Jointly with Cédric HEUCHENNE QuantOM, HEC-ULg Management School - University of Liège Friday, 14 September
More informationEcon 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines
Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Maximilian Kasy Department of Economics, Harvard University 1 / 37 Agenda 6 equivalent representations of the
More informationEcon 2120: Section 2
Econ 2120: Section 2 Part I - Linear Predictor Loose Ends Ashesh Rambachan Fall 2018 Outline Big Picture Matrix Version of the Linear Predictor and Least Squares Fit Linear Predictor Least Squares Omitted
More information,..., θ(2),..., θ(n)
Likelihoods for Multivariate Binary Data Log-Linear Model We have 2 n 1 distinct probabilities, but we wish to consider formulations that allow more parsimonious descriptions as a function of covariates.
More informationStatistical Data Analysis
DS-GA 0 Lecture notes 8 Fall 016 1 Descriptive statistics Statistical Data Analysis In this section we consider the problem of analyzing a set of data. We describe several techniques for visualizing the
More informationIntroduction to Regression
Introduction to Regression p. 1/97 Introduction to Regression Chad Schafer cschafer@stat.cmu.edu Carnegie Mellon University Introduction to Regression p. 1/97 Acknowledgement Larry Wasserman, All of Nonparametric
More informationStein s method and zero bias transformation: Application to CDO pricing
Stein s method and zero bias transformation: Application to CDO pricing ESILV and Ecole Polytechnique Joint work with N. El Karoui Introduction CDO a portfolio credit derivative containing 100 underlying
More informationSIMULTANEOUS CONFIDENCE BANDS AND HYPOTHESIS TESTING FOR SINGLE-INDEX MODELS
Statistica Sinica 24 (2014), 937-955 doi:http://dx.doi.org/10.5705/ss.2012.127 SIMULTANEOUS CONFIDENCE BANDS AND HYPOTHESIS TESTING FOR SINGLE-INDEX MODELS Gaorong Li 1, Heng Peng 2, Kai Dong 2 and Tiejun
More informationSubmitted to the Brazilian Journal of Probability and Statistics
Submitted to the Brazilian Journal of Probability and Statistics Multivariate normal approximation of the maximum likelihood estimator via the delta method Andreas Anastasiou a and Robert E. Gaunt b a
More informationSelecting Local Models in Multiple Regression. by Maximizing Power
Selecting Local Models in Multiple Regression by Maximizing Power Chad M. Schafer and Kjell A. Doksum October 26, 2006 Abstract This paper considers multiple regression procedures for analyzing the relationship
More informationFinite Population Sampling and Inference
Finite Population Sampling and Inference A Prediction Approach RICHARD VALLIANT ALAN H. DORFMAN RICHARD M. ROYALL A Wiley-Interscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim Brisbane
More information[y i α βx i ] 2 (2) Q = i=1
Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation
More informationBickel Rosenblatt test
University of Latvia 28.05.2011. A classical Let X 1,..., X n be i.i.d. random variables with a continuous probability density function f. Consider a simple hypothesis H 0 : f = f 0 with a significance
More information