Asymptotic distributions of nonparametric regression estimators for longitudinal or functional data

Size: px
Start display at page:

Download "Asymptotic distributions of nonparametric regression estimators for longitudinal or functional data"

Transcription

1 Journal of Multivariate Analysis 98 (2007) Asymptotic distriutions of nonparametric regression estimators for longitudinal or functional data Fang Yao Department of Statistics, Colorado State University, Fort Collins, CO 80523, USA Received 28 January 2005 Availale online 28 Septemer 2006 Astract The estimation of a regression function y kernel method for longitudinal or functional data is considered. In the context of longitudinal data analysis, a random function typically represents a suject that is often oserved at a small numer of time points, while in the studies of functional data the random realization is usually measured on a dense grid. However, essentially the same methods can e applied to oth sampling plans, as well as in a numer of settings lying etween them. In this paper general results are derived for the asymptotic distriutions of real-valued functions with arguments which are functionals formed y weighted averages of longitudinal or functional data. Asymptotic distriutions for the estimators of the mean and covariance functions otained from noisy oservations with the presence of within-suject correlation are studied. These asymptotic normality results are comparale to those standard rates otained from independent data, which is illustrated in a simulation study. Besides, this paper discusses the conditions associated with sampling plans, which are required for the validity of local properties of kernel-ased estimators for longitudinal or functional data Elsevier Inc. All rights reserved. AMS 99 suject classification: 62G08 Keywords: Asymptotic distriution; Covariance; Functional data; Longitudinal data; Regression; Within-suject correlation. Introduction Modern technology and advanced computing environments have facilitated the collection and analysis of high-dimensional data, or data that are repeatedly measured for a sample of sujects. The repeated measurements are often recorded over a period of time, say on an closed and ounded interval T. It also could e a spacial variale, such as in image or geoscience applications. addresses: fyao@utstat.toronto.edu, fyao@stat.colostate.edu X/$ - see front matter 2006 Elsevier Inc. All rights reserved. doi:0.06/j.jmva

2 F. Yao / Journal of Multivariate Analysis 98 (2007) When the data are recorded densely over time, often y machine, they are typically termed functional or curve data with one oserved curve or function per suject, while in longitudinal studies the repeated measurements usually take place on a few scattered oservational time points for each suject. A significant intrinsic difference etween two settings lies in the perception that functional data are oserved in the continuum without noise [2,3], whereas longitudinal data are oserved at sparsely distriuted time points and are often suject to experimental error [4]. However, in practice functional data are analyzed after smoothing noisy oservations [0], which indicates that the differences etween two data types related to the way in which a prolem is perceived are argualy more conceptual than actual. Therefore in this paper, kernel-ased regression estimators otained from oservations at discrete time points contaminated with measurement errors, rather than oservations in the continuum, are considered for these realistic reasons. In the context of kernelased nonparametric regression, the effects of sampling plans on the statistical estimators are also investigated. A vast literature has een developed in the past decade on the kernel-ased regression for independent and identically distriuted data, for summary, see Fan and Gijels [5]. There has een sustantial recent interest in extending the existing asymptotic results to functional or longitudinal data [8,,4,3,9]. The issues caused y the within-suject correlation are rigorously addressed in this paper. Hart and Wehrly [8] studied the Gasser Müller estimator of the mean function for repeated measurements oserved on a regular grid y assuming stationary correlation structure, and showed that the influence of the within-suject correlation on the asymptotic variance is of smaller order compared to the standard rate otained from independent data and will disappear when the correlation function is differentiale at zero. Our asymptotic distriution result is in fact consistent with that in Hart and Wehrly [8] and applicale for general covariance structure without stationary assumption. This prolem was also discussed y Staniswalis and Lee [2] and Lin and Carroll [9], where they used the heuristic arguments of the local property of local polynomial estimation and intuitively ignored the within-suject correlation while deriving the asymptotic variances. This paper derives appropriate conditions that are required for the validity of the local property of kernel type estimators otained from longitudinal or functional data. These conditions also provide practical guidelines for various sampling procedures. The contriution of this paper is the derivation of general asymptotic distriution results in oth one-dimensional and two-dimensional smoothing context for real-valued functions with arguments which are functionals formed y weighted averages of longitudinal or functional data. These asymptotic normality results are comparale to those otained for identically distriuted and independent data. These results are applied to the kernel-ased estimators of the mean and covariance functions, which yields asymptotic normal distriutions of these estimators. In particular, to the est of our knowledge, no asymptotic distriution results are availale up to date for nonparametric estimation of covariance functions otained from longitudinal or functional data contaminated with measurement error. By comparison, Hall et al. [6,7] investigated asymptotic properties of nonparametric kernel estimators of autocovariance, where the measurements were only oserved from a single stationary stochastic process or random field. Although the asymptotic distriutions are derived for random design in this paper, the arguments can e extended to fixed design and other sampling plans with appropriate modifications, and asymptotic ias and variance terms can also e otained in similar manner. This will provide theoretical asis and practical guidance for the nonparametric analysis of functional or longitudinal data with important potential applications which are ased on the asymptotic distriutions. Typical examples include the construction of asymptotic confidence ands for regression functions and confidence regions

3 42 F. Yao / Journal of Multivariate Analysis 98 (2007) for covariance surfaces, and also fast selection of andwidth for covariance surface estimation ased on asymptotic mean squared error. Other applications in the context of smoothing independent data can e explored for the smoothing of longitudinal or functional data using kernel-ased estimators. The remainder of the paper is organized as follow. In Section 2 we derive the general asymptotic distriutions of one- and two-dimensional smoothers otained from longitudinal or functional data for random design. These general asymptotic results are applied to commonly used kernel-type estimators of the mean curve and covariance surface in Section 3. Extension to fixed design is discussed in Section 4. A simulation study is presented to evaluate the derived asymptotic results for correlated data in Section 5, while discussions, including potential applications of the resulting asymptotic normality, are offered in Section General results of asymptotic distriutions for random design In this section we will define general functionals that are kernel-weighted averages of the data for one-dimensional and two-dimensional smoothing. The introduced general functionals include the most commonly used types of kernel-ased estimators as special cases, such as Gasser Müller estimator, Nadaraya Waston estimator, local polynomial estimator, etc. Since Nadaraya Waston and local polynomial estimators are mostly used in practice, their asymptotic ehaviors in terms of ias and variance for independent data have een thoroughly studied in existing literature. However, for longitudinal or functional data, particularly in regard to covariance surface estimators, the asymptotic ehaviors of ias and variance of these two commonly used estimators are still largely unknown. Therefore in Section 3, the general asymptotic results developed in this section are applied to Nadaraya Waston and local polynomial estimators in oth one-dimensional and two-dimensional smoothing settings. In particular, the lack of asymptotic results for the covariance surface estimators of longitudinal or functional data is an additional motivation for the definition of the two-dimensional general functional that can e applied to develop the asymptotic distriutions for these estimators. We first consider random design while extension to other sampling plans is deferred to Section 4. In classical longitudinal studies, measurements are often intended to e on a regular time grid. However, since individuals may miss scheduled visits, the resulting data usually ecome sparse, where only few oservations are otained for most sujects, with unequal numers of repeated measurements per suject and different measurement times T ij per individual. This sampling plan motivates the following assumptions which are applicale for a numer of longitudinal data encountered in practice. Let X ={X(t),t [0, T ]}, T <, e a continuous-time stochastic process defined on a proaility space (Ω, A,P) with finite variance, E( T 0 X2 (t) dt) <, and X i e independently and identically distriuted (i.i.d.) realizations of X which can e modelled as follows in general. The trajectories X i have the mean function μ(t) and uncorrelated random coefficients ξ ik with mean zero, variances λ k satisfying k= λ k < and corresponding asis functions k (t), where the within-suject covariance function is determined as C(s,t) = cov(x i (s), X i (t)) = k= λ k k (s) k (t). A typical example is the Karhunen Loève representation of stochastic processes, where λ k and k correspond to eigenvalues and eigenfunctions. Suppose that one has data {(T ij,y ij ), i n, j N i }, where Y ij are oservations drawn from the process X i at time T ij, incorporating the uncorrelated measurement errors ε ij that have mean zero and a constant

4 F. Yao / Journal of Multivariate Analysis 98 (2007) variance σ 2, Y ij = X i (T ij ) + ε ij = μ(t ij ) + ξ ik k (T ij ) + ε ij, T ij T, () k= where Eε ij = 0, var(ε ij ) = σ 2, and the numer of oservations, N i (n) depending on the sample size n, are considered random. We make the following assumptions, (A.) The numer of oservations N i (n) made for the ith suject or cluster, i =,...,n,isar.v. with N i (n) i.i.d N(n), where N(n) > 0 is a positive integer-valued random variale with lim sup n (n) 2 /[(n)] 2 < and lim sup n (n) 4 /(n)(n) 3 <. In the sequel the dependence of N i (n) and N(n) on the sample size n is suppressed for simplicity; i.e., N i = N i (n) and N(n) = N. The oservation times and measurements are assumed to e independent of the numer of measurements, i.e., for any suset J i {,...,N i } and for all i =,...,n, (A.2) ({T ij : j J i }, {Y ij : j J i }) is independent of N i. Writing T i = (T i,...,t ini ) T and Y i = (Y i,...,y ini ) T, it is easy to see that the triples {T i, Y i,n i } are i.i.d Asymptotic normality of one-dimensional smoother To assume appropriate regularity conditions that are used to derive asymptotic properties, we define a new type of continuity that differs from those which are commonly used. We say that a real function f(x,y) :R p+q Ris continuous on x A R p uniformly in y R q, provided that for any x A and ε > 0, there exists a neighorhood of x not depending on y, saying U(x) R p, such that f(x,y) f(x,y) < ε for all x U(x) and y R q. For random design, (T ij,y ij ) are assumed to have the identical distriution as (T, Y ) with joint density g(t, y). Assume that the oservation times T ij are i.i.d. with the marginal density f(t), ut dependence is allowed among Y ij and Y ik that are oservations made for the same suject or cluster. Also denote the joint density of (T j,t k,y j,y k ) y g 2 (t,t 2,y,y 2 ), where j = k. Let ν,ke given integers, with 0 ν <k. We assume regularity conditions for the marginal and joint densities, f(t), g(t, y), g 2 (t,t 2,y,y 2 ) and the mean function of the underlying process X(t), i.e., E[X(t)]=μ(t), with respect to a neighorhood of a interior point t T, assuming that there exists a neighorhood U(t) of t such that: d (B.) k f (u) exists and is continuous on u U(t), and f (u) > 0 for u U(t); du k (B.2) g(u, y) is continuous on u U(t)uniformly in y R; dk g(u, y) exists and is continuous du on u U(t) uniformly in y R; k (B.3) g 2 (u, v, y,y 2 ) is continuous on (u, v) U(t) 2 uniformly in (y,y 2 ) R 2 ; d (B.4) k μ(u) exists and is continuous on u U(t). du k Let K ( ) e nonnegative univariate kernel functions in one-dimensional smoothing. The assumptions for kernels K :R Rare as follows. We say that a univariate kernel function K is of order (ν,k),if 0, 0 l <k, l = ν, u l K (u) du = ( ) ν ν!, l = ν, (2) = 0, l = k,

5 44 F. Yao / Journal of Multivariate Analysis 98 (2007) (B2.) K is compactly supported, K 2 = K 2 (u) du < ; (B2.2) K is a kernel function of order (ν,l). Let = (n) e a sequences of andwidths that are used in one-dimensional smoothing. We develop asymptotics as n, and require (B3) 0, n() ν+, () 0, and n() 2k+ d 2 for some d with 0 d <. One could see in the proof of Theorem that the assumptions (B3) comined with (A.) provide the condition such that the local property of kernel-type estimators holds for longitudinal or functional data with the presence of within-suject correlation. Let {ψ λ } λ=,...,l e a collection of real functions ψ λ :R 2 R, which satisfy: (B4.) ψ λ (t, y) are continuous on {t} uniformly in y R; d (B4.2) k ψ dt k λ (t, y) exists for all arguments (t, y) and are continuous on {t} uniformly in y R. Then we define the general weighted averages Ψ λn = n ν+ n N i i= j= ψ λ (T ij,y ij )K ( t Tij ), λ =,...,l. and μ λ = μ λ (t) = dν dt ν ψ λ (t, y)g(t, y) dy, λ =,...,l. Let σ κλ = σ κλ (t) = ψ κ (t, y)ψ λ (t, y)g(t, y) dy K 2, λ, κ l, and H :R l Re a function with continuous first order derivatives. We denote the gradient vector (( H/ x )(v),...,( H/ x l )(v)) T y DH (v) and N = n i= N i /n. Theorem. If the assumptions (A.), (A.2) and (B.) (B4.2) hold, then n N 2ν+ D [H(Ψ n,...,ψ ln ) H(μ,...,μ l )] N (β, [DH(μ,...,μ l )] T Σ[DH(μ,...,μ l )]), (3) where β = ( )k d k! u k K (u) du l λ= H μ λ {(μ,...,μ l ) T } dk ν dt k ν μ λ(t), Σ = (σ κλ ) κ,λ l. Proof. It is seen that N can e replaced with y Slutsky Theorem under (A.). We now show that n() 2ν+ [H(EΨ n,...,eψ ln ) H(μ,...,μ l )] β. (4)

6 F. Yao / Journal of Multivariate Analysis 98 (2007) Since (A.) and (A.2) hold, and K is of order (ν,k), using Taylor expansion to order k, one otains EΨ λn = n n ν+ E N i ( ) t Tij ψ λ (T ij,y ij )K i= N = ν+ E j= j= = { ( )} t T ν+ E ψ λ (T, Y )K = μ λ + ( )k k! [ ( ) t Tj E ψ λ (T j,y j )K N] u k K (u) du dk ν dt k ν μ λ(t) k ν + o( k ν ). (5) Then (4) follows from an l-dimensional Taylor expansion of H of order around (μ,...,μ l ) T, coupled with (5). If we can show n() 2ν+ [(Ψ n,...,ψ ln ) T (EΨ,...,EΨ ln ) T ] D N (0, Σ), (6) in analogy to Bhattacharya and Müller [], and continuity of DH at (μ,...,μ l ) T and applying similar arguments used in (5), we find DH(EΨ n,...,eψ ln ) DH(μ,...,μ l ). Then Carmér Wold device yields n() 2ν+ D [H(Ψ n,...,ψ ln ) H(EΨ,...,EΨ ln )] N (0,DH(μ,...,μ l ) T ΣDH(μ,...,μ l )), (7) comined with (4), leading to (3). It remains to show (6). Oserving (A.) and (A.2), one has n() 2ν+ cov(ψ λn, Ψ κn ) = E N j= E N [ N E I I 2. k= j= ψ λ (T j,y j )K ( t Tj ψ λ (T j,y j )K ( t Tj ( ) ] t Tk ψ κ (T k,y k )K ) [ N ) k= ( ) ] t Tk ψ κ (T k,y k )K

7 46 F. Yao / Journal of Multivariate Analysis 98 (2007) It is ovious that I 2 = O() = o() from the derivation of (5). For I, it can e written as I = E N ( ) t ψ λ (T j,y j )ψ κ (T j,y j )K 2 Tj j= + E j =k N ψ λ (T j,y j )ψ κ (T k,y k )K ( t Tj Q + Q 2. Applying (A.) and (A.2), one has Q = E N [ ( ) t E ψ λ (T j,y j )ψ κ (T j,y j )K 2 Tj N] j= ) ( ) t Yk K = ( )] t Y [ψ E λ (T, Y )ψ κ (T, Y )K 2 = σ λκ + o(). Then (4) will hold, oserving (A.) and the following argument that guarantees the local property of the kernel-ased estimators with the presence of within-suject correlation in longitudinal or functional data, Q 2 = N [ ( ) ( ) t E Tj t Tk E ψ λ (T j,y j )ψ κ (T k,y k )K K N] = = (N ) E (N ) j =k N [ ( t T ψ λ (T,Y )ψ κ (T 2,Y 2 )K )] ( ) t T2 K R 4 ψ λ (t u, y )ψ κ (t v, y 2 )K (u)k 2 (v) g 2 (t u, t v, y,y 2 )dudvdy dy 2 (N ) = ψ λ (t, y )ψ κ (t, y 2 )g 2 (t,t,y,y 2 )dy dy 2 + o() = o(), R 2 i.e., the within-suject correlation can e ignored while deriving the asymptotic variance Asymptotic normality of two-dimensional smoother The general asymptotic result can e extended to two-dimensional smoothing. Let (ν, k) denote the multi-indices ν = (ν, ν 2 ) and k = (k,k 2 ), where ν =ν + ν 2 and k =k + k 2. In two-dimensional smoothing, more regularity assumptions are needed for joint densities. Let f 2 (s, t) e the joint density of (T j,t k ), and g 4 (s,t,s,t,y,y 2,y,y 2 ) the joint density of (T j,t k,t j,t k,y j,y k,y j,y k ) where j = k, (j, k) = (j,k ). Denote the covariance surface y C(s,t) = cov(x(t j ), X(T k ) T j = s, T k = t). The following regularity conditions are assumed, where U(s,t) is some neighorhood of {(s, t)}, (C.) d k du k dv k 2 f 2(u, v) exists and is continuous on (u, v) U(s,t), and f 2 (u, v) > 0 for (u, v) U(s,t);

8 F. Yao / Journal of Multivariate Analysis 98 (2007) (C.2) g 2 (u, v, y,y 2 ) is continuous on (u, v) U(s,t) uniformly in (y,y 2 ) R 2 d ; k du k dv k 2 g 2 (u, v, y,y 2 ) exits and is continuous on (u, v) U(s,t) uniformly in (y,y 2 ) R 2 ; (C.3) g 4 (u, v, u,v,y,y 2,y,y 2 ) is continuous on (u, v, u,v ) U(s,t) 2 uniformly in (y,y 2,y,y 2 ) R4 ; d (C.4) k C(u, v) exists and is continuous on (u, v) U(s,t). du k dv k 2 Let K 2 e nonnegative ivariate kernel functions used in the two-dimensional smoothing. The assumptions for kernels K 2 are as follows, (C2.) K 2 is compacted supported with K 2 2 = R 2 K2 2 (u,v)dudv <, and is symmetric with respect to coordinates u and v. (C2.2) K 2 is a kernel function of order ( ν, k ), i.e., l +l 2 = l 0, 0 l < k, l = ν, u l v l 2 K 2 (u,v)dudv = ( ) ν ν!, l = ν, R 2 = 0, l = k. (8) Let h = h(n) e a sequence of andwidths used in two-dimensional smoothing, while it is possile that the andwidths used for two arguments may e different. Since we will focus on the estimator of the covariance surface that is symmetric aout the diagonal, it is sufficient to consider the identical andwidths for the two arguments. The asymptotics is developed as n as follows: (C3) h 0, n 2 h ν +2, h 3 0, and ne[n(n )]h 2 k +2 e 2 for some 0 e <. Similar to the one-dimensional smoothing case, assumptions (C3) and (A.) guarantee the local property of the ivariate kernel-ased estimators with the presence of within-suject correlation. Let { λ } λ=,...,l e a collection of real functions λ :R 4 R, λ =,...,l, satisfying (C4.) λ (s,t,y,y 2 ) are continuous on {(s, t)} uniformly in (y,y 2 ) R 2 ; d (C4.2) k ds k dt k 2 λ(s,t,y,y 2 ) exit for all arguments (s,t,y,y 2 ) and are continuous on {(s, t)} uniformly in (y,y 2 ) R 2. Then the general weighted averages of two-dimensional smoothing are defined y, for λ l, n Φ λn = Φ λn (t, s) = ne[n(n )]h ν +2 λ (T ij,t ik,y ij,y ik ) i= j =k N i Let K 2 ( s Tij h m λ = m λ (s, t) =, t T ) ik. h ν +ν 2 = ν d ν ds ν dt ν 2 R 2 λ (s,t,y,y 2 )g 2 (s,t,y,y 2 )dy dy 2, λ l,

9 48 F. Yao / Journal of Multivariate Analysis 98 (2007) and ω κλ = ω κλ (s, t) = κ (s,t,y,y 2 ) λ (s,t,y,y 2 )g 2 (s,t,y,y 2 )dy dy 2 K 2 2, R 2 κ, λ l, and H :R l Ris a function with continuous first order derivatives as previously defined. Theorem 2. If assumptions (A.), (A.2) and (C.) (C4.2) hold, then n N( N )h 2 ν +2 [H(Φ n,...,φ ln ) H(m,...,m l )] where D N (γ, [DH(m,...,m l )] T Ω[DH(m,...,m l )]), (9) γ = ( ) k e k! l λ= k +k 2 = k d k u k v k 2 K 2 (u,v)dudv R 2 ds k dt k 2 λ (s,t,y,y 2 )g 2 (s,t,y,y 2 )dy dy 2 R 2 { } H (m,...,m l ) T, m λ Ω = (ω κλ ) κ l. The proof of Theorem 2 essentially follows that of Theorem with appropriate modifications which are required for two-dimensional smoothing. 3. Applications to nonparametric regression estimators for functional or longitudinal data Although various versions of kernel-ased estimators have een introduced in literature, Nadaraya Waston and local polynomial, especially local linear estimators, are the most commonly used non-parametric smoothing techniques in longitudinal or functional data analysis. Due to within-suject correlation, the asymptotic ehavior in terms of ias and variance of these estimators for noisily oserved longitudinal or functional data has yet een as well understood as for i.i.d. data. Especially, asymptotic results for covariance estimators do not exist. Therefore in this section, we apply the asymptotic results developed for general functionals to Nadaraya Waston and local linear estimators of regression function and covariance surface to otain their asymptotic distriutions. 3.. Asymptotic distriutions of mean estimators We apply Theorem to the local asymptotic distriutions of the commonly used Nadaraya Waston kernel estimator ˆμ N (t) and local linear estimator ˆμ L (t) for functional/longitudinal

10 F. Yao / Journal of Multivariate Analysis 98 (2007) data: n ˆμ N (t) = N i i= j= ˆμ L (t) =ˆα 0 (t) =arg min (α 0, α ) K ( t Tij n ) N i i= j= Y ij / n K ( t Tij N i i= j= K ( t Tij ), (0) ) [Y ij (α 0 + α (T ij t))] 2. () Corollary. If assumptions (A.), (A.2), and (B.) (B3) hold with ν = 0 and k = 2, then ( ) D d μ n N[ˆμ (2) (t)f (t)+2μ () (t)f () (t) N (t) μ(t)] N σ 2 K 2 f(t), var(y T =t) K 2, f(t) (2) where d is as in (B3), σ 2 K = u 2 K (u) du and K 2 = K 2 (u) du. One can see that the variance function var(y T = t) can e easily derived from model () as var(y T = t) = k= λ k 2 k (t) + σ2, where λ k and k are the variances of the uncorrelated random coefficients and corresponding asis functions, and σ 2 is the variance of the measurement error. Proof. Choose ν = 0, k = 2, ψ (u, v) = v, ψ 2 (u, v) and H(x,x 2 ) = x /x 2 in Theorem, then ˆμ N (t) = H(Ψ n, Ψ 2n ). To compute β, use DH(μ, μ 2 ) = (/μ 2, μ /μ 2 2 ), and note μ (t) = tg(t,v)dv = f(t)μ(t). It is easy to show that (d 2 /dt 2 )μ (t) =[f (2) μ+2f () μ () +f μ (2) ](t), and (d 2 /dt 2 )μ 2 (t) = f (2) (t), leading to the ias term in (2). For the asymptotic variance, note that σ = K 2 v 2 g(t, v) dv = K 2 E(Y 2 T = t)f(t), σ 2 = σ 2 = K 2 μ(t)f (t), σ 22 = K 2 f(t), and DH(μ, μ 2 ) = (/μ 2, μ /μ 2 2 ), yielding the variance term in (2). Concerning the local linear estimator ˆμ L (t), one otains Corollary 2. If assumptions (A.), (A.2), and (B.) (B3) hold with ν = 0 and k = 2, then ( ) D d n N[ˆμ L (t) μ(t)] N 2 μ(2) (t)σ 2 var(y T = t) K, K 2, (3) f(t) where d is as in (B3), σ 2 K = u 2 K (u) du and K 2 = K 2 (u) du. Proof. The local linear estimator ˆμ L (t) of the mean function μ(t) can e explicitly written as i ˆμ L (t) =ˆα 0 (t) = j w ij Y ij i i j w j w ij (T ij t) ij i j w â (t), (4) ij where ˆα (t) = i j w ij (T ij t)y ij ( i i j w ij (T ij t) 2 ( i j w ij (T ij t) i j w ij (T ij t)) 2 /( i j w ij Y ij )/( i j w ij ) j w ij ). (5)

11 50 F. Yao / Journal of Multivariate Analysis 98 (2007) Here w ij = K ((t T ij )/)/(n), where K is a kernel function of order (0, 2), satisfying (B2.) and (B2.2), and ˆα (t) is an estimator for the first derivative μ (t) of μ at t. Oserving that Corollary implies ˆμ N (t) p μ(t), let f(t) ˆ = i j w ij /N i, it is easy to p p show f(t) ˆ f(t) in analogy to Corollary. We proceed to show â (t) μ (t). Denote σ 2 K = u 2 K (u) du, the kernel function K (t) = tk (t)/σ 2 K, and define Ψ λn, λ 3 y ψ (u, y) = y, ψ 2 (u, y), ψ 3 (u, y) = u t. Oserve that K is of order (, 3), f(t) ˆ p f(t), and define H(x,x 2,x 3 ) = x x 2 ˆμ N (t) x 3 x2 2/ f(t) ˆ σ 2 and H(x,x 2,x 3 ) = x x 2 μ(t). x K 3 Then ˆα (t) = H(Ψ n, Ψ 2n, Ψ 3n ) [ = H(Ψ n, Ψ 2n, Ψ 3n ) + Ψ ] 2n(μ(t) ˆμ N (t)) Ψ 3n Ψ 3n Ψ 3n + 2 Ψ 2 2n / f(t) ˆ σ 2. K Note that μ = (μ f +mf )(t), μ 2 = f (t), and μ 3 = f(t), implying Ψ λn μ λ = O p (/ n N 3 ), for λ =, 2, 3, y Theorem. Using Slutsky s Theorem, H(Ψ n, Ψ 2n, Ψ 3n ) μ (t) = O p (/ n N 3 ) follows. For the asymptotic distriution of ˆμ L, note that ˆμ L (t) = i j w ij Y ij i j w ij (T ij t)â (t) i j w. ij j w ij (T ij t) = n Nσ 2 K 2 Ψ 2n. Since K is of order (, 3), Considering n N i Theorem implies Ψ 2n = f (t)+o p (/ n N 3 ), which yields n Nσ 2 K 2 Ψ 2n = n N 5 σ 2 K f (t) + σ 2 K O p () = o p () y oserving n N 5 d 2 for 0 d <. Since f(t) ˆ p f(t)and ˆα (t) μ (t) =O p (/ n N 3 ) = o p (),wefind lim n N[ˆμ n L (t) μ(t)] = D lim n N n { i j w ij Y ij μ (t) i j w ij T ij + tμ (t) i i j w ij j w ij } μ(t). Using the kernel K of order (0, 2), we re-define Ψ λn, λ 3, through ψ (u, y) = y, ψ 2 (u, y) = u and ψ 3 (u, y), setting ν = 0, k = 2, l = 3 and H(x,x 2,x 3 ) =[x μ (t)x 2 + tμ (t)x 3 ]/x 3. Then (3) follows y applying Theorem Asymptotic distriutions of covariance estimators Note that in model (), cov(y ij,y ik T ij,t ik ) = cov(x(t ij ), X(T ik )) + σ 2 δ jk, where δ jl isif j = k and 0 otherwise. Let C ij k = (Y ij ˆμ(T ij ))(Y ik ˆμ(T ik )) e the raw covariances, where ˆμ(t) is the estimated mean function otained from the previous step, for instance, ˆμ(t) =ˆμ N (t) or ˆμ(t) =ˆμ L (t). It is easy to see that E[C ij k T ij,t ik ] cov(x(t ij ), X(T ik )) + σ 2 δ jk. Therefore,

12 F. Yao / Journal of Multivariate Analysis 98 (2007) the diagonal of the raw covariances should e removed, i.e., only C ij k, j = k, should e included as input data for the covariance surface smoothing step, as previously oserved in Staniswalis and Lee [2] and Yao et al. [5]. Commonly used nonparametric regression estimators of the covariance surface, C(s,t) = E{[X(T ) μ(t )][X(T 2 ) μ(t 2 ) T = s, T 2 = t]}, are the two-dimensional Nadaraya Waston estimator and local linear estimator defined as follows: / n Ĉ N (s, t) = i= j =k K 2 ( s Tij h, t T ik h n ( s Tij K 2 h i= j =k n Ĉ L (s, t) = ˆβ 0 (s, t) =arg min β ) C ij k, t T ) ik, (6) h i= j =k K 2 ( s Tij h, t T ) ik h [C ij k f(β,(s,t),(t ij,t ik ))] 2, (7) where β = (β 0, β, β 2 ) and f(β,(s,t),(t ij,t ik )) = β 0 + β (T ij s) + β 2 (T ik t). Applying Theorem 2 to the aove estimators yields the asymptotic distriutions. Corollary 3. If assumptions (A.), (A.2), and (C.) (C3) hold with ν =0and k =2, then n N( N )h 2 D [Ĉ N (s, t) C(s,t)] N (γ N (s, t), v(s, t) K 2 2 ), f 2 (s, t) (8) where e is as in (C3), γ N (s, t) = e { d 2 C(s,t) 4 σ2 K 2 ds 2 f 2 (s, t) + d2 C(s,t) dt 2 f 2 (s, t) [ df2 (s, t) dc(s, t) +2 + df ]}/ 2(s, t) dc(s, t) f 2 (s, t), dt dt ds ds v(s, t) = var{(y μ(t ))(Y 2 μ(t 2 )) T = s, T 2 = t}, σ 2 K 2 = (u 2 + v 2 )K 2 (u,v)dudv, R 2 K 2 2 = K 2 R 2 2 (u,v)dudv. Proof. For two-dimensional Nadaraya Waston estimator Ĉ N (s, t) of the covariance C(s,t), one uses the raw oservations, C ij k = (Y ij ˆμ(T ij ))(Y ik ˆμ(T ik )). Let C ij k = (Y ij μ(t ij ))(Y ik μ(t ik )). Note that C ij k = C ij k + (Y ij μ(t ij ))(μ(t ik ) ˆμ(T ik )) + (Y ik μ(t ik ))(μ(t ij ) ˆμ(T ij )) +(μ(t ij ) ˆμ(T ij ))(μ(t ik ) ˆμ(T ik )). Applying Lemma in Yao et al. [6], one has the weak uniform convergence rate sup t T ˆμ(t) μ(t) =O p (/( n)) for oth possile choice of ˆμ(t) = ˆμ N (t) and ˆμ(t) = ˆμ L (t). Letting

13 52 F. Yao / Journal of Multivariate Analysis 98 (2007) (t,t 2,y,y 2 ) = (y μ(t ))(y 2 μ(t 2 )), 2 (t,t 2,y,y 2 ) = y μ(t ), and 3 (t,t 2,y,y 2 ), then sup t,s T Φ pn =O p (), for p =, 2, 3, y Lemma of Yao et al. [6]. This implies that sup t,s T Φ 2n O p (/( n)) = O p (/( n)) and sup t,s T Φ 3n O p (/( n)) = O p (/( n)). Since sup t T ˆμ(t) μ(t) 2 = O p (/(n)) are negligile compared to Φ n, the Nadaraya Waston estimator Ĉ N (s, t),ofc(s,t) otained from C ij k is asymptotically equivalent to that otained from C ij k, denoted y C N (t, s). Therefore, it is sufficient to show that the asymptotic distriution of C N (s, t) follows (8). Choose ν = (0, 0), k =2, (s,t,y,y 2 ) = (y μ(s))(y 2 μ(t)), 2 (s,t,y,y 2 ) and H(x,x 2 ) = x /x 2 in Theorem 2, then C N (s, t) = H(Ψ n, Ψ 2n ). To compute γ N (s, t), use DH(m,m 2 ) = (/m 2, m /m 2 2 ), and note m (s, t) = R 2 (y μ(s))(y 2 μ(t))g 2 (s,t,y,y 2 ) dy dy 2 = f 2 (s, t)c(s, t) and m 2 (s, t) = f 2 (s, t). One has (d 2 /dt 2 )m (s, t) =[(d 2 f 2 /dt 2 )C + 2(df 2 /dt)(dc/dt) + f 2 (d 2 C/dt 2 )](s, t), (d 2 /d 2 t)m 2 (s, t) = d 2 f 2 (s, t)/dt 2 and similar derivatives with respect to the argument s leading to the ias term in (2). For the asymptotic variance, note that ω = K 2 2 R 2 (y μ(s)) 2 (y 2 μ(t)) 2 g 2 (s,t,y,y 2 )dy dy 2 = E[(Y μ(t )) 2 (Y 2 μ(t 2 )) 2 T = s, T 2 = t)f 2 (s, t) K 2 2, ω 2 = ω 2 = K 2 2 f 2 (s, t)c(s, t), ω 22 = K 2 2 f 2 (s, t), and DH(m,m 2 ) = (/m 2, m /m 2 2 ), yielding the variance term in (2). Corollary 4. If the assumptions (A.), (A.2), and (C.) (C3) hold with ν =0 and k =2, then n N( N )h 2 [Ĉ L (s, t) C(s,t)] ( D e N 4 σ2 K 2 [d 2 C(s, t)/ds 2 + d 2 C(s, t)/dt 2 ], v(s, t) K 2 2 ), (9) f 2 (s, t) where e is as in (C3), v(s, t) = var{(y μ(t ))(Y 2 μ(t 2 )) T = s, T 2 = t), σ 2 K 2 = R 2 (u 2 + v 2 )K 2 (u,v)dudv, K 2 2 = R 2 K2 2(u,v)dudv. Proof. In analogy to the proof of Corollary 3, the local linear estimator Ĉ L (s, t) otained from C ij k is asymptotically equivalent to that otained from C ij k, denoted y C L (t, s). Also denote the solution to (7), after sustituting C ij k for C ij k,y β(s, t) = ( β 0 (s, t), β (s, t), β 2 (s, t)), and in fact β 0 (s, t) = C L (s, t). For simplicity, let W ij k = K 2 ((s T ij )/h, (t T ik )/h)/(nh 2 ) and i,j =k " is areviation of n i= j =k ". Algera calculations yield that i,j =k C C ij k W ij k β i,j =k L = W ij kt ij + β i,j =k W ij ks β 2 i,j =k W ij kt ik + β 2 i,j =k W ij kt i,j =k W, ij k β = R 00(S 0 S 02 S 0 S ) + R 0 (S 00 S 02 S 0 S 20 ) R 0 (S 00 S S 0 S 02 ) S 00 S 20 S 02 S 00 S 2 S2 0 S 02 + S 0 S 0 S + S 20 S 0 S S 0 S20 2, β 2 = R 00(S 0 S S 0 S 02 ) R 0 (S 00 S S 0 S 20 ) + R 0 (S 00 S 20 S0 2 ) S 00 S 20 S 02 S 00 S 2 S2 0 S 02 + S 0 S 0 S + S 20 S 0 S S 0 S20 2, where R pq = W ij k (T ij s) p (T ik t) q C ij k, i,j =k S pq = W ij k (T ij s) p (T ik t) q. i,j =k

14 F. Yao / Journal of Multivariate Analysis 98 (2007) Note that β and β 2 are local linear estimators of the partial derivatives of C(s,t), dc(s, t)/ds and dc(s, t)/dt, respectively. In analogy to the proof of Corollary 2, it can e shown that β (s, t) dc(s, t)/ds =O p (/ n(n )h 4 ) and β 2 (s, t) dc(s, t)/dt =O p (/ n N( N )h 4 ) y applying Theorem 2. Then one can sustitute dc(s, t)/ds, dc(s, t)/dt for β (s, t), β 2 (s, t) in C L (s, t), and denote the resulting estimator y CL (s, t). It is easy to see that lim n N( N )h 2 [C L (s, t) C(s,t)] = D lim n n n N( N )h 2 [CL (s, t) C(s,t)]. We define Φ λn, λ 4, through (s,t,y,y 2 ) = (y μ(s))(y 2 μ(t)), 2 (s,t,y,y 2 ) = s, 3 (s,t,y,y 2 ) and 4 (s,t,y,y 2 ) = t. Put ν = (0, 0), k =2and H(x,x 2,x 3,x 4 ) = [x x 2 dc(s, t)/ds x 4 dc(s, t)/dt]/x 3 + s dc(s, t)/ds + t dc(s, t)/dt. Then result (9) follows y applying Theorem Extension to fixed design In the studies of functional data, it is often the case that the repeated measurements are recorded densely and regularly over time y machine. For realistic reason, the measurements are assumed to e contaminated with experimental error. The asymptotic distriution results developed for random design that is often encountered in longitudinal data can e extended to fixed equally spaced design for typical functional data with slight modifications. Now the density functions g(t, y) is defined as the density of Y(t) at the fixed time t, g 2 (s,t,y,y 2 ) is the joint density of (Y (s), Y (t)) at (s, t), and g 4 (s,t,s,t,y,y 2,y,y 2 ) is the joint density of (Y (s), Y (t), Y (s ), Y (t )) at (s,t,s,t ), while f(t) for the random oservation time T is not needed. The fixed equally spaced design considered here is as follows: (A ) N i (n) = N(n), T i,j+ T i,j = T i,j + T i,j for j,j N, and T ij = T i j for i, i n and j N. The extension that will e discussed in this section is also applicale to the case that the vector of oservation times T i are independent and identically distriuted with N i = N and T i,j+ T i,j = T i,j + T i,j, which is in fact a case that lies etween random unalanced and fixed equally spaced designs. For the general result in Theorem, the weighted averages Ψ λn should e re-defined y Ψ λn = nn ν+ n N i= j= ψ λ (T ij,y ij )K ( t T ij Assumption (B3) now ecomes (B3 ) 0, nn ν+, N 0 and nn 2k+ d 2 with 0 d <. Then in analogy to the proof of Theorem with appropriate modifications, Theorem is still valid if the assumptions (A ), (B.) (B2.2), (B3 ), and (B4.) (B4.2) hold. For the general result of two-dimensional smoothing in Theorem 2, one modifies assumption (C3) as follows: (C3 ) h 0, nn 2 h ν +2 0, hn 3 0, and nn(n )h 2 k +2 e 2 for 0 e <. Then Theorem 2 holds under assumptions (A ), (C.) (B2.2), (C3 ), and (C4.) (C4.2). Therefore, Theorems and 2 can e applied to otain asymptotic distriutions of kernel-ased estimators for the mean and covariance functions under fixed equally spaced design. For instance, the asymptotic normal distriutions of the local linear estimators ˆμ L (t) and Ĉ L (s, t) are similar to ).

15 54 F. Yao / Journal of Multivariate Analysis 98 (2007) those in Corollaries 3 and 4, with f(t)replaced y / T and f(s,t)replaced y / T 2, where T is the length of the interval. 5. Simulation study A numerical study is conducted to evaluate the derived asymptotic properties. The key finding in this paper is that the asymptotic results for functional or longitudinal are comparale to those otained from independent data, i.e., the influence of within-suject covariance does not play significant role in determining the asymptotic ias and variance. For simplicity, we focus on the local polynomial mean estimators which are often superior to the Nadaraya Waston estimators. We first generated M = 200 samples consisting of n = 50 i.i.d. random trajectories each. Following model (), the simulated process has a mean function μ(t) = (t /2) 2,0 t which has a constant second derivative μ (2) (t) = 2, and a constant within-suject covariance function derived from a random intercept ξ i.i.d. N(0, λ ), where λ = 0.0 and (t) =, 0 t. The measurement error in () was set ε ij i.i.d. N(0, σ 2 ), where σ 2 = 0.0. A random design was used, where the numers of oservations for each suject N i were chosen from {2, 3, 4, 5} with equal likelihood and the locations of the oservations were uniformly distriuted on [0, ], i.e., T ij i.i.d. U[0, ]. For comparison, we generated M = 200 samples of n = 50 i.i.d. random trajectories which have the same structure as in model () ut no within-suject i.i.d. correlation. Letting ξ i = 0 and ε ij N(0, λ + σ 2 ) leads to independent data with the same mean and variance functions. Therefore, the two sets of data have the same asymptotic distriution for the local polynomial mean estimators. We also generated M = 200 correlated and independent samples, respectively, consisting of n = 200 trajectories each for demonstrating the asymptotic ehavior with the increasing sample size n. Here we use the Epanechnikov kernel function, i.e., K (u) = 3/4( u 2 ) [,] (u), where A (u) = ifu A and 0 otherwise for any set A. Note that n() 2k+ d 2 in (B3), μ (2) (t) = 2, var(y T = t) = λ + σ 2 = 0.02, and the design density f(t) =, where k = 2 for local polynomial estimators and is the andwidth used for the mean estimation. From the aove construction, one can calculate the asymptotic variance and ias of the local polynomial mean estimators μ L (t) using Corollary 2 which is in fact applicale for oth correlated and independent data. Since the ias and variance terms are oth constant in our simulation framework, for convenience we compare the asymptotic integrated squared ias and variance with the empirical integrated squared ias and variance otained using Monte Carlo average from M = 200 simulated samples ased on 0 E[{ˆμ L(t) μ(t)} 2 ] dt = 0 {ˆμ L(t) E[ˆμ L (t)]} 2 dt + 0 {E[ˆμ L(t)] μ(t)} 2 dt. The asymptotic integrated squared ias and variance are given y AIBIAS = 2 σ2 K 4, AIVAR = 0.02 K 2, (20) n N and the asymptotic integrated mean squared error AIMSE = AIBIAS + AIVAR, where σ 2 K = u 2 K (u) du, K 2 = K 2(u) du and N = (/n) n i= N i, while the empirical integrated squared ias, variance and mean squared error are denoted y EIBIAS, EIVAR and EIMSE, The asymptotic and empirical quantities, such as the integrated squared ias, variance and mean squared error, are shown in Fig. for the correlated/independent data with sample size n = 50/n = 200, respectively. From Fig., it is ovious that the asymptotic approximation is improved y increasing the sample size. The asymptotic quantities AIBIAS, AIVAR and AIMSE agree with the

16 F. Yao / Journal of Multivariate Analysis 98 (2007) x n=50 Correlated 3 x n=50 Independent log () log () 3 x x n=200 Correlated n=200 Independent log () log () Fig.. Shown are the empirical quantities (solid, including EIBIAS, EIVAR, EIMSE) and asymptotic quantities (dashed, including AIBIAS, AIVAR, AIMSE) versus log() for correlated (left panels) and independent (right panels) data with different sample sizes n = 50 (top panels) and n = 200 (ottom panels), where is the andwidth used in the smoothing. In each panel, the integrated squared ias is the one with increasing pattern, the integrated variance is the one with decreasing pattern, and they cross each other, while the integrated mean squared error, which is larger than oth integrated squared ias and variance for any andwidth, usually decreases first and then increases after reaching a minimum. empirical quantities EIBIAS, EIVAR and EIMSE for oth correlated and independent data. For the simulated data with the same sample size n, such asymptotic approximations for correlated and independent data are well comparale in pattern and magnitude. This provides the evidence that the within-suject correlation indeed does not have ovious influence on the asymptotic ehavior of the local polynomial estimators compared to the standard rate otained from independent data, which is consistent with our theoretical derivations. 6. Discussion In this paper, the asymptotic distriutions of kernel-ased nonparametric regression estimators for functional or longitudinal data are studied. In particular, it derives general results for asymptotic distriutions of real-valued functions with arguments which are functionals formed y weighted averages otained from longitudinal or functional data. These asymptotic distriution results are comparale to those otained from identically distriuted and independent data. The conditions for the validity of local property of kernel-ased estimators are provided in (B3), (C3), (B3 ) and (C3 ) with the presence of within-suject correlation in such data, under random unalanced

17 56 F. Yao / Journal of Multivariate Analysis 98 (2007) design descried in (A.) and (A.2), fixed equally spaced design descried in (A ), and some case lying etween them. The proposed results could also e extended to more complicated cases, such as panel data where oservations for different sujects are otained at a series of common time points during a longitudinal follow-up. If considering random design, the density of the jth oservation time T j could e assumed to e f j (t), then the results are readily applied to this case with appropriate modifications with respect to the different marginal densities. The general asymptotic distriution results in univariate and ivariate smoothing settings are applied to the kernel-ased estimators of the mean and covariance functions, which yields asymptotic normal distriutions of these estimators. To the est of our knowledge, there are no asymptotic distriution results availale in literature for nonparametric estimators of covariance function otained from oserved noisy longitudinal or functional data. This provides theoretical asis and practical guidance for the nonparametric analysis of functional or longitudinal data with important potential applications that are ased on the asymptotic distriutions. For example, asymptotic confidence ands or regions for the regression curve or the covariance surface can e constructed ased on their asymptotic distriutions. Since, due to their heavy computational load, commonly used procedures (such as cross-validation) for andwidth selection in two-dimensional settings are not feasile, one important research prolem is to seek efficient approaches for choosing such smoothing parameters. Also functional principal component analysis, an increasingly popular tool for functional data analysis, is ased on eigen-decomposition of the estimated covariance function. Thus, the influence of the asymptotic properties of covariance estimators on the estimated eigenfunctions is another potential research of interest. References [] P.K. Bhattacharya, H.G. Müller, Asymptotics for nonparametric regression, Sankhyā 55 (993) [2] G. Boente, R. Fraiman, Kernel-ased functional principal components, Statist. Proa. Lett. 48 (999) [3] H. Cardot, F. Ferraty, P. Sarda, Functional linear model, Statist. Proa. Lett. 45 (999) 22. [4] P. Diggle, P. Hergerty, K.Y. Liang, S. Zeger, Analysis of Longitudinal Data, Oxford Unversity Press, Oxford, [5] J. Fan, I. Gijels, Local Polynomial Modelling and Its Applications, Chapman & Hall, London, 996. [6] P. Hall, N.I. Fisher, B. Hoffmann, On the nonparametric estimation of covariance functions, Ann. Statist. 22 (994) [7] P. Hall, P. Patil, Properties of nonparametic estimators of autocovariance for stationary random fields, Proa. Theory Related Fields 99 (994) [8] J.D. Hart, T.E. Wehrly, Kernel regression estimation using repeated measurements data, J. Amer. Statist. Assoc. 8 (986) [9] X. Lin, R.J. Carroll, Nonparametric function estimation for clustered data when the predictor is measured without/with error, J. Amer. Statist. Assoc. 95 (2000) [0] J.O. Ramsay, J.B. Ramsey, Functional data analysis of the dynamics of the monthly index of nondurale goods production, J. Econom. 07 (2002) [] T.A. Sererini, J.G. Staniswalis, Quasi-likelihood estimation in semiparametric models, J. Amer. Statist. Assoc. 89 (994) [2] J.G. Staniswalis, J.J. Lee, Nonparametric regression analysis of longitudinal data, J. Amer. Statist. Assoc. 93 (998) [3] A.P. Veryla, B.R. Cullis, M.G. Kenward, S.J. Welham, The analysis of designed experiments and longitudinal data using smoothing splines (with discussion), Appl. Statist. 48 (999) [4] C.J. Wild, T.W. Yee, Additive extensions to generalized estimating equation methods, J. Roy. Statist. Soc., Ser B 58 (996) [5] F. Yao, H.G. Müller, A.J. Clifford, S.R. Dueker, J. Lin Follett, B.A.Y. Buchholz, J.S. Vogel, Shrinkage estimation for functional principal component scores with application to the population kinetics of plasma folate, Biometrics 59 (2003) [6] F. Yao, H.G. Müller, J.L. Wang, Functional data analysis for sparse longitudinal data, J. Amer. Statist. Assoc. 00 (2005)

Estimating a Finite Population Mean under Random Non-Response in Two Stage Cluster Sampling with Replacement

Estimating a Finite Population Mean under Random Non-Response in Two Stage Cluster Sampling with Replacement Open Journal of Statistics, 07, 7, 834-848 http://www.scirp.org/journal/ojs ISS Online: 6-798 ISS Print: 6-78X Estimating a Finite Population ean under Random on-response in Two Stage Cluster Sampling

More information

Nonparametric Estimation of Smooth Conditional Distributions

Nonparametric Estimation of Smooth Conditional Distributions Nonparametric Estimation of Smooth Conditional Distriutions Bruce E. Hansen University of Wisconsin www.ssc.wisc.edu/~hansen May 24 Astract This paper considers nonparametric estimation of smooth conditional

More information

REGRESSING LONGITUDINAL RESPONSE TRAJECTORIES ON A COVARIATE

REGRESSING LONGITUDINAL RESPONSE TRAJECTORIES ON A COVARIATE REGRESSING LONGITUDINAL RESPONSE TRAJECTORIES ON A COVARIATE Hans-Georg Müller 1 and Fang Yao 2 1 Department of Statistics, UC Davis, One Shields Ave., Davis, CA 95616 E-mail: mueller@wald.ucdavis.edu

More information

FUNCTIONAL DATA ANALYSIS. Contribution to the. International Handbook (Encyclopedia) of Statistical Sciences. July 28, Hans-Georg Müller 1

FUNCTIONAL DATA ANALYSIS. Contribution to the. International Handbook (Encyclopedia) of Statistical Sciences. July 28, Hans-Georg Müller 1 FUNCTIONAL DATA ANALYSIS Contribution to the International Handbook (Encyclopedia) of Statistical Sciences July 28, 2009 Hans-Georg Müller 1 Department of Statistics University of California, Davis One

More information

Smoothing the Nelson-Aalen Estimtor Biostat 277 presentation Chi-hong Tseng

Smoothing the Nelson-Aalen Estimtor Biostat 277 presentation Chi-hong Tseng Smoothing the Nelson-Aalen Estimtor Biostat 277 presentation Chi-hong seng Reference: 1. Andersen, Borgan, Gill, and Keiding (1993). Statistical Model Based on Counting Processes, Springer-Verlag, p.229-255

More information

Properties of Principal Component Methods for Functional and Longitudinal Data Analysis 1

Properties of Principal Component Methods for Functional and Longitudinal Data Analysis 1 Properties of Principal Component Methods for Functional and Longitudinal Data Analysis 1 Peter Hall 2 Hans-Georg Müller 3,4 Jane-Ling Wang 3,5 Abstract The use of principal components methods to analyse

More information

Functional Data Analysis for Sparse Longitudinal Data

Functional Data Analysis for Sparse Longitudinal Data Fang YAO, Hans-Georg MÜLLER, and Jane-Ling WANG Functional Data Analysis for Sparse Longitudinal Data We propose a nonparametric method to perform functional principal components analysis for the case

More information

Spiking problem in monotone regression : penalized residual sum of squares

Spiking problem in monotone regression : penalized residual sum of squares Spiking prolem in monotone regression : penalized residual sum of squares Jayanta Kumar Pal 12 SAMSI, NC 27606, U.S.A. Astract We consider the estimation of a monotone regression at its end-point, where

More information

New Local Estimation Procedure for Nonparametric Regression Function of Longitudinal Data

New Local Estimation Procedure for Nonparametric Regression Function of Longitudinal Data ew Local Estimation Procedure for onparametric Regression Function of Longitudinal Data Weixin Yao and Runze Li The Pennsylvania State University Technical Report Series #0-03 College of Health and Human

More information

Shrinkage Estimation for Functional Principal Component Scores, with Application to the Population Kinetics of Plasma Folate

Shrinkage Estimation for Functional Principal Component Scores, with Application to the Population Kinetics of Plasma Folate Shrinkage Estimation for Functional Principal Component Scores, with Application to the Population Kinetics of Plasma Folate Fang Yao, Hans-Georg Müller,, Andrew J. Clifford, Steven R. Dueker, Jennifer

More information

Functional modeling of longitudinal data

Functional modeling of longitudinal data CHAPTER 1 Functional modeling of longitudinal data 1.1 Introduction Hans-Georg Müller Longitudinal studies are characterized by data records containing repeated measurements per subject, measured at various

More information

Mixture of Gaussian Processes and its Applications

Mixture of Gaussian Processes and its Applications Mixture of Gaussian Processes and its Applications Mian Huang, Runze Li, Hansheng Wang, and Weixin Yao The Pennsylvania State University Technical Report Series #10-102 College of Health and Human Development

More information

Functional Latent Feature Models. With Single-Index Interaction

Functional Latent Feature Models. With Single-Index Interaction Generalized With Single-Index Interaction Department of Statistics Center for Statistical Bioinformatics Institute for Applied Mathematics and Computational Science Texas A&M University Naisyin Wang and

More information

Single Equation Linear GMM with Serially Correlated Moment Conditions

Single Equation Linear GMM with Serially Correlated Moment Conditions Single Equation Linear GMM with Serially Correlated Moment Conditions Eric Zivot October 28, 2009 Univariate Time Series Let {y t } be an ergodic-stationary time series with E[y t ]=μ and var(y t )

More information

Single Equation Linear GMM with Serially Correlated Moment Conditions

Single Equation Linear GMM with Serially Correlated Moment Conditions Single Equation Linear GMM with Serially Correlated Moment Conditions Eric Zivot November 2, 2011 Univariate Time Series Let {y t } be an ergodic-stationary time series with E[y t ]=μ and var(y t )

More information

1 Systems of Differential Equations

1 Systems of Differential Equations March, 20 7- Systems of Differential Equations Let U e an open suset of R n, I e an open interval in R and : I R n R n e a function from I R n to R n The equation ẋ = ft, x is called a first order ordinary

More information

Fractal functional regression for classification of gene expression data by wavelets

Fractal functional regression for classification of gene expression data by wavelets Fractal functional regression for classification of gene expression data by wavelets Margarita María Rincón 1 and María Dolores Ruiz-Medina 2 1 University of Granada Campus Fuente Nueva 18071 Granada,

More information

Modeling Repeated Functional Observations

Modeling Repeated Functional Observations Modeling Repeated Functional Observations Kehui Chen Department of Statistics, University of Pittsburgh, Hans-Georg Müller Department of Statistics, University of California, Davis Supplemental Material

More information

I N S T I T U T D E S T A T I S T I Q U E B I O S T A T I S T I Q U E E T S C I E N C E S A C T U A R I E L L E S (I S B A)

I N S T I T U T D E S T A T I S T I Q U E B I O S T A T I S T I Q U E E T S C I E N C E S A C T U A R I E L L E S (I S B A) I N S T I T U T D E S T A T I S T I Q U E B I O S T A T I S T I Q U E E T S C I E N C E S A C T U A R I E L L E S (I S B A) UNIVERSITÉ CATHOLIQUE DE LOUVAIN D I S C U S S I O N P A P E R 2011/2 ESTIMATION

More information

Function of Longitudinal Data

Function of Longitudinal Data New Local Estimation Procedure for Nonparametric Regression Function of Longitudinal Data Weixin Yao and Runze Li Abstract This paper develops a new estimation of nonparametric regression functions for

More information

Minimax Rate of Convergence for an Estimator of the Functional Component in a Semiparametric Multivariate Partially Linear Model.

Minimax Rate of Convergence for an Estimator of the Functional Component in a Semiparametric Multivariate Partially Linear Model. Minimax Rate of Convergence for an Estimator of the Functional Component in a Semiparametric Multivariate Partially Linear Model By Michael Levine Purdue University Technical Report #14-03 Department of

More information

FinQuiz Notes

FinQuiz Notes Reading 9 A time series is any series of data that varies over time e.g. the quarterly sales for a company during the past five years or daily returns of a security. When assumptions of the regression

More information

DESIGN-ADAPTIVE MINIMAX LOCAL LINEAR REGRESSION FOR LONGITUDINAL/CLUSTERED DATA

DESIGN-ADAPTIVE MINIMAX LOCAL LINEAR REGRESSION FOR LONGITUDINAL/CLUSTERED DATA Statistica Sinica 18(2008), 515-534 DESIGN-ADAPTIVE MINIMAX LOCAL LINEAR REGRESSION FOR LONGITUDINAL/CLUSTERED DATA Kani Chen 1, Jianqing Fan 2 and Zhezhen Jin 3 1 Hong Kong University of Science and Technology,

More information

Derivative Principal Component Analysis for Representing the Time Dynamics of Longitudinal and Functional Data 1

Derivative Principal Component Analysis for Representing the Time Dynamics of Longitudinal and Functional Data 1 Derivative Principal Component Analysis for Representing the Time Dynamics of Longitudinal and Functional Data 1 Short title: Derivative Principal Component Analysis Xiongtao Dai, Hans-Georg Müller Department

More information

Wavelet Regression Estimation in Longitudinal Data Analysis

Wavelet Regression Estimation in Longitudinal Data Analysis Wavelet Regression Estimation in Longitudinal Data Analysis ALWELL J. OYET and BRAJENDRA SUTRADHAR Department of Mathematics and Statistics, Memorial University of Newfoundland St. John s, NF Canada, A1C

More information

EVALUATIONS OF EXPECTED GENERALIZED ORDER STATISTICS IN VARIOUS SCALE UNITS

EVALUATIONS OF EXPECTED GENERALIZED ORDER STATISTICS IN VARIOUS SCALE UNITS APPLICATIONES MATHEMATICAE 9,3 (), pp. 85 95 Erhard Cramer (Oldenurg) Udo Kamps (Oldenurg) Tomasz Rychlik (Toruń) EVALUATIONS OF EXPECTED GENERALIZED ORDER STATISTICS IN VARIOUS SCALE UNITS Astract. We

More information

Appendices to the paper "Detecting Big Structural Breaks in Large Factor Models" (2013) by Chen, Dolado and Gonzalo.

Appendices to the paper Detecting Big Structural Breaks in Large Factor Models (2013) by Chen, Dolado and Gonzalo. Appendices to the paper "Detecting Big Structural Breaks in Large Factor Models" 203 by Chen, Dolado and Gonzalo. A.: Proof of Propositions and 2 he proof proceeds by showing that the errors, factors and

More information

Weakly dependent functional data. Piotr Kokoszka. Utah State University. Siegfried Hörmann. University of Utah

Weakly dependent functional data. Piotr Kokoszka. Utah State University. Siegfried Hörmann. University of Utah Weakly dependent functional data Piotr Kokoszka Utah State University Joint work with Siegfried Hörmann University of Utah Outline Examples of functional time series L 4 m approximability Convergence of

More information

Fisher information for generalised linear mixed models

Fisher information for generalised linear mixed models Journal of Multivariate Analysis 98 2007 1412 1416 www.elsevier.com/locate/jmva Fisher information for generalised linear mixed models M.P. Wand Department of Statistics, School of Mathematics and Statistics,

More information

FUNCTIONAL DATA ANALYSIS FOR VOLATILITY PROCESS

FUNCTIONAL DATA ANALYSIS FOR VOLATILITY PROCESS FUNCTIONAL DATA ANALYSIS FOR VOLATILITY PROCESS Rituparna Sen Monday, July 31 10:45am-12:30pm Classroom 228 St-C5 Financial Models Joint work with Hans-Georg Müller and Ulrich Stadtmüller 1. INTRODUCTION

More information

Fixed-b Inference for Testing Structural Change in a Time Series Regression

Fixed-b Inference for Testing Structural Change in a Time Series Regression Fixed- Inference for esting Structural Change in a ime Series Regression Cheol-Keun Cho Michigan State University imothy J. Vogelsang Michigan State University August 29, 24 Astract his paper addresses

More information

Density estimators for the convolution of discrete and continuous random variables

Density estimators for the convolution of discrete and continuous random variables Density estimators for the convolution of discrete and continuous random variables Ursula U Müller Texas A&M University Anton Schick Binghamton University Wolfgang Wefelmeyer Universität zu Köln Abstract

More information

Lecture 4: Numerical solution of ordinary differential equations

Lecture 4: Numerical solution of ordinary differential equations Lecture 4: Numerical solution of ordinary differential equations Department of Mathematics, ETH Zürich General explicit one-step method: Consistency; Stability; Convergence. High-order methods: Taylor

More information

B y t = γ 0 + Γ 1 y t + ε t B(L) y t = γ 0 + ε t ε t iid (0, D) D is diagonal

B y t = γ 0 + Γ 1 y t + ε t B(L) y t = γ 0 + ε t ε t iid (0, D) D is diagonal Structural VAR Modeling for I(1) Data that is Not Cointegrated Assume y t =(y 1t,y 2t ) 0 be I(1) and not cointegrated. That is, y 1t and y 2t are both I(1) and there is no linear combination of y 1t and

More information

Modeling Multi-Way Functional Data With Weak Separability

Modeling Multi-Way Functional Data With Weak Separability Modeling Multi-Way Functional Data With Weak Separability Kehui Chen Department of Statistics University of Pittsburgh, USA @CMStatistics, Seville, Spain December 09, 2016 Outline Introduction. Multi-way

More information

Empirical Dynamics for Longitudinal Data

Empirical Dynamics for Longitudinal Data Empirical Dynamics for Longitudinal Data December 2009 Short title: Empirical Dynamics Hans-Georg Müller 1 Department of Statistics University of California, Davis Davis, CA 95616 U.S.A. Email: mueller@wald.ucdavis.edu

More information

Nonparametric Modal Regression

Nonparametric Modal Regression Nonparametric Modal Regression Summary In this article, we propose a new nonparametric modal regression model, which aims to estimate the mode of the conditional density of Y given predictors X. The nonparametric

More information

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Jianqing Fan Department of Statistics Chinese University of Hong Kong AND Department of Statistics

More information

Luis Manuel Santana Gallego 100 Investigation and simulation of the clock skew in modern integrated circuits. Clock Skew Model

Luis Manuel Santana Gallego 100 Investigation and simulation of the clock skew in modern integrated circuits. Clock Skew Model Luis Manuel Santana Gallego 100 Appendix 3 Clock Skew Model Xiaohong Jiang and Susumu Horiguchi [JIA-01] 1. Introduction The evolution of VLSI chips toward larger die sizes and faster clock speeds makes

More information

Integrated Likelihood Estimation in Semiparametric Regression Models. Thomas A. Severini Department of Statistics Northwestern University

Integrated Likelihood Estimation in Semiparametric Regression Models. Thomas A. Severini Department of Statistics Northwestern University Integrated Likelihood Estimation in Semiparametric Regression Models Thomas A. Severini Department of Statistics Northwestern University Joint work with Heping He, University of York Introduction Let Y

More information

Issues on quantile autoregression

Issues on quantile autoregression Issues on quantile autoregression Jianqing Fan and Yingying Fan We congratulate Koenker and Xiao on their interesting and important contribution to the quantile autoregression (QAR). The paper provides

More information

Modeling Sparse Generalized Longitudinal Observations With Latent Gaussian Processes

Modeling Sparse Generalized Longitudinal Observations With Latent Gaussian Processes Modeling Sparse Generalized Longitudinal Observations With Latent Gaussian Processes Peter Hall,2, Hans-Georg Müller,3 and Fang Yao 4 December 27 SUMMARY. In longitudinal data analysis one frequently encounters

More information

1. Define the following terms (1 point each): alternative hypothesis

1. Define the following terms (1 point each): alternative hypothesis 1 1. Define the following terms (1 point each): alternative hypothesis One of three hypotheses indicating that the parameter is not zero; one states the parameter is not equal to zero, one states the parameter

More information

Minimizing properties of geodesics and convex neighborhoods

Minimizing properties of geodesics and convex neighborhoods Minimizing properties of geodesics and convex neighorhoods Seminar Riemannian Geometry Summer Semester 2015 Prof. Dr. Anna Wienhard, Dr. Gye-Seon Lee Hyukjin Bang July 10, 2015 1. Normal spheres In the

More information

1. Stochastic Processes and Stationarity

1. Stochastic Processes and Stationarity Massachusetts Institute of Technology Department of Economics Time Series 14.384 Guido Kuersteiner Lecture Note 1 - Introduction This course provides the basic tools needed to analyze data that is observed

More information

Dynamic Relations for Sparsely Sampled Gaussian Processes

Dynamic Relations for Sparsely Sampled Gaussian Processes TEST manuscript No. (will be inserted by the editor) Dynamic Relations for Sparsely Sampled Gaussian Processes Hans-Georg Müller Wenjing Yang Received: date / Accepted: date Abstract In longitudinal studies,

More information

An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data

An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data Jae-Kwang Kim 1 Iowa State University June 28, 2012 1 Joint work with Dr. Ming Zhou (when he was a PhD student at ISU)

More information

Generalized Seasonal Tapered Block Bootstrap

Generalized Seasonal Tapered Block Bootstrap Generalized Seasonal Tapered Block Bootstrap Anna E. Dudek 1, Efstathios Paparoditis 2, Dimitris N. Politis 3 Astract In this paper a new lock ootstrap method for periodic times series called Generalized

More information

arxiv: v3 [math.st] 10 Dec 2018

arxiv: v3 [math.st] 10 Dec 2018 Moving Block and Tapered Block Bootstrap for Functional Time Series with an Application to the K-Sample Mean Prolem arxiv:7.0070v3 [math.st] 0 Dec 208 Dimitrios PILAVAKIS, Efstathios PAPARODITIS and Theofanis

More information

HIGH-DIMENSIONAL GRAPHS AND VARIABLE SELECTION WITH THE LASSO

HIGH-DIMENSIONAL GRAPHS AND VARIABLE SELECTION WITH THE LASSO The Annals of Statistics 2006, Vol. 34, No. 3, 1436 1462 DOI: 10.1214/009053606000000281 Institute of Mathematical Statistics, 2006 HIGH-DIMENSIONAL GRAPHS AND VARIABLE SELECTION WITH THE LASSO BY NICOLAI

More information

Computation Of Asymptotic Distribution. For Semiparametric GMM Estimators. Hidehiko Ichimura. Graduate School of Public Policy

Computation Of Asymptotic Distribution. For Semiparametric GMM Estimators. Hidehiko Ichimura. Graduate School of Public Policy Computation Of Asymptotic Distribution For Semiparametric GMM Estimators Hidehiko Ichimura Graduate School of Public Policy and Graduate School of Economics University of Tokyo A Conference in honor of

More information

The choice of weights in kernel regression estimation

The choice of weights in kernel regression estimation Biometrika (1990), 77, 2, pp. 377-81 Printed in Great Britain The choice of weights in kernel regression estimation BY THEO GASSER Zentralinstitut fur Seelische Gesundheit, J5, P.O.B. 5970, 6800 Mannheim

More information

On prediction and density estimation Peter McCullagh University of Chicago December 2004

On prediction and density estimation Peter McCullagh University of Chicago December 2004 On prediction and density estimation Peter McCullagh University of Chicago December 2004 Summary Having observed the initial segment of a random sequence, subsequent values may be predicted by calculating

More information

Buckling Behavior of Long Symmetrically Laminated Plates Subjected to Shear and Linearly Varying Axial Edge Loads

Buckling Behavior of Long Symmetrically Laminated Plates Subjected to Shear and Linearly Varying Axial Edge Loads NASA Technical Paper 3659 Buckling Behavior of Long Symmetrically Laminated Plates Sujected to Shear and Linearly Varying Axial Edge Loads Michael P. Nemeth Langley Research Center Hampton, Virginia National

More information

Rejoinder. 1 Phase I and Phase II Profile Monitoring. Peihua Qiu 1, Changliang Zou 2 and Zhaojun Wang 2

Rejoinder. 1 Phase I and Phase II Profile Monitoring. Peihua Qiu 1, Changliang Zou 2 and Zhaojun Wang 2 Rejoinder Peihua Qiu 1, Changliang Zou 2 and Zhaojun Wang 2 1 School of Statistics, University of Minnesota 2 LPMC and Department of Statistics, Nankai University, China We thank the editor Professor David

More information

Time Series. Anthony Davison. c

Time Series. Anthony Davison. c Series Anthony Davison c 2008 http://stat.epfl.ch Periodogram 76 Motivation............................................................ 77 Lutenizing hormone data..................................................

More information

Spectrum Opportunity Detection with Weak and Correlated Signals

Spectrum Opportunity Detection with Weak and Correlated Signals Specum Opportunity Detection with Weak and Correlated Signals Yao Xie Department of Elecical and Computer Engineering Duke University orth Carolina 775 Email: yaoxie@dukeedu David Siegmund Department of

More information

AMS 147 Computational Methods and Applications Lecture 13 Copyright by Hongyun Wang, UCSC

AMS 147 Computational Methods and Applications Lecture 13 Copyright by Hongyun Wang, UCSC Lecture 13 Copyright y Hongyun Wang, UCSC Recap: Fitting to exact data *) Data: ( x j, y j ), j = 1,,, N y j = f x j *) Polynomial fitting Gis phenomenon *) Cuic spline Convergence of cuic spline *) Application

More information

For a stochastic process {Y t : t = 0, ±1, ±2, ±3, }, the mean function is defined by (2.2.1) ± 2..., γ t,

For a stochastic process {Y t : t = 0, ±1, ±2, ±3, }, the mean function is defined by (2.2.1) ± 2..., γ t, CHAPTER 2 FUNDAMENTAL CONCEPTS This chapter describes the fundamental concepts in the theory of time series models. In particular, we introduce the concepts of stochastic processes, mean and covariance

More information

Motivation: Can the equations of physics be derived from information-theoretic principles?

Motivation: Can the equations of physics be derived from information-theoretic principles? 10. Physics from Fisher Information. Motivation: Can the equations of physics e derived from information-theoretic principles? I. Fisher Information. Task: To otain a measure of the accuracy of estimated

More information

Supplementary Material to General Functional Concurrent Model

Supplementary Material to General Functional Concurrent Model Supplementary Material to General Functional Concurrent Model Janet S. Kim Arnab Maity Ana-Maria Staicu June 17, 2016 This Supplementary Material contains six sections. Appendix A discusses modifications

More information

Multivariate Analysis and Likelihood Inference

Multivariate Analysis and Likelihood Inference Multivariate Analysis and Likelihood Inference Outline 1 Joint Distribution of Random Variables 2 Principal Component Analysis (PCA) 3 Multivariate Normal Distribution 4 Likelihood Inference Joint density

More information

A Stickiness Coefficient for Longitudinal Data

A Stickiness Coefficient for Longitudinal Data A Stickiness Coefficient for Longitudinal Data June 2011 Andrea Gottlieb 1 Graduate Group in Biostatistics University of California, Davis 1 Shields Avenue Davis, CA 95616 U.S.A. Phone: 1 (530) 752-2361

More information

Additive Isotonic Regression

Additive Isotonic Regression Additive Isotonic Regression Enno Mammen and Kyusang Yu 11. July 2006 INTRODUCTION: We have i.i.d. random vectors (Y 1, X 1 ),..., (Y n, X n ) with X i = (X1 i,..., X d i ) and we consider the additive

More information

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

FULL LIKELIHOOD INFERENCES IN THE COX MODEL October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach

More information

Version of record first published: 01 Jan 2012

Version of record first published: 01 Jan 2012 This article was downloaded by: [University of California, Los Angeles (UCLA)] On: 27 July 212, At: 11:46 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 172954

More information

Estimation of a quadratic regression functional using the sinc kernel

Estimation of a quadratic regression functional using the sinc kernel Estimation of a quadratic regression functional using the sinc kernel Nicolai Bissantz Hajo Holzmann Institute for Mathematical Stochastics, Georg-August-University Göttingen, Maschmühlenweg 8 10, D-37073

More information

Nonlinear time series

Nonlinear time series Based on the book by Fan/Yao: Nonlinear Time Series Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna October 27, 2009 Outline Characteristics of

More information

Functional quasi-likelihood regression models with smooth random effects

Functional quasi-likelihood regression models with smooth random effects J. R. Statist. Soc. B (2003) 65, Part 2, pp. 405 423 Functional quasi-likelihood regression models with smooth random effects Jeng-Min Chiou National Health Research Institutes, Taipei, Taiwan and Hans-Georg

More information

Testing equality of autocovariance operators for functional time series

Testing equality of autocovariance operators for functional time series Testing equality of autocovariance operators for functional time series Dimitrios PILAVAKIS, Efstathios PAPARODITIS and Theofanis SAPATIAS Department of Mathematics and Statistics, University of Cyprus,

More information

ON FLATNESS OF NONLINEAR IMPLICIT SYSTEMS

ON FLATNESS OF NONLINEAR IMPLICIT SYSTEMS ON FLATNESS OF NONLINEAR IMPLICIT SYSTEMS Paulo Sergio Pereira da Silva, Simone Batista Escola Politécnica da USP Av. Luciano Gualerto trav. 03, 158 05508-900 Cidade Universitária São Paulo SP BRAZIL Escola

More information

MATH 205C: STATIONARY PHASE LEMMA

MATH 205C: STATIONARY PHASE LEMMA MATH 205C: STATIONARY PHASE LEMMA For ω, consider an integral of the form I(ω) = e iωf(x) u(x) dx, where u Cc (R n ) complex valued, with support in a compact set K, and f C (R n ) real valued. Thus, I(ω)

More information

arxiv: v1 [stat.ml] 10 Apr 2017

arxiv: v1 [stat.ml] 10 Apr 2017 Integral Transforms from Finite Data: An Application of Gaussian Process Regression to Fourier Analysis arxiv:1704.02828v1 [stat.ml] 10 Apr 2017 Luca Amrogioni * and Eric Maris Radoud University, Donders

More information

Diagnostics for Linear Models With Functional Responses

Diagnostics for Linear Models With Functional Responses Diagnostics for Linear Models With Functional Responses Qing Shen Edmunds.com Inc. 2401 Colorado Ave., Suite 250 Santa Monica, CA 90404 (shenqing26@hotmail.com) Hongquan Xu Department of Statistics University

More information

Multivariate Regression

Multivariate Regression Multivariate Regression The so-called supervised learning problem is the following: we want to approximate the random variable Y with an appropriate function of the random variables X 1,..., X p with the

More information

CONCEPT OF DENSITY FOR FUNCTIONAL DATA

CONCEPT OF DENSITY FOR FUNCTIONAL DATA CONCEPT OF DENSITY FOR FUNCTIONAL DATA AURORE DELAIGLE U MELBOURNE & U BRISTOL PETER HALL U MELBOURNE & UC DAVIS 1 CONCEPT OF DENSITY IN FUNCTIONAL DATA ANALYSIS The notion of probability density for a

More information

arxiv: v2 [stat.ap] 17 Nov 2008

arxiv: v2 [stat.ap] 17 Nov 2008 The Annals of Applied Statistics 2008, Vol. 2, No. 3, 1056 1077 DOI: 10.1214/08-AOAS172 c Institute of Mathematical Statistics, 2008 arxiv:0805.0463v2 [stat.ap] 17 Nov 2008 DISTANCE-BASED CLUSTERING OF

More information

7 Day 3: Time Varying Parameter Models

7 Day 3: Time Varying Parameter Models 7 Day 3: Time Varying Parameter Models References: 1. Durbin, J. and S.-J. Koopman (2001). Time Series Analysis by State Space Methods. Oxford University Press, Oxford 2. Koopman, S.-J., N. Shephard, and

More information

Estimation in Generalized Linear Models with Heterogeneous Random Effects. Woncheol Jang Johan Lim. May 19, 2004

Estimation in Generalized Linear Models with Heterogeneous Random Effects. Woncheol Jang Johan Lim. May 19, 2004 Estimation in Generalized Linear Models with Heterogeneous Random Effects Woncheol Jang Johan Lim May 19, 2004 Abstract The penalized quasi-likelihood (PQL) approach is the most common estimation procedure

More information

Econ 582 Nonparametric Regression

Econ 582 Nonparametric Regression Econ 582 Nonparametric Regression Eric Zivot May 28, 2013 Nonparametric Regression Sofarwehaveonlyconsideredlinearregressionmodels = x 0 β + [ x ]=0 [ x = x] =x 0 β = [ x = x] [ x = x] x = β The assume

More information

Asymptotic efficiency of simple decisions for the compound decision problem

Asymptotic efficiency of simple decisions for the compound decision problem Asymptotic efficiency of simple decisions for the compound decision problem Eitan Greenshtein and Ya acov Ritov Department of Statistical Sciences Duke University Durham, NC 27708-0251, USA e-mail: eitan.greenshtein@gmail.com

More information

Travel Grouping of Evaporating Polydisperse Droplets in Oscillating Flow- Theoretical Analysis

Travel Grouping of Evaporating Polydisperse Droplets in Oscillating Flow- Theoretical Analysis Travel Grouping of Evaporating Polydisperse Droplets in Oscillating Flow- Theoretical Analysis DAVID KATOSHEVSKI Department of Biotechnology and Environmental Engineering Ben-Gurion niversity of the Negev

More information

Additive modelling of functional gradients

Additive modelling of functional gradients Biometrika (21), 97,4,pp. 791 8 C 21 Biometrika Trust Printed in Great Britain doi: 1.193/biomet/asq6 Advance Access publication 1 November 21 Additive modelling of functional gradients BY HANS-GEORG MÜLLER

More information

14 - Gaussian Stochastic Processes

14 - Gaussian Stochastic Processes 14-1 Gaussian Stochastic Processes S. Lall, Stanford 211.2.24.1 14 - Gaussian Stochastic Processes Linear systems driven by IID noise Evolution of mean and covariance Example: mass-spring system Steady-state

More information

A Reduced Rank Kalman Filter Ross Bannister, October/November 2009

A Reduced Rank Kalman Filter Ross Bannister, October/November 2009 , Octoer/Novemer 2009 hese notes give the detailed workings of Fisher's reduced rank Kalman filter (RRKF), which was developed for use in a variational environment (Fisher, 1998). hese notes are my interpretation

More information

The Airy function is a Fredholm determinant

The Airy function is a Fredholm determinant Journal of Dynamics and Differential Equations manuscript No. (will e inserted y the editor) The Airy function is a Fredholm determinant Govind Menon Received: date / Accepted: date Astract Let G e the

More information

Time-Varying Functional Regression for Predicting Remaining Lifetime Distributions from Longitudinal Trajectories

Time-Varying Functional Regression for Predicting Remaining Lifetime Distributions from Longitudinal Trajectories Time-Varying Functional Regression for Predicting Remaining Lifetime Distributions from Longitudinal Trajectories Hans-Georg Müller and Ying Zhang Department of Statistics, University of California, One

More information

Modelling Non-linear and Non-stationary Time Series

Modelling Non-linear and Non-stationary Time Series Modelling Non-linear and Non-stationary Time Series Chapter 2: Non-parametric methods Henrik Madsen Advanced Time Series Analysis September 206 Henrik Madsen (02427 Adv. TS Analysis) Lecture Notes September

More information

Efficient Estimation for the Partially Linear Models with Random Effects

Efficient Estimation for the Partially Linear Models with Random Effects A^VÇÚO 1 33 ò 1 5 Ï 2017 c 10 Chinese Journal of Applied Probability and Statistics Oct., 2017, Vol. 33, No. 5, pp. 529-537 doi: 10.3969/j.issn.1001-4268.2017.05.009 Efficient Estimation for the Partially

More information

Alternatives to Basis Expansions. Kernels in Density Estimation. Kernels and Bandwidth. Idea Behind Kernel Methods

Alternatives to Basis Expansions. Kernels in Density Estimation. Kernels and Bandwidth. Idea Behind Kernel Methods Alternatives to Basis Expansions Basis expansions require either choice of a discrete set of basis or choice of smoothing penalty and smoothing parameter Both of which impose prior beliefs on data. Alternatives

More information

Adaptive Piecewise Polynomial Estimation via Trend Filtering

Adaptive Piecewise Polynomial Estimation via Trend Filtering Adaptive Piecewise Polynomial Estimation via Trend Filtering Liubo Li, ShanShan Tu The Ohio State University li.2201@osu.edu, tu.162@osu.edu October 1, 2015 Liubo Li, ShanShan Tu (OSU) Trend Filtering

More information

A Stickiness Coefficient for Longitudinal Data

A Stickiness Coefficient for Longitudinal Data A Stickiness Coefficient for Longitudinal Data Revised Version November 2011 Andrea Gottlieb 1 Department of Statistics University of California, Davis 1 Shields Avenue Davis, CA 95616 U.S.A. Phone: 1

More information

white noise Time moving average

white noise Time moving average 1.3 Time Series Statistical Models 13 white noise w 3 1 0 1 0 100 00 300 400 500 Time moving average v 1.5 0.5 0.5 1.5 0 100 00 300 400 500 Fig. 1.8. Gaussian white noise series (top) and three-point moving

More information

Asymptotic inference for a nonstationary double ar(1) model

Asymptotic inference for a nonstationary double ar(1) model Asymptotic inference for a nonstationary double ar() model By SHIQING LING and DONG LI Department of Mathematics, Hong Kong University of Science and Technology, Hong Kong maling@ust.hk malidong@ust.hk

More information

6 Pattern Mixture Models

6 Pattern Mixture Models 6 Pattern Mixture Models A common theme underlying the methods we have discussed so far is that interest focuses on making inference on parameters in a parametric or semiparametric model for the full data

More information

Polynomial Chaos and Karhunen-Loeve Expansion

Polynomial Chaos and Karhunen-Loeve Expansion Polynomial Chaos and Karhunen-Loeve Expansion 1) Random Variables Consider a system that is modeled by R = M(x, t, X) where X is a random variable. We are interested in determining the probability of the

More information

Modeling Repeated Functional Observations

Modeling Repeated Functional Observations Modeling Repeated Functional Observations Kehui Chen Department of Statistics, University of Pittsburgh, Hans-Georg Müller Department of Statistics, University of California, Davis ABSTRACT We introduce

More information

A Note on Data-Adaptive Bandwidth Selection for Sequential Kernel Smoothers

A Note on Data-Adaptive Bandwidth Selection for Sequential Kernel Smoothers 6th St.Petersburg Workshop on Simulation (2009) 1-3 A Note on Data-Adaptive Bandwidth Selection for Sequential Kernel Smoothers Ansgar Steland 1 Abstract Sequential kernel smoothers form a class of procedures

More information

Illustration of the Varying Coefficient Model for Analyses the Tree Growth from the Age and Space Perspectives

Illustration of the Varying Coefficient Model for Analyses the Tree Growth from the Age and Space Perspectives TR-No. 14-06, Hiroshima Statistical Research Group, 1 11 Illustration of the Varying Coefficient Model for Analyses the Tree Growth from the Age and Space Perspectives Mariko Yamamura 1, Keisuke Fukui

More information

Upper Bounds for Stern s Diatomic Sequence and Related Sequences

Upper Bounds for Stern s Diatomic Sequence and Related Sequences Upper Bounds for Stern s Diatomic Sequence and Related Sequences Colin Defant Department of Mathematics University of Florida, U.S.A. cdefant@ufl.edu Sumitted: Jun 18, 01; Accepted: Oct, 016; Pulished:

More information