Modelling Survival Events with Longitudinal Data Measured with Error

Size: px
Start display at page:

Download "Modelling Survival Events with Longitudinal Data Measured with Error"

Transcription

1 Modelling Survival Events with Longitudinal Data Measured with Error Hongsheng Dai, Jianxin Pan & Yanchun Bao First version: 14 December 29 Research Report No. 16, 29, Probability and Statistics Group School of Mathematics, The University of Manchester

2 Modelling survival events with longitudinal data measured with error Hongsheng Dai, University of Brighton, UK E mail: h.dai@brighton.ac.uk; Jianxin Pan, University of Manchester, UK; Yanchun Bao, Yunnan Normal University, China Abstract In survival analysis, time-dependent covariates are usually available as longitudinal data collected periodically and with error. The longitudinal data can be assumed to follow a linear mixed effect model and Cox regression models are usually used for survival events. The hazard rate of survival times depends on the underlying time-dependent covariate which is described by random effects. Most methods proposed for such models assume a parametric distribution assumption on the random effects and specify a normally distributed error term for the linear mixed effect model. These assumptions may not be valid in practice. This paper proposes a new likelihood method for Cox regression models with error-contaminated time-dependent covariates. The new method does not require any parametric distribution assumption on random effects and random Supported by grant Z from Chun Hui Plan Scientific Research Project of the Ministry of Education of the People s Republic of China. 1

3 errors. Large sample properties for the estimator are derived and simulation results demonstrate that sometimes the new method is more efficient than existing methods. Key words: Censored data; Longitudinal measurements; Partial likelihood; Proportional hazard model. 1 Introduction Cox proportional hazard model (Cox, 1972) is widely used to study the relationship between survival events and time-dependent or time-independent covariates, which assumes that the failure time hazard rate function λ(s) relates to covariates through λ(s) = λ (s) exp(γw (s)). To implement the above Cox model most existing methods require the time dependent covariate process W (s) to be fully observed. In practice, however, W (s) is measured intermittently and with error. In other words, we only observe longitudinal measurements W j at some time points t j, j = 1,, m, where W j = W (t j ) + ɛ j and ɛ j is the error term. Substituting mis-measured values for true covariates in Cox models leads to biased estimates (Prentice, 1982). Recent studies focus on joint modelling of survival events and longitudinal measurements. The latent time-dependent process W (s) is usually assumed to be W (s) = ω + ω 1 s, which may be generalized to more complex polynomial models. Here ω and ω 1 are random effects. Such assumptions can be used to study the effects of potentially mis-measured time-dependent covariates on the failure time. Many existing research 2

4 works (DeGruttola and Tu, 1994; Faucett and Thomas, 1996; Henderson et.al., 2; Wulfson and Tsiatis, 1997) assumed that the random effects and random error follow Gaussian distributions and they used EM algorithms or Bayesian approaches to deal with the unobserved covariate process W (s). However, the normal distribution assumption on random effects and random errors may not be true in practice. Misspecification of the distributions of random effects or random errors will result in biased estimates (Tisiatis and Davidian, 21; Song and Huang, 25; Wang, 26). Hu et.al. (1998) and Song et.al. (22b) relaxed the normal assumption by assuming that the density of underlying covariates belong to a smooth class. These approaches involve intensive EM-algorithm computation. Huang and Wang (2) and Song and Huang (25) proposed a corrected score approach which does not require any distribution assumption on random effects and the error term ɛ. Their models assume that the underlying covariate W is time-independent. In addition the corrected score method requires replicated covariate observations for each subject. In many applications, however, the underlying covariates are time-dependent and replicated covariate observations for each subject may not be available. Recently another interesting method is proposed by Wang (26). With the normal distribution assumption for error terms, Wang method does not put any distribution restriction on W (s). This method is based on the assumption that the hazard rate depends on the time-independent random effects, not the time-dependent underlying process W (s). Another estimator, conditional score estimator, is proposed by Tisiatis and Davidian (21). When the hazard rate λ(s) depends on the time-dependent underlying process and the error term is normally distributed, conditional score (CS) estimator does not require any distribution assumption on W (s). 3

5 In this paper, we that the failure time hazard rate depends on a time-dependent covariate process. With the normal distribution assumption on ɛ, a simple workinglikelihood (SWL) estimator is proposed without any distribution assumption on ω and ω 1. The simple working-likelihood estimator is proved to be consistent and asymptotically normal under some regular conditions. A consistent covariance estimator is also provided. Then we relax the normal distribution assumption on ɛ. A generalized working-likelihood (GWL) estimator is introduced for such cases. Consistency and asymptotic distributions of the GWL estimator are also shown. Simulation studies demonstrate that when the error term ɛ follows a normal distribution, the SWL estimator is as efficient as the CS estimator. Numerical studies also show that the GWL estimator works very well and it is more efficient than SWL and CS estimators when ɛ and (ω, ω 1 ) are both non-normally distributed. This paper is organized as follows. Models and notations are given in Section 2. In Section 3 we propose the simple working-likelihood estimator. The generalized working-likelihood estimator is provided in Section 4. Numerical studies and discussions are given in sections 5 and 6. 2 Models and notations Let Z i (s) be the observed time-dependent covariate and W i (s) be the unobserved timedependent process. Throughout this paper for simplicity we assume that W i (s) is a univariate process, though the proposed methods can be extended to a multivariate unobserved time-dependent process. We also assume that W i (s) = q 1 j= ω ijs j. Here ω i = (ω i,, ω i,q 1 ) is the random effect for the ith subject. We cannot observe W i (s) 4

6 but observe m i longitudinal measurements W i = { W ij, j = 1,, m i } as W ij = W i (t ij ) + ɛ ij, (1) at ordered times t i = (t i1,, t i,mi ) T. We assume that ω i is independent of t i. Let ɛ i = (ɛ i1,, ɛ i,mi ). Throughout this paper we do not put any distribution assumption on ω i. Let T i be the survival time of the ith subject. In practice, we may not observer T i for all subjects. Instead, we only observe T i = min(t i, C i ) and δ i = I(T i C i ), where the censoring variable C i is independent of T i. There is additional censoring at τ, which is the end time of the experiment. We assume that T i and C i are independent of t i and ɛ i. Cox proportional hazard model assumes that the hazard rate is a function of covariates through the following form, λ i (s) = λ (s) exp(γw i (s)+β T Z i (s)) where λ (t) is an arbitrary baseline hazard function. Define counting processes dn i (s) = I[s T i s + ds, δ i = 1, t iq s and at-risk processes Y i (s) = I[ T i s, t iq s. We have E(dN i (s) F i,s ) = λ (s)ds exp(γw i (s) + β T Z i (s))y i (s), (2) where F i,s is the filtration generated from σ-fields σ{ T i u, δ i, ω i, Z i (u), u s} and σ{t iq u, u τ}. The log-partial likelihood function for parameter θ = (γ, β T ) T is l(θ) = n 1 i [γwi (s) + β T Z i (s) log E () (W, θ, s) dn i (s), (3) where E () (W, θ, s) = n 1 i E() i (W i, θ, s) := n 1 i exp[γw i(s) + β T Z i (s)y i (s). If 5

7 W i (s) is fully observed, then we can maximize l(θ) by solving the following equations U (1) (θ, τ) := n 1 i τ [ (Wi ) (s) E(1) (W, θ, s) dn Z i (s) E () i (s) =, (4) (W, θ, s) where E (1) (W, θ, s) = E() (W,θ,s) (γ,β T ) T and E (1) (W, θ, s) = n 1 i E (1) i (W i, θ, s) := n 1 i ( Wi (s), Z i (s) ) T T () E i (W i, θ, s). Using martingale theories and under some regular conditions, it is straightforward to show that the maximum likelihood estimate based on (3) is consistent. When W i (s) is measured with error and periodically, the score functions in (4) are not available. To solve this problem, a naive approach is to estimate W i (s) using the Least Square estimates and then to apply these estimates as the covariates. Another method is to use the regression calibration estimator, which is to replace W i (s) with its conditional expectation given the longitudinal measurements. These approaches, however, introduce a severe bias to the estimate of γ. Detailed discussions and comparisons can be found in Tisiatis and Davidian (21) and Wang (26). Based on a sufficient statistic for W i (s) Tisiatis and Davidian (21) proposed a conditional score estimator, when no distribution assumption is given to the random effect ω i. The conditional score estimator is more efficient than the naive approach and regression calibration. 3 A simple working likelihood estimator Throughout this section, we assume that ɛ i in (1) has normal distribution N(, σ 2 I mi ). We assume that ω i, i = 1,, n are i.i.d. with an unknown common distribution. 6

8 3.1 Unbiased working log-likelihood function Let Ŵi(s) be the ordinary LSE of W i (s) using all the longitudinal observations W i. This requires at least q longitudinal measurements on subject i. Let s = (1, s,, s q 1 ) T and A i = 1 t i,1 t q 1 i,1.. 1 t i,mi t q 1 i,m i... Then we have Var(Ŵi(s) W i (s)) = σ 2 v i (s) where v i (s) := s T (A T i A i ) 1 s. A consistent estimator (Tisiatis and Davidian, 21) for σ 2 is ˆσ 2 = i I[m i>qr i i I[m i>q(m i q), where R i is the residual sum of squares for subject i for the least squares fit to all the m i observations. Let Ê() (θ, σ 2, s) = n 1 i Ê() i (θ, σ 2, s) and [ Ê () i (θ, σ 2, s) := exp γŵi(s) γ2 σ 2 v i (s) + β T Z i (s) Y i (s). 2 We know that given W i (s), the LSE Ŵi(s) is normally distributed. Thus given W i (s) the random variable exp[γŵi(s) has log-normal distribution. Using the well-known results for the expectation of a log-normal distribution, we have E{exp[γŴi(s) W i (s)} = exp[γw i + γ 2 σ 2 v i (s)/2, where E means expectation. Therefore E[Ê() i (θ, σ 2, s) W i (s) = E () i (W i, θ, s) (5) which implies EÊ() i (θ, σ 2, s) = EE () i (W i, θ, s). Under some regular conditions (see Appendix A, C.2) and according to (5) we have 7

9 that, in probability, lim Ê () (θ, σ 2, s) = lim E () (W, θ, s) := e () (θ, s). (6) n n When σ 2 is known we consider the working likelihood function, ˆln (θ, σ 2 ) = n 1 i [ γŵi(s) + β T Z i (s) log Ê() (θ, σ 2, s) dn i (s). (7) From (6) we know that ˆl n (θ, σ 2 ) is a working likelihood function asymptotically unbiased to l(θ) given in (3). 3.2 The maximum likelihood estimator To maximize ˆl n (θ, σ 2 ) given in (7), we solve the following equations, Û (1) γ (θ, σ 2, τ) := n 1 i Û (1) β (θ, σ 2, τ) := n 1 i τ τ [ Ŵ i (s) Ê(1) γ (θ, σ 2, s) dn i (s) =, Ê () (θ, σ 2, s) Ê (1) β (θ, σ Z 2, s) i (s) dn i (s) =, (8) Ê () (θ, σ 2, s) where Ê (1) γ (θ, σ 2, s) = Ê() (θ, σ 2, s) γ Ê (1) β (θ, σ 2, s) = Ê() (θ, σ 2, s) β = n 1 i = n 1 i [Ŵi (s) σ 2 v i (s)γ Z i (s)ê() i (θ, σ 2, s). Ê() i (θ, σ 2, s), 8

10 Let Ê(1) (θ, σ 2, s) = (Ê(1) γ (θ, σ 2, s), Ê(1) β (θ, σ 2, s) T ) T and Û (1) (θ, σ 2, s) = (Û (1) γ (θ, σ 2, s), Û (1) β (θ, σ 2, s) T ) T. Then we can write the estimating equations as Û (1) (θ, σ 2, τ) := n 1 i τ [ ) (Ŵi (s) Ê(1) (θ, σ 2, s) dn i (s) =. (9) Z i (s) Ê () (θ, σ 2, s) Since ˆσ 2 is a consistent estimator for σ 2 and (9) is an estimating equation asymptotically unbiased to (4), we conclude that (9) with σ 2 replaced by ˆσ 2 is still an asymptotically unbiased estimating equation. Theorem 3.1. Let ˆθ be the estimated value by solving Û (1) (θ, ˆσ 2, τ) = and θ be the true parameter. Under some regular conditions given in Appendix A, we have lim n ˆθ = θ in probability. Proof. Under some regular conditions l(θ) is concave. The theorem then follows from the facts that ˆl n (θ, σ 2 ) is asymptotically unbiased to l(θ) and that under some mild regular conditions l(θ) has a unique maximum at θ = θ. Let N(s) = i N i(s)/n. A consistent estimator for λ (s)ds is ˆλ (s)ds = d N(s) Ê () (ˆθ,ˆσ 2,s). 9

11 3.3 Covariance estimation Let Ê(1) Ê (2) (θ, σ (θ, σ 2 2, s), s) := θ = n 1 i [Ŵi(s) σ 2 v i (s)γ 2 σ 2 v i (s) Z i (s)[ŵi(s) σ 2 v i (s)γ [Ŵi(s) σ 2 v i (s)γz i (s) T Z i (s) Z i (s) T Ê() i (θ, σ 2, s) and ˆV (θ, σ 2, s) := Ê(2) (θ, σ 2, s) Ê () (θ, σ 2, s) } {Ê(1) 2 (θ, σ 2, s). Ê () (θ, σ 2, s) For any vector a, the notation a 2 denotes the outer product aa. Under regular conditions C.2 and C.3 in Appendix A, ˆV (θ, σ 2, s) converges uniformly to v(θ, s) given in C.3 in Appendix A and n 1 ˆV (θ, ˆσ 2, s) i dn i(s) converges to v(θ, s)e () (θ, s)λ (s)ds in probability. The asymptotic normality of ˆθ is established by the following theorem. Theorem 3.2. We have n 1/2 (ˆθ θ ) N(, R), with R = I(θ, τ) 1 Σ U (θ, σ 2, τ)i(θ, τ) 1, where Σ U (θ, σ 2, τ) = lim n V ar[ nû (1) (θ, ˆσ 2, τ) and I(θ, τ) = v(θ, s)e () (θ, s)λ (s)ds. A consistent estimator for I(θ, τ) is Î(ˆθ, ˆσ 2, τ) = 1 n i ˆV (ˆθ, ˆσ 2, s)dn i (s). 1

12 Define ˆΦ i (θ, ˆσ 2, t) = t t [ ) (Ŵi (s) Ê(1) (θ, ˆσ 2, s) dn i (s) (1) Z i (s) Ê () (θ, ˆσ 2, s) [ (Ŵi (s) ˆσ 2 ) v i (s)γ Ê(1) (θ, ˆσ 2, s) Ê () Z i (s) Ê () (θ, ˆσ 2 i (θ, ˆσ 2, s)ˆλ (s)ds., s) A consistent estimator for Σ U (θ, σ 2, t) is ˆΣ U (ˆθ, ˆσ 2, t) = 1 n n ˆΦ i (ˆθ, ˆσ 2, t) 2 + i n ˆΦ i (ˆθ, ˆσ 2, t) ˆΦ 1 (ˆθ, ˆσ 2, t). (11) i=2 Thus a consistent estimator for R is ˆR = Î(ˆθ, ˆσ 2, τ) 1 ˆΣU (ˆθ, ˆσ 2, τ)î(ˆθ, ˆσ 2, τ) 1. 4 A general working likelihood estimator Throughout this section, we relax the normal distribution assumption for ɛ i and only assume ɛ ij, j = 1,, m i, i = 1,, n are i.i.d. with mean and finite variance. We also assume that given ω i, ( W ij, t ij ), j = 1,, m i are i.i.d. pairs. 4.1 Unbiased log-likelihood function and unbiased estimating equation Let ξ i (s) = Ŵi(s) W i (s). We have ξ i (s) = s T (A T i A i ) 1 A T i ɛ i. (12) 11

13 Note that ξ i (s), i = 1,, n may not have the same distribution since the number of longitudinal measurements of each subject, m i, may not be the same. Suppose each subject has M extra longitudinal observations i.e. ( W ij,1, t ij,1 ), j = 1,, M, which are i.i.d pairs and are also independent of ( W ij, t ij ), j = 1,, m i. The Least- Squares estimate based on the extra longitudinal observations is denoted by Ŵi,1(s). Let ξ i,1 (s) = Ŵi,1(s) W i (s). Since ξ i,1 (s) has a similar expression as that in (12) and each subject has M replicated longitudinal measurements, we know that ξ i,1 (s), i = 1,, n are i.i.d. random variables. Let ϕ (k) (γ, s) = E[ξ i,1 (s) k exp(γξ i,1 (s)), k =, 1, 2 and Ĕ () i (θ, s) = E () i (Ŵi,1, θ, s). Note that Ĕ() i (θ, s) is E () i (W i,1, θ, s) with W i replaced by Ŵi,1, the Least Squares estimate based on the M extra longitudinal observations. Since ξ i (s) ξ i,1 (s) = Ŵi(s) Ŵi,1(s), we have Ĕ () i (θ, s) ϕ () (γ, s) = E () i (Ŵi, θ, s) exp(γξ i (s) γξ i,1 (s)) = E() i (Ŵi, θ, s) exp(γξ i (s)) exp[γξ i,1 (s) ϕ () (γ, s) 1 ϕ () (γ, s) = E () i (W i, θ, s) exp[γξ i,1(s) ϕ () (γ, s). Therefore E[Ĕ() i (θ, s)/ϕ () (γ, s) W i (s) = E () i (W i, θ, s). If let Ĕ() (θ, s) = n 1 i Ĕ() i (θ, s), then we have lim n Ĕ () (θ, s)/ϕ () (γ, s) = lim n E () (θ, s) = e () (θ, s). Similarly as the results in Section 3, we have an unbiased log-likelihood function as ln (θ) = n 1 i [ γŵi(s) + β T Z i (s) log Ĕ() (θ, s) dn ϕ () i (s). (13) (γ, s) 12

14 Let Ψ = { } ψ 1 (γ, s) = ϕ(1) (γ, s) ϕ () (γ, s), ψ 2(γ, s) = ϕ(2) (γ, s). ϕ () (γ, s) Then unbiased estimating equations are Ŭ (1) (θ, Ψ, τ) := n 1 i τ [ ) (Ŵi (s) Ĕ(1) (θ, Ψ, s) dn i (s) =, (14) Z i (s) Ĕ () (θ, s) where Ĕ (1) (θ, Ψ, s) = Ĕ() (θ, s) θ := n 1 i ) (Ŵi,1 (s) ψ 1 (γ, s) Ĕ () i (θ, s). Z i (s) Let Ĕ(2) (θ, Ψ, s) = Ĕ(1) (θ,ψ,s). We have θ Ĕ (2) (θ, Ψ, s) = n 1 i Ŵ i,1 (s) ψ 1 (γ, s) Z i (s) 2 ψ 2 (γ, s) ψ 1 (γ, s) 2 Ĕ() i (θ, s). Thus the derivative of Ŭ (1) (θ, Ψ, τ) with respect to θ is Ĭ(θ, Ψ, τ) = 1 n i V (θ, Ψ, s)dn i (s) where V (θ, Ψ, s) = Ĕ(2) (θ, Ψ, s) Ĕ () (θ, s) } {Ĕ(1) 2 (θ, Ψ, s). Ĕ () (θ, s) Note that if we replace Ψ = {ψ 1 (γ, s), ψ 2 (γ, s)} in (14) by its consistent estimator, then we can calculate MLE by solving score functions Ŭ (1) (θ, ˆΨ, τ) =. 13

15 4.2 Consistent estimator for Ψ Suppose that for each ( W ij,1, t ij,1 ), j = 1,, M, there are another two replicated data sets ( W ij,r, t ij,r ), j = 1,, M, r = 2, 3. Based on the two replicated data sets we can find the LSEs Ŵi,r(s), r = 2, 3. Let ξ i,r (s) = Ŵi,r(s) W i (s). For r = 2, 3, we can calculate i.i.d. values ξ i,1 (s) ξ i,r (s) = Ŵi(s) Ŵi,r(s), i = 1,, n. We then have the following theorem. Theorem 4.1. Let ˆψ 1 (γ, s) := ˆψ 2 (γ, s) := i (ξ i,1(s) ξ i,2 (s)) exp(γξ i,1 (s) γξ i,3 (s)) i exp(γξ ; i,1(s) γξ i,3 (s)) i (ξ i,1(s) ξ i,2 (s)) 2 exp(γξ i,1 (s) γξ i,3 (s)) i exp(γξ i,1(s) γξ i,3 (s)) i σ2 s T [A T i,2a i,2 1 s exp(γξ i,1 (s) γξ i,3 (s)) i exp(γξ, i,1(s) γξ i,3 (s)) where A i,2 is the regressor matrix for the second replicated data set ( W ij,2, t ij,2 ), j = 1,, M of subject i. We have ˆψ 1 (γ, s) ψ 1 (γ, s) and ˆψ 2 (γ, s) ψ 2 (γ, s) in probability as n. With the definition of ˆΨ = ( ˆψ 1 (γ, s), ˆψ 2 (γ, s)) given in the above theorem, we have the unbiased estimating equations as Ŭ (1) (θ, ˆΨ, τ) := n 1 i τ [ ) (Ŵi (s) Ĕ(1) (θ, ˆΨ, s) dn i (s) =. (15) Z i (s) Ĕ () (θ, s) Similarly as the proof for Theorem 3.1 we can show that the estimated value θ by solving (15) is consistent. 14

16 4.3 Covariance estimation Similarly as the results in Section 3, we can show that n( θ θ) converges weakly to a normal distribution N(, R). Let λ (s)ds = d N(s)/Ĕ() ( θ, s) and Φ i (θ, ˆΨ, t) = t [ ) (Ŵi (s) Ĕ(1) (θ, ˆΨ, s) dn i (s) Z i (s) Ĕ () (θ, s) [ (Ŵi (s) ˆψ ) 1 (γ, s) t Z i (s) Ĕ(1) (θ, ˆΨ, s) Ĕ () (θ, s) Ĕ () i (θ, s) λ (s)ds. A consistent estimate for R is [ R = Ĭ( θ, ˆΨ, τ) 1 1 Φ i ( θ, n ˆΨ, τ) 2 + i n Φ i ( θ, ˆΨ, τ) Φ 1 ( θ, ˆΨ, τ) Ĭ( θ, ˆΨ, τ) 1. i=2 4.4 Constructing the three groups of replicated longitudinal measurements The advantage of the above method is that it makes no distribution assumption on random effects and random errors. But this also leads to its drawback that for each subject it needs three extra longitudinal data sets, {( W ij,r, t ij,r ), j = 1,, M}, r = 1, 2, 3. Note that although in practice the replicated longitudinal observations ( W ij,r, t ij,r ), j = 1,, M, r = 1, 2, 3 are not available directly, we can achieve replicated data sets in the following way. We may choose M = q+2 or M = q+3 and select 3M longitudinal measurements from each individual, if it has no less than 3M longitudinal observations. The 3M longitudinal measurements will be partitioned randomly into three groups as 15

17 ( W ij,r, t ij,r ), j = 1,, M, r = 1, 2, 3, which can be viewed as replicated longitudinal observations. These measurements will be used to calculate the consistent estimator ˆΨ. Then we keep the first group of longitudinal measurements ( W ij,1, t ij,1 ), j = 1,, M unchanged and the rest of longitudinal observations for each subject are denoted as ( W ij, t ij ), j = 1,, m i. The larger value of M, the smaller variance for ξ i,r (s). Thus a larger value of M causes smaller variances of ˆΨ and the estimating equation (15). We will expect that a larger value of M leads to better estimates of θ. On the other hand, using the above method for constructing the three replicated data sets, subjects with less than 3M longitudinal observations have to be ignored when estimating ˆΨ. If we choose a very large value of M, then too many subjects will not be taken into account for the estimation of ˆΨ. This may result in a larger variance of ˆΨ and further lead to a poor estimate for θ. In summary, we need to choose M as large as possible, conditioning on that most subjects have at least 3M longitudinal measurements. Effects on the parameter estimates by choosing different values of M are discussed in the following section through simulation studies. 5 Simulation studies and data analysis 5.1 Simulation studies We consider simulation scenarios in Tisiatis and Davidian (21) where for simplicity there is a single time-dependent covariate W i (s) and no time-independent covariates are involved in the proportional hazard model. We choose a modified version of the 16

18 scenarios in Tisiatis and Davidian (21). We assume W i (s) = ω i + ω i1 s. Two different distributions for (ω i, ω i1 ) are considered. They are (i) (ω i, ω i1 ) is from a bivariate normal distribution with mean (3.173,.13) and covariance matrix D with elements D = (D 11, D 12, D 22 ) = (1.24,.39,.3); (ii) (ω i, ω i1 ) follows a mixture of bivariate normal distribution, with mixing proportion.5 and mixture component N(µ k, D k ), k = 1, 2, where µ 1 = (6.173,.13) T, µ 2 = (2.173,.13) T and D 1 = D 2 = D. The maximum number of longitudinal observations for each subject is 24 and nominal times of observation for W i (s) are t i = {8 + 3j, j =,, 23}. Survival times are generated from model λ i (s) = exp(γw i (s)) with γ = 1. The censoring distribution is exponential with mean 15 and with additional censoring at 8. We also consider two scenarios for the distribution of error terms: (a) a normally distributed error with distribution N(,.5); (b) the error term has a mixture normal distribution,.7n(.7,.1) +.3N(1.633,.1). In each scenario sample sizes are chosen to be n = 2 and 5 Monte Carlo data sets were generated. The parameter γ was estimated using four different methods: (1) using the ideal estimator that could be obtained by fitting by partial likelihood with true values of W i (s); (2) using the conditional score (CS) estimator; (3) using the simple working likelihood (SWL) method; (4) using the generalized working likelihood (GWL) method with M = 4, 5. Other methods such as naive regression or method of last value carried forward are not considered in the simulation studies since they are much less efficient than the conditional score estimator (Tisiatis and Davidian, 21; Huang and Wang, 2). Table 1 is about here. 17

19 When the error term is normally distributed, from Table 1 we can see that the CS estimator and the SWL estimator work as well as the ideal estimator with respect to the bias. The GWL estimators with M = 4, 5 also have very small bias. The GWL estimators have larger standard error estimates than the other two estimators, since the GWL method does not make the normal assumption for random error term. When the error term is not normally distributed, results are summarized in Table 2. Table 2 is about here. We can see that the bias of CS estimate and SWL estimate increase since they are valid only for normal random errors, but the GWL estimates did not change much and they are very stable. We conclude that when random errors are not normally distributed, the GWL estimators with M = 4, 5 have much smaller bias than SWL and CS estimators. We also investigated other choices for σ 2 and found similar results. As the variance of ɛ ij increases, the standard errors of all estimators increase. This is because larger variance of ɛ ij results in larger variance for estimating equations and further lead to larger standard errors for our estimators. Tisiatis and Davidian (21) pointed out that the estimating equation of conditional score method may have multi-roots. We also investigated the multi-roots problem for estimating equations of both methods. The typical score plots are shown in Figure 1. We can see that all methods have a solution close to the truth. The generalized working likelihood estimating equations have a single root. The conditional score method and the simple working likelihood method have multiple solutions. In practice as Tisiatis and Davidian (21) suggested we may choose the naive regression estimate as starting value to locate the correct estimate. 18

20 Figure 1 is about here. We also find out that when using generalized working likelihood method and choosing M = 4, the score function may have an outlier solution. In Figure 2 the solution for M = 4 are outliers while the solution for M = 5 is the truth. The outliers will result in poor estimate for standard error and means. Similarly as Song and Huang (25), when all estimates exist and are non-outliers, general working likelihood estimates are stable and have small bias, regardless of the distributions of random effects and error terms. Figure 2 is about here. 5.2 Data analysis To demonstrate the proposed methods we use the primary biliary cirrhosis (PBC) data set collected by the Mayo Clinic from 1974 to PBC is a chronic disease that can, little by little, destroy some of the bile ducts linking your liver to your gut. When PBC damages your bile ducts, bile can no longer flow through them. Instead it builds up in the liver, damaging the liver cells and causing inflammation and scarring. In the clinical trial, survival status and laboratory results such as serum bilirubin of 312 patients were recorded. In this clinical trail, 158 out of 312 patients took the drug D-penicillamine and the other patients are in the control group. Serum bilirubin are measured at irregularly time points, recorded until death or censoring. Among the 312 patients, the maximum number of repeated measurement is 16. More detailed discussion of the trail and data set can be found in Ding and Wang (28) and Fleming and Harrington (1991). We here consider the biomarker serum bilirubin as the covariate process W i (s) and 19

21 treatment type as the time-independent covariate Z i in the Cox regression model λ i (s) = λ (s) exp(γw i (s) + βz i ) and longitudinal model W i (s) = ω i + ω i1 s. In our analysis we took logarithm transformation on the serum bilirubin values as the longitudinal measurements. Six typical patients serum bilirubin values are plotted in Figure 3. Figure 3 is about here. We fit the model using conditional score method, simple working-likelihood method and generalized working-likelihood method. The maximum number of longitudinal observations for each subject is 16 and there are 32 patients who have more than 12 measurements on serum bilirubin. We choose M = 4 when using the generalized working-likelihood method. This means that 3M = 12 longitudinal measurements are selected and randomly partitioned into three groups as replicated observations to estimate Ψ. The results are shown in Table 3. The three approaches provide similar results. All three methods suggest that the coefficient estimate ˆγ for the latent process of serum bilirubin is non-significant. The coefficient estimate based on baseline serum bilirubin in Fleming and Harrington (1991) is.8, significantly unequal to. This suggests that the baseline serum bilirubin is a risk factor for survival times but serum bilirubin at later times after treatment are not. Fleming and Harrington (1991) also studied the treatment effect of D-penicillamine to PBC and their result is that the treatment is non-significant. From Table 3 we can see that all three methods give the same result that the coefficient estimate ˆβ for treatment Z i is not significantly unequal to. Table 3 is about here. 2

22 6 Discussion We have proposed new methods for joint modelling of survival events and error-contaminated time-dependent covariates. The estimators are easily computed and their large sample properties are shown. We suggest using the generalized working likelihood method in practice, since it does not lose much efficiency when the random error is normally distributed and it results in the least bias if the error term is not normally distributed. When using the general working likelihood method, we need replicated observations. Even if replicated observations do not exists, we still obtain replicates from the longitudinal observations using the method in Section 4. When partitioning the 3M longitudinal measurements into three groups, we need to do it randomly for each individual. Note that we cannot partition the 3M measurements into three groups such that t ij,1 < t ik,2 < t il,3 for j, k, l = 1,, M. This is because doing in such way ( W ij,r, t ij,r, j = 1,, M) will not have the same distribution for different values of r and the large sample properties of the estimator cannot be guaranteed. All the existing parametric or nonparametric correction methods assume that the observation times t ij are non-informative. In practice we may observe error-contaminated longitudinal points collected at informative observation times (Liang et.al., 29). For such problems the existing methods are not valid. This is left as future research work. A Regular conditions C.1 The time τ is such that τ λ (s)ds <. 21

23 C.2 Let E (k) (W, θ, s) = n 1 i E (k) i (W i, θ, s) := n 1 i ( ) k Wi (s) E () i (W i, θ, s), k =, 1, 2. Z i (s) There exists a neighborhood Θ of θ and, respectively, scalar, vector and matrix functions e (), e (1) and e (2) defined on Θ [, τ such that, sup E (k) (W, θ, s) e (k) (θ, s) s [,τ,θ Θ in probability as n. C.3 Let v = e (2) /e () [e (1) /e () 2. Then for all θ Θ and s τ, θ e() (θ, s) = e (1) (θ, s) 2 θ 2 e() (θ, s) = e (2) (θ, s). C.4 The matrix Σ(θ, τ) = v(θ, s)e () (θ, s)λ (s)ds is positive definite. 22

24 B Proof of Theorem 3.2 Proof. If nû (1) (θ, ˆσ 2, τ) converges weakly to a normal distribution, using the firstorder Taylor extension for Û (1) (θ, ˆσ 2, t) we have n(ˆθ θ ) N(, R), where R = I(θ, τ) 1 Σ U (θ, σ 2, τ)i(θ, τ) 1 and Σ U (θ, σ 2, τ) = lim n V ar[ nû (1) (θ, ˆσ 2, τ). So we only need to show the asymptotic property for nû (1) (θ, ˆσ 2, τ). We can write n Û (1) (θ, ˆσ 2, t) = n 1/2 i n 1/2 i t t [ ) (Ŵi (s) Ê(1) (θ, ˆσ 2, s) dn i (s) Z i (s) Ê () (θ, ˆσ 2, s) [ (Ŵi (s) ˆσ 2 v i (s)γ Z i (s) ) Ê(1) (θ, ˆσ 2, s) Ê () (θ, ˆσ 2, s) Ê () i (θ, ˆσ 2, s)λ (s)ds which is equivalent to n Û (1) (θ, ˆσ 2, t) = n 1/2 i n 1/2 i +n 1/2 i t [ ) (Ŵi (s) e(1) (θ, s) Z i (s) e () dn i (s) (θ, s) [ (Ŵi (s) ˆσ 2 ) v i (s)γ e(1) (θ, s) Z i (s) e () (θ, s) ( e (1) (θ, s) e () (θ, s) Ê (1) (θ, ˆσ 2, s) t t := I II + III. Ê () (θ, ˆσ 2, s) Ê () i (θ, ˆσ 2, s)λ (s)ds ) ( ) dn i (s) Ê() i (θ, ˆσ 2, s)λ (s)ds We know that III = o p (1), since under regular conditions E[dN i (s) Ê() i (θ, σ 2, s)λ (s)ds W i (s) = e and sup (1) (θ,s) Ê(1) (θ,s) θ,s e () (θ,s) = o(1). Thus we can write Ê () (θ,s) n Û (1) (θ, ˆσ 2, t) = I II + o p (1) = n 1/2 i Φ i (θ, ˆσ 2, t) + o p (1) 23

25 where Φ i (θ, σ 2, t) is t [ ) (Ŵi (s) e(1) (θ, s) Z i (s) e () dn i (s) (θ, s) t [ (Ŵi (s) σ 2 v i (s)γ Z i (s) ) e(1) (θ, s) e () Ê () i (θ, σ 2, s)λ (s)ds. (θ, s) Since Φ i (θ, σ 2, t), i = 1,, n are i.i.d. random variables, we know that n 1/2 i Φ i(θ, σ 2, t) and n 1/2 i [Φ i(θ, ˆσ 2, t) Φ i (θ, σ 2, t) both converge weakly to a Gaussian process. Therefore we have nû (1) (θ, ˆσ 2, t) converges weakly to a normal distribution with mean and covariance matrix [ Σ U (θ, σ 2 1, t) = lim n n n Φ i (θ, ˆσ 2, t) 2 + i=1 n Φ i (θ, ˆσ 2, t) Φ 1 (θ, ˆσ 2, t) i=2 According to the definition of ˆΦ i (θ, ˆσ 2, t) given in Theorem 3.2, we know that a consistent estimator for Σ U (θ, σ 2, t) is ˆΣ U (ˆθ, ˆσ 2, t) = 1 n n ˆΦ i (ˆθ, ˆσ 2, t) 2 + i n ˆΦ i (θ, ˆσ 2, t) ˆΦ 1 (θ, ˆσ 2, t). i=2 Thus a consistent estimator for R is ˆR = Î(ˆθ, ˆσ 2, τ) 1 ˆΣU (ˆθ, ˆσ 2, τ)î(ˆθ, ˆσ 2, τ) 1. C Proof of Theorem 4.1 The theorem follows from the obvious results ϕ (1) (s) ϕ () (s) = E[ξ i,1(s) exp(γξ i,1 (s)) E[exp(γξ i,1 (s)) = E[(ξ i,1(s) ξ i,2 (s)) exp(γξ i,1 (s) γξ i,3 (s)), E[exp(γξ i,1 (s) γξ i,3 (s)) 24

26 ϕ (2) (s) ϕ () (s) = E[ξ i,1(s) 2 exp(γξ i,1 (s)) E[exp(γξ i,1 (s)) = E[{(ξ i,1(s) ξ i,2 (s)) 2 ξ i,2 (s) 2 } exp(γξ i,1 (s) γξ i,3 (s)) E[exp(γξ i,1 (s) γξ i,3 (s)) and E[ξ i,2 (s) 2 = σ 2 E [ s T [A T i,2a i,2 1 s. References Cox D. R. (1972). Regression models and life tables (with discussion). Journal of the Royal Statistical Society, B, 34: Dafni U. G. and Tsiatis A. A. (1998). Evaluating surrogate markers of clinical outcome measured with error. Biometrics, 54: DeGruttola V. and Tu X. M. (1994). Modeling progression of CD-4 lymphocyte count and its relationship to survival time. Biometrics, 5: Faucett C. J. and Thomas D. C. (1996). Simultaneously modelling censored survival data and repeatedly measured covariates: A Gibbs sampling approach. Statstics in Medicine, 1: Fleming T. R. and Harrington D. P. (1991). Counting Processes and Survival Analysis, John Wiley & Sons, Inc. Henderson R., Diggle P. and Dobson A. (2). Joint modelling of longitudinal measurements and event time data. Biostatistics, 1: Hu P., Tsiatis A. A. and Davidian M. (1998). Estimating the parameters in the Cox model when covariate variables are measured with error. Biometrics, 54:

27 Huang Y. and Wang C. Y. (2). Cox regression with accurate covariates unascertainable: A nonparametric-correction approach. Journal of the American Statistical Association, 95: Liang Yu. and Lu W. and Ying Z. (29). Joint modeling and analysis of longitudinal data with informative observation times. Biometrics, 65: Nakamura T. (199). Corrected score function for errors in variables models: methodology and application to generalized linear models. Biometrika, 77: Prentice R. (1982). Covariate measurement errors and parameter estimates in a failure time regression measurement error models. Biometrika, 74: Song X., Davidian M. and Tsiatis A. A. (22a). An estimator for the proportional hazards model with multiple longitudinal covariates measured with error. Biostatistics, 3: Song X., Davidian M. and Tsiatis A. A. (22b). A semiparametric estimator for the proportional hazards model with longitudinal covariates measured with error. Biometrika, 88: Jimin Ding and Jane-Ling Wang. (28). Modeling longitudinal data with nonparametric multiplicative random effects jointly with survival data. Biometrics, 64: Tisiatis A. A. and Davidian M. (21). A semiparametric estimator for the proportional hazards model with longitudinal covariates measured with error. Biometrika, 88: Tisiatis A. A. and DeGruttola V. and Wulfsohn M. S. (1995). Modeling the relationship 26

28 of survival to longitudinal data measured with error: Applications to survival and CD4 counts in patients with AIDS. J. Am. Statist. Assoc., 9: Song X. and Huang Y. (25). On corrected score approach for proportional hazards model with covariate measurement error. Biometrics, 61: Wang C. Y. and Hsu L. and Feng Z. D. and Prentice R. L. (1997). Regression Calibration in Failure Time Regression. Biometrics, 53: Wang C. Y. (26). Corrected score estimator for joint modelling of longitudinal and failure time data. Statistica Sinica, 16: Wulfson M. S. and Tsiatis A. A. (1997). A joint model for survival and longitudianl data measured with error. Biometrics, 53:

29 Table 1: Simulation results for two underlying random effect distributions, when error term is N(,.5). I, ideal ; CS, conditional score estimator; SWL, simple working likelihood estimator; GWL, Generalized working likelihood estimator with M = 4, 5; SD, Monte Carlo standard deviation; SE, average of estimated standard errors. n = 2 Normal covariate Mixture covariate Method Mean SD SE Mean SD SE I CS SWL GWL, GWL,

30 Table 2: Simulation results for two underlying random effect distributions, when error term is mixed normal as.7n(.7,.1)+.3n(1.633,.1). I, ideal ; CS, conditional score estimator; SWL, simple working likelihood estimator; GWL, Generalized working likelihood estimator with M = 4, 5; SD, Monte Carlo standard deviation; SE, average of estimated standard errors. n = 2 Normal covariate Mixture covariate Method Mean SD SE Mean SD SE I CS SWL GWL GWL Table 3: Results for the PBC data. ˆγ (sd) ˆβ (sd) CS -.29 (.69).53 (.265) SWL -.33 (.55).54 (.16) GWL -.4 (.99).77 (.248) 29

31 5 5 4 Solid line: I Dash line: CS Dot line: SWL 4 Solid line: I Dot line: GWL,M=4 Dash dot line: GWL,M= U( γ ) U( γ ) γ γ Figure 1: Typical score plots for simulation data sets. I, ideal method; CS, conditional score method; SWL, simple working likelihood method; GWL, generalized working likelihood method Solid line : Dash line : GWL,M=5 GWL,M=4 2 1 U( γ ) γ Figure 2: Typical score plots for a simulation data set. likelihood method. GWL, generalized working 3

32 log(serum bilirubin) Time (in days) Figure 3: Longitudinal observation plot for six patients. 31

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations Takeshi Emura and Hisayuki Tsukuma Abstract For testing the regression parameter in multivariate

More information

Review Article Analysis of Longitudinal and Survival Data: Joint Modeling, Inference Methods, and Issues

Review Article Analysis of Longitudinal and Survival Data: Joint Modeling, Inference Methods, and Issues Journal of Probability and Statistics Volume 2012, Article ID 640153, 17 pages doi:10.1155/2012/640153 Review Article Analysis of Longitudinal and Survival Data: Joint Modeling, Inference Methods, and

More information

[Part 2] Model Development for the Prediction of Survival Times using Longitudinal Measurements

[Part 2] Model Development for the Prediction of Survival Times using Longitudinal Measurements [Part 2] Model Development for the Prediction of Survival Times using Longitudinal Measurements Aasthaa Bansal PhD Pharmaceutical Outcomes Research & Policy Program University of Washington 69 Biomarkers

More information

Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates

Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates Anastasios (Butch) Tsiatis Department of Statistics North Carolina State University http://www.stat.ncsu.edu/

More information

A TWO-STAGE LINEAR MIXED-EFFECTS/COX MODEL FOR LONGITUDINAL DATA WITH MEASUREMENT ERROR AND SURVIVAL

A TWO-STAGE LINEAR MIXED-EFFECTS/COX MODEL FOR LONGITUDINAL DATA WITH MEASUREMENT ERROR AND SURVIVAL A TWO-STAGE LINEAR MIXED-EFFECTS/COX MODEL FOR LONGITUDINAL DATA WITH MEASUREMENT ERROR AND SURVIVAL Christopher H. Morrell, Loyola College in Maryland, and Larry J. Brant, NIA Christopher H. Morrell,

More information

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score

More information

Lecture 5 Models and methods for recurrent event data

Lecture 5 Models and methods for recurrent event data Lecture 5 Models and methods for recurrent event data Recurrent and multiple events are commonly encountered in longitudinal studies. In this chapter we consider ordered recurrent and multiple events.

More information

SEMIPARAMETRIC APPROACHES TO INFERENCE IN JOINT MODELS FOR LONGITUDINAL AND TIME-TO-EVENT DATA

SEMIPARAMETRIC APPROACHES TO INFERENCE IN JOINT MODELS FOR LONGITUDINAL AND TIME-TO-EVENT DATA SEMIPARAMETRIC APPROACHES TO INFERENCE IN JOINT MODELS FOR LONGITUDINAL AND TIME-TO-EVENT DATA Marie Davidian and Anastasios A. Tsiatis http://www.stat.ncsu.edu/ davidian/ http://www.stat.ncsu.edu/ tsiatis/

More information

Tests of independence for censored bivariate failure time data

Tests of independence for censored bivariate failure time data Tests of independence for censored bivariate failure time data Abstract Bivariate failure time data is widely used in survival analysis, for example, in twins study. This article presents a class of χ

More information

Approximation of Survival Function by Taylor Series for General Partly Interval Censored Data

Approximation of Survival Function by Taylor Series for General Partly Interval Censored Data Malaysian Journal of Mathematical Sciences 11(3): 33 315 (217) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES Journal homepage: http://einspem.upm.edu.my/journal Approximation of Survival Function by Taylor

More information

Joint Modeling of Survival and Longitudinal Data: Likelihood Approach Revisited

Joint Modeling of Survival and Longitudinal Data: Likelihood Approach Revisited Biometrics 62, 1037 1043 December 2006 DOI: 10.1111/j.1541-0420.2006.00570.x Joint Modeling of Survival and Longitudinal Data: Likelihood Approach Revisited Fushing Hsieh, 1 Yi-Kuan Tseng, 2 and Jane-Ling

More information

On the covariate-adjusted estimation for an overall treatment difference with data from a randomized comparative clinical trial

On the covariate-adjusted estimation for an overall treatment difference with data from a randomized comparative clinical trial Biostatistics Advance Access published January 30, 2012 Biostatistics (2012), xx, xx, pp. 1 18 doi:10.1093/biostatistics/kxr050 On the covariate-adjusted estimation for an overall treatment difference

More information

Survival Analysis for Case-Cohort Studies

Survival Analysis for Case-Cohort Studies Survival Analysis for ase-ohort Studies Petr Klášterecký Dept. of Probability and Mathematical Statistics, Faculty of Mathematics and Physics, harles University, Prague, zech Republic e-mail: petr.klasterecky@matfyz.cz

More information

Semiparametric Mixed Effects Models with Flexible Random Effects Distribution

Semiparametric Mixed Effects Models with Flexible Random Effects Distribution Semiparametric Mixed Effects Models with Flexible Random Effects Distribution Marie Davidian North Carolina State University davidian@stat.ncsu.edu www.stat.ncsu.edu/ davidian Joint work with A. Tsiatis,

More information

Joint Modelling of Accelerated Failure Time and Longitudinal Data

Joint Modelling of Accelerated Failure Time and Longitudinal Data Joint Modelling of Accelerated Failure Time and Longitudinal Data YI-KUAN TSENG, FUSHING HSIEH and JANE-LING WANG yktseng@wald.ucdavis.edu fushing@wald.ucdavis.edu wang@wald.ucdavis.edu Department of Statistics,

More information

Longitudinal + Reliability = Joint Modeling

Longitudinal + Reliability = Joint Modeling Longitudinal + Reliability = Joint Modeling Carles Serrat Institute of Statistics and Mathematics Applied to Building CYTED-HAROSA International Workshop November 21-22, 2013 Barcelona Mainly from Rizopoulos,

More information

Joint modelling of longitudinal measurements and event time data

Joint modelling of longitudinal measurements and event time data Biostatistics (2000), 1, 4,pp. 465 480 Printed in Great Britain Joint modelling of longitudinal measurements and event time data ROBIN HENDERSON, PETER DIGGLE, ANGELA DOBSON Medical Statistics Unit, Lancaster

More information

Survival Analysis. Lu Tian and Richard Olshen Stanford University

Survival Analysis. Lu Tian and Richard Olshen Stanford University 1 Survival Analysis Lu Tian and Richard Olshen Stanford University 2 Survival Time/ Failure Time/Event Time We will introduce various statistical methods for analyzing survival outcomes What is the survival

More information

ANALYSIS OF COMPETING RISKS DATA WITH MISSING CAUSE OF FAILURE UNDER ADDITIVE HAZARDS MODEL

ANALYSIS OF COMPETING RISKS DATA WITH MISSING CAUSE OF FAILURE UNDER ADDITIVE HAZARDS MODEL Statistica Sinica 18(28, 219-234 ANALYSIS OF COMPETING RISKS DATA WITH MISSING CAUSE OF FAILURE UNDER ADDITIVE HAZARDS MODEL Wenbin Lu and Yu Liang North Carolina State University and SAS Institute Inc.

More information

Package Rsurrogate. October 20, 2016

Package Rsurrogate. October 20, 2016 Type Package Package Rsurrogate October 20, 2016 Title Robust Estimation of the Proportion of Treatment Effect Explained by Surrogate Marker Information Version 2.0 Date 2016-10-19 Author Layla Parast

More information

Power and Sample Size Calculations with the Additive Hazards Model

Power and Sample Size Calculations with the Additive Hazards Model Journal of Data Science 10(2012), 143-155 Power and Sample Size Calculations with the Additive Hazards Model Ling Chen, Chengjie Xiong, J. Philip Miller and Feng Gao Washington University School of Medicine

More information

An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data

An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data Jae-Kwang Kim 1 Iowa State University June 28, 2012 1 Joint work with Dr. Ming Zhou (when he was a PhD student at ISU)

More information

Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion

Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion Glenn Heller and Jing Qin Department of Epidemiology and Biostatistics Memorial

More information

STAT331. Cox s Proportional Hazards Model

STAT331. Cox s Proportional Hazards Model STAT331 Cox s Proportional Hazards Model In this unit we introduce Cox s proportional hazards (Cox s PH) model, give a heuristic development of the partial likelihood function, and discuss adaptations

More information

Generalized Linear Models with Functional Predictors

Generalized Linear Models with Functional Predictors Generalized Linear Models with Functional Predictors GARETH M. JAMES Marshall School of Business, University of Southern California Abstract In this paper we present a technique for extending generalized

More information

Joint Longitudinal and Survival-cure Models with Constrained Parameters in Tumour Xenograft Experiments

Joint Longitudinal and Survival-cure Models with Constrained Parameters in Tumour Xenograft Experiments Joint Longitudinal and Survival-cure Models with Constrained Parameters in Tumour Xenograft Experiments Bao, Yanchun and Pan, Jianxin and Dai, Hongsheng and Fang, Hong-Bin 2009 MIMS EPrint: 2009.48 Manchester

More information

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

FULL LIKELIHOOD INFERENCES IN THE COX MODEL October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach

More information

Joint longitudinal and survival-cure models in tumour xenograft experiments

Joint longitudinal and survival-cure models in tumour xenograft experiments Joint longitudinal and survival-cure models in tumour xenograft experiments Jianxin Pan, Yanchun Bao,, Hongsheng Dai,, Hong-Bin Fang, Abstract In tumour xenograft experiments, treatment regimens are administered

More information

Integrated likelihoods in survival models for highlystratified

Integrated likelihoods in survival models for highlystratified Working Paper Series, N. 1, January 2014 Integrated likelihoods in survival models for highlystratified censored data Giuliana Cortese Department of Statistical Sciences University of Padua Italy Nicola

More information

Estimation of Conditional Kendall s Tau for Bivariate Interval Censored Data

Estimation of Conditional Kendall s Tau for Bivariate Interval Censored Data Communications for Statistical Applications and Methods 2015, Vol. 22, No. 6, 599 604 DOI: http://dx.doi.org/10.5351/csam.2015.22.6.599 Print ISSN 2287-7843 / Online ISSN 2383-4757 Estimation of Conditional

More information

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model Other Survival Models (1) Non-PH models We briefly discussed the non-proportional hazards (non-ph) model λ(t Z) = λ 0 (t) exp{β(t) Z}, where β(t) can be estimated by: piecewise constants (recall how);

More information

Goodness-Of-Fit for Cox s Regression Model. Extensions of Cox s Regression Model. Survival Analysis Fall 2004, Copenhagen

Goodness-Of-Fit for Cox s Regression Model. Extensions of Cox s Regression Model. Survival Analysis Fall 2004, Copenhagen Outline Cox s proportional hazards model. Goodness-of-fit tools More flexible models R-package timereg Forthcoming book, Martinussen and Scheike. 2/38 University of Copenhagen http://www.biostat.ku.dk

More information

Joint Modeling of Longitudinal Item Response Data and Survival

Joint Modeling of Longitudinal Item Response Data and Survival Joint Modeling of Longitudinal Item Response Data and Survival Jean-Paul Fox University of Twente Department of Research Methodology, Measurement and Data Analysis Faculty of Behavioural Sciences Enschede,

More information

1 Introduction. 2 Residuals in PH model

1 Introduction. 2 Residuals in PH model Supplementary Material for Diagnostic Plotting Methods for Proportional Hazards Models With Time-dependent Covariates or Time-varying Regression Coefficients BY QIQING YU, JUNYI DONG Department of Mathematical

More information

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA Kasun Rathnayake ; A/Prof Jun Ma Department of Statistics Faculty of Science and Engineering Macquarie University

More information

UNIVERSITY OF CALIFORNIA, SAN DIEGO

UNIVERSITY OF CALIFORNIA, SAN DIEGO UNIVERSITY OF CALIFORNIA, SAN DIEGO Estimation of the primary hazard ratio in the presence of a secondary covariate with non-proportional hazards An undergraduate honors thesis submitted to the Department

More information

Some New Methods for Latent Variable Models and Survival Analysis. Latent-Model Robustness in Structural Measurement Error Models.

Some New Methods for Latent Variable Models and Survival Analysis. Latent-Model Robustness in Structural Measurement Error Models. Some New Methods for Latent Variable Models and Survival Analysis Marie Davidian Department of Statistics North Carolina State University 1. Introduction Outline 3. Empirically checking latent-model robustness

More information

On Estimating the Relationship between Longitudinal Measurements and Time-to-Event Data Using a Simple Two-Stage Procedure

On Estimating the Relationship between Longitudinal Measurements and Time-to-Event Data Using a Simple Two-Stage Procedure Biometrics DOI: 10.1111/j.1541-0420.2009.01324.x On Estimating the Relationship between Longitudinal Measurements and Time-to-Event Data Using a Simple Two-Stage Procedure Paul S. Albert 1, and Joanna

More information

Dynamic Prediction of Disease Progression Using Longitudinal Biomarker Data

Dynamic Prediction of Disease Progression Using Longitudinal Biomarker Data Dynamic Prediction of Disease Progression Using Longitudinal Biomarker Data Xuelin Huang Department of Biostatistics M. D. Anderson Cancer Center The University of Texas Joint Work with Jing Ning, Sangbum

More information

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Article Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Fei Jin 1,2 and Lung-fei Lee 3, * 1 School of Economics, Shanghai University of Finance and Economics,

More information

STAT 331. Accelerated Failure Time Models. Previously, we have focused on multiplicative intensity models, where

STAT 331. Accelerated Failure Time Models. Previously, we have focused on multiplicative intensity models, where STAT 331 Accelerated Failure Time Models Previously, we have focused on multiplicative intensity models, where h t z) = h 0 t) g z). These can also be expressed as H t z) = H 0 t) g z) or S t z) = e Ht

More information

Parametric fractional imputation for missing data analysis

Parametric fractional imputation for missing data analysis 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 Biometrika (????),??,?, pp. 1 15 C???? Biometrika Trust Printed in

More information

Likelihood Construction, Inference for Parametric Survival Distributions

Likelihood Construction, Inference for Parametric Survival Distributions Week 1 Likelihood Construction, Inference for Parametric Survival Distributions In this section we obtain the likelihood function for noninformatively rightcensored survival data and indicate how to make

More information

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics Jiti Gao Department of Statistics School of Mathematics and Statistics The University of Western Australia Crawley

More information

Selection Criteria Based on Monte Carlo Simulation and Cross Validation in Mixed Models

Selection Criteria Based on Monte Carlo Simulation and Cross Validation in Mixed Models Selection Criteria Based on Monte Carlo Simulation and Cross Validation in Mixed Models Junfeng Shang Bowling Green State University, USA Abstract In the mixed modeling framework, Monte Carlo simulation

More information

Multi-state Models: An Overview

Multi-state Models: An Overview Multi-state Models: An Overview Andrew Titman Lancaster University 14 April 2016 Overview Introduction to multi-state modelling Examples of applications Continuously observed processes Intermittently observed

More information

Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs

Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs Michael J. Daniels and Chenguang Wang Jan. 18, 2009 First, we would like to thank Joe and Geert for a carefully

More information

Joint longitudinal and survival-cure models in tumour xenograft experiments

Joint longitudinal and survival-cure models in tumour xenograft experiments Joint longitudinal and survival-cure models in tumour xenograft experiments Jianxin Pan, Yanchun Bao,, Hongsheng Dai,, Hong-Bin Fang, Abstract In tumour xenograft experiments, treatment regimens are administered

More information

Part III Measures of Classification Accuracy for the Prediction of Survival Times

Part III Measures of Classification Accuracy for the Prediction of Survival Times Part III Measures of Classification Accuracy for the Prediction of Survival Times Patrick J Heagerty PhD Department of Biostatistics University of Washington 102 ISCB 2010 Session Three Outline Examples

More information

Harvard University. Harvard University Biostatistics Working Paper Series

Harvard University. Harvard University Biostatistics Working Paper Series Harvard University Harvard University Biostatistics Working Paper Series Year 2008 Paper 94 The Highest Confidence Density Region and Its Usage for Inferences about the Survival Function with Censored

More information

Survival Analysis Math 434 Fall 2011

Survival Analysis Math 434 Fall 2011 Survival Analysis Math 434 Fall 2011 Part IV: Chap. 8,9.2,9.3,11: Semiparametric Proportional Hazards Regression Jimin Ding Math Dept. www.math.wustl.edu/ jmding/math434/fall09/index.html Basic Model Setup

More information

USING TRAJECTORIES FROM A BIVARIATE GROWTH CURVE OF COVARIATES IN A COX MODEL ANALYSIS

USING TRAJECTORIES FROM A BIVARIATE GROWTH CURVE OF COVARIATES IN A COX MODEL ANALYSIS USING TRAJECTORIES FROM A BIVARIATE GROWTH CURVE OF COVARIATES IN A COX MODEL ANALYSIS by Qianyu Dang B.S., Fudan University, 1995 M.S., Kansas State University, 2000 Submitted to the Graduate Faculty

More information

Parameter Estimation

Parameter Estimation Parameter Estimation Consider a sample of observations on a random variable Y. his generates random variables: (y 1, y 2,, y ). A random sample is a sample (y 1, y 2,, y ) where the random variables y

More information

On the Breslow estimator

On the Breslow estimator Lifetime Data Anal (27) 13:471 48 DOI 1.17/s1985-7-948-y On the Breslow estimator D. Y. Lin Received: 5 April 27 / Accepted: 16 July 27 / Published online: 2 September 27 Springer Science+Business Media,

More information

A weighted simulation-based estimator for incomplete longitudinal data models

A weighted simulation-based estimator for incomplete longitudinal data models To appear in Statistics and Probability Letters, 113 (2016), 16-22. doi 10.1016/j.spl.2016.02.004 A weighted simulation-based estimator for incomplete longitudinal data models Daniel H. Li 1 and Liqun

More information

Analysis of Gamma and Weibull Lifetime Data under a General Censoring Scheme and in the presence of Covariates

Analysis of Gamma and Weibull Lifetime Data under a General Censoring Scheme and in the presence of Covariates Communications in Statistics - Theory and Methods ISSN: 0361-0926 (Print) 1532-415X (Online) Journal homepage: http://www.tandfonline.com/loi/lsta20 Analysis of Gamma and Weibull Lifetime Data under a

More information

CTDL-Positive Stable Frailty Model

CTDL-Positive Stable Frailty Model CTDL-Positive Stable Frailty Model M. Blagojevic 1, G. MacKenzie 2 1 Department of Mathematics, Keele University, Staffordshire ST5 5BG,UK and 2 Centre of Biostatistics, University of Limerick, Ireland

More information

Goodness-of-fit test for the Cox Proportional Hazard Model

Goodness-of-fit test for the Cox Proportional Hazard Model Goodness-of-fit test for the Cox Proportional Hazard Model Rui Cui rcui@eco.uc3m.es Department of Economics, UC3M Abstract In this paper, we develop new goodness-of-fit tests for the Cox proportional hazard

More information

Nonresponse weighting adjustment using estimated response probability

Nonresponse weighting adjustment using estimated response probability Nonresponse weighting adjustment using estimated response probability Jae-kwang Kim Yonsei University, Seoul, Korea December 26, 2006 Introduction Nonresponse Unit nonresponse Item nonresponse Basic strategy

More information

Using Estimating Equations for Spatially Correlated A

Using Estimating Equations for Spatially Correlated A Using Estimating Equations for Spatially Correlated Areal Data December 8, 2009 Introduction GEEs Spatial Estimating Equations Implementation Simulation Conclusion Typical Problem Assess the relationship

More information

Residuals and model diagnostics

Residuals and model diagnostics Residuals and model diagnostics Patrick Breheny November 10 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/42 Introduction Residuals Many assumptions go into regression models, and the Cox proportional

More information

Efficiency Comparison Between Mean and Log-rank Tests for. Recurrent Event Time Data

Efficiency Comparison Between Mean and Log-rank Tests for. Recurrent Event Time Data Efficiency Comparison Between Mean and Log-rank Tests for Recurrent Event Time Data Wenbin Lu Department of Statistics, North Carolina State University, Raleigh, NC 27695 Email: lu@stat.ncsu.edu Summary.

More information

A Study of Joinpoint Models for Longitudinal Data

A Study of Joinpoint Models for Longitudinal Data UNLV Theses, Dissertations, Professional Papers, and Capstones 8-1-2014 A Study of Joinpoint Models for Longitudinal Data Libo Zhou University of Nevada, Las Vegas, zhoul@unlv.nevada.edu Follow this and

More information

Parametric Joint Modelling for Longitudinal and Survival Data

Parametric Joint Modelling for Longitudinal and Survival Data Parametric Joint Modelling for Longitudinal and Survival Data Paraskevi Pericleous Doctor of Philosophy School of Computing Sciences University of East Anglia July 2016 c This copy of the thesis has been

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models

Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations

More information

Now consider the case where E(Y) = µ = Xβ and V (Y) = σ 2 G, where G is diagonal, but unknown.

Now consider the case where E(Y) = µ = Xβ and V (Y) = σ 2 G, where G is diagonal, but unknown. Weighting We have seen that if E(Y) = Xβ and V (Y) = σ 2 G, where G is known, the model can be rewritten as a linear model. This is known as generalized least squares or, if G is diagonal, with trace(g)

More information

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features Yangxin Huang Department of Epidemiology and Biostatistics, COPH, USF, Tampa, FL yhuang@health.usf.edu January

More information

Multivariate Survival Analysis

Multivariate Survival Analysis Multivariate Survival Analysis Previously we have assumed that either (X i, δ i ) or (X i, δ i, Z i ), i = 1,..., n, are i.i.d.. This may not always be the case. Multivariate survival data can arise in

More information

Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models

Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models Tihomir Asparouhov 1, Bengt Muthen 2 Muthen & Muthen 1 UCLA 2 Abstract Multilevel analysis often leads to modeling

More information

Chapter 2 Inference on Mean Residual Life-Overview

Chapter 2 Inference on Mean Residual Life-Overview Chapter 2 Inference on Mean Residual Life-Overview Statistical inference based on the remaining lifetimes would be intuitively more appealing than the popular hazard function defined as the risk of immediate

More information

FUNCTIONAL PRINCIPAL COMPONENT ANALYSIS FOR LONGITUDINAL AND SURVIVAL DATA

FUNCTIONAL PRINCIPAL COMPONENT ANALYSIS FOR LONGITUDINAL AND SURVIVAL DATA Statistica Sinica 17(27), 965-983 FUNCTIONAL PRINCIPAL COMPONENT ANALYSIS FOR LONGITUDINAL AND SURVIVAL DATA Fang Yao University of Toronto Abstract: This paper proposes a nonparametric approach for jointly

More information

Quantile Regression for Residual Life and Empirical Likelihood

Quantile Regression for Residual Life and Empirical Likelihood Quantile Regression for Residual Life and Empirical Likelihood Mai Zhou email: mai@ms.uky.edu Department of Statistics, University of Kentucky, Lexington, KY 40506-0027, USA Jong-Hyeon Jeong email: jeong@nsabp.pitt.edu

More information

Regression Calibration in Semiparametric Accelerated Failure Time Models

Regression Calibration in Semiparametric Accelerated Failure Time Models Biometrics 66, 405 414 June 2010 DOI: 10.1111/j.1541-0420.2009.01295.x Regression Calibration in Semiparametric Accelerated Failure Time Models Menggang Yu 1, and Bin Nan 2 1 Department of Medicine, Division

More information

A Sampling of IMPACT Research:

A Sampling of IMPACT Research: A Sampling of IMPACT Research: Methods for Analysis with Dropout and Identifying Optimal Treatment Regimes Marie Davidian Department of Statistics North Carolina State University http://www.stat.ncsu.edu/

More information

Published online: 10 Apr 2012.

Published online: 10 Apr 2012. This article was downloaded by: Columbia University] On: 23 March 215, At: 12:7 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 172954 Registered office: Mortimer

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 24 Paper 153 A Note on Empirical Likelihood Inference of Residual Life Regression Ying Qing Chen Yichuan

More information

Least Absolute Deviations Estimation for the Accelerated Failure Time Model. University of Iowa. *

Least Absolute Deviations Estimation for the Accelerated Failure Time Model. University of Iowa. * Least Absolute Deviations Estimation for the Accelerated Failure Time Model Jian Huang 1,2, Shuangge Ma 3, and Huiliang Xie 1 1 Department of Statistics and Actuarial Science, and 2 Program in Public Health

More information

Implementing Response-Adaptive Randomization in Multi-Armed Survival Trials

Implementing Response-Adaptive Randomization in Multi-Armed Survival Trials Implementing Response-Adaptive Randomization in Multi-Armed Survival Trials BASS Conference 2009 Alex Sverdlov, Bristol-Myers Squibb A.Sverdlov (B-MS) Response-Adaptive Randomization 1 / 35 Joint work

More information

Two-stage Adaptive Randomization for Delayed Response in Clinical Trials

Two-stage Adaptive Randomization for Delayed Response in Clinical Trials Two-stage Adaptive Randomization for Delayed Response in Clinical Trials Guosheng Yin Department of Statistics and Actuarial Science The University of Hong Kong Joint work with J. Xu PSI and RSS Journal

More information

Chapter 4. Parametric Approach. 4.1 Introduction

Chapter 4. Parametric Approach. 4.1 Introduction Chapter 4 Parametric Approach 4.1 Introduction The missing data problem is already a classical problem that has not been yet solved satisfactorily. This problem includes those situations where the dependent

More information

ASYMPTOTIC PROPERTIES AND EMPIRICAL EVALUATION OF THE NPMLE IN THE PROPORTIONAL HAZARDS MIXED-EFFECTS MODEL

ASYMPTOTIC PROPERTIES AND EMPIRICAL EVALUATION OF THE NPMLE IN THE PROPORTIONAL HAZARDS MIXED-EFFECTS MODEL Statistica Sinica 19 (2009), 997-1011 ASYMPTOTIC PROPERTIES AND EMPIRICAL EVALUATION OF THE NPMLE IN THE PROPORTIONAL HAZARDS MIXED-EFFECTS MODEL Anthony Gamst, Michael Donohue and Ronghui Xu University

More information

Statistical Methods for Alzheimer s Disease Studies

Statistical Methods for Alzheimer s Disease Studies Statistical Methods for Alzheimer s Disease Studies Rebecca A. Betensky, Ph.D. Department of Biostatistics, Harvard T.H. Chan School of Public Health July 19, 2016 1/37 OUTLINE 1 Statistical collaborations

More information

Test of Association between Two Ordinal Variables while Adjusting for Covariates

Test of Association between Two Ordinal Variables while Adjusting for Covariates Test of Association between Two Ordinal Variables while Adjusting for Covariates Chun Li, Bryan Shepherd Department of Biostatistics Vanderbilt University May 13, 2009 Examples Amblyopia http://www.medindia.net/

More information

Semiparametric Models for Joint Analysis of Longitudinal Data and Counting Processes

Semiparametric Models for Joint Analysis of Longitudinal Data and Counting Processes Semiparametric Models for Joint Analysis of Longitudinal Data and Counting Processes by Se Hee Kim A dissertation submitted to the faculty of the University of North Carolina at Chapel Hill in partial

More information

BAYESIAN METHODS FOR JOINT MODELING OF LONGITUDINAL AND SURVIVAL DATA WITH APPLICATIONS TO CANCER VACCINE TRIALS

BAYESIAN METHODS FOR JOINT MODELING OF LONGITUDINAL AND SURVIVAL DATA WITH APPLICATIONS TO CANCER VACCINE TRIALS Statistica Sinica 14(2004), 863-883 BAYESIAN METHODS FOR JOINT MODELING OF LONGITUDINAL AND SURVIVAL DATA WITH APPLICATIONS TO CANCER VACCINE TRIALS Joseph G. Ibrahim, Ming-Hui Chen and Debajyoti Sinha

More information

SURVIVAL ANALYSIS WITH MULTIPLE DISCRETE INDICATORS OF LATENT CLASSES KLAUS LARSEN, UCLA DRAFT - DO NOT DISTRIBUTE. 1.

SURVIVAL ANALYSIS WITH MULTIPLE DISCRETE INDICATORS OF LATENT CLASSES KLAUS LARSEN, UCLA DRAFT - DO NOT DISTRIBUTE. 1. SURVIVAL ANALYSIS WITH MULTIPLE DISCRETE INDICATORS OF LATENT CLASSES KLAUS LARSEN, UCLA DRAFT - DO NOT DISTRIBUTE Abstract.... 1. Introduction In many studies the outcome of primary interest is the time

More information

An Empirical Characteristic Function Approach to Selecting a Transformation to Normality

An Empirical Characteristic Function Approach to Selecting a Transformation to Normality Communications for Statistical Applications and Methods 014, Vol. 1, No. 3, 13 4 DOI: http://dx.doi.org/10.5351/csam.014.1.3.13 ISSN 87-7843 An Empirical Characteristic Function Approach to Selecting a

More information

Statistical Inference and Methods

Statistical Inference and Methods Department of Mathematics Imperial College London d.stephens@imperial.ac.uk http://stats.ma.ic.ac.uk/ das01/ 31st January 2006 Part VI Session 6: Filtering and Time to Event Data Session 6: Filtering and

More information

Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis

Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis Jonathan Taylor & Kristin Cobb Statistics 262: Intermediate Biostatistics p.1/?? Overview of today s class Kaplan-Meier Curve

More information

GOODNESS-OF-FIT TESTS FOR ARCHIMEDEAN COPULA MODELS

GOODNESS-OF-FIT TESTS FOR ARCHIMEDEAN COPULA MODELS Statistica Sinica 20 (2010), 441-453 GOODNESS-OF-FIT TESTS FOR ARCHIMEDEAN COPULA MODELS Antai Wang Georgetown University Medical Center Abstract: In this paper, we propose two tests for parametric models

More information

Modification and Improvement of Empirical Likelihood for Missing Response Problem

Modification and Improvement of Empirical Likelihood for Missing Response Problem UW Biostatistics Working Paper Series 12-30-2010 Modification and Improvement of Empirical Likelihood for Missing Response Problem Kwun Chuen Gary Chan University of Washington - Seattle Campus, kcgchan@u.washington.edu

More information

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Jae-Kwang Kim 1 Iowa State University June 26, 2013 1 Joint work with Shu Yang Introduction 1 Introduction

More information

Calibration Estimation of Semiparametric Copula Models with Data Missing at Random

Calibration Estimation of Semiparametric Copula Models with Data Missing at Random Calibration Estimation of Semiparametric Copula Models with Data Missing at Random Shigeyuki Hamori 1 Kaiji Motegi 1 Zheng Zhang 2 1 Kobe University 2 Renmin University of China Institute of Statistics

More information

JOINT MODELING OF LONGITUDINAL AND TIME-TO-EVENT DATA: AN OVERVIEW

JOINT MODELING OF LONGITUDINAL AND TIME-TO-EVENT DATA: AN OVERVIEW Statistica Sinica 14(2004), 809-834 JOINT MODELING OF LONGITUDINAL AND TIME-TO-EVENT DATA: AN OVERVIEW Anastasios A. Tsiatis and Marie Davidian North Carolina State University Abstract: A common objective

More information

EM Algorithm II. September 11, 2018

EM Algorithm II. September 11, 2018 EM Algorithm II September 11, 2018 Review EM 1/27 (Y obs, Y mis ) f (y obs, y mis θ), we observe Y obs but not Y mis Complete-data log likelihood: l C (θ Y obs, Y mis ) = log { f (Y obs, Y mis θ) Observed-data

More information

Analysis of Time-to-Event Data: Chapter 4 - Parametric regression models

Analysis of Time-to-Event Data: Chapter 4 - Parametric regression models Analysis of Time-to-Event Data: Chapter 4 - Parametric regression models Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/25 Right censored

More information

Ensemble estimation and variable selection with semiparametric regression models

Ensemble estimation and variable selection with semiparametric regression models Ensemble estimation and variable selection with semiparametric regression models Sunyoung Shin Department of Mathematical Sciences University of Texas at Dallas Joint work with Jason Fine, Yufeng Liu,

More information

Part III. Hypothesis Testing. III.1. Log-rank Test for Right-censored Failure Time Data

Part III. Hypothesis Testing. III.1. Log-rank Test for Right-censored Failure Time Data 1 Part III. Hypothesis Testing III.1. Log-rank Test for Right-censored Failure Time Data Consider a survival study consisting of n independent subjects from p different populations with survival functions

More information

Weighting in survey analysis under informative sampling

Weighting in survey analysis under informative sampling Jae Kwang Kim and Chris J. Skinner Weighting in survey analysis under informative sampling Article (Accepted version) (Refereed) Original citation: Kim, Jae Kwang and Skinner, Chris J. (2013) Weighting

More information

Maximum Likelihood Estimation. only training data is available to design a classifier

Maximum Likelihood Estimation. only training data is available to design a classifier Introduction to Pattern Recognition [ Part 5 ] Mahdi Vasighi Introduction Bayesian Decision Theory shows that we could design an optimal classifier if we knew: P( i ) : priors p(x i ) : class-conditional

More information