Maximum likelihood estimation in the proportional hazards cure model

Size: px
Start display at page:

Download "Maximum likelihood estimation in the proportional hazards cure model"

Transcription

1 AISM DOI 1.17/s x Maximum likelihood estimation in the proportional hazards cure model Wenbin Lu Received: 12 January 26 / Revised: 16 November 26 The Institute of Statistical Mathematics, Tokyo 27 Abstract The proportional hazards cure model generalizes Cox s proportional hazards model which allows that a proportion of study subjects may never experience the event of interest. Here nonparametric maximum likelihood approach is proposed to estimating the cumulative hazard and the regression parameters. The asymptotic properties of the resulting estimators are established using the modern empirical process theory. And the estimators for the regression parameters are shown to be semiparametric efficient. Keywords Censoring Maximum likelihood estimation Mixture models Proportional hazards cure model Semiparametric efficiency 1 Introduction Survival models with a cure rate have received much attention in recent years (Farewell, 1982, 1986; Kuk and Chen, 1992; Sy and Taylor, 2; Peng and Dear, 2, among others. These models are useful when a proportion of study subjects never experience the event of interest. Applications of cure models can be found in many disciplines, including biomedical sciences, economics, sociology and engineering science. Maller and Zhou (1996 contains a list of such applications. Mixture modelling approach is commonly used to formulate a cure model, which assumes that the underlying population is a mixture of susceptible and nonsusceptible subjects. All susceptible subjects would eventually experience W. Lu (B Department of Statistics, North Carolina State University, 251 Founders Drive, Raleigh, NC 27695, USA wlu4@stat.ncsu.edu

2 W. Lu the event if there were no censoring, while the nonsusceptible ones are immune from the event. Under mixture modelling approach, a decomposition of the event time is given by T = ηt + (1 η, (1 where T < denotes the failure time of a susceptible subject and η indicates, by the value 1 or, whether the sampled subject is susceptible or not. Thus, one can model separately the survival distribution for susceptible individuals and the fraction of nonsusceptible ones. Parametric mixture models were explored earlier by a number of authors. Berkson and Gage (1952 used a mixture of exponential distributions and a constant cure fraction to fit survival data from studies of breast cancer and stomach cancer. This parametric approach was extended by Farewell (1982, 1986 to Weibull regression for survival and logistic regression for the cure fraction. Theoretical and empirical properties of the Weibull extension were fully studied there via the parametric maximum likelihood method. More recent attention has been focused on semiparametric mixture modelling approaches. Kuk and Chen (1992 proposed the so-called proportional hazards cure model in which the proportional hazards regression (Cox 1972 models the survival times of susceptible subjects while the logistic regression models the cure fraction. The model is specified by the following two terms: λ(t Z, X = lim P(t dt T < t + dt T t, Z, X/dt = λ(t exp(β Z, (2 + P (η = 1 X, Z = exp(γ X 1 + exp(γ X, (3 where λ(t Z, X is the hazard function for a susceptible subject with p-dimensional covariates Z and q-dimensional covariates X, and λ(t the completely unspecified baseline hazard function, and β and γ the unknown regression parameter vectors of primary interest. Recall that Z and X may share some common components and X includes 1 so that γ contains the intercept term. Furthermore, we assume that the censoring time C is independent of T and η conditional on Z and X. Define T = min{t, min(c, τ and δ = I{T min(c, τ, where τ denote the total follow-up of the study. Then the observations consist of ( T i, δ i, Z i, X i, i = 1,..., n, which are independent copies of ( T, δ, Z, X. And the observed likelihood function can be written as L n (, θ = n [ { π(γ X i λ( T i e β Z i e ( T i exp(β δ Z i i {1 π(γ X i + π(γ X i e ( T i exp(β Z i 1 δ i, (4

3 Maximum likelihood estimation in the proportional hazards cure model where θ = (β, γ, π(a = exp(a/{1 + exp(a for any real number a, and (t = λ(sds is the baseline cumulative hazard function. As noted by Kuk and Chen (1992, if the cure fraction 1 π(γ X is not equal to zero, the hazard function of T is no longer proportional and the simple form of the partial likelihood function, like that for the usual Cox proportional hazards model, can not be obtained here. To solve this difficulty, they proposed to consider the following complete but unobserved likelihood function: L nc (, θ = n [ { π(γ X i λ( T i e β Z i e ( T i exp(β δ Z i i η i { { 1 π(γ 1 ηi X i {π(γ X i e ( T i exp(β η 1 δi Z i i, where η i s are only partially observed, i.e. when δ i = 1, η i = 1, while when δ i =, η i is unobserved. Based on L nc, they developed a Monte Carlo simulation-based algorithm for approximating a rank-based likelihood function, thereby enabling them to perform maximum marginal likelihood estimation. The proportional hazards cure model was further studied by Peng and Dear (2 and Sy and Taylor (2 among others to obtain alternative methods for computing the joint semiparametric likelihood function. Their approaches largely relied on the semiparametric EM algorithm that computes estimates for both the cumulative baseline hazard and the regression parameters. However, the theoretical properties of the resulting estimators for the proportional hazards cure model remain to be established. Fang et al. (25 studied the large sample properties of the maximum likelihood estimators under the proportional hazards cure model. Recently, Lu and Ying (24 proposed an estimating equations approach for the semiparametric transformation cure models, where the class of linear transformation models are used for the failure times of susceptible subjects and the logistic regression is used for the cure fraction. Their approach was motivated by the work of Chen et al. (22 and used the martingale integral representation to construct unbiased estimating equations. The large sample properties of the resulting estimators were also studied. However, the proposed algorithm for solving the equations may not converge and the resulting estimators for the regression parameters are not efficient, even when the model specifies the proportional hazards cure model, i.e. the error term of the linear transformation models follows the extreme value distribution. In this paper, we propose to estimate the parametric and the nonparametric components in the proportional hazards cure model by using nonparametric maximum likelihood. The joint parametric/nonparametric likelihood approach to semiparametric problems was pioneered by Murphy (1994, 1995 for the frailty model and Scharfsten et al. (1998 for the generalized odds-rate class of regression models, among others. Here, we apply and extend their techniques to the proportional hazards cure model and make use of the modern empirical

4 W. Lu process theory, as elucidated in van der Vaart and Wellner (1996, to drive the large sample properties of the resulting estimators. In addition, We show that the maximum likelihood estimators for the regression parameters are semiparametrically efficient (Bickel et al., The paper is organized in a natural order. We first compute the semiparametric variance bound (Sect. 2. Then, in Sect. 3, we define our estimators and show that they are consistent, asymptotically normal, and achieve the variance bound. Consistent variance estimators are also derived here. Section 4 gives some discussions on generalization of our approach to include time dependent covariates and other semiparametric cure models. Some technical lemmas are put together and proved in the Appendix. 2 Semiparametric variance bound First, note that the log likelihood function for a single observation is l(, θ = δ[log{π(γ Xλ( T+β Z ( T exp(β Z+(1 δ log S( T,, θ, (5 where S(t,, θ = 1 π(γ X + π(γ X exp{ (t exp(β Z. Then the semiparametric variance bound is computed via the semiparametric efficiency theory of Bickel et al. (1993. To be specific, a parametric submodel corresponding to a parameterization of λ(t is considered, say λ(t, α, where λ(t, α = λ(t for some α. So the log likelihood for such parametric submodel is given by l(α, θ = δ[log{π(γ Xλ( T, α+β Z ( T, α exp(β Z +(1 δ log S{ T, (, α, θ, where (t, α = λ(s, αds. The score for θ is l τ θ = W{t, (, α, θdm{t, (, α, θ, (6 where M(t,, θ = N(t Y(sg(s,, θexp(β Zd (s and W(t,, θ = (Z [1 {1 g(t,, θ exp(β Z (t, X {1 g(t,, θ with g(t,, θ = π(γ X exp{ (t exp(β Z 1 π(γ X + π(γ X exp{ (t exp(β Z. By the usual counting process and its associated martingale theory (Fleming and Harrington 1991; Andersen et al. 1993, M(t,, θ is the F t -counting process martingale, where F t is the smallest sigma-algebra generated by {N(s, Y(s, s t. The score for α is

5 Maximum likelihood estimation in the proportional hazards cure model [ l τ α = α λ(t, α t {1 g(t,, θ exp(β Z λ(t, α λ(s, α ds dm{t, (, α, θ. α (7 Let θ and be the true values of θ and, respectively. And denote S θ and S α to be the corresponding scores for θ and α evaluated at the truth. Similar to Scharfsten et al. (1998, we can define the following tangent set in the nonparametric direction, ={f ( T, δ, Z, X : f ( T, δ, Z, X = [a(t {1 g (t exp(β Z a(sd (sdm (t, where a(t is any(p+q dimensional function of t, E[ f ( T, δ, Z, X 2 <. Here g (t = g(t,, θ and M (t = M(t,, θ. Then the efficient score S eff for θ can be defined as S eff = S θ [S θ, where [ is the projection operator. To project S θ onto, we need to find the vector a(t such that ( E [W (t a(t +{1 g (t exp(β Z a(sd (s dm (t [a (t {1 g (t exp(β Z a (sd (sdm (t =, (8 for all a.herew (t = W(t,, θ. By some simple algebra, we can show that the vector a(t which satisfies (8 is a solution to the following Fredholm integral equation of the second kind: where for t, s τ, a(t K(t, sa(sd (s = f (t, t [, τ, (9 K(t, s = E[g (s t{1 g (s t exp(2β ZY(s t E{Y(tg (t exp(β Z τ s t E[g (u{1 g (u 2 exp(3β ZY(ud (u E{Y(tg (t exp(β Z, f (t = E{W (ty(tg (t exp(β Z E{Y(tg (t exp(β Z t E[W (sg (s{1 g (s exp(2β ZY(sd (s E{Y(tg (t exp(β Z.

6 W. Lu If the kernel K(, satisfies sup t [,τ K(t, s d (s <, then according to Kress (1989, there exists a solution to this integral equation. Under mild regularity conditions, which are given in the next section, we can show that the kernel K defined above satisfies the above condition. Denote the solution by a eff (t. Then, the efficient score for θ is S eff = [W (t a eff (t +{1 g (t exp(β Z a eff (sd (sdm (t Therefore, provided that the E(S eff S eff is nonsingular, the semiparametric variance bound,,is{e(s eff S eff 1. Furthermore, we can show that ( E(S eff S eff = E W (t[w (t a eff (t+{1 g (t exp(β Z a eff (sd (s Y(tg (t exp(β Zd (t, since based on (8, ( E [W (t a eff (t +{1 g (t exp(β Z a eff (sd (s [a eff (t {1 g (t exp(β Z a eff (sd (sy(tg (t exp(β Zd (t =. 3 Estimation and asymptotic Nonparametric maximum likelihood method is used to estimate the regression parameters and the baseline cumulative hazard function. For convenience, we assume that there are no tied death times, but our results can be easily tailored to accommodate tied death times. The log likelihood function for observed data is given by n ( δ i [log{π(γ X i +log λ( T i + β Z i ( T i exp(β Z i +(1 δ i log S i ( T i,, θ, (1 where S i is obtained from S by replacing Z and X by Z i and X i, respectively. The maximum of this function does not exists if ( is restricted to be absolutely continuous. Thus, we allow ( to be discrete and replace λ(t in (1 withthe jump size of at time t, denoted by (t. Then we maximize the following modified log likelihood function to obtain our estimates:

7 Maximum likelihood estimation in the proportional hazards cure model nl n (, θ = n ( δ i [log{π(γ X i +log{ ( T i +β Z i ( T i exp(β Z i + (1 δ i log S i ( T i,, θ. (11 We observe that the maximizer ˆ n of (11 must be a step function with jumps at all the death times. Within the class of functions of this form, we can show that the maximum likelihood estimators, ˆ n and ˆθ n,of(11 exist and are finite. Then, we show that they are consistent and asymptotically normal. And the limiting variance of ˆθ n attains the semiparametric efficiency bound,. To establish our claims, we need the following conditions: Condition 1. The function (t is strictly increasing and continuously differentiable, and (τ <. Condition 2. θ lies in the interior of a compact set C and the covariate vectors Z and X are bounded in the sense that P ( Z < m and X < m = 1 for some constant m >. Condition 3. With probability one, there exists a positive constant ε such that P (C T τ Z, X >ε. Condition 4. P {Y(t = 1 Z, X is continuous in t. Remark 1 The conditions 1, 2 and 4 are the usual regularity conditions needed for establishing the large sample results for the maximum partial likelihood estimators under the proportional hazards model. Condition 3 is for the identifiability of the proportional hazards cure model on the interval [, τ, i.e. the follow-up is sufficiently long for identifying (t on the interval [, τ. 3.1 Existence Theorem 1 Assume that conditions 1 2 hold. Then the maximum likelihood estimators of l n (, θ, (, θ = ( ˆ n, ˆθ n exists and is finite. Proof Since l n is a continuous function of θ and the jump sizes of, it is equivalent to show that the jump sizes are finite (Scharfsten et al., To see this, let b 1,..., b k(n denote the jump sizes at the death times T (1 < T (2 < < T (k(n, where k(n is the total number of deaths. Then l n < 1 k(n i log{π(γ X (i +β Z (i +{log b i exp(β Z (i b j n < 1 k(n i log{π(γ X (i +β Z (i +{log b i exp( M b j n where M is a positive constant and Z (i, X (i are the covariate vectors corresponding to the ith death time, T (i. The first inequality above holds since

8 W. Lu < S i ( T i,, θ < 1 and the second inequality is due to condition 2. Therefore, l n diverges to if b j tends to for some j. This implies that the jump sizes of must be finite. Remark 2 The maximization of l n (, θin (11 can be carried out by the Nelder- Mead simplex method. Such an algorithm is available in MATLAB software with the build-in function fminsearch. Remark 3 Since ( ˆ n, ˆθ n exists and is finite, the derivative of l n with respect to the jump sizes of should be zero. This leads to the following equation for ˆ n : ˆ n (t = 1 n d N(s n Y j (s exp( ˆβ Z j {δ j + (1 δ j g j ( T j, ˆ n, ˆθ, < t τ, (12 where N(t = (1/n n N i (t and g i is obtained from g by replacing Z and X by Z i and X i, respectively. 3.2 Consistency We apply the techniques used by Murphy (1994 and Scharfsten et al. (1998 here for proving consistency. Specifically, define n (t = 1 n d N(s n Y j (s exp(β Z j{δ j + (1 δ j g j ( T j,, θ, < t τ, (13 which is a step function with jumps at each of the death times and converges uniformly to (see Lemma 2 of the Appendix. Theorem 2 Assume that conditions 1 4 hold. Then sup ˆ n (t (t a.s. and ˆθ n θ a.s. t [,τ The proof of Theorem 2 is given in Appendix. 3.3 Asymptotic distribution We extend the approach of Murphy (1995 and Scharfsten et al. (1998 to derive the asymptotic distribution of our estimators ( ˆ n, ˆθ n.herewealso work with one-dimensional submodels through the estimators and the difference at the estimators. To be specific, set d (t = {1 + dh 1(sd ˆ n (s and θ d = dh 2 + ˆθ n, where h 1 is a function and h 2 is a (p + q-dimensional vector. Furthermore, write h 2 = (h 21, h 22, where h 21 is the p-dimensional and

9 Maximum likelihood estimation in the proportional hazards cure model h 22 is the q-dimensional vectors corresponding to Z and X, respectively. Let S n ( ˆ n, ˆθ n (h 1, h 2 denote the derivative of l n with respect to d and evaluated at d =. If ( ˆ n, ˆθ n maximizes l n, then S n ( ˆ n, ˆθ n (h 1, h 2 = for all (h 1, h 2.We observe that S n ( ˆ n, ˆθ n (h 1, h 2 = S n1 ( ˆ n, ˆθ n (h 1 + S n2 ( ˆ n, ˆθ n (h 2, where S n1 ( ˆ n, ˆθ n (h 1 = 1 n [ h 1 (t {1 g i (t, ˆ n, ˆθ n exp( ˆβ Z i n { dn i (t Y i (tg i (t, ˆ n, ˆθ n exp( ˆβ Z i d ˆ (t, S n2 ( ˆ n, ˆθ n (h 2 = 1 n n h 2 W i(t, ˆ n, ˆθ n {dn i (t Y i (tg i (t, ˆ n, ˆθ n exp( ˆβ Z i d ˆ (t. h 1 (sd ˆ (s Let BV[, τ denote the space of bounded variation functions defined on [, τ. We assume that the class of h belongs to the space H = BV[, τ R p+q. For h H, we define the norm on H to be h H = h 1 v + h 2 1, where h 1 v is the absolute value of h 1 ( plus the total variation of h 1 on the interval [, τ and h 2 1 is the L 1 -norm of h 2. Define H m ={h H : h H m. Ifm =, then the inequality is strict. In addition, define, θ (h = h 1(td (t + h 2 θ. The, θ indexes the space functionals { =, θ : sup, θ <. h H m Now l (H m, where l (H m is the space of bounded real-valued functions on H m under the supremum norm U = sup h Hm U(h. The score function S n is a random map to l (H m for all finite m. Convergence in probability (denoted by P and weak convergence will be in terms of outer measure. Theorem 3 Assume that conditions 1 4 hold. Then n( ˆ n, n( ˆθ θ G (14 weakly in l (H m, where G is a tight Gaussian process in l (H m with mean zero and covariance process Cov{G(h, G(h = h 1 (tσ 1 (1 (h (td (t + h 2 σ 1 (2 (h, (15

10 W. Lu where σ = (σ 1, σ 2 is a continuous linear operator from H to H with inverse σ 1 = (σ 1 (1, σ 1 (2. The form of σ is given as follows: σ 1 (h(t = E{V(t, ϒ (hy(tg(t, ϒ exp(β [ Z τ E V(s, ϒ (hy(sg(s, ϒ {1 g(s, ϒ exp(2β Zd (s, t and [ σ 2 (h = E W(t, ϒ V(t, ϒ (hy(tg(t, ϒ exp(β Zd (t, where ϒ = (, θ and V(t,, θ = h 1 (t {1 g(t,, θ exp(β Z h 1 (sd (s + h 2W(t,, θ. The proof of Theorem 3 is given in the Appendix and follows the following theorem from (van der Vaart and Wellner 1996, Theorem In this theorem, the parameter space is a subset of l (H m and the score function is a random map S n : l (H m.letϒ = (, θ and ˆϒ n = ( ˆ n, ˆθ n. Furthermore, denote the asymptotic version of S n by S, i.e S(ϒ = E{S n (ϒ. Then we have that S n ( ˆϒ n =, S(ϒ = and ˆϒ n ϒ = o P (1 as elements in l (H m. The notation lin before a set denotes the set of all finite linear combinations of the elements of the set. Theorem 4 Assume the following: 1. (Asymptotic distribution of the score function n{s n (ϒ S(ϒ G, where G is a tight Gaussian process on l (H m. 2. (Fréchet differentiability of the asymptotic score n{s( ˆϒ n S(ϒ = nṡ(ϒ ( ˆϒ n ϒ + o P (1 + n ˆϒ n ϒ, where Ṡ(ϒ : lin{ϒ ϒ : ϒ l (H m is a continuous linear operator. 3. (Invertibility Ṡ(ϒ is continuously invertible on its range. 4. (Approximation condition n{(s n S( ˆϒ n (S n S(ϒ = o P (1 + n ˆϒ n ϒ. Then, n( ˆϒ n ϒ Ṡ(ϒ 1 G. Remark 4 From (15 of Theorem 3, we have Var{G(h = h 1 (tσ 1 (1 (h(td (t + h 2 σ 1 (2 (h, (16

11 Maximum likelihood estimation in the proportional hazards cure model which provides the asymptotic variances of many quantities of interest. For example, if we choose h 1 (t = for all t and h 2 = e i,theith unit vector, then we obtain the asymptotic variance of the ith element of ˆθ n. Alternatively, if we choose h 1 (s = I(s t and h 2 =, then we obtain the asymptotic variance of ˆ n (t. Remark 5 A natural estimator of the asymptotic variance of n( ˆθ θ, n( ˆ n (h can be given as h 1g 1 d ˆ n + h 2 g 2, where g = (g 1, g 2 is the solution to h 1 =ˆσ 1 (g and h 2 =ˆσ 2 (g, with ˆσ 1 (g and ˆσ 2 (g defined as ˆσ 1 (g(t = 1 n ˆσ 2 (g = 1 n 1 n n V i (t, ˆϒ n (gy i (tg i (t, ˆϒ n exp( ˆβ n Z i n t n V i (s, ˆϒ n (gy i (sg i (s, ˆϒ n {1 g i (s, ˆϒ n exp(2 ˆβ n Z id ˆ n (s W i (t, ˆϒ n V i (t, ˆϒ n (gy i (tg i (t, ˆϒ n exp( ˆβ n Z id ˆ n (t Here ˆσ 1 (g and ˆσ 2 (g are the empirical versions of σ 1 (g and σ 2 (g with the true parameter values θ and replaced by the maximum likelihood estimators ˆθ n and ˆ n, respectively. In the following theorem, we want to show that the asymptotic variance estimator defined above exists and converges to the true value given in (16. Theorem 5 Assume that conditions 1 4 hold, then for h H m, the solution g = ˆσ 1 (h exists with probability going to one as n increases. Furthermore, h 1g 1 d ˆ n + h 2 g 2 converges to h 1(tσ 1 (1 (h(td (t + h 2 σ 1 (2 (h in probability. The proof of Theorem 5 is given in the Appendix. Finally, we want to show that the asymptotic variance of n( ˆθ n θ achieve the semiparametric efficiency bound,, established in Sect. 2. By the Cramer-Wold device (see Serfling 198, it suffices to demonstrate that the asymptotic variance of a n( ˆθ n θ is equal to a a, where a is any vector in R p+q. To do this, we need to find an h = (h 1, h 2 such that σ 1 (h(t = for all t and σ 2 (h = a. Consider the solution, h 2 = a and h 1 (t = a eff (ta 1 (B Ia, where { B = E W 2 (t, ϒ Y(tg(t, ϒ exp(β Zd (t, ( A = E W(t, ϒ [a eff (t {1 g(t, ϒ exp(β Z a eff (sd (s Y(tg(t, ϒ exp(β Zd (t.

12 W. Lu Note that with the h define above ( σ 2 (h = E [ W(t, ϒ a eff (ta 1 (B Ia +{1 g(t, ϒ exp(β Z a eff (sa 1 (B Iad (s + W (t, ϒ a Y(tg(t, ϒ exp(β Zd (t = AA 1 (B Ia + B a = a and σ 1 (h(t = E ([ a eff (t +{1 g(t, ϒ exp(β Z a eff (sd (s Y(tg(t, ϒ exp(β Z A 1 (B Ia +E{W(t, ϒ Y(tg(t, ϒ exp(β ( Z a τ s E [ a eff (s +{1 g(s, ϒ exp(β Z a eff (ud (u t Y(sg(s, ϒ {1 g(s, ϒ exp(2β Zd (s A 1 (B Ia [ E W(s, ϒ Y(sg(s, ϒ {1 g(s, ϒ exp(2β Zd (s a t If A 1 (B I =, then σ 1 (h(t = ( E ([ a eff (t +{1 g(t, ϒ exp(β Z a eff (sd (s Y(tg(t, ϒ exp(β Z + E{W(t, ϒ Y(tg(t, ϒ exp(β Z ( s E [ a eff (s +{1 g(s, ϒ exp(β Z a eff (ud (u t Y(sg(s, ϒ {1 g(s, ϒ exp(2β Zd (s [ E W(s, ϒ Y(sg(s, ϒ {1 g(s, ϒ exp(2β Zd (s a ={a eff (t = t K(t, sa eff (sd (s f (te{y(tg(t, ϒ exp(β Z a

13 Maximum likelihood estimation in the proportional hazards cure model the last equality above holds since a eff (t is the solution to (9. To check A 1 (B I =, it is equivalent to show that B A = 1. This easily follows the definition of and some simple algebra. Therefore, we have proved that the asymptotic variance of n( ˆθ n θ achieve the semiparametric efficiency bound. 4 Concluding remarks We have developed in this paper the nonparametric maximum likelihood estimators and their associated asymptotic properties for statistical inference related to the proportional hazards cure model. And the proposed estimators for the regression parameters have been shown to be semiparametric efficient. Our methods and derivation rely on modern empirical process theory. In this paper, a logistic regression is used for modeling the cure fraction. Other parametric models, such as the probit model, can also be easily accommodated. However, the choice of such parametric models is difficult to test in practice. In addition, the covariate vector Z in the Cox model (2 isassumed to be time independent for convenience. This assumption can be relaxed to include time dependent covariates by adding the following condition on Z( (also see condition 1 of Bilias et al. (1997: Condition 2. There exists a constant B such that Z v B, where Z v is the absolute value of Z( plus the total variation of Z on the interval [, τ. Furthermore, the methods proposed here can also be applied to analysis of other semiparametric cure models. For example, the class of transformation mixture cure models considered by Lu and Ying (24, and the class of semiparametric nonmixture cure models (Tsodikov 1998, 21; Tsodikov et al. 23. In particular, Zeng et al. (26 studied a class of transformation nonmixture cure models using the nonparametric maximum likelihood estimation. The models are specified as P(T > t X = G{exp(γ XF(t where F(t is a completely unspecified distribution function and G(x is a known monotone decreasing transformation function with G( = 1. The cure probability for a subject with covariate vector X is then P(T = X = G{exp(γ X. A main difference between the above nonmixture cure model and the proportional hazards mixture cure model considered in the paper is that the parameter γ affects both the cure fraction and the underlying survival distribution of a susceptible subject in the nonmixture cure model while it only affects the cure fraction in the mixture cure model. The goodness-of-fit tests of different types of semiparametric cure models need to be further investigated and warrant future research. Acknowledgement The author is grateful to Professor Zhiliang Ying for the helpful discussion of the paper. Wenbin Lu s research was partially supported by National Science Foundation Grant DMS

14 W. Lu Appendix Lemma 1 Assume that conditions 1-4 hold, then sup n ˆ n (τ <. Proof We prove this result by contradiction. That is, suppose that sup n ˆ n (τ goes to as n increases. Then note that l n ( ˆ n, ˆθ l n ( n, θ = 1 n δ i log π( ˆγ n X i n π(γ X i + 1 n δ i ( ˆβ n β Z i n { + 1 n δ i log ˆ n ( T i n n ( T i ˆ n ( T i exp( ˆβ n Z i + n ( T i exp(β Z i + 1 n (1 δ i log S( T i, ˆ n, ˆθ n n S( T i, n, θ { = Õ p (1 + 1 n δ i log ˆ n ( T i n n ( T i ˆ n ( T i exp( ˆβ n Z i = Õ p (1 + 1 n log 1 n Y j (t exp(β n n Z j{δ j + (1 δ j g( T j, ϒ dn i (t 1 n log 1 n Y j (t exp( ˆβ n n n Z j{δ j + (1 δ j g( T j, ˆϒ n dn i (t 1 n n δ i ˆ n ( T i exp( ˆβ n Z i = Õ p (1 1 n Õ p (1 1 n n δ i ˆ n ( T i exp( ˆβ n Z i n δ i ˆ n ( T i Y i (τ exp( ˆβ n Z i Õ p (1 ˆ n (τ 1 n n δ i Y i (τ exp( ˆβ n Z i Õ p (1 exp( c ˆ n (τ 1 n n δ i Y i (τ where c is a positive constant and Õ p (1 represents quantities, which are bounded away from positive infinity with probability one as n becomes large. Since the limit of 1 n n δ i Y i (τ is E{π(γ XP(C T τ Z, X, which is bigger than according to conditions 2 and 3. Thus, as n goes to, the right-

15 Maximum likelihood estimation in the proportional hazards cure model hand side of the above inequality diverges to with probability one, which is a contradiction. Lemma 2 Assume that conditions 1 4 hold, then sup t (,τ n (t (t, a.e., and for each ω, (i (ii (iii is absolutely continuous. d ˆ nk sup t (,τ (t γ(t. d nk sup t (,τ ˆ nk (t γ(sd (s. Proof Note that M (t = N(t Y(s exp(β Zg(s, ϒ d (s is mean zero and E{Y(t exp(β Zg(t, ϒ > fort [, τ, we have (t = E{Y(s exp(β Zg(s, ϒ 1 de{n(s where E{N(t =E{ Y(s exp(β Zg(s, ϒ d (s. Then n (t (t = 1 n n Y j (s exp(β Z j{δ j + (1 δ j g j ( T j,, θ E{Y(s exp(β Zg(s, ϒ 1 de{n(s 1 d N(s By the Glivenko-Cantelli theorem, n 1 n Y j (s exp(β Z j{δ j +(1 δ j g j ( T j,, θ uniformly converges to E[Y(s exp(β Z{δ + (1 δg( T,, θ on [, τ. In addition, E{Y(t exp(β Zδ =E[I(C t exp(β ZE{I(t T C Z, X, C = E (I(C t exp(β Zπ(γ X[exp{ (t exp(β Z exp{ (C exp(β Z and E{Y(t exp(β Z(1 δg( T,, θ [ = E I(C t exp(β Zg(C,, θ E{I(T > C Z, X, C = E [I(C t exp(β Zπ(γ X exp{ (C exp(β Z

16 W. Lu Therefore, E[Y(t exp(β Z{δ + (1 δg( T,, θ = E [I(C t exp(β Zπ(γ X exp{ (t exp(β Z [ = E I(C t exp(β Zg(t,, θ E{I(T t Z, X = E{Y(t exp(β Zg(t,, θ Since inf t [,τ E{Y(t exp(β Zg(t,, θ >, by Lemma A.2 of Tsiatis (1981, we have sup t (,τ n (t (t, a.e. For (i, let f be any non-negative, bounded, continuous function. Then = f (td (t + f (td{ (t ˆ nk (t 1 f (t 1 n k Y j (t exp( ˆβ n n k Z j {δ j + (1 δ j g j ( T j, ˆ nk, ˆθ nk d N nk (t k 1 f (td{ 1 n k (t ˆ nk (t+m 4 f (t Y j (t d N nk (t n k where N nk (t = n k N i(t and M 4 is a positive constant. The existence of such constant M 4 is because exp( ˆβ n k Z j {δ j + (1 δ j g j ( T j, ˆ nk, ˆθ nk exp( c π( ˆγ n k X j exp{ ˆ nk (τ exp(c c 1 exp( c exp{ c 2 exp(c where c 1 and c 2 are two positive constants. By the Helly-Bray Lemma (p. 18 of Loeve 1963, f (td{ (t ˆ nk (t ask. Using Lemmas of A.1 and A.2 of Tsiatis (1981, we know that 1 1 n k f (t Y j (t d N nk (t n k f (t[e{y(t 1 E{Y(t exp(β Zg(t,, θ d (t as k. Since E{Y(t is bounded away from zero for all t [, τ, we know that

17 Maximum likelihood estimation in the proportional hazards cure model f (td (t M 4 f (t[e{y(t 1 E{Y(t exp(β Zg(t,, θ d (t By choosing f appropriately, this inequality implies that must be continuous at the continuity points of. Since is absolutely continuous, then so is. For(ii, since d ˆ nk (t d nk (t = 1 nk n k and we have already shown that t [,τ Y j(t exp(β Z j{δ j + (1 δ j g j ( T j,, θ 1 nk n k Y j(t exp( ˆβ n k Z j {δ j + (1 δ j g j ( T j, ˆ nk, ˆθ nk sup 1 n k Y j (t exp(β n Z j{δ j + (1 δ j g j ( T j,, θ k E{Y(t exp(β Zg(t,, θ converges to as k goes to. Then we need to show sup 1 n k Y j (t exp( ˆβ n n k Z j {δ j + (1 δ j g j ( T j, ˆ nk, ˆθ nk k E[Y(t exp(β Z{δ + (1 δg( T,, θ t [,τ converges to as k goes to. In fact, the term in above supremum norm is bounded above by 1 n k Y j (t exp( ˆβ n n k Z j {δ j + (1 δ j g j ( T j, ˆ nk, ˆθ nk k 1 n k Y j (t exp(β Z j {δ j + (1 δ j g j ( T j, ˆ nk, θ n k + 1 n k Y j (t exp(β Z j {δ j + (1 δ j g j ( T j, ˆ nk, θ n k 1 n k Y j (t exp(β Z j {δ j + (1 δ j g j ( T j,, θ n k

18 + 1 n k Y j (t exp(β Z j {δ j + (1 δ j g j ( T j,, θ n k E[Y(t exp(β Z{δ + (1 δg( T,, θ W. Lu M 5 ˆθ nk θ + M 6 sup ˆ nk (t (t t [,τ + 1 n k Y j (t exp(β Z j {δ j + (1 δ j g j ( T j,, θ n k E[Y(t exp(β Z{δ + (1 δg( T,, θ where M 5, M 6 are two positive constants. The first two terms converge to by the uniform consistency of ˆ nk and ˆθ nk. The third term needs to be handled more carefully. As noted by Scharfsten et al. (1998, the space of absolutely, bounded, increasing functions { (t is separable under the supremum norm. Thus, the space has a countably dense subset. Let { l, l 1, denote this set. Then we have sup 1 n k Y j (t exp(ξ β Z j {δ j + (1 δ j g j ( T j, l t [,τ n, ξ θ k E[Y(t exp(ξ β Z{δ + (1 δg( T, l, ξ θ converges to as k goes to, for each rational ξ θ = (ξ β, ξ γ and l 1. The third term can be bounded above by 1 n k Y j (t exp(β Z j {δ j + (1 δ j g j ( T j,, θ n k 1 n k Y j (t exp(ξ β Z j {δ j + (1 δ j g j ( T j,, ξ θ n k + 1 n k Y j (t exp(ξ β Z j {δ j + (1 δ j g j ( T j,, ξ θ n k 1 n k Y j (t exp(ξ β Z j {δ j + (1 δ j g j ( T j, l n, ξ θ k

19 Maximum likelihood estimation in the proportional hazards cure model + 1 n k Y j (t exp(ξ β Z j {δ j + (1 δ j g j ( T j, l n, ξ θ k E[Y(t exp(ξ β Z{δ + (1 δg( T, l, ξ θ M 7 θ ξ θ + M 8 sup (t l (t t [,τ + 1 n k Y j (t exp(ξ β Z j {δ j + (1 δ j g j ( T j, l n, ξ θ k E[Y(t exp(ξ β Z{δ + (1 δg( T, l, ξ θ where M 7, M 8 are two positive constants. By appropriate choice of ξ and l,the right-hand side of the above inequality converges to in supremum norm. So, (ii holds. For(iii, since ˆ nk (t γ(sd (s [ = 1 n k 1 Y j (s exp( ˆβ n n k Z j {δ j + (1 δ j g j ( T j, ˆ nk, ˆθ nk d N (s nk k γ(sd (s [ = 1 n k 1 Y j (s exp( ˆβ n k Z j {δ j + (1 δ j g j ( T j, ˆ nk, ˆθ nk d N (s nk n k ( E[Y(s exp(β Z{δ + (1 δg ( T 1dEN(s {[ 1 n k n k + [ sup 1 n k t [,τ 1 Y j (s exp( ˆβ n k Z j {δ j + (1 δ j g j ( T j, ˆ nk, ˆθ nk ( E[Y(s exp(β Z{δ + (1 δg ( T 1 d N nk (s ( E[Y(s exp(β Z{δ + (1 δg ( T 1 d{ N nk (s EN(s n k 1 Y j (t exp( ˆβ n k Z j {δ j + (1 δ j g j ( T j, ˆ nk, ˆθ nk

20 ( 1 E[Y(t exp(β Z{δ + (1 δg ( T ( 1d{ + E[Y(s exp(β Z{δ + (1 δg ( T N nk (s EN(s W. Lu the first term was shown to converge to in (ii and the second term converges to zero by the Helly-Bray Lemma. In addition, pointwise convergence can be strengthened to uniform convergence by applying the same monotonicity argument used in the proof of the Glivenko-Cantelli Theorem (p. 96 of Shorack and Wellner Proof of Theorem 2 We first show that ( ˆ n, ˆθ n converges to (, θ a.s. To do this, we need to show the following three things: (i sup n ˆ n (τ < ; (ii there exists a convergent subsequence of ( ˆ n, ˆθ n,say( ˆ nk, ˆθ nk (, θ a.s.; (iii (, θ = (, θ. The proof of (i is given in Lemma 1 of the Appendix. For (ii, since every bounded sequence in R p+q has a convergent subsequence, there exists a θ such that ˆθ mk θ. By Helly s theorem (Ash 1972, there exists a function and a subsequence { ˆ nk of { ˆ mk such that ˆ nk for all t [, τ at which is continuous. Therefore, ( ˆ nk, ˆθ nk must converge to (, θ. For (iii, we show in Lemma 2 of the Appendix that is continuous at the continuity points of. And we know that l nk ( ˆ nk, ˆθ nk l nk ( nk, θ = 1 n k log{χ nk,i(t{dn i (t Y i (tg i (t, nk, θ exp(β n Z id nk (t k where + 1 n k n k [log{χ nk,i(t {χ nk,i(t 1Y i (tg i (t, nk, θ exp(β Z id nk (t, χ nk,i(t = ˆ nk (tg i ( ˆ nk, ˆθ nk exp( ˆβ n k Z i nk (tg i ( nk, θ exp(β Z i. The second term on the right-hand side of the above inequality is less or equal to zero since for x >, log(x (x 1. Using the results and techniques of Lemma 2 of the Appendix, the first term can be shown to converge to zero and the second term converges to ( [ { g (t exp(β Z E log g (t exp(β γ(t Z Y(tg (t exp(β Zd (t { g (t exp(β Z g (t exp(β γ(t 1 Z, (17

21 Maximum likelihood estimation in the proportional hazards cure model where g (t = g(t,, θ and γ(t = E[Y(t exp(β Z{δ + (1 δg ( T E[Y(t exp(β Z{δ + (1 δg ( T. (18 Note that (17 is the negative Kullback-Leibler information, E[l(, θ E[l(, θ. Due to the above inequality, the Kullback-Leibler information must equal zero. Thus, with probability one, we have log{λ (t exp(β Zg (tdn(t Y(t exp(β Zg (td (t = log{λ (t exp(β Zg (tdn(t Y(t exp(β Zg (td (t, where λ (t = d (t/dt. This equality holds for the following two cases: (i Y(τ = 1, N(τ =, and (ii Y(τ = 1, N(t =, and N(τ = 1for t (, τ. The difference between the equalities from these two cases entails that λ (t exp(β Zg (t = λ (t exp(β Zg (t, t (, τ. After integrating form to t on both sides of the above equality, we have which implies log{s(t,, θ =log{s(t,, θ, 1 exp{ (t exp(β z 1 exp{ (t exp(β z = π(γ X π(γ X, t (, τ. Since the right-hand side of the above equality is independent of t, we have that = and β = β. This further implies that γ = γ. Therefore, ( ˆ nk, ˆθ nk must converge to (, θ a.s. By Helly s theorem, we know that ( ˆ n, ˆθ n must converge to (, θ a.s. Furthermore, the point-wise convergence can be strengthened to uniform convergence by applying the same monotonicity argument used in the proof of the Glivenko-Cantelli Theorem (Shorack and Wellner Proof of Theorem 3 To prove Theorem 3, we validate each of the four conditions of Theorem 4. First, note that S = S 1 + S 2, where ( [ S 1 (ϒ = E h 1 (t {1 g(t, ϒ exp(β Z h 1 (sd (s {dn(t Y(tg(t, ϒexp(β Zd (t, [ S 2 (ϒ = E h 2 W(t, ϒ{dN(t Y(tg(t, ϒexp(β Zd (t.

22 W. Lu For ϒ ϒ l (H m, we need the following bounds on ϒ ϒ : m θ θ 1 m ϒ ϒ m θ θ 1 2m Step 1. We want to establish condition 1 for all finite m. To do this, we show that the class of score function S {S (ϒ h : h H m is Donsker, where S (ϒh = S (ϒ(h 1 + h 2 S θ (ϒ,with S (ϒ(h 1 = [ h 1 (t {1 g(t, ϒ exp(β Z {dn(t Y(tg(t, ϒexp(β Zd (t h 1 (sd (s T = δh 1 ( T {δ + (1 δg( T, ϒ exp(β Z h 1 (td (t, { S θ (ϒ = W(t, ϒ dn(t Y(tg(t, ϒexp(β Zd (t Conditions 1 and 2 imply that S θ (ϒ is a uniformly bounded function, which implies that {h 2 S θ (ϒ : h R p+q, h 2 1 m is Donsker (see Example of van der Vaart and Wellner Since the sum of bounded Donsker classes is Donsker, the class {S (ϒ (h 1 : h 1 BV[, τ, h 1 v m is Donsker if the following two classes F 1 ={δh 1 ( T : h 1 BV[, τ, h 1 v m, { T F 2 = {δ+(1 δg( T, ϒ exp(β Z h 1 (td (t : h 1 BV[, τ, h 1 v m are Donsker and uniformly bounded. The class F 1 is uniformly bounded and Donsker since h 1 varies over bounded variation functions (see Example of van der Vaart and Wellner, In addition, F 2 equals a uniformly bounded function times the class { T h 1(td (t : h 1 BV[, τ, h 1 v m, and this class is Donsker because is a monotone function (see Example of van der Vaart and Wellner Also F 2 is uniformly bounded because h 1 varies over bounded variation functions. Thus, we conclude that S is Donsker, so the first condition holds. Step 2. To check condition 2, it suffices to show that S(ϒ S(ϒ + Ṡ(ϒ (ϒ ϒ is o( ϒ ϒ as ϒ ϒ goes to. To do this, write S(ϒ linearly in d( and θ θ. To be specific,

23 Maximum likelihood estimation in the proportional hazards cure model ( S 1 (ϒ(h = (θ θ E W(t, ϒ [h 1 (t {1 g(t, ϒ exp(β Z h 1 (sd (sy(tg(t, ϒ exp(β Zd (t ( +E [h 1 (t {1 g(t, ϒ exp(β Z h 1 (sd (s Y(tg(t, ϒ exp(β Zd{ (t (t ( E [h 1 (t {1 g(t, ϒ exp(β Z h 1 (sd (s { (t (ty(tg(t, ϒ {1 g(t, ϒ exp(2β Zd (t + error 1 (ϒ(h, { S 2 (ϒ(h = (θ θ E W(t, ϒ h 2 W(t, ϒ d (t [ +E h 2 W(t, ϒ Y(tg(t, ϒ exp(β Zd{ (t (t [ E h 2 W(t, ϒ { (t (t Y(tg(t, ϒ {1 g(t, ϒ exp(2β Zd (t + error 2 (ϒ(h. And the error terms can be very easily shown to satisfy sup h H m error i (ϒ(h θ θ 1 as θ θ 1, where i = 1, 2. This follows from the boundedness of Z, X, N, Y, θ and. Furthermore, S(ϒ S(ϒ + Ṡ(ϒ (ϒ ϒ ϒ ϒ error 1(ϒ(h + error 2 (ϒ(h m θ θ 1 m Since m θ θ 1 m as ϒ ϒ, we conclude that S(ϒ S(ϒ + Ṡ(ϒ (ϒ ϒ is o( ϒ ϒ as ϒ ϒ goes to. Note that S(ϒ =, combining S 1 and S 2 we have Ṡ(ϒ ( ˆϒ n ϒ (h = σ 1 (h(td( ˆ n (t + ( ˆθ n θ σ 2 (h. (19 Step 3. To check condition 3, we need to show that Ṡ(ϒ is continuously invertible. Following Scharfsten et al. (1998, we need to show that σ(h is invertible

24 W. Lu everywhere due to the discreteness of ˆ n. To do this, we first show that σ(h is one-to-one map on L 2 (d R p+q, i.e. σ 1 (h(th 1 (td (t + h 2 σ 2(h = (2 implies that h 2 = and h 1 (t = almost everywhere (d. Plug σ 1 (h(t and σ 2 (h into the above equation, we have ( = E t [ h 1 (t V(t, ϒ Y(tg(t, ϒ exp(β Z V(s, ϒ Y(tg(s, ϒ {1 g(s, ϒ exp(2β Zd (s d (t { + h 2 E W(t, ϒ V(t, ϒ Y(tg(t, ϒ exp(β Zd (t [ = E {h 1 (t + h 2 W(t, ϒ V(t, ϒ Y(tg(t, ϒ exp(β Zd (t [ E h 1 (sd (sv(t, ϒ Y(tg(t, ϒ {1 g(t, ϒ exp(2β Zd (t [ = E V 2 (t, ϒY(tg(t, ϒ exp(β Zd (t. Since Y(tg(t, ϒ exp(β Z>a.e.on[, τ, it implies V(t, ϒ = a.e.(d. Therefore, for almost all ω, h 1 (t +{1 g(t, ϒ [h 22 X(ω t exp{β Z(ω {h 21 Z(ω + h 1(sd (s = h 21 Z(ω a.e. (d. From this, h 21 must be zero. It further implies that h 1 (t 1 g(t, ϒ exp{β Z(ω h 1 (sd (s = h 22 X(ω a.e. (d. Similarly, h 22 must be zero. With h 2 = (h 21, h 22 =, we have h 1 (t {1 g(t, ϒ exp(β Z h 1(sd (s = a.e.(d. This is a Volterra integral equation of the first kind. It is easy to show that the solution, h 1 (, to this equation must be zero a.e. (d. Now based on the fact (2, we want to show that σ is one-to-one everywhere, i.e. set σ = and show that h 2 = and h 1 (t = for all t on [, τ. Ifσ 1 (h(t = for all t and σ 2 (h =, from (2 we have h 2 = and h 1 (t = a.e.(d. Then

25 Maximum likelihood estimation in the proportional hazards cure model = σ 1 (h(t = E{V(t, ϒ Y(tg(t, ϒ exp(β [ Z τ E V(s, ϒ g(s, ϒ {1 g(s, ϒ exp(2β Zd (s. t Based on the fact h 1 (td (t = for all t on [, τ, we have σ 1 (h(t = h 1 (te{y(tg(t, ϒ exp(β Z =. Since E{Y(tg(t, ϒ exp(β Z > for all t on [, τ, we conclude that h 1(t = for all t. Therefore, σ is one-to-one everywhere. Then, we want to show that σ, as a continuous linear operator from H to H, has a continuous inverse. Note that H is a Banach space, if σ is invertible, then the inverse will be continuous (see p. 149, Luenberger, To show that σ is invertible, since σ is one-to-one, we only need to show that it can be written as the difference of a bounded, linear operator with a bounded inverse and a compact, linear operator (see Corollary 3.8 and Theorem 3.4 of Kress, To do this, we define the following linear operator (h(t = ( h 1 (te{y(t exp(β Zg(t, ϒ, { h 2 E W 2 (t, ϒ Y(t exp(β Zg(t, ϒ d (t This is a bounded linear operator due to the boundedness of Z, X, θ, Y(t and (t. In addition, under conditions 1 4, E{Y(t exp(β Zg(t, ϒ >ɛon [, τ for some ɛ> and W 2 (t, ϒ is positive definite a.e. on [, τ, which implies { E W 2 (t, ϒ Y(t exp(β Zg(t, ϒ d (t is a positive definite matrix. Hence, the inverse of (h(t exists, which is given by 1 (h(t = ( h 1 (te{y(t exp(β Zg(t, ϒ 1, { 1 h 2 E W 2 (t, ϒ Y(t exp(β Zg(t, ϒ d (t and it is also a bounded linear operator. Now we want to show that (h σ(h is compact. Let {h n be a sequence in H m. Then it only needs to show that there exists a convergent subsequence of (h n σ(h n. Since h 1n is of bounded variation, we can write h 1n as the difference of increasing functions (see Lemma of Ash In addition, both of these increasing functions are bounded in absolute value by 2m.This

26 W. Lu implies that there exists a pointwise convergent subsequence according to Helly s theorem. Let {h nk be the convergent subsequence with limit h.wemust prove that (h nk σ(h nk converges to (h σ(h in h H norm. To do this, note that (h σ(h can be written as ( ([{ E 1 g(t, ϒ exp(β Z [ +E t ( E h 1 (sd (s h 2 W(t, ϒ Y(tg(t, ϒ exp(β Z V(s, ϒ Y(sg(s, ϒ {1 g(s, ϒ exp(2β Zd (s, W(t, ϒ [h 1 (t {1 g(t, ϒ exp(β Z Y(tg(t, ϒ exp(β Zd (t h 1 (sd (s Therefore, (h nk σ(h nk (h + σ(h H E ([{1 g(t, ϒ exp(β Z (h 1nk h 1 (sd (s (h 2nk h 2 W(t, ϒ Y(tg(t, ϒ exp(β Z ( s +E [(h 1nk h 1 (s {1 g(s, ϒ exp(β Z (h 1nk h 1 (ud (u +(h 2nk h 2 W(s, ϒ Y(sg(s, ϒ {1 g(s, ϒ exp(2β Zd (s ( + E W(t, ϒ [(h 1nk h 1 (t {1 g(t, ϒ exp(β Z (h 1nk h 1 (sd (sy(tg(t, ϒ exp(β Zd (t Due to the boundedness of Z, X, θ, Y(t and (t and the fact that < g(t, ϒ <1on[, τ, the right-hand side of the above inequality is bounded above by M 1 h 2nk h 2 1+ {M 2 (h 1nk h 1 (t +M 3 v H (h 1nk h 1 (s d (sd (t where M 1, M 2 and M 3 are some positive constants. Then applying the dominated convergence theorem, we can show that this sum converges to zero. Hence, (h σ(h is a compact linear operator from H m onto its range for all finite m. Now following the similar steps (14 and (15 of Scharfsten et al. (1998, we can show that Ṡ(ϒ is continuously invertible on its range.

27 Maximum likelihood estimation in the proportional hazards cure model Step 4. To check condition 4, it suffices to show (by Lemma 1 of van der Vaart 1995 that F ={S (ϒ(h S (ϒ (h : h H m, ϒ ϒ <ɛ is Donsker for some ɛ> and sup h Hm E[{S (ϒ(h S (ϒ (h 2 converges to as ϒ converges to ϒ. We can write F = F 3 + F 4, where { { F 3 = h 2 W(t, ϒdM(t, ϒ F 4 = W(t, ϒ dm(t, ϒ : h 2 R p+q, h 2 1 m, θ [ ɛ, ɛ p+q, nonnegative, increasing with (τ 2 (τ { {δ + (1 δg( T, ϒ exp(β Z T T h 1 (td (t {δ + (1 δg( T, ϒ exp(β Z h 1 (td (t : h 1 BV[, τ, h 1 v m, θ [ ɛ, ɛ p+q, nonnegative, increasing with (τ 2 (τ We want to show that F 3 and F 4 are Donsker with uniformly bounded envelopes. Using the result from empirical process theory, that classes of smooth functions are Donsker (see Theorem of van der Vaart and Wellner 1996, we know that the following two classes { δw(t, ϒ : θ [ ɛ, ɛ p+q, nonnegative, increasing with (τ 2 (τ and { W(t, ϒY(tg(t, ϒexp(β Zd (t : θ [ ɛ, ɛ p+q, nonnegative, increasing with (τ 2 (τ are Donsker. In addition, both classes are uniformly bounded due to the boundedness of Z, X, θ, Y(t and (t. Hence, according to the result, that classes of Lipschitz transformations of Donsker classes with integrable envelope functions are Donsker (see Theorem of van der Vaart and Wellner (1996, F 3 is Donsker with uniformly bounded envelope. Theorem of van der Vaart and Wellner (1996 also implies that the class BV M of all real-valued functions on [, τ, which are uniformly bounded by a constant M and are of variation bounded by M, is Donsker. Since T h 1 (td (t h 1 v <, v

28 W. Lu the class { T h 1 (td (t : h 1 BV[, τ, h 1 v m, nonnegative, increasing with (τ 2 (τ is Donsker and uniformly bounded. In addition, since {δ + (1 δg( T, ϒ exp(β Z is uniformly bounded, F 4 is also Donsker with uniformly bounded envelope (see Example of van der Vaart and Wellner Thus, we have shown that all the four conditions in Theorem 4 hold. Then, we know that for all finite m Ṡ(ϒ n( ˆϒ n ϒ (h = σ 1 (h(td( n ˆ n (t n( ˆθ n θ σ 2 (h = n{s( ˆϒ n S(ϒ +o P (1 = n{s n ( ˆϒ n S n (ϒ +o P (1 = n{s n (ϒ S(ϒ +o P (1 uniformly in h H m, where the last equality in the above equation holds since S n ( ˆϒ n = S(ϒ =. Hence, n( ˆϒ n ϒ Ṡ(ϒ 1 G. Following the similar steps of Scharfsten et al. (1998, we can show that Ṡ(ϒ 1 G = G, where G is defined in Theorem 3. Proof of Theorem 5 Since σ is continuously invertible on its range, ˆσ is continuously invertible on a set of probability going to 1. According to Scharfsten et al. (1998, it suffices to show that sup h H =1 ˆσ(h σ(h H converges in probability to. The most difficult part in the proof is to deal with the terms in ˆσ(h σ(h involving h 1. For example, one such term can be written as 1 n n [ h 1 (t {1 g i (t, ˆϒ n exp( ˆβ n Z i W i (t, ˆϒ n Y i (tg i (t, ˆϒ n exp( ˆβ n Z id ˆ n (t ( E [h 1 (t {1 g(t, ϒ exp(β Z W(t, ϒ Y i (tg(t, ϒ exp(β Zd (t h 1 (sd ˆ n (s h 1 (sd (s

29 Maximum likelihood estimation in the proportional hazards cure model The variation of this term can be bounded by the variation of 1 n n ( [ h 1 (t {1 g i (t, ˆϒ n exp( ˆβ n Z i W i (t, ˆϒ n Y i (tg i (t, ˆϒ n exp( ˆβ n Z id ˆ n (t [h 1 (t {1 g i (t, ϒ exp(β Z W i (t, ϒ Y i (tg i (t, ϒ exp(β Zd (t h 1 (sd ˆ n (s h 1 (sd (s plus the variation of 1 n n { [h 1 (t {1 g i (t, ϒ exp(β Z W i (t, ϒ Y i (tg i (t, ϒ exp(β ( Zd (t τ E [h 1 (t {1 g(t, ϒ exp(β Z W(t, ϒ Y i (tg(t, ϒ exp(β Zd (t h 1 (sd (s h 1 (sd (s Uniform consistency of ˆϒ implies the first term converges to zero. The variation of the second term is bounded by 1 n n W i (t, ϒ Y i (tg i (t, ϒ exp(β Z E{W(t, ϒ Y i (tg(t, ϒ exp(β Z v d (t h 1 v + 1 n n W i (t, ϒ Y i (tg i (t, ϒ {1 g i (t, ϒ exp(2β Z E[W(t, ϒ Y i (tg(t, ϒ {1 g(t, ϒ exp(2β Z v d (t h 1 v (τ The integrand converges to zero by the strong law of large numbers. Thus the term converges to zero by the dominated convergence theorem, since h 1 v 1 and (τ is finite. All other terms in ˆσ(h σ(h can be handled similarly. The details of the proof is omitted here.

30 W. Lu References Andersen, P. K., Borgan, Ø., Gill, R., Keiding, N. (1993. Statistical models based on counting processes. New York: Springer Ash, R.B. (1972. Real analysis and probability. San Diego: Academic Berkson, J., Gage, R. P. (1952. Survival curve for cancer patients following treatment. Journal of the American Statistical Association, 47, Bickel, P. J., Klaassen, C. A. J., Ritov, Y., Wellner, J. A. (1993. Efficient and adaptive estimation for semiparametric models. Baltimore: Johns Hopkins University Press. Bilias, Y., Gu, M., Ying, Z. (1997. Towards a general asymptotic theory for Cox model with staggered entry. Annals of Statistics, 25, Chen, K., Jin, Z., Ying, Z. (22. Semiparametric of transformation models with censored data. Biometrika, 89, Cox, D. R. (1972. Regression models and life tables (with Discussion. Journal of the Royal Statistical Society Series B, 34, Fang, H. B., Li, G., Sun, J. (25. Maximum likelihood estimation in a semiparametric logistic/proportional-hazards mixture model. Scandinavian Journal of Statistics, 32, Farewell, V. T. (1982. The use of mixture models for the analysis of survival data with long-term survivors. Biometrics, 38, Farewell, V. T. (1986. Mixture models in survival analysis: are they worth the risk? Canadian Journal of Statistics, 14, Fleming, T. R., Harrington, D. P. (1991. Counting processes and survival analysis. New York: Wiley. Kress, R. (1989. Linear integral equations. New York: Springer. Kuk, A. Y. C., Chen, C.-H. (1992. A mixture model combining logistic regression with proportional hazards regression. Biometrika, 79, Loeve, M. (1963. Probability theory. Princeton: D. Van Nordstrand Company. Lu, W., Ying, Z. (24. On semiparametric transformation cure models. Biometrika, 91, Luenberger, D. G. (1969. Optimization by vector space methods. New York: Wiley. Maller, R. A., Zhou, S. (1996. Survival analysis with long term survivors. New York: Wiley. Murphy, S. A. (1994. Consistency in a proportional hazards model incorporating a random effect. Annals of Statistics, 22, Murphy, S. A. (1995. Asymptotic theory for the frailty model. Annals of Statistics, 23, Peng, Y., Dear, K. B. G. (2. A nonparametric mixture model for cure rate estimation. Biometrics, 56, Scharfsten, D. O., Tsiatis, A. A., Gilbert, P. B. (1998. Semiparametric efficient estimation in the generalized odds-rate class of regression models for right-censored time-to-event data. Lifetime Data Analysis, 4, Serfling, R. J. (198. Approximation theorems of mathematical statistics. New York: Wiley. Shorack, G. R., Wellner, J. A. (1986. Empirical processes with applications to statistics. New York: Wiley. Sy, J. P., Taylor, J. M. G. (2. Estimation in a Cox proportional hazards cure model. Biometrics, 56, Tsiatis, A. A. (1981. A large sample study of Cox s regression model. Annals of Statistics, 9, Tsodikov, A. D. (1998. A proportional hazards model taking account of long-term survivors. Biometrics, 54, Tsodikov, A. D. (21. Estimation of survival based on proportional hazards when cure is a possibility. Mathematical and Computer Modelling, 33, Tsodikov, A. D., Ibrahim, J. G., Yakovlev, A. Y. (23. Estimating cure rates from survival data: an alternative to two-component mixture models. Journal of the American Statistical Association, 98, van der Vaart, A. W. (1995. Efficiency of infinite dimensional M-estimators. Statistica Neerlandica, 49, 9 3. van der Vaart, A. W., Wellner, J. A. (1996. Weak convergence and empirical Processes. New York: Springer. Zeng, D., Ying, G., Ibrahim, J. G. (26. Semiparametric transformation models for survival data with a cure fraction. Journal of the American Statistical Association, 11,

Efficiency of Profile/Partial Likelihood in the Cox Model

Efficiency of Profile/Partial Likelihood in the Cox Model Efficiency of Profile/Partial Likelihood in the Cox Model Yuichi Hirose School of Mathematics, Statistics and Operations Research, Victoria University of Wellington, New Zealand Summary. This paper shows

More information

Estimation and Inference of Quantile Regression. for Survival Data under Biased Sampling

Estimation and Inference of Quantile Regression. for Survival Data under Biased Sampling Estimation and Inference of Quantile Regression for Survival Data under Biased Sampling Supplementary Materials: Proofs of the Main Results S1 Verification of the weight function v i (t) for the lengthbiased

More information

STAT Sample Problem: General Asymptotic Results

STAT Sample Problem: General Asymptotic Results STAT331 1-Sample Problem: General Asymptotic Results In this unit we will consider the 1-sample problem and prove the consistency and asymptotic normality of the Nelson-Aalen estimator of the cumulative

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models

Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations

More information

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

FULL LIKELIHOOD INFERENCES IN THE COX MODEL October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach

More information

A note on L convergence of Neumann series approximation in missing data problems

A note on L convergence of Neumann series approximation in missing data problems A note on L convergence of Neumann series approximation in missing data problems Hua Yun Chen Division of Epidemiology & Biostatistics School of Public Health University of Illinois at Chicago 1603 West

More information

NETWORK-REGULARIZED HIGH-DIMENSIONAL COX REGRESSION FOR ANALYSIS OF GENOMIC DATA

NETWORK-REGULARIZED HIGH-DIMENSIONAL COX REGRESSION FOR ANALYSIS OF GENOMIC DATA Statistica Sinica 213): Supplement NETWORK-REGULARIZED HIGH-DIMENSIONAL COX REGRESSION FOR ANALYSIS OF GENOMIC DATA Hokeun Sun 1, Wei Lin 2, Rui Feng 2 and Hongzhe Li 2 1 Columbia University and 2 University

More information

Efficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models

Efficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models NIH Talk, September 03 Efficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models Eric Slud, Math Dept, Univ of Maryland Ongoing joint project with Ilia

More information

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model Other Survival Models (1) Non-PH models We briefly discussed the non-proportional hazards (non-ph) model λ(t Z) = λ 0 (t) exp{β(t) Z}, where β(t) can be estimated by: piecewise constants (recall how);

More information

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky Empirical likelihood with right censored data were studied by Thomas and Grunkmier (1975), Li (1995),

More information

UNIVERSITY OF CALIFORNIA, SAN DIEGO

UNIVERSITY OF CALIFORNIA, SAN DIEGO UNIVERSITY OF CALIFORNIA, SAN DIEGO Estimation of the primary hazard ratio in the presence of a secondary covariate with non-proportional hazards An undergraduate honors thesis submitted to the Department

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 24 Paper 153 A Note on Empirical Likelihood Inference of Residual Life Regression Ying Qing Chen Yichuan

More information

Closest Moment Estimation under General Conditions

Closest Moment Estimation under General Conditions Closest Moment Estimation under General Conditions Chirok Han and Robert de Jong January 28, 2002 Abstract This paper considers Closest Moment (CM) estimation with a general distance function, and avoids

More information

Published online: 10 Apr 2012.

Published online: 10 Apr 2012. This article was downloaded by: Columbia University] On: 23 March 215, At: 12:7 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 172954 Registered office: Mortimer

More information

Statistical Modeling and Analysis for Survival Data with a Cure Fraction

Statistical Modeling and Analysis for Survival Data with a Cure Fraction Statistical Modeling and Analysis for Survival Data with a Cure Fraction by Jianfeng Xu A thesis submitted to the Department of Mathematics and Statistics in conformity with the requirements for the degree

More information

Efficiency Comparison Between Mean and Log-rank Tests for. Recurrent Event Time Data

Efficiency Comparison Between Mean and Log-rank Tests for. Recurrent Event Time Data Efficiency Comparison Between Mean and Log-rank Tests for Recurrent Event Time Data Wenbin Lu Department of Statistics, North Carolina State University, Raleigh, NC 27695 Email: lu@stat.ncsu.edu Summary.

More information

Approximation of Survival Function by Taylor Series for General Partly Interval Censored Data

Approximation of Survival Function by Taylor Series for General Partly Interval Censored Data Malaysian Journal of Mathematical Sciences 11(3): 33 315 (217) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES Journal homepage: http://einspem.upm.edu.my/journal Approximation of Survival Function by Taylor

More information

Full likelihood inferences in the Cox model: an empirical likelihood approach

Full likelihood inferences in the Cox model: an empirical likelihood approach Ann Inst Stat Math 2011) 63:1005 1018 DOI 10.1007/s10463-010-0272-y Full likelihood inferences in the Cox model: an empirical likelihood approach Jian-Jian Ren Mai Zhou Received: 22 September 2008 / Revised:

More information

FULL LIKELIHOOD INFERENCES IN THE COX MODEL: AN EMPIRICAL LIKELIHOOD APPROACH

FULL LIKELIHOOD INFERENCES IN THE COX MODEL: AN EMPIRICAL LIKELIHOOD APPROACH FULL LIKELIHOOD INFERENCES IN THE COX MODEL: AN EMPIRICAL LIKELIHOOD APPROACH Jian-Jian Ren 1 and Mai Zhou 2 University of Central Florida and University of Kentucky Abstract: For the regression parameter

More information

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations Takeshi Emura and Hisayuki Tsukuma Abstract For testing the regression parameter in multivariate

More information

Quantile Regression for Residual Life and Empirical Likelihood

Quantile Regression for Residual Life and Empirical Likelihood Quantile Regression for Residual Life and Empirical Likelihood Mai Zhou email: mai@ms.uky.edu Department of Statistics, University of Kentucky, Lexington, KY 40506-0027, USA Jong-Hyeon Jeong email: jeong@nsabp.pitt.edu

More information

Empirical Processes & Survival Analysis. The Functional Delta Method

Empirical Processes & Survival Analysis. The Functional Delta Method STAT/BMI 741 University of Wisconsin-Madison Empirical Processes & Survival Analysis Lecture 3 The Functional Delta Method Lu Mao lmao@biostat.wisc.edu 3-1 Objectives By the end of this lecture, you will

More information

Power and Sample Size Calculations with the Additive Hazards Model

Power and Sample Size Calculations with the Additive Hazards Model Journal of Data Science 10(2012), 143-155 Power and Sample Size Calculations with the Additive Hazards Model Ling Chen, Chengjie Xiong, J. Philip Miller and Feng Gao Washington University School of Medicine

More information

Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis

Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis Jonathan Taylor & Kristin Cobb Statistics 262: Intermediate Biostatistics p.1/?? Overview of today s class Kaplan-Meier Curve

More information

Lecture 22 Survival Analysis: An Introduction

Lecture 22 Survival Analysis: An Introduction University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 22 Survival Analysis: An Introduction There is considerable interest among economists in models of durations, which

More information

On the Breslow estimator

On the Breslow estimator Lifetime Data Anal (27) 13:471 48 DOI 1.17/s1985-7-948-y On the Breslow estimator D. Y. Lin Received: 5 April 27 / Accepted: 16 July 27 / Published online: 2 September 27 Springer Science+Business Media,

More information

1 Local Asymptotic Normality of Ranks and Covariates in Transformation Models

1 Local Asymptotic Normality of Ranks and Covariates in Transformation Models Draft: February 17, 1998 1 Local Asymptotic Normality of Ranks and Covariates in Transformation Models P.J. Bickel 1 and Y. Ritov 2 1.1 Introduction Le Cam and Yang (1988) addressed broadly the following

More information

SEMIPARAMETRIC LIKELIHOOD RATIO INFERENCE. By S. A. Murphy 1 and A. W. van der Vaart Pennsylvania State University and Free University Amsterdam

SEMIPARAMETRIC LIKELIHOOD RATIO INFERENCE. By S. A. Murphy 1 and A. W. van der Vaart Pennsylvania State University and Free University Amsterdam The Annals of Statistics 1997, Vol. 25, No. 4, 1471 159 SEMIPARAMETRIC LIKELIHOOD RATIO INFERENCE By S. A. Murphy 1 and A. W. van der Vaart Pennsylvania State University and Free University Amsterdam Likelihood

More information

3003 Cure. F. P. Treasure

3003 Cure. F. P. Treasure 3003 Cure F. P. reasure November 8, 2000 Peter reasure / November 8, 2000/ Cure / 3003 1 Cure A Simple Cure Model he Concept of Cure A cure model is a survival model where a fraction of the population

More information

Tests of independence for censored bivariate failure time data

Tests of independence for censored bivariate failure time data Tests of independence for censored bivariate failure time data Abstract Bivariate failure time data is widely used in survival analysis, for example, in twins study. This article presents a class of χ

More information

Comparing Distribution Functions via Empirical Likelihood

Comparing Distribution Functions via Empirical Likelihood Georgia State University ScholarWorks @ Georgia State University Mathematics and Statistics Faculty Publications Department of Mathematics and Statistics 25 Comparing Distribution Functions via Empirical

More information

MAXIMUM LIKELIHOOD METHOD FOR LINEAR TRANSFORMATION MODELS WITH COHORT SAMPLING DATA

MAXIMUM LIKELIHOOD METHOD FOR LINEAR TRANSFORMATION MODELS WITH COHORT SAMPLING DATA Statistica Sinica 25 (215), 1231-1248 doi:http://dx.doi.org/1.575/ss.211.194 MAXIMUM LIKELIHOOD METHOD FOR LINEAR TRANSFORMATION MODELS WITH COHORT SAMPLING DATA Yuan Yao Hong Kong Baptist University Abstract:

More information

University of California San Diego and Stanford University and

University of California San Diego and Stanford University and First International Workshop on Functional and Operatorial Statistics. Toulouse, June 19-21, 2008 K-sample Subsampling Dimitris N. olitis andjoseph.romano University of California San Diego and Stanford

More information

Statistical Analysis of Competing Risks With Missing Causes of Failure

Statistical Analysis of Competing Risks With Missing Causes of Failure Proceedings 59th ISI World Statistics Congress, 25-3 August 213, Hong Kong (Session STS9) p.1223 Statistical Analysis of Competing Risks With Missing Causes of Failure Isha Dewan 1,3 and Uttara V. Naik-Nimbalkar

More information

Semiparametric Models for Joint Analysis of Longitudinal Data and Counting Processes

Semiparametric Models for Joint Analysis of Longitudinal Data and Counting Processes Semiparametric Models for Joint Analysis of Longitudinal Data and Counting Processes by Se Hee Kim A dissertation submitted to the faculty of the University of North Carolina at Chapel Hill in partial

More information

EMPIRICAL ENVELOPE MLE AND LR TESTS. Mai Zhou University of Kentucky

EMPIRICAL ENVELOPE MLE AND LR TESTS. Mai Zhou University of Kentucky EMPIRICAL ENVELOPE MLE AND LR TESTS Mai Zhou University of Kentucky Summary We study in this paper some nonparametric inference problems where the nonparametric maximum likelihood estimator (NPMLE) are

More information

Part III. Hypothesis Testing. III.1. Log-rank Test for Right-censored Failure Time Data

Part III. Hypothesis Testing. III.1. Log-rank Test for Right-censored Failure Time Data 1 Part III. Hypothesis Testing III.1. Log-rank Test for Right-censored Failure Time Data Consider a survival study consisting of n independent subjects from p different populations with survival functions

More information

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA Kasun Rathnayake ; A/Prof Jun Ma Department of Statistics Faculty of Science and Engineering Macquarie University

More information

Estimation of the Bivariate and Marginal Distributions with Censored Data

Estimation of the Bivariate and Marginal Distributions with Censored Data Estimation of the Bivariate and Marginal Distributions with Censored Data Michael Akritas and Ingrid Van Keilegom Penn State University and Eindhoven University of Technology May 22, 2 Abstract Two new

More information

arxiv:submit/ [math.st] 6 May 2011

arxiv:submit/ [math.st] 6 May 2011 A Continuous Mapping Theorem for the Smallest Argmax Functional arxiv:submit/0243372 [math.st] 6 May 2011 Emilio Seijo and Bodhisattva Sen Columbia University Abstract This paper introduces a version of

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued

Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations Research

More information

Exercises. (a) Prove that m(t) =

Exercises. (a) Prove that m(t) = Exercises 1. Lack of memory. Verify that the exponential distribution has the lack of memory property, that is, if T is exponentially distributed with parameter λ > then so is T t given that T > t for

More information

Closest Moment Estimation under General Conditions

Closest Moment Estimation under General Conditions Closest Moment Estimation under General Conditions Chirok Han Victoria University of Wellington New Zealand Robert de Jong Ohio State University U.S.A October, 2003 Abstract This paper considers Closest

More information

Goodness-of-fit test for the Cox Proportional Hazard Model

Goodness-of-fit test for the Cox Proportional Hazard Model Goodness-of-fit test for the Cox Proportional Hazard Model Rui Cui rcui@eco.uc3m.es Department of Economics, UC3M Abstract In this paper, we develop new goodness-of-fit tests for the Cox proportional hazard

More information

Empirical Likelihood in Survival Analysis

Empirical Likelihood in Survival Analysis Empirical Likelihood in Survival Analysis Gang Li 1, Runze Li 2, and Mai Zhou 3 1 Department of Biostatistics, University of California, Los Angeles, CA 90095 vli@ucla.edu 2 Department of Statistics, The

More information

Frailty Models and Copulas: Similarities and Differences

Frailty Models and Copulas: Similarities and Differences Frailty Models and Copulas: Similarities and Differences KLARA GOETHALS, PAUL JANSSEN & LUC DUCHATEAU Department of Physiology and Biometrics, Ghent University, Belgium; Center for Statistics, Hasselt

More information

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score

More information

CURE MODEL WITH CURRENT STATUS DATA

CURE MODEL WITH CURRENT STATUS DATA Statistica Sinica 19 (2009), 233-249 CURE MODEL WITH CURRENT STATUS DATA Shuangge Ma Yale University Abstract: Current status data arise when only random censoring time and event status at censoring are

More information

Asymptotics for posterior hazards

Asymptotics for posterior hazards Asymptotics for posterior hazards Pierpaolo De Blasi University of Turin 10th August 2007, BNR Workshop, Isaac Newton Intitute, Cambridge, UK Joint work with Giovanni Peccati (Université Paris VI) and

More information

STAT 331. Martingale Central Limit Theorem and Related Results

STAT 331. Martingale Central Limit Theorem and Related Results STAT 331 Martingale Central Limit Theorem and Related Results In this unit we discuss a version of the martingale central limit theorem, which states that under certain conditions, a sum of orthogonal

More information

1 Glivenko-Cantelli type theorems

1 Glivenko-Cantelli type theorems STA79 Lecture Spring Semester Glivenko-Cantelli type theorems Given i.i.d. observations X,..., X n with unknown distribution function F (t, consider the empirical (sample CDF ˆF n (t = I [Xi t]. n Then

More information

11 Survival Analysis and Empirical Likelihood

11 Survival Analysis and Empirical Likelihood 11 Survival Analysis and Empirical Likelihood The first paper of empirical likelihood is actually about confidence intervals with the Kaplan-Meier estimator (Thomas and Grunkmeier 1979), i.e. deals with

More information

On Estimation of Partially Linear Transformation. Models

On Estimation of Partially Linear Transformation. Models On Estimation of Partially Linear Transformation Models Wenbin Lu and Hao Helen Zhang Authors Footnote: Wenbin Lu is Associate Professor (E-mail: wlu4@stat.ncsu.edu) and Hao Helen Zhang is Associate Professor

More information

Product-limit estimators of the survival function with left or right censored data

Product-limit estimators of the survival function with left or right censored data Product-limit estimators of the survival function with left or right censored data 1 CREST-ENSAI Campus de Ker-Lann Rue Blaise Pascal - BP 37203 35172 Bruz cedex, France (e-mail: patilea@ensai.fr) 2 Institut

More information

Modelling Survival Events with Longitudinal Data Measured with Error

Modelling Survival Events with Longitudinal Data Measured with Error Modelling Survival Events with Longitudinal Data Measured with Error Hongsheng Dai, Jianxin Pan & Yanchun Bao First version: 14 December 29 Research Report No. 16, 29, Probability and Statistics Group

More information

is a Borel subset of S Θ for each c R (Bertsekas and Shreve, 1978, Proposition 7.36) This always holds in practical applications.

is a Borel subset of S Θ for each c R (Bertsekas and Shreve, 1978, Proposition 7.36) This always holds in practical applications. Stat 811 Lecture Notes The Wald Consistency Theorem Charles J. Geyer April 9, 01 1 Analyticity Assumptions Let { f θ : θ Θ } be a family of subprobability densities 1 with respect to a measure µ on a measurable

More information

1 Introduction. 2 Residuals in PH model

1 Introduction. 2 Residuals in PH model Supplementary Material for Diagnostic Plotting Methods for Proportional Hazards Models With Time-dependent Covariates or Time-varying Regression Coefficients BY QIQING YU, JUNYI DONG Department of Mathematical

More information

Linear life expectancy regression with censored data

Linear life expectancy regression with censored data Linear life expectancy regression with censored data By Y. Q. CHEN Program in Biostatistics, Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109, U.S.A.

More information

Survival Analysis for Case-Cohort Studies

Survival Analysis for Case-Cohort Studies Survival Analysis for ase-ohort Studies Petr Klášterecký Dept. of Probability and Mathematical Statistics, Faculty of Mathematics and Physics, harles University, Prague, zech Republic e-mail: petr.klasterecky@matfyz.cz

More information

Analysis of Time-to-Event Data: Chapter 4 - Parametric regression models

Analysis of Time-to-Event Data: Chapter 4 - Parametric regression models Analysis of Time-to-Event Data: Chapter 4 - Parametric regression models Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/25 Right censored

More information

Cox s proportional hazards model and Cox s partial likelihood

Cox s proportional hazards model and Cox s partial likelihood Cox s proportional hazards model and Cox s partial likelihood Rasmus Waagepetersen October 12, 2018 1 / 27 Non-parametric vs. parametric Suppose we want to estimate unknown function, e.g. survival function.

More information

Goodness-Of-Fit for Cox s Regression Model. Extensions of Cox s Regression Model. Survival Analysis Fall 2004, Copenhagen

Goodness-Of-Fit for Cox s Regression Model. Extensions of Cox s Regression Model. Survival Analysis Fall 2004, Copenhagen Outline Cox s proportional hazards model. Goodness-of-fit tools More flexible models R-package timereg Forthcoming book, Martinussen and Scheike. 2/38 University of Copenhagen http://www.biostat.ku.dk

More information

arxiv: v1 [math.oc] 22 Sep 2016

arxiv: v1 [math.oc] 22 Sep 2016 EUIVALENCE BETWEEN MINIMAL TIME AND MINIMAL NORM CONTROL PROBLEMS FOR THE HEAT EUATION SHULIN IN AND GENGSHENG WANG arxiv:1609.06860v1 [math.oc] 22 Sep 2016 Abstract. This paper presents the equivalence

More information

Chapter 2 Inference on Mean Residual Life-Overview

Chapter 2 Inference on Mean Residual Life-Overview Chapter 2 Inference on Mean Residual Life-Overview Statistical inference based on the remaining lifetimes would be intuitively more appealing than the popular hazard function defined as the risk of immediate

More information

Competing risks data analysis under the accelerated failure time model with missing cause of failure

Competing risks data analysis under the accelerated failure time model with missing cause of failure Ann Inst Stat Math 2016 68:855 876 DOI 10.1007/s10463-015-0516-y Competing risks data analysis under the accelerated failure time model with missing cause of failure Ming Zheng Renxin Lin Wen Yu Received:

More information

5.1 Approximation of convex functions

5.1 Approximation of convex functions Chapter 5 Convexity Very rough draft 5.1 Approximation of convex functions Convexity::approximation bound Expand Convexity simplifies arguments from Chapter 3. Reasons: local minima are global minima and

More information

Logistic regression model for survival time analysis using time-varying coefficients

Logistic regression model for survival time analysis using time-varying coefficients Logistic regression model for survival time analysis using time-varying coefficients Accepted in American Journal of Mathematical and Management Sciences, 2016 Kenichi SATOH ksatoh@hiroshima-u.ac.jp Research

More information

Asymptotic Properties of Nonparametric Estimation Based on Partly Interval-Censored Data

Asymptotic Properties of Nonparametric Estimation Based on Partly Interval-Censored Data Asymptotic Properties of Nonparametric Estimation Based on Partly Interval-Censored Data Jian Huang Department of Statistics and Actuarial Science University of Iowa, Iowa City, IA 52242 Email: jian@stat.uiowa.edu

More information

A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints

A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints Noname manuscript No. (will be inserted by the editor) A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints Mai Zhou Yifan Yang Received: date / Accepted: date Abstract In this note

More information

A GENERALIZED ADDITIVE REGRESSION MODEL FOR SURVIVAL TIMES 1. By Thomas H. Scheike University of Copenhagen

A GENERALIZED ADDITIVE REGRESSION MODEL FOR SURVIVAL TIMES 1. By Thomas H. Scheike University of Copenhagen The Annals of Statistics 21, Vol. 29, No. 5, 1344 136 A GENERALIZED ADDITIVE REGRESSION MODEL FOR SURVIVAL TIMES 1 By Thomas H. Scheike University of Copenhagen We present a non-parametric survival model

More information

Verifying Regularity Conditions for Logit-Normal GLMM

Verifying Regularity Conditions for Logit-Normal GLMM Verifying Regularity Conditions for Logit-Normal GLMM Yun Ju Sung Charles J. Geyer January 10, 2006 In this note we verify the conditions of the theorems in Sung and Geyer (submitted) for the Logit-Normal

More information

Attributable Risk Function in the Proportional Hazards Model

Attributable Risk Function in the Proportional Hazards Model UW Biostatistics Working Paper Series 5-31-2005 Attributable Risk Function in the Proportional Hazards Model Ying Qing Chen Fred Hutchinson Cancer Research Center, yqchen@u.washington.edu Chengcheng Hu

More information

ASYMPTOTIC PROPERTIES AND EMPIRICAL EVALUATION OF THE NPMLE IN THE PROPORTIONAL HAZARDS MIXED-EFFECTS MODEL

ASYMPTOTIC PROPERTIES AND EMPIRICAL EVALUATION OF THE NPMLE IN THE PROPORTIONAL HAZARDS MIXED-EFFECTS MODEL Statistica Sinica 19 (2009), 997-1011 ASYMPTOTIC PROPERTIES AND EMPIRICAL EVALUATION OF THE NPMLE IN THE PROPORTIONAL HAZARDS MIXED-EFFECTS MODEL Anthony Gamst, Michael Donohue and Ronghui Xu University

More information

Analysis of Cure Rate Survival Data Under Proportional Odds Model

Analysis of Cure Rate Survival Data Under Proportional Odds Model Analysis of Cure Rate Survival Data Under Proportional Odds Model Yu Gu 1,, Debajyoti Sinha 1, and Sudipto Banerjee 2, 1 Department of Statistics, Florida State University, Tallahassee, Florida 32310 5608,

More information

In contrast, parametric techniques (fitting exponential or Weibull, for example) are more focussed, can handle general covariates, but require

In contrast, parametric techniques (fitting exponential or Weibull, for example) are more focussed, can handle general covariates, but require Chapter 5 modelling Semi parametric We have considered parametric and nonparametric techniques for comparing survival distributions between different treatment groups. Nonparametric techniques, such as

More information

An exponential family of distributions is a parametric statistical model having densities with respect to some positive measure λ of the form.

An exponential family of distributions is a parametric statistical model having densities with respect to some positive measure λ of the form. Stat 8112 Lecture Notes Asymptotics of Exponential Families Charles J. Geyer January 23, 2013 1 Exponential Families An exponential family of distributions is a parametric statistical model having densities

More information

log T = β T Z + ɛ Zi Z(u; β) } dn i (ue βzi ) = 0,

log T = β T Z + ɛ Zi Z(u; β) } dn i (ue βzi ) = 0, Accelerated failure time model: log T = β T Z + ɛ β estimation: solve where S n ( β) = n i=1 { Zi Z(u; β) } dn i (ue βzi ) = 0, Z(u; β) = j Z j Y j (ue βz j) j Y j (ue βz j) How do we show the asymptotics

More information

M- and Z- theorems; GMM and Empirical Likelihood Wellner; 5/13/98, 1/26/07, 5/08/09, 6/14/2010

M- and Z- theorems; GMM and Empirical Likelihood Wellner; 5/13/98, 1/26/07, 5/08/09, 6/14/2010 M- and Z- theorems; GMM and Empirical Likelihood Wellner; 5/13/98, 1/26/07, 5/08/09, 6/14/2010 Z-theorems: Notation and Context Suppose that Θ R k, and that Ψ n : Θ R k, random maps Ψ : Θ R k, deterministic

More information

Survival Analysis. Lu Tian and Richard Olshen Stanford University

Survival Analysis. Lu Tian and Richard Olshen Stanford University 1 Survival Analysis Lu Tian and Richard Olshen Stanford University 2 Survival Time/ Failure Time/Event Time We will introduce various statistical methods for analyzing survival outcomes What is the survival

More information

DAGStat Event History Analysis.

DAGStat Event History Analysis. DAGStat 2016 Event History Analysis Robin.Henderson@ncl.ac.uk 1 / 75 Schedule 9.00 Introduction 10.30 Break 11.00 Regression Models, Frailty and Multivariate Survival 12.30 Lunch 13.30 Time-Variation and

More information

Semi-parametric Inference for Cure Rate Models 1

Semi-parametric Inference for Cure Rate Models 1 Semi-parametric Inference for Cure Rate Models 1 Fotios S. Milienos jointly with N. Balakrishnan, M.V. Koutras and S. Pal University of Toronto, 2015 1 This research is supported by a Marie Curie International

More information

A semiparametric generalized proportional hazards model for right-censored data

A semiparametric generalized proportional hazards model for right-censored data Ann Inst Stat Math (216) 68:353 384 DOI 1.17/s1463-14-496-3 A semiparametric generalized proportional hazards model for right-censored data M. L. Avendaño M. C. Pardo Received: 28 March 213 / Revised:

More information

ST745: Survival Analysis: Cox-PH!

ST745: Survival Analysis: Cox-PH! ST745: Survival Analysis: Cox-PH! Eric B. Laber Department of Statistics, North Carolina State University April 20, 2015 Rien n est plus dangereux qu une idee, quand on n a qu une idee. (Nothing is more

More information

Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective. Anastasios (Butch) Tsiatis and Xiaofei Bai

Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective. Anastasios (Butch) Tsiatis and Xiaofei Bai Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective Anastasios (Butch) Tsiatis and Xiaofei Bai Department of Statistics North Carolina State University 1/35 Optimal Treatment

More information

The International Journal of Biostatistics

The International Journal of Biostatistics The International Journal of Biostatistics Volume 1, Issue 1 2005 Article 3 Score Statistics for Current Status Data: Comparisons with Likelihood Ratio and Wald Statistics Moulinath Banerjee Jon A. Wellner

More information

COX S REGRESSION MODEL UNDER PARTIALLY INFORMATIVE CENSORING

COX S REGRESSION MODEL UNDER PARTIALLY INFORMATIVE CENSORING COX S REGRESSION MODEL UNDER PARTIALLY INFORMATIVE CENSORING Roel BRAEKERS, Noël VERAVERBEKE Limburgs Universitair Centrum, Belgium ABSTRACT We extend Cox s classical regression model to accomodate partially

More information

ANALYSIS OF COMPETING RISKS DATA WITH MISSING CAUSE OF FAILURE UNDER ADDITIVE HAZARDS MODEL

ANALYSIS OF COMPETING RISKS DATA WITH MISSING CAUSE OF FAILURE UNDER ADDITIVE HAZARDS MODEL Statistica Sinica 18(28, 219-234 ANALYSIS OF COMPETING RISKS DATA WITH MISSING CAUSE OF FAILURE UNDER ADDITIVE HAZARDS MODEL Wenbin Lu and Yu Liang North Carolina State University and SAS Institute Inc.

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 13: Entropy Calculations

Introduction to Empirical Processes and Semiparametric Inference Lecture 13: Entropy Calculations Introduction to Empirical Processes and Semiparametric Inference Lecture 13: Entropy Calculations Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations Research

More information

asymptotic normality of nonparametric M-estimators with applications to hypothesis testing for panel count data

asymptotic normality of nonparametric M-estimators with applications to hypothesis testing for panel count data asymptotic normality of nonparametric M-estimators with applications to hypothesis testing for panel count data Xingqiu Zhao and Ying Zhang The Hong Kong Polytechnic University and Indiana University Abstract:

More information

Lecture 2: Martingale theory for univariate survival analysis

Lecture 2: Martingale theory for univariate survival analysis Lecture 2: Martingale theory for univariate survival analysis In this lecture T is assumed to be a continuous failure time. A core question in this lecture is how to develop asymptotic properties when

More information

Statistical Inference and Methods

Statistical Inference and Methods Department of Mathematics Imperial College London d.stephens@imperial.ac.uk http://stats.ma.ic.ac.uk/ das01/ 31st January 2006 Part VI Session 6: Filtering and Time to Event Data Session 6: Filtering and

More information

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a

More information

3 (Due ). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

3 (Due ). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure? MA 645-4A (Real Analysis), Dr. Chernov Homework assignment 1 (Due ). Show that the open disk x 2 + y 2 < 1 is a countable union of planar elementary sets. Show that the closed disk x 2 + y 2 1 is a countable

More information

MAS3301 / MAS8311 Biostatistics Part II: Survival

MAS3301 / MAS8311 Biostatistics Part II: Survival MAS3301 / MAS8311 Biostatistics Part II: Survival M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2009-10 1 13 The Cox proportional hazards model 13.1 Introduction In the

More information

STAT331. Combining Martingales, Stochastic Integrals, and Applications to Logrank Test & Cox s Model

STAT331. Combining Martingales, Stochastic Integrals, and Applications to Logrank Test & Cox s Model STAT331 Combining Martingales, Stochastic Integrals, and Applications to Logrank Test & Cox s Model Because of Theorem 2.5.1 in Fleming and Harrington, see Unit 11: For counting process martingales with

More information

Models for Multivariate Panel Count Data

Models for Multivariate Panel Count Data Semiparametric Models for Multivariate Panel Count Data KyungMann Kim University of Wisconsin-Madison kmkim@biostat.wisc.edu 2 April 2015 Outline 1 Introduction 2 3 4 Panel Count Data Motivation Previous

More information

Survival Analysis Math 434 Fall 2011

Survival Analysis Math 434 Fall 2011 Survival Analysis Math 434 Fall 2011 Part IV: Chap. 8,9.2,9.3,11: Semiparametric Proportional Hazards Regression Jimin Ding Math Dept. www.math.wustl.edu/ jmding/math434/fall09/index.html Basic Model Setup

More information

Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates

Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates Anastasios (Butch) Tsiatis Department of Statistics North Carolina State University http://www.stat.ncsu.edu/

More information

Modelling geoadditive survival data

Modelling geoadditive survival data Modelling geoadditive survival data Thomas Kneib & Ludwig Fahrmeir Department of Statistics, Ludwig-Maximilians-University Munich 1. Leukemia survival data 2. Structured hazard regression 3. Mixed model

More information

(Part 1) High-dimensional statistics May / 41

(Part 1) High-dimensional statistics May / 41 Theory for the Lasso Recall the linear model Y i = p j=1 β j X (j) i + ɛ i, i = 1,..., n, or, in matrix notation, Y = Xβ + ɛ, To simplify, we assume that the design X is fixed, and that ɛ is N (0, σ 2

More information