Pseudo full likelihood estimation for prospective survival. analysis with a general semiparametric shared frailty model: asymptotic theory

Size: px
Start display at page:

Download "Pseudo full likelihood estimation for prospective survival. analysis with a general semiparametric shared frailty model: asymptotic theory"

Transcription

1 Pseudo full likelihood estimation for prospective survival analysis with a general semiparametric shared frailty model: asymptotic theory David M. Zucker 1 Department of Statistics, Hebrew University, Mt. Scopus, Jerusalem 9195, Israel mszucker@mscc.huji.ac.il Malka Gorfine Faculty of Industrial Engineering and Management, Technion, Technion City, Haifa 32, Israel, and Department of Mathematics, Bar-Ilan University, Ramat-Gan, 529, Israel gorfinm@ie.technion.ac.il Li Hsu Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA , USA lih@fhcrc.org August 29, 27 1 To whom correspondence should be addressed. Phone: Fax:

2 Abstract In this work we present a simple estimation procedure for a general frailty model for analysis of prospective correlated failure times. Earlier work showed this method to perform well in a simulation study. Here we provide rigorous large-sample theory for the proposed estimators of both the regression coefficient vector and the dependence parameter, including consistent variance estimators. Key words: Correlated failure times; EM algorithm; Frailty model; Prospective family study; Survival analysis. i

3 1 Introduction Many epidemiological studies involve failure times that are clustered into groups, such as families or schools. Unobserved characteristics shared by members of the same cluster (e.g. genetic information or unmeasured shared environmental exposures) could influence time to the studied event. Frailty models express within-cluster dependence through a shared unobservable random effect. Estimation in the frailty model has received much attention under various frailty distributions, including gamma (Gill, 1985, 1989; Nielsen et al., 1992; Klein 1992, among others), positive stable (Hougaard, 1986; Fine et al., 23), inverse Gaussian, compound Poisson (Henderson and Oman, 1999) and log-normal (McGilchrist, 1993; Ripatti and Palmgren, 2; Vaida and Xu, 2, among others). Hougaard (2) provides a comprehensive review of the properties of the various frailty distributions. In a frailty model, the parameters of interest typically are the regression coefficients, the cumulative baseline hazard function, and the dependence parameters in the random effect distribution. Since the frailties are latent covariates, the Expectation-Maximization (EM) algorithm is a natural estimation tool, with the latent covariates estimated in the E-step and the likelihood maximized in the M-step after substituting in the estimated latent quantities. Gill (1985), Nielsen et al. (1992) and Klein (1992) discussed EM-based maximum likelihood estimation for the semiparametric gamma frailty model. One problem with the EM algorithm is that variance estimates for the estimated parameters are not readily available (Louis, 1982; Gill, 1989; Nielsen et al., 1992; Andersen et al., 1997). It has been suggested (Gill, 1989; Nielsen et al, 1992) that a nonparametric information calculation could yield consistent variance estimators. Parner (1998), building on Murphy (1994, 1995), proved the consistency and asymptotic normality of the maximum likelihood estimator in the gamma frailty model. Parner also presented a consistent estimator of the 1

4 limiting covariance matrix of the estimator, based on inverting a discrete observed information matrix. He noted that since the dimension of the observed information matrix grows with the number of observed survival times, inverting the matrix is practically infeasible for a large data set with many distinct failure times. He therefore suggested an alternate approach to estimating the covariance, based on solving a discrete version of a second order Sturm-Liouville equation, along the lines of Bickel (1985). This covariance estimator requires less computational effort, but still is not so simple to implement. We (Gorfine et al., 26) developed a new method that can handle any parametric frailty distribution with finite moments. Nonconjugate frailty distributions can be handled by a simple univariate numerical integration over the frailty distribution. Our new method possesses a number of desirable properties: a non-iterative procedure for estimating the cumulative hazard function; consistency and asymptotic normality of the parameter estimates; a direct consistent covariance estimator; and easy computation and implementation. The method was found to perform well in a simulation study and the results are very similar to those of the EM-based method. Indeed, on a dataset-by-dataset basis, the correlation between our estimator and the EM estimator was found to be 95% for the covariate regression parameter and 98-99% for the within-cluster dependence parameter. The purpose of the current paper is to present in detail the theoretical justification for the method. Our technical approach resembles that of Bagdonavicius and Nikulin (1999) and Dabrowska (26a, 26b). These works, however, dealt with a univariate data context, whereas we deal with a clustered data context. Dabrowska works with a transformation model with unknown transformation. She discusses the univariate gamma frailty model, but assumes that the shape parameter of the frailty distribution is known. Indeed, as discussed in Dabrowska (26a, pp ), identifiability problems arise in the univariate 2

5 gamma frailty model with unknown shape parameter when an unknown transformation is involved. In fact, even when the transformation is known, if there are no covariate effects on the hazard rate (i.e., in the model (1) below, the regression parameter vector β is equal to zero), the shape parameter cannot be identified from univariate data (Lancaster and Nickell, 198). In our setting, there is no unknown transformation, and we have clustered data. In this case, the shape parameter is identifiable irrespective of whether β is zero or nonzero. In our work, we are specifically interested in estimating the shape parameter, which expresses the within-cluster dependence. In genetic research and other contexts, this cluster dependence parameter is itself of significant scientific interest, because it provides insight into the impact of genetic and environmental factors on the disease incidence. Dabrowska (26b) discusses a one-step method for converting a consistent estimator into a semiparametric efficient estimator. In principle, this approach could be applied to our estimator as well. In our simulations, however, we found that our estimator was comparable in efficiency to the full nonparametric MLE. Thus, although our estimator is not theoretically semiparametric efficient, in practical terms it closely approaches semiparametric efficiency. The plan of the paper is as follows. Section 2 presents the estimation procedure. Section 3 presents the consistency and asymptotic normality results, along with the covariance estimator for the parameter estimates. Section 4 presents a simulation study. Section 5 presents the technical conditions required for our theoretical results and the proofs of these results. The proofs are patterned after Zucker (25), but with a number of significant differences, which are described at the beginning of Section 5. 3

6 2 The Proposed Approach Consider n families, with family i containing m i members, i = 1,..., n. Following Parner (1998, p. 187), we regard m i as a random variable over {1,..., m} for some m, and build up the remainder of the model conditional on m i. Let T ij and C ij denote the failure and censoring times, respectively, for individual ij. The observed follow-up time is T ij = min(t ij, C ij ), and the failure indicator is δ ij = I(T ij C ij ). On each individual, we observe a p-vector of covariates Z ij. In addition, we associate with family i an unobservable family-level covariate W i, the frailty, which induces dependence among family members. The conditional hazard function for individual ij, given the family frailty W i, is taken to be λ ij (t) = W i λ (t) exp(β T Z ij ) i = 1,..., n j = 1,..., m i (1) where λ is an unspecified conditional baseline hazard and β is a p-vector of unknown regression coefficients. This is an extension of the Cox (1972) proportional hazards model, with the hazard function for an individual in family i multiplied by W i. Conditional on W i, the individuals within a family are assumed independent. We also assume that, given Z ij and W i, the censoring is independent and noninformative for W i and (β, Λ ) (Andersen et al., 1993, Sec. III.2.3). We assume further that the frailty W i is independent of Z ij and has a density f(w; θ), where θ is an unknown parameter. For simplicity we assume that θ is a scalar, but the development extends readily to the case where θ is a vector. Finally, we assume that for any given family, there is a positive probability of at least two failures. This condition is necessary to ensure identifiability of the model; see Nielsen et al. (1992, Sec. 4, end). Let τ be the end of the observation period. The full likelihood of the data then can 4

7 be written as L = = n m i {λ ij (T ij )} δ ij S ij (T ij )f(w)dw n m i n {λ (T ij ) exp(β T Z ij )} δ ij w Ni.(τ) exp{ wh i. (τ)}f(w)dw, (2) i=1 where N ij (t) = δ ij I(T ij t), N i. (t) = m i j=1 N ij (t), H ij (t) = Λ (T ij t) exp(β T Z ij ), a b = min{a, b}, Λ ( ) is the baseline cumulative hazard function, S ij ( ) is the conditional survival function of subject ij, and H i. (t) = m i j=1 H ij (t). The log-likelihood is given by n m i n { l = δ ij log{λ (T ij ) exp(β T Z ij )} + log i=1 } w Ni.(τ) exp{ wh i. (τ)}f(w)dw. The normalized scores (log-likelihood derivatives) for (β 1,..., β p ) are given by U r = 1 n n m i i=1 j=1 δ ij Z ijr 1 n n i=1 [ mi j=1 H ij (T ij )Z ijr ] w N i. (τ)+1 exp{ wh i. (τ)}f(w)dw w N i. (τ) exp{ wh i. (τ)}f(w)dw (3) for r = 1,..., p. The normalized score for θ is U p+1 = 1 n n i=1 w N i. (τ) exp{ wh i. (τ)}f (w)dw w N i. (τ) exp{ wh i. (τ)}f(w)dw where f (w) = d dθ f(w). Let γ = (βt, θ) and U(γ, Λ ) = (U 1,..., U p, U p+1 ) T. To obtain estimators ˆβ and ˆθ, we propose to substitute an estimator of Λ, denoted by ˆΛ, into the equations U(γ, Λ ) =. is Let Y ij (t) = I(T ij t) and let F t denote the entire observed history up to time t, that F t = σ{n ij (u), Y ij (u), Z ij, i = 1,..., n; j = 1,..., m i ; u t}. Then, as discussed by Gill (1992) and Parner (1998), the stochastic intensity process for N ij (t) with respect to F t is given by λ (t) exp(β T Z ij )Y ij (t)ψ i (γ, Λ, t ), (4) 5

8 where ψ i (γ, Λ, t) = E(W i F t ). Using a Bayes theorem argument and the joint density (2) with observation time restricted to [, t), we obtain ψ i (γ, Λ, t) = φ 2i (γ, Λ, t)/φ 1i (γ, Λ, t), where φ ki (γ, Λ, t) = w N i.(t)+(k 1) exp{ wh i. (t)}f(w)dw, k = 1,..., 4. Given the intensity model (4), in which exp(β T Z ij )ψ i (γ, Λ, t ) may be regarded as a time dependent covariate effect, a natural estimator of Λ is a Breslow (1974) type estimator along the lines of Zucker (25). For given values of β and θ we estimate Λ as a step function with jumps at the observed failure times τ k, k = 1,..., K, with ˆΛ (τ k ) = d k ni=1 ψ i (γ, ˆΛ, τ k 1 ) m i j=1 Y ij (τ k ) exp(β T Z ij ) where d k is the number of failures at time τ k. Note that given the intensity model (4), the estimator of the kth jump depends on ˆΛ up to and including time τ k 1. By this approach, we avoid complicating the iterative optimization process with a further iterative scheme for estimating the cumulative hazard. This feature, however, does not necessarily translate into a computational advantage relative to the EM-method, because ψ i (γ, ˆΛ, τ k 1 ) has to be computed at every jump. Bagdonavicius and Nikulin (1999) proposed a similar estimator in a univariate survival context, for a model which they called the generalized proportional hazards model, which includes univariate frailty-type models. (5) 3 Asymptotic Properties Let γ = (β T, θ ) T with β, θ and Λ (t) denoting the respective true values of β, θ and Λ (t), and let ˆγ = (ˆβ T, ˆθ) T. We assume the technical conditions listed in Section

9 In Section 4.3, we establish the following results. A. ˆΛ (t, γ) converges almost surely to Λ (t, γ) uniformly in t and γ. B. U(γ, ˆΛ (, γ)) converges almost surely uniformly in t and γ to a limit u(γ, Λ (, γ)). C. There exists a unique consistent root to U(ˆγ, ˆΛ (, ˆγ)) =. In Section 4.4, we show that n 1/2 (ˆγ γ ) is asymptotically normally distributed. We accomplish this by analyzing in turn each of the terms in the following decomposition: = U(ˆγ, ˆΛ (, ˆγ)) = U(γ, Λ ) + [U(γ, ˆΛ (, γ )) U(γ, Λ )] + [U(ˆγ, ˆΛ (, ˆγ)) U(γ, ˆΛ (, γ ))]. (6) We show further that the covariance matrix of ˆγ can be consistently estimated by a sandwich estimator of the following form: Ĉov(ˆγ) = D 1 (ˆγ){ ˆV(ˆγ) + Ĝ(ˆγ) + Ĉ(ˆγ)}D 1 (ˆγ) T. (7) The matrix D consists of the derivatives of the U r s with respect to the parameters γ. V is the asymptotic covariance matrix of U(γ, Λ ), G is the asymptotic covariance matrix of [U(γ, ˆΛ (, γ )) U(γ, Λ )], and C is the asymptotic covariance matrix between U(γ, Λ ) and [U(γ, ˆΛ (, γ )) U(γ, Λ )]. The term G+C reflects the added variance resulting from the need to estimate the cumulative hazard function. All these matrices are defined explicitly in Section Simulation Study for the Gamma Frailty Case In Gorfine et al. (26), we presented a simulation study comparing our method to the EM method under the gamma frailty distribution with expectation 1 and variance θ. Here 7

10 we extend the simulation study by considering larger θ values and family sizes larger than two. Gorfine et al. describes in detail the steps of the EM-based algorithm, as given in Nielsen et al. (1992), in parallel with the corresponding steps in our procedure. We refer the reader to Gorfine et al. for details. The setup for the simulation study, which is patterned after Hsu et al. (24), is as follows. We worked with a sample size of 3 families, with a common family size of 2 or 5. We generated for each family a common frailty value W from the gamma distribution with scale and shape parameters both equal to θ 1, and for each individual a single covariate Z from the standard normal distribution. Conditional on W, the survivor function was taken to be S(t Z, W ) = exp{ W exp(βz)(.1t) 4.6 }. We took the censoring distribution to be N(6, 15 2 ). The β values examined were β = ln(2) and β = ln(3), leading to censoring levels of approximately 85% and 8%, respectively. The censoring distribution was chosen in order to generate an appropriate mean age at onset and age-of-onset distribution, similar to what is often observed for late onset diseases. With censoring distributed according to N(13, 15 2 ) the respective censoring levels are approximately 35% and 3%. The θ values examined were θ = 2 and θ = 4. Tables 1-2 present the simulation results for the two estimation techniques, based on 1, replications. For our method, we compare the mean estimated standard error based on our theoretical formula with the empirical standard error, and provide the empirical coverage rate of the 95% Wald-type confidence interval. For the EM-based method, we report only the empirical standard error. In addition, the empirical correlation between the EM-based estimators and our estimators is presented. The additional simulation 8

11 results confirm our earlier findings. Both estimation techniques perform very well in terms of bias. Also, for our method, fairly good agreement was observed between the estimated and the empirical standard error, although some differences were seen in some cases. The high values of the correlations implies similarity between the two estimation techniques not only on an average basis, but actually on a data set by data set basis. 5 Technical Conditions and Proofs 5.1 Introductory Remarks This section presents the technical conditions we assume for the asymptotic results and the proofs of these results. Some details have been omitted for the sake of brevity. These details are provided in an expanded version of this paper which is available at the Front for the Mathematics ArXiv under Statistics, publication number: math.st/ The general pattern of the argument follows that of Zucker (25), but with some significant changes. Our estimator for the cumulative hazard is based on the formula ˆΛ (t) = t n 1 n mi dn ij (s) n 1 n mi ψ i (γ, ˆΛ, s )Y ij (s) exp(β T Z ij ). The quantity ψ i (γ, ˆΛ, s ) involves terms of the form ˆΛ (s T ij ), i.e. it involves ˆΛ values at T ij as well as at s. By contrast, the corresponding integrand in Zucker s (25) estimator involves only ˆΛ(s ). (The estimator of Bagdonavicius and Nikulin (1999) is similar to that of Zucker (25) in this respect.) This difference in the structure of the estimators entails the need for substantial extensions to the argument. In particular, Zucker s consistency proof for the cumulative hazard estimate makes use of a result of the form sup β,t,c A (β, t, c) a (β, t, c) a.s., where A (β, t, c) is a certain empirical process, a (β, t, c) is its expectation, and the supremum is over β B, t [, τ], and c [, Λ max ]. In our consistency proof, we need the more complex result given in (2) 9

12 below, whose proof requires a sophisticated argument. In the asymptotic normality proof, a number of extra steps are required, relative to Zucker s proof, to deal with the middle term in the decomposition (6). In particular, we need to introduce the decomposition of ˆΛ (t, γ ) Λ (t) given in (25) below, and the interchange of integrals that is carried out right after introducing this decomposition. Furthermore, unlike in Zucker (25), the first two terms in the decomposition (6) are correlated, so that extra development is needed to deal with the correlation (Step III of the asymptotic normality proof). The structure of the derivative matrix of the score function vector is more complex than in Zucker (25). Finally, in contrast with Zucker (25), we use mainly the classical CLT for sums of iid s rather than the martingale CLT. We take note here that since β and Z ij are bounded, there exists a constant ν > such that ν 1 exp(β T Z ij ) ν. (8) This fact is used repeatedly in our proofs. We also introduce here some basic definitions. Recall that ψ i (γ, Λ, t) = w N i (t)+1 e H i (t)w f(w)dw w N i (t) e H i (t)w f(w)dw, with H i (t) = H i (t, γ, Λ) = m i j=1 Λ(T ij t) exp(β T Z ij ) (here we define H i so as to allow dependence on a general γ and Λ, which will often not be explicitly indicated in the notation). We define (for r m and h ) ψ (r, h) = w r+1 e hw f(w)dw wr e hw f(w)dw. (9) We further define ψ min(h) = min r m ψ (r, h) and ψ max(h) = max r m ψ (r, h). In (9), the numerator and denominator are bounded above since W is assumed to have finite (m + 2)-th moment. Also, since W is nondegenerate, the numerator and denominator are 1

13 strictly positive. Thus ψ max(h) is finite and ψ min(h) is strictly positive. The following result can be proved by elementary calculus (details in the expanded version). Lemma 1: The function ψ (r, h) is decreasing in h. Hence for all γ G and all t, ψ i (γ, Λ, t) ψ max(), (1) ψ i (γ, Λ, t) ψ min(mνλ(t)). (11) 5.2 Technical Conditions In deriving the asymptotic properties of ˆγ we make the following assumptions: 1. The random vectors (m i, T i1,..., T im i, C i1,..., C imi, Z i1,..., Z imi, W i ), i = 1,..., n, are independent and identically distributed. 2. There is a finite maximum follow-up time τ >, with E[ m i j=1 Y ij (τ)] = y > for all i. 3. (a) Conditional on Z ij and W i, the censoring is independent and noninformative of W i and (β, Λ ). (b) W i is independent of Z ij and of m i. 4. The frailty random variable W i has finite moments up to order (m + 2). 5. Z ij is bounded. 6. The parameter γ lies in a compact subset G of IR p+1 containing an open neighborhood of γ. 7. There exist B > and h > (independent of θ) such that, for all h h, we have ψ min(h) Bh 1. 11

14 8. The baseline hazard function λ (t) is bounded over [, τ] by some fixed (but not necessarily known) constant λ max. 9. The function f (w; θ) = (d/dθ)f(w; θ) is absolutely integrable. 1. The censoring distribution has at most finitely many jumps on [, τ]. 11. For any given family, there is a positive probability of at least two failures. 12. The matrix [( / γ)u(γ, ˆΛ (, γ))] γ=γ is invertible with probability going to 1 as n. In regard to Assumption 7, the assumption is satisfied if either one of the following two conditions holds. a. There exist b(θ) > and C(θ) > such that sup θ with b(θ) bounded from below over θ. f(w; θ) C(θ)w 1 (b(θ) 1) as h, b. We have lim w sup θ f(w; θ) =, and there exists a > independent of θ such that f(w; θ) is increasing in w over w [, a]. These conditions cover a wide range of frailty distributions, including popular choices such as the gamma, inverse Gaussian, and lognormal. 5.3 Preliminary Lemmas Lemma 2: Define Λ = 1.3e mσ h/(mν), with σ = 1.1mν 2 /(By ), with h and B as above. Then, with probability one, there exists n such that, for all t [, τ] and γ G, ˆΛ (t, γ) Λ for n n. (12) 12

15 Thus, ˆΛ (t, γ) is naturally bounded, with no need to impose an upper bound artificially. Proof: To simplify the writing below, we will suppress the argument γ in ˆΛ (t, γ). Recall 1 n ˆΛ (τ k ) = ψ i (γ, ˆΛ m i, τ k 1 ) Y ij (τ k ) exp(β T Z ij ), i=1 where we now take d k = 1 since the survival time distribution is assumed continuous. Using Lemma 1 and (8), we have j=1 ˆΛ (τ k ) n 1 νψmin(mν ˆΛ(τ k 1 )) 1 1 n 1 n m i Y ij (τ). By the strong law of large numbers, there exists with probability one some n such that We thus have, for n n, 1 n n m i Y ij (τ).999y for n n. (13) ˆΛ (τ k ) n 1 ( 1.1ν y ) ψ min(mν ˆΛ(τ k 1 )) 1. (14) Given this result, the desired conclusion can be obtained via simple technical manipulations, detailed in the expanded version of this paper. Lemma 3: We have sup s [,τ] ˆΛ (s, γ ) ˆΛ (s, γ ) as n, as an immediate consequence of Lemma 2 and (14). 5.4 Consistency We now show the almost sure consistency of ˆβ and ˆΛ. The argument is built on Claims A-C of Section 3, which we prove below. Our argument follows Zucker (25, Appendix A.3). Claim A: ˆΛ (t, γ) converges a.s. to some function Λ (t, γ) uniformly in t and γ. Proof: Whenever a functional norm is written below, the relevant uniform norm is intended. We define Λ max = max( Λ, λ max τ), h max = mνλ max, and ψ (r, h) = ψ (r, h 13

16 h max ). It is easy to see that ψ (r, h) is Lipschitz continuous in h, uniformly in r. Recall that ψ i (γ, Λ, t) = ψ (N i (t), H i (t, γ, Λ)). Lemma 2 implies that H i (t, γ, ˆΛ (, γ)) h max for all t [, τ] and γ G. Hence ψ i (γ, ˆΛ (, γ), t) = ψ (N i (t), H i (t, γ, ˆΛ (, γ))). Now define, for a general function Λ, and Ξ n (t, γ, Λ) = t n 1 n mi dn ij (s) n 1 n mi ψ (N i (s ), H i (s, γ, Λ))Y ij (s) exp(β T Z ij ) Ξ(t, γ, Λ) = t E[ m i j=1 ψ (N i (s ), H i (s, γ, Λ ))Y ij (s) exp(β T Z ij )] E[ m i j=1 ψ (N i (s ), H i (s, γ, Λ))Y ij (s ) exp(β T Z ij )] λ (s)ds. By definition, ˆΛ (t, γ) satisfies the equation ˆΛ (t, γ) = Ξ n (t, γ, ˆΛ (, γ)). (15) Next, define q γ (s, Λ) = E[ m i j=1 ψ (N i (s ), H i (s, γ, Λ ))Y ij (s) exp(β T Z ij )] E[ m i j=1 ψ (N i (s ), H i (s, γ, Λ))Y ij (s) exp(β T Z ij )] λ (s). This function is uniformly bounded by B = [ψ max()/ψ min(h max )]λ max. Moreover, by the Lipschitz continuity of ψ (r, h) with respect to h, it satisfies a Lipschitz-like condition of the form q γ (s, Λ 1 ) q γ (s, Λ 2 ) K sup u s Λ 1 (u) Λ 2 (u). Hence, by mimicking the argument of Hartman (1973, Theorem 1.1), we find that the equation Λ(t) = Ξ(t, γ, Λ) has a unique solution, which we denote by Λ (t, γ). The claim then is that ˆΛ (t, γ) converges almost surely (uniformly in t and γ) to Λ (t, γ). Define Λ (n) (t, γ) to be a modified version of ˆΛ (t, γ) defined by linear interpolation between the jumps. Lemma 3 implies that, with probability one, and thus sup Λ (n) (t, γ) ˆΛ (t, γ), (16) t,γ sup Ξ n (t, γ, Λ (t, γ)) Ξ n (t, γ, ˆΛ (t, γ)). (17) t,γ 14

17 Lemma 2 shows that the family L = { Λ (n) (t, γ), n n } is uniformly bounded. We can show further that L is equicontinuous, using arguments similar to those of Zucker (25). The first step is to note that, with N(t) = n 1 n i=1 mi j=1 N ij (t), we have N(t) E[N i (t)] as n uniformly in t with probability one. From this we can obtain the following result: with probability one, for any ɛ > there exists n (ɛ) such that for all t and u with u < t, ˆΛ (t, γ) ˆΛ (u, γ) B (t u) + ɛ 2 for all n n (ɛ). (18) Moreover, ˆΛ (t, γ) is Lipschitz continuous in γ, uniformly in γ and t. The equicontinuity follows. Given that L is a.s. uniformly bounded and equicontinuous, the Arzela-Ascoli theorem implies that it is (almost surely) a relatively compact set in C([, τ] G). Next, define A(γ, Λ, s) = 1 n m i ψ (N i (s ), H i (s, γ, Λ))Y ij (s) exp(β T Z ij ), n m i a(γ, Λ, s) = E ψ (N i (s ), H i (s, γ, Λ))Y ij (s) exp(β T Z ij ). j=1 For any fixed continuous Λ, the functional strong law of large numbers of Andersen and Gill (1982, Appendix III) implies that Here we need the following more complex result: sup A(γ, Λ, s) a(γ, Λ, s) a.s. (19) s,γ sup A(γ, Λ (n), s) a(γ, Λ (n), s) a.s. (2) s,γ The proof of (2) is involved; we give the details in Section 4.5 below. In outline form, the proof involves two steps: (1) showing that, for any given ɛ >, we can define an appropriate finite class L ɛ of functions Λ such that Λ (n) can be suitably approximated by some member of the class; (2) applying the result (19), which will hold uniformly over the finite class. 15

18 Given (2) and the a.s. uniform convergence of N(t) to E[Ni (t)], we can infer that (n) (n) sup Ξ n (t, γ, Λ (t, γ)) Ξ(t, γ, Λ (t, γ)) a.s. (21) t,γ This result is obtained by adapting the argument of Aalen (1976, Lemma 6.1). From (15), (16), (17), and (21) it follows that any limit point of { Λ (n) (t, γ)} must satisfy the equation Λ = Ξ(t, γ, Λ). Since Λ (t, γ) is the unique solution of this equation, it is the unique limit point of { Λ (n) (t, γ)}. Thus { Λ (n) (t, γ)} is a sequence in a compact set with unique limit point Λ (t, γ). Hence Λ (n) (t, γ) converges a.s. uniformly in t and γ to Λ (t, γ). In view of (16), the same holds of ˆΛ (t, γ), which is the desired result. Note that Λ (, γ ) = Λ ( ). Indeed, if we plug Λ into the expression for Ξ(t, γ, Λ), the expectation terms cancel, and so we are left with the integral of λ (s). Thus, Λ is the solution to the equation Λ = Ξ(t, γ, Λ). Claim B: With u(γ, Λ (, γ)) = E[U(γ, Λ (, γ))], we have U(γ, ˆΛ (, γ)) u(γ, Λ (, γ)) uniformly in γ G with probability one. Proof: As in Zucker (25). Claim C: There exists a unique consistent root to U(ˆγ, ˆΛ (, ˆγ)) =. Proof: We apply Foutz s (1977) consistency theorem for maximum likelihood type estimators. The following conditions must be established: F1. U(γ, ˆΛ (, γ))/ γ exists and is continuous in an open neighborhood about γ. F2. The convergence of U(γ, ˆΛ (, γ))/ γ to its limit is uniform in open neighborhood of γ. F3. U(γ, ˆΛ (, γ )) as n. F4. The matrix [ U(γ, ˆΛ (, γ))/ γ] γ=γ is invertible with probability going to 1 as 16

19 n. (In Foutz s paper, the matrix in question is symmetric, and so he stated the condition in terms of positive definiteness. But his proof, which is based on the inverse function theorem, shows that the basic condition needed is invertibility.) It is easily seen that Condition F1 holds. Given Assumptions 2, 4, and 5, Condition F2 follows from the previously-cited functional law of large numbers. As for Condition F3, Claim B says that U(γ, Λ (, γ)) converges a.s. uniformly to u(γ, Λ (, γ)) = E[U(γ, Λ (, γ))]. We noted already that Λ (, γ ) = Λ ( ). Thus we need only show that E[U(γ, Λ )] =. Since U is a score function derived from a classical iid likelihood, this result follows from classical likelihood theory. Condition F4 has been assumed in Assumption 12. With Conditions F1-F4 established, the result follows. 5.5 Asymptotic Normality To show that ˆγ is asymptotically normally distributed, we write = U(ˆγ, ˆΛ (, ˆγ)) = U(γ, Λ ) + [U(γ, ˆΛ (, γ )) U(γ, Λ )] + [U(ˆγ, ˆΛ (, ˆγ)) U(γ, ˆΛ (, γ ))] In the following we consider each of the terms of the right-hand side of the equation. Step I We can write U(γ, Λ ) = n 1 n i=1 ξ i, where ξ i is a (p + 1)-vector with r-th element, r = 1,..., p, given by m i ξ ir = δ ij Z ijr j=1 and (p + 1)-th element given by [ mi j=1 H ij (τ)z ijr ] w N i. (τ)+1 exp{ w{h i. (τ)}f(w; θ)dw w N i. (τ) exp{ wh i. (τ)}f(w; θ)dw ξ i(p+1) = w N i. (τ) exp{ wh i. (τ)}f (w; θ)dw w N i. (τ) exp{ wh i. (τ)}f(w; θ)dw. 17

20 Thus U(γ, Λ ) is the mean of the iid mean-zero random vectors ξ i. It hence follows from the central limit theorem that n 1 2 U(γ, Λ ) is asymptotically mean-zero multivariate normal. To estimate the covariance matrix, let ξ i be the counterpart of ξ i with estimates of γ and Λ substituted for the true values. Then an empirical estimator of the covariance matrix is given by ˆV(ˆγ) = n 1 n i=1 ξ i ξ T i. This is a consistent estimator of the covariance matrix since ˆΛ (t, γ) converges to Λ (t, γ) a.s. uniformly in t and γ (Claim A), and ˆγ is a consistent estimator of γ (Claim C). Step II Let Ûr = U r (γ, ˆΛ ), r = 1,..., p, and Ûp+1 = U p+1 (γ, ˆΛ ) (in this segment of the proof, when we write (γ, ˆΛ ) the intent is to signify (γ, ˆΛ (, γ )). First order Taylor expansion of Ûr about Λ, r = 1,..., p + 1, gives n 1/2 {U r (γ, ˆΛ ) U r (γ, Λ )} m i n = n 1/2 Q ijr (γ, Λ, T ij ){ˆΛ (T ij, γ ) Λ (T ij )} + o p (1), (22) where Q ijr (γ, Λ φ 2i (γ, Λ, T ij ) =, τ) φ 1i (γ, Λ, τ) R ijz ijr φ 3i(γ, Λ, τ) m i φ 1i (γ, Λ, τ) R ij H ij (T ij )Z ijr j=1 + φ2 2i(γ, Λ, τ) m i φ 2 1i(γ, Λ, τ) R ij H ij (T ij )Z ijr j=1 for r = 1,..., p, and Q ij(p+1) (γ, Λ, T ij ) = Rij with R ij = exp(β T Z ij ) and φ 2i (γ, Λ, τ)φ (θ) 1i (γ, Λ, τ) φ 2 1i(γ, Λ, τ) 2i (γ, Λ, τ) φ 1i (γ, Λ, τ), φ(θ) φ (θ) ki (γ, Λ, t) = w N i.(t)+(k 1) exp{ wh i. (t)}f (w)dw, k = 1, 2. The validity of the approximation (22) can be seen by an argument similar to that used in connection with (24) below. 18

21 Given the intensity process (4), the process M ij (t) = N ij (t) t λ (u) exp(β T Z ij )Y ij (u)ψ i (γ, Λ, u )du is a mean zero martingale with respect to the filtration F t. Also, by Lemma 3, we have that sup s [,τ] ˆΛ (s, γ ) ˆΛ (s, γ ) converges to zero. Thus, replacing s by s we obtain the following approximation, uniformly over t [, τ]: ˆΛ (t, γ ) Λ (t) 1 n + 1 n t t {Y(s, Λ )} 1 n m i dm ij (s) [ {Y(s, ˆΛ )} 1 {Y(s, Λ )} 1] n m i dn ij (s), (23) where Y(s, Λ) = 1 n ψ i (γ m i, Λ, s) Y ij (s) exp(β T Z ij ). n i=1 j=1 Now let W(s, r) = {Y(s, Λ + r )} 1 with = ˆΛ Λ. Define Ẇ and Ẅ as the first and second derivative of W with respect to r, respectively. Then, computing the necessary derivatives and carrying out a first order Taylor expansion of W(s, r) around r = evaluated at r = 1 with Lagrange remainder (Abramowitz and Stegun, 1972, p. 88), we get {Y(s, ˆΛ )} 1 {Y(s, Λ )} 1 = = 1 n n m i 1 Ẇ(s, ) + r(s)) 2Ẅ(s, [ Ri. (s)η 1i (, s) 1 ] {Y(s, Λ )} 2 2 h i( r(s), s) exp(β T Z ij ){ˆΛ (T ij s) Λ (T ij s)}, (24) where R ij (u) = exp(β T Z ij )Y ij (u), R i. (u) = m i j=1 R ij (u), r(s) [, 1], η 1i (r, s) = φ 3i(γ, Λ + r, s) φ 1i (γ, Λ + r, s) { φ2i (γ, Λ } 2 + r, s), φ 1i (γ, Λ + r, s) and h i (r, s) is as defined in Section 4.6 below, and shown there to be o(1) uniformly in r and s. 19

22 Let η 1i (s) = η 1i (, s). Plugging (24) into (23) we get t ˆΛ (t, γ ) Λ (t) n 1 {Y(s, Λ )} 1 n m i dm ij (s) t n n 2 m k I(T kl > s)r k. (s)η 1k (s) n exp(β T Z k=1 l=1 {Y(s, Λ )} 2 kl ){ˆΛ (s) Λ m i (s)} dn ij (s) t n n 2 m k I(T kl s)r k. (s)η 1k (s) n exp(β T Z k=1 l=1 {Y(s, Λ )} 2 kl ){ˆΛ (T kl ) Λ m i (T kl )} dn ij (s) t n + n 2 m k 1 n k=1 l=1 2 h k( r(s), s) exp(β T Z kl ){ˆΛ (T kl ) Λ m i (T kl )} dn ij (s). (25) The third term of the above equation can be written, by interchanging the order of integration, as n 2 n m k n m i t k=1 l=1 where Ñij(t) = I(T ij t) and Hence we get [ R k. (s)η 1k (s) s ] {Y(s, Λ )} 2 exp(βt Z kl ) {ˆΛ (u) Λ (u)}dñkl(u)} dn ij (s) t n = {ˆΛ (s) Λ m i (s)} Ω ij (s, t)dñij(s), t n Ω ij (s, t) = n 2 {Y(u, Λ )} 2 R i. (u)η 1i (u) exp(β T m k Z ij ) dn kl (u). s k=1 l=1 t ˆΛ (t, γ ) Λ (t) = n 1 {Y(s, Λ )} 1 where t {ˆΛ (s, γ ) Λ (s)} Υ(s) = n 2 {Y(s, Λ )} 2 n m k k=1 l=1 n m i n m i dm ij (s) {δ ij Υ(s) + Ω ij (s, t) + o(n 1 )}dñij(s) I(T kl > s)r k. (s)η 1k (s) exp(β T Z kl ). The o(n 1 ) is uniform in t (see Section 4.6) and will be dominated by Ω and Υ, which are of order n 1. Hence the o(n 1 ) term can be ignored. 2

23 An argument similar to that of Yang and Prentice (1999) and Zucker (25) now yields the martingale representation ˆΛ (t, γ ) Λ (t) 1 t nˆp(t) ˆp(s ) n mi dm ij (s), (26) Y(s, Λ ) where ˆp(t) = 1 + s t n m i {δ ij Υ(s) + Ω ij (s, t)}dñij(s). Based on (22), we can write U r (γ, ˆΛ ) U r (γ, Λ ) n 1 n m i τ Q ijr (γ, Λ, s){ˆλ (s, γ ) Λ (s)}dñij(s). Plugging the martingale representation (26) into the above equation and carrying out some more algebra (again involving an interchange of integrals) gives where U r (γ, ˆΛ ) U r (γ, Λ ) τ n 1 π r (s, γ, Λ ) ˆp(s ) n mk k=1 l=1 dm kl(s), (27) Y(s, Λ ) π r (s, γ, Λ ) = n 1 τ s ni=1 mi j=1 Q ijr (γ, Λ, t)dñij(t). ˆp(t) Therefore, n 1/2 [U(γ, ˆΛ (, γ )) U(γ, Λ (, γ ))] is asymptotically mean zero multivariate normal with covariance matrix that can be consistently estimated by G rl (ˆγ) = n 1 τ for r, l = 1,..., p + 1. Step III we have π r (s, ˆγ, ˆΛ )π l (s, ˆγ, ˆΛ ){ˆp(s )} 2 ni=1 mi j=1 dn ij (s) {Y(s, ˆΛ )} 2 We now examine the sum of U(γ, Λ ) and U(γ, ˆΛ (, γ )) U(γ, Λ ). From (27), U r (γ, ˆΛ τ n (, γ )) U r (γ, Λ ) n 1 m k α r (s) dm kl (s) = 1 n µ kr, k=1 l=1 n k=1 21

24 where α r (s) is the limiting value of π r (s, γ, Λ )ˆp(s )/Y(s, Λ ) and µ kr is defined as µ kr = τ m k α r (s) dm kl (s). l=1 Arguments in Yang and Prentice (1999, Appendix A) can be used to show that ˆp(s ) has a limit. Also, clearly E[µ kr ] =. We thus have U r (γ, Λ ) + [U r (γ, ˆΛ (, γ )) U r (γ, Λ )] 1 n (ξ ir + µ ir ), n i=1 which is a mean of n iid random variables. Hence n 1/2 {U r (γ, Λ ) + [U r (γ, ˆΛ (, γ )) U r (γ, Λ )]} is asymptotically normally distributed. The covariance matrix may be estimated by ˆV(ˆγ) + Ĝ(ˆγ) + Ĉ(ˆγ), where Ĉ rl (ˆγ) = 1 n with n (ξirµ il + ξilµ ir), r, l = 1,..., p + 1, i=1 µ ir = τ m i π r (s, ˆγ, ˆΛ )ˆp(s ) Y(s, ˆΛ d ˆM ij (s) ) j=1 and ˆM ij (t) = N ij (t) t exp(ˆβ T Z ij )Y ij (u)ψ i (ˆγ, ˆΛ, u )dˆλ (u). Step IV First order Taylor expansion of U(ˆγ, ˆΛ (, ˆγ)) about γ = (β T, θ ) T gives U(ˆγ, ˆΛ (, ˆγ)) = U(γ, ˆΛ (, γ )) + D(γ )(ˆγ γ ) T + o p (1), where D ls (γ) = U l (γ, ˆΛ (, γ))/ γ s for l, s = 1,..., p + 1, with γ p+1 = θ. 22

25 For l, s = 1,..., p we have n D ls (γ) = n 1 φ 2i (γ, ˆΛ, τ) m i Ĥij(T ij ) i=1 φ 1i (γ, ˆΛ Z ijl, τ) j=1 β s [ φ3i (γ, ˆΛ, τ) φ 1i (γ, ˆΛ, τ) φ2 2i(γ, ˆΛ ], τ) mi Ĥi.(τ) φ 2 1i(γ, ˆΛ Ĥ ij (T ij )Z ijl, τ) j=1 β s, (28) Ĥij(τ k ) = ˆΛ (T ij τ k ) exp(β T Z ij ) + β s β ˆΛ (T ij τ k ) exp(β T Z ij )Z ijs s and { ˆΛ (τ k ) n = d k β s n i=1 } 2 φ 2i (γ, ˆΛ, τ k 1 ) i=1 φ 1i (γ, ˆΛ, τ k 1 ) R i.(τ k ) [{ φ 2 2i (γ, ˆΛ, τ k 1 ) φ 2 1i(γ, ˆΛ, τ k 1 ) φ 3i(γ, ˆΛ, τ k 1) φ 1i (γ, ˆΛ, τ k 1 ) + φ 2i(γ, ˆΛ, τ k 1 ) φ 1i (γ, ˆΛ, τ k 1 ) m i R ij (τ k )Z ijs. j=1 } Ĥ i. (τ k 1 ) β s R i. (τ k ) For l = 1,..., p we have n D l(p+1) (γ) = n 1 φ 2i (γ, ˆΛ, τ) m i Ĥij(T ij ) i=1 φ 1i (γ, ˆΛ Z ijl, τ) j=1 θ + φ(θ) 2i (γ, ˆΛ, τ) φ 1i (γ, ˆΛ, τ) φ 2i(γ, ˆΛ (θ), τ)φ1i (γ, ˆΛ, τ) φ 2 1i(γ, ˆΛ, τ) { φ 2 + 2i (γ, ˆΛ, τ) φ 2 1i(γ, ˆΛ, τ) φ 3i(γ, ˆΛ } ], τ) Ĥ i. (τ) mi φ 1i (γ, ˆΛ Ĥ ij (T ij )Z ijl, τ) θ (29) j=1 and n D (p+1)l (γ) = n 1 i=1 φ (θ) 1i (γ, ˆΛ, τ)φ 2i (γ, ˆΛ, τ) φ 2 1i(γ, ˆΛ, τ) 2i (γ, ˆΛ, τ) φ 1i (γ, ˆΛ, τ) φ(θ) Ĥi.(τ) β l. (3) Finally, n D (p+1)(p+1) (γ) = n 1 + i=1 φ (θ,θ) 1i (γ, ˆΛ, τ) φ 1i (γ, ˆΛ, τ) φ(θ) φ(θ) 1i (γ, ˆΛ, τ)φ 2i (γ, ˆΛ, τ) φ 2 1i(γ, ˆΛ, τ) 23 1i (γ, ˆΛ 2, τ) φ 1i (γ, ˆΛ ) φ(θ) 2i (γ, ˆΛ, τ) φ 1i (γ, ˆΛ, τ) Ĥi.(τ) θ (31)

26 where φ (θ,θ) 1i (γ, ˆΛ, τ) = w Ni.(τ) exp{ wĥi.(τ)} d2 f(w) dw, dθ 2 Ĥij(τ k ) θ = ˆΛ (T ij τ k ) θ exp(β T Z ij ), and ˆΛ (τ k ) θ } 2 { n φ 2i (γ, = d ˆΛ, τ k 1 ) k i=1 φ 1i (γ, ˆΛ, τ k 1 ) R i.(τ k ) n R i. (τ k ) φ(θ) 2i (γ, ˆΛ, τ k 1 ) i=1 φ 1i (γ, ˆΛ, τ k 1 ) φ 2i(γ, ˆΛ (θ), τ k 1)φ1i (γ, ˆΛ, τ k 1 ) φ 2 1i(γ, ˆΛ, τ k 1 ) + Ĥi.(τ { k 1 ) φ 2 2i (γ, ˆΛ, τ k 1 ) θ φ 2 1i(γ, ˆΛ, τ k 1 ) φ 3i(γ, ˆΛ }], τ k 1) φ 1i (γ, ˆΛ., τ k 1 ) Step V Combining the results above we get that n 1/2 (ˆγ γ ) is asymptotically zero-mean normally distributed with a covariance matrix that can be consistently estimated by ˆD 1 (ˆγ){ ˆV(ˆγ) + Ĝ(ˆγ) + Ĉ(ˆγ)} ˆD 1 (ˆγ) T. 5.6 Proof of (2) The goal is to prove that This involves several steps. that sup A(γ, Λ (n), s) a(γ, Λ (n), s) a.s. (32) s,γ First, it is easy to see that there exists a constant κ (independent of γ and s) such sup A(γ, Λ 1, s) A(γ, Λ 2, s) κ Λ 1 Λ 2, (33) s,γ sup a(γ, Λ 1, s) a(γ, Λ 2, s) κ Λ 1 Λ 2. (34) s,γ 24

27 Next, for any fixed continuous Λ, the functional strong law of large numbers of Andersen and Gill (1982, Appendix III) implies that, with probability one, Now, given ɛ >, define the sets {t (ɛ) j sup A(γ, Λ, s) a(γ, Λ, s). (35) s,γ }, {γ (ɛ) }, and {Λ(ɛ)} to be finite partition k grids of [, τ], G, and [, Λ max ], respectively, with distance of no more than ɛ between grid l points. Define L ɛ to be the set of functions of t and γ defined by linear interpolation through vertices of the form (t (ɛ) j, γ (ɛ) k, Λ(ɛ) l ). Obviously L ɛ is a finite set. Hence, in view of (35), there exists a probability-one set of realizations Ω ɛ for which Define sup A(γ, Λ, s) a(γ, Λ, s). (36) s [,τ],γ G,Λ L ɛ Ω = Ω 1/l l=1 and Ω = Ω Ω, with Ω as defined earlier. Clearly Pr(Ω ) = 1. From now on, we restrict attention to Ω. Now let ɛ > be given. Choose l > ɛ 1. In view of (18) and (36), we can find for any ω Ω a suitable positive integer n(ɛ, ω) such that, whenever n n(ɛ, ω), Λ (n) (n) (t, γ) Λ (u, γ) B (t u) + ɛ 2 t, u, (37) where Next, let Λ (ɛ) jk Λ (n) sup A(γ, Λ, s) a(γ, Λ, s) ɛ. (38) s [,τ],γ G,Λ L 1/l denote the function defined by linear interpolation through (t (ɛ) is the element of {Λ(ɛ)} that is closest to l (n) Λ (t (ɛ) j j, γ (ɛ) k, γ (ɛ) ). It is clear that k, Λ (ɛ) jk ), Λ (n) (t (ɛ) j, γ (ɛ) (n) ) Λ k (t (ɛ) j, γ (ɛ) k ) ɛ j, k. 25

28 (n) Using (37) and the Lipschitz continuity of Λ (t, γ) with respect to γ (which follows from the corresponding property of ˆΛ (t, γ)), we thus obtain sup Λ (n) (n) (t, γ) Λ (t, γ) B ɛ t,γ for a suitable fixed constant B (depending on B and C ). Combining this with (38) and (34), we obtain sup A(γ, Λ (n), s) a(γ, Λ (n), s) (2κB + 1)ɛ s,γ for all n n(ɛ, ω). Since ɛ was arbitrary, the desired conclusion (32) follows, and the proof is thus complete. 5.7 Definition and behavior of h i (r, s) The quantity h i (r, s) appearing in (24) is given by h i (r, s) = 2R i. (s)η 1i (r, s) {Y(s, Λ + r )} 3 1 n R i.(s)η 2i (r, s) {Y(s, Λ + r )} 2 n m i R l. (s)η 1l (r, s) exp(β T Z lj ) (T lj s) l=1 j=1 m i exp(β T Z ij ) (T ij s) j=1 where (T ij s) = ˆΛ (T ij s) Λ o (T ij s) and η 2i (r, s) = 2 { φ2i (γ, Λ } 3 + r, s) + φ 4i(γ, Λ + r, s) φ 1i (γ, Λ + r, s) φ 1i (γ, Λ + r, s) 3φ 2i(γ, Λ + r, s)φ 3i (γ, Λ + r, s) {φ 1i (γ, Λ + r, s)} 2. For all i = 1,..., n and s [, τ], we have R i. (s) mν, where ν is as in (8). Moreover, for k = 1,..., 4, we have E[W r min+(k 1) i exp{ W i me βt Z Λ (τ)}] φ ki (γ, Λ, s) E[W rmax+(k 1) i ] where r max = arg max 1 r m E(W r i ), r min = arg min 1 r m E(W r i ). Hence, η 1i and η 2i are bounded. In addition, the the proof of Lemma 2 show that Y(s, Λ + r ) is uniformly bounded away from zero for n sufficiently large. Finally, in the consistency proof we obtained = o(1). Therefore h i (r, s) is o(1) uniformly in r and s. 26

29 6 Acknowledgements We thanks the referees for their helpful comments, and for calling our attention to the work of Dabrowska (26a, 26b). 7 References Aalen, O. O. (1976). Nonparametric inference in connection with multiple decrement models. Scand. J. Statist. 3, Abramowitz, M. and Stegun, I. A. (Eds.) (1972). Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, 9th printing New York: Dover. Andersen, P. K., Borgan, O, Gill, R. D. and Keiding, N. (1993). Statistical Models Based on Counting Processes. Berlin: Springer-Verlag. Andersen, P. K. and Gill, R. D. (1982). Cox s regression model for counting processes: A large sample study. Ann. Statist. 1, Andersen, P. K., Klein, J. P., Knudsen, K. M. and Palacios, R. T. (1997). Estimation of variance in Cox s regression model with shared gamma frailty. Biometrics 53, Bickel, P. (1985). Efficient testing in a class of transformation models. Bull. Int. Statist. Inst., 51, 53-81, Meeting 23, Amsterdam. Bagdonavicius, V. B., and Nikulin, M. S. (1999). Generalized proportional hazards model based on modified partial likelihood Lifetime Data Analysis, 5,

30 Breslow, N. (1974). Covariance analysis of censored survival data. Biometrics, 3, Cox, D. R. (1972). Regression models and life tables (with discussion). J. R. Statist. Soc. B 34, Dabrowska, D. (26a). Estimation in a class of semi-parametric transformation models. In Optimality: Second Erich L. Lehmann Symposium, Institute of Mathematical Statistics Lecture Notes and Monographs Series Vol. 49 (J. Roho, ed.). Beachwood, OH: Institute of Mathematical Statistics. Dabrowska, D. (26b). Information bounds and efficient estimation in a class of censored transformation models. Technical report. Available at arxiv:math.st/6888. Fine, J. P., Glidden D. V. and Lee, K. (23). A simple estimator for a shared frailty regression model. J. R. Statist. Soc. B 65, Foutz, R. V. (1977). On the unique consistent solution to the likelihood equation. J. Amer. Statist. Ass. 72, Gill, R. D. (1985). Discussion of the paper by D. Clayton and J. Cuzick. J. R. Statist. Soc. A 148, Gill, R. D. (1989). Non- and semi-parametric maximum likelihood estimators and the Von Mises method (Part 1). Scand. J. Statist. 16, Gill, R. D. (1992). Marginal partial likelihood. Scand. J. Statist. 79, Gorfine, M., Zucker, D. M., and Hsu, L. (26). Prospective survival analysis with a general semiparametric shared frailty model - a pseudo full likelihood approach. Biometrika 93,

31 Hartman, P. (1973). Ordinary Differential Equations, 2nd ed. (reprinted, 1982), Boston: Birkhauser. Henderson, R. and Oman, P. (1999). Effect of frailty on marginal regression estimates in survival analysis. J. R. Statist. Soc. B 61, Hougaard, P. (1986). Survival models for heterogeneous populations derived from stable distributions. Biometrika 73, Hougaard, P. (2). Analysis of Multivariate Survival data. New York: Springer. Klein, J. P. (1992). Semiparametric estimation of random effects using the Cox model based on the EM Algorithm. Biometrics 48, Lancaster, T., and Nickell, S. J. (198). The analysis of re-employment probabilities for the unemployed. Journal of the Royal statistical Society, Series A 143, Louis, T. A. (1982). Finding the observed information matrix when using the EM algorithm. J. R. Statis. Soc. B 44, McGilchrist, C. A. (1993). REML estimation for survival models with frailty. Biometrics 49, Murphy, S. A. (1994). Consistency in a proportional hazards model incorporating a random effect. Ann. Statist. 22, Murphy, S. A. (1995). Asymptotic theory for the frailty model. Ann. Statist. 23,

32 Nielsen, G. G., Gill, R. D., Andersen, P. K. and Sorensen, T. I. (1992). A counting process approach to maximum likelihood estimation of frailty models. Scand. J. Statist. 19, Parner, E. (1998). Asymptotic theory for the correlated gamma-frailty model. Ann. Statist. 26, Ripatti, S. and Palmgren J. (2). Estimation of multivariate frailty models using penalized partial likelihood. Biometrics 56, Vaida, F. and Xu, R. H. (2). Proportional hazards model with random effects. Stat. in Med. 19, Yang, S. and Prentice, R. L. (1999). Semiparametric inference in the proportional odds regression model. J. Amer. Statist. Ass. 94, Zucker, D. M. (25). A pseudo partial likelihood method for semi-parametric survival regression with covariate errors. J. Amer. Statist. Ass. 1,

33 Table 1: Simulation results for family size 2. A: Empirical mean. B: Empirical standard deviation. C: Estimated Standard deviation. D: Coverage rate. E: Correlation. ˆβ ˆθ censoring θ β % Our approach EM algorithm Our approach EM algorithm 2 ln(2) 35 A B C D E A B C D E ln(3) 3 A B C D E A B C D E ln(2) 5 A B C D E A B C D E ln(3) 45 A B C D E A B C D E

34 Table 2: Simulation results for family size equals 5. A: Empirical mean. B: Empirical standard deviation. C: Estimated Standard deviation. D: Coverage rate. E: Correlation. ˆβ ˆθ censoring θ β % Our approach EM algorithm Our approach EM algorithm 2 ln(2) 35 A B C D E A B C D E ln(3) 3 A B C D E A B C D E ln(2) 5 A B C D E A B C D E ln(3) 45 A B C D E A B C D E

arxiv:math/ v1 [math.st] 18 May 2005

arxiv:math/ v1 [math.st] 18 May 2005 Prospective survival analysis with a general semiparametric arxiv:math/55387v1 [math.st] 18 May 25 shared frailty model - a pseudo full likelihood approach Malka Gorfine 1 Department of Mathematics, Bar-Ilan

More information

Multivariate Survival Analysis

Multivariate Survival Analysis Multivariate Survival Analysis Previously we have assumed that either (X i, δ i ) or (X i, δ i, Z i ), i = 1,..., n, are i.i.d.. This may not always be the case. Multivariate survival data can arise in

More information

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations Takeshi Emura and Hisayuki Tsukuma Abstract For testing the regression parameter in multivariate

More information

Efficiency of Profile/Partial Likelihood in the Cox Model

Efficiency of Profile/Partial Likelihood in the Cox Model Efficiency of Profile/Partial Likelihood in the Cox Model Yuichi Hirose School of Mathematics, Statistics and Operations Research, Victoria University of Wellington, New Zealand Summary. This paper shows

More information

Frailty Models and Copulas: Similarities and Differences

Frailty Models and Copulas: Similarities and Differences Frailty Models and Copulas: Similarities and Differences KLARA GOETHALS, PAUL JANSSEN & LUC DUCHATEAU Department of Physiology and Biometrics, Ghent University, Belgium; Center for Statistics, Hasselt

More information

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

FULL LIKELIHOOD INFERENCES IN THE COX MODEL October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach

More information

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model Other Survival Models (1) Non-PH models We briefly discussed the non-proportional hazards (non-ph) model λ(t Z) = λ 0 (t) exp{β(t) Z}, where β(t) can be estimated by: piecewise constants (recall how);

More information

STAT Sample Problem: General Asymptotic Results

STAT Sample Problem: General Asymptotic Results STAT331 1-Sample Problem: General Asymptotic Results In this unit we will consider the 1-sample problem and prove the consistency and asymptotic normality of the Nelson-Aalen estimator of the cumulative

More information

Test for Equality of Baseline Hazard Functions for Correlated Survival Data using Frailty Models

Test for Equality of Baseline Hazard Functions for Correlated Survival Data using Frailty Models 1 Test for Equality of Baseline Hazard Functions for Correlated Survival Data using Frailty Models Polyna Khudyakov 1, Malka Gorfine 2 and Paul Feigin 2 1 Harvard School of Public Health, USA and 2 Technion,

More information

On the Breslow estimator

On the Breslow estimator Lifetime Data Anal (27) 13:471 48 DOI 1.17/s1985-7-948-y On the Breslow estimator D. Y. Lin Received: 5 April 27 / Accepted: 16 July 27 / Published online: 2 September 27 Springer Science+Business Media,

More information

Models for Multivariate Panel Count Data

Models for Multivariate Panel Count Data Semiparametric Models for Multivariate Panel Count Data KyungMann Kim University of Wisconsin-Madison kmkim@biostat.wisc.edu 2 April 2015 Outline 1 Introduction 2 3 4 Panel Count Data Motivation Previous

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models

Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations

More information

Lecture 5 Models and methods for recurrent event data

Lecture 5 Models and methods for recurrent event data Lecture 5 Models and methods for recurrent event data Recurrent and multiple events are commonly encountered in longitudinal studies. In this chapter we consider ordered recurrent and multiple events.

More information

STAT331. Cox s Proportional Hazards Model

STAT331. Cox s Proportional Hazards Model STAT331 Cox s Proportional Hazards Model In this unit we introduce Cox s proportional hazards (Cox s PH) model, give a heuristic development of the partial likelihood function, and discuss adaptations

More information

Multivariate Survival Data With Censoring.

Multivariate Survival Data With Censoring. 1 Multivariate Survival Data With Censoring. Shulamith Gross and Catherine Huber-Carol Baruch College of the City University of New York, Dept of Statistics and CIS, Box 11-220, 1 Baruch way, 10010 NY.

More information

ASYMPTOTIC PROPERTIES AND EMPIRICAL EVALUATION OF THE NPMLE IN THE PROPORTIONAL HAZARDS MIXED-EFFECTS MODEL

ASYMPTOTIC PROPERTIES AND EMPIRICAL EVALUATION OF THE NPMLE IN THE PROPORTIONAL HAZARDS MIXED-EFFECTS MODEL Statistica Sinica 19 (2009), 997-1011 ASYMPTOTIC PROPERTIES AND EMPIRICAL EVALUATION OF THE NPMLE IN THE PROPORTIONAL HAZARDS MIXED-EFFECTS MODEL Anthony Gamst, Michael Donohue and Ronghui Xu University

More information

Tests of independence for censored bivariate failure time data

Tests of independence for censored bivariate failure time data Tests of independence for censored bivariate failure time data Abstract Bivariate failure time data is widely used in survival analysis, for example, in twins study. This article presents a class of χ

More information

1 Glivenko-Cantelli type theorems

1 Glivenko-Cantelli type theorems STA79 Lecture Spring Semester Glivenko-Cantelli type theorems Given i.i.d. observations X,..., X n with unknown distribution function F (t, consider the empirical (sample CDF ˆF n (t = I [Xi t]. n Then

More information

Regularization in Cox Frailty Models

Regularization in Cox Frailty Models Regularization in Cox Frailty Models Andreas Groll 1, Trevor Hastie 2, Gerhard Tutz 3 1 Ludwig-Maximilians-Universität Munich, Department of Mathematics, Theresienstraße 39, 80333 Munich, Germany 2 University

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 24 Paper 153 A Note on Empirical Likelihood Inference of Residual Life Regression Ying Qing Chen Yichuan

More information

FULL LIKELIHOOD INFERENCES IN THE COX MODEL: AN EMPIRICAL LIKELIHOOD APPROACH

FULL LIKELIHOOD INFERENCES IN THE COX MODEL: AN EMPIRICAL LIKELIHOOD APPROACH FULL LIKELIHOOD INFERENCES IN THE COX MODEL: AN EMPIRICAL LIKELIHOOD APPROACH Jian-Jian Ren 1 and Mai Zhou 2 University of Central Florida and University of Kentucky Abstract: For the regression parameter

More information

Product-limit estimators of the survival function with left or right censored data

Product-limit estimators of the survival function with left or right censored data Product-limit estimators of the survival function with left or right censored data 1 CREST-ENSAI Campus de Ker-Lann Rue Blaise Pascal - BP 37203 35172 Bruz cedex, France (e-mail: patilea@ensai.fr) 2 Institut

More information

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score

More information

Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion

Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion Glenn Heller and Jing Qin Department of Epidemiology and Biostatistics Memorial

More information

Survival Analysis for Case-Cohort Studies

Survival Analysis for Case-Cohort Studies Survival Analysis for ase-ohort Studies Petr Klášterecký Dept. of Probability and Mathematical Statistics, Faculty of Mathematics and Physics, harles University, Prague, zech Republic e-mail: petr.klasterecky@matfyz.cz

More information

11 Survival Analysis and Empirical Likelihood

11 Survival Analysis and Empirical Likelihood 11 Survival Analysis and Empirical Likelihood The first paper of empirical likelihood is actually about confidence intervals with the Kaplan-Meier estimator (Thomas and Grunkmeier 1979), i.e. deals with

More information

Modelling Survival Events with Longitudinal Data Measured with Error

Modelling Survival Events with Longitudinal Data Measured with Error Modelling Survival Events with Longitudinal Data Measured with Error Hongsheng Dai, Jianxin Pan & Yanchun Bao First version: 14 December 29 Research Report No. 16, 29, Probability and Statistics Group

More information

1 Local Asymptotic Normality of Ranks and Covariates in Transformation Models

1 Local Asymptotic Normality of Ranks and Covariates in Transformation Models Draft: February 17, 1998 1 Local Asymptotic Normality of Ranks and Covariates in Transformation Models P.J. Bickel 1 and Y. Ritov 2 1.1 Introduction Le Cam and Yang (1988) addressed broadly the following

More information

Survival Analysis Math 434 Fall 2011

Survival Analysis Math 434 Fall 2011 Survival Analysis Math 434 Fall 2011 Part IV: Chap. 8,9.2,9.3,11: Semiparametric Proportional Hazards Regression Jimin Ding Math Dept. www.math.wustl.edu/ jmding/math434/fall09/index.html Basic Model Setup

More information

Outline. Frailty modelling of Multivariate Survival Data. Clustered survival data. Clustered survival data

Outline. Frailty modelling of Multivariate Survival Data. Clustered survival data. Clustered survival data Outline Frailty modelling of Multivariate Survival Data Thomas Scheike ts@biostat.ku.dk Department of Biostatistics University of Copenhagen Marginal versus Frailty models. Two-stage frailty models: copula

More information

GOODNESS-OF-FIT TESTS FOR ARCHIMEDEAN COPULA MODELS

GOODNESS-OF-FIT TESTS FOR ARCHIMEDEAN COPULA MODELS Statistica Sinica 20 (2010), 441-453 GOODNESS-OF-FIT TESTS FOR ARCHIMEDEAN COPULA MODELS Antai Wang Georgetown University Medical Center Abstract: In this paper, we propose two tests for parametric models

More information

Power and Sample Size Calculations with the Additive Hazards Model

Power and Sample Size Calculations with the Additive Hazards Model Journal of Data Science 10(2012), 143-155 Power and Sample Size Calculations with the Additive Hazards Model Ling Chen, Chengjie Xiong, J. Philip Miller and Feng Gao Washington University School of Medicine

More information

STAT 331. Martingale Central Limit Theorem and Related Results

STAT 331. Martingale Central Limit Theorem and Related Results STAT 331 Martingale Central Limit Theorem and Related Results In this unit we discuss a version of the martingale central limit theorem, which states that under certain conditions, a sum of orthogonal

More information

GROUPED SURVIVAL DATA. Florida State University and Medical College of Wisconsin

GROUPED SURVIVAL DATA. Florida State University and Medical College of Wisconsin FITTING COX'S PROPORTIONAL HAZARDS MODEL USING GROUPED SURVIVAL DATA Ian W. McKeague and Mei-Jie Zhang Florida State University and Medical College of Wisconsin Cox's proportional hazard model is often

More information

Survival Analysis I (CHL5209H)

Survival Analysis I (CHL5209H) Survival Analysis Dalla Lana School of Public Health University of Toronto olli.saarela@utoronto.ca January 7, 2015 31-1 Literature Clayton D & Hills M (1993): Statistical Models in Epidemiology. Not really

More information

Full likelihood inferences in the Cox model: an empirical likelihood approach

Full likelihood inferences in the Cox model: an empirical likelihood approach Ann Inst Stat Math 2011) 63:1005 1018 DOI 10.1007/s10463-010-0272-y Full likelihood inferences in the Cox model: an empirical likelihood approach Jian-Jian Ren Mai Zhou Received: 22 September 2008 / Revised:

More information

Frailty Modeling for clustered survival data: a simulation study

Frailty Modeling for clustered survival data: a simulation study Frailty Modeling for clustered survival data: a simulation study IAA Oslo 2015 Souad ROMDHANE LaREMFiQ - IHEC University of Sousse (Tunisia) souad_romdhane@yahoo.fr Lotfi BELKACEM LaREMFiQ - IHEC University

More information

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA Kasun Rathnayake ; A/Prof Jun Ma Department of Statistics Faculty of Science and Engineering Macquarie University

More information

Empirical Processes & Survival Analysis. The Functional Delta Method

Empirical Processes & Survival Analysis. The Functional Delta Method STAT/BMI 741 University of Wisconsin-Madison Empirical Processes & Survival Analysis Lecture 3 The Functional Delta Method Lu Mao lmao@biostat.wisc.edu 3-1 Objectives By the end of this lecture, you will

More information

Semiparametric maximum likelihood estimation in normal transformation models for bivariate survival data

Semiparametric maximum likelihood estimation in normal transformation models for bivariate survival data Biometrika (28), 95, 4,pp. 947 96 C 28 Biometrika Trust Printed in Great Britain doi: 1.193/biomet/asn49 Semiparametric maximum likelihood estimation in normal transformation models for bivariate survival

More information

M- and Z- theorems; GMM and Empirical Likelihood Wellner; 5/13/98, 1/26/07, 5/08/09, 6/14/2010

M- and Z- theorems; GMM and Empirical Likelihood Wellner; 5/13/98, 1/26/07, 5/08/09, 6/14/2010 M- and Z- theorems; GMM and Empirical Likelihood Wellner; 5/13/98, 1/26/07, 5/08/09, 6/14/2010 Z-theorems: Notation and Context Suppose that Θ R k, and that Ψ n : Θ R k, random maps Ψ : Θ R k, deterministic

More information

UNIVERSITY OF CALIFORNIA, SAN DIEGO

UNIVERSITY OF CALIFORNIA, SAN DIEGO UNIVERSITY OF CALIFORNIA, SAN DIEGO Estimation of the primary hazard ratio in the presence of a secondary covariate with non-proportional hazards An undergraduate honors thesis submitted to the Department

More information

SEMIPARAMETRIC REGRESSION WITH TIME-DEPENDENT COEFFICIENTS FOR FAILURE TIME DATA ANALYSIS

SEMIPARAMETRIC REGRESSION WITH TIME-DEPENDENT COEFFICIENTS FOR FAILURE TIME DATA ANALYSIS Statistica Sinica 2 (21), 853-869 SEMIPARAMETRIC REGRESSION WITH TIME-DEPENDENT COEFFICIENTS FOR FAILURE TIME DATA ANALYSIS Zhangsheng Yu and Xihong Lin Indiana University and Harvard School of Public

More information

Pairwise dependence diagnostics for clustered failure-time data

Pairwise dependence diagnostics for clustered failure-time data Biometrika Advance Access published May 13, 27 Biometrika (27), pp. 1 15 27 Biometrika Trust Printed in Great Britain doi:1.193/biomet/asm24 Pairwise dependence diagnostics for clustered failure-time data

More information

Lecture 3. Truncation, length-bias and prevalence sampling

Lecture 3. Truncation, length-bias and prevalence sampling Lecture 3. Truncation, length-bias and prevalence sampling 3.1 Prevalent sampling Statistical techniques for truncated data have been integrated into survival analysis in last two decades. Truncation in

More information

Estimation and Inference of Quantile Regression. for Survival Data under Biased Sampling

Estimation and Inference of Quantile Regression. for Survival Data under Biased Sampling Estimation and Inference of Quantile Regression for Survival Data under Biased Sampling Supplementary Materials: Proofs of the Main Results S1 Verification of the weight function v i (t) for the lengthbiased

More information

Regression models for multivariate ordered responses via the Plackett distribution

Regression models for multivariate ordered responses via the Plackett distribution Journal of Multivariate Analysis 99 (2008) 2472 2478 www.elsevier.com/locate/jmva Regression models for multivariate ordered responses via the Plackett distribution A. Forcina a,, V. Dardanoni b a Dipartimento

More information

Modelling geoadditive survival data

Modelling geoadditive survival data Modelling geoadditive survival data Thomas Kneib & Ludwig Fahrmeir Department of Statistics, Ludwig-Maximilians-University Munich 1. Leukemia survival data 2. Structured hazard regression 3. Mixed model

More information

Goodness-of-fit test for the Cox Proportional Hazard Model

Goodness-of-fit test for the Cox Proportional Hazard Model Goodness-of-fit test for the Cox Proportional Hazard Model Rui Cui rcui@eco.uc3m.es Department of Economics, UC3M Abstract In this paper, we develop new goodness-of-fit tests for the Cox proportional hazard

More information

Multistate Modeling and Applications

Multistate Modeling and Applications Multistate Modeling and Applications Yang Yang Department of Statistics University of Michigan, Ann Arbor IBM Research Graduate Student Workshop: Statistics for a Smarter Planet Yang Yang (UM, Ann Arbor)

More information

A GENERALIZED ADDITIVE REGRESSION MODEL FOR SURVIVAL TIMES 1. By Thomas H. Scheike University of Copenhagen

A GENERALIZED ADDITIVE REGRESSION MODEL FOR SURVIVAL TIMES 1. By Thomas H. Scheike University of Copenhagen The Annals of Statistics 21, Vol. 29, No. 5, 1344 136 A GENERALIZED ADDITIVE REGRESSION MODEL FOR SURVIVAL TIMES 1 By Thomas H. Scheike University of Copenhagen We present a non-parametric survival model

More information

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky Empirical likelihood with right censored data were studied by Thomas and Grunkmier (1975), Li (1995),

More information

A Measure of Association for Bivariate Frailty Distributions

A Measure of Association for Bivariate Frailty Distributions journal of multivariate analysis 56, 6074 (996) article no. 0004 A Measure of Association for Bivariate Frailty Distributions Amita K. Manatunga Emory University and David Oakes University of Rochester

More information

Estimation in Generalized Linear Models with Heterogeneous Random Effects. Woncheol Jang Johan Lim. May 19, 2004

Estimation in Generalized Linear Models with Heterogeneous Random Effects. Woncheol Jang Johan Lim. May 19, 2004 Estimation in Generalized Linear Models with Heterogeneous Random Effects Woncheol Jang Johan Lim May 19, 2004 Abstract The penalized quasi-likelihood (PQL) approach is the most common estimation procedure

More information

Graduate Econometrics I: Maximum Likelihood I

Graduate Econometrics I: Maximum Likelihood I Graduate Econometrics I: Maximum Likelihood I Yves Dominicy Université libre de Bruxelles Solvay Brussels School of Economics and Management ECARES Yves Dominicy Graduate Econometrics I: Maximum Likelihood

More information

Harvard University. Harvard University Biostatistics Working Paper Series

Harvard University. Harvard University Biostatistics Working Paper Series Harvard University Harvard University Biostatistics Working Paper Series Year 2008 Paper 85 Semiparametric Maximum Likelihood Estimation in Normal Transformation Models for Bivariate Survival Data Yi Li

More information

Efficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models

Efficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models NIH Talk, September 03 Efficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models Eric Slud, Math Dept, Univ of Maryland Ongoing joint project with Ilia

More information

Introduction to the Mathematical and Statistical Foundations of Econometrics Herman J. Bierens Pennsylvania State University

Introduction to the Mathematical and Statistical Foundations of Econometrics Herman J. Bierens Pennsylvania State University Introduction to the Mathematical and Statistical Foundations of Econometrics 1 Herman J. Bierens Pennsylvania State University November 13, 2003 Revised: March 15, 2004 2 Contents Preface Chapter 1: Probability

More information

Asymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands

Asymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands Asymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands Elizabeth C. Mannshardt-Shamseldin Advisor: Richard L. Smith Duke University Department

More information

A note on profile likelihood for exponential tilt mixture models

A note on profile likelihood for exponential tilt mixture models Biometrika (2009), 96, 1,pp. 229 236 C 2009 Biometrika Trust Printed in Great Britain doi: 10.1093/biomet/asn059 Advance Access publication 22 January 2009 A note on profile likelihood for exponential

More information

asymptotic normality of nonparametric M-estimators with applications to hypothesis testing for panel count data

asymptotic normality of nonparametric M-estimators with applications to hypothesis testing for panel count data asymptotic normality of nonparametric M-estimators with applications to hypothesis testing for panel count data Xingqiu Zhao and Ying Zhang The Hong Kong Polytechnic University and Indiana University Abstract:

More information

Cox s proportional hazards model and Cox s partial likelihood

Cox s proportional hazards model and Cox s partial likelihood Cox s proportional hazards model and Cox s partial likelihood Rasmus Waagepetersen October 12, 2018 1 / 27 Non-parametric vs. parametric Suppose we want to estimate unknown function, e.g. survival function.

More information

Likelihood ratio confidence bands in nonparametric regression with censored data

Likelihood ratio confidence bands in nonparametric regression with censored data Likelihood ratio confidence bands in nonparametric regression with censored data Gang Li University of California at Los Angeles Department of Biostatistics Ingrid Van Keilegom Eindhoven University of

More information

Survival Regression Models

Survival Regression Models Survival Regression Models David M. Rocke May 18, 2017 David M. Rocke Survival Regression Models May 18, 2017 1 / 32 Background on the Proportional Hazards Model The exponential distribution has constant

More information

PQL Estimation Biases in Generalized Linear Mixed Models

PQL Estimation Biases in Generalized Linear Mixed Models PQL Estimation Biases in Generalized Linear Mixed Models Woncheol Jang Johan Lim March 18, 2006 Abstract The penalized quasi-likelihood (PQL) approach is the most common estimation procedure for the generalized

More information

Invariant HPD credible sets and MAP estimators

Invariant HPD credible sets and MAP estimators Bayesian Analysis (007), Number 4, pp. 681 69 Invariant HPD credible sets and MAP estimators Pierre Druilhet and Jean-Michel Marin Abstract. MAP estimators and HPD credible sets are often criticized in

More information

Exercises. (a) Prove that m(t) =

Exercises. (a) Prove that m(t) = Exercises 1. Lack of memory. Verify that the exponential distribution has the lack of memory property, that is, if T is exponentially distributed with parameter λ > then so is T t given that T > t for

More information

Web-based Supplementary Materials for A Robust Method for Estimating. Optimal Treatment Regimes

Web-based Supplementary Materials for A Robust Method for Estimating. Optimal Treatment Regimes Biometrics 000, 000 000 DOI: 000 000 0000 Web-based Supplementary Materials for A Robust Method for Estimating Optimal Treatment Regimes Baqun Zhang, Anastasios A. Tsiatis, Eric B. Laber, and Marie Davidian

More information

A FRAILTY MODEL APPROACH FOR REGRESSION ANALYSIS OF BIVARIATE INTERVAL-CENSORED SURVIVAL DATA

A FRAILTY MODEL APPROACH FOR REGRESSION ANALYSIS OF BIVARIATE INTERVAL-CENSORED SURVIVAL DATA Statistica Sinica 23 (2013), 383-408 doi:http://dx.doi.org/10.5705/ss.2011.151 A FRAILTY MODEL APPROACH FOR REGRESSION ANALYSIS OF BIVARIATE INTERVAL-CENSORED SURVIVAL DATA Chi-Chung Wen and Yi-Hau Chen

More information

Rank Regression Analysis of Multivariate Failure Time Data Based on Marginal Linear Models

Rank Regression Analysis of Multivariate Failure Time Data Based on Marginal Linear Models doi: 10.1111/j.1467-9469.2005.00487.x Published by Blacwell Publishing Ltd, 9600 Garsington Road, Oxford OX4 2DQ, UK and 350 Main Street, Malden, MA 02148, USA Vol 33: 1 23, 2006 Ran Regression Analysis

More information

Practice Exam 1. (A) (B) (C) (D) (E) You are given the following data on loss sizes:

Practice Exam 1. (A) (B) (C) (D) (E) You are given the following data on loss sizes: Practice Exam 1 1. Losses for an insurance coverage have the following cumulative distribution function: F(0) = 0 F(1,000) = 0.2 F(5,000) = 0.4 F(10,000) = 0.9 F(100,000) = 1 with linear interpolation

More information

1 Introduction. 2 Residuals in PH model

1 Introduction. 2 Residuals in PH model Supplementary Material for Diagnostic Plotting Methods for Proportional Hazards Models With Time-dependent Covariates or Time-varying Regression Coefficients BY QIQING YU, JUNYI DONG Department of Mathematical

More information

Composite likelihood and two-stage estimation in family studies

Composite likelihood and two-stage estimation in family studies Biostatistics (2004), 5, 1,pp. 15 30 Printed in Great Britain Composite likelihood and two-stage estimation in family studies ELISABETH WREFORD ANDERSEN The Danish Epidemiology Science Centre, Statens

More information

CTDL-Positive Stable Frailty Model

CTDL-Positive Stable Frailty Model CTDL-Positive Stable Frailty Model M. Blagojevic 1, G. MacKenzie 2 1 Department of Mathematics, Keele University, Staffordshire ST5 5BG,UK and 2 Centre of Biostatistics, University of Limerick, Ireland

More information

Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis

Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis Jonathan Taylor & Kristin Cobb Statistics 262: Intermediate Biostatistics p.1/?? Overview of today s class Kaplan-Meier Curve

More information

Spring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM

Spring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 14 GEE-GMM Throughout the course we have emphasized methods of estimation and inference based on the principle

More information

Comparing Distribution Functions via Empirical Likelihood

Comparing Distribution Functions via Empirical Likelihood Georgia State University ScholarWorks @ Georgia State University Mathematics and Statistics Faculty Publications Department of Mathematics and Statistics 25 Comparing Distribution Functions via Empirical

More information

TESTINGGOODNESSOFFITINTHECOX AALEN MODEL

TESTINGGOODNESSOFFITINTHECOX AALEN MODEL ROBUST 24 c JČMF 24 TESTINGGOODNESSOFFITINTHECOX AALEN MODEL David Kraus Keywords: Counting process, Cox Aalen model, goodness-of-fit, martingale, residual, survival analysis. Abstract: The Cox Aalen regression

More information

Continuous Time Survival in Latent Variable Models

Continuous Time Survival in Latent Variable Models Continuous Time Survival in Latent Variable Models Tihomir Asparouhov 1, Katherine Masyn 2, Bengt Muthen 3 Muthen & Muthen 1 University of California, Davis 2 University of California, Los Angeles 3 Abstract

More information

Reduced-rank hazard regression

Reduced-rank hazard regression Chapter 2 Reduced-rank hazard regression Abstract The Cox proportional hazards model is the most common method to analyze survival data. However, the proportional hazards assumption might not hold. The

More information

A note on L convergence of Neumann series approximation in missing data problems

A note on L convergence of Neumann series approximation in missing data problems A note on L convergence of Neumann series approximation in missing data problems Hua Yun Chen Division of Epidemiology & Biostatistics School of Public Health University of Illinois at Chicago 1603 West

More information

EMPIRICAL ENVELOPE MLE AND LR TESTS. Mai Zhou University of Kentucky

EMPIRICAL ENVELOPE MLE AND LR TESTS. Mai Zhou University of Kentucky EMPIRICAL ENVELOPE MLE AND LR TESTS Mai Zhou University of Kentucky Summary We study in this paper some nonparametric inference problems where the nonparametric maximum likelihood estimator (NPMLE) are

More information

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data Journal of Multivariate Analysis 78, 6282 (2001) doi:10.1006jmva.2000.1939, available online at http:www.idealibrary.com on Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone

More information

EM Algorithm II. September 11, 2018

EM Algorithm II. September 11, 2018 EM Algorithm II September 11, 2018 Review EM 1/27 (Y obs, Y mis ) f (y obs, y mis θ), we observe Y obs but not Y mis Complete-data log likelihood: l C (θ Y obs, Y mis ) = log { f (Y obs, Y mis θ) Observed-data

More information

Analysis of Gamma and Weibull Lifetime Data under a General Censoring Scheme and in the presence of Covariates

Analysis of Gamma and Weibull Lifetime Data under a General Censoring Scheme and in the presence of Covariates Communications in Statistics - Theory and Methods ISSN: 0361-0926 (Print) 1532-415X (Online) Journal homepage: http://www.tandfonline.com/loi/lsta20 Analysis of Gamma and Weibull Lifetime Data under a

More information

Modification and Improvement of Empirical Likelihood for Missing Response Problem

Modification and Improvement of Empirical Likelihood for Missing Response Problem UW Biostatistics Working Paper Series 12-30-2010 Modification and Improvement of Empirical Likelihood for Missing Response Problem Kwun Chuen Gary Chan University of Washington - Seattle Campus, kcgchan@u.washington.edu

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued

Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations Research

More information

and Comparison with NPMLE

and Comparison with NPMLE NONPARAMETRIC BAYES ESTIMATOR OF SURVIVAL FUNCTIONS FOR DOUBLY/INTERVAL CENSORED DATA and Comparison with NPMLE Mai Zhou Department of Statistics, University of Kentucky, Lexington, KY 40506 USA http://ms.uky.edu/

More information

On consistency of Kendall s tau under censoring

On consistency of Kendall s tau under censoring Biometria (28), 95, 4,pp. 997 11 C 28 Biometria Trust Printed in Great Britain doi: 1.193/biomet/asn37 Advance Access publication 17 September 28 On consistency of Kendall s tau under censoring BY DAVID

More information

STAT 331. Accelerated Failure Time Models. Previously, we have focused on multiplicative intensity models, where

STAT 331. Accelerated Failure Time Models. Previously, we have focused on multiplicative intensity models, where STAT 331 Accelerated Failure Time Models Previously, we have focused on multiplicative intensity models, where h t z) = h 0 t) g z). These can also be expressed as H t z) = H 0 t) g z) or S t z) = e Ht

More information

Efficiency Comparison Between Mean and Log-rank Tests for. Recurrent Event Time Data

Efficiency Comparison Between Mean and Log-rank Tests for. Recurrent Event Time Data Efficiency Comparison Between Mean and Log-rank Tests for Recurrent Event Time Data Wenbin Lu Department of Statistics, North Carolina State University, Raleigh, NC 27695 Email: lu@stat.ncsu.edu Summary.

More information

Goodness-of-fit tests for the cure rate in a mixture cure model

Goodness-of-fit tests for the cure rate in a mixture cure model Biometrika (217), 13, 1, pp. 1 7 Printed in Great Britain Advance Access publication on 31 July 216 Goodness-of-fit tests for the cure rate in a mixture cure model BY U.U. MÜLLER Department of Statistics,

More information

Maximum Smoothed Likelihood for Multivariate Nonparametric Mixtures

Maximum Smoothed Likelihood for Multivariate Nonparametric Mixtures Maximum Smoothed Likelihood for Multivariate Nonparametric Mixtures David Hunter Pennsylvania State University, USA Joint work with: Tom Hettmansperger, Hoben Thomas, Didier Chauveau, Pierre Vandekerkhove,

More information

Two-level lognormal frailty model and competing risks model with missing cause of failure

Two-level lognormal frailty model and competing risks model with missing cause of failure University of Iowa Iowa Research Online Theses and Dissertations Spring 2012 Two-level lognormal frailty model and competing risks model with missing cause of failure Xiongwen Tang University of Iowa Copyright

More information

6 Pattern Mixture Models

6 Pattern Mixture Models 6 Pattern Mixture Models A common theme underlying the methods we have discussed so far is that interest focuses on making inference on parameters in a parametric or semiparametric model for the full data

More information

Asymptotic statistics using the Functional Delta Method

Asymptotic statistics using the Functional Delta Method Quantiles, Order Statistics and L-Statsitics TU Kaiserslautern 15. Februar 2015 Motivation Functional The delta method introduced in chapter 3 is an useful technique to turn the weak convergence of random

More information

6. Fractional Imputation in Survey Sampling

6. Fractional Imputation in Survey Sampling 6. Fractional Imputation in Survey Sampling 1 Introduction Consider a finite population of N units identified by a set of indices U = {1, 2,, N} with N known. Associated with each unit i in the population

More information

Large sample theory for merged data from multiple sources

Large sample theory for merged data from multiple sources Large sample theory for merged data from multiple sources Takumi Saegusa University of Maryland Division of Statistics August 22 2018 Section 1 Introduction Problem: Data Integration Massive data are collected

More information

Attributable Risk Function in the Proportional Hazards Model

Attributable Risk Function in the Proportional Hazards Model UW Biostatistics Working Paper Series 5-31-2005 Attributable Risk Function in the Proportional Hazards Model Ying Qing Chen Fred Hutchinson Cancer Research Center, yqchen@u.washington.edu Chengcheng Hu

More information

8. Parametric models in survival analysis General accelerated failure time models for parametric regression

8. Parametric models in survival analysis General accelerated failure time models for parametric regression 8. Parametric models in survival analysis 8.1. General accelerated failure time models for parametric regression The accelerated failure time model Let T be the time to event and x be a vector of covariates.

More information