A semiparametric generalized proportional hazards model for right-censored data

Size: px
Start display at page:

Download "A semiparametric generalized proportional hazards model for right-censored data"

Transcription

1 Ann Inst Stat Math (216) 68: DOI 1.17/s A semiparametric generalized proportional hazards model for right-censored data M. L. Avendaño M. C. Pardo Received: 28 March 213 / Revised: 3 September 214 / Published online: 21 November 214 The Institute of Statistical Mathematics, Tokyo 214 Abstract We introduce a flexible family of semiparametric generalized logit-based regression models for survival analysis. Its hazard rates are proportional as the Cox model, but its relative risk related to a covariate is different for the values of the other covariates. The method of partial likelihood approach is applied to estimate its parameters in presence of right censoring and its asymptotic normality is established. We perform a simulation study to evaluate the finite-sample performance of these estimators. This new family of models is illustrated with lung cancer data and compared with Cox model. The importance of the conclusions obtained from the relative risk is pointed out. Keywords Survival analysis Proportional hazards Type-I generalized logistic distribution Semiparametric models Profile likelihood Partial likelihood 1 Introduction Analysis of event times (also referred as survival analysis) deals with data representing the time to a well-defined event. These data arise in engineering, economy, reliability, public health, biomedicine and other areas. Data arising from survival analysis often M. L. Avendaño (B) Faculty of Mathematics, University of Veracruz, Circuito Gonzalo Aguirre Beltrán, s/n, Zona Universitaria, 919 Xalapa, Mexico maravend@ucm.es M. C. Pardo Department of Statistics and Operational Research I, Faculty of Mathematics, Complutense University of Madrid, Plaza de las Ciencias 3, Ciudad Universitaria, 284 Madrid, Spain

2 354 M. L. Avendaño, M. C. Pardo consist of a response variable that measures the duration of time until the occurrence of a specific event and a set of variables (covariates) thought to be associated with the event-time variable, they have some features that present difficulties to traditional statistical methods. The first is that the data are generally asymmetrically distributed, while the second feature is that lifetimes are frequently censored (the end point of interest has not been observed for that individual). Regression models for survival data have traditionally been based on the Cox regression model (Cox 1972). One of the reasons for the popularity of this model is because the unknown parameter can be estimated by partial likelihood without putting a structure on baseline hazard. This model requires that the hazards between any two individuals are proportional across time. There are many works whose aim has been to extend the Cox model in different ways. Some of them appealing the necessity to allow non-proportional hazards such as Aranda-Ordaz (1983), Tibshirani and Ciampi (1983), Etezadi-Amoli and Ciampi (1987), Thomas (1986), Sasieni (1995), Clayton (1978), Hougaard (1984), Younes and Lachin (1997), MacKenzie (1996, 1997) and Devarajan and Ebrahimi (211). However, when the assumption of proportional hazards is tenable, a Cox regression model is usually the preferred model. Therefore, proportional hazards models that maintain the good properties of the Cox model, but give flexibility to the model are welcome. As mentioned above, regression models for survival data have traditionally been based on the proportional hazards model of Cox (1972) which is defined through the hazard function λ(t z) of the form λ(t β, z) = λ (t) exp(β z), (1) where λ ( ) is an arbitrary function of time called baseline hazard function, z = (z 1,...,z d ) is a vector of covariates for the individual at time t, and β = (β 1,...,β d ) is a vector of unknown parameters to be estimated. In case that the baseline hazard function is treated non-parametrically, then this model becomes a semiparametric model. In this paper, we introduce a flexible semiparametric family of proportional hazards models based on replacing the exponential link function in (1) by a generalization of the logistic distribution (see Balakrishnan 1992) called Type-I generalized logistic distribution in Sect. 2. The partial likelihood approach is proposed to estimate the covariate effects of the model considering right-censored data and an application is shown in Sects. 3 and 4, respectively. To prove the asymptotic normality of the partial likelihood estimator in our model, first we establish the equivalence of the partial likelihood estimator and the profile likelihood estimator for our model. Second, the efficiency of the profile likelihood estimator for our semiparametric model is proven. This proof is based on the method of an approximate least favorable submodel used by Murphy and van der Vaart (2) with the Cox regression model and it is described in Sect. 5, but developed in an Appendix. Finally, the small sample performance of the maximum partial likelihood estimators of the regression parameters in threeand four-covariate hazard function models is evaluated through a simulation study in Sect. 6.

3 A semiparametric generalized PH model for right-censoring The generalized proportional hazards model The Cox model given in (1) can be generalized by replacing the exponential function in (1) by a generalization of the logistic distribution called Type-I generalized logistic distribution which is given by ( ) 1 α F(y) = 1 + e y, < y <, α>, where the parameter α is a proportionality constant for the distribution. See Balakrishnan (1992). Therefore, we propose a proportional hazards model defined through the hazard function λ(t θ, z) = λ (t)k (θ, z), (2) with ( ) 1 α K (θ, z) = 1 + exp( β, α >, z) where θ = (β,α), which we call generalized logit-link proportional hazards model. In fact, if α = orβ =, the model reduces to a non-parametric form. The cumulative hazard function for the model (2) is given by (t θ, z) = (t)k (θ, z), where (t) = t λ (u)du is the baseline cumulative hazard function. Thus, the survival function corresponding to the hazard model is S(t θ, z) = exp( (t θ, z)), and Eq. (2) characterizes the model with density given by f (t θ, z) = exp( (t θ, z))λ(t θ, z). Note that, in the particular case α = 1, we obtain the proportional hazards model with a logit-link function exp(β z) λ(t β, z) = λ (t) 1 + exp(β z). For two-covariate profiles z i and z j, the hazard ratio is proportional and the relative risk does not depend on t as

4 356 M. L. Avendaño, M. C. Pardo ρ(t θ, z i, z j ) = λ(t θ, z i) λ(t θ, z j ) ( 1 + exp( β z j ) = 1 + exp( β z i ) = ρ(t β,α,z i, z j ). The interpretation of the hazard ratio of model (2) is similar to the Cox model hazard ratio since if ρ(t θ, z i, z j ) = 1, then the individuals with covariates z i and z j have the same relative risk of death and if ρ(t θ, z i, z j )<1(>1), the individual with covariate z i has lower (higher) relative risk of death than the individual with covariate z j. As particular case, when the covariate z m increases z m +1 and the rest of covariates remain equal the relative risk is equal (lower or higher) if and only if β m = (< or >), respectively, and the parameter α is a proportionality parameter for the model since the relative risk increases or decreases when α increases depending if the relative risk is less or greater than 1. Moreover, this parameter gives flexibility to the model to adapt better to the data. For illustrative purposes, we consider a two-covariate model, the first one a continuous covariate and the second one a 4-level factor covariate. The 4-level covariate is transformed in an indicator 3-dimensional vector. So, we obtain a full model given by ( ) 1 α λ(t θ, z) = λ (t), α >. 1 + exp( (β 1 z 1 + β 2 z 2 + β 3 z 3 + β 4 z 4 )) In this case, if we take the relative risk of individuals with values z 1 = z 1i and z 1 = z 1 j, respectively, at the first covariate and the same level at the second covariate, we have that the relative risk depend on the level of the second covariate, that is: ) α Level I: Level II: Level III: Level IV: ) α ( 1 + exp( z1 j β 1 ) 1 + exp( z 1i β 1 ) ( 1 + exp( z1 j β 1 β 2 ) ) α 1 + exp( z 1i ˆβ 1 β 2 ) ( ) 1 + exp( z1 j β 1 β 3 ) α 1 + exp( z 1i β 1 β 3 ) ( ) 1 + exp( z1 j β 1 β 4 ) α. 1 + exp( z 1i β 1 β 4 ) Note that, in this case, the relative risk under Cox model (1)isgivenby ρ Cox (t β, z i, z j ) = exp(β 1 (z 1i z 1 j )) which does not depend on the second covariate level. Therefore, the proposed model gives more flexibility to the relative risk.

5 A semiparametric generalized PH model for right-censoring 357 Up to now, we have not made any assumption about the baseline hazard function λ ( ), but in the following we assume an unknown functional form for the baseline hazard function of the model, so we have a semiparametric model. 3 Estimation procedure Consider a right-censored data sample of size n where the observed data are the iid triplets (X i,δ i, Z i ) where X i = min(t i, C i ) is the observed time with T i and C i the survival and censoring time, respectively, δ i = 1 {Ti C i } is the indicator variable of censoring and Z i is the vector of covariates for i = 1,...,n. In this case, analogously to Cox (1972), we can construct the partial likelihood function for the sample, and it can be written as L n (θ) = n i=1 K (θ, z i ) K (θ, z j ) j R(x i ) where R(x i ) is the risk set at time x i, and the partial log-likelihood function is given by l n (θ) = ln(l n (θ)) n = ln(k (θ, z i )) ln δ i i=1 j R(x i ) δ i, K (θ, z j ). (3) Then, the maximum partial log-likelihood estimator of the unknown vector of parameters of model (2)isgivenby ˆθ n = ( ˆβ n, ˆα n ) = arg max l n(θ), (4) θ=(β,α>) and the variance of the parameter estimator ˆθ n is var(ˆθ n ) = diag(î 1 n (ˆθ n )), where Î n is the observed information matrix. To obtain (4), we would need to use numerical methods for carrying out the required constrained (α >) optimization. However, the numerical algorithms are high sensitive to initial values of the parameters and they fall at local optimums. To avoid it, we consider {α 1,...,α l } a grid for the value of the parameter α then we maximize the partial log-likelihood function so we obtain l estimations of β, {β 1,...,β l } with partial log-likelihood function values {l n (α 1, β 1 ),...,l n (α l, β l )}. Finally, we consider arg max{l n (α 1, β 1 ),..., l n (α l, β l )} as estimations of the parameters (α, β).

6 l l l 358 M. L. Avendaño, M. C. Pardo beta alpha beta alpha alpha beta Fig. 1 Surface for partial log-likelihood function To check how the method employed works and also if there exists an identifiable problem for β and α, weshowinfig.1 a surface with the values of the partial loglikelihood function for different values of α and β 1 for a data set of size 5 generated from our model with the covariate z 1 a Bernoulli of parameter.45 and z 2 = for α = 2.5 and β 1 = β 2 = 1. It shows clearly that the method employed works very well. Furthermore, we graph the level curves for the partial log-likelihood function in Fig. 2. The true value of the parameters is the red square and the blue circle is that which achieves the maximum value for the partial log-likelihood function. It shows there is non-identifiability issue even if we have only one covariate. 4 Veteran s administration lung cancer data We present results from fitting the generalized proportional hazards model to a dataset from the Veteran s Administration Lung Cancer Study Clinical Trial (Kalbfleisch and Prentice 22). In this clinical trial, males with advanced inoperable lung cancer were randomized to either a standard or test chemotherapy. The primary end point for therapy comparison was time to death. Only nine out of the 137 survival times were censored, so the censoring rate is 6.6 %. The dataset includes six covariates: treatment (1 = standard and 2 = test), age at diagnosis (in years), Karnofsky score (from to 1), diagnosis time, cell type (1 = Squamous, 2 = Small, 3 = Adenocarcinoma and 4 = Large) and prior therapy. The primary purpose of this clinical trial was to

7 A semiparametric generalized PH model for right-censoring 359 Partial log likelihood Partial log likelihood α α β 1 5 levels β 1 1 levels Fig. 2 Level curves for partial log-likelihood function Table 1 Fit of models for veteran s administration lung cancer data Covariate PH GLPH Est SE Est SE Karnofsky Cell type Cell type Cell type α AIC investigate whether the new chemotherapy works better or worse than the standard chemotherapy after adjusting for other covariates. The Veteran data were used by Kalbfleisch and Prentice (22) to illustrate the Cox proportional hazards model and the treatment effect was found to be non-significant. The veteran cancer data can be obtained from the randomsurvivalforest Package of the R statistical software, these data are called veteran. In this example, we consider only two covariates: Karnofsky score and cell type (factor with four levels). We fit the Cox proportional hazards model (PH) and the generalized logit-link proportional hazards model (GLPH). In Table 1, we can see the parameter estimates (Est) and their standard errors (SE) for the two fitted models, for GLPH model we use parametric bootstrap with 5 replicas to obtain the SE of the parameters. We compare the fit of the two models with the the Akaike s entropy-based Information Criterion (AIC). From these results, we conclude that the fit of the two models is similar. On the other hand, the interpretation of regression coefficients is similar in both models. However, the interpretation of their relative risks differs. We consider two individuals with a difference of 1 in the Karnofsky score value with the same cell type. Under PH model the relative risk is

8 36 M. L. Avendaño, M. C. Pardo exp(1 ˆβ 1 ).73. However, under the GLPH model the relative risk depends on the Cell type of the individuals. By way of illustration, we consider individuals with a difference of Karnofsky score of 1, for example, individuals with Karnofsky score of 2 and 1, respectively: ( ) ˆα 1 + exp( 1 ˆβ 1 ) Cell type 1 : exp( 2 ˆβ 1 ) ( ) ˆα 1 + exp( 1 ˆβ 1 ˆβ 2 ) Cell type 2 : exp( 2 ˆβ 1 ˆβ 2 ) ( ) ˆα 1 + exp( 1 ˆβ 1 ˆβ 3 ) Cell type 3 : exp( 2 ˆβ 1 ˆβ 3 ) ( ) ˆα 1 + exp( 1 ˆβ 1 ˆβ 4 ) Cell type 4 : exp( 2 ˆβ 1 ˆβ 4 ) Note that, an individual with a Karnofsky score of 2 compared with an individual with a Karnofsky score of 1 has lower relative risk, but now it is different depending on the cell type. The individuals with cell type 1 have the lowest risk. The individuals with cell type 3 have the highest risk. On other hand, considering individuals with Karnofsky score of 1 and 9, respectively (also a difference of 1 ), the relative risks are: ( ) ˆα 1 + exp( 9 ˆβ 1 ) Cell type 1 : exp( 1 ˆβ 1 ) ( ) ˆα 1 + exp( 9 ˆβ 1 ˆβ 2 ) Cell type 2 : exp( 1 ˆβ 1 ˆβ 2 ) ( ) ˆα 1 + exp( 9 ˆβ 1 ˆβ 3 ) Cell type 3 : exp( 1 ˆβ 1 ˆβ 3 ) ( ) ˆα 1 + exp( 9 ˆβ 1 ˆβ 4 ) Cell type 4 :.7, 1 + exp( 1 ˆβ 1 ˆβ 4 ) thus, an individual with a Karnofsky score of 1 compared with an individual with a Karnofsky score of 9 has lower relative risk to die than when we compare to individuals with Karnofsky scores of 1 and 2, respectively, and this risk is different depending on the cell type. The individuals with cell type 1 have the lowest risk. The individuals with cell type 3 have the highest risk. To sum up, the relative risk under the Cox model is independent from the Cell type and also the concrete Karnofsky score since only it takes into account the difference

9 A semiparametric generalized PH model for right-censoring 361 between the score of the individuals. Therefore, the proposed model provides useful information from its relative risk. 5 Asymptotic normal distribution of the parameter estimators To make inference, we establish the asymptotic normal distribution of the parameter estimator of the generalized proportional hazards model. To do it, we use the main Theorem of Murphy and van der Vaart (2) and its Corollary 1, (see Kosorok 28). Note that, this theorem considers a maximum profile log-likelihood estimator and we propose a maximum partial log-likelihood estimator, then it is necessary to verify that both estimators are equivalent. Consider the semiparametric model (θ, ) P θ,, where θ is the regression parameter and the cumulative hazard function in the model (2) and P θ, is the data distribution function. Recall that an observation with right censoring is given by Y = (X,δ,Z) where X = min(t, C), δ = 1 {T C} and Z R d is the covariate vector. We assume that T is the survival time with cumulative hazard function given Z (t z) = (t)k (θ, z) and C is a censoring time independent of T given Z and uninformative of (θ, ). In this case, the density function of y = (x,δ,z) is given by p θ, (y) =[λ (x)k (θ, z)s T Z (x z)s C Z (x z)] δ [S T Z (x z)p C Z (x z)] 1 δ p Z (z). Thus, the log-likelihood for the observation y is given by l(θ, )(y) = log p θ, (y) = δ[log λ (x) + log K (θ, z)] (x)k (θ, z) + c (5) where c is a constant. Note that, the log-likelihood function depends on the parameters θ and. Furthermore, in Appendix A it is shown that its efficient score function for θ is l θ, (y) = δ( K (θ, z) h (x)) K (θ, z) x ( K (θ, z) h (s))d (s), (6) where with K (θ, z) = ([K 1 (θ, z)], K 2 (θ, z)) (7) αz K 1 (θ, z) = 1 + exp(β z), K 2 (θ, z) = log[k (θ, z)] and h : R R d+1 is the least favorable direction (see Appendix A ).

10 362 M. L. Avendaño, M. C. Pardo We consider the log-likelihood function based on a random sample of n observations y 1,...,y n which is given by l n (θ, ) = n l(θ, )(y i ), i=1 then the profile log-likelihood for θ is defined as pl n (θ) = sup l n (θ, ). (8) Note that, the maximum likelihood estimator for θ, the first component of the pair (ˆθ n, ˆ ) that maximizes l n (θ, ), is the maximizer of the profile loglikelihood function θ pl n (θ). Then, θ n denote the maximum profile log-likelihood estimator. To establish the asymptotic properties of the parameter estimates θ n we assume the following: (i) Exist τ < such that P (C τ) = P (C = τ) >, where P is the true probability measure. (ii) The observed time X [,τ]. (iii) Let H be all monotone increasing continuous functions on [,τ] with () =. We define Ĥ to be the set of all monotone increasing Cadlag functions on [,τ] with () = and Ĥ, where is the true value of the parameter. (iv) The vector Z R d is bounded. (v) Let R d R + be a compact set such as θ, where θ is the true value of the parameter. (vi) The efficient Fisher information matrix Ĩ is invertible. Theorem 1 Under regularity conditions (i) (vi), the maximum profile log-likelihood estimator θ n for the generalized proportional hazards model (2) obtained maximizing (8) is asymptotically normal distributed as n( θ n θ ) d n N (, Ĩ 1 ) where Ĩ = P [( l θ, )( l θ, ) ] with l θ, given in (6). Proof To apply Corollary 1 of the main result of Murphy and van der Vaart (2) we need to verify Conditions (8) (11) of Murphy and van der Vaart (2) and also conditions of its Theorem 1. Let us define the least favorable submodel t t (θ, )

11 A semiparametric generalized PH model for right-censoring 363 with t R d+1, where t (θ, )(s) = t (s) = s (1 + (θ t) h (u))d (u), where h : R R d+1 is the least favorable direction at the true parameter (θ, ). For t small enough t (θ, ) Ĥ. Moreover, we obtain θ (θ, )( ) = θ ( ) = ( ), that is, Condition (8) of Murphy and van der Vaart (2) is satisfied. The least favorable direction h ( ) is given by h (x) = P [K (θ, z)1 {X x} ] P [ K (θ, z)k (θ, z)1 {X x} ], with K (θ, z) as (7). Details about the construction of h ( ) is given in Appendix A at the end of this work. On the other hand, we consider the map t l(t, θ, )(y) defined by l(t, θ, )(y) = log p t, t (y), (9) thus the log-likelihood for the data y under the least favorable submodel is given by l(t, θ, )(y) = δ[log((1 + (θ t) h (x))λ (x)) + log K (t, z)] K (t, z) and its score function for t is given by x (1 + (θ t) h (s))d (s) l(t, θ, )(y) = t l(t, θ, )(y) (1) = δ K (t, z) τ δh (x) (1 + (θ t) h (x)) + τ ( = K (t, z) K (t, z)k (t, z)1 {X s} (1 + (θ t) h (s))d (s) τ h (s) 1 + (θ t) h (s) K (t, z)1 {X s} h (s)d (s) (11) ) dm(s), (12) with dm(s) = d1 {X s} δ K (t, z)1 {X s} (1 + (θ t) h (s))d (s). Thus, we have l(θ, θ, )(y) = l θ, (y), this is, Condition (9) of Murphy and van der Vaart (2) is satisfied.

12 364 M. L. Avendaño, M. C. Pardo In addition, considering the likelihood L(θ, ), the maximizer of L(θ, ) has the form t P n [dn(u)] ˆ θ (t) = P n [1 {X u} K (θ, z)], (13) with P n the empirical distribution function and N(u) = 1 {X u} δ. This estimator is equivalent to the Breslow cumulative baseline hazard estimator in Cox model. Note that, the estimator (13) depends on the parameter θ. In Theorem 5 ( Appendix B ) it P is verified that ˆ θ n is uniformly consistent for for any sequence θ n θ. Thus, Condition (1) of Murphy and van der Vaart (2) is satisfied. The non-bias Condition (11) of Murphy and van der Vaart (2) is verified in Lemma 6 in Appendix B of this work. Finally, standard arguments allow to prove Donsker and Glivenko Cantelli conditions of Theorem 1 of Murphy and van der Vaart (2) (see Appendix B ). Then, we can apply Theorem 1 of Murphy and van der Vaart (2) and we obtain log pl n ( θ n ) = log pl n (θ ) + ( θ n θ ) n l (y i ) i=1 1 2 n( θ n θ ) Ĩ ( θ n θ ) + o P ( n θ n θ +1) 2 where l is the efficient score function for θ at (θ, )givenin(6) and Ĩ is the efficient Fisher matrix for θ at (θ, ). Now, assuming that Ĩ is invertible (Assumption iv)) and Theorem 5 of Appendix B, we can apply Corollary 1 of Murphy and van der Vaart (2) and we obtain n( θ n θ ) d n where, the efficient Fisher matrix is given by N (, Ĩ 1 ), Ĩ = P [( l θ, )( l θ, ) ]. Last theorem establishes the asymptotic distribution of the maximum profile loglikelihood estimator of the unknown parameters of model (2), but we propose to use maximum partial log-likelihood estimator. Nevertheless, next lemma shows equivalence of the partial likelihood and profile likelihood estimators. Lemma 2 The estimator θ n based on the profile log-likelihood for the model (2) under right censoring and the estimator ˆθ n based on the partial log-likelihood function (3) are equivalent.

13 A semiparametric generalized PH model for right-censoring 365 Proof We define the function N i (x) = 1 {Xi x}δ i, thus the partial likelihood function for the sample is given by L n (θ) = n the partial log-likelihood is i=1 x τ ( 1 {Xi x}k (θ, z i ) nj=1 1 {X j x}k (θ, z j ) ) Ni (x), l n (θ) = log(l n (θ)) n τ n = log[1 {Xi x}k (θ, z i )] log 1 {X j x}k (θ, z j ) dn i (x), i=1 and its score function is given by [ ( τ θ l n(θ) = np n K (θ, z) P ) ] n[1 {X x} K (θ, z)k (θ, z)] dn(x). (14) P n [1 {X x} K (θ, z)] On the other hand, if we consider the second term of the score efficient function (6) and the empirical density function we have P n [ K (θ, z) [,x] ( j=1 K (θ, z) P ) ] n[1 {X x} K (θ, z)k (θ, z)] d ˆ (s) P n [1 {X x} K (θ, z)] τ ( = P n [1 {X x} K (θ, z)k (θ, z)] P n [1 {X x} K (θ, z)] P n[1 {X x} K (θ, z)k (θ, z)] P n [1 {X x} K (θ, z)] =, ) d ˆ (s) with ˆ ( ) the estimator of the cumulative baseline hazard function. Taking the first term of (6) wehave ( P n [δ K (θ, z) E )] n[1 {X x} K (θ, z)k (θ, z)] E n [1 {X x} K (θ, z)] [ ( τ = P n K (θ, z) E ) ] n[1 {X x} K (θ, z)k (θ, z)] dn(x). E n [1 {X x} K (θ, z)] Thus, the estimator ˆθ n and the estimator θ n are equivalent.

14 366 M. L. Avendaño, M. C. Pardo 6 Simulation study The aim of this section is to check the normality of the estimators of the covariate effects of model (2) for finite-sample through a simulation study. The survival times t were generated under the model λ(t θ, z) = λ (t)k (θ, z), with ( ) 1 α K (θ, z) = 1 + exp( β, α > fixed z) as t = 1 ( ) log(u), K (θ, z) where u U(, 1) (Bender et al. (25)). We assume that the cumulative baseline hazard function is a Weibull distribution with shape parameter γ = 3.2 and scale parameter μ = 1.1. We consider three different experiment designs: First Design: We consider Z β = Z 1 β 1 + Z 2 β 2 + Z 3 β 3 + Z 4 β 4 with true values of the parameters β = (.5,.2,.3,.1) and α = 8, the covariates Z 1, Z 2 and Z 3 are binary variables that were generated from a factor of 4 levels and each level has a probability of {.37,.19,.3,.14}, respectively, and Z 4 is generated from a U( 1, 1). Second Design: We consider Z β = Z 1 β 1 + Z 2 β 2 + Z 3 β 3 with true values of the parameters β = (.3,.6,.1) and α = 8, the covariates Z 1, Z 2 and Z 3 were generated from a B(.35), U( 1, 1) and N (1.5, 2), respectively. Third Design: We consider Z β = Z 1 β 1 + Z 2 β 2 with true values of the parameters β = (2, 1) and α =.2, the covariates Z 1 and Z 2 were generated from a N (, 1) and U( 1, 1), respectively. Several sample sizes n = 1, 2, 4 and different percent of censoring 15, 45 and 7 % for First and Second designs were considered and only 15 % of censoring for Third design was considered. A total of 5 simulated datasets were generated for each configuration of three designs. In Sect. 5, we proved the asymptotic normality of the estimators of the covariate effects and now we investigate this property for small and moderate sample sizes. Therefore, we fit a generalized logit-based proportional hazard model for each one of the 5 simulate datasets. In Tables 2, 3 and 5, we present the average and the median of the estimators of the covariate effects, their estimated standard errors (SE), empirical standard errors (SEe), the p value for the Shapiro Wilks test for testing normality by marginals and the coverage probabilities (CP) of a 9, 95 and 99 % confidence interval by marginals, respectively. In Table 4, the p values for the Henze Zirkler multivariate normality test are presented for the First and Second Designs for each sample size

15 A semiparametric generalized PH model for right-censoring 367 Table 2 Average and median of the parameter estimates, standard error (SE), empirical standard error (SEe), p value for Shapiro Wilks test and coverage probabilities (CP) for a nominal 9, 95 and 99 %, respectively, for the first design Censoring Sample size Parameter Average Median SE SEe p value CP for.9 CP for.95 CP for % 1 β β β β β β β β β β β β % 1 β β β β β β β β β β

16 368 M. L. Avendaño, M. C. Pardo Table 2 continued Censoring Sample size Parameter Average Median SE SEe p value CP for.9 CP for.95 CP for.99 β β % 1 β β β β β β β β β β β β

17 A semiparametric generalized PH model for right-censoring 369 Table 3 Average and median of the parameter estimates, standard error (SE), empirical standard error (SEe), p value for Shapiro Wilks test and coverage probabilities (CP) for a nominal 9, 95 and 99 %, respectively, for the second design Censoring Sample size Parameter Average Median SE SEe p value CP for.9 CP for.95 CP for % 1 β β β β β β β β β % 1 β β β β β β β β β

18 37 M. L. Avendaño, M. C. Pardo Table 3 continued Censoring Sample size Parameter Average Median SE SEe p value CP for.9 CP for.95 CP for.99 7 % 1 β β β β β β β β β

19 A semiparametric generalized PH model for right-censoring 371 Table 4 p Values for the Henze Zirkler test Censoring Sample size p values Design 1 p values Design 2 15 % % % and each censoring percentage. Moreover, Fig. 3 shows the estimated density plots of the standardized parameter estimates (by marginals) to check the normality of the estimator of β, and Fig. 4 shows the multivariate qq-plots, for sample size 4 for First and Second designs and different censoring. From Tables 2 and 3 it can be seen that the average of the estimates of the parameters is really close to the true value of the parameters for all configurations in both designs. Furthermore, the estimated standard errors are very near to the empirical ones. The empirical coverage probabilities appear to be quite close to the nominal levels. Although the p values of the Shapiro Wilks normality test increase with n for most censoring, some p values are too small for not rejecting the normality of some estimations, even for n = 4. This can be caused by the optimization algorithm used that it favors one estimator over the others. However, the p values of the Henze Zirkler multivariate normality test (Table 4) accept the multivariate normality for n = 4. Moreover, sometimes the results of the formal tests contradict with the impressions of the graphical analysis. In particular for large sample sizes the blind trust on the formal tests for normality can lead to erroneous results. Figure 3 shows that the empirical density plots (by marginals) improve when n increases and these plots are near to a normal density for moderate censoring, even for small sample size and Fig. 4 supports the approach to a multivariate normality distribution based on the qq-plots for n = 4. From Table 5, we can see that the estimation of the parameters is bad, however, it is expected because for the Third design α is near to zero, so the model approaches to a non-parametric model. Therefore, we should prevent to use this model for α less than.5 since the estimations get worse cause the model approaches to non-parametric form. Appendix A: The efficient score function and the least favorable direction To find the efficient score function for the model (2), we follow the steps in the Cox model Example of Murphy and van der Vaart (2).

20 372 M. L. Avendaño, M. C. Pardo Table 5 Average and median of the parameter estimates, standard error (SE), empirical standard error (SEe), p value for Shapiro Wilks test and coverage probabilities (CP) for a nominal 9, 95 and 99 %, respectively, for the third design Censoring Sample size Parameter Average Median SE SEe p value CP for.9 CP for.95 CP for % 1 β β β β β β

21 A semiparametric generalized PH model for right-censoring 373 a First design β^4 Second design n=1 n=1 n=1 n=1 β^ n=2 n=2 n=2 n=2 β^ n=1 n=1 n= n=2 n=2 n= b n=4 n=4 n=4 n=4 β^4 n=4 n=4 n= n=1 n=1 n=1 n=1 β^ n=2 n=2 n=2 n=2 β^ n=1 n=1 n= n=2 n=2 n= c n=4 n=4 n=4 n=4 β^4 n=4 n=4 n= n=1 n=1 n=1 n=1 β^ n=2 n=2 n=2 n=2 β^ n=1 n=1 n= n=2 n=2 n= n=4 n=4 n=4 n=4 n=4 n=4 n=4 Fig. 3 Empirical density plots for the standardized parameter estimates for sample sizes n = 1, 2 and 4 for the First and Second design with a 15 % of censoring, b 45 % of censoring and c 7 % of censoring

22 374 M. L. Avendaño, M. C. Pardo a First design Chi Square Q Q Plot Second design Chi Square Q Q Plot Squared Mahalanobis Distance Squared Mahalanobis Distance Chi Square Quantile Chi Square Quantile b Squared Mahalanobis Distance Chi Square Q Q Plot Squared Mahalanobis Distance Chi Square Q Q Plot Chi Square Quantile Chi Square Quantile c Squared Mahalanobis Distance Chi Square Q Q Plot Squared Mahalanobis Distance Chi Square Q Q Plot Chi Square Quantile Chi Square Quantile Fig. 4 Multivariate qq-plots for sample sizes n = 4 for the First and Second design with a 15 % of censoring, b 45 % of censoring and c 7 % of censoring

23 A semiparametric generalized PH model for right-censoring 375 The efficient score function for θ at (θ, ) is defined as l θ, (y) = l θ, (y) θ, l θ, (y) (15) where l θ, (y) is the score function for θ at (θ, ) and θ, minimizes the squared distance P ( l θ, g)2 over all functions g in the closed linear span of the score functions for. In this case, as in the Cox model, we have θ, l θ, = A θ, h (16) h = (A θ, A θ, ) 1 A l θ, θ, (17) where A θ, h is the score function for at (θ, ), A is the adjoint operator θ, and h is the least favorable direction. Then, the score function for θ of the model (2) is given by From (5) wehave with and l(θ, )(y) = ([ ] β l(θ, )(y), ) α l(θ, )(y). β l(θ, )(y) = δk 1 (θ, z) (x)k 1 (θ, z)k (θ, z), α l(θ, )(y) = δk 2 (θ, z) (x)k 2 (θ, z)k (θ, z), K 1 (θ, z) = αz 1 + exp(β z) K 2 (θ, z) = ln[k (θ, z)]. Note that, the score function for θ can be expressed as l(θ, )(y) = δ K (θ, z) (x) K (θ, z)k (θ, z), (18) with K (θ, z) = (K 1 (θ, z), K 2 (θ, z)).

24 376 M. L. Avendaño, M. C. Pardo To obtain the score function for, we define the path t (x) = x (1 + t h(s))d (s) (19) for t R d+1, ( ) given and a bounded function h :[,τ] R d+1. The path t ( ) Ĥ if t. Replacing the path (19) in the log-likelihood (5) we obtain l(θ, t )(y) = δ[log((1 + t h(x))λ (x)) + log K (θ, z)] K (θ, z) x thus, its derivative with respect to t is given by t l(θ, h(x) t)(y) = δ (1 + t h(x)) and taking t =, we can get t l(θ, t)(y) = δh(x) K (θ, z) t= (1 + t h(s))d (s) + c, x K (θ, z) h(s)d (s), x h(s)d (s) = A θ, h(y). (2) Then, if we consider that h is the least favorable direction and by expressions (15), (16), (18) and (2) the efficient score function for θ is given by l θ, (y) = δ( K (θ, z) h (x)) K (θ, z) x ( K (θ, z) h (s))d (s). In addition, to find the expression of the least favorable direction h we use (17), so it is necessary to obtain the expressions of A θ, l(θ, )(y) and A θ, A θ, (x) where A θ, is the adjoint operator characterized by P θ, [(A θ, h)g] = [h(a θ, g)]. Using the last expression we can get P θ, [(A θ, g)(a θ, h)] = [g(a θ, A θ, h)] (21) and [(A θ, l(θ, ))h] =P θ, [ l(θ, )(A θ, h)]. (22)

25 A semiparametric generalized PH model for right-censoring 377 First, we calculate (A θ, g)(a θ, h) that it is given by x (A θ, g(y)) (A θ, h(y)) = δg(x) h(x) δk (θ, z) [g(x) h(s)d (s) x ] + h(x) g(s)d (s) ( x ) x + (K (θ, z)) 2 g(s)d (s) h(s)d (s). We define x x a(x) = g(x) h(s)d (s) + h(x) g(s)d (s) and integrating partially we have (A θ, g(y)) (A θ, h(y)) x = δg(x) h(x) δk (θ, z)a(x) + (K (θ, z)) 2 a(s)d (s). Furthermore, we can obtain τ P θ, [ δk (θ, z)a(x)] = a(x)e[(k (θ, z)) 2 1 {X x} ]d (x). Therefore, we have P θ, [ δg(x) h(x) δk (θ, z)a(x) + (K (θ, z)) 2 x Finally, by Eq. (21) ] a(s)d (s) = P θ, [δg(x) h(x)] τ [ ] = g(x) h(x) K (θ, z)p[t C > x z]p Z (z)dz d (x) R d τ = g(x) E[K (θ, z)1 {X x} ]h(x)d (x) = [g E[K (θ, z)1 {X x} ]h]. A θ, A θ, h(y) = E[K (θ, z)1 {X x} ]h(y). Analogously, taking the product l(θ, )A θ, h, integrating and using (22)wehave A θ, l(θ, )(y) = E[ K (θ, z)k (θ, z)1 {X x} ].

26 378 M. L. Avendaño, M. C. Pardo Then, we can define the least favorable direction at the true parameters (θ, ) as h (x) = E [ K (θ, z)k (θ, z)1 {X x} ]. E [K (θ, z)1 {X x} ] Appendix B: Technical details for the proof of Theorem 1 We follow the procedure of Murphy (1994) to establish the consistency of the estimators of the generalized proportional hazards model. First, we prove the following lemmas. Lemma 3 The estimator (13) is consistent at θ,thisis, when n. Proof We can obtain that sup ˆ θ (x) a.s. (x), x [,τ] sup ˆ θ (x) (x) sup 1 x [,τ] x [,τ] P n [1 {Xi x}k (θ, z)] 1 P [1 {Xi x}k (θ, z)] [ ] + sup (P N(x) n P ). P [1 {X x} K (θ, z)] On the other hand, the classes x [,τ] {1 {X x} K (θ, z) : x [,τ], θ } and { } N(x) : x [,τ], θ (23) P [1 {X x} K (θ, z)] are Donsker, and as consequence they are Glivenko Cantelli classes. Thus, we have sup (P n P )[1 {Xi x}k (θ, z)] a.s., x [,τ] additionally, as P [1 {Xi x}k (θ, z)] > forx [,τ] sup x [,τ] 1 P n [1 {Xi x}k (θ, z)] 1 P [1 {Xi x}k (θ, z)] a.s.. Moreover, as the class given in (23) is Glivenko Cantelli we obtain sup x [,τ] [ (P n P ) N(x) P [1 {X x} K (θ, z)] ] a.s..

27 A semiparametric generalized PH model for right-censoring 379 Then, the stated lemma follows. Lemma 4 lim sup ˆ θ (τ) < almost surely. Proof Under assumption (iv), there exist < c < such as We define the constant c 2 such as for θ given. Thus, we obtain z < c. c 2 = min K (θ, z) z <c P n [1 {X x} K (θ, z)] c 2 P n [1 {X x} ], and by the law of large numbers, almost surely we have P n [1 {X x} K (θ, z)] c 2 P [1 {X x} ]+o P (1). We assume that P [1 {X x} ] > forx [,τ], thus, P n [1 {X x} K (θ, z)] > almost surely when n. This is, the jumps of ˆ θ in τ are bounded by 1/c 2, thus, ˆ θ (τ) O(1)P n N(τ)/c 2 almost surely when n, with N(x) = 1 {X x} δ. Now, we can prove the following consistency theorem, where θ n is the maximum profile log-likelihood estimator of model (2). Theorem 5 The estimators of the generalized proportional hazards model (2) considering n right-censored data are consistent, this is sup ˆ θ n (x) (x) and θ n θ x [,τ] converge to almost surely when n. Proof By Lemma 4 and Helly s selection Theorem, we have that along of a sequence ˆ θ n (x) (x) for x [,τ], where ( ) is a continuous non-decreasing function and θ n θ with θ.

28 38 M. L. Avendaño, M. C. Pardo On the other hand, because P n [1 {X x} K (θ, z)] >ɛ>, we have that ˆ θ n ( ) is absolutely continuous with respect to ˆ θ ( ) and d ˆ θn (x) converges to a bounded d ˆ θ (x) measurable function ζ(x), thisis (x) = x ζ(s)d (s). Thus, ( ) is absolutely continuous with respect to ( ) and its derivative is denoted by λ ( ), moreover ζ(x) = λ (x) λ (x). In addition, as ( θ n, ˆ θ n ) maximize l n (θ, ) we have 1 n n [l( θ n, ˆ θ n )(y) l(θ, ˆ θ )(y)], i=1 if n, P [l(θ, )(y) l(θ, )(y)]. By the Glivenko Cantelli Theorem and as d ˆ θn (x) d ˆ converges uniformly to λ (x) θ (x) λ (x),we can conclude that the Kullback Leibler information between the density given by the parameters (θ, ) and the density given by the parameters (θ, ) is negative, thus two densities are equal almost surely. This implies that θ = θ and ( ) = ( ) in [,τ]. Then, θ n θ and ˆ θ n (x) (x) almost surely with x [,τ]. Lemma 6 Under the consistency of the estimators θ n and ˆ θ n we have P [ l(θ, θ n, ˆ θ n )]=o P ( θ n θ ). Proof From expression (11) wehave [ τ P [ l(θ, θ n, ˆ θ n )]=P K (θ, z)1 {X s} (h (s) K (θ, z))d ˆ θ n (s) τ Moreover, we can obtain the following equalities K (θ, z)k (θ, z)1 {X s} ( θ n θ ) h (s)d ˆ θ n (s) ] + δ( K (θ, z) h (x)) 1 + ( θ n θ ) h (x) + δ K (θ, z)( θ n θ ) h (x). 1 + ( θ n θ ) h (x)

29 A semiparametric generalized PH model for right-censoring 381 P [K (θ, z)1 {X x} (h (x) K (θ, z))] =, [ ] δ( K (θ, z) h (x)) P = 1 + ( θ n θ ) h (x) and [ ] [ ] δ K (θ, z)( θ n θ ) h (x) δh (x)( θ n θ ) h (x) P = P. 1 + ( θ n θ ) h (x) 1 + ( θ n θ ) h (x) Thus, τ P [ l(θ, θ n, ˆ θ n )]=P [ K (θ, z)k (θ, z)1 {X s} ( θ n θ ) h (s)d ˆ θ n (s) ] + δh (x)( θ n θ ) h (x). (24) 1 + ( θ n θ ) h (x) As s M(s) = N(s) K (θ, z)1 {X s} (1 + (θ θ ) h (s))d (s) is a martingale with media, we have P [ l(θ, θ n, )]= τ = P [ K (θ, z)k (θ, z)1 {X s} ( θ n θ ) h (s)d (s) ] + δ h (x)( θ n θ ) h (x). (25) 1 + ( θ n θ ) h (x) Finally, from expressions (24) and (25) wehave P [ l(θ, θ n, ˆ θ n )]=P [ l(θ, θ n, ˆ θ n )] P [ l(θ, θ n, )] = P [ τ K (θ, z)k (θ, z)1 {X s} ( θ n θ ) h (s)d( ˆ θ n (s) (s)) +2δh (x)( θ n θ ) h (x) 1 + ( θ n θ ) h (x) = o P ( θ n θ ). ]

30 382 M. L. Avendaño, M. C. Pardo Lemma 7 Let V be a neighborhood of (θ, θ, ) where θ and are the true values of the parameter of the generalized proportional hazards model (2) considering right-censored data and l(t, θ, ) the log-likelihood function for the submodel as (9). (i) The class of functions { l(t, θ, ) : (t, θ, ) V } is Donsker with squared-integrable envelope function. (ii) The class of functions { l(t, θ, ) : (t, θ, ) V } is Glivenko Cantelli and bounded in L 1 (P ). Proof Let F be the set of continuous distribution functions. For ρ>, we define C ρ ={P F : P P ρ}, where P is the true distribution function. On the other hand, as the vector z is bounded (assumption iv), the class {β z : β β } is Donsker, with β such as = β α. As the function exp(β z) is differentiable and its derivatives are bounded, the class {exp(β z) : β β } is Donsker. Moreover, as 1 + exp(β z)> we have that { ( K (θ, z) = exp( β z) ) α : θ }, (26) and { } αz K 1 (θ, z) = 1 + exp(β z) : θ are Donsker. In addition, as f (x) = ln(x) with domain in [c, ) where c > is Lipschitz we obtain that the class is Donsker. Thus, {K 2 (θ, z) = ln[k (θ, z)] :θ } { K (θ, z) : θ }

31 A semiparametric generalized PH model for right-censoring 383 is Donsker. As the class {1 {X x} : x [,τ]} is Donsker, we have that the classes and {K (θ, z)1 {X x} : x [,τ], θ } { K (θ, z)k (θ, z)1 {X x} : x [,τ], θ } are Donsker. For P C ρ, as the function P E P [ f ] is Lipschitz, the classes and {E P [K (θ, z)1 {X x} ]:x [,τ], θ, P C ρ } {E P [ K (θ, z)k (θ, z)1 {X x} ]:x [,τ], θ, P C ρ } are Donsker. On the other hand, as z is bounded as θ, there exist m and M such as < m < K (θ, z) <M <. (27) As the function P E P [ f ] is continuous, there exist ρ 1 > such as for each P C ρ we have E P [1 {X x} ] ρ 1 >. (28) From (27) and (28) we obtain <ρ 1 m me P [1 {X x} ] E P [K (θ, z)1 {X x} ] ME P [1 {X x} ] < thus, the class { } 1 E P [K (θ, z)1 {X x} ] : x [,τ], θ, P C ρ is Donsker and in consequence the class { K (θ, z) E } P[ K (θ, z)k (θ, z)1 {X x} ] : x [,τ], θ, P C ρ E P [K (θ, z)1 {X x} ] (29) is it. As the function f d is Lipschitz, the class { [,x] ( K (θ, z) E ) } P[ K (θ, z)k (θ, z)1 {X x} ] d : x [,τ], θ, P C ρ E P [K (θ, z)1 {X x} ] (3)

32 384 M. L. Avendaño, M. C. Pardo is Donsker. Finally, from (26), (29) and (3) we have that { l(θ, ) : x [,τ], θ, P C ρ } is Donsker with squared-integrable envelope function, this proves (i). Analogously, (ii) could be proved. Acknowledgments This work was supported by research grant MTM R and MAEC-AECID. The authors are grateful to the referees for their valuable comments and proposals which have improved the contents of this paper. References Aranda-Ordaz, F. J. (1983). An extension of the proportional-hazards model for grouped data. Biometrics, 39, Balakrishnan, N. (1992). Handbook of the logistic distribution. New York: Marcel Dekker. Bender, R., Augustin, T., Blettner, M. (25). Generating survival times to simulate Cox proportional hazards model. Statistics in Medicine, 24(11), Clayton, D. G. (1978). A model for association in bivariate life tables and its application in epidemiological studies of familial tendency in chronic disease incidence. Biometrika, 65, Cox, D. R. (1972). Regression models and life-tables. Journal of the Royal Statistical Society Series B, 34(2), Devarajan, K., Ebrahimi, N. (211). A semi-parametric generalization of the cox proportional hazards regression model: inference and applications. Computational Statistics and Data Analysis, 65(1), Etezadi-Amoli, J., Ciampi, A. (1987). Extended hazard regression for censored survival data with covariates: a spline approximation for the background hazard function. Biometrics, 43, Hougaard, P. (1984). Life table methods for heterogeneous populations: distributions describing heterogeneity. Biometrika, 71, Kalbfleisch, J. D., Prentice, R. L. (22). The statistical analysis of failure time data (2nd ed.). New York: Wiley. Kosorok, M. R. (28). Introduction to empirical processes and semiparametric inference. NewYork: Springer. MacKenzie, G. (1996). Regression models for survival data: the generalized time-logistic family. Statistician, 45, MacKenzie, G. (1997). On a non-proportional hazards regression model for repeated medical random counts. Statistics in Medicine, 16, Murphy, S. A. (1994). Consistency in a proportional hazards model incorporating a random effect. Annals of Statistics, 22(2), Murphy, S. A., van der Vaart, W. (2). On profile likelihood. Journal of the American Statistical Association, Theory and Methods, 95(45), Sasieni, P. D. (1995). Efficiently weighted estimating equations with application to proportional excess hazards. Lifetime data analysis, 1, Thomas, D. C. (1986). Use of auxiliary information in fitting non-proportional hazards model. In S. H. Moolgavkar R. L. Prentice (Eds.), Modern statistical methods in chronic disease epidemiology (pp ). New-York: Wiley. Tibshirani, R., Ciampi, A. (1983). A family of additive and proportional hazard models for survival data. Biometrics, 39(1), Younes, N., Lachin, J. (1997). Link-based models for survival data with interval and continuous time censoring. Biometrics, 53,

Efficiency of Profile/Partial Likelihood in the Cox Model

Efficiency of Profile/Partial Likelihood in the Cox Model Efficiency of Profile/Partial Likelihood in the Cox Model Yuichi Hirose School of Mathematics, Statistics and Operations Research, Victoria University of Wellington, New Zealand Summary. This paper shows

More information

Power and Sample Size Calculations with the Additive Hazards Model

Power and Sample Size Calculations with the Additive Hazards Model Journal of Data Science 10(2012), 143-155 Power and Sample Size Calculations with the Additive Hazards Model Ling Chen, Chengjie Xiong, J. Philip Miller and Feng Gao Washington University School of Medicine

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models

Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations

More information

Quantile Regression for Residual Life and Empirical Likelihood

Quantile Regression for Residual Life and Empirical Likelihood Quantile Regression for Residual Life and Empirical Likelihood Mai Zhou email: mai@ms.uky.edu Department of Statistics, University of Kentucky, Lexington, KY 40506-0027, USA Jong-Hyeon Jeong email: jeong@nsabp.pitt.edu

More information

STAT331. Cox s Proportional Hazards Model

STAT331. Cox s Proportional Hazards Model STAT331 Cox s Proportional Hazards Model In this unit we introduce Cox s proportional hazards (Cox s PH) model, give a heuristic development of the partial likelihood function, and discuss adaptations

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 24 Paper 153 A Note on Empirical Likelihood Inference of Residual Life Regression Ying Qing Chen Yichuan

More information

Tests of independence for censored bivariate failure time data

Tests of independence for censored bivariate failure time data Tests of independence for censored bivariate failure time data Abstract Bivariate failure time data is widely used in survival analysis, for example, in twins study. This article presents a class of χ

More information

UNIVERSITY OF CALIFORNIA, SAN DIEGO

UNIVERSITY OF CALIFORNIA, SAN DIEGO UNIVERSITY OF CALIFORNIA, SAN DIEGO Estimation of the primary hazard ratio in the presence of a secondary covariate with non-proportional hazards An undergraduate honors thesis submitted to the Department

More information

Full likelihood inferences in the Cox model: an empirical likelihood approach

Full likelihood inferences in the Cox model: an empirical likelihood approach Ann Inst Stat Math 2011) 63:1005 1018 DOI 10.1007/s10463-010-0272-y Full likelihood inferences in the Cox model: an empirical likelihood approach Jian-Jian Ren Mai Zhou Received: 22 September 2008 / Revised:

More information

Approximation of Survival Function by Taylor Series for General Partly Interval Censored Data

Approximation of Survival Function by Taylor Series for General Partly Interval Censored Data Malaysian Journal of Mathematical Sciences 11(3): 33 315 (217) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES Journal homepage: http://einspem.upm.edu.my/journal Approximation of Survival Function by Taylor

More information

Frailty Models and Copulas: Similarities and Differences

Frailty Models and Copulas: Similarities and Differences Frailty Models and Copulas: Similarities and Differences KLARA GOETHALS, PAUL JANSSEN & LUC DUCHATEAU Department of Physiology and Biometrics, Ghent University, Belgium; Center for Statistics, Hasselt

More information

A note on profile likelihood for exponential tilt mixture models

A note on profile likelihood for exponential tilt mixture models Biometrika (2009), 96, 1,pp. 229 236 C 2009 Biometrika Trust Printed in Great Britain doi: 10.1093/biomet/asn059 Advance Access publication 22 January 2009 A note on profile likelihood for exponential

More information

Survival Analysis for Case-Cohort Studies

Survival Analysis for Case-Cohort Studies Survival Analysis for ase-ohort Studies Petr Klášterecký Dept. of Probability and Mathematical Statistics, Faculty of Mathematics and Physics, harles University, Prague, zech Republic e-mail: petr.klasterecky@matfyz.cz

More information

BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY

BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY Ingo Langner 1, Ralf Bender 2, Rebecca Lenz-Tönjes 1, Helmut Küchenhoff 2, Maria Blettner 2 1

More information

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model Other Survival Models (1) Non-PH models We briefly discussed the non-proportional hazards (non-ph) model λ(t Z) = λ 0 (t) exp{β(t) Z}, where β(t) can be estimated by: piecewise constants (recall how);

More information

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

FULL LIKELIHOOD INFERENCES IN THE COX MODEL October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach

More information

Lecture 22 Survival Analysis: An Introduction

Lecture 22 Survival Analysis: An Introduction University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 22 Survival Analysis: An Introduction There is considerable interest among economists in models of durations, which

More information

Estimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk

Estimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk Ann Inst Stat Math (0) 64:359 37 DOI 0.007/s0463-00-036-3 Estimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk Paul Vos Qiang Wu Received: 3 June 009 / Revised:

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations Takeshi Emura and Hisayuki Tsukuma Abstract For testing the regression parameter in multivariate

More information

Cox s proportional hazards model and Cox s partial likelihood

Cox s proportional hazards model and Cox s partial likelihood Cox s proportional hazards model and Cox s partial likelihood Rasmus Waagepetersen October 12, 2018 1 / 27 Non-parametric vs. parametric Suppose we want to estimate unknown function, e.g. survival function.

More information

Noninformative Priors for the Ratio of the Scale Parameters in the Inverted Exponential Distributions

Noninformative Priors for the Ratio of the Scale Parameters in the Inverted Exponential Distributions Communications for Statistical Applications and Methods 03, Vol. 0, No. 5, 387 394 DOI: http://dx.doi.org/0.535/csam.03.0.5.387 Noninformative Priors for the Ratio of the Scale Parameters in the Inverted

More information

Analysis of competing risks data and simulation of data following predened subdistribution hazards

Analysis of competing risks data and simulation of data following predened subdistribution hazards Analysis of competing risks data and simulation of data following predened subdistribution hazards Bernhard Haller Institut für Medizinische Statistik und Epidemiologie Technische Universität München 27.05.2013

More information

Lecture 3. Truncation, length-bias and prevalence sampling

Lecture 3. Truncation, length-bias and prevalence sampling Lecture 3. Truncation, length-bias and prevalence sampling 3.1 Prevalent sampling Statistical techniques for truncated data have been integrated into survival analysis in last two decades. Truncation in

More information

Chapter 2 Inference on Mean Residual Life-Overview

Chapter 2 Inference on Mean Residual Life-Overview Chapter 2 Inference on Mean Residual Life-Overview Statistical inference based on the remaining lifetimes would be intuitively more appealing than the popular hazard function defined as the risk of immediate

More information

Analysis of Time-to-Event Data: Chapter 4 - Parametric regression models

Analysis of Time-to-Event Data: Chapter 4 - Parametric regression models Analysis of Time-to-Event Data: Chapter 4 - Parametric regression models Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/25 Right censored

More information

Survival Analysis Math 434 Fall 2011

Survival Analysis Math 434 Fall 2011 Survival Analysis Math 434 Fall 2011 Part IV: Chap. 8,9.2,9.3,11: Semiparametric Proportional Hazards Regression Jimin Ding Math Dept. www.math.wustl.edu/ jmding/math434/fall09/index.html Basic Model Setup

More information

Verifying Regularity Conditions for Logit-Normal GLMM

Verifying Regularity Conditions for Logit-Normal GLMM Verifying Regularity Conditions for Logit-Normal GLMM Yun Ju Sung Charles J. Geyer January 10, 2006 In this note we verify the conditions of the theorems in Sung and Geyer (submitted) for the Logit-Normal

More information

MAS3301 / MAS8311 Biostatistics Part II: Survival

MAS3301 / MAS8311 Biostatistics Part II: Survival MAS3301 / MAS8311 Biostatistics Part II: Survival M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2009-10 1 13 The Cox proportional hazards model 13.1 Introduction In the

More information

Attributable Risk Function in the Proportional Hazards Model

Attributable Risk Function in the Proportional Hazards Model UW Biostatistics Working Paper Series 5-31-2005 Attributable Risk Function in the Proportional Hazards Model Ying Qing Chen Fred Hutchinson Cancer Research Center, yqchen@u.washington.edu Chengcheng Hu

More information

FULL LIKELIHOOD INFERENCES IN THE COX MODEL: AN EMPIRICAL LIKELIHOOD APPROACH

FULL LIKELIHOOD INFERENCES IN THE COX MODEL: AN EMPIRICAL LIKELIHOOD APPROACH FULL LIKELIHOOD INFERENCES IN THE COX MODEL: AN EMPIRICAL LIKELIHOOD APPROACH Jian-Jian Ren 1 and Mai Zhou 2 University of Central Florida and University of Kentucky Abstract: For the regression parameter

More information

Multivariate Survival Data With Censoring.

Multivariate Survival Data With Censoring. 1 Multivariate Survival Data With Censoring. Shulamith Gross and Catherine Huber-Carol Baruch College of the City University of New York, Dept of Statistics and CIS, Box 11-220, 1 Baruch way, 10010 NY.

More information

Jackknife Euclidean Likelihood-Based Inference for Spearman s Rho

Jackknife Euclidean Likelihood-Based Inference for Spearman s Rho Jackknife Euclidean Likelihood-Based Inference for Spearman s Rho M. de Carvalho and F. J. Marques Abstract We discuss jackknife Euclidean likelihood-based inference methods, with a special focus on the

More information

Large sample theory for merged data from multiple sources

Large sample theory for merged data from multiple sources Large sample theory for merged data from multiple sources Takumi Saegusa University of Maryland Division of Statistics August 22 2018 Section 1 Introduction Problem: Data Integration Massive data are collected

More information

Empirical likelihood ratio with arbitrarily censored/truncated data by EM algorithm

Empirical likelihood ratio with arbitrarily censored/truncated data by EM algorithm Empirical likelihood ratio with arbitrarily censored/truncated data by EM algorithm Mai Zhou 1 University of Kentucky, Lexington, KY 40506 USA Summary. Empirical likelihood ratio method (Thomas and Grunkmier

More information

STAT 6350 Analysis of Lifetime Data. Failure-time Regression Analysis

STAT 6350 Analysis of Lifetime Data. Failure-time Regression Analysis STAT 6350 Analysis of Lifetime Data Failure-time Regression Analysis Explanatory Variables for Failure Times Usually explanatory variables explain/predict why some units fail quickly and some units survive

More information

11 Survival Analysis and Empirical Likelihood

11 Survival Analysis and Empirical Likelihood 11 Survival Analysis and Empirical Likelihood The first paper of empirical likelihood is actually about confidence intervals with the Kaplan-Meier estimator (Thomas and Grunkmeier 1979), i.e. deals with

More information

Lecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL

Lecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL Lecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL The Cox PH model: λ(t Z) = λ 0 (t) exp(β Z). How do we estimate the survival probability, S z (t) = S(t Z) = P (T > t Z), for an individual with covariates

More information

Step-Stress Models and Associated Inference

Step-Stress Models and Associated Inference Department of Mathematics & Statistics Indian Institute of Technology Kanpur August 19, 2014 Outline Accelerated Life Test 1 Accelerated Life Test 2 3 4 5 6 7 Outline Accelerated Life Test 1 Accelerated

More information

Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion

Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion Glenn Heller and Jing Qin Department of Epidemiology and Biostatistics Memorial

More information

Regularization in Cox Frailty Models

Regularization in Cox Frailty Models Regularization in Cox Frailty Models Andreas Groll 1, Trevor Hastie 2, Gerhard Tutz 3 1 Ludwig-Maximilians-Universität Munich, Department of Mathematics, Theresienstraße 39, 80333 Munich, Germany 2 University

More information

Multistate Modeling and Applications

Multistate Modeling and Applications Multistate Modeling and Applications Yang Yang Department of Statistics University of Michigan, Ann Arbor IBM Research Graduate Student Workshop: Statistics for a Smarter Planet Yang Yang (UM, Ann Arbor)

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued

Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations Research

More information

Empirical Processes & Survival Analysis. The Functional Delta Method

Empirical Processes & Survival Analysis. The Functional Delta Method STAT/BMI 741 University of Wisconsin-Madison Empirical Processes & Survival Analysis Lecture 3 The Functional Delta Method Lu Mao lmao@biostat.wisc.edu 3-1 Objectives By the end of this lecture, you will

More information

A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints

A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints Noname manuscript No. (will be inserted by the editor) A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints Mai Zhou Yifan Yang Received: date / Accepted: date Abstract In this note

More information

Frailty Modeling for clustered survival data: a simulation study

Frailty Modeling for clustered survival data: a simulation study Frailty Modeling for clustered survival data: a simulation study IAA Oslo 2015 Souad ROMDHANE LaREMFiQ - IHEC University of Sousse (Tunisia) souad_romdhane@yahoo.fr Lotfi BELKACEM LaREMFiQ - IHEC University

More information

Beyond GLM and likelihood

Beyond GLM and likelihood Stat 6620: Applied Linear Models Department of Statistics Western Michigan University Statistics curriculum Core knowledge (modeling and estimation) Math stat 1 (probability, distributions, convergence

More information

Continuous Time Survival in Latent Variable Models

Continuous Time Survival in Latent Variable Models Continuous Time Survival in Latent Variable Models Tihomir Asparouhov 1, Katherine Masyn 2, Bengt Muthen 3 Muthen & Muthen 1 University of California, Davis 2 University of California, Los Angeles 3 Abstract

More information

Bias Correction of Cross-Validation Criterion Based on Kullback-Leibler Information under a General Condition

Bias Correction of Cross-Validation Criterion Based on Kullback-Leibler Information under a General Condition Bias Correction of Cross-Validation Criterion Based on Kullback-Leibler Information under a General Condition Hirokazu Yanagihara 1, Tetsuji Tonda 2 and Chieko Matsumoto 3 1 Department of Social Systems

More information

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky Empirical likelihood with right censored data were studied by Thomas and Grunkmier (1975), Li (1995),

More information

Multivariate Survival Analysis

Multivariate Survival Analysis Multivariate Survival Analysis Previously we have assumed that either (X i, δ i ) or (X i, δ i, Z i ), i = 1,..., n, are i.i.d.. This may not always be the case. Multivariate survival data can arise in

More information

Survival Analysis I (CHL5209H)

Survival Analysis I (CHL5209H) Survival Analysis Dalla Lana School of Public Health University of Toronto olli.saarela@utoronto.ca January 7, 2015 31-1 Literature Clayton D & Hills M (1993): Statistical Models in Epidemiology. Not really

More information

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA Kasun Rathnayake ; A/Prof Jun Ma Department of Statistics Faculty of Science and Engineering Macquarie University

More information

AFT Models and Empirical Likelihood

AFT Models and Empirical Likelihood AFT Models and Empirical Likelihood Mai Zhou Department of Statistics, University of Kentucky Collaborators: Gang Li (UCLA); A. Bathke; M. Kim (Kentucky) Accelerated Failure Time (AFT) models: Y = log(t

More information

EMPIRICAL ENVELOPE MLE AND LR TESTS. Mai Zhou University of Kentucky

EMPIRICAL ENVELOPE MLE AND LR TESTS. Mai Zhou University of Kentucky EMPIRICAL ENVELOPE MLE AND LR TESTS Mai Zhou University of Kentucky Summary We study in this paper some nonparametric inference problems where the nonparametric maximum likelihood estimator (NPMLE) are

More information

Maximum likelihood estimation in the proportional hazards cure model

Maximum likelihood estimation in the proportional hazards cure model AISM DOI 1.17/s1463-7-12-x Maximum likelihood estimation in the proportional hazards cure model Wenbin Lu Received: 12 January 26 / Revised: 16 November 26 The Institute of Statistical Mathematics, Tokyo

More information

Analysis of Gamma and Weibull Lifetime Data under a General Censoring Scheme and in the presence of Covariates

Analysis of Gamma and Weibull Lifetime Data under a General Censoring Scheme and in the presence of Covariates Communications in Statistics - Theory and Methods ISSN: 0361-0926 (Print) 1532-415X (Online) Journal homepage: http://www.tandfonline.com/loi/lsta20 Analysis of Gamma and Weibull Lifetime Data under a

More information

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3 Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest

More information

1 Glivenko-Cantelli type theorems

1 Glivenko-Cantelli type theorems STA79 Lecture Spring Semester Glivenko-Cantelli type theorems Given i.i.d. observations X,..., X n with unknown distribution function F (t, consider the empirical (sample CDF ˆF n (t = I [Xi t]. n Then

More information

Bayesian estimation of the discrepancy with misspecified parametric models

Bayesian estimation of the discrepancy with misspecified parametric models Bayesian estimation of the discrepancy with misspecified parametric models Pierpaolo De Blasi University of Torino & Collegio Carlo Alberto Bayesian Nonparametrics workshop ICERM, 17-21 September 2012

More information

Practice Exam 1. (A) (B) (C) (D) (E) You are given the following data on loss sizes:

Practice Exam 1. (A) (B) (C) (D) (E) You are given the following data on loss sizes: Practice Exam 1 1. Losses for an insurance coverage have the following cumulative distribution function: F(0) = 0 F(1,000) = 0.2 F(5,000) = 0.4 F(10,000) = 0.9 F(100,000) = 1 with linear interpolation

More information

On the Breslow estimator

On the Breslow estimator Lifetime Data Anal (27) 13:471 48 DOI 1.17/s1985-7-948-y On the Breslow estimator D. Y. Lin Received: 5 April 27 / Accepted: 16 July 27 / Published online: 2 September 27 Springer Science+Business Media,

More information

Logistic regression model for survival time analysis using time-varying coefficients

Logistic regression model for survival time analysis using time-varying coefficients Logistic regression model for survival time analysis using time-varying coefficients Accepted in American Journal of Mathematical and Management Sciences, 2016 Kenichi SATOH ksatoh@hiroshima-u.ac.jp Research

More information

A FRAILTY MODEL APPROACH FOR REGRESSION ANALYSIS OF BIVARIATE INTERVAL-CENSORED SURVIVAL DATA

A FRAILTY MODEL APPROACH FOR REGRESSION ANALYSIS OF BIVARIATE INTERVAL-CENSORED SURVIVAL DATA Statistica Sinica 23 (2013), 383-408 doi:http://dx.doi.org/10.5705/ss.2011.151 A FRAILTY MODEL APPROACH FOR REGRESSION ANALYSIS OF BIVARIATE INTERVAL-CENSORED SURVIVAL DATA Chi-Chung Wen and Yi-Hau Chen

More information

Estimation of the Bivariate and Marginal Distributions with Censored Data

Estimation of the Bivariate and Marginal Distributions with Censored Data Estimation of the Bivariate and Marginal Distributions with Censored Data Michael Akritas and Ingrid Van Keilegom Penn State University and Eindhoven University of Technology May 22, 2 Abstract Two new

More information

Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests

Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests Biometrika (2014),,, pp. 1 13 C 2014 Biometrika Trust Printed in Great Britain Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests BY M. ZHOU Department of Statistics, University

More information

GROUPED SURVIVAL DATA. Florida State University and Medical College of Wisconsin

GROUPED SURVIVAL DATA. Florida State University and Medical College of Wisconsin FITTING COX'S PROPORTIONAL HAZARDS MODEL USING GROUPED SURVIVAL DATA Ian W. McKeague and Mei-Jie Zhang Florida State University and Medical College of Wisconsin Cox's proportional hazard model is often

More information

Hacettepe Journal of Mathematics and Statistics Volume 45 (5) (2016), Abstract

Hacettepe Journal of Mathematics and Statistics Volume 45 (5) (2016), Abstract Hacettepe Journal of Mathematics and Statistics Volume 45 (5) (2016), 1605 1620 Comparing of some estimation methods for parameters of the Marshall-Olkin generalized exponential distribution under progressive

More information

Likelihood Construction, Inference for Parametric Survival Distributions

Likelihood Construction, Inference for Parametric Survival Distributions Week 1 Likelihood Construction, Inference for Parametric Survival Distributions In this section we obtain the likelihood function for noninformatively rightcensored survival data and indicate how to make

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview

Introduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview Introduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations

More information

Efficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models

Efficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models NIH Talk, September 03 Efficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models Eric Slud, Math Dept, Univ of Maryland Ongoing joint project with Ilia

More information

On Estimation of Partially Linear Transformation. Models

On Estimation of Partially Linear Transformation. Models On Estimation of Partially Linear Transformation Models Wenbin Lu and Hao Helen Zhang Authors Footnote: Wenbin Lu is Associate Professor (E-mail: wlu4@stat.ncsu.edu) and Hao Helen Zhang is Associate Professor

More information

Estimation and Inference of Quantile Regression. for Survival Data under Biased Sampling

Estimation and Inference of Quantile Regression. for Survival Data under Biased Sampling Estimation and Inference of Quantile Regression for Survival Data under Biased Sampling Supplementary Materials: Proofs of the Main Results S1 Verification of the weight function v i (t) for the lengthbiased

More information

Survival Analysis. Stat 526. April 13, 2018

Survival Analysis. Stat 526. April 13, 2018 Survival Analysis Stat 526 April 13, 2018 1 Functions of Survival Time Let T be the survival time for a subject Then P [T < 0] = 0 and T is a continuous random variable The Survival function is defined

More information

Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates

Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates Anastasios (Butch) Tsiatis Department of Statistics North Carolina State University http://www.stat.ncsu.edu/

More information

Models for Multivariate Panel Count Data

Models for Multivariate Panel Count Data Semiparametric Models for Multivariate Panel Count Data KyungMann Kim University of Wisconsin-Madison kmkim@biostat.wisc.edu 2 April 2015 Outline 1 Introduction 2 3 4 Panel Count Data Motivation Previous

More information

COX S REGRESSION MODEL UNDER PARTIALLY INFORMATIVE CENSORING

COX S REGRESSION MODEL UNDER PARTIALLY INFORMATIVE CENSORING COX S REGRESSION MODEL UNDER PARTIALLY INFORMATIVE CENSORING Roel BRAEKERS, Noël VERAVERBEKE Limburgs Universitair Centrum, Belgium ABSTRACT We extend Cox s classical regression model to accomodate partially

More information

Least Absolute Deviations Estimation for the Accelerated Failure Time Model. University of Iowa. *

Least Absolute Deviations Estimation for the Accelerated Failure Time Model. University of Iowa. * Least Absolute Deviations Estimation for the Accelerated Failure Time Model Jian Huang 1,2, Shuangge Ma 3, and Huiliang Xie 1 1 Department of Statistics and Actuarial Science, and 2 Program in Public Health

More information

COMPUTATION OF THE EMPIRICAL LIKELIHOOD RATIO FROM CENSORED DATA. Kun Chen and Mai Zhou 1 Bayer Pharmaceuticals and University of Kentucky

COMPUTATION OF THE EMPIRICAL LIKELIHOOD RATIO FROM CENSORED DATA. Kun Chen and Mai Zhou 1 Bayer Pharmaceuticals and University of Kentucky COMPUTATION OF THE EMPIRICAL LIKELIHOOD RATIO FROM CENSORED DATA Kun Chen and Mai Zhou 1 Bayer Pharmaceuticals and University of Kentucky Summary The empirical likelihood ratio method is a general nonparametric

More information

COMPUTE CENSORED EMPIRICAL LIKELIHOOD RATIO BY SEQUENTIAL QUADRATIC PROGRAMMING Kun Chen and Mai Zhou University of Kentucky

COMPUTE CENSORED EMPIRICAL LIKELIHOOD RATIO BY SEQUENTIAL QUADRATIC PROGRAMMING Kun Chen and Mai Zhou University of Kentucky COMPUTE CENSORED EMPIRICAL LIKELIHOOD RATIO BY SEQUENTIAL QUADRATIC PROGRAMMING Kun Chen and Mai Zhou University of Kentucky Summary Empirical likelihood ratio method (Thomas and Grunkmier 975, Owen 988,

More information

Part III Measures of Classification Accuracy for the Prediction of Survival Times

Part III Measures of Classification Accuracy for the Prediction of Survival Times Part III Measures of Classification Accuracy for the Prediction of Survival Times Patrick J Heagerty PhD Department of Biostatistics University of Washington 102 ISCB 2010 Session Three Outline Examples

More information

A general mixed model approach for spatio-temporal regression data

A general mixed model approach for spatio-temporal regression data A general mixed model approach for spatio-temporal regression data Thomas Kneib, Ludwig Fahrmeir & Stefan Lang Department of Statistics, Ludwig-Maximilians-University Munich 1. Spatio-temporal regression

More information

β j = coefficient of x j in the model; β = ( β1, β2,

β j = coefficient of x j in the model; β = ( β1, β2, Regression Modeling of Survival Time Data Why regression models? Groups similar except for the treatment under study use the nonparametric methods discussed earlier. Groups differ in variables (covariates)

More information

Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems

Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems Jeremy S. Conner and Dale E. Seborg Department of Chemical Engineering University of California, Santa Barbara, CA

More information

Approximate and Fiducial Confidence Intervals for the Difference Between Two Binomial Proportions

Approximate and Fiducial Confidence Intervals for the Difference Between Two Binomial Proportions Approximate and Fiducial Confidence Intervals for the Difference Between Two Binomial Proportions K. Krishnamoorthy 1 and Dan Zhang University of Louisiana at Lafayette, Lafayette, LA 70504, USA SUMMARY

More information

ST495: Survival Analysis: Maximum likelihood

ST495: Survival Analysis: Maximum likelihood ST495: Survival Analysis: Maximum likelihood Eric B. Laber Department of Statistics, North Carolina State University February 11, 2014 Everything is deception: seeking the minimum of illusion, keeping

More information

CTDL-Positive Stable Frailty Model

CTDL-Positive Stable Frailty Model CTDL-Positive Stable Frailty Model M. Blagojevic 1, G. MacKenzie 2 1 Department of Mathematics, Keele University, Staffordshire ST5 5BG,UK and 2 Centre of Biostatistics, University of Limerick, Ireland

More information

TMA 4275 Lifetime Analysis June 2004 Solution

TMA 4275 Lifetime Analysis June 2004 Solution TMA 4275 Lifetime Analysis June 2004 Solution Problem 1 a) Observation of the outcome is censored, if the time of the outcome is not known exactly and only the last time when it was observed being intact,

More information

Published online: 10 Apr 2012.

Published online: 10 Apr 2012. This article was downloaded by: Columbia University] On: 23 March 215, At: 12:7 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 172954 Registered office: Mortimer

More information

SEMIPARAMETRIC LIKELIHOOD RATIO INFERENCE. By S. A. Murphy 1 and A. W. van der Vaart Pennsylvania State University and Free University Amsterdam

SEMIPARAMETRIC LIKELIHOOD RATIO INFERENCE. By S. A. Murphy 1 and A. W. van der Vaart Pennsylvania State University and Free University Amsterdam The Annals of Statistics 1997, Vol. 25, No. 4, 1471 159 SEMIPARAMETRIC LIKELIHOOD RATIO INFERENCE By S. A. Murphy 1 and A. W. van der Vaart Pennsylvania State University and Free University Amsterdam Likelihood

More information

Model Selection in Bayesian Survival Analysis for a Multi-country Cluster Randomized Trial

Model Selection in Bayesian Survival Analysis for a Multi-country Cluster Randomized Trial Model Selection in Bayesian Survival Analysis for a Multi-country Cluster Randomized Trial Jin Kyung Park International Vaccine Institute Min Woo Chae Seoul National University R. Leon Ochiai International

More information

On consistency of Kendall s tau under censoring

On consistency of Kendall s tau under censoring Biometria (28), 95, 4,pp. 997 11 C 28 Biometria Trust Printed in Great Britain doi: 1.193/biomet/asn37 Advance Access publication 17 September 28 On consistency of Kendall s tau under censoring BY DAVID

More information

ASYMPTOTIC PROPERTIES AND EMPIRICAL EVALUATION OF THE NPMLE IN THE PROPORTIONAL HAZARDS MIXED-EFFECTS MODEL

ASYMPTOTIC PROPERTIES AND EMPIRICAL EVALUATION OF THE NPMLE IN THE PROPORTIONAL HAZARDS MIXED-EFFECTS MODEL Statistica Sinica 19 (2009), 997-1011 ASYMPTOTIC PROPERTIES AND EMPIRICAL EVALUATION OF THE NPMLE IN THE PROPORTIONAL HAZARDS MIXED-EFFECTS MODEL Anthony Gamst, Michael Donohue and Ronghui Xu University

More information

LOGISTIC REGRESSION Joseph M. Hilbe

LOGISTIC REGRESSION Joseph M. Hilbe LOGISTIC REGRESSION Joseph M. Hilbe Arizona State University Logistic regression is the most common method used to model binary response data. When the response is binary, it typically takes the form of

More information

GOODNESS-OF-FIT TESTS FOR ARCHIMEDEAN COPULA MODELS

GOODNESS-OF-FIT TESTS FOR ARCHIMEDEAN COPULA MODELS Statistica Sinica 20 (2010), 441-453 GOODNESS-OF-FIT TESTS FOR ARCHIMEDEAN COPULA MODELS Antai Wang Georgetown University Medical Center Abstract: In this paper, we propose two tests for parametric models

More information

Reduced-rank hazard regression

Reduced-rank hazard regression Chapter 2 Reduced-rank hazard regression Abstract The Cox proportional hazards model is the most common method to analyze survival data. However, the proportional hazards assumption might not hold. The

More information

M- and Z- theorems; GMM and Empirical Likelihood Wellner; 5/13/98, 1/26/07, 5/08/09, 6/14/2010

M- and Z- theorems; GMM and Empirical Likelihood Wellner; 5/13/98, 1/26/07, 5/08/09, 6/14/2010 M- and Z- theorems; GMM and Empirical Likelihood Wellner; 5/13/98, 1/26/07, 5/08/09, 6/14/2010 Z-theorems: Notation and Context Suppose that Θ R k, and that Ψ n : Θ R k, random maps Ψ : Θ R k, deterministic

More information

STAT Sample Problem: General Asymptotic Results

STAT Sample Problem: General Asymptotic Results STAT331 1-Sample Problem: General Asymptotic Results In this unit we will consider the 1-sample problem and prove the consistency and asymptotic normality of the Nelson-Aalen estimator of the cumulative

More information

1 Introduction. 2 Residuals in PH model

1 Introduction. 2 Residuals in PH model Supplementary Material for Diagnostic Plotting Methods for Proportional Hazards Models With Time-dependent Covariates or Time-varying Regression Coefficients BY QIQING YU, JUNYI DONG Department of Mathematical

More information

Empirical Likelihood in Survival Analysis

Empirical Likelihood in Survival Analysis Empirical Likelihood in Survival Analysis Gang Li 1, Runze Li 2, and Mai Zhou 3 1 Department of Biostatistics, University of California, Los Angeles, CA 90095 vli@ucla.edu 2 Department of Statistics, The

More information