Tests of independence for censored bivariate failure time data

Size: px
Start display at page:

Download "Tests of independence for censored bivariate failure time data"

Transcription

1 Tests of independence for censored bivariate failure time data Abstract Bivariate failure time data is widely used in survival analysis, for example, in twins study. This article presents a class of χ 2 -type tests for independence between pairs of failure times after adjusting for covariates. A bivariate accelerated failure time model is proposed for the joint distribution of bivariate failure times while leaving the dependence structures for related failure times completely unspecified. Theoretical properties of the proposed tests are derived and variance estimates of the test statistics are obtained using a resampling technique. Simulation studies show that the proposed tests are appropriate for practical use. Two examples including the study of infection in catheters for patients on dialysis and the diabetic retinopathy study are also given to illustrate the methodology. Keywords Accelerated failure time model Bivariate failure time data Independence test Resampling 1 Introduction Multivariate failure time data arise in many medical studies when each study subject can potentially experience multiple failures or when failures times may be clustered. In this article we focus on the bivariate case. It is of great interest to test whether the pair of failure times for the same subject or within the same cluster are independent or not (Clayton, 1978). A number of methods have been developed for testing independence between two variables. For example, when there is no censoring, the coefficient of concordance (Kendall, 1962) is widely used for independence tests in a bivariate distribution. Oakes (1982 a ) generalized Kendall s tau to censored bivariate data. Furthermore, the cross ratio has been extensively studied for testing independence in a 2 2 table (e.g. Mantel and Haenszel, 1959). And Clayton (1978) introduced the cross ratio concept into bivariate failure time data, which was further studied by 1

2 Oakes (1989) and by Hsu and Prentice (1996). Based on similar ideas, Ding and Wang (2004) also proposed a Mantel-Haenszel type of independence test for bivariate current status data. However, most independence tests proposed in literature for bivariate failure time data do not account for the effect of covariates. One exception is that Hsu and Prentice (1996) extended their proposed covariance test to accommodate covariates. In their paper, the marginal distributions of the pair of failure times were assumed to follow the proportional hazards model. In practice it is important to investigate the dependence relationship between the two related failure times conditional on some covariates. For example, in twins study, it is of great interest to know whether the familial genetic effects introduce additional association between the two failure times of twins after adjusting some common environmental factors. The proportional hazards model (Cox, 1972) has been widely used for analysis of right censored survival data. However, as noted by many authors, the proportional hazards model may not be appropriate for modeling survival times in some medical studies, and alternative models may be more suitable. One of the alternatives is the accelerated failure time (AFT) model, which relates the logarithm of the failure time linearly to the covariates (Kalbfleisch and Prentice, 1980; Cox and Oakes, 1984). The AFT model could be more attractive than the proportional hazards model for many applications due to its direct physical interpretation (Jin et al., 2003). In this article, we propose a bivariate AFT model for joint analysis of pairs of failure times. It naturally generalizes the conventional AFT model for univariate failure times to bivariate case and leaves the dependence structure between the related failure times completely unspecified. For such models, we derive a class of χ 2 -type independence tests for the bivariate AFT model and study both the theoretical and numerical properties of the proposed independence tests. The remainder of the article is organized as follows. In the next section, we specify the bivariate accelerated failure time model and derive the proposed χ 2 -type independence tests as well as their theoretical properties. Numerical studies including simulation results and two examples are given in Section 3 and 4, respectively. Some 2

3 concluding remarks are provided in Section 5. Major technical derivations are contained in the Appendix. 2 The Proposed Methodology 2.1 The bivariate accelerated failure time model and parameters estimation Consider a study involving n independent subjects. Here we focus on data consisting of bivariate events, i.e. each study subject can potentially experience two types of failures. Extension to bivariate clustered failure time data will be discussed later. For i = 1,, n and k = 1, 2, let T ik be the time to the kth failure of the ith subject; let C ik be the corresponding censoring time, and Z ik be the p-dimensional vector of covariates. Throughout the paper, we assume that censoring is noninformative, i.e. (T i1, T i2 ) and (C i1, C i2 ) are independent conditional on (Z i1, Z i2 ). The joint distribution of the two types of failure times is formulated with the following bivariate accelerated failure time model, log T ik = β kz ik + ɛ ik, i = 1,, n; k = 1, 2, (1) where β k is a p-dimensional vector of regression parameters, and (ɛ i1, ɛ i2 ) (i = 1,, n) are independent random vectors with a common, but completely unspecified joint distribution. Thus, the two types of failure times marginally also follow the AFT model while the dependence structure between the pair of failure times is left completely unspecified. Define T ik = T ik C ik and δ ik = I(T ik C ik ). Here and in the sequel, a b = min(a, b), and I( ) is the indicator function. Then the observed data consists of ( T ik, δ ik, Z ik ) (k = 1, 2; i = 1,, n), which are n independent copies of ( T k, δ k, Z k ) (k = 1, 2). Furthermore, define W ik (β k ) = log T ik β k Z ik, C ik (β k ) = log C ik β k Z ik and W ik (β k ) = W ik (β k ) C ik (β k ) = log T ik β k Z ik. Let N ik (x, β k ) = δ ik I{ W ik (β k ) x} and Y ik (x, β k ) = I{ W ik (β k ) x} denote the counting and at-risk processes respectively, and let S (r) k (x, β k) = n 1 n Zr ik Y ik(x, β k ) (r = 0, 1). Then the weighted log-rank 3

4 estimating function for β k is given by U k, φk (β k ) = δ ik φ k { W ik (β k ), β k }[Z ik Z k { W ik (β k ), β k }], k = 1, 2, (2) where Z k (x, β k ) = S (1) k (x, β k)/s (0) k (x, β k) and φ k is a weight function satisfying Condition 5 of Ying (1993). Let ˆβ k, φk be a solution of U k, φk (β k ) = 0 and β 0k be the true value of β k (k = 1, 2). It has been established by a number of authors that n 1/2 ( ˆβ k, φk β 0k ) is asymptotically zero-mean normal (eg. Tsiatis, 1990; Ying, 1993). Furthermore, Lin and Wei (1992) showed that n 1/2 (ˆβ φ β 0 ) also converges in distribution to a zero-mean normal vector, where ˆβ φ = ( ˆβ 1, φ 1, ˆβ 2, φ 2 ) and β 0 = (β 01, β 02). In general, U k, φk (β k ) is neither continuous nor componentwise monotone in β k. Therefore, it is difficult to obtain the solution ˆβ k, φk, especially when β k is high dimensional. One simplification arises in the choice of φ k (x, β k ) = S (0) k (x, β k), which corresponds to the Gehan (1965) weight function. In this case, U k, φk (β k ) can be expressed as U k, G (β k ) = n 1 δ ik (Z ik Z jk )I{ W ik (β k ) W jk (β k )}, (3) j=1 which is monotone in each component of β k (Fygenson and Ritov, 1994). It is easy to show that the right-hand side of (3) is the gradient of the following convex function L k, G (β k ) = n 1 δ ik { W ik (β k ) W jk (β k )}, (4) j=1 where a = a I(a 0). And a minimizer ˆβ k, G of L k, G (β k ) can be easily obtained by the linear programming technique (Jin et al., 2003). To account for the potential dependence among multivariate failure times, Jin et al. (2006) recently proposed a new resampling approach for approximating the asymptotic variance-covariance matrix of ˆβ G ( ˆβ 1, G, ˆβ 2, G ). That is, define a new loss function L k, G(β k ) = n 1 δ ik { W ik (β k ) W jk (β k )} V i V j, (5) j=1 where V i (i = 1,, n) are are independent positive random variables with mean 1 and variance 1, and are independent of the observed data. Let 4 ˆβ k, G be a minimizer of

5 L k, G (β k). Then the asymptotic distribution of n 1/2 (ˆβ G β 0 ) can be approximated by the conditional distribution of n 1/2 (ˆβ G ˆβ G ) given the observed data, where ˆβ G = ( ˆβ 1, G, ˆβ 2, G ). In addition, given β k, the cumulative hazard function Λ k of error terms ɛ k (k = 1, 2), i.e. P (ɛ k > x) = exp{ Λ k (x)}, can be consistently estimated by the Aalen-Breslowtype estimator ˆΛ k (x, β k ) = x dn ik (u) ns (0) k (u, β k). (6) Therefore, a consistent estimator ˆΛ k of Λ k can be obtained by plugging a consistent estimator of β 0k into (6), for example the Gehan-weight estimator ˆβ k, G, i.e. ˆΛk (x) = ˆΛ k (x, ˆβ k, G ). 2.2 The proposed χ 2 -type independence test Our interest is to test the null hypothesis H 0 : T 1 and T 2 are independent given covariates Z 1 and Z 2, or equivalently H 0 : ɛ 1 and ɛ 2 are independent. Under the null hypothesis H 0, we have P {W i1 (β 01 ) W i2 (β 02 ) > x Z i1, Z i2 } = exp[ {Λ 1 (x) + Λ 2 (x)}]. (7) Define M i (x, β 1, β 2, Λ 1, Λ 2 ) = N i (x, β 1, β 2 ) x I{ W i (β 1, β 2 ) u}d{λ 1 (u) + Λ 2 (u)}, where W i (β 1, β 2 ) = W i1 (β 1 ) W i2 (β 2 ), N i (x, β 1, β 2 ) = δ i (β 1, β 2 )I{ W i (β 1, β 2 ) x}, and δ i (β 1, β 2 ) = δ i1 if W i1 (β 1 ) W i2 (β 2 ) and δ i2 otherwise. Since (T i1, T i2 ) and (C i1, C i2 ) are independent given covariates (Z i1, Z i2 ), it is easy to show that M i (x, β 01, β 02, Λ 01, Λ 02 ) is a zero-mean process under H 0, where Λ 01 and Λ 02 are the true values of Λ 1 and Λ 2, respectively. Thus, we propose to use the following test statistic for H 0, that is H n = n 1/2 h(x)dm i {x, ˆβ 1, G, ˆβ 2, G, ˆΛ 1 (, ˆβ 1, G ), ˆΛ 2 (, ˆβ 2, G )}, (8) where h(x) is a positive deterministic weight function. For example, the simplest choice of h is h(x) 1. Here we use the Gehan-weight estimators ˆβ 1, G and ˆβ 2, G of β 1 and β 2 respectively in H n for simplicity. Other weighted log-rank estimators can also be accommodated in H n using the technique of Jin et al. (2003). It is easy to see that 5

6 when replacing the Gehan-weight estimators by the true values β 01, β 02, Λ 01 and Λ 02 in (8), each summand of H n has mean zero under H 0. Thus, by the central limit theorem, H n converges in distribution to a normal random variable with mean zero under H 0. Actually, H n defined in (8) also converges in distribution to a zero-mean normal random variable under H 0. We will establish the theoretical properties of H n in the following theorem. Theorem 1 Under some regularity conditions, as n goes to, H n converges in distribution under H 0 to a normal random variable with mean 0 and variance σ 2. Moreover, H n can be expressed asymptotically as a sum of independent random variables, i.e. H n = n 1/2 n ψ i + o p (1), where ψ i = h(x)dm i (x, β 01, β 02, Λ 01, Λ 02 ) B k, GA 1 k, G U ik, G h(x) s(0) (x, β 01, β 02 ) s (0) k (x, β dm ik (x, β 0k, Λ 0k ), 0k) and M ik (x, β, Λ) = N ik (x, β) x Y ik(u, β)dλ(u) (k = 1, 2). The quantities B k, G, A k, G, U ik, G, s (0) (x, β 1, β 2 ) and s (0) k (x, β k) are defined in the Appendix. The proof of Theorem 1 is given in the Appendix. It is easy to show that under H 0, H n is asymptotically a sum of zero-mean independent random variables and σ 2 = V ar(ψ 1 ). To use Theorem 1 for testing H 0, we need to find a consistent estimator for σ 2. Like the variances of the weighted log-rank estimators, σ 2 also involves the derivatives of the density functions of error terms ɛ k (k = 1, 2). Directly estimating σ 2 from data is very complicated and needs nonparametric smoothing techniques, which usually require large sample size to ensure good results. In addition, since the test statistic H n involves the estimates of finite and infinite dimensional parameters, the classical bootstrap method for estimating asymptotic variance may not be applicable here. Therefore, we propose a resampling method for computing the estimator ˆσ 2 of σ 2. To be specific, define Hn = n 1/2 h(x)dm i {x, ˆβ 1, G, ˆβ 2, G, ˆΛ 1 (, ˆβ 1, G), ˆΛ 2 (, ˆβ 2, G)}. (9) 6

7 Then we have Theorem 2 Under some regularity conditions, as n goes to, the conditional distribution of Q n of H n under H 0, where given the observed data converges almost surely to the limiting distribution Q n = H n H n + n 1/2 [ h(x)dm i {x, ˆβ 1, G, ˆβ 2, G, ˆΛ 1 (, ˆβ 1, G ), ˆΛ 2 (, ˆβ 2, G )} h(x) S(0) (x, ˆβ 1, G, ˆβ 2, G ) S (0) k (x, ˆβ dm ik {x, ˆβ k, G, ˆΛ k (, ˆβ ] k, G )} (V i 1), k, G ) with S (0) (x, β 1, β 2 ) = n 1 n I{ W i (β 1, β 2 ) x} and V i (i = 1,, n) are the same as those used in (5) for computing ˆβ k, G. A similar resampling technique was also used by Lin et al. (1994) for constructing confidence bands of survival functions under the proportional hazards model. major difference between the proposed resampling method and Lin et al. (1994) s is that the same set of perturbation random variables {V i } s were also used for computing the resampling estimators One ˆβ k, G, k = 1, 2. Consequently, the proper resampling scheme needs to account for the sampling variations due to both the rank estimation of β k and the estimation of the infinite dimensional parameters Λ k (x), k = 1, 2. The proof of Theorem 2 is given in the Appendix. To obtain the estimator of σ 2, we repeatedly generate the random variables (V 1,, V n ), say M times, and calculate Q n,j (j = 1,, M) for each generated set. Then ˆσ 2 can be obtained using the sample variance of Q n,j (j = 1,, M), and the null hypothesis H 0 is rejected at level α when H 2 n/ˆσ 2 > χ 2 1(α), where χ 2 1(α) is the upper αth quantile of the χ 2 1 distribution. 2.3 Extension to bivariate clustered failure time data Bivariate clustered failure time data is another common type of bivariate failure time data. For example, in twins study the pair of failure times of twins are usually correlated. The model (1) proposed in Section 2.1. for bivariate events data can be easily tailored to accommodate bivariate clustered failure time data. To be specific, 7

8 we set β 1 = β 2 = β and Λ 1 = Λ 2 = Λ in (1). Then under the working independence assumption, the weighted log-rank estimating function for β can be written as U φ (β) = δ ik φ{ W ik (β), β}[z ik Z{ W ik (β), β}], (10) where Z(x, β) = S (1) (x, β)/s (0) (x, β) with S (r) (x, β) = n 1 n 2 Zr ik Y ik(x, β) (r = 0, 1) and φ is a weight function. The Gehan-weight estimating function corresponds to the choice of φ(x, β) = S (0) (x, β), i.e. U G (β) = n 1 δ ik (Z ik Z jl )I{ W ik (β) W jl (β)}, (11) j=1 l=1 which is the gradient of L G (β) n 1 δ ik { W ik (β) W jl (β)}. (12) j=1 l=1 Let ˆβ G be a minimizer of L G (β). Jin et al. (2006) showed that n 1/2 ( ˆβ G β 0 ) converges in distribution to a zero-mean normal vector, where β 0 is the true value of β. And the asymptotic variance-covariance matrix of ˆβ G can be estimated by the resampling method. To be specific, define L G (β) n 1 j=1 l=1 where V i (i = 1,, n) are defined the same as before. Let δ ik { W ik (β) W jl (β)} V i V j, (13) ˆβ G be a minimizer of L G (β). Then the limiting distribution of n 1/2 ( ˆβ G β 0 ) can be approximated by the conditional distribution of n 1/2 ( ˆβ G ˆβ G ) given the observed data. Furthermore, Λ(x) can be consistently estimated by Λ(x) = ˆΛ(x, ˆβ G ), where ˆΛ(x, β) = x dn ik (u) ns (0) (u, β). (14) Now we are ready to derive independence tests for bivariate clustered failure time data. Define M i (x, β, Λ) = Ñi(x, β) 2 x I{ W i (β) u}dλ(u), where W i (β) = W i1 (β) W i2 (β), Ñ i (x, β) = δ i (β)i{ W i (β) x}, and δ i (β) = δ i1 if Wi1 (β) W i2 (β) 8

9 and δ i2 otherwise. In addition, let H n = n 1/2 n h(x)d M i {x, ˆβ G, ˆΛ(, ˆβ G )}, H n = n 1/2 n h(x)d M i {x, ˆβ G, ˆΛ(, ˆβ G )} and Q n = H n H n + n 1/2 [ h(x)d M i {x, ˆβ G, ˆΛ(, ˆβ G )} h(x) 2 S (0) (x, ˆβ G ) S (0) (x, ˆβ G ) dm ik{x, ˆβ G, ˆΛ(, ˆβ ] G )} (V i 1), where S (0) (x, β) = n 1 n I{ W i (β) x}. Then Theorem 3 Under some regularity conditions, as n goes to, Hn converges in distribution under H 0 to a normal random variable with mean 0 and variance σ 2. And the conditional distribution of Q n the limiting distribution of Hn under H 0. given the observed data converges almost surely to To estimate σ 2, we repeatedly generate the independent random variables V i (i = Q 1,, n), say M times, and calculate n,j for each generated set. Now σ 2 can be approximated by the sample variance ˆ σ 2 of Q n,j (j = 1,, M) and we reject H 0 at level α when H 2 n/ˆ σ 2 > χ 2 1(α). 3 Simulation Study The performance of the proposed independence test for bivariate failure time data was assessed in a series of simulations studies. The bivariate accelerated failure time model specified in (1) is used to generate pairs of failure times. Under each setting, two independent covariates were generated with the first covariate following a Bernoulli distribution with success probability 0.5 and the second one following a uniform distribution on ( 1, 1). For bivariate events data, the pair of failure times shared the same set of covariates, i.e. Z i1 = Z i2 = (Z i1,1, Z i1,2 ). But for bivariate clustered failure time data, Z i1 and Z i2 are independent draws. The first set of simulation studies were conducted for bivariate normal errors, i.e. the joint distribution of two error terms (ɛ i1, ɛ i2 ) is from a bivariate normal distribution 9

10 with mean zero and variance-covariance matrix σ2 1 θσ 1 σ 2 θσ 1 σ 2 σ2 2. Here we set σ 1 = σ 2 = 0.5 and chose θ = 0.5, 0, 0.5. Note that θ = 0 corresponds to the case that the pair of failure times T i1 and T i2 are independent given covariates Z i1 and Z i2. In the second set of simulations, (ɛ i1, ɛ i2 ) were generated from a Clayton-Oakes model (Clayton, 1978; Oakes, 1982 b, 1986), i.e. the joint survival function of (ɛ i1, ɛ i2 ) is P (ɛ i1 > x 1, ɛ i2 > x 2 ; θ) = [exp{θλ 1 (x 1 )} + exp{θλ 2 (x 2 )} 1] θ 1. where the dependence parameter θ 0 and θ = 0 corresponds to the independence between the two error terms. As in the second setting, we set Λ 1 (x) = Λ 2 (x). And chose the extreme value and logistic distributions for the marginal distributions of the two error terms. For bivariate events data, the two regression parameters β 1 = (β 11, β 12 ) = (1, 1), β 2 = (β 21, β 22 ) = (0.5, 0.5) ; while for bivariate clustered failure time data, the regression parameters β = (1, 1). Under each setting, the censoring times C ik were generated from a uniform distribution on (0, c k ), where c k (k = 1, 2) were chosen such that the expected proportion of censoring for each type of failures is 25% and 40%. (Insert Tables 1-5 here) Tables 1-3 summarize the results from the first and second sets of simulation studies for bivariate events data with sample size n = 100 or 200; while Tables 4 and 5 summarize the results for bivariate clustered failure time data with n = 100. Each entry in the table was based on 500 simulated data sets. For computational convenience, when constructing the χ 2 -type independence test statistics proposed in Section 2.3. and 2.4., we chose the weights h(x) 1. For estimating the variances of the proposed test statistics, we generated M = 500 realizations of n independently distributed unit exponential random variables V i, i = 1,, n, under each settings. From the simulation results, we see that the proposed independence tests give the appropriate type one 10

11 errors (the α-level was chosen as 5%) in all the settings when the pair of failure times are really independent given covariates and also have reasonable powers to detect the association between the pair of failure times when there is. And the results improve when the sample size increases and the level of censoring decreases. 4 Two Examples To illustrate the independence test of Section 2, we also considered two examples. For bivariate events data, we studied the data of infection in catheters for patients on dialysis (McGilchrist and Aisbett, 1991). The dataset contained time (days) to infection in kidney catheters for 38 patients on dialysis from two observation periods. Three covariates, age at the time of insertion, gender (1 for male and 0 for female) and type of disease (0 = GN, 1 = AN, 2 = PKD, and 3 = other), were also included. The bivariate AFT model (1) was used for the pair of event times from the two observation periods. For bivariate clustered failure time data, we considered the diabetic retinopathy data (Huster et al., 1989). The dataset contains 197 patients, who were a 50% random sample of the patients with high-risk diabetic retinopathy as defined by the Diabetic Retinopathy Study (DRS). Each patient had one eye randomized to laser treatment and the other eye served as an untreated control. For each eye, the event of interest was the time from initiation of treatment to the severe visual loss (call it blindness ). Besides treatment indicator (1 = treatment; 0 = control), four other covariates: laser type (0 = xenon, 1 = argon), age at diagnosis of diabetics, type of diabetics (0= juvenile, 1 = adult) and risk group (6-12), are also included in the bivariate AFT model. The proposed independence tests were constructed for both datasets using two different weight functions: constant weight function, i.e. h(x) = 1, and data-dependent weight function, i.e. h(x) = S (0) (x, ˆβ 1G, ˆβ 2G ) for the bivariate events data and h(x) = S (0) (x, ˆβ G ) for the bivariate clustered failure time data. The corresponding test statistics are denoted as Hn 1 for the constant weight function and Hn 2 for the data-dependent weight function. Their results are summarized as follows: 11

12 Example Test Statistic SE p-value Catheter Infection (Hn) Catheter Infection (Hn) Diabetic Retinopathy Data (Hn) Diabetic Retinopathy Data (Hn) Here SE is the estimated standard error of the corresponding test statistic with resampling size M = 500. From the above results, for catheter infection data the pair of event times are independent given the covariates: age, gender and types of disease; while for diabetic retinopathy data, the two failure times of the pair of eyes are correlated even after adjusting for the treatment indicator, laser type, age at diagnosis of diabetics, type of diabetics and risk group. 5 Discussion We have proposed a class of χ 2 -type independence tests for bivariate events data with the adjustment of covariates. The pair of failure times are modelled through the bivariate accelerated failure time model and the dependence structure between the two related failure times is completely unspecified. The large sample properties of the proposed tests are established. And a resampling technique is used for obtaining the estimates of the asymptotic variances of the proposed test statistics. The proposed independence tests can also be extended to multivariate failure time data by using pair-wise comparison. In this article, the proposed independence test statistics was introduced in an ad hoc fashion. Only the independence along the diagonal line is used for testing the null hypothesis H 0. That is, we only use the fact that, under H 0, P {W i1 (β 01 ) W i2 (β 02 ) > x Z i1, Z i2 } = exp[ {Λ 1 (x) + Λ 2 (x)}]. More comprehensive tests may be constructed, for example, the covariance test proposed by Hsu and Prentice (1996) for the proportional hazards model. To be specific, 12

13 we may consider the following covariance-type test statistics: P n = n 1/2 h(x 1, x 2 )dm i1 {x 1, ˆβ 1, G, ˆΛ 1 (, ˆβ 1, G )}dm i2 {x 2, ˆβ 2, G, ˆΛ 2 (, ˆβ 2, G )}, where h(x 1, x 2 ) is a positive weight function. In addition, we only considered deterministic weight functions h(x) for constructing the test statistics in the paper. However, to improve the power of the test, data-dependent predictable and bounded weight function h n (x) may also be used, where h n (x) converges almost surely to some deterministic function as n goes to. The derivation of the theoretical properties for the covariancetype tests and the tests using data-dependent weight functions, and the estimation of the asymptotic variance of the corresponding test statistics via the resampling method becomes more complicated and needs further investigation. Acknowledgement The author would like to thank the two referees for their insightful and constructive comments. The author is also grateful to Professor Tsiatis for helpful discussions and suggestions. The research was supported in part by NSF grant DMS Appendix To avoid delicate technical issues associated with smoothness and tail fluctuation, we assume that related functions are sufficiently smooth and impose regularity conditions similar to Conditions 1-5 of Ying (1993). Proof of Theorem 1 in (2) can be rewritten as It is easy to see that the Gehan-weight estimating function defined U k, G (β, Λ) = S (0) k (x, β){z ik Z k (x, β)}dm ik (x, β, Λ), k = 1, 2. Note that E{M ik (x, β 0k, Λ 0k )} = 0. Using some empirical process approximation techniques, we can show that n 1/2 U k, G (β 0k, Λ 0k ) = n 1/2 U ik, G + o p (1), (A.1) 13

14 where U ik, G = and z k (x, β) = s (1) k s (0) k (x, β 0k){Z ik z k (x, β 0k )}dm ik (x, β 0k, Λ 0k ), (x, β)/s(0) (x, β) with s(r) (x, β) = lim n E{S (r) (x, β)}, r = 0, 1, k = k k 1, 2. Then it follows from the arguments of Jin et al. (2003) that ˆβ k, G a.s β 0k and k n 1/2 ( ˆβ k, G β 0k ) = A 1 k, G n 1/2 U ik, G + o p (1), (A.2) where A k, G = lim n n 1 E s (0) k (x, β 0k){Z ik z k (x, β 0k )} 2 {d log λ k (x)/dx}dn ik (x, β k ), and λ k (x) = dλ k (x)/dx, a 2 = aa. In addition, it can be shown (Park and Wei, 2003) that there exists a deterministic vector C k, G (x, β) (k = 1, 2) such that, n 1/2 {ˆΛ k (x, ˆβ k, G ) ˆΛ k (x, β 0k )} = C k, G(x, β 0k )n 1/2 ( ˆβ k, G β 0k ) + o p (1). (A.3) Let g(β 1, β 2, Λ 1, Λ 2 ) = lim n n 1 n E{ h(x)dm i(x, β 1, β 2, Λ 1, Λ 2 )}. Then applying the techniques of Tsiatis (1990) and Ying (2003) for deriving linearity of the weighted long-rank estimates as well as some empirical process approximation techniques, we can show that where H n = n 1/2 + h(x)dm i {x, β 01, β 02, ˆΛ 1 (, β 01 ), ˆΛ 2 (, β 02 )} B k, Gn 1/2 ( ˆβ k, G β 0k ) + o p (1), B k, G = g(β 1, β 2, Λ 1, Λ 2 ) β1 =β β 01,β 1 =β 01,Λ 1 =Λ 01,Λ 2 =Λ 02 k h(x)s (0) (x, β 01, β 02 )dc k, G (x, β 0k ) 14

15 and s (0) (x, β 1, β 2 ) = lim n E{S (0) (x, β 1, β 2 )}. Then it follows that H n = n 1/2 n 1/2 = n 1/2 n 1/2 h(x)dm i {x, β 01, β 02, Λ 01, Λ 02 ) B k, GA 1 k, G n 1/2 U ik, G h(x)i{ W i (β 01, β 02 ) x}d{ˆλ k (x, β 0k ) Λ 01 (x)} + o p (1) h(x)dm i {x, β 01, β 02, Λ 01, Λ 02 ) = n 1/2 ψ i + o p (1). Thus, the results established in Theorem 1 hold. B k, GA 1 k, G n 1/2 h(x) s(0) (x, β 01, β 02 ) s (0) k (x, β dm ik (x, β 0k, Λ 0k ) + o p (1) 0k) U ik, G Proof of Theorem 2 From the arguments given in the appendix of Jin et al. (2006), we have that where U k, G( ˆβ k, G ) = ˆβ k, G a.s β 0k (k = 1, 2) and n 1/2 ( ˆβ k, G ˆβ k, G ) = A 1 k, G n 1/2 Uk, G( ˆβ k, G ) + o p (1), (A.4) S (0) k (x, ˆβ k, G ){Z ik Z k (x, ˆβ k, G )}dm ik {x, ˆβ k, G, ˆΛ k (, ˆβ k, G )}(V i 1). Furthermore, by the similar derivation given in Appendix A, we can show that H n H n = Then it follows from (A.4) that H n H n = n 1/2 [ B k, Gn 1/2 ( ˆβ k, G ˆβ k, G ) + o p (1). B k, GA 1 k, G S (0) k (x, ˆβ k, G ){Z ik Z k (x, ˆβ k, G )}dm ik {x, ˆβ k, G, ˆΛ k (, ˆβ ] k, G )} (V i 1) + o p (1). 15

16 Now we have Q n = n 1/2 [ h(x)dm i {x, ˆβ 1, G, ˆβ 2, G, ˆΛ 1 (, ˆβ 1, G ), ˆΛ 2 (, ˆβ 2, G )} S (0) k (x, ˆβ k, G ){Z ik Z k (x, ˆβ k, G )}dm ik {x, ˆβ k, G, ˆΛ k (, ˆβ k, G )} B k, GA 1 k, G h(x) S(0) (x, ˆβ 1, G, ˆβ 2, G ) S (0) k (x, ˆβ dm ik {x, ˆβ k, G, ˆΛ k (, ˆβ ] k, G )} (V i 1) + o p (1) k, G ) n 1/2 ˆψ i (V i 1) + o p (1). Note that given the observed data, ˆψi is a constant. Thus, as n goes to, the conditional distribution of Q n given the observed data converges almost surely to a normal distribution with mean 0 and variance σ 2 Q lim n (1/n) n ˆψ 2 i. In addition, under H 0, σ 2 Q = V ar(ψ 1) = σ 2. It completes the proof of Theorem 2. Proof of Theorem 3 Following the similar steps used in the proofs of Theorems 1 and 2, we can also show the results established in Theorem 3. The details are omitted here. References Clayton DG (1978) A Model for Association in Bivariate Life Tables and Its Application in Epidemiological Studies of Familial Tendency in Chronic Disease Incidence. Biometrika 65: Cox DR (1972) Regression models and life tables (with Discussion). J Roy Stat Soc Ser B 34: Cox DR, Oakes D (1984) Analysis of Survival Data. Chapman & Hall, New York, NY. Ding AA, Wang W (2004) Testing Independence for Bivariate Current Status Data. J Am Stat Assoc 99: Fygenson M, Ritov Y (1994) Monotone Estimating Equations for Censored Data. Ann Stat 22:

17 Gehan EA (1965) A Generalized Wilcoxon Test for Comparing Arbitrary Single- Censored Samples. Biometrika 52: Hsu L, Prentice RL (1996) A Generalization of the Mantel-Haenszel Tests to Bivariate Failure Time Data. Biometrika 83: Jin Z, Lin DY, Wei LJ, Ying Z (2003) Rank-based Inference for the Accelerated Failure Time Model. Biometrika 90: Jin Z, Lin DY, Ying Z (2006) Rank Regression Analysis of Multivariate Failure Time Data Based on Marginal Linear Models. Scan J Stat 33:1-23. Kalbfleish JD, Prentice RL (1980) The Statistical Analysis of Failure Time Data. Wiley, New York, NY. Kendall MG (1962) Rank Correlation Methods. Griffin, London. Lin DY, Fleming TR, Wei LJ (1994) Confidence bands for survival curves under the proportional hazards model. Biometrika 81: Lin JS, Wei LJ (1992) Linear Regression Analysis for Multivariate Failure Time Observations. J Am Stat Assoc 87: Mantel N, Haenszel W (1959) Statistical Aspects of the Analysis of Data from Retrospective Studies of Disease. J National Cancer Institute 22: Oakes D (1982) A Concordance Test for Independence in the Presence of Bivariate Censoring. Biometrics 38: Oakes D (1982) A Model for Bivariate Survival Data. J Roy Stat Soc Ser B 44: Oakes D (1986) Semiparametric Inference in a Model for Association in Bivariate Survival Data. Biometrika 73: Oakes D (1989) Bivariate Survival Models Induced by Frailties. J Am Stat Assoc 84:

18 Park Y, Wei LJ (2003) Estimating Subject-Specific Survival Functions under the Accelerated Failure Time Model. Biometrika 90: Tsiatis AA (1990) Estimating Regression Parameters Using Linear Rank Tests for Censored Data. Ann Stat 18: Ying Z (1993) A Large Sample Study of Rank Estimation for Censored Regression Data. Ann Stat 21:

19 Table 1 Simulation Results for Bivariate Events Data under Bivariate Normal Model n θ Censoring Mean SE SEE POW % % % % % % % % % % % % NOTE: Mean and SE present the mean and sample standard error of the test statistics H n. SEE is the mean of the estimated standard errors using the resampling method, and POW is the power of the test or the type I error under the null hypothesis. 19

20 Table 2 Simulation Results for Bivariate Events Data under Clayton-Oakes Model (n = 100) Error Distribution θ Censoring Mean SE SEE POW Extreme Value % % % % % % Logistic % % % % % % Table 3 Simulation Results for Bivariate Events Data under Clayton-Oakes Model (n = 200) Error Distribution θ Censoring Mean SE SEE POW Extreme Value % % % % % % Logistic % % % % % %

21 Table 4 Simulation Results for Bivariate Clustered Failure Time Data under Bivariate Normal Model (n = 100) θ Censoring Mean SE SEE POW 0 25% % % % % % Table 5 Simulation Results for Bivariate Clustered Failure Time Data under Clayton-Oakes Model (n = 100) Error Distribution θ Censoring Mean SE SEE POW Extreme Value % % % % % % Logistic % % % % % %

Rank Regression Analysis of Multivariate Failure Time Data Based on Marginal Linear Models

Rank Regression Analysis of Multivariate Failure Time Data Based on Marginal Linear Models doi: 10.1111/j.1467-9469.2005.00487.x Published by Blacwell Publishing Ltd, 9600 Garsington Road, Oxford OX4 2DQ, UK and 350 Main Street, Malden, MA 02148, USA Vol 33: 1 23, 2006 Ran Regression Analysis

More information

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

FULL LIKELIHOOD INFERENCES IN THE COX MODEL October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach

More information

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations Takeshi Emura and Hisayuki Tsukuma Abstract For testing the regression parameter in multivariate

More information

GOODNESS-OF-FIT TESTS FOR ARCHIMEDEAN COPULA MODELS

GOODNESS-OF-FIT TESTS FOR ARCHIMEDEAN COPULA MODELS Statistica Sinica 20 (2010), 441-453 GOODNESS-OF-FIT TESTS FOR ARCHIMEDEAN COPULA MODELS Antai Wang Georgetown University Medical Center Abstract: In this paper, we propose two tests for parametric models

More information

Efficiency Comparison Between Mean and Log-rank Tests for. Recurrent Event Time Data

Efficiency Comparison Between Mean and Log-rank Tests for. Recurrent Event Time Data Efficiency Comparison Between Mean and Log-rank Tests for Recurrent Event Time Data Wenbin Lu Department of Statistics, North Carolina State University, Raleigh, NC 27695 Email: lu@stat.ncsu.edu Summary.

More information

On the Breslow estimator

On the Breslow estimator Lifetime Data Anal (27) 13:471 48 DOI 1.17/s1985-7-948-y On the Breslow estimator D. Y. Lin Received: 5 April 27 / Accepted: 16 July 27 / Published online: 2 September 27 Springer Science+Business Media,

More information

On consistency of Kendall s tau under censoring

On consistency of Kendall s tau under censoring Biometria (28), 95, 4,pp. 997 11 C 28 Biometria Trust Printed in Great Britain doi: 1.193/biomet/asn37 Advance Access publication 17 September 28 On consistency of Kendall s tau under censoring BY DAVID

More information

STAT 331. Accelerated Failure Time Models. Previously, we have focused on multiplicative intensity models, where

STAT 331. Accelerated Failure Time Models. Previously, we have focused on multiplicative intensity models, where STAT 331 Accelerated Failure Time Models Previously, we have focused on multiplicative intensity models, where h t z) = h 0 t) g z). These can also be expressed as H t z) = H 0 t) g z) or S t z) = e Ht

More information

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model Other Survival Models (1) Non-PH models We briefly discussed the non-proportional hazards (non-ph) model λ(t Z) = λ 0 (t) exp{β(t) Z}, where β(t) can be estimated by: piecewise constants (recall how);

More information

On least-squares regression with censored data

On least-squares regression with censored data Biometrika (2006), 93, 1, pp. 147 161 2006 Biometrika Trust Printed in Great Britain On least-squares regression with censored data BY ZHEZHEN JIN Department of Biostatistics, Columbia University, New

More information

Published online: 10 Apr 2012.

Published online: 10 Apr 2012. This article was downloaded by: Columbia University] On: 23 March 215, At: 12:7 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 172954 Registered office: Mortimer

More information

Multivariate Survival Data With Censoring.

Multivariate Survival Data With Censoring. 1 Multivariate Survival Data With Censoring. Shulamith Gross and Catherine Huber-Carol Baruch College of the City University of New York, Dept of Statistics and CIS, Box 11-220, 1 Baruch way, 10010 NY.

More information

Frailty Models and Copulas: Similarities and Differences

Frailty Models and Copulas: Similarities and Differences Frailty Models and Copulas: Similarities and Differences KLARA GOETHALS, PAUL JANSSEN & LUC DUCHATEAU Department of Physiology and Biometrics, Ghent University, Belgium; Center for Statistics, Hasselt

More information

Outline. Frailty modelling of Multivariate Survival Data. Clustered survival data. Clustered survival data

Outline. Frailty modelling of Multivariate Survival Data. Clustered survival data. Clustered survival data Outline Frailty modelling of Multivariate Survival Data Thomas Scheike ts@biostat.ku.dk Department of Biostatistics University of Copenhagen Marginal versus Frailty models. Two-stage frailty models: copula

More information

Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion

Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion Glenn Heller and Jing Qin Department of Epidemiology and Biostatistics Memorial

More information

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA Kasun Rathnayake ; A/Prof Jun Ma Department of Statistics Faculty of Science and Engineering Macquarie University

More information

AFT Models and Empirical Likelihood

AFT Models and Empirical Likelihood AFT Models and Empirical Likelihood Mai Zhou Department of Statistics, University of Kentucky Collaborators: Gang Li (UCLA); A. Bathke; M. Kim (Kentucky) Accelerated Failure Time (AFT) models: Y = log(t

More information

Survival Analysis Math 434 Fall 2011

Survival Analysis Math 434 Fall 2011 Survival Analysis Math 434 Fall 2011 Part IV: Chap. 8,9.2,9.3,11: Semiparametric Proportional Hazards Regression Jimin Ding Math Dept. www.math.wustl.edu/ jmding/math434/fall09/index.html Basic Model Setup

More information

Estimating Bivariate Survival Function by Volterra Estimator Using Dynamic Programming Techniques

Estimating Bivariate Survival Function by Volterra Estimator Using Dynamic Programming Techniques Journal of Data Science 7(2009), 365-380 Estimating Bivariate Survival Function by Volterra Estimator Using Dynamic Programming Techniques Jiantian Wang and Pablo Zafra Kean University Abstract: For estimating

More information

Survival Analysis for Case-Cohort Studies

Survival Analysis for Case-Cohort Studies Survival Analysis for ase-ohort Studies Petr Klášterecký Dept. of Probability and Mathematical Statistics, Faculty of Mathematics and Physics, harles University, Prague, zech Republic e-mail: petr.klasterecky@matfyz.cz

More information

A Bivariate Weibull Regression Model

A Bivariate Weibull Regression Model c Heldermann Verlag Economic Quality Control ISSN 0940-5151 Vol 20 (2005), No. 1, 1 A Bivariate Weibull Regression Model David D. Hanagal Abstract: In this paper, we propose a new bivariate Weibull regression

More information

Statistical Methods and Computing for Semiparametric Accelerated Failure Time Model with Induced Smoothing

Statistical Methods and Computing for Semiparametric Accelerated Failure Time Model with Induced Smoothing University of Connecticut DigitalCommons@UConn Doctoral Dissertations University of Connecticut Graduate School 5-2-2013 Statistical Methods and Computing for Semiparametric Accelerated Failure Time Model

More information

11 Survival Analysis and Empirical Likelihood

11 Survival Analysis and Empirical Likelihood 11 Survival Analysis and Empirical Likelihood The first paper of empirical likelihood is actually about confidence intervals with the Kaplan-Meier estimator (Thomas and Grunkmeier 1979), i.e. deals with

More information

Efficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models

Efficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models NIH Talk, September 03 Efficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models Eric Slud, Math Dept, Univ of Maryland Ongoing joint project with Ilia

More information

STAT331. Cox s Proportional Hazards Model

STAT331. Cox s Proportional Hazards Model STAT331 Cox s Proportional Hazards Model In this unit we introduce Cox s proportional hazards (Cox s PH) model, give a heuristic development of the partial likelihood function, and discuss adaptations

More information

Lecture 5 Models and methods for recurrent event data

Lecture 5 Models and methods for recurrent event data Lecture 5 Models and methods for recurrent event data Recurrent and multiple events are commonly encountered in longitudinal studies. In this chapter we consider ordered recurrent and multiple events.

More information

Semiparametric maximum likelihood estimation in normal transformation models for bivariate survival data

Semiparametric maximum likelihood estimation in normal transformation models for bivariate survival data Biometrika (28), 95, 4,pp. 947 96 C 28 Biometrika Trust Printed in Great Britain doi: 1.193/biomet/asn49 Semiparametric maximum likelihood estimation in normal transformation models for bivariate survival

More information

1 Introduction. 2 Residuals in PH model

1 Introduction. 2 Residuals in PH model Supplementary Material for Diagnostic Plotting Methods for Proportional Hazards Models With Time-dependent Covariates or Time-varying Regression Coefficients BY QIQING YU, JUNYI DONG Department of Mathematical

More information

Package Rsurrogate. October 20, 2016

Package Rsurrogate. October 20, 2016 Type Package Package Rsurrogate October 20, 2016 Title Robust Estimation of the Proportion of Treatment Effect Explained by Surrogate Marker Information Version 2.0 Date 2016-10-19 Author Layla Parast

More information

Multivariate Survival Analysis

Multivariate Survival Analysis Multivariate Survival Analysis Previously we have assumed that either (X i, δ i ) or (X i, δ i, Z i ), i = 1,..., n, are i.i.d.. This may not always be the case. Multivariate survival data can arise in

More information

Chapter 7 Fall Chapter 7 Hypothesis testing Hypotheses of interest: (A) 1-sample

Chapter 7 Fall Chapter 7 Hypothesis testing Hypotheses of interest: (A) 1-sample Bios 323: Applied Survival Analysis Qingxia (Cindy) Chen Chapter 7 Fall 2012 Chapter 7 Hypothesis testing Hypotheses of interest: (A) 1-sample H 0 : S(t) = S 0 (t), where S 0 ( ) is known survival function,

More information

Analysis of Time-to-Event Data: Chapter 4 - Parametric regression models

Analysis of Time-to-Event Data: Chapter 4 - Parametric regression models Analysis of Time-to-Event Data: Chapter 4 - Parametric regression models Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/25 Right censored

More information

Power and Sample Size Calculations with the Additive Hazards Model

Power and Sample Size Calculations with the Additive Hazards Model Journal of Data Science 10(2012), 143-155 Power and Sample Size Calculations with the Additive Hazards Model Ling Chen, Chengjie Xiong, J. Philip Miller and Feng Gao Washington University School of Medicine

More information

Part III. Hypothesis Testing. III.1. Log-rank Test for Right-censored Failure Time Data

Part III. Hypothesis Testing. III.1. Log-rank Test for Right-censored Failure Time Data 1 Part III. Hypothesis Testing III.1. Log-rank Test for Right-censored Failure Time Data Consider a survival study consisting of n independent subjects from p different populations with survival functions

More information

log T = β T Z + ɛ Zi Z(u; β) } dn i (ue βzi ) = 0,

log T = β T Z + ɛ Zi Z(u; β) } dn i (ue βzi ) = 0, Accelerated failure time model: log T = β T Z + ɛ β estimation: solve where S n ( β) = n i=1 { Zi Z(u; β) } dn i (ue βzi ) = 0, Z(u; β) = j Z j Y j (ue βz j) j Y j (ue βz j) How do we show the asymptotics

More information

A Measure of Association for Bivariate Frailty Distributions

A Measure of Association for Bivariate Frailty Distributions journal of multivariate analysis 56, 6074 (996) article no. 0004 A Measure of Association for Bivariate Frailty Distributions Amita K. Manatunga Emory University and David Oakes University of Rochester

More information

Efficiency of Profile/Partial Likelihood in the Cox Model

Efficiency of Profile/Partial Likelihood in the Cox Model Efficiency of Profile/Partial Likelihood in the Cox Model Yuichi Hirose School of Mathematics, Statistics and Operations Research, Victoria University of Wellington, New Zealand Summary. This paper shows

More information

Quantile Regression for Residual Life and Empirical Likelihood

Quantile Regression for Residual Life and Empirical Likelihood Quantile Regression for Residual Life and Empirical Likelihood Mai Zhou email: mai@ms.uky.edu Department of Statistics, University of Kentucky, Lexington, KY 40506-0027, USA Jong-Hyeon Jeong email: jeong@nsabp.pitt.edu

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 24 Paper 153 A Note on Empirical Likelihood Inference of Residual Life Regression Ying Qing Chen Yichuan

More information

Unit 14: Nonparametric Statistical Methods

Unit 14: Nonparametric Statistical Methods Unit 14: Nonparametric Statistical Methods Statistics 571: Statistical Methods Ramón V. León 8/8/2003 Unit 14 - Stat 571 - Ramón V. León 1 Introductory Remarks Most methods studied so far have been based

More information

REGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520

REGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520 REGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520 Department of Statistics North Carolina State University Presented by: Butch Tsiatis, Department of Statistics, NCSU

More information

Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis

Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis Jonathan Taylor & Kristin Cobb Statistics 262: Intermediate Biostatistics p.1/?? Overview of today s class Kaplan-Meier Curve

More information

Semiparametric Regression

Semiparametric Regression Semiparametric Regression Patrick Breheny October 22 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/23 Introduction Over the past few weeks, we ve introduced a variety of regression models under

More information

Survival Analysis I (CHL5209H)

Survival Analysis I (CHL5209H) Survival Analysis Dalla Lana School of Public Health University of Toronto olli.saarela@utoronto.ca January 7, 2015 31-1 Literature Clayton D & Hills M (1993): Statistical Models in Epidemiology. Not really

More information

Competing risks data analysis under the accelerated failure time model with missing cause of failure

Competing risks data analysis under the accelerated failure time model with missing cause of failure Ann Inst Stat Math 2016 68:855 876 DOI 10.1007/s10463-015-0516-y Competing risks data analysis under the accelerated failure time model with missing cause of failure Ming Zheng Renxin Lin Wen Yu Received:

More information

A comparison study of the nonparametric tests based on the empirical distributions

A comparison study of the nonparametric tests based on the empirical distributions 통계연구 (2015), 제 20 권제 3 호, 1-12 A comparison study of the nonparametric tests based on the empirical distributions Hyo-Il Park 1) Abstract In this study, we propose a nonparametric test based on the empirical

More information

Nonparametric rank based estimation of bivariate densities given censored data conditional on marginal probabilities

Nonparametric rank based estimation of bivariate densities given censored data conditional on marginal probabilities Hutson Journal of Statistical Distributions and Applications (26 3:9 DOI.86/s4488-6-47-y RESEARCH Open Access Nonparametric rank based estimation of bivariate densities given censored data conditional

More information

FULL LIKELIHOOD INFERENCES IN THE COX MODEL: AN EMPIRICAL LIKELIHOOD APPROACH

FULL LIKELIHOOD INFERENCES IN THE COX MODEL: AN EMPIRICAL LIKELIHOOD APPROACH FULL LIKELIHOOD INFERENCES IN THE COX MODEL: AN EMPIRICAL LIKELIHOOD APPROACH Jian-Jian Ren 1 and Mai Zhou 2 University of Central Florida and University of Kentucky Abstract: For the regression parameter

More information

Typical Survival Data Arising From a Clinical Trial. Censoring. The Survivor Function. Mathematical Definitions Introduction

Typical Survival Data Arising From a Clinical Trial. Censoring. The Survivor Function. Mathematical Definitions Introduction Outline CHL 5225H Advanced Statistical Methods for Clinical Trials: Survival Analysis Prof. Kevin E. Thorpe Defining Survival Data Mathematical Definitions Non-parametric Estimates of Survival Comparing

More information

Efficient Estimation of Censored Linear Regression Model

Efficient Estimation of Censored Linear Regression Model 2 3 4 5 6 7 8 9 2 3 4 5 6 7 8 9 2 2 22 23 24 25 26 27 28 29 3 3 32 33 34 35 36 37 38 39 4 4 42 43 44 45 46 47 48 Biometrika (2), xx, x, pp. 4 C 28 Biometrika Trust Printed in Great Britain Efficient Estimation

More information

CTDL-Positive Stable Frailty Model

CTDL-Positive Stable Frailty Model CTDL-Positive Stable Frailty Model M. Blagojevic 1, G. MacKenzie 2 1 Department of Mathematics, Keele University, Staffordshire ST5 5BG,UK and 2 Centre of Biostatistics, University of Limerick, Ireland

More information

Estimation and Goodness of Fit for Multivariate Survival Models Based on Copulas

Estimation and Goodness of Fit for Multivariate Survival Models Based on Copulas Estimation and Goodness of Fit for Multivariate Survival Models Based on Copulas by Yildiz Elif Yilmaz A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the

More information

EMPIRICAL LIKELIHOOD ANALYSIS FOR THE HETEROSCEDASTIC ACCELERATED FAILURE TIME MODEL

EMPIRICAL LIKELIHOOD ANALYSIS FOR THE HETEROSCEDASTIC ACCELERATED FAILURE TIME MODEL Statistica Sinica 22 (2012), 295-316 doi:http://dx.doi.org/10.5705/ss.2010.190 EMPIRICAL LIKELIHOOD ANALYSIS FOR THE HETEROSCEDASTIC ACCELERATED FAILURE TIME MODEL Mai Zhou 1, Mi-Ok Kim 2, and Arne C.

More information

Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates

Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates Anastasios (Butch) Tsiatis Department of Statistics North Carolina State University http://www.stat.ncsu.edu/

More information

Estimation of Conditional Kendall s Tau for Bivariate Interval Censored Data

Estimation of Conditional Kendall s Tau for Bivariate Interval Censored Data Communications for Statistical Applications and Methods 2015, Vol. 22, No. 6, 599 604 DOI: http://dx.doi.org/10.5351/csam.2015.22.6.599 Print ISSN 2287-7843 / Online ISSN 2383-4757 Estimation of Conditional

More information

Pairwise dependence diagnostics for clustered failure-time data

Pairwise dependence diagnostics for clustered failure-time data Biometrika Advance Access published May 13, 27 Biometrika (27), pp. 1 15 27 Biometrika Trust Printed in Great Britain doi:1.193/biomet/asm24 Pairwise dependence diagnostics for clustered failure-time data

More information

Simple techniques for comparing survival functions with interval-censored data

Simple techniques for comparing survival functions with interval-censored data Simple techniques for comparing survival functions with interval-censored data Jinheum Kim, joint with Chung Mo Nam jinhkim@suwon.ac.kr Department of Applied Statistics University of Suwon Comparing survival

More information

A Conditional Approach to Modeling Multivariate Extremes

A Conditional Approach to Modeling Multivariate Extremes A Approach to ing Multivariate Extremes By Heffernan & Tawn Department of Statistics Purdue University s April 30, 2014 Outline s s Multivariate Extremes s A central aim of multivariate extremes is trying

More information

Least Absolute Deviations Estimation for the Accelerated Failure Time Model. University of Iowa. *

Least Absolute Deviations Estimation for the Accelerated Failure Time Model. University of Iowa. * Least Absolute Deviations Estimation for the Accelerated Failure Time Model Jian Huang 1,2, Shuangge Ma 3, and Huiliang Xie 1 1 Department of Statistics and Actuarial Science, and 2 Program in Public Health

More information

Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests

Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests Biometrika (2014),,, pp. 1 13 C 2014 Biometrika Trust Printed in Great Britain Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests BY M. ZHOU Department of Statistics, University

More information

Approximation of Survival Function by Taylor Series for General Partly Interval Censored Data

Approximation of Survival Function by Taylor Series for General Partly Interval Censored Data Malaysian Journal of Mathematical Sciences 11(3): 33 315 (217) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES Journal homepage: http://einspem.upm.edu.my/journal Approximation of Survival Function by Taylor

More information

Quantile Regression for Recurrent Gap Time Data

Quantile Regression for Recurrent Gap Time Data Biometrics 000, 1 21 DOI: 000 000 0000 Quantile Regression for Recurrent Gap Time Data Xianghua Luo 1,, Chiung-Yu Huang 2, and Lan Wang 3 1 Division of Biostatistics, School of Public Health, University

More information

A Goodness-of-fit Test for Semi-parametric Copula Models of Right-Censored Bivariate Survival Times

A Goodness-of-fit Test for Semi-parametric Copula Models of Right-Censored Bivariate Survival Times A Goodness-of-fit Test for Semi-parametric Copula Models of Right-Censored Bivariate Survival Times by Moyan Mei B.Sc. (Honors), Dalhousie University, 2014 Project Submitted in Partial Fulfillment of the

More information

DIAGNOSTICS FOR STRATIFIED CLINICAL TRIALS IN PROPORTIONAL ODDS MODELS

DIAGNOSTICS FOR STRATIFIED CLINICAL TRIALS IN PROPORTIONAL ODDS MODELS DIAGNOSTICS FOR STRATIFIED CLINICAL TRIALS IN PROPORTIONAL ODDS MODELS Ivy Liu and Dong Q. Wang School of Mathematics, Statistics and Computer Science Victoria University of Wellington New Zealand Corresponding

More information

Conditional Copula Models for Right-Censored Clustered Event Time Data

Conditional Copula Models for Right-Censored Clustered Event Time Data Conditional Copula Models for Right-Censored Clustered Event Time Data arxiv:1606.01385v1 [stat.me] 4 Jun 2016 Candida Geerdens 1, Elif F. Acar 2, and Paul Janssen 1 1 Center for Statistics, Universiteit

More information

Goodness-of-fit test for the Cox Proportional Hazard Model

Goodness-of-fit test for the Cox Proportional Hazard Model Goodness-of-fit test for the Cox Proportional Hazard Model Rui Cui rcui@eco.uc3m.es Department of Economics, UC3M Abstract In this paper, we develop new goodness-of-fit tests for the Cox proportional hazard

More information

EMPIRICAL ENVELOPE MLE AND LR TESTS. Mai Zhou University of Kentucky

EMPIRICAL ENVELOPE MLE AND LR TESTS. Mai Zhou University of Kentucky EMPIRICAL ENVELOPE MLE AND LR TESTS Mai Zhou University of Kentucky Summary We study in this paper some nonparametric inference problems where the nonparametric maximum likelihood estimator (NPMLE) are

More information

Survival Analysis. Lu Tian and Richard Olshen Stanford University

Survival Analysis. Lu Tian and Richard Olshen Stanford University 1 Survival Analysis Lu Tian and Richard Olshen Stanford University 2 Survival Time/ Failure Time/Event Time We will introduce various statistical methods for analyzing survival outcomes What is the survival

More information

Outline. Frailty modelling of Multivariate Survival Data. Clustered survival data. Clustered survival data

Outline. Frailty modelling of Multivariate Survival Data. Clustered survival data. Clustered survival data Outline Frailty modelling of Multivariate Survival Data Thomas Scheike ts@biostat.ku.dk Department of Biostatistics University of Copenhagen Marginal versus Frailty models. Two-stage frailty models: copula

More information

IDENTIFIABILITY OF THE MULTIVARIATE NORMAL BY THE MAXIMUM AND THE MINIMUM

IDENTIFIABILITY OF THE MULTIVARIATE NORMAL BY THE MAXIMUM AND THE MINIMUM Surveys in Mathematics and its Applications ISSN 842-6298 (electronic), 843-7265 (print) Volume 5 (200), 3 320 IDENTIFIABILITY OF THE MULTIVARIATE NORMAL BY THE MAXIMUM AND THE MINIMUM Arunava Mukherjea

More information

Repeated ordinal measurements: a generalised estimating equation approach

Repeated ordinal measurements: a generalised estimating equation approach Repeated ordinal measurements: a generalised estimating equation approach David Clayton MRC Biostatistics Unit 5, Shaftesbury Road Cambridge CB2 2BW April 7, 1992 Abstract Cumulative logit and related

More information

Attributable Risk Function in the Proportional Hazards Model

Attributable Risk Function in the Proportional Hazards Model UW Biostatistics Working Paper Series 5-31-2005 Attributable Risk Function in the Proportional Hazards Model Ying Qing Chen Fred Hutchinson Cancer Research Center, yqchen@u.washington.edu Chengcheng Hu

More information

Rank-based inference for the accelerated failure time model

Rank-based inference for the accelerated failure time model Biometrika (2003), 90, 2, pp. 341 353 2003 Biometrika Trust Printed in reat Britain Rank-based inference for the accelerated failure time model BY ZHEZHEN JIN Department of Biostatistics, Columbia University,

More information

Statistical inference based on non-smooth estimating functions

Statistical inference based on non-smooth estimating functions Biometrika (2004), 91, 4, pp. 943 954 2004 Biometrika Trust Printed in Great Britain Statistical inference based on non-smooth estimating functions BY L. TIAN Department of Preventive Medicine, Northwestern

More information

ST745: Survival Analysis: Nonparametric methods

ST745: Survival Analysis: Nonparametric methods ST745: Survival Analysis: Nonparametric methods Eric B. Laber Department of Statistics, North Carolina State University February 5, 2015 The KM estimator is used ubiquitously in medical studies to estimate

More information

Regression Calibration in Semiparametric Accelerated Failure Time Models

Regression Calibration in Semiparametric Accelerated Failure Time Models Biometrics 66, 405 414 June 2010 DOI: 10.1111/j.1541-0420.2009.01295.x Regression Calibration in Semiparametric Accelerated Failure Time Models Menggang Yu 1, and Bin Nan 2 1 Department of Medicine, Division

More information

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky Empirical likelihood with right censored data were studied by Thomas and Grunkmier (1975), Li (1995),

More information

Marginal Screening and Post-Selection Inference

Marginal Screening and Post-Selection Inference Marginal Screening and Post-Selection Inference Ian McKeague August 13, 2017 Ian McKeague (Columbia University) Marginal Screening August 13, 2017 1 / 29 Outline 1 Background on Marginal Screening 2 2

More information

Analysis of transformation models with censored data

Analysis of transformation models with censored data Biometrika (1995), 82,4, pp. 835-45 Printed in Great Britain Analysis of transformation models with censored data BY S. C. CHENG Department of Biomathematics, M. D. Anderson Cancer Center, University of

More information

Estimation in Generalized Linear Models with Heterogeneous Random Effects. Woncheol Jang Johan Lim. May 19, 2004

Estimation in Generalized Linear Models with Heterogeneous Random Effects. Woncheol Jang Johan Lim. May 19, 2004 Estimation in Generalized Linear Models with Heterogeneous Random Effects Woncheol Jang Johan Lim May 19, 2004 Abstract The penalized quasi-likelihood (PQL) approach is the most common estimation procedure

More information

Introduction to Statistical Analysis

Introduction to Statistical Analysis Introduction to Statistical Analysis Changyu Shen Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology Beth Israel Deaconess Medical Center Harvard Medical School Objectives Descriptive

More information

A Regression Model For Recurrent Events With Distribution Free Correlation Structure

A Regression Model For Recurrent Events With Distribution Free Correlation Structure A Regression Model For Recurrent Events With Distribution Free Correlation Structure J. Pénichoux(1), A. Latouche(2), T. Moreau(1) (1) INSERM U780 (2) Université de Versailles, EA2506 ISCB - 2009 - Prague

More information

Survival Distributions, Hazard Functions, Cumulative Hazards

Survival Distributions, Hazard Functions, Cumulative Hazards BIO 244: Unit 1 Survival Distributions, Hazard Functions, Cumulative Hazards 1.1 Definitions: The goals of this unit are to introduce notation, discuss ways of probabilistically describing the distribution

More information

Modelling Survival Events with Longitudinal Data Measured with Error

Modelling Survival Events with Longitudinal Data Measured with Error Modelling Survival Events with Longitudinal Data Measured with Error Hongsheng Dai, Jianxin Pan & Yanchun Bao First version: 14 December 29 Research Report No. 16, 29, Probability and Statistics Group

More information

Nuisance parameter elimination for proportional likelihood ratio models with nonignorable missingness and random truncation

Nuisance parameter elimination for proportional likelihood ratio models with nonignorable missingness and random truncation Biometrika Advance Access published October 24, 202 Biometrika (202), pp. 8 C 202 Biometrika rust Printed in Great Britain doi: 0.093/biomet/ass056 Nuisance parameter elimination for proportional likelihood

More information

Full likelihood inferences in the Cox model: an empirical likelihood approach

Full likelihood inferences in the Cox model: an empirical likelihood approach Ann Inst Stat Math 2011) 63:1005 1018 DOI 10.1007/s10463-010-0272-y Full likelihood inferences in the Cox model: an empirical likelihood approach Jian-Jian Ren Mai Zhou Received: 22 September 2008 / Revised:

More information

UNIVERSITY OF CALIFORNIA, SAN DIEGO

UNIVERSITY OF CALIFORNIA, SAN DIEGO UNIVERSITY OF CALIFORNIA, SAN DIEGO Estimation of the primary hazard ratio in the presence of a secondary covariate with non-proportional hazards An undergraduate honors thesis submitted to the Department

More information

Issues on quantile autoregression

Issues on quantile autoregression Issues on quantile autoregression Jianqing Fan and Yingying Fan We congratulate Koenker and Xiao on their interesting and important contribution to the quantile autoregression (QAR). The paper provides

More information

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon

Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Discussion of the paper Inference for Semiparametric Models: Some Questions and an Answer by Bickel and Kwon Jianqing Fan Department of Statistics Chinese University of Hong Kong AND Department of Statistics

More information

MAS3301 / MAS8311 Biostatistics Part II: Survival

MAS3301 / MAS8311 Biostatistics Part II: Survival MAS3301 / MAS8311 Biostatistics Part II: Survival M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2009-10 1 13 The Cox proportional hazards model 13.1 Introduction In the

More information

The Accelerated Failure Time Model Under Biased. Sampling

The Accelerated Failure Time Model Under Biased. Sampling The Accelerated Failure Time Model Under Biased Sampling Micha Mandel and Ya akov Ritov Department of Statistics, The Hebrew University of Jerusalem, Israel July 13, 2009 Abstract Chen (2009, Biometrics)

More information

Regularization in Cox Frailty Models

Regularization in Cox Frailty Models Regularization in Cox Frailty Models Andreas Groll 1, Trevor Hastie 2, Gerhard Tutz 3 1 Ludwig-Maximilians-Universität Munich, Department of Mathematics, Theresienstraße 39, 80333 Munich, Germany 2 University

More information

Cox s proportional hazards model and Cox s partial likelihood

Cox s proportional hazards model and Cox s partial likelihood Cox s proportional hazards model and Cox s partial likelihood Rasmus Waagepetersen October 12, 2018 1 / 27 Non-parametric vs. parametric Suppose we want to estimate unknown function, e.g. survival function.

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models

Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations

More information

Analysis of Time-to-Event Data: Chapter 6 - Regression diagnostics

Analysis of Time-to-Event Data: Chapter 6 - Regression diagnostics Analysis of Time-to-Event Data: Chapter 6 - Regression diagnostics Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/25 Residuals for the

More information

Harvard University. Harvard University Biostatistics Working Paper Series

Harvard University. Harvard University Biostatistics Working Paper Series Harvard University Harvard University Biostatistics Working Paper Series Year 2008 Paper 85 Semiparametric Maximum Likelihood Estimation in Normal Transformation Models for Bivariate Survival Data Yi Li

More information

Sample size calculations for logistic and Poisson regression models

Sample size calculations for logistic and Poisson regression models Biometrika (2), 88, 4, pp. 93 99 2 Biometrika Trust Printed in Great Britain Sample size calculations for logistic and Poisson regression models BY GWOWEN SHIEH Department of Management Science, National

More information

Master s Written Examination - Solution

Master s Written Examination - Solution Master s Written Examination - Solution Spring 204 Problem Stat 40 Suppose X and X 2 have the joint pdf f X,X 2 (x, x 2 ) = 2e (x +x 2 ), 0 < x < x 2

More information

Statistics in medicine

Statistics in medicine Statistics in medicine Lecture 4: and multivariable regression Fatma Shebl, MD, MS, MPH, PhD Assistant Professor Chronic Disease Epidemiology Department Yale School of Public Health Fatma.shebl@yale.edu

More information

Frailty Modeling for clustered survival data: a simulation study

Frailty Modeling for clustered survival data: a simulation study Frailty Modeling for clustered survival data: a simulation study IAA Oslo 2015 Souad ROMDHANE LaREMFiQ - IHEC University of Sousse (Tunisia) souad_romdhane@yahoo.fr Lotfi BELKACEM LaREMFiQ - IHEC University

More information