Comparing Distribution Functions via Empirical Likelihood

Georgia State University ScholarWorks @ Georgia State University Mathematics and Statistics Faculty Publications Department of Mathematics and Statistics 25 Comparing Distribution Functions via Empirical Likelihood Yichuan Zhao Georgia State University, yichuan@gsu.edu Ian W. McKeague Follow this and additional works at: https://scholarworks.gsu.edu/math_facpub Part of the Biostatistics Commons, and the Mathematics Commons Recommended Citation McKeague, Ian W. and Zhao, Yichuan 25 "Comparing Distribution Functions via Empirical Likelihood," The International Journal of Biostatistics: Vol. 1 : Iss. 1, Article 5. This Article is brought to you for free and open access by the Department of Mathematics and Statistics at ScholarWorks @ Georgia State University. It has been accepted for inclusion in Mathematics and Statistics Faculty Publications by an authorized administrator of ScholarWorks @ Georgia State University. For more information, please contact scholarworks@gsu.edu.

The International Journal of Biostatistics Volume 1, Issue 1 25 Article 5 Comparing Distribution Functions via Empirical Likelihood Ian W. McKeague Yichuan Zhao Columbia University, im2131@columbia.edu Georgia State University, dz27@gmail.com Copyright c 25 The Berkeley Electronic Press. All rights reserved.

Comparing Distribution Functions via Empirical Likelihood Ian W. McKeague and Yichuan Zhao Abstract This paper develops empirical likelihood based simultaneous confidence bands for differences and ratios of two distribution functions from independent samples of right-censored survival data. The proposed confidence bands provide a flexible way of comparing treatments in biomedical settings, and bring empirical likelihood methods to bear on important target functions for which only Wald-type confidence bands have been available in the literature. The approach is illustrated with a real data example. KEYWORDS: Kaplan Meier estimator, Nelson Aalen estimator, plug-in, right censoring

1 Introduction McKeague and Zhao: Comparing Distribution Functions The purpose of this paper is to develop simultaneous confidence bands for the difference and the ratio of two distribution functions using the empirical likelihood approach. The proposed confidence bands provide an attractive graphical comparison of treatment and control groups in biomedical studies on the basis of independent right-censored survival time data from each group. The graphical comparison of two survival distributions can be done in various ways. Empirical likelihood EL techniques have been used to provide confidence bands for Q- Q plots Einmahl and McKeague, 1999, ratios of survival functions McKeague and Zhao, 22, and P-P plots Claeskens et al., 23; references to the earlier literature may be found in these papers. The simplest and most natural way to carry out such a comparison, however, is to target the difference and the ratio of the two distribution functions, which represent directly interpretable measures of treatment effect and relative risk. The difference is suitable when an absolute measure of treatment effect is needed, the ratio when a relative measure is needed. Parzen et al. 1997 constructed a Waldtype simultaneous confidence band for a difference between two distribution functions, but, as far as we know, there is no EL band available in the literature. We develop our approach in terms of differences and ratios of linear functionals of the cumulative hazard functions: αt = t g 1 s da 1 s t g 2 s da 2 s and t / t βt = g 1 s da 1 s g 2 s da 2 s, where A j t is the cumulative hazard function for group j, and g j t =S j t =1 F j tis the survival function for group j. The difference between the two distribution functions is then seen to be αt =F 1 t F 2 t =S 2 t S 1 t, and the ratio of the two distribution functions is βt =F 1 t/f 2 t. The difference and the ratio of the cumulative hazard functions are obtained by taking g j 1, and our approach extends essentially without change to that case as well. These various ways of comparing the two distributions provide greater flexibility than what is currently available. Published by The Berkeley Electronic Press, 25 1

The International Journal of Biostatistics, Vol. 1 [25], Iss. 1, Art. 5 The unknown survival functions g j = S j in the above representations of αt and βt can be seen as nuisance parameters in the EL statistic, so we take the approach of plugging-in the corresponding Kaplan Meier estimates. Plug-in for unknown parameters in estimating equations has been used extensively in conjunction with EL, and typically perturbs the usual chi-squared limit distribution into a more complicated form, see, e.g., Hjort et al. 25. The present case is no exception: we find that the empirical likelihood statistic with plug-in of the Kaplan Meier estimates is not asymptotically distribution free; a bootstrap procedure is thus needed to determine critical values for the EL confidence bands. An EL confidence interval for a linear functional gs das of a cumulative hazard function A, where g is known, has been developed by Pan and Zhou 22. Their approach is based on a Poisson extension of the likelihood cf. Murphy, 1995, but we found that it is not easy to deal with the target functions αt and βt using a likelihood of this form. Instead, our approach is based on the standard nonparametric likelihood for S 1,S 2, cf. McKeague and Zhao 22. The EL function or nonparametric likelihood ratio is constructed by substituting A j t = log S j t in the estimating equation that defines the target function of interest, and we find that this leads to a mathematically tractable formulation. The main results underlying our derivation of the proposed confidence bands are presented in Section 2. In Section 3 we develop the bootstrap procedure needed to construct the bands in practice. In Section 4 we give an illustrative example. Some concluding remarks are given in Section 5. Proofs are contained in the Appendix. 2 Main results 2.1 Preliminaries We consider the standard two-sample framework with independent right censoring. That is, we have two independent samples of i.i.d. observations of the form Z ji,δ ji, where j =1, 2 indexes the sample, i =1,...,n j indexes the observations within each sample, and Z ji = X ji Y ji, δ ji =1 {Xji Y ji }. The distribution functions of X ji and Y ji are http://www.bepress.com/ijb/vol1/iss1/5 DOI: 1.222/1557-4679.17 2

McKeague and Zhao: Comparing Distribution Functions denoted F j and G j, respectively. The survival functions S j =1 F j are assumed to be continuous. The total sample size is n = n 1 + n 2. We work with independent and non-negative X ji and Y ji. The nonparametric likelihood is given by L S 1, S 2 = n 2 j { S j Z ji S j Z ji } δ ji Sj Z ji 1 δ ji, 2.1 j=1 where S j belongs to Γ, the space of all survival functions on [,. The target functions αt and βt may be written in the general form θt = θt, S 1,S 2,g 1,g 2 by substitution of the cumulative hazard functions A j t = log S j t. The empirical likelihood ratio for θt, with plug-in of estimators ĝ j for the g j, is then given by R θt,t,ĝ 1, ĝ 2 = sup{l S 1, S 2 :θt, S 1, S 2, ĝ 1, ĝ 2 = θt, S 1, S 2 Γ Γ} sup{l S 1, S 2 : S 1, S, 2 Γ Γ} 2.2 which can be expressed more explicitly in the case of θt = αt as follows. The ordered uncensored survival times, i.e., the X ji with corresponding δ ji = 1, are written T j1 T jnj <, and r ji = n j k=1 1 {Z jk T ji } denotes the size of the risk set at T ji, d ji = n j k=1 1 {Z jk =T ji,δ jk =1} denotes the number of deaths occurring at time T ji. Define K j t =#{i : T ji t} and D j = max i:tji td ji r ji /ĝ j T ji 1 {ĝj T ji >}. Using Lagrange s method [cf. Thomas and Grunkemeier 1975 or Li 1995], it can be shown that = 2 2 log Rαt,t,ĝ 1, ĝ 2 2 K 1 t K 2 t d 1i log 1+ λ nĝ 1 T 1i log 1+ λ nĝ 1 T 1i d 1i r 2i d 2i log 1 λ nĝ 2 T 2i r 2i log 1 λ nĝ 2 T 2i r 2i d 2i r 2i where the Lagrange multiplier D 1 <λ n < D 2 satisfies the equation K 1 t log 1 d 1i ĝ 1 T 1i + λ n ĝ 1 T 1i K 2 t log 1 d 2i ĝ 2 T 2i r 2i λ n ĝ 2 T 2i 2.3 = αt. 2.4 Published by The Berkeley Electronic Press, 25 3

The International Journal of Biostatistics, Vol. 1 [25], Iss. 1, Art. 5 Here we are suppressing the dependence of the Lagrange multiplier on t. The equation 2.4 has a unique solution λ n provided D j <, ĝ j T ji >, i =1,...,N j, j =1, 2, because as a function of λ n the l.h.s. of 2.4 is continuous, strictly increasing and tends to ± as λ n D 2 or λ n D 1. Similarly, in the case of θt =βt, we have 2 log Rβt,t,ĝ 1, ĝ 2 = 2 2 j=1 K j t r ji log { r ji d ji log 1+ βtj 1 λ n ĝ j T ji r ji d ji } 1+ βtj 1 λ n ĝ j T ji, 2.5 r ji where the Lagrange multiplier D 1 <λ n < D 2 /βt satisfies the equation K 1 t log 1 K 2 t βt d 1i ĝ 1 T 1i + λ n ĝ 1 T 1i log 1 d 2i ĝ 2 T 2i =. 2.6 r 2i βtλ n ĝ 2 T 2i If βt >, the equation 2.6 has a unique solution λ n provided ĝ j T ji >, i = 1,...,K j t, j =1, 2, because, as a function of λ n, the l.h.s. of 2.6 is continuous, strictly increasing and tends to ± as λ n D 2 /βt orλ n D 1. We assume throughout that n j /n p j > asn. The plug-in estimate of g j t =S j t is specified by ĝ j t =S j,nj t, where S j,nj t is the Kaplan Meier estimator of S j t. Define H j s =S j s1 G j s, let τ 1 be such that S j τ 1 < 1, and let τ 2 τ 1 be such that H j τ 2 >, j =1, 2. For future convenience, we define γ j t = t df j s 1 G j s, 2.7 σdiff 2 t =γ 1t/p 1 + γ 2 t/p 2 and σratiot 2 =γ 1 t/p 1 + βt 2 γ 2 t/p 2. These functions appear in the limiting distributions of the likelihood ratio statistics and need to be estimated. It can be shown see Lemma A.3 that ˆσ diff 2 t =n[ˆγ 1t/n 1 +ˆγ 2 t/n 2 ] and ˆσ ratiot 2 =n[ˆγ 1 t/n 1 + ˆβt 2ˆγ 2 t/n 2 ] are uniformly consistent estimators of σdiff 2 t and σratiot 2 over [τ 1,τ 2 ], where ˆγ j t = http://www.bepress.com/ijb/vol1/iss1/5 DOI: 1.222/1557-4679.17 t df j,nj s 1 G j,nj s, 2.8 4

McKeague and Zhao: Comparing Distribution Functions F j,nj and G j,nj are the Kaplan Meier estimators of F j and G j, and ˆβt = F 1,n1 t/f 2,n2 t. 2.2 Confidence bands We now state our main results and explain how they can be used to construct the proposed simultaneous confidence bands. Theorem 2.1. The process 2 log Rαt,t,ĝ 1, ĝ 2, t [τ 1,τ 2 ], converges in distribution to U1 2 t/σdiff 2 t in D[τ 1,τ 2 ], where U 1 t is a Gaussian process with mean zero and covariance function covu 1 s,u 1 t = S 1 ss 1 tσ1s 2 t/p 1 + S 2 ss 2 tσ2s 2 t/p 2 and σ 2 j t = t df js/s j sh j s. It follows that sup 2 log Rαt,t,ĝ 1, ĝ 2 t [τ 1,τ 2 ] D U1 2 t sup t [τ 1,τ 2 ] σdiff 2 t by the continuous mapping theorem, and we obtain B diff = {t, αt : 2 log R αt,t,ĝ 1, ĝ 2 c α [τ 1,τ 2 ], t [τ 1,τ 2 ]} 2.9 as an asymptotic 11 α% confidence band for αt over[τ 1,τ 2 ], where the critical value c α [τ 1,τ 2 ] is the upper α-quantile of the distribution of sup t [τ1,τ 2 ] U1 2 t/σdiff 2 t. A simulation method is developed in Section 3 to obtain this critical value. Implementation. We now explain how to compute the confidence band B diff. For fixed t, let φλ n denote the r.h.s. of 2.3. Its derivative is φ λ n = + K 1 t K 2 t 2λ n ĝ 1 T 1i d 1i d 1i + λ n ĝ 1 T 1i + λ n ĝ 1 T 1i 2λ n ĝ 2 T 2i d 2i r 2i d 2i λ n ĝ 2 T 2i r 2i λ n ĝ 2 T 2i As in McKeague and Zhao 22, there exist exactly two roots λ L < < λ U for φλ L =φλ U =c α [τ 1,τ 2 ] and {λ n : φλ n c α [τ 1,τ 2 ]} =[λ L,λ U ]. Published by The Berkeley Electronic Press, 25 5

The International Journal of Biostatistics, Vol. 1 [25], Iss. 1, Art. 5 The confidence set for αt is a closed interval [ α L, α U ], where α L = K 1 t log 1 d 1i / + λ L ĝ 1 T 1i ĝ 1 T 1i + K2 t log 1 d 2i /r 2i λ L ĝ 2 T 2i ĝ 2 T 2i and α U is the same as α L, but with λ L replaced by λ U. Next we state a parallel result for βt. Theorem 2.2. The process 2 log Rβt,t,ĝ 1, ĝ 2, t [τ 1,τ 2 ], converges in distribution to U2 2 t/σratiot 2 in D[τ 1,τ 2 ], where U 2 t is a Gaussian process with mean zero and covariance function covu 2 s,u 2 t = S 1 ss 1 tσ1s t/p 2 1 +S 2 ss 2 tβsβtσ2s 2 t/p 2. It follows that an asymptotic 11 α% confidence band for βt is given by B ratio = {t, βt : 2 log R βt,t,ĝ 1, ĝ 2 K α [τ 1,τ 2 ], t [τ 1,τ 2 ]}, where K α [τ 1,τ 2 ] is the upper α-quantile of the distribution of sup t [τ1,τ 2 ] U2 2 t/σratiot, 2 which can be obtained by simulation, see Section 3. 3 Bootstrap procedure The limiting distributions of the EL statistics in Section 2 are complicated and include unknown parameters, so we need to develop a Monte Carlo method to determine the critical values. To that end we adapt the Gaussian multiplier bootstrap procedure of Lin et al. 1993 for checking the adequacy of the Cox proportional hazards model. First we define some standard counting process notation: N ji t =1 {Zji t,δ ji =1}, Y ji t =1 {Zji t}, and Y j t = n j Y jit is the size of the risk set at t. The processes M ji s =N ji s s α jsy ji s ds are orthogonal locally square integrable martingales, where α j s is hazard rate corresponding to F j [see Andersen et al. 1993, II.4]. From the proof of Theorem 2.1 in the Appendix, and using the martingale representation of n j S j,nj t S j t, the process U 1 t/σ diff t is seen to be asymptotically equivalent to n 1/2 t n1 S 1,n1 t dm t n2 1is + S 2,n2 t dm 2is. ˆσ diff t Y 1 s Y 2 s http://www.bepress.com/ijb/vol1/iss1/5 DOI: 1.222/1557-4679.17 6

McKeague and Zhao: Comparing Distribution Functions A version of this process that can be simulated is t U1 t = n1/2 n1 S 1,n1 t G 1idN 1i s t + S 2,n2 t ˆσ diff t Y 1 s n2 G 2idN 2i s Y 2 s where G 11,...,G 1n1,G 21,...,G 2n2 are i.i.d. N, 1 random variables independent of the data. Conditional on the data, the limiting distribution of U1 t coincides with that of U 1 t/σ diff t for almost all data sequences. This can be shown by noting that U1 is a Gaussian process with independent components whose covariance converges to that of U 1 /σ diff with probability 1, and by verifying tightness by applying the argument of Lin et al. 1993, Appendix 1 to each of the terms. The bootstrap resampling method is then used to obtain c α [τ 1,τ 2 ]: generate a large number, say L = 3, independent copies U11,...,U 1L of U 1 and take c α [τ 1,τ 2 ]tobe the upper α-quantile of sup U11 2 t,, sup U1Lt. 2 t [τ 1,τ 2 ] t [τ 1,τ 2 ] A similar method gives the critical value K α [τ 1,τ 2 ]: use U 2 t = n 1/2 t S 1,n1 t ˆσ ratio t n1 G 1i dn 1i s t + S 2,n2 t Y 1 s ˆβt as the bootstrap approximation for U 2 t/σ ratio t. n2, G 2i dn 2i s Y 2 s, 4 Numerical example The data come from a Mayo Clinic trial involving a treatment for primary biliary cirrhosis of the liver, see Fleming and Harrington 1991 for discussion. A total of n = 312 patients participated in the randomized clinical trial, 158 receiving the treatment Dpenicillamine and 154 receiving a placebo. Censoring is heavy 187 of the 312 observations are censored. We use τ 1 = 34 and τ 2 = 4427, respectively. Figure 1 displays the proposed confidence band B diff for the difference of the distribution functions placebo minus treatment. The corresponding difference of the Kaplan Meier curves is also displayed. Note that the simultaneous band contains the horizontal line difference =, so there is no evidence of a difference between treatment Published by The Berkeley Electronic Press, 25 7

The International Journal of Biostatistics, Vol. 1 [25], Iss. 1, Art. 5 and placebo. As expected, the confidence band is much narrower in the left tail than in the right. Difference of distribution functions -.4 -.2..2.4 Empirical likelihood confidence band Estimated difference of distribution functions 1 2 3 4 Days after start of treatment Figure 1: Mayo Clinic trial, 95% simultaneous confidence band for the difference of distribution functions placebo treatment. Empirical likelihood confidence band Estimated ratio of distribution functions Ratio of distribution functions 2 4 6 1 2 3 4 Days after start of treatment Figure 2: Mayo Clinic trial, 9% simultaneous confidence band for the ratio of distribution functions placebo/treatment. Figure 2 displays the proposed confidence band B ratio for the ratio of the distribu- http://www.bepress.com/ijb/vol1/iss1/5 DOI: 1.222/1557-4679.17 8

McKeague and Zhao: Comparing Distribution Functions tion functions placebo over treatment. The corresponding estimate of the ratio of the distribution functions is also displayed. Note that the simultaneous band contains the horizontal line ratio = 1, so there is no evidence of a difference between treatment and placebo on the basis of this analysis. The lower bound of confidence band is greater than zero, which is within the range of parameter βt. As expected, the confidence band is much narrower in the right tail than in the left: the distribution function tends to zero in the left tail, so the variance of the ratio estimate blows up. 5 Discussion This article has developed an empirical likelihood approach for comparing the distributions of survival times from two independent samples of right censored survival data in terms of ratios, differences, and other functionals of the underlying distribution functions. Our methods have potential application in epidemiological cohort studies, and in randomized clinical trials for the comparison of treatment and placebo groups. Standard approaches to making such comparisons have been via pointwise confidence intervals expressed in terms of Kaplan Meier estimates, with the standard errors computed by the bootstrap or Greenwood s formula in conjunction with the delta method; see, e.g., Brenner and Hakulinen 25 on the estimation of relative survival rates of cancer patients. We have shown, on the other hand, that it is possible to obtain simultaneous confidence bands in this setting without relying on the Wald approach of centering the confidence band on a point estimate of the target function cf., Parzen et al., 1997. Our approach gives an added perspective to that obtained from other EL-type confidence bands for the comparison of survival distributions: Q-Q plots Einmahl and McKeague, 1999, ratios of survival functions McKeague and Zhao, 22, and P-P plots Claeskens et al., 23, the latter only being available in the absence of censoring. Our proposed confidence bands are computationally intensive compared with the closed form of Wald-type bands because they require the solution of a nonlinear equation at each uncensored survival time, and rely on the Gaussian multiplier simulation technique. For this reason, a simulation study to assess their performance would be Published by The Berkeley Electronic Press, 25 9

The International Journal of Biostatistics, Vol. 1 [25], Iss. 1, Art. 5 time-consuming, and we have not carried out such a study for the present article. Nevertheless, based on a previous simulation study of an EL-type confidence band in a survival analysis setting Hollander et al., 1997, we expect that the present EL bands will have significant advantages over their Wald-type counterparts in terms of coverage accuracy and adaption to skewness in the sampling distribution of the point estimates. Acknowledgements Ian McKeague acknowledges partial support under NSF grant DMS-5521. The authors are grateful to the editor and two referees for their helpful comments. Appendix: Proofs We need the following lemma to prove Theorem 2.1. Lemma A.1. Under the assumptions of Theorem 2.1, the Lagrange multiplier solving 2.4 satisfies λ n = λ n t =O P n 1/2 uniformly over [τ 1,τ 2 ]. Proof. First assume λ n t <. Along similar lines as Li 1995, p. 11 12, it can be shown that K 1 t and K 2 t log 1 log 1 d 1i ĝ 1 T 1i + λ n tĝ 1 T 1i K 1 t d 1i K 2 t d 2i d 2i ĝ 2 T 2i r 2i λ n tĝ 2 T 2i r 2i + K 2 t log Combining the above two inequalities and 2.4 we get n 1 n 1 λ n t ĝ 1 T 1i n 2 n 2 + λ n t ĝ 2 T 2i ĝ 1 T 1i, ĝ 2 T 2i 1 d 2i + d 2i ĝ 2 T 2i. r 2i r 2i http://www.bepress.com/ijb/vol1/iss1/5 DOI: 1.222/1557-4679.17 1

McKeague and Zhao: Comparing Distribution Functions αt + K 1 t K 2 t K 2 t d 1i ĝ 1 T 1i d 2i ĝ 2 T 2i r 2i n 1 n 1 λ n ĝ 1 T 1i n 2 n 2 + λ n ĝ 2 T 2i K 2 t + d 2i ĝ 2 T 2i r 2i ĝ 2 T 2i log 1 d 2i. A.1 r 2i Denote T =[τ 1,τ 2 ], ˆθ j t = K j t ĝ j T ji d ji /r ji, and θ j t = t g jsda j s = F j t, for t T. Let θ j t = K j t log1 d ji /r ji ĝ j T ji. Using 1/1 + x 1 for x and 1/1 x 1+x for x<1, from A.1 we obtain αt+ θ 2 t ˆθ 1 t K 1 t d 1i ĝ 2 1T 1i λ n n 1. A.2 In the case that λ n t, a similar argument leads to K 1 t αt + K 1 t d 1i ĝ 1 T 1i ĝ 1 T 1i log n 1 n 1 + λ n ĝ 1 T 1i K 1 t + 1 d K 2 t 1i d 2i ĝ 2 T 2i + r 2i From A.3, in a similar fashion to A.2, we obtain d 1i ĝ 1 T 1i n 2. A.3 n 2 λ n ĝ 2 T 2i αt+ θ 1 t ˆθ 2 t K 2 t d 2i ĝ 2 2T 2i r 2i λ n n 2. A.4 Next, in terms of the Nelson Aalen estimator Âj of A j,wehave nj ˆθ j t θ j t = t t n j S j,nj s dâjs S j s da j s = n j S j t S j,nj t, from the Volterra integral equation that relates the Nelson Aalen estimator and the Kaplan Meier estimator see Andersen et al., 1993, p. 92, and hence n j ˆθ j t D θ j t,j =1, 2 S j tu j t,j =1, 2 in D[τ 1,τ 2 ] D[τ 1,τ 2 ], where the U j t are Published by The Berkeley Electronic Press, 25 11

The International Journal of Biostatistics, Vol. 1 [25], Iss. 1, Art. 5 independent Gaussian martingales with mean zero and varu j s = σj 2 s Andersen et al., 1993, p. 263. By n j /n p j >, it follows that n{[ˆθ1 t θ 1 t] [ˆθ 2 t θ 2 t]} or in terms of θ 1 t θ 2 t =αt, we have D S 1tU 1 t p1 + S 2tU 2 t p2, nˆθ1 t ˆθ 2 t αt D S 1tU 1 t p1 + S 2tU 2 t p2. A.5 Using log1 x+x x 2 for x<1, we have nj ˆθ j t θ j t = K j t n j = max i K j t max i K j t log 1 d K j t ji d ji ĝ j T ji + r ji d ji nj r ji nj r ji Kj t ˆθ j t d ji ĝ j T ji r ji P ĝ j T ji r ji A.6 uniformly in t T, where in the last equality we use d ji = 1 a.s., which is a consequence of the continuity of S j, and in the final step we used the Glivenko Cantelli theorem and the uniform convergence in probability of ˆθ j t toθ j t ont. Combining A.5 and A.6, we have nαt ˆθ1 t+ θ 2 t D S 1tU 1 t p1 + S 2tU 2 t p2, nαt θ1 t+ˆθ 2 t D S 1tU 1 t p1 + S 2tU 2 t p2. Thus the l.h.s. of A.2 and A.4 are O P n 1/2 uniformly for t T. Applying Lenglart s inequality to the martingale integral t ĝk j s dâj A j s cf. Andersen et al., 1993, p. 19, where k 1, shows that it converges to zero uniformly in probability over t T. Since S j,nj t is a uniformly consistent estimator of S j t, we have that ĝj k t is a uniformly consistent estimator of gj k t. Thus, t ĝk j s gj k s da j s uniformly in probability over t T, and K j t d ji ĝ k j T ji r ji P t g k j sda j s A.7 http://www.bepress.com/ijb/vol1/iss1/5 DOI: 1.222/1557-4679.17 12

McKeague and Zhao: Comparing Distribution Functions uniformly over t T, for j =1, 2 and any fixed k 1. Using n j /n p j >, A.2, A.4, A.7, we find that λ n = O P n 1/2 uniformly for t T. Proof of Theorem 2.1. Let f j λ n = K j t log1 d ji /r ji +λ n ĝ j T ji ĝ j T ji, and recall θ j t = K j t log1 d ji /r ji ĝ j T ji. Then f j = θ j t, f j = γ j t/n j, where γ j t = n j i:t ji t ĝ 2 j T ji d ji r ji r ji d ji. A.8 is a uniformly consistent estimator of γ j t over t [τ 1,τ 2 ]. This is proved in Lemma A.3. For any λ n = O P n 1/2, by a Taylor expansion we have f j 1 j 1 λ n = θ j t+ γ jt 1 j 1 λ n + f j ξ jn λ 2 n, A.9 n j 2 where ξ jn λ n. By A.7 with k =3,f j ξ jn λ 2 n = O P n 1 j. Using n j /n p j >, 2.4 and A.9 we then obtain αt = θ 1 t+ θ 2 t+ γ 1tλ n n 1 + γ 2tλ n n 2 + O P 1. A.1 It follows from A.1 that λ n = nˆσ 2 diff αt θ 1 t+ θ 2 t+o P n 1. A.11 Since λ n ĝ j T ji /r ji d ji = o P 1 and λ n ĝ j T ji /r ji = o P 1 uniformly in i = 1,...,K j τ 2, using a Taylor expansion for 2.3 and A.11 we obtain K 1 t 2 log Rαt,t,ĝ 1, ĝ 2 = λ 2 ĝ 2 K 2 t 1T 1i d 1i n r 1i d 1i + ĝ2t 2 2i d 2i r 2i r 2i d 2i 2λ3 n 3 K 1 t ĝ1t 3 1i 1 d 1i 1 2 r1i 2 Published by The Berkeley Electronic Press, 25 13

+ 2λ3 K 2 t n 3 + λ4 n 2 + λ4 n 2 K 1 t K 2 t ĝ2t 3 2i ĝ1t 4 1i ĝ2t 4 2i 1 r 2i d 2i 2 1 r 2 2i 1 d 1i 1 3 r1i 3 1 r 2i d 2i 1 3 r2i 3 + o P 1 = nˆσ 2 diff αt θ 1 t+ θ 2 t+o P n 1 2 + o P 1, where in the last equality we use A.7 for k =3, 4. Combining A.5, A.6 and the uniform consistency of ˆσ diff 2 t shows that the above process has the limiting distribution indicated in the theorem. In order to prove Theorem 2.2 we need the following lemma. Lemma A.2. Under the assumptions of Theorem 2.2, the Lagrange multiplier solving 2.5 satisfies λ n = λ n t =O P n 1/2 uniformly over [τ 1,τ 2 ]. Proof. The proof follows a similar pattern to the proof of Lemma A.1. First assume λ n t <. Then, as in Li 1995, p. 11 12, K 1 t log 1 The International Journal of Biostatistics, Vol. 1 [25], Iss. 1, Art. 5 d 1i ĝ 1 T 1i + λ n tĝ 1 T 1i K 1 t d 1i n 1 n 1 λ n t ĝ 1 T 1i ĝ 1 T 1i and K 2 t log 1 d 2i ĝ 2 T 2i r 2i βtλ n tĝ 2 T 2i K 2 t K 2 t log 1 d 2i + d 2i ĝ 2 T 2i r 2i r 2i d 2i r 2i n 2 ĝ 2 T 2i n 2 + βt λ n t ĝ 2 T 2i. Combining the above two inequalities and 2.6, using 1/1 + x 1 for x and 1/1 x 1+x for x<1, we obtain ˆθ 1 t+βt θ 2 t K 1 t d 1i ĝ 2 1T 1i λ n n 1. A.12 http://www.bepress.com/ijb/vol1/iss1/5 DOI: 1.222/1557-4679.17 14

McKeague and Zhao: Comparing Distribution Functions Second, supposing λ n t, a similar argument leads to K 2 t βt d 2i ĝ 2 T 2i r 2i n 2 n 2 βt λ n ĝ 2 T 2i + + K 1 t K 1 t K 1 t In a similar fashion to A.12, from A.13 we obtain K 2 t θ 1 t βtˆθ 2 t β 2 t By n j /n p j >, we have n{[ˆθ1 t θ 1 t] βt[ˆθ 2 t θ 2 t]} D d 1i ĝ 1 T 1i d 1i ĝ 1 T 1i n 1 n 1 + λ n ĝ 1 T 1i ĝ 1 T 1i log 1 d 1i. A.13 d 2i ĝ 2 2T 2i r 2i λ n n 2. S 1tU 1 t p1 + βts 2tU 2 t p2, A.14 or in terms of θ 1 t/θ 2 t =βt, we have D nˆθ1 t βtˆθ 2 t S 1tU 1 t p1 + βts 2tU 2 t p2. A.15 Combining A.6 and A.15 gives nˆθ1 t βt θ 2 t n θ1 t βtˆθ 2 t D D S 1tU 1 t p1 S 1tU 1 t p1 + βts 2tU 2 t p2, + βts 2tU 2 t p2. Thus the l.h.s. of A.12 and A.14 are O P n 1/2. Using n j /n p j >, βt θ 1 τ 1 /θ 2 τ 2 =F 1 τ 1 /F 2 τ 2 >, A.7, A.12, and A.14, we find that λ n = O P n 1/2 uniformly for t T. Proof of Theorem 2.2. This proof is a variation of the proof of Theorem 2.1. Recall f j = γ j t/n j and note that βt θ 1 τ 2 /θ 2 τ 1, t T. For any λ n = O P n 1/2, by a Taylor expansion we have f j β j 1 tλ n = θ j t+ γ jt βt j 1 λ n + f j βt j 1 ξ jn βt 2j 1 λ 2 n, n j 2 A.16 Published by The Berkeley Electronic Press, 25 15

The International Journal of Biostatistics, Vol. 1 [25], Iss. 1, Art. 5 where ξ jn λ n. By A.7 with k =3,f j ξ jn λ 2 n = O P n 1 j, and using n j /n p j >, 2.6 and A.16, we obtain = θ 1 t+βt θ 2 t+ γ 1tλ n n 1 + γ 2tβ 2 tλ n n 2 + O P 1. A.17 It follows from A.17 that λ n = nˆσ 2 ratio θ 1 t βt θ 2 t+o P n 1. A.18 Since λ n ĝ j T ji /r ji d ji = o P 1 and λ n ĝ j T ji /r ji = o P 1 uniformly in i = 1,...,K j τ 2, using a Taylor expansion for 2.5 and A.18 we have K 1 t 2 log Rβt,t,ĝ 1, ĝ 2 = λ 2 ĝ 2 K 2 t 1T 1i d 1i n r 1i d 1i + β 2 tĝ2t 2 2i d 2i r 2i r 2i d 2i 2λ3 n 3 K 1 t + 2β3 tλ 3 n 3 + λ4 n 2 K 1 t + β4 tλ 4 n 2 ĝ1t 3 1i K 2 t ĝ1t 4 1i K 2 t 1 d 1i 1 2 r1i 2 ĝ2t 3 2i 1 r 2i d 2i 1 2 r2i 2 1 d 1i 1 3 r1i 3 ĝ2t 4 2i 1 r 2i d 2i 1 3 r2i 3 + o P 1 = nˆσ 2 ratio θ 1 t βt θ 2 t+o P n 1 2 + o P 1, where in the last equality we use A.7 for k =3, 4. Combining A.6, A.15 and the uniform consistency of ˆσ ratiot 2 completes the proof. Lemma A.3. The estimators ˆγ j and γ j defined in 2.8 and A.8, respectively, converge uniformly in probability to γ j over [τ 1,τ 2 ]. Proof. Note that φf j,g j t = t df js/1 G j s, t [,τ 2 ], is a continuous functional of cdfs F j and G j in supremum norm Andersen et al., 1993, Proposition http://www.bepress.com/ijb/vol1/iss1/5 DOI: 1.222/1557-4679.17 16

McKeague and Zhao: Comparing Distribution Functions II.8.6., p. 113, so the result for ˆγ j follows from the uniform consistency of the Kaplan Meier estimators of F j and G j on [,τ 2 ]. The result for γ j can be proved by adapting the argument on pages 191 192 of Andersen et al. 1993. References Andersen, P.K., Borgan, O., Gill, R.D., and Keiding, N. 1993 Statistical Models Based on Counting Processes, New York: Springer-Verlag. Brenner, H., and Hakulinen, T. 25 Substantial overestimation of standard errors of relative survival rates of cancer patients. Amer. J. Epidemiol., 161, 781 786. Claeskens, G., Jing, B.-Y., Peng, L., and Zhou, W. 23 Empirical likelihood confidence regions for comparison distributions and ROC curves. Canadian J. Statist., 31, 173 19. Einmahl, J.H., and McKeague, I.W. 1999 Confidence tubes for multiple quantile plots via empirical likelihood. Ann. Statist., 27, 1348 1367. Fleming, T.R., and Harrington, D.P. 1991 Counting Processes and Survival Analysis, New York: Wiley. Hjort, N. L., McKeague, I. W., and Van Keilegom, I. 25 Extending the scope of empirical likelihood. Unpublished manuscript. Hollander, M., McKeague, I. W., and Yang, J. 1997 Likelihood ratio based confidence bands for survival functions. J. Amer. Statist. Assoc., 92, 215 226. Li, G. 1995 On nonparametric likelihood ratio estimation of survival probabilities for censored data. Statist. Probab. Lett., 25, 95 14. Lin, D.Y., Wei, L.J., and Ying, Z. 1993 Checking the Cox model with cumulative sums of martingale-based residuals. Biometrika, 8, 557 572. McKeague, I.W., and Zhao, Y. 22 Simultaneous confidence bands for ratios of survival functions via empirical likelihood. Statist. Probab. Lett., 6, 45 415. Murphy, S. 1995 Likelihood ratio-based confidence intervals in survival analysis. J. Amer. Statist. Assoc., 9, 1399 145. Owen, A. 1988 Empirical likelihood ratio confidence intervals for a single functional. Biometrika, 75, 237 249. Owen, A. 199 Empirical likelihood and confidence regions. Ann. Statist., 18, 9 12. Owen, A. 21 Empirical Likelihood, Chapman & Hall: New York. Published by The Berkeley Electronic Press, 25 17

The International Journal of Biostatistics, Vol. 1 [25], Iss. 1, Art. 5 Pan, X. R., and Zhou, M. 22 Empirical likelihood ratio in terms of cumulative hazard function for censored data. J. Multi. Anal., 8, 166 188. Parzen, M.I., Wei, L.J., and Ying, Z. 1997 Simultaneous confidence intervals for the difference of two survival functions. Scand. J. Statist., 24, 39 314. Thomas, D.R., and Grunkemeier, G.L. 1975 Confidence interval estimation for survival probabilities for censored data. J. Amer. Statist. Assoc., 7, 865 871. http://www.bepress.com/ijb/vol1/iss1/5 DOI: 1.222/1557-4679.17 18