Bios 323: Applied Survival Analysis Qingxia (Cindy) Chen Chapter 4 Fall 2012 4.2 Estimators of the survival and cumulative hazard functions for RC data Suppose X is a continuous random failure time with the survival function S(t) and cumulative hazard function H(t). X may be subject to noninformative right censoring, hence we observe {(T i, δ i ), i = 1,..., n}, where T i = min(x i, C i ) and δ i = I(T i = X i ). Notations: t 1 < t 2 < < t D, D unique death times = # deaths at t j = n i=1 I(T i = t j, δ i = 1), j = 1,..., D Y j = # at risk /alive at t j = n i=1 I(T i t j ) Kaplan-Meier estimator for S(x): { 1, t < t1, Ŝ(t) = t j t (1 /Y j ), t 1 t. (4.2.1) Greenwood s variance estimator for Ŝ(t): ˆV [Ŝ(t)] = Ŝ(t)2 tj t Nelson-Aalen estimator for H(t): Y j (Y j ). (4.2.2) H(t) = { 0, t < t1 t j t /Y j, t 1 t. (4.2.3) Aalen s variance estimator for H(t): σ 2 H(t) = t j t Yj 2. (4.2.4) 1
An alternative estimator for H(t) is Ĥ(t) = log Ŝ(t), and an alternative estimator for S(x) is S(t) = exp{ H(t)}. Ŝ is also called product limit estimator. As n, D, and # of terms in the product for continuous X. For any fixed n <, D <, but in the limit, involves the whole support of X. Example: Continuous X is subject to right censoring. Let T = min(x, C) and δ = I(X < C). A random sample of size 5 is given in the table. T i 0.5 1 0.75 0.25 0.75 δ i 1 1 1 0 0 Question: What are Ŝ, H and S? Remarks Ŝ, H and Ĥ are right continuous, even though S, H may be continuous. Ŝ, S are consistent for S, for all t τ, such that P (T τ) > 0 2
Ŝ, S are asymptotically equivalent. Recall that Ŝ, H are only defined for t max(t i ) Some methods to extend Ŝ to t > max(t i) = t max Efron (1967) Ŝ(t) = 0, t > t max Gill (1980) Ŝ(t) = Ŝ(t max), t > t max Brown, Hollander and Kowar (1974) Ŝ(t) = exp{t log[ŝ(t max)]/t max }, t > t max. Example 4.1: We consider the data in Section 1.2 on the time to relapse of patients in a clinical trial of 6-MP against a placebo. The study included 42 children with acute leukemia who had a complete or partial remission of their leukemia induced by treatment with a drug. In the maintenance phrase, 6-MP was used to prevent relapse and its efficacy was of interest. The trial was conducted by matching pairs of patients by remission status and randomizing within the pair to either a 6-MP or placebo maintenance therapy. Patients were followed until relapse or until the end of study. Now we only consider 6-MP patients and would like to estimate S(t) and H(t). 4.3 Pointwise confidence intervals for the survival function 100 (1 α)% linear confidence interval for S(t) is Ŝ(t) Z 1 α/2 Ŝ(t)σ S (t), Ŝ(t) + Z 1 α/2ŝ(t)σ S(t), (4.3.1) where σ 2 S (t) = ˆV [Ŝ(t)]/Ŝ2 (t) = t j t Y j (Y j ). Alternative: CI based on some transformation g Consider g(ŝ), where g is a known, monotone and differentiable function. 3
Construct 100 (1 α)% CI for g(s(t)): g(ŝ(t)) ± Z 1 α/2 g (Ŝ(t)) Ŝ(t)σ S(t) Retransform to find 100 (1 α)% CI for S(t): g 1 {g(ŝ(t)) ± Z 1 α/2 g (Ŝ(t)) Ŝ(t)σ S(t)} Why does transforming help? 100 (1 α)% arcsine-square root confidence interval for S(t) is [ ( ) 1/2 ]} Ŝ(t) sin {max 2 0, arcsin(ŝ(t)1/2 ) ± 0.5Z 1 α/2 σ S (t) 1 Ŝ(t) (4.3.3) 100 (1 α)% log-transformed (or more commonly called log-log transformed) confidence interval for S(t) is { } Ŝ(t) 1/θ Z1 α/2 σ S (t), Ŝ(t)θ, where θ = exp log[ŝ(t)] (4.3.2) Klein and Moeschberger: log{h(t)} = log{ log(s(t))} g(u) = log{ log(u)}, g (u) = 1 u log(u) and g 1 (u) = exp{ exp(u)} 95% CI for g(s(t)) : log{ log(ŝ(t))} ± 1.96 σ S(t) log(ŝ(t)) Retransform to get a 95% CI for S(t) in (4.3.2). 4
Exercise: What is a 100 (1 α)% log -transformed CI for S(t)? (g(u) = log(u)) Both the log-log and arcsine-square root transformations give CIs which have better coverage than linear CI. R/Splus codes: library(survival) T = c(1.5, 2.5, 1.4, 6.2, 2.8, 5.3, 4.5) ind = c(1, 0, 0, 1, 1, 1, 0) fit1 = survfit(surv(t,ind), type = "Kaplan-meier", error = "greenwood", conf.int = 0.95, conf.type = "log") summary(fit1) Output: > summary(fit1) Call: survfit(formula = Surv(T, ind)) time n.risk n.event survival std.err lower 95% CI upper 95% CI 1.5 6 1 0.833 0.152 0.583 1 2.8 4 1 0.625 0.213 0.320 1 5.3 2 1 0.313 0.245 0.067 1 6.2 1 1 0.000 NA NA NA 5
conf.type = none (suppress CI), plain (linear), log (default, g(u) = log(u)), log-log (today s recommendation) 6
4.4 Confidence bands for the survival function Here we find L(t), U(t), such that P {L(t) S(t) U(t), for all t L t t U } = 1 α, where t L the smallest observed event time and t U the largest observed event time. [L(t), U(t)]is then called 100 (1 α)% confidence band for S(t), t L t t U. Define a L = nσ2 S (t L) 1+nσ 2 S (t L) and a U = nσ2 S (t U ) 1+nσ 2 S (t U ), which will satisfy 0 < a L < a U < 1. Equal probability (EP) bands for S(t), t L t t U : From Table C.3 in Appendix C, find confidence coefficients c α (a L, a U ) and construct EP bands over the range [t L, t U ] by replacing Z 1 α/2 in (4.3.1)-(4.3.3) with c α (a L, a U ). Hall and Wellner (HW) confidence bands for S(t), t L t t U : Replace Z 1 α/2 σ S (t) in (4.3.1)-(4.3.3) with k α (a L, a U )[1+nσ 2 S (t)]n 1/2, where k α (a L, a U ) can be found in Table C.4 in Appendix C. Comparisons of different confidence bands for S(t): EP Sample size HW Sample size Linear 200 Linear 20 Log 20 Log 20 Arcsine 20 Arcsine 20 7
4.5 Point and interval estimates of the mean and median survival time Mean survival time µ = E(X) = 0 S(t)dt. Naturally ˆµ = Ŝ(t)dt. 0 Ŝ(t) is not defined beyond last observation, if censored. However, the right tail of Ŝ(t) can be completed. First solution: If the last observation is censored and we use the Efron s method, we obtain the estimate t max 0 Ŝ(t)dt. Second solution: Estimate the mean restricted to a preassigned interval [0, τ], where τ can be the largest observation or some value prespecified by the investigators: The variance of this estimator is ˆµ τ = τ 0 Ŝ(t)dt. (4.5.1) ˆV (ˆµ τ ) = D [ τ j=1 t j ] 2 Ŝ(t)dt Y j (Y j ). (4.5.2) A 100 (1 α)% CI for the mean is: ˆµ τ ± Z 1 α/2 ˆV (ˆµ τ ). (4.5.3) Example 4.1 continued: Example 4.1 uses the data in Section 1.2 on the time to relapse of patients in a clinical trial of 6-MP against a placebo. We consider estimating the mean survival time and its standard error for the 6-MP patients based on the following product-limit estimator of S(t). 8
t j Y j Ŝ(t j ) 6 3 21 0.85714 7 1 17 0.80672 10 1 15 0.75294 13 1 12 0.69020 16 1 11 0.62745 22 1 7 0.53782 23 1 6 0.44818 Recall that the pth quantile of X with S(x) is defined by x p = inf{t : S(t) 1 p}. When p = 0.5, x p is the median time to the event of interest. ˆx p = inf{t : Ŝ(t) 1 p}. A 100 (1 α)% CI for x p, based on linear CI, is the set of all time points t which satisfy: Z 1 α/2 Ŝ(t) (1 p) ˆV 1/2 [Ŝ(t)] Z 1 α/2. (4.5.4) A 100 (1 α)% CI for x p, based on log-transformed CI, is the set of all time points t satisfying: Z 1 α/2 [log{ log(ŝ(t))} log{ log((1 p))}]ŝ(t) log(ŝ(t)) ˆV 1/2 [Ŝ(t)] Z 1 α/2. (4.5.5) A 100 (1 α)% CI for x p, based on arcsine-square root CI, is the set of all time points t satisfying: { 2 arcsin ( Ŝ(t) ) arcsin( } 1 p) {Ŝ(t)(1 Ŝ(t))}1/2 Z 1 α/2 ˆV 1/2 [Ŝ(t)] Z 1 α/2. 9 (4.5.6)
Example 4.2: This example uses the data described in Section 1.3 on bone marrow transplantation for leukemia. We shall estimate and construct 95% confidence intervals for the median disease-free survival time for the ALL group. 4.6 Estimators of the survival function for left-truncated and right-censored data Right-censored and left-truncated data {(L i, T i, δ i ), i = 1,..., n}, where L i : age entering the study, and time T i : the death or censoring time. Death times t 1 < t 2 < < t D, and at time t j, j = 1,..., D, let = number of deaths and let Y j = n i=1 I(L i t j T i ) Using the modified Y j for the LT data, all the estimation procedures defined in Section 4.2-4.4 are now applicable. Note that we are estimating the conditional survival S L (t) = P (X > t X > L). Also Nelson-Aalen estimator H L (t) = ln P (X > t) + ln P (X > L). Conditional hazard rate is H L (t) = h(t). Example 4.3 A survival study of residences of the Channing House retirement center located in California. Here the truncation times are the ages, in months, at which individuals entered the community. We want to estimate the conditional survival function. First look at the males. The first subject entered the center at age 751 and died at 777, the second entered the study at age 759 and died at 781, and the third subject appeared at age 782. If we used the K-M estimator directly to estimate the conditional survival function, what is ŜL(t)? 10
Instead, we estimate S a (t) = P (X > t X a), where t > a. Ŝa(t) = a t j t (1 /Y j ), t a. ˆV [Ŝa(t)] = [Ŝa(t)] 2 a t j t Ĥa(t) = a t j t Y j. Y j (Y j ), t > a. 4.7 Summary curves for competing risks Example: Bone marrow transplantation in Sec 1.3. Suppose we are interested in the time to treatment failure, which includes death in remission or relapse, whichever comes first. Suppose the death in remission is of primary interest, then the relapse is the competing risk. Competing risk data: {(T i, δ i ), i = 1,..., n}, where 1 if T i = X 1i T i = min(x 1i, X 2i, C i ) and δ i = 2 if T i = X 2i 0 if T i = C i Can we still apply K-M estimator to the competing risk data? In competing risks setting, the standard Kaplan-Meier method is commonly applied for type k event as follows: KM-C is to consider only the first event if of type k and to censor the non-type k events; With the presence of dependence, KM-C results in an inflated estimate of the cause-specific failure probability. KM-I is to consider any event of type k while ignoring all other non-type k events; 11
As KM-I analyses are conducted for each cause of failure, the resulting commonent failure probabilities may exceed the total probability of failure. For cause k, the cumulative incidence function is defined F k (t) = P (X t, δ = k) = t 0 h k (u)s(u )du, 1 where S(u) = P (X > u), h k (u) = lim u 0 P (u X < u + u, δ = k X u). u Let t 1 < t 2 < < t M be the distinct times where one of the competing risks occurs. At time t j, Y j = the number of subjects at risk at t j, r j = the number of subjects with an occurrence of the event of interest at t j, = the number of subjects with an occurrence of any of the other events of interest at t j. F k (t) can be estimated by 0 if t < t 1 ˆF k (t) = = t i t [ i 1 j=1 (1 +r j Y j )] r i Y i if t t 1 where Ŝ(t i ) is the K-M estimator, evaluated at just before t i. The variance of ˆF k (t) is given in (4.7.2) in the textbook. t i t Ŝ(t i ) r i Y i, This estimator and its inference have been implemented in a R package called cmprsk. The specific function name is cuminc. They are also implemented in a SAS macro http://www.mcw.edu/biostatistics/research/software.htm 12