Linear life expectancy regression with censored data

Size: px

Start display at page:

Download "Linear life expectancy regression with censored data"

Chrystal Copeland
5 years ago
Views:

1 Linear life expectancy regression with censored data By Y. Q. CHEN Program in Biostatistics, Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109, U.S.A. and S. CHENG Department of Epidemiology & Biostatistics, University of California, San Francisco, California 94143, U.S.A. Summary In the statistical literature, the life expectancy is usually characterised by the mean residual life function. Regression models are thus needed to study the association between the mean residual life functions and their covariates. In this article, we consider a linear mean residual life model and further developed inference procedures in the presence of potential censoring. The new model and inference procedures are applied to the Stanford heart transplant data. Additional semiparametric efficiency calculations and information bound are also considered. Some key words: Additive model; Counting process; Estimating equation; Mean residual life; Semiparametric model 1

2 1 Introduction Suppose that failure time T is a nonnegative random variable on a probability space {Ω, F, P}. In the statistical literature, the life expectancy of T at time t > 0, e(t) say, is usually characterised and studied by way of its mean residual life function, m(t) = E(T t T > t) = e(t) t. When covariates are present, the proportional mean residual life model of Oakes & Dasu (1990) can be used to study the association between m(t) and the covariates: m(t Z) =m 0 (t) exp(β T Z), (1) where Z are the p-vector covariates and β are the associated parameters. To estimate β in (1), Maguluri & Zhang (1994) and Oakes & Dasu (2003) studied some estimation procedures when there is no censoring. Recently, Chen & Cheng (2005) developed quasi partial score estimating equations when censoring is present and m 0 ( ) is unspecified. As a result of the nature of e(t), the shape of m(t) has an intrinsic constraint, in that e(t) = m(t)+ t should be monotonically nondecreasing in t 0. In the proportional mean residual life model, however, m(t Z)+t = m 0 (t) exp(β T Z)+t may not always satisfy this constraint for certain β R p, unless m 0 ( ) itself is monotonically nondecreasing, as pointed out in Oakes & Dasu (1990). A monotonically nondecreasing m 0 ( ), although plausible mathematically, may not be compatible with the actual underlying process, in the case of human ageing, for example. To cope with this constraint, we consider instead a linear mean residual life model, m(t Z) =m 0 (t)+β T Z. (2) It is apparent the additive structure in model (2) complies with the intrinsic constraint. Moreover, model (2) is equivalent to e(t Z) =e 0 (t)+β T Z. Thus, in practice, the parameter β in model (2) can be interpreted as the average difference in life expectancies per unit change in Z. For example, when Z = 0/1 is the treatment indicator in a randomised clinical trial, β can be considered as the treatment effect measured in terms of life expectancy. In general, m 0 (t)+β T Z is required to be nonnegative. 2

3 To apply the new linear model in real applications, it is desirable that m 0 ( ) be unspecified without restrictive parametric assumptions, and the fact that the survival outcomes are often censored has to be accommodated. The rest of this article aims to develop and study some inference procedures for the new linear model. 2 Estimation and Inference 2.1 Estimation of baseline function Suppose that there are n subjects in the dataset, and let T i and C i be the failure and censoring times, respectively, for i = 1, 2,...,n. The observed dataset thus consists of n independent identically distributed copies of {(X i, i,z i ); i =1, 2,...,n}, where X i = min(t i,c i ), i = I(T i C i ) and Z i are the covariates, respectively. Given Z i, T i and C i are assumed independent. Without confusion in notation, subscripts may be occasionally suppressed. Let N(t) =I(X t, = 1) and Y (t) =I(X t). Since the baseline residual life function in model (2) is unspecified, we first develop an estimator for m 0 (t) as if the true β were known. Let the true values of β and m 0 ( ) beβ and m ( ), respectively; and consider the filtration defined by F t = σ{n i (u),y i (u),z i ;0 u t, i =1, 2,...,n}. Then E{dN(t) F t ; β,m } = Y (t)dλ(t Z; β,m ), where Λ( ) denotes the cumulative hazard function. Let dm i (t; β,m )=dn i (t) Y i (t)dλ(t Z i ; β,m ), i =1, 2,...,n, where {M i (t; β,m )} are martingales with respect to F t. Applying the inversion formula in Oakes & Dasu (1990) to the linear model, we know that the survival function of Z is S(t Z; β,m )= m { (0) + β T Z t } m (t)+β T Z exp du. 0 m (u)+β T Z As a result, {m (t)+β T Z}dΛ(t Z; β,m )=d{t+m (t)}. Thus, it is reasonable to estimate m 0 ( ) by the estimating equation, { m 0 (t)+β T Z i }dn i (t) 3 Y i (t)d{t + m 0 (t)} =0, (3)

4 which is in fact m 0 (t) i dn i(t)/ i Y i(t)+d m 0 (t) = i {βt Z i dn i (t) Y i (t)dt}/ i Y i(t). Then this first-order ordinary differential equation yields a closed-form solution of m 0 (t; β )=Ŝ(t) 1 i Ŝ(u) {Y i(u)du β T Z i dn i (u)} t i Y, i(u) where Ŝ(t) = exp{ t 0 i dn i(u)/ i Y i(u)}. Here τ = inf{t : pr(x > t)=0} < to avoid technical discussion about the tail behaviour of censored data. By the convention of Reid (1981) and James (1986), would be redefined as 1 for the last observation to ensure a meaningful m 0 ( ). Since i {m (t)+β T Z i}dn i (t) i Y i(t)d{t+m (t)} = i {m (t)+β T Z}dM i(t; β,m ), by subtracting it from (3), we obtain that i { m 0(t; β ) m (t)}dn i (t)+ i Y i(t)d{ m 0 (t; β ) m (t)} = i {m (t)+β T Z i }dm i (t; β,m ). Thus, this leads to m 0 (t; β ) m (t) = Ŝ(t) 1 i Ŝ(u) {m (u)+β T Z i }dm i (u; β,m ) t i Y, (4) i(u) and hence the consistency and asymptotic normality of m 0 (t; β ) hold according to the standard martingale theory for counting processes. In addition, if we let µ (t; β) = m 0 (t; β)/ β, it follows that µ (t; β ) = Ŝ(t) 1 τ Ŝ(u) t j Z jdn j (u)/ j Y j(u) = µ (t) +o p (1), where µ (t) =E {ZS (t Z)} /E {S (t Z)} and S (t Z) =pr{x t Z}, respectively. 2.2 Extended Buckley-James estimation The linear mean residual life model (2) implies that E(T Z) =m 0 (0)+β T Z. It is thus closely related to the standard linear model in Miller (1976) and Buckley & James (1979), which assumes that T γ 0 γ1 T Z has the identical zero-mean distribution with E(T Z) =γ 0 +γ1 T Z. Since the Buckley-James estimation procedure (Buckley & James, 1979) has been shown to be a reliable estimation procedure in the linear regression models with censored data (Miller & Halpern, 1982; Lin & Wei, 1992), we consider an extension of this procedure to the estimation of the linear mean residual life model. 4

5 Let α(t,z; β) =Z{T m 0 (0) β T Z}, and let that ξ(x,,z; β) = α(x, Z; β)+(1 )E{α(T,Z; β) T > X,Z}. Then E(ξ Z) =E(α Z) = 0, which leads to the unbiased estimating equations g 1 (β) =n 1/2 ξ(x i, i,z i ; β) =0, (5) as in the Buckley-James procedure for the standard linear regression model. However, both α(t,z; β) and E{α(T,Z; β) T > X,Z} depend on the unknown m 0 ( ), in addition to β. In the linear regression model, Buckley & James (1979) required that the residuals of T m 0 (0) β T Z share the same distribution, which was then estimated by the Kaplan-Meier product-limit estimator. However, this is usually not satisfied in the linear mean residual life model. It is thus less straightforward to use Efron s (1967) self-consistency representation of the Kaplan-Meier estimator to simplify g 1 (β). Instead, we extend the Buckley-James procedure to the model (2) using the proposed baseline estimator in 2.1. Consider the natural estimator of α(t,z; β) defined by α(t,z; β) =Z{T m 0 (0) β T Z}. In addition, by the inversion formula, S(t Z; β) is estimated by Ŝ(t Z; β) ={ m 0(0) + β T Z}{ m 0 (t) +β T Z} 1 exp[ t { m 0 0(u)+β T Z} 1 du]. Therefore, E{α(T,Z; β) T > X,Z} can be estimated by Ê{α(T,Z; β) T > X,Z} = Ŝ(X Z; β) 1 τ α(u, Z; β)dŝ(u Z; β). X By substituting α(t,z; β) and E{α(T,Z; β) T > X,Z} by these estimators in g 1 (β), we obtain the final estimating equations ĝ 1 (β) =n 1/2 [ i α(x i,z i ; β)+(1 i )Ê {α(t,z i; β) T>X i } ] =0. (6) Let D 1 (β) = lim n n 1/2 ĝ 1 (β)/ β. When β = β, D 1 (β) = E[ α(x, Z; β)/ β + (1 ) Ê{α(T,Z; β) T>X,Z}/ β]. In addition, by a Multivariate Central Limit Theorem, n 1/2 ĝ 1 (β ) approaches a zero-mean normal distribution with covariance matrix V 1 = E { α(x, Z; β ) 2 }+E[(1 )Ê {α(t,z; β ) T>X,Z} 2 ], asymptotically. Denote by β BJ the solution to ĝ 1 (β) = 0. Following an application of Taylor expansion, β BJ is asymptotically normal: n 1/2 ( β BJ β ) N(0,D 1 1 V 1D 1 1 ), 5

6 in distribution, as n, provided that D 1 is nonsingular. Here V 1 and D 1 can be estimated by their empirical estimators, V 1 = n 1 i [ i α(x i,z i ; β BJ ) 2 +(1 i )Ê{α(T,Z i ; β BJ ) T> X i,z i } 2 ] and D 1 = n 1 i [ i α(x i,z i ; β BJ )/ β+(1 i ) Ê{α(T,Z i; β BJ ) T>X i,z i }/ β]. β BJ Here, α( ) is an ad hoc choice for the estimating functions, and therefore the estimator is not necessarily efficient. In fact, in the original Buckley-James estimation, when ɛ is normal with known variance in the standard linear regression model of Y = γ 0 + γ T 1 Z + ɛ, α(t,z; γ) =Z(T γ 0 γ T 1 Z) is the score function of the regression parameter γ 1. Its efficient use for distributions other than normal is however unclear. Later, in James (1986), α(t,z; γ) was naturally generalised to be log f(t Z; γ)/ γ, where f( ) is the known density function of T. Similarly for the current linear mean residual life model (2), one possibly more efficient choice for α is [ log f(t Z; β) 1 T ] α(t,z; β) = = Z β m 0 (T )+β T Z {1+m 0(u)}du, (7) 0 {m 0 (u)+β T Z} 2 instead of α(t,z; β) =Z{T m 0 (0) β T Z}. In particular, when the baseline mean residual life function is constant of m 0 (t) α 0, then α(t,z; β) =Z(α 0 + βz) 2 (T α 0 βz) = Zm(t Z) 2 (T α 0 βz). More discussion of efficient estimation can be found in Quasi partial score estimation An alternative approach is to construct similar quasi partial score equations to those in Chen & Cheng (2005). Note that the estimating functions of g 2 (β ; m )=n 1/2 n β T Z i}dn i (t) Y i (t){1+m (t)dt}] are unbiased. It is therefore natural to replace m ( ) with 0 Z i[{m (t)+ its estimator, m 0 ( ; β), and use ĝ 2 {β; m 0 (β)} = 0 to solve for β to provide an estimator. Since d{t + m 0 (t; β)} = n { m 0(t; β)+β T Z i }dn i (t)/ i Y i(t), as in (3), ĝ 2 {β; m 0 (β)} =0 is simplified as { n 1/2 Zi Z(t) } { m 0 (t; β)+β T Z i }dn i (t) =0. (8) 0 Here, Z(t) = i Y i(t)z i / i Y i(t) has the same limit as µ (t). Therefore, n 1/2 ĝ 2 (β )/ β tends to D 2 = E[ {Z µ 0 (t)} 2 dn(t)] almost surely, as n. Consider a decomposition 6

7 of ĝ 2 {β ; m 0 (β )} of the form ĝ 2 {β ; m 0 (β )} =[ĝ 2 {β ; m 0 (β )} ĝ 2 (β; m )] + ĝ 2 (β ; m ), as in Tsiatis (1991). By the martingale representation of m(t; β ) m (t) in(4), the first term in this decomposition equals n 1/2 0 Z i Z(t) [ Ŝ(t) j=1 t Ŝ(u) ] i Y i(u) {m (u)+β T Z j }dm j (u) dn i (t). (9) Let Z(t) =Ŝ(t) t 0 Ŝ(u) 1 j {Z j Z(u)}dN j (u)/ j Y j(t). Then (9) can be further written as n 1/2 τ Z(t){m i 0 (t) +β TZ i}dm i (t). Since the second term of ĝ 2 (β ; m )in the decomposition equals n 1/2 τ i {Z 0 i Z(t)}{m (t) +β TZ i}dm i (t), we obtain that ĝ 2 (β ) = ĝ 2 {β ; m 0 (β )} = n 1/2 τ i {Z 0 i Z(t) Z(t)}{m (t) +β T Z i }dm i (t). As a result, ĝ 2 (β ) tends in distribution to a zero-mean normal distribution with the asymptotic variance V 2 (β )=E[ Y 0 1(t){Z µ (t) µ (t)} 2 {1+m (t)}{m (t)+β T Z}dt], where µ (t) = lim n Z(t), as n. Let βqp be the solution to ĝ 2 (β) = 0. By Taylor expansion, n 1/2 ( β QP β )={ n 1/2 ĝ 2 (β )/ β} 1 ĝ 2 (β )+o p (1). Therefore, n 1/2 ( β QP β ) N(0,D 1 2 V 2D 1 2 ), in distribution, where D 2 and V 2 can be consistently estimated by D 2 = n 1 n {Z 0 i Z(t)} 2 dn i (t), and V 2 = n 1 n Y 0 i(t){z i Z(t) Z(t)} 2 {1+ m 0(t)}{ m 0 (t)+ β QPZ T i }dt. In addition, user-defined weight functions can be also included in the proposed quasi partial estimating functions, as in n 1/2 τ i W (t){z 0 i Z(t)} 2 { m 0 (t; β)+β T Z i }dn i (t) =0, where the F t predictable W ( ) converges to a deterministic function of w( ). Then the asymptotic properties of its solution, β w, say, can be similarly derived. 2.4 Efficiency considerations Given the ad hoc nature of the estimating equations for the proposed β BJ and β QP, more efficient estimation procedures may be of interest. To gain some insight into the efficient estimation of the semiparametric linear mean residual life model, we follow the procedure in Lai & Ying (1992) and Lin & Ying (1994) in calculating a semiparametric information 7

8 bound for this model. Let β 0 R p be a parameter and letη( ) be a fixed function. Consider the parametric submodels of (2), defined by m(t Z) =m 0 (t)+β T s Z s (t), (10) where β s =(β0 T,β T ) T and Z s ( ) =(η( ) T,Z T ) T, respectively. Then the associated loglikelihood function of β s is l(β s )= i 0 log[{m 0(t) +θ T η (t)}/{m 0 (t) +βs T Z si (t)}]dn i (t) Y i (t)[{1 +m 0 (t) η (t)}/{m +θt 0 (t)+βs TZ si(t)}]dt. Therefore, the score functions at β s = (0,β T ) T are l β(β s ) = l(β s )/ β βs=β s = i Z 0 i/{m 0 (t) +β T Z i }[dn i (t) Y i (t){1 + m 0(t)}/{m 0 (t) +β T Z i }dt] and l β 0 (β s ) = l(β s )/ β 0 βs=β s = i 0 [η (t)/{1 +m 0(t)} η(t)/{m 0 (t)+β T Z i }][dn i (t) Y i (t){1+m 0(t)}/{m 0 (t)+β T Z i }dt], respectively. We therefore obtain that, in distribution, n 1/2 l β (β s ) l β 0 (β s ) N{0,I(η)}, where I(η) = I ββ(η) I β0 β(η) I ββ0 (η) I β0 β 0 (η) under proper regularity conditions similar to those in Lai & Ying (1992), where I ββ (η) = E{l ββ (β s )}, I ββ0 (η) =I β0 β(η) T = E{l ββ 0 (β s )} and I β0 β 0 (η) =E{l β 0 β 0 ) (β s )}. Actual calculation leads to that I ββ (η) = lim n n 1 I ββ0 (η) = lim n n 1 I β0 β 0 (η) = lim n n E [ Yi (t){1+m 0(t)}Z 2 ] i dt, {m 0 (t)+β T Z i } 3 [ Yi (t){1+m 0 E (t)}z { } i η T ] (t) {m 0 (t)+β T Z i } 2 1+m 0(t) η(t) dt, m 0 (t)+β T Z i [ { } ] Y i (t){1+m E 0(t)} η 2 (t) {m 0 (t)+β T Z i } 1+m 0(t) η(t) dt. m 0 (t)+β T Z i We shall say that matrix A is larger than matrix B if A B is nonnegative definite. Then, by an application of the Cauchy-Schwarz inequality, I 1 β (η) ={I ββ(η) I ββ0 (η)i 1 β 0 β 0 (η)i β0 β(η) T } 1 reaches its maximum at η = η 0. Here, η 0 ( ) satisfies the ordinary differential equation η 0(t)E[Y (t)/{1 +m 0(t)}] η 0 (t)e[y (t)/{m 0 (t)+β T Z}] =E[Y (t)z/{m 0 (t)+β T Z}], with the solution of η 0 (t) =P (t) 1 P (u)q(u)du. Here, P (t) = exp( t E[Y (u)/{m t 0 0(u) +, 8

9 β T Z}]/E[Y (u)/{1+m 0(u)}]du) and Q(t) =E[Y (t)z/{m 0 (t)+β T Z}]/E[Y (t)/{1+m 0(t)}], respectively. Therefore, the upper parametric information bound for β is I 1 β (η 0)= ( lim n n 1 0 [ Yi (t){z i µ 0 (t; β )} 2 ] 1 E dt), (11) {m 0 (t)+β T Z i } 2 where µ 0 (t; β) =lim n E[ i Y i(t)z i /{m 0 (t)+β T Z i }]/E[ i Y i(t)/{m 0 (t)+β T Z i }]. The sample-splitting technique in Lai & Ying and Lin & Ying (1994) can be used to construct a regular semiparametric estimator with its asymptotic variance reaching I 1 β (η 0). The entire data would be randomly divided into two evenly disjoint groups. One has the first n 0 =[n/2] samples and the other has the remaining n 1 = n n 0 samples. Let β j and m 0j ( ) be the ad hoc estimators of β and m 0 ( ) estimated from the (j + 1)st group for j =0, 1. For example, they can be obtained from either the extended Buckley-James estimation procedure or the quasi partial estimation procedure. Consider the following estimating functions for β: g(β) = n 1 j j=0 0 Z i Ẑ(t; β 1 j, m 0,1 j ) m 0,1 j (t)+ β T 1 j Z i [ dn i (t) Y ] i(t)d{t + m 0,1 j (t)}, m 0,1 j (t)+β T Z i where Ẑ(t; β,m 0)= i [Y i(t)z i /{m 0 (t) +β T Z i }]/ i [Y i(t)/{m 0 (t) +β T Z i }]. By standard counting process arguments, the asymptotic behaviour for one group is not affected by its estimates of β and m 0 from another group, but they asymptotically approach the true values. Thus the asymptotic variance of the semiparametric estimator β such that g( β) = 0 equals I 1 β (η 0). According to Bickel et al. (1993), the upper parametric information bound for β in the model (10) atβ s would lead to the semiparametric information bound for β at β in the model (2) as well. Therefore the semiparametric information bound of the linear mean residual life model is I 1 β (η 0); that is, by the Cramér-Rao inequality, the variance of any regular semiparametric estimator β in the linear mean residual life model is larger than (I ββ I βθ I 1 θθ I T βθ) 1 for any η, ifn 1/2 ( β β ) converges to a zero-mean normal. 9

10 3 Examples We consider the well-known Stanford heart transplantation data analysed in Miller & Halpern (1982). The time-to-event outcome of this dataset is the lifetime since first heart transplantation between October 1967 and February Two covariates were originally used in analysis, namely, age at the time of first transplant and the T5 mismatch score, which measures the degree of tissue incompatibility between the hearts of the initial donor and recipient with respect to hla antigens. To provide a comparison with their results, we apply the linear mean residual life model to the lifetimes with these two covariates. The results are tabulated in Table 1, along with those from the partial likelihood estimates of the Cox proportional hazards models and from the Buckley-James estimates of the linear regression models for the base-10 log-transformed lifetimes. [Table 1 about here] All the estimates in the linear mean residual life models suggest that both age and the T5 mismatch score are negatively associated with the patients life expectancy, although none of them is significant for the T5 mismatch scores. The covariate age is not only a significant predictor for the patients hazard and their average survival lifetimes, but also for their life expectancy throughout time. Among the different estimation procedures of the same linear mean residual life model, it is not surprising to see that the efficient estimation procedure yields the smallest variance. In addition, as demonstrated in Miller & Halpern (1982), a quadratic relationship to the covariate of age is plausible for both the Cox proportional hazards model and the linear regression model. We therefore fitted the linear mean residual life model with the T5 mismatch scores omitted because their nonsignificance as shown in Table 1. Table 2 shows that, both age and squared age are significant predictors of the life expectancy. However, their effects are of different signs: age itself is positively correlated with the life expectancy while the quadratic age has a negative influence on life expectancy. This demonstrates similar patterns to those shown by the Cox model and the linear regression 10

11 model. Again, the efficient estimation procedure yields smaller variances than do the other ad hoc procedures. [Table 2 about here] We can assess the fit of the linear mean residual life model using the goodness-of-fit test of Gill & Schumacher (1987). We consider two different weight functions, W 1 (t) and W 2 (t), say, in the weighted quasi partial estimating equations of g w1 (β) and g w2 (β), respectively. Their corresponding estimators are denoted by β w1 and β w2, respectively. Then a significant difference of β w1 β w2 should suggest lack of fit. To implement this, we adapt the reliable variation suggested in Wei et al. (1990), involving the goodness-of-fit test statistic defined by κ(w 1,W 2 ) = min β N ( βw1 ) {(g w1(β) T,g w2 (β) T )Σ 1 w ( β w1 )(g w1 (β) T,g w2 (β) T ) T }, where N ( β w1 ) is a neighbourhood of β w1. Here Σ w ( β w1 ) is the covariance matrix of (g w1 ( β w1),g T w2 ( β w1) T T ) T, which can be estimated by computer-intensive methods, such as the bootstrap. Under the null hypothesis that the linear mean residual life model is adequate, κ(w 1,W 2 ) is approximately distributed as χ 2 p. For the Stanford data, κ(w 1,W 2 )=4.81 (p =0.09) for the linear mean residual life model based on age and T5, which suggests a mild lack of fit. For the linear mean residual life model based on age and squared age, the test statistic is 1.17 (p >0.5) which does not reject the model. 4 Discussion Some simulations were conducted to assess the finite-sample properties of the proposed methods. We took the sample size n to be 100 or 200 with two covariates Z =(Z 1,Z 2 ) T for each of n subjects. The covariate Z 1 is a Ber(0.5) random variable and Z 2 Un[0, 1]. We chose m (t) =t + 1, which corresponds to the Pareto distribution with the baseline survival function being (1 + t) 2. Failure times were generated according to the linear mean residual life model. The true parameters were (0, 0) T or (1, 1) T. Independent censoring times were generated from the uniform distribution so as to allow about 30% of the observations to be 11

12 censored. The simulation results are summarised in Table 3, with each entry based on 1000 simulated datasets. The results show that the estimators of β are virtually unbiased and the nominal 95% confidence intervals have acceptable coverage probabilities. More extensive simulation studies are needed to evaluate the efficiency of the proposed methods. [Table 3 about here] There is, however, the critical challenge of the tail behaviour of underlying distributions of failure times. When the underlying failure times are heavily right-skewed and censored early, as in the case of long-term survivors on cancer treatment or subjects in HIV/AIDS prevention/vaccine trials, it is impossible to estimate the mean residual life function on the whole positive real line without extra assumptions, although some techniques such those as in Gill (1983) can be extended. In general, it is difficult to determine a reasonable upper limit τ without the robustness of the proposed methodologies being compromised. One possible approach is modify the fully unspecified m 0 ( ) by including a parametric component in the tail. For instance, let τ be a prespecified truncation time. Then it is assumed that m 0 (t) =m 0 (t)i(t < τ)+m r I(t > τ), where m r is some positive constant. This means that the baseline mean residual life function is unspecified up to the truncation time τ, while it becomes exponential after τ. Then it is straightforward to extend the proposed methodologies to the whole positive real line. Acknowledgement The authors would like to thank the associate editor and an anonymous reviewer for their constructive comments and suggestions that led to a significant improvement of this paper. They would also thank Professors Zhiliang Ying, Chen-Hsin Chen and Louis Chen for fruitful discussion during the first author s visit to the National University of Singapore in , which was partially funded by the Institute for Mathematical Sciences, National University of Singapore. 12

13 References Buckley, J. & James, I. (1979). Linear regression with censored data. Biometrika 66, Bickel, P. J., Klassen, C. A. J., Ritov, Y. & Wellner, J. A. (1993). Efficient and Adaptive Estimation for Semiparametric Models. Baltimore, MD: Johns Hopkins University Press. Chen, Y. Q. & Cheng, S. (2005). Semiparametric regression analysis of mean residual life with censored survival data. Biometrika 92, Efron, B. (1967). The two sample problem with censored data. Proc. 5th Berkeley Symp. Math. Stat. Prob. 4, Berkeley, CA: UC Berkeley Press. Gill, R. D. (1983). Large sample behaviour of the product-limit estimator on the whole line. Ann. Statist. 11, Gill, R. D.& Schumacher, M. (1987). assumption. Biometrika 74, A simple test of the proportional hazards James, I. R. (1986). On estimating equations with censored data. Biometrika 73, Lai, T. L. & Ying, Z. (1992). Asymptotic efficiency estimation in censored and truncated regression models. Statist. Sinica 2, Lin, D. Y. & Ying, Z. (1994). Biometrika 81, Semiparametric analysis of the additive risk model. Lin, J. S. & Wei, L.-J. (1992). Linear regression analysis based on Buckley-James estimating equations. Biometrics 48, Maguluri, G. & Zhang, C.-H. (1994). Estimation in the mean residual life regression model. J. R. Statist. Soc. B 56,

14 Miller, R. & Halpern, J. (1982). Regression with censored data. Biometrika 69, Oakes, D. & Dasu, T. (1990). A note on residual life. Biometrika 77, Oakes, D. & Dasu, T. (2003). Inference for the proportional mean residual life model. In Crossing Boundaries: Statistical Essays in Honor of Jack Hall, Institute of Mathematical Statistics Lecture Notes Monograph Series, Vol. 43, Ed. J. E. Kolassa and D. Oakes, pp Hayward, CA: Institute of Mathematical Statistics. Reid, N. (1981). Influence functions for censored data. Ann. Statist. 9, Tsiatis, A. A. (1981). A large sample study of Cox s regression model. Ann. Statist. 9, Wei, L. J., Ying, Z. & Lin, D. Y. (1990). Linear regression analysis of censored survival data based on rank tests. Biometrika 77,

15 Table 1: Regression estimates with standard errors for 157 Stanford heart transplant patients Coefficients Age T5 Score β 1 s.e. β2 s.e. Cox model Linear regression model Linear mean residual life model Buckley-James Quasi partial score Efficient estimation Cox model, estimates obtained in original time-scale; Linear regression model, Buckley- James estimates obtained in 10-base log-transformed scale; linear mean residual life model, estimates obtained in original time-scale; s.e., estimated standard error 15

16 Table 2: Regression estimates with standard errors for 152 Stanford heart transplant patients who survived at least 10 days Coefficients Age Age 2 β 1 s.e. β2 s.e. Cox model Linear regression model Linear mean residual life model Buckley-James Quasi partial score Efficient estimation Cox model, estimates obtained in original time-scale; Linear regression model, Buckley- James estimates obtained in 10-base log-transformed scale; linear mean residual life model, estimates obtained in original time-scale; s.e., estimated standard error 16

17 Table 3: Summary of simulation studies Z 1 Z 2 β n Estimation Cov. Prob. Bias Cov. Prob. Bias (0,0) 100 BJ (0,0) 100 QP (0,0) 200 BJ (0,0) 200 QP (1,1) 100 BJ (1,1) 100 QP (1,1) 200 BJ (1,1) 200 QP BJ, the Buckley-James approach; QP, the quasi partial score equation approach; Cov. Prob., coverage probabilities of the 95% nominal confidence intervals; Bias, mean difference between the estimate and the true value 17

University of California, Berkeley

University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 24 Paper 153 A Note on Empirical Likelihood Inference of Residual Life Regression Ying Qing Chen Yichuan