STAT 331. Accelerated Failure Time Models. Previously, we have focused on multiplicative intensity models, where

STAT 331 Accelerated Failure Time Models Previously, we have focused on multiplicative intensity models, where h t z) = h 0 t) g z). These can also be expressed as H t z) = H 0 t) g z) or S t z) = e Ht z) = e ) H g 0t) 0 t) or S t z) = S 0 t)) gz). Consider instead the model where, for the 2-sample problem, patients in group 0 : T 0 patients in group 1 : T 1 and L T 1 ) = L φt 0 ) φ > 0). That is, survival time in group 1 is distributed as φt 0, where survival time in group 0 is distributed as T 0. Let z = 0, 1 denote group. Then S 1 t) = P T 1 > t) = P φt 0 > t) = P T 0 > t/φ). = S 0 t/ϕ). For simplicity, let ψ = 1/φ. Then the accelerated failure time model for the 2-sample problem can be defined by any of the following 3 equations: S 1 t) = S 0 ψt) or f 1 t) = ψf 0 ψt). 27.1) or h 1 t) = ψh 0 ψt) 1

Let s generalize to a vector of covariates z. Let g z) be some non-negative function such that g 0) = 1. Then we can generalize 27.1) to S t z) = S 0 t g z)) 27.2) e.g. z = scalar and binary and S t z = 0) = S 0 t 1) = S 0 t) S t z = 1) = S 0 t g 1)) = S 0 ψt) where ψ = g 1). This is ust 27.1). Note that 27.2) corresponds to or f t z) = g z) f 0 t g z)) h t z) = g z) h 0 t g z)). Special Case: Suppose g z) = e βz. Then, for example, h t z) = e βz h 0 te βz ). Let s express this in terms of random variables: Suppose T 0 S 0 t) Then if T def = T 0 gz), 27.3) S t z) def = P T > t z) = P [T 0 > t g z)] = S 0 t g z)), which is what 27.2) specifies. This also follows if for given Z = z, T T 0 gz). 27.4) Thus, in terms of r.v. s we can specify an AFT model as 2

ln T = ln T 0 ln g z)) 27.5) or, less stringently, given Z = z, ln T ln T 0 ln g z)) ; 27.6) equivalently, given Z = z, ln T + ln g z)) ln T 0. 27.7) Let µ 0 def = E ln T 0 ) and ϵ def = ln T 0 µ 0. Then 27.5) is equivalent to or ln T = ln T 0 ln g z)) = ln g z)) + µ 0 + ln T 0 µ 0 ln T = µ 0 ln g z) + ϵ, 27.8) where the random variable ϵ has mean 0 and a distibution that s independent of z. 27.6) is equivalent to, for given Z = z, with ϵ = lnt 0 ) µ 0. Now define Then lnt ) µ 0 lngz)) + ϵ ϵ = lnt ) µ 0 + lngz)). 27.9) lnt ) = µ 0 lngz)) + ϵ. Also in this more general case, for given Z = z, ϵ = lnt ) µ 0 + lngz)) lnt 0 ) lngz)) µ 0 + lngz)) = lnt 0 ) µ 0, a mean zero random variable whose distribution does not depend on Z = z. Special Case: - g z) = e βz. Then ln T = µ 0 βz + ϵ, ϵ 0, some variance). 3

Connection with PH Models PH Model: S t z) = S 0 t)) g 1z) for some g 1 ) AFT Model: S t z) = S 0 g 2 z) t) g 2 z). These in general are different models. They coincide iff T 0 Weibull cf: Cox & Oakes, 1984). Parametric Analysis - if no censoring Least Squares: find ln T i = µ βz i + ϵ i µ, β ) such that i = 1, 2,..., n ϵ i iid 0, σ 2) }{{} some distribution n i=1 ln T i µ βz i ) 2 is minimized. no specific parametric assumptions needed efficient if ln T i normal or T i lognormal) not efficient if ln T i is not normal e.g. ARE =.61 if T i Exp see Cox & Oakes, 1984). Parametric Analysis - with censoring see Buckley, J & James, I, 1979, Biometrika, Vol. 66, p 429 36. Semiparametric Inference Suppose ln T = µ 0 β z + ϵ Let ϵ 0, some variance) 4

n patients ϵ = ϵ + µ 0 β = β so that we can also write ln T = βz + ϵ, ϵ S 0 ) { mean not zero i th : ln T i = βz i + ϵ i, ϵ i iid S 0 ). Let C 1, C 2,..., C n be potential censoring times independent and independent of the T i given Z i ). Observe U i, δ i, Z i ) i = 1, 2,..., n U i = min T i, C i ) δ i = 1 T i C i ) } usual setting. Define e i β) = ln U i βz i and note that min ln T i βz i, ln C i βz i ) = min ϵ i, d i ) where d i = ln C i βz i. Also, ϵ i d i ln T i ln C i T i C i. Thus, 1 {ϵi d i } δ i. Thus, consider the data U i, δ i ) i = 1, 2,..., n where Thus U i = min ϵ i, d i ) Note ϵ i d i Z i. U i = min ln T i, ln C i ) βz i = ln min T i, C i )) βz i = ln U i ) βz i = e i β). Thus, we ve created data where the underlying survival time ϵ i doesn t depend on Z i! more later) Recall the PH model and Cox s PL score function n n n 1/2 U β ) = = n 1/2 =1 Z i Y ) u) Z e β Z Y u) e β Z dn i u). i=1 0 5

If there is no covariate effect i.e., β = 0), n 1/2 U β ) reduces to n 1/2 Z i Y ) u) Z i 0 Y dn i u). u) Also, we know n 1/2 U 0) L N 0, some variance) when β = 0. Let s consider this statistic and apply it to the data Ui, δ i). Note: for the true β, the underlying survival times ϵ i have a distribution that does not depend on Z i, and therefore follow a Cox propotional hazards model with parameter β = 0. S β) def = n 1/2 i or Note Thus, in summary 0 S β) = n 1/2 Z i n i=1 δ i Z 1 [e β) u] 1 [e β) u] Z i d1 [e i β) u and δ i = 1] Z 1 [e β) e i β)] 1 [e β) e i β)] e β) e i β) ln U βz ln U i βz i ln U U i ) β Z Z i ).. S β) L N 0, some var). Then use this as an estimating equation for β, i.e. find β ) s.t. S β = 0. Then, it can be shown that β p β ) L n β β N 0, some var). Note: Close connection to rank tests - Tsiatis, 1990, Ann. Stat, Vol 18, p 354 72. 6

General References to AFT Models Wei, Ying, Lin, 1990, JASA, Vol 79, p 649 652. Wei, 1992, Stat in Med, Vol 11, p 1871 1879. Exercises 1. Give an example of an accelerated failure time model involving 2 covariates: Z 1 =treatment group, and Z 2 =age. Use T to denote survival time. 7