Outline. Recall that Aalen additive hazards model and the semiparametric version

Outlne Clustered survval data (addtve models) Addtve-Multplcatve hazards model Advanced Survval Analyss 21 Copenhagen Gulana Cortese gco@bostat.ku.dk Clustered survval data Margnal addtve models Addtve-Multplcatve hazards model Cox-Aalen model Proportonal excess hazards models Aalen addtve model Clustered survval data Recall that Aalen addtve hazards model and the semparametrc verson p λ(t) = Y (t)x T (t)β(t) = Y (t) X j(t)β j(t), (1) and the semparametrc verson j=1 λ (t) = Y [ (t) X T (t)β(t) + Z T (t)γ ]. (2) These are models for the ntensty of the observed countng process, e.. λ(t)dt = E [dn(t) σ(n(s), X (s), Z(s), Y (s), s [, t) ] In clustered data, falure tmes wthn each cluster may be correlated. Many examples: to blndness n both eyes Twn studes Recurrent events Here, choce of survval model depends on the ams of the study, such as 1) correlaton of falure tmes wthn clusters (condtonal models) 2) comparson of falure tmes across clusters (margnal models)

Clustered survval data Clustered survval data Two basc approaches : 1) Fralty models: gven random eects "Z" the survval tmes are ndependent wth hazards λ(t)z exp(x T β). 2) Margnal models: Gven covarates the margnal rate (hazard) s Y (t)λ(t) exp(x T β) = E(dN(t) X, Y (t) = 1)/dt For k = 1,..., K, = 1,..., n, let T and C be the falure and censorng tmes for the th ndvdual n the kth cluster, and let X (t) be a p-vector of covarates. Put T k = ( T1k,..., Tnk), C k = (C1k,..., C nk), X k(t) = (X1k(t),..., X nk(t)). We assume that ( Tk, C k, X k( )), k = 1,..., K are ndependent and dentcally dstrbuted varables and these varables follow the model descrbed n the followng. The rght-censored falure tme s denoted T = T C and as usual we let Y (t) = 1(T t). Margnal models Rght-censored falure tmes T = T C, Y (t) = 1(T t), N (t) = 1(T t, T = T) Margnal (ntensty) model s a model Cox model: F t = σ{n (s), Y (s), X (s) : s t}, (3) λ F (t) = Y (t)λ(t) exp (X T (t)β). (4) It s mportant to note that (4) s not the ntensty wth respect to the observed ltraton F t = F k t, (5) k F k t = σ{n (s), Y (s), X (s) : = 1, n, s t} s the nformaton generated by observng all the ndvduals n the kth cluster. Margnal addtve models Instead of the Cox model we assume the Aalen addtve model for the margnal ntenstes, λ F (t) = Y (t)x T (t)β(t). (6) Possble nteracton terms are ncluded n the covarate vector X T (t). Let Xk(t) be the n p-matrx wth th row and let (Y (t)x 1(t),..., Y (t)x p(t)). N k(t) = (N1k(t),..., N nk(t)) T M k(t) = N k(t) X k(s)db(s). The -th component of M k(t), M (t) s a martngale wth respect to, but M k(t) s not a martngale wth respect to the observed ltraton. F t

Margnal addtve models Margnal addtve models One may also show that (7) has the same lmt dstrbuton as The (unweghted) workng ndependence estmator of B(t) s We have that K K ˆB(t) = [ X T k (s) Xk(s)] 1 X T k (s)dn k(s) K K 1/2 (ˆB(t) B(t)) = K 1/2 K [K 1 X T k (s) Xk(s)] 1 X T k (s)dm k(s) K = K 1/2 ɛ k(t) (7) whch s essentally a sum of..d. components (replace K 1 K X T (t) Xk(t) by ts lmt n probablty). k ˆɛ k(t) = K 1/2 K K 1/2 K ˆɛ k(t)g k, (8) K [K 1 X T k (s) Xk(s)] 1 T X k (s)d ˆMk(s) and ˆM k(t) = N k(t) X k(s)d ˆB(s). G1,..., GK are ndependent standard normal. The asymptotc covarance matrx of K 1/2 ( ˆB(t) B(t)) s estmated consstently by K K 1 ˆɛ 2 (t). k Resamplng from (8) may be used to construct condence bands and obtan p-values n tests about B(t). Example: Dabetc retnopathy data Example: Dabetc retnopathy data Subset of 197 patents wth dbetc retnopathy. Purpose of study was to assess the ecacy of laser treatment n delayng onset of blndness. Treatment assgned randomly to one eye of each patent, the other eye was wthout treatment. d tme status trteye treat adult 1 5 46.2496727 2 1 2 2 5 46.275538 2 2 3 14 42.568362 1 1 1 4 14 31.3414453 1 1 1 5 16 42.39841 1 1 1 6 16 42.274643 1 1 7 25 2.6443229 2 1 1 8 25 2.635121 2 1 9 29 38.7829463 2 1 1 1 29.3153673 1 2 1 > adult.treat<-(dabetes$adult==2)*(dabetes$treat) > ft<-aalen(surv(tme,status) ~adult+treat+adult.treat, dabetes,cluster=dabetes$d) > plot(ft) > summary(ft) Addtve Aalen Model Test for nonparametrc terms Test for non-sgnfcant effects sup hat B(t)/SD(t) p-value H_: B(t)= (Intercept) 2.9.387 adult 1.86.534 treat 2.49.18 adult.treat 2.99.57 Test for tme nvarant effects sup B(t) - (t/tau)b(tau) p-value H_: B(t)=b t (Intercept).187.791 adult.144.765 treat.11.553 adult.treat.154.89 nt (B(t)-(t/tau)B(tau))^2dt p-value H_: B(t)=b t.597.669 (Intercept) adult.161.94 treat.849.713 adult.treat.433.658

Example: Dabetc retnopathy data Example: Dabetc retnopathy data (Intercept) adult The margnal addtve model may be smpled by assumng a tme-constant eect for the nteracton term. Cumulatve coeffcents..5 1. 1 2 3 4 5 6 Cumulatve coeffcents.2..2.4.6.8 1 2 3 4 5 6 > ft.s<-aalen(surv(tme,status) ~adult+treat+const(adult.treat), dabetes,cluster=dabetes$d) > par(mfrow=c(2,2));plot(ft,sm.c=2) Test for non-sgnfcant effects Supremum-test of sgnfcance p-value H_: B(t)= (Intercept) 2.51.118 adult 2.1.3 treat adult.treat treat 2.48.169 Cumulatve coeffcents.6.4.2..1 1 2 3 4 5 6 Cumulatve coeffcents 1.2.8.4. 1 2 3 4 5 6 Test for tme nvarant effects Kolmogorov-Smrnov test p-value H_:constant effect (Intercept).243.73 adult.127.3 treat.689.828 Parametrc terms : Coef. SE Robust SE z P-val const(adult.treat) -.921.382.339-2.41.16 Fgure: Dabetes-data. Cumulatve regresson estmators along wth 95% condence ntervals (full lnes) and unform bands (broken lnes). Margnal semparametrc addtve models What are the conclusons about the nteracton? What does ts regresson coecent express? Multplcatve and addtve hazards models Multplcatve ntensty models One can also consder a semparametrc addtve model the margnal ntenstes are assumed to be λ F (t) = Y [ (t) X T (t)β(t) + Z T (t)γ ], (9) γ s an unknown q-dmensonal vector of tme-constant coecents. Workng ndependence estmator of B(t) and γ can be found smlarly to the nonparametrc model. Agan, an d decomposton for K 1/2 (B(t, ˆγ) B(t)) and for K 1/2 (ˆγ γ) can be obtaned and used for resamplng. Addtve models If one compares the two models: α (t) = Y (t) exp (β(t) T X (t)γ T Z (t)) α (t) = Y (t)(β(t) T X (t) + γ T Z (t)) Multplcatve model very attractve but smoothng s needed! Multplcatve model leads to a relatve-rsk type summary. Addtve model very appealng because of lack of smoothng parameter, but hazard predctons may be negatve. Addtve model leads to excess rsk nterpretaton. Some eects are addtve (competng rsks model), and some wll be multplcatve (smokng). Goodness-of-t wll seldom clearly decde whch model s the correct one.

Survval Estmaton for proportonal/addtve models Survval estmate s S (t) = exp( Therefore estmate λ (s)ds: Cox: exp(βt X )dλ(s). Multplcatve: exp(β(s)t X )dλ(s). Aalen: B(s) T X. Cox-Aalen: (B(s) T X ) exp(γ T Z). λ (s)ds). In multplcatve model S (t) needs the estmates of beta(s), and thus t depends on bandwdth. Asymptotcs are more complcated. In addtve model S (t) does not need smoothng, problems of negatve hazards predctons can be handled wth post-processng. Multplcatve-addtve hazards models The addtve and multplcatve hazards models may be combned as follows: Cox-Aalen model λ(t) = Y (t) [ X T (t)α(t) ] exp{z T (t)β}. Here there s an addtve model for the baselne based on covarate X. The nterpretaton of relatve rsks for Z s the same as n Cox model, whle the coecents α(t) represent the excess baselne due to presence of X. Proportonal excess hazards model λ(t) = Y (t)x T (t)α(t)+ρ(t)λ(t) exp{z T (t)β}, Y (t) and ρ(t) are at rsk ndcators, λ(t) s the baselne hazard of the excess rsk term. Estmaton n Cox-Aalen Estmaton n Cox-Aalen λ(t) = Y (t) [ X T (t)α(t) ] exp{z T (t)β}. The log-lelhood functon leads to the score equatons for β and da(t) Z(t) T { dn Y (β, t) T da(t) } =, Y (β, t) T W (t) { dn Y (β, t) T da(t) } =, W (t) = dag(w (t)) wth w (t) = Y (t) λ = Y (t) exp( Z (t) T β), (t) X (t) T α(t) for = 1,..., n. For known β ths leads to the estmator of the cumulatve ntensty Â(β, t) = Y (β, s)dn(s), Y (β, t) = (Y (β, t) T W (t)y (β, t)) 1 Y (β, t) T W (t) Insertng ths estmator nto the score equaton for β gves U(β) = wth τ U(β) = U(β, τ) = Z T (t)g(β, t)dn(t), (1) G(β, t) = I Y (β, t)y (β, t) s the projecton onto the orthogonal space spanned by the columns of Y (β, t). Easy to see that U(β, t) = s a square ntegrable martngale. Z T (s)g(β, s)dm(s)

2 4 6 8 2 4 6 8 2 4 6 8 Example: PBC data Example: PBC data > ft <- cox.aalen(surv(tme/365, status) ~ prop(age) + edema + + prop(log(bl)) + prop(log(alb)) + log(protme), pbc, + maxtme = 3/365) > summary(ft) (Intercept) Edema pro.m Test for tme nvarant effects sup B(t) - (t/tau)b(tau) p-value H_: B(t)=b t (Intercept) 4.557.41 edema.596.334 log(protme) 1.815.424 Proportonal Cox terms : Coef. Std. Error Robust SE D2log(L)^-1.35.7.1.8 prop(age) prop(log(bl)).8.78.87.87 prop(log(alb)) -2.451.676.647.675 Score Tests for Proportonalty sup hat U(t) p-value H_ Cumulatve regresson functon..2.4.6 Cumulatve regresson functon.4.2..2.4.6 Cumulatve regresson functon.5..5 1. 1.5 2. prop(age) 75.46.686 prop(log(bl)) 17.358.14 prop(log(alb)).516.992 Proportonal excess hazard models Example: Melanoma data For model 11 It can be seen as an excess rsk model (Martnussen & schee, 22): λ(t) = Y (t)x T (t)α(t)+ρ(t)λ(t) exp{z T (t)β}, (11) data(melanoma) lt<-log(melanoma$thck) excess<-(melanoma$thck>=21) # excess rsk for thck tumors ft<-prop.excess(surv(days/365,status==1)~ sex + ulc + cox(sex)+ cox(ulc) + cox(lt), melanoma, excess=excess, n.sm=1) For model 12 ρ = 1, all, gves a mx of Aalen's and Cox's models. The Cox term represents the excess rsk for an exposed subject wth covarates Z. ρ s an excess ndcator, e.g. I (d > ) wth d doses for th subject. The addtve term X T (t)α(t) can be estmated from the study. It can also be replaced by a known functon of X, α(t, X ), whch represents the background mortalty rate of a control populaton (often n cancer studes) (Sasen, 1996): λ(t) = Y (t) [ α(t, X )+λ(t) exp{z T (t)β} ]. (12) data(mela.pop) out<-pe.sasen(surv(start,stop,status==1)~ age + sex, mela.pop, d=1:25, max.tme=7, offsets=mela.pop$rate, n.sm=1) summary(out) Proportonal terms: coef se(coef) z p.327.125.261.794 age sex.472.365 1.29.196 Test for Proportonalty sup hat U(t) p-value H_ age 56..89 sex 3.77.43 ul<-out$cum[,2]+1.96*out$var.cum[,2]^.5 ll<-out$cum[,2]-1.96*out$var.cum[,2]^.5 plot(out$cum,type="s",ylm=range(ul,ll)) lnes(out$cum[,1],ul,type="s"); lnes(out$cum[,1],ll,type="s")

Exercse on Dabetes data Consder the Dabetes data n the tmereg package. Start from the followng model ft.s<-aalen(surv(tme,status) ~adult+treat+const(adult.treat), dabetes,cluster=dabetes$d) Try to smplfy further the above margnal semparametrc addtve model, after successve GOF tests. What s a nal possble model? Solve Exercse 9.1 of the book Martnussen & Schee (26). Exercse on TRACE data The TRACE study group conducted a study amed at assessng the prognostc mportance of varous rsk factors on mortalty for approxmately 6 patents wth myocardal nfarcton. The patents had varous rsk factors recorded such as age, sex (male=1), congestve heart falure (chf) (present=1), dabetes (present=1) and ventrcular brllaton (vf) (present=1). Consder the TRACE data n the tmereg package (a subset of patents from the orgnal data) wth the above covarates. ft.tr <- cox.aalen(surv(tme,status==9)~ chf + age + sex + prop(dabetes) + vf, TRACE) plot(ft.tr, sm.c=2,xlab=" (years)") Conclude what s/are the best model/models (Cox, addtve, Cox-Aalen) from ttng the TRACE data and arrve to a nal concluson by usng GOF procedures. Compare the nal model wth results from a Cox model wth "vf" as a stratcaton factor. Can ths be obtaned from cox.aalen? Plot survval predctons, gven some covarate values, for your nal model and the above strated Cox model. By comparng estmated curves, what may we conclude?