Outline. Cox's regression model Goodness-of-t methods. Cox's proportional hazards model: Survival analysis

Outline Cox's regression model Goodness-of-t methods Giuliana Cortese gico@biostat.ku.dk Advanced Survival Analysis 21 Copenhagen Cox's proportional hazards model: assumptions Goodness-of-t for Cox's model: graphical procedures and tests. Possible solutions to lack-of-t and extensions of Cox's model. Survival analysis Standard setup for right-censored survival data. T1,..., T i,..., T n are i.i.d. copies of (T, D) where T = T C D = I (T C) with T the true survival time and C the (potential) censoring time. Hazard-function Counting process Martingale 1 α(t) = lim h h P(t T < t + h T t, F ). N i(t) = I (T i t, D i = 1) M i(t) = N i(t) Λ i(t) where Λ i(t) = Y i(s)α(s) ds, Y i(t) = I (t T i), (compensator) (at risk process). Cox's proportional hazards model One studies a covariate vector: X i(t) (p-dimensional). Hazard conditional on covariates: α i(t, X i). Cox's proportional hazards model: α i(t) = α(t) exp (β T X i) where α(t) is unspecied baseline hazard (hazard for X i = ). This model has the nice property that the hazards ratios are constant over time. For a covariate X1 (other covariates being xed) then the relative risk α(t, X1 + 1) = exp (β1) α(t, X1) is not depending on time (key assumption)!

Cox's proportional hazards model Assumptions in Cox's model are proportional hazards (regression coecients are time-constant), functional form of continuous covariates, Examples: X (linear), X 2, log(x ), link function between hazard and linear predictor (log-linearity). If these conditions are not violated, then Cox's model ts our data well. One needs to check goodness-of-t of a regression model! Usually each assumption is tested at a time, while assuming the other two hold true (dicult to test simultaneously!). PBC data (primary biliary cirrhosis): 418 patients are followed until death or censoring. PBC is a fatal chronic liver disease. Important explanatory variables: Age Albumin Bilirubin Edema Prothrombin time Fitting Cox's model in R. Graphical procedure for testing proportional hazards > library(survival) > data(pbc) > attach(pbc) > cbind(time,status,age,edema,bili,protime,alb)[1:5,] time status age edema bili protime alb 4 1 58.7652 1 14.5 12.2 2.6 [1,] [2,] 45 56.4463 1.1 1.6 4.14 [3,] 112 1 7.726 1 1.4 12. 3.48 [4,] 1925 1 54.746 1 1.8 1.3 2.54 [5,] 154 38.154 3.4 1.9 3.53 > sum(status) [1] 161 > fit.pbc<-coxph(surv(time/365, status) ~ age+edema+log(bili)+log(protime)+log(alb)) > fit.pbc Call: coxph(formula = Surv(time/365, status) ~ age +edema +log(bili) +log(protime) +log(alb)) coef exp(coef) se(coef) z p.382 1.39.768 4.97 6.5e-7 age edema.6613 1.937.2595 3.21 1.3e-3 log(bili).8975 2.453.8271 1.85.e+ log(protime) 2.3458 1.442.77425 3.3 2.4e-3 log(alb) -2.4524.86.6577-3.73 1.9e-4 Likelihood ratio test=234 on 5 df, p= n= 418 Stratify according to X1, categorical with k levels, λ(t) = Y (t)λk(t) exp(β2x2(t) + + β px p(t)), with k = 1,..., K denoting the K possible strata. Now, if Cox's model is valid the stratied baselines will be (approximately) proportional λk(t) = λ(t)θ k. Graphical representation of log(ˆλk(t)) for k = 1,..., K! If hazards are proportional, plots of estimated curves versus t should be approximately parallel. It helps to plot dierences log(ˆλk(t)) log(ˆλ1(t)).

Graphical procedure Graphical procedure with condence intervals Edema Alb: 1.quartile Log(Cumulative baseline hazard) 13 12 11 1 9 8 7 2 4 6 8 1 Log(Cumulative baseline hazard) 11 1 9 8 7 6 5 2 4 6 8 1 Difference in log cumulative hazard Difference in log cumulative hazard 2 1 1 2 3 4 2 1 1 2 3 4 2 4 6 8 1 Alb: 2.quartile Difference in log cumulative hazard Difference in log cumulative hazard 2 1 1 2 3 4 2 1 1 2 3 4 2 4 6 8 1 Alb: 3.quartile 2 4 6 8 1 2 4 6 8 1 Graphical procedure with condence bands broken=pointwise, Red = correct uniform band, blue = without variation in Cox estimate Drawbacks of graphical procedure Difference in log cumulative hazard betahat 2 1 1 2 3 4 Edema Graphical methods based on cumulative hazards are useful but also have some limitations: They can suggest type of deviation from proportionality, only if a single covariate is in Cox model. Continuous covariates need to be categorized arbitrarily. How much can estimated curves deviate from being parallel? Dicult to see in practice! Help in plotting dierences log(ˆλk(t)) log(ˆλ1(t)). Useful to plot condence intervals/bands, but hard in practice. It is not simple to work with asymptotic properties of ˆΛk(t), but i.i.d. decomposition can be used to do this by resampling techniques. 2 4 6 8 1

Alternative methods for checking proportionality Cumulative martingale residuals There exists alternative methods based on both graphical procedures and test. Graphical procedures based on Schoenfeld residuals, smoothing. Tests ad hoc where specic deviations from proportionality are veried ( β1(t) = β1 + θlog(t) ), coe. being time-constant if θ = ). Tests and plots based on cumulative martingale residuals. As alternative: Cumulative martingale residuals (Lin et al., 1993). The martingales under the Cox regression model can be written as M i(t) = N i(t) Y i(s) exp(x T β)dλ(s) T ˆM i(t) = N i(t) Y i(s) exp(x ˆβ)d ˆΛ(s) = N i(t) Y i(s) exp(x T i (s) ˆβ) 1 dn (s). S(s, ˆβ) One idea is now to look at dierent groupings of the these residuals and see if they behave as they should under the model. Cumulative martingale residuals Cumulative martingale residuals: iid decomposition The score function, evaluated in the estimate ˆβ and seen as a function of time (observed score process), can be written as U( ˆβ, t) = X i(s)d ˆMi(s) = (X i(s) E(t, ˆβ))d ˆMi(s). where S(t, β) = Yi (t) exp(x T i (t)β), S1(t, β) = Yi (t) exp(x T i (t)β)xi (t), S2(t, β) = Yi (t) exp(x T i (t)β)xi (t) 2 S1(t, β) E(t, β) = S(t, β). and with minus the derivative of the score ( ) S2(s, β) I (t, β) = E(s, β) 2 dn i(s) S(s, β) n 1/2 U( ˆβ, t) is asymptotically equivalent to the process n 1/2 W (t, β) = n (M1(t) 1/2 I (t, ˆβ) I 1 (τ, ˆβ) ) M1(τ), where M1(t) = M1i(t) = (X i(s) e(s, β)) dm i(s) with e(t, β) = lim p E(t, β). M1(t) is a sum of iid terms with mean zero, and n 1/2 W (t, β) converges to a zero-mean Gaussian process.

Cumulative martingale residuals: resampling The asymptotic distribution of n 1/2 W (t, β) can be obtained by resampling procedures. The distribution of the process n 1/2 M1(t) (t [, τ]) is asymptotically equivalent to n 1/2 (Xi (s) E(s, ˆβ))dNi (s)gi where G1,..., Gn are independent standard normals. Since Mi 's are i.i.d. with variance E(Ni ), they are equivalent to processes Ni Gi with known distribution. Alternatively to this resampling approach, it is also shown (Lin et al., 2) that n 1/2 M1(t) is asympt. equivalent to n 1/2 ˆM1i (t)gi, with ˆM1i (t) = (Xi (s) E(s, ˆβ))d ˆMi (s). Therefore it can be shown that n 1/2 U( ˆβ, t) is asymptotically equivalent to ( n 1/2 Ŵ (t, ˆβ) = n 1/2 ˆM1i (t) I (t, ˆβ) I 1 (τ, ˆβ) ˆM1i (τ)) for almost any sequence of the counting processes that determines ˆM and I. Gi, Cumulative martingale residuals: test for proportional hazards A summary test statistic can be sup Uj( ˆβ, t) or t [,τ] sup t [δ,τ δ] Uj( ˆβ, t) var(uj( ˆ ˆβ, t)) = sup t [δ,τ δ] U w j ( ˆβ, t) j = 1..., p, where δ is a small positive number to avoid numerical problems at the edges, and var(u( ˆ ˆβ, t)) = ( ˆM1i (t) I (t, ˆβ)I 1 (τ, ˆβ) ) 2 ˆM1i (τ), is a consistent estimator of the variance of the observed score process (but also other choices). Graphical procedures can suggest type of departure from proportionality: plots of Uj( ˆβ, t) or U w j ( ˆβ, t) versus t, for j = 1,..., p, together with simulated processes Ŵ h(t, ˆβ) or Ŵ h(t, ˆβ)/ var(uj( ˆ ˆβ, t)) (h = 1,..., 1) under the model. > library(timereg) > fit<-cox.aalen(surv(time/365,status==2)~ +prop(edema)+prop(logbili)+ + +, weighted.score=,pbc) Right censored survival times, Cox-Aalen Survival Model, Num Simulations = 5 > summary(ourcox) Score Test for Proportionality sup hat U(t) p-value H_ 12..372 prop(edema) 5.61.22 prop(logbili) 13.6.114 2.28.6 1.25.446 2 1 1 2 6 4 2 2 4 6 prop(edema) 15 5 5 1 15 prop(logbili) 2 4 6 8 1 2 4 6 8 1 2 4 6 8 1 > plot(fitw,score=t,xlab="",ylab="") > fitw<-cox.aalen(surv(time/365,status==2)~ +prop(edema)+prop(logbili)+ + +, weighted.score=1,pbc) > summary(ourcox) Score Test for Proportionality sup hat U(t) p-value H_ 1.84.658 prop(edema) 3.12.44 prop(logbili) 3.17.58 3.96.2 2..548 2 1 1 2 2 1 1 2 > plot(fitw,score=t,xlab="",ylab="") 2 4 6 8 1 2 4 6 8 1

Schoenfeld residuals With 3 2 1 1 2 3 3 2 1 1 2 3 prop(edema) 4 2 2 4 prop(logbili) { } 2 S2(t, β) S1(t, E(t, β) = S1(t, β)/s(t, β); V (t, β) = S(t, β) β), S(t, β) the Schoenfeld residuals (Schoenfeld,1982) are dened as r k(β) = X (k)(t k) E(t k, β), 2 4 6 8 1 2 4 6 8 1 2 4 6 8 1 and the scaled Schoenfeld residuals (Grambsch and Therneau, 1994) as r k = V 1 (t k, β)r k(β) 4 2 2 4 2 4 6 8 1 4 2 2 4 2 4 6 8 1 with X (k) the covariate vector of the subject failing at t k. Given Cox estimates ˆβ, if proportional hazards hold, the estimated residuals ˆr = r ( ˆβ) have E(ˆr ) = k. k k k Plots of ˆr kj versus t k (j = 1,..., p) should be centered around the zero line. Schoenfeld residuals Schoenfeld residuals Assume a Cox model with time-varying coecients of type β j(t) = β j + θ jg j(t), j = 1,..., p, (1) with g j(t) as known functions (predictable). Interest in testing H : θ j = (β j(t) = β j) for all covariates j. Under (1), using a Taylor-series expansion of E(t, β(t)) about β(t) = β, it is E(rkj(β)) θ jg j(t), j. It follows that E(ˆr kj) + ˆβ j β j(t). Plots and smoothing of (r + ˆβ) gives a way of estimating β(t). k ˆθ can be obtained as one-step Newton-Raphson maximum likelihood estimate, based on the constant start β(t) = ˆβ. Under a Cox's model with β(t) = β + θg(t), standard asymptotics are still valid with parameter (β, θ) and covariates (Xi, Xi g(t)). The score test statistic gives many goodness-of-t tests for H : θj =, for dierent choices of g(t). The score function around (ˆβ, ) becomes U( ˆβ, ) = g(t)(xi E( ˆβ, s))dni (s), i therefore if this score is equal to for all g(t), it follows that the cumulative Schoenfeld residuals and the Schoenfeld residuals must be (Xi E( ˆβ, s))dni (s) = t or (Xi E( ˆβ, s))dni (s) =. Under the null, the score statistic is T (g(t)) = U( ˆβ, )J 1 22 U( ˆβ, ), with J22( ˆβ, ) being the block for θ in the observed information matrix. What if H : θj = is not rejected for some choices of g(t)?

> library(survival) > fit.pbc <- coxph(surv(time/365, status==2) ~ age+as.factor(edema) +log(bili)+ + log(protime)+log(albumin)) > stest= cox.zph(fit.pbc,transform="log") > stest > par(mfrow=c(2,3)) > for (i in 1:6) { > plot(stest[i]) > abline(,, lty=3) > } rho chisq p.829 9.15e-5.99237 age as.factor(edema).5 -.135388 2.97e+.8472 as.factor(edema)1 -.36895 2.14e-1.64366 log(bili).14625 1.53e+.21588 log(protime) -.297482 9.51e+.24 log(albumin).34176 1.99e-1.65569 GLOBAL NA 1.57e+1.1537 > plot(stest$x, stest$y[,2]) Add the linear fit of regression line r^*_k2 ~ log(time) > abline(lm(stest$y[,2] ~ stest$x)$coefficients, lty=4, col=3) Beta(t) for log(bili) Beta(t) for age..1 1 1 2 3.1.2 Beta(t) for as.factor(edema).5 4 2 2 4 6.1.5 2. 1..1.5 2. 1..1.5 2. 1. Beta(t) for log(protime) 1 1 2 3 Beta(t) for as.factor(edema)1 5 5 1 Beta(t) for log(albumin) 4 3 2 1 1.1.5 2. 1..1.5 2. 1..1.5 2. 1. for checking other assumptions Lin et.al (1993) suggested a more general cumulated sum over t and z: Mc(t, z) = K T z (s)d ˆM(s) where Kz(t) = f (Xi ) I (Xi (t) z). Some special cases of Mc(t, z) are: Global model assessment. Kz(t) is an n 1 matrix with elements I (Xi1(t) z) for i = 1,..., n focusing here on the continuous covariate X1. Thus one has cumulative residuals versus both time and the covariate values. Checking functional form of covariates. A 1-dimensional process can be considered: τ Mc(z) = K T z (t)d ˆM(t), (2) where Kz(t) has still elements I (Xi1(t) z). Plots of Mc(z) versus z can reveal if the functional form is appropriate. Checking the link function As in (2), but Kz(t) is an n 1 matrix with elements I (X T i (t) ˆβ z). These processes can also be resampled. Results can be used for graphical procedures and for estimating p-values for supremum tests. PBC data: To show if the there are problems for the levels of a continuous covariate over time one may also group the residuals after its level. > fit.pbc <- cox.aalen(surv(time/365, status==2) ~ +prop(edema)+prop(logbili)++, + residuals=1, n.sim=,pbc) > X <- model.matrix( ~ -1+ cut(bili,quantile(bili),include.lowest=t),pbc) > colnames(x) <- c("1.quartile","2.quartile","3.quartile","4.quartile") > resids <- cum.residuals(fit.pbc,pbc,x,n.sim=1,cum.resid=1) > summary(resids) Test for cumulative MG-residuals Grouped Residuals consistent with model sup hat B(t) p-value H_: B(t)= 6.11.27 1. quartile 2. quartile 7.719.246 3. quartile 11.778.76 4. quartile 15.61.16 int ( B(t) )^2 dt p-value H_: B(t)= 123.763.326 1. quartile 2. quartile 24.885.272 3. quartile 873.61.46 4. quartile 1127.889.44 Residual versus covariates consistent with model sup hat B(t) p-value H_: B(t)= 8.98.454 prop(edema) 2.76.39 prop(logbili) 1.816.6 7.613.476 7.391.57

PBC data: levels of Bilirubin PBC data: martingale residuals for functional form 1. quartile 2. quartile prop(logbili) 1 5 5 1 15 5 5 15 2 4 6 8 1 2 4 6 8 1 3 4 5 6 7 8 3. quartile 4. quartile 15 5 5 15 15 5 5 15 15 5 5 1 1 5 1 1 1 2 3 prop(logbili) 1 5 1 1 5 1 2 4 6 8 1 2 4 6 8 1 2.2 2.4 2.6 2.8.8 1. 1.2 1.4 PBC data: martingale residuals for functional form PBC data: levels of Bilirubin > fit.pbc <- cox.aalen(surv(time/365, status==2) ~ +prop(edema)+prop(bili)+ + +, residuals=1, n.sim=,pbc) 1. quartile 2. quartile > resids <- cum.residuals(fit.pbc,pbc,x,n.sim=1,cum.resid=1) > summary(resids) Test for cumulative MG-residuals Grouped Residuals consistent with model sup hat B(t) p-value H_: B(t)= 1. quartile 19.712. 2. quartile 5.557.559 3. quartile 11.363.6 2 1 1 2 2 4 6 8 1 1 5 1 2 4 6 8 1 4. quartile 1.388.138 int ( B(t) )^2 dt p-value H_: B(t)= 1. quartile 223.251. 2. quartile 128.63.496 3. quartile 4. quartile 3. quartile 8.76.43 4. quartile 418.33.216 Residual versus covariates consistent with model sup hat B(t) p-value H_: B(t)= 8.51.461 prop(edema) 3.92.211 2 1 2 1 5 1 prop(bili) 33.989. 2 4 6 8 1 2 4 6 8 1 11.367.72 4.875.967

PBC data: functional form General cumulative residuals process prop(bili) 1 5 1 3 1 1 3 Dene a cumulative residual process by M U(t) = U T (t)d ˆM(s) = U T (t)g( ˆβ, s)dm(s) with G(β, t) = I Y (β, t)y (β, t) 3 4 5 6 7 8 5 1 15 2 25 Y (β, t) = diag(exp(x T i (t)β))y (t), Y (β, t) = (Y T (t)diag(exp(x T i (t)β))y (t)) 1 Y T (t). prop(bili) The residual is asymptotically equivalent to [ U T (t)g(β, s)dm(s) U T (s)g(s)diag { Y (β, s)y (β, s)dn(s) } ] Z(s) ( ˆβ β) 15 5 5 1 15 5 5 1 and the last expression is [ U T (s)g(s)diag { Y (β, s)y (β, s)dn(s) } Z(s) Denote the second integral in the latter display by B U(β, t). ] I 1 (β, τ)u(β, τ) + op(n 1/2 ) 2.2 2.4 2.6 2.8.8 1. 1.2 1.4 processes: estimated variance, IID decomposition The variance of MU(t) can be estimated by the optional variation process [MU] (t) = U T (s)g( ˆβ, s)diag(dn(s))u(s)g( ˆβ, s) + BU( ˆβ, t) [U] (τ)b T U ( ˆβ, t) BU( ˆβ, t) [MU, U] (t) [U, MU] (t)b T U ( ˆβ, t). MU(t) is asymptotically equivalent to { 1 Ui (t) U T (s)y (β, s) Y T (β, s)w (s)y (β, s)} Yi (s)dmi (s) [ U T (s)g(s)diag { Y (β, s)y (β, s)dn(s) } ] I 1 (β, τ)z(s) W1,i + op(n 1/ = Wi (t) + op(n 1/2 ), where Wi (t) are i.i.d. an W1,i is an i.i.d. decomposition of ˆβ β. Resample construction is given as Ŵi (t)gi, where G1,..., Gn are standard normals, and Ŵi (t) is obtained by using ˆMi instead of Mi. This resampled process has the same asymptotic distribution of MU(t).

Extensions of Cox's model First, check if an extended Cox's model with strata or/and time-dependent covariates ts well the data! (and responds to the clinical questions of interest) Cox's model can be extended as λ(t) exp(z T β(t)). It can be tted by Sieves (Murphy & Sen, 1991) or local likelihood methods (Pons, 2; Hastie & Tibshirani, 1993; Cai & Sun, 23; Tian et al., 25) or by smoothing splines (Zucker & Karr, 199). There exists dierent methods to estimate β(t) directly, or to estimate cumulative coecients B(t) = β(s)ds (Martinussen et al., 22). Spline-penalized likelihood approach : β(s) splines on [, τ] with baseline λ τ PL(t, β()) + b β (s) 2 ds, parametric problem. Extension of Cox's model Local Likelihood : β(s) C 2 [, τ] with baseline λ PL(t, β()) = τ (X T i (s)β(s) log( Y k(s)exp(x T k (s)β(s))))dni (s) i k Now, a local linear approximation of β(s) yields (for xed t) β(s) = β(t) + θ(t)(s t) such that a local PL is given by ( ( )) PL l (t, ν) = τ 1 K((s t)/b) b i X T i (s)νt log T Y k(s) exp( X k k with ν = (β(t), β (t)), covariates Xi (s) = (Xi, (s t) Xi ), and K( ) a Kernel function. PL l(t, ν) is concave! Newton-Raphson yields consistent estimators such that n 2/5 ( ˆβ(t) β(t)) N(Bt, Vt). Cumulative estimate B(t) = ˆβ(s)ds based on this method have the same asymptotic properties proved by other approaches (Martinussen et al., 22). dn i (s Resampling exersise GOF test Exercise Consider the PBC data and t a Cox model. Run the code from the slides and repeat the arguments from the lectures. Obtain the observed score process (based on martingale residuals) for checking proportional hazards. By considering the iid decomposition, construct 1 resampled score processes under the model, by using the two dierent approaches by Lin et al. (1993) and Lin et al. (2). Construct the supremum test statistics (weighted and unweighted) and compute p-values from the resampled processes. Compare with results from the program output: > fit.pbc<-cox.aalen(surv(time/365, status==2) ~ +prop(edema)+ + prop(logbili)+ +, weighted.test=, + resample.iid=1, residuals=1, pbc) > summary(fit.pbc) > plot(fit.pbc, score=t,xlab="",ylab="") > fit.pbc.w <-cox.aalen(surv(time/365, status==2) ~ +prop(edema)+ + prop(logbili)+ +, weighted.test=1, + resample.iid=1, residuals=1, pbc) > summary(fit.pbc.w) > par(mfrow=c(2,3)) > plot(fit.pbc.w, score=t,xlab="",ylab="") Fit a Cox model to the PBC data with all continuous covariates in linear form. data(pbc); attach(pbc) time <- time/365; fit<-coxph(surv(time,status==2)~age + as.factor(edema) + bili + protime + alb, pbc) Check proportional hazards assumption by using Schoenfeld residuals and cumulative residuals, compare results. # : time <- time+.1*runif(418) fit<-cox.aalen(surv(time,status==2)~+prop(edema)+ prop(bili)+prop(protime)+prop(alb),pbc,residuals=1) par(mfrow=c(2,3)) plot(fit.pbc, score=t, xlab="",ylab="") plot(fit)

GOF test Exercise GOF test Exercise Check functional form of continuous covariates with cumulative martingale residuals fit.mg <- cum.residuals(fit,pbc,n.sim=2,cum.resid=1) summary(fit.mg) par(mfrow=c(2,2)) plot(fit.mg,score=2) X <- model.matrix( ~ -1+factor(cut(protime,quantile(protime), include.lowest=t)),pbc) colnames(x) <- c("1. quartile","2. quartile","3. quartile","4. quartile") resids <- cum.residuals(fit,pbc,x,n.sim=1,cum.resid=1) summary(resids) plot(resids, score=1) plot(resids, score=2,specific.comps=c(1,3:5)) Compare condence intervals and condence bands of the observed processes of cumulative residuals grouped after the covariate levels Consider the TRACE data on myocardial infarction (data(trace)) Make a goodness-of-t test in a Cox model with covariates "chf", "age", "sex", "diabetes", "vf". Compare with the conclusions from the extended Cox model with time-varying coecients: fit<-timecox(surv(time,status>=7)~ chf+age+sex+diabetes+ vf, TRACE) par(mfrow=c(2,3)) plot(fit) How could this model be simplied? plot(resids,pointwise.c,hw.ci=2,sim.ci=3) What is the conclusion? How should the model be adapted? Does it provide a satisfactory goodness-of-t to the data? References References Cai, Z. and Sun, Y. (23). Local linear estimation for time-dependent coecients in Cox s regression models. Scand. J. Statist. 3, 93112. Grambsch, P.M. and Therneau, T.M. (1994). Proportional hazards tests and diagnostics based on weighted residuals (corr: 95v82 p668). Biometrika 81, 515526. Hastie, T. and Tibshirani, R. (1993). Varying-coecient models. J. Roy. Stat. Soc. Ser. B 55, 757796. Lin, D.Y., Wei, L., Yang, I., and Ying, Z. (2). Semiparametric regression for the mean and rate functions of recurrent events. J. Roy. Stat. Soc. Ser. B 62, 71173. Lin, D.Y., Wei, L.J., and Ying, Z. (1993). Checking the Cox model with cumulative sums of martingale-based residuals. Biometrika 8, 557572. Martinussen, T., Scheike, T.H. and Skovgaard, I.M. (22). Ecient estimation of xed and time-varying covariate eects in multiplicative intensity models. Scand. J. Statistics 29, 5774. Murphy, S. A. and Sen, P. K. (1991). Cox regression model with time dependent covariates. Stochastic Processes and their Applications 39, 15318. Pons, O. (2). Nonparametric estimation in a varying-coecient Cox model. Math. Meth. Statist. 9, 376398. Schoenfeld, D. (1982). Partial residuals for the proportional hazards regression model. Biometrika 69, 239241. Tian, L., Zucker, D., and Wei, L. (25). On the Cox model with time-varying regression coecients. J. Amer. Statist. Assoc. 1, 172183. Zucker, D.M. and Karr, A.F. (199). Nonparametric survival analysis with time-dependent covariate eects: A penalized partial likelihood approach. Ann. Statist. 18, 329353.