Analysis of Time-to-Event Data: Chapter 6 - Regression diagnostics

Analysis of Time-to-Event Data: Chapter 6 - Regression diagnostics Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/25

Residuals for the Cox regression model Suppose that the survival times of n individuals are available, where r of these are death times and the remaining n r are right censored. We further suppose that a Cox regression model has been fitted to the survival times. The fitted hazard function for the ith individual is therefore ĥ i (t) = ĥ 0 (t) exp(x i ˆβ), i = 1..., n. where x i ˆβ = ˆβ 1 x 1i + ˆβ 2 x 2i + + ˆβ p x pi is the value of the fitted linear predictor of the model for that individual and ĥ 0 (t) is the estimated baseline hazard function. Winter term 2018/19 2/25

Example: Infection in patients on dialysis Patient Time Status Age Sex 1 8 1 28 1 2 15 1 44 2 3 22 1 32 1 4 24 1 16 2 5 30 1 10 1 6 54 0 42 2 7 119 1 22 2 8 141 1 34 2 9 185 1 60 2 10 292 1 43 2 11 402 1 30 2 12 447 1 31 2 13 536 1 17 2 Figure: Times to removal of a catheter following an infection for a group of kidney patients (Collett 2015, p. 140). Winter term 2018/19 3/25

Cox-Snell residuals The Cox-Snell residual for the ith individual is given by r Ci = Ĥ 0 (t i ) exp(x i ˆβ), i = 1..., n, where Ĥ 0 (t) is the Breslow estimate of the baseline cumulative hazard function. If the correct model has been fitted, the n Cox-Snell residuals will behave as n (censored) observations from a unit exponential distribution. Plot of the Nelson-Aalen estimator H(t) for (r Ci, δ i ) (i = 1..., n) versus r Ci should be a straight line through the origin with unit slope. The Cox-Snell residuals can be used to assess the overall model adequacy. Winter term 2018/19 4/25

Cox-Snell residuals: Example 2.5. cti :::J 2.0 "0 "üi e:? 0 1.5 "0 (ö ~.r:: (]) 1.0 > ~ :s E :::J ü 0.5.,... O.Oc,------------,------------,-------------.------------, 0.0 0.5 1.0 1.5 2.0 Cox- Snell residual Figure: Cumulative hazard plot of the Cox-Snell residuals obtained from fitting the kidney catheter data (Collett 2015, p. 144). Winter term 2018/19 5/25

Martingale residuals When the data are right-censored and all the covariates are fixed at the start of the study, the martingale residuals are obtained as r Mi = δ i r Ci, i = 1,..., n. Martingale residuals take values between and unity. Properties: E(r Mi ) = 0; n i=1 r Mi = 0; Cov(r Mi, r Mj ) = 0 for i j. r Mi is the difference between the observed number of deaths for the ith individual in [0, t i ] and the corresponding estimated expected number on the basis of the fitted model. Winter term 2018/19 6/25

Martingale residuals (2) Martingale residuals can be used to determine the functional form of a covariate. First, obtain martingale residuals from fitting a null model. These residuals are then plotted against the values of each covariate in the model. The functional form required for the covariate can be determined by superimposing a smoothed curve that is fitted to the scatterplot, e.g. using the method LOWESS (locally weighted scatterplot smoothing). Winter term 2018/19 7/25

Martingale residuals: Example 10 qs 0.5. D 0 E 0.0 ::J c Q -0.5 Cii ::J -10 :-g (/) [I! -15 Q) Cii Ol -2.0 c t CO 2-2.5-3.0 10 20 30 40 50 60 Age Figure: Plot of the martingale residuals for the null model against age, with a smoothed curve superimposed, obtained from fitting the kidney catheter data (Collett 2015, p. 148). Winter term 2018/19 8/25

Deviance residuals The deviance residuals are less skewed than r Mi and are defined as r Di = sgn(r Mi )[ 2(r Mi + δ i ln(δ i r Mi ))] 1/2, i = 1,..., n, where sgn( ) is the sign function. The original motivation for these residuals is that they are components of the deviance: [ ] D = 2 ln(ˆl c ) ln(ˆl f ), where ˆL c (ˆL f ) is the maximised partial likelihood under the current model (saturated or full model). The deviance residuals are then such that D = r 2 Di. Winter term 2018/19 9/25

Deviance residuals (2) The quantity ˆβ x i is called the risk score. The risk score provides information about whether an individual might be expected to survive for a short or long time. A plot of the deviance residuals against the risk score is a helpful diagnostic to identify individuals whose survival times are out of line. Deviance residuals can be used to identify observations that are not well fitted by the model. Winter term 2018/19 10/25

Deviance residuals: Example 2.0 1.5 (1j :::J "0 w [!! 1.0 Q) 0.5 0 c Cll ;;; Q) 0 0.0-0.5-1.0-5 -4-3 Risk score -2-1 Figure: Plot of the deviance residuals against the values of the risk score for the kidney catheter data (Collett 2015, p. 146). Winter term 2018/19 11/25

Schoenfeld residuals For each individual, a set of Schoenfeld residuals, one for each covariate included in the fitted Cox model, does exist. The ith Schoenfeld residual for the jth explanatory variable in the model is given by r Sji = δ i (x ji â ji ), where x ji is the value of the jth explanatory variable (j = 1,..., p) for the ith individual in the study, x jl exp(ˆβ x l ) â ji = l R(t (i) ) l R(t (i) ) exp(ˆβ x l ) and R(t (i) ) is the risk set of all individuals at time t (i)., Winter term 2018/19 12/25

Scaled Schoenfeld residuals It turns out that a scaled or weighted version of the Schoenfeld residuals is more effective in detecting departures from the assumed model. Let the vector of Schoenfeld residuals for the ith individual be denoted r Si = (r S1i,..., r Spi ). The scaled Schoenfeld residuals, rsji, are then components of the vector r Si = d Ĉov(ˆβ) r Si, where d is the number of deaths among the n individuals. Winter term 2018/19 13/25

Score residuals 6.1 Residuals Score residuals are modifications of the Schoenfeld residuals. The ith score residual for the jth explanatory variable in the model is given by r Uji = r Sji + exp (ˆβ ) x i t (r) t (i) (â jr x ji)δ r l R(t (r) ) exp (ˆβ x l ). As for the Schoenfeld residuals, the score residuals sum to zero. The score residuals will not necessarily be zero when an observation is censored. Winter term 2018/19 14/25

Example: Infection in patients on dialysis i r Ci r Mi r Di r S1i r S2i rs1i rs2i r U1i r U2i 1 0.280 0.720 1.052-1.085-0.242 0.033-3.295-0.781-0.174 2 0.072 0.928 1.843 14.493 0.664 0.005 7.069 13.432 0.614 3 1.214-0.214-0.200 3.129-0.306 0.079-4.958-0.322 0.058 4 0.084 0.916 1.765-10.222 0.434-0.159 8.023-9.214 0.384 5 1.506-0.506-0.439-16.588-0.550-0.042-5.064 9.833 0.130 6 0.265-0.265-0.728-3.826-0.145 7 0.235 0.765 1.168-17.829 0.000-0.147 3.083-15.401-0.079 8 0.484 0.516 0.648-7.620 0.000-0.063 1.318-7.091-0.114 9 1.438-0.438-0.387 17.091 0.000 0.141-2.955-15.811-0.251 10 1.212-0.212-0.199 10.239 0.000 0.085-1.770 1.564-0.150 11 1.187-0.187-0.176 2.857 0.000 0.024-0.494 6.575-0.101 12 1.828-0.828-0.670 5.534 0.000 0.046-0.957 4.797-0.104 13 2.195-1.195-0.904 0.000 0.000 0.000 0.000 16.246-0.068 Table: Different types of residual after fitting a Cox model (Collett 2015, p. 141). Winter term 2018/19 15/25

Use of Schoenfeld residuals The Schoenfeld residuals are particularly useful in evaluating the assumption of proportional hazards after fitting a Cox regression model. It can be shown that E(r Sji) β j (t i ) ˆβ j, where β j (t i ) is the value of a time-varying coefficient of x j at the survival time of the ith individual, t i, and ˆβ j is the estimated value of β j in the fitted Cox model. A plot of the values of r Sji + ˆβ j or just r Sji against the observed survival times should give information about the form of the time-dependent coefficient β j (t). Winter term 2018/19 16/25

Schoenfeld residuals: Example Figure: Plot of scaled Schoenfeld residuals for Age and Sex (Collett 2015, p. 164). Winter term 2018/19 17/25

A test for proportional hazards for a particular covariate A test of the proportional hazards assumption can be based on testing whether there is a linear relationship between E(rSji ) and some function of time. For a particular covariate x j, linear dependence of the coefficient of x j on time can be expressed by taking β j (t i ) = β j + ν j (t i t), where ν j is an unknown regression coefficient. This leads to a linear regression model with E(r Sji ) = ν j(t i t). A test of whether the slope ν j is zero leads to a test of whether the coefficient of x j is time-dependent and hence of proportional hazards with respect to x j. Winter term 2018/19 18/25

A test for proportional hazards for a particular covariate (2) Let τ 1,..., τ d be the d observed death times across all n individuals. An appropriate test statistic is where τ = 1 d d i=1 τ i. ( d i=1 (τ i τ)r Sji) 2 d Var( ˆβ j ) d i=1 (τ i τ) 2, Under the null hypothesis that the slope is zero, this statistic has a χ 2 distribution on 1 d.f. Winter term 2018/19 19/25

A global test for proportional hazards An overall or global test of the proportional hazards assumption across all the p explanatory variables included in a Cox model is based on the following test statistic: (τ τ ) S Ĉov(ˆβ) S (τ τ ) d i=1 (τ, i τ) 2 1 d where τ = (τ 1,..., τ d ) and S is the d p matrix whose columns are the (unscaled) Schoenfeld residuals for the jth explanatory variable. This test statistic has a χ 2 distribution on p d.f. This test is sometimes referred to as the zph test. Winter term 2018/19 20/25

Example: Infection in patients on dialysis The estimated variances of the estimated coefficients of the variables age and sex are 0.000688 and 1.20099, respectively and the sum of squares of the 12 mean-centred event times is 393418.92. The values of the test statistic are 0.811 (p-value = 0.368) and 0.224 (p-value = 0.636) for age and sex, respectively. The numerator of the global test statistic is 26578.805, from which the zph test statistic is 0.811. This has a χ 2 distribution on 2 d.f. leading to a p-value of 0.667. Winter term 2018/19 21/25

Adding a time-dependent variable To examine the assumption of proportional hazards, a time-dependent variable can be added to the model. Let x 1 be a fixed covariate. Define x 2 (t) = x 1 g(t), where g(t) is a known function of t, e.g. ln(t). Consider a survival study in which each patient has been allocated to one of two groups, corresponding to a standard treatment and a new treatment. Winter term 2018/19 22/25

Adding a time-dependent variable (2) The hazard function for the ith individual in the study is then h i (t) = h 0 (t) exp(β 1 x 1i ), where x 1i is the value of an indicator variable x 1 that is zero for the standard treatment and unity for the new treatment. Let x 2i = x 1i t be the value of x 2 = x 1 t for the ith individual. The hazard of death at time t for the ith individual becomes h i (t) = h 0 (t) exp(β 1 x 1i + β 2 x 2i ). The relative hazard at time t is now exp(β 1 + β 2 t). A test of the hypothesis that β 2 = 0 is a test of the assumption of proportional hazards. Winter term 2018/19 23/25

Plot of the relative hazard 0.0 0 Time Figure: Plot of the relative hazard, exp(β 1 + β 2 t), against t, for different values of β 2 (Collett 2015, p. 167). Winter term 2018/19 24/25

Example: Infection in patients on dialysis Fitting a Cox regression model containing just age and sex leads to a value of 2 ln(ˆl) of 34.468. Define terms that are the products of these variables, namely age * t and sex * t. When the variable age * t is added to the model that has age and sex, the value of 2 ln(ˆl) reduces to 32.006, but this reduction is not significant at the 5% level (p-value = 0.117). The reduction in 2 ln(ˆl) is only 0.364 (p-value = 0.546), when the variable sex * t is added to the model that has age and sex. Winter term 2018/19 25/25