REGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520

REGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520 Department of Statistics North Carolina State University Presented by: Butch Tsiatis, Department of Statistics, NCSU 1

Outline Why standard regression models aren t used with censored survival data Modeling the hazard rate as a function of covariates The Proportional Hazards Model Interpreting the regression coefficients 2

Regression Models with Survival Data A major focus in any study is to characterize the relationship between a response Y and covariates (prognostic factors) X 1,...,X k For continuous variables Y the most popular model is the multiple linear regression model where Y = β 0 +β 1 X 1 +...+β k X k +e The regression coefficient β 0,β 1,...,β k in such a model have a nice interpretation, describing the direction and strength of the relationship of each of the prognostic factors on their effect on Y. Reminder: If the j-th variable X j is increased by one unit and all other variables are kept the same the response will be increased by β j units. 3

Multiple Linear Regression Such models accommodate continuous variables discrete variables (dummy variables) interactions polynomial regression Easy to estimate the parameters using least squares Properties have been studied extensively 4

Multiple Linear Regression In chronic disease clinical trials we argued that the primary endpoint (response variable) is survival time T Since T is a continuous positive random variable, it would seem natural to model the relationship of survival time (or possibly some simple transformation of survival time) and covariates using multiple linear regression. logt = β 0 +β 1 X 1 +...+β k X k +e This may be a reasonable strategy if the survival time T were observed (uncensored) for everyone in the study 5

Difficulties with Linear Regression Models Survival times are most often right censored This creates difficulties in estimating the parameters β 0,β 1,...,β k. Least squares doesn t work anymore in providing good estimates Does not accommodate time-dependent covariates (i.e. covariates that change over time). For example cumulative exposure to a risk factor blood pressure heart transplant status 6

Modeling the Hazard Function Data from a clinical trial with a survival endpoint can be summarized as (U i, i,x 1i,...,X ki ),i = 1,...n, where for patient i among a sample of n patients U i denotes time on study i denotes failure indicator 1 = death, 0 = censored X 1i,...,X ki denotes the value of the k covariates With such data it turns out to be more convenient to model the hazard function of dying rather than the survival time itself 7

The hazard rate The hazard rate or hazard function is defined as { } P(t T < t+h T t) λ(t) = lim. h 0 h Models for the hazard rate are given by considering λ(t X 1,...,X k ), where λ(t X 1 = x 1,...,X k = x k ) means the hazard rate (mortality rate) at time t for individuals in the population whose X 1 value equals x 1,... and X k value equals x k. 8

Hazard rate So, for example, suppose survival time is measured as length of life after treatment for leukemia, X 1 denotes age at time of treatment and X 2 denotes gender (0=male, 1=female), then, roughly speaking, if λ(5 X 1 = 55,X 2 = 1) =.10, then this means that the hazard of failing at five years; i.e. given that a woman, 55 years of age when starting treatment, is still alive 5 years after treatment, then the probability of dying in the next year is.10. Notice that time t where the hazard is measured as well as the covariates X 1,...,X k are important in this relationship 9

Modeling the hazard rate Models explore the relationship of the hazard rate in terms of time and covariates The most popular model is the proportional hazards regression model introduced by D.R. Cox in (1972). Often referred to as the Cox regression model. In this model it is assumed that λ(t X 1,...,X k ) = λ 0 (t)exp(β 1 X 1 +...+β k X k ) 10

How to interpret the Cox Model λ(t X) = λ 0 (t)exp(βx) First let is consider only one covariate If X = 0, then the hazard at time t is λ 0 (t) λ 0 (t) is referred to as the baseline hazard function No assumption is made regarding the shape of this function over time t Semiparametric model 11

How to interpret the Cox Model This model can also be written as λ(t X = x) λ 0 (t) = exp(βx), where λ(t X) λ 0 (t) is the ratio of the hazard rate at time t for an individual whose covariate value X = x to the hazard rate at time t for an individual whose covariate value X = 0 λ(t X = x 1 ) λ(t X = x 0 ) = λ 0(t)exp(βx 1 ) λ 0 (t)exp(βx 0 ) = exp{β(x 1 x 0 )} This ratio of hazard rates is sometimes referred to as relative risk The Cox model implicitly assumes that the relative risk is constant over time; i.e. the so-called proportional hazards assumption 12

How to interpret the Cox Model Suppose x 1 > x 0 If β > 0, then exp{β(x 1 x 0 )} > 1 implying that the hazard rate is higher for individuals whose X = x 1 as compared to those whose X = x 0. Moreover, because of proportional hazards, this higher hazard rate occurs throughout all time. Consequently, the greater the value of X the higher the hazard of dying resulting in shorter survival times on average. If β = 0, then the hazard rate doesn t change with the value of X. This corresponds to the null hypothesis that X has no effect on survival. Everyone in the population, regardless of their value of X, has the same hazard rate λ 0 (t). If β < 0, then the hazard rate decreases with increasing X resulting in longer survival times. 13

Example Let X be a binary indicator. For example let X = 1 denotes women with Stage II, node positive breast cancer receiving high intensity CAF therapy and X = 0 those receiving low intensity therapy β =.33; i.e. λ(t X = 1) λ(t X = 0) = exp(.33) =.72 This means that the hazard of dying for women receiving high intensity therapy is.72 times that of women receiving low intensity therapy. i.e. high-dose therapy increased longevity 14

Example Let X denote the number of involved nodes at the time of treatment. β =.06 λ(t X = x 1 ) λ(t X = x 0 ) = exp{β(x 1 x 0 )} So, for example, if we wanted to derive the relative risk between a woman with 11 involved nodes at time of treatment to a woman with 2 involved nodes at time of treatment, we take x 1 = 11 and x 0 = 2 to obtain a relative risk of exp{.06(11 2)} = 1.72 The woman with 11 involved nodes has almost 2 times the risk of death compared to the woman with 2 involved nodes. Clearly, the fact that β was positive also implies that the risk of death increases with the greater the number of nodes that are involved. 15

Multiple Covariates The model implies that λ(t X 1,...,X k ) = λ 0 (t)exp(β 1 X 1 +...+β k X k ) λ(t X 1,...,X k ) λ 0 (t) = exp(β 1 X 1 +...+β k X k ) More importantly, the regression coefficient β j associated with the covariate X j will indicate the direction and strength of the relationship that X j has on the risk of dying, adjusting for the effect of the other covariates. 16

Multiple Covariates For example, suppose we are jointly considering the relationship of X 1,...,X k on the risk of dying using a proportional hazards model. The effect that increasing X j by one unit on the risk of dying, keeping all other variables the same, is λ(t X 1 = x 1,...,X j 1 = x j 1,X j = x j +1,X j+1 = x j+1,...,x k = x k ) λ(t X 1 = x 1,...,X j 1 = x j 1,X j = x j,x j+1 = x j+1,...,x k = x k ) = λ 0(t)exp(β 1 x 1 +...+β j 1 x j 1 +β j (x j +1)+β j+1 x j+1 +...+β k x k ) λ 0 (t)exp(β 1 x 1 +...+β j 1 x j 1 +β j x j +β j+1 x j+1 +...+β k x k ) = exp(β j ) 17

Statistical inference As in any statistical problem, we don t get to see the true population relationship between the hazard rate and the covariates That is, we don t ever know the true values of the regression coefficients β are Instead, these must be estimated from a sample of data (U i, i,x 1i,...,X ki ),i = 1,...,n. Consequently, we obtain estimators of β 1,...,β k which are denoted by ˆβ 1,..., ˆβ k. Estimators for the β s are obtained by maximizing the partial likelihood (an incredibly clever idea that Cox developed) This methodology also provides standard errors for the estimators of β 18

Statistical Inference Therefore, we can get a good idea where, say, the range of the coefficient β j lies within by considering the 95% confidence interval computed by ˆβ j ±1.96 se(ˆβ j ), where seˆβ j ), is the standard error of the estimator ˆβ j If the value β j = 0 (null hypothesis that variable X j has no effect on survival) is not contained in the confidence interval, then this can be used as evidence that X j has a significant effect on survival (the direction of the effect depends on the sign of ˆβ j ) You can also assess the strength of the effect by computing p-value; that is how far out in the tail of a standard normal distribution is the standardized test statistic ˆβ j se(ˆβ j ) 19

Example Using R library(survival) data=read.table( cal8541.dat ) time=data[,1] status=data[,2] trt=data[,3] newdata=subset(data,trt 3) newtime=newdata[,1] newstatus=newdata[,2] newtrt=newdata[,3]-1 ph=coxph(surv(newtime,newstatus) newtrt) ph 20

Example Using R (Results) Call: coxph(formula = Surv(newtime, newstatus) newtr coef exp(coef) se(coef) z p newtrt 0.329 1.39 0.105 3.13 0.0018 Likelihood ratio test=9.86 on 1 df, p=0.00169 n= 987, number of events= 367 One can do multiple regression say for example meno=newdata[,4] ph=coxph(surv(newtime,newstatus) newtrt + meno) 21

Example Using SAS options ps=59 ls=80; data bcancer; infile tsiatis/butch/cal8541.dat ; input days cens trt meno tsize nodes er; years=days/365.25; data bcancer1; set bcancer; if trt= 1 or trt=2; proc phreg data=bcancer1; model years*cens(0)=trt meno; run; 22