Introduction Mathematical theory Practice Discussion Philosophy and Features of the mstate package Liesbeth de Wreede, Hein Putter Department of Medical Statistics and Bioinformatics Leiden University Medical Center 19 September 2008 Gent, IAP workshop
Introduction Mathematical theory Practice Discussion Outline 1 Introduction Example The model 2 Mathematical theory Notation and basic functions Estimation of hazards Estimation of state probabilities 3 4 Practice Data Covariates Preparation of the data Estimation 5 Discussion
Introduction Mathematical theory Practice Discussion Outline 1 Introduction Example The model 2 Mathematical theory Notation and basic functions Estimation of hazards Estimation of state probabilities 3 4 Practice Data Covariates Preparation of the data Estimation 5 Discussion
Introduction Mathematical theory Practice Discussion Example: survival after bone-marrow transplantation Data provided by the European Registry for Blood and Marrow Transplantation (EBMT), leukemia patients with bone marrow transplantation; 1977 patients, History after transplant of bone marrow: Relapse, Death. Two covariates considered here: Risk Score (categorical), Age at transplant (continuous).
Introduction Mathematical theory Practice Discussion The simplest multi-state model: State 2 (relapse) State 1 (transplant) State 3 (death) Figure: The illness-death model Software works for more complicated models as well.
Introduction Mathematical theory Practice Discussion Characteristics of the models: One (or more) starting states, one or more intermediate states, one or more absorbing states, Baseline hazards non-parametric, estimators change at event times,
Introduction Mathematical theory Practice Discussion Characteristics of the models: One (or more) starting states, one or more intermediate states, one or more absorbing states, Baseline hazards non-parametric, estimators change at event times, Covariates at baseline can be included, Cox model: individual hazard proportional to baseline hazard.
Introduction Mathematical theory Practice Discussion Characteristics of the models: One (or more) starting states, one or more intermediate states, one or more absorbing states, Baseline hazards non-parametric, estimators change at event times, Covariates at baseline can be included, Cox model: individual hazard proportional to baseline hazard. Possible assumption: baseline hazards proportional, usually into the same state (proportional hazards model).
Introduction Mathematical theory Practice Discussion Dynamic prediction Clinical question: how to give a prognosis for a given patient with known covariate values and post-transplant history? Dynamic: updating prediction by incorporation of information about intermediate events. Standard errors of estimators needed.
Introduction Mathematical theory Practice Discussion Outline 1 Introduction Example The model 2 Mathematical theory Notation and basic functions Estimation of hazards Estimation of state probabilities 3 4 Practice Data Covariates Preparation of the data Estimation 5 Discussion
Introduction Mathematical theory Practice Discussion Notation and basic functions Transition probability matrix P(s, t); P hj (s, t) = P(X (t) = j X (s) = h). Markov models: the future depends on the history only through the present (Clock forward-models, time t refers to the time since the patient entered the initial state). Dynamic prediction: P(s, t) with either s (backward) or t (forward) varying in time,
Introduction Mathematical theory Practice Discussion Notation and basic functions Transition probability matrix P(s, t); P hj (s, t) = P(X (t) = j X (s) = h). Markov models: the future depends on the history only through the present (Clock forward-models, time t refers to the time since the patient entered the initial state). Dynamic prediction: P(s, t) with either s (backward) or t (forward) varying in time, Hazard rate α hj (t), cumulative hazard A hj (t) = t 0 α hj(u) d(u); A = {A hj }. Semi-parametric models: Cox: α hj (t Z ) = α hj,0 (t) exp(β Z hj ).
Introduction Mathematical theory Practice Discussion Estimation of hazards Estimation of baseline cumulative hazards: use counting process approach, asymptotic theory and stratified model (Andersen, Borgan, Gill and Keiding, Statistical Models Based on Counting Processes). Regression coefficients β estimated by means of maximum likelihood,
Introduction Mathematical theory Practice Discussion Calculate variances (by means of martingales and functional delta-method). Two types: Aalen Cumulative hazards continuous, extendable to models with covariates. Greenwood Cumulative hazards can have jumps, exact multinomial standard errors in case of no censoring. Two causes of error in Cox model: 1 Estimation of coefficients β, 2 Estimation of baseline cumulative hazard.
Introduction Mathematical theory Practice Discussion Calculate variances (by means of martingales and functional delta-method). Two types: Aalen Cumulative hazards continuous, extendable to models with covariates. Greenwood Cumulative hazards can have jumps, exact multinomial standard errors in case of no censoring. Two causes of error in Cox model: 1 Estimation of coefficients β, 2 Estimation of baseline cumulative hazard. Extension to proportional hazards model: consider strata instead of transitions. Baseline hazards in one stratum are proportional.
Introduction Mathematical theory Practice Discussion Estimation of state probabilities P(s, t) = Π (s,t] (I + d A(u)), Estimate transition matrix as P(s, t Z ) = Π (s,t] (I + d Â(u Z )), Aalen-Johansen-type estimator.
Introduction Mathematical theory Practice Discussion Standard errors: We calculate standard errors of estimated transition probabilities as function of (co-)variances of hazards. Advantage: in case of non-standard models, adjusting calculation of latter suffices.
Introduction Mathematical theory Practice Discussion Standard errors: We calculate standard errors of estimated transition probabilities as function of (co-)variances of hazards. Advantage: in case of non-standard models, adjusting calculation of latter suffices. Direct or through recursion formula (backward, forward; Aalen(-type), Greenwood): Var( P(s, t)) = {(I + dâ(t Z )) I } Var( P(s, t )){(I + dâ(t Z )) I } + {I P(s, t )} Var(dÂ(t Z )){I P(s, t ) }
Introduction Mathematical theory Practice Discussion Outline 1 Introduction Example The model 2 Mathematical theory Notation and basic functions Estimation of hazards Estimation of state probabilities 3 4 Practice Data Covariates Preparation of the data Estimation 5 Discussion
Introduction Mathematical theory Practice Discussion mstate: R-package: can be combined with other packages, written by Hein Putter, Marta Fiocco and Liesbeth de Wreede, data-preparation and calculation, first package to cover broad range of models, estimators and their standard errors,
Introduction Mathematical theory Practice Discussion mstate: R-package: can be combined with other packages, written by Hein Putter, Marta Fiocco and Liesbeth de Wreede, data-preparation and calculation, first package to cover broad range of models, estimators and their standard errors, philosophy: flexibility: different multi-state models, different covariate effects (any mixture of identical or differential effects across transitions), user can use part of the functions in case of non-standard models (e.g. calculate own hazards and their standard errors and then apply probtrans), models without covariates: Greenwood and Aalen estimator of variances, exact in case of standard models, simulation/bootstrap procedures in other cases (mssample/msboot).
Introduction Mathematical theory Practice Discussion Outline 1 Introduction Example The model 2 Mathematical theory Notation and basic functions Estimation of hazards Estimation of state probabilities 3 4 Practice Data Covariates Preparation of the data Estimation 5 Discussion
Introduction Mathematical theory Practice Discussion Data score: EBMT score (0=low risk, 1=medium risk, 2=high risk), highly predictive for relapse and survival, age: continuous (mean 35), 1977 subjects, 872 dead, 456 relapse.
Introduction Mathematical theory Practice Discussion Transition-/stratum specific covariates age score sx Tr Stra score.1 score.2 sx.1 sx.2 sx.3 RL(t) age score sx 1 1 score 0 sx 0 0 0 age score sx 2 2 0 score 0 sx 0 0 age score sx 3 2 0 score 0 0 sx 1 Table: Covariates at baseline and expanded covariates for one person Time-dependent covariate distinguishes between different transitions in the same stratum.
Introduction Mathematical theory Practice Discussion Long format msprep to arrange data in long format instead of wide format, expand.covs to make covariates of individual transition-specific. > msebmt <- msprep(ebmt,tmat, + time=c("dumtime","rel","srv"), + status=c("dumstat","relstat","srvstat"), + keep=covs) > msebmt <- expand.covs(msebmt,tmat,covs) id from to trans transn Tstart Tstop time status score age score.1 1 1 1 2 1 Tx -> Rel 0 1610 1610 0 1 28 1 2 1 1 3 2 Tx -> Death 0 1610 1610 0 1 28 0 3 2 1 2 1 Tx -> Rel 0 165 165 1 1 33 1 4 2 1 3 2 Tx -> Death 0 165 165 0 1 33 0 5 2 2 3 3 Rel -> Death 165 961 796 1 1 33 0 6 3 1 2 1 Tx -> Rel 0 1508 1508 0 1 38 1 score.2 score.3 age.1 age.2 age.3 1 0 0 28 0 0 2 1 0 0 28 0 3 0 0 33 0 0 4 1 0 0 33 0 5 0 1 0 0 33 6 0 0 38 0 0
Introduction Mathematical theory Practice Discussion Estimation of coefficients Use normal Cox-function coxph to estimate parameters (transitions as strata) and select relevant covariates. Model I: all baseline hazards independent, all covariates transition-specific. Model II: baseline hazards into death state proportional, covariates same effect for all transitions. > c1 <- coxph(surv(tstart,tstop,status) ~ score.1 + score.2 + score.3 + age.1 + age.2 + age.3 + strata(trans), data=msebmt) > c2 <- coxph(surv(tstart,tstop,status) ~ score + age + rel.srv + strata(to), data=msebmt)
Introduction Mathematical theory Practice Discussion
Introduction Mathematical theory Practice Discussion Estimation of hazards Build matrix with cumulative hazards of individual for whom prediction is made by means of msfit: > HvH1 <- msfit(c1,newdata=ndata1) time Haz trans 1 0.5 0.001939696 1 2 1.0 0.002911486 1 3 2.0 0.002911486 1 4 5.0 0.002911486 1 5 7.0 0.002911486 1 6 8.0 0.002911486 1 time varhaz trans1 trans2 3601 2292 0.4380339 3 3 3602 2322 0.4380339 3 3 3603 2344 0.4637127 3 3 3604 2537 0.5211256 3 3 3605 2596 0.6213435 3 3 3606 2597 0.6213435 3 3
Introduction Mathematical theory Practice Discussion Estimation of transition probabilities probtrans : > pt1 <- probtrans(hvh1,tmat,predt=0,direction="forward") time pstate1 pstate2 pstate3 se1 se2 se3 1 0.0 1.0000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 2 0.5 0.9962635 0.001939696 0.001796783 0.001887622 0.001388488 0.001278757 3 1.0 0.9952954 0.002907855 0.001796783 0.002137196 0.001713350 0.001278757 4 2.0 0.9943988 0.002907855 0.002693317 0.002324428 0.001713350 0.001573487 5 5.0 0.9917109 0.002907855 0.005381252 0.002826827 0.001713350 0.002253818 6 7.0 0.9899210 0.002907855 0.007171193 0.003128568 0.001713350 0.002624354
Introduction Mathematical theory Practice Discussion
Introduction Mathematical theory Practice Discussion
Introduction Mathematical theory Practice Discussion
Introduction Mathematical theory Practice Discussion Outline 1 Introduction Example The model 2 Mathematical theory Notation and basic functions Estimation of hazards Estimation of state probabilities 3 4 Practice Data Covariates Preparation of the data Estimation 5 Discussion
Introduction Mathematical theory Practice Discussion Discussion: Possible to experiment with different covariate effects, Proportional baseline hazards can be considered, Dynamic prediction possible, Standard errors of all estimated quantities are calculated.
Introduction Mathematical theory Practice Discussion Discussion: Possible to experiment with different covariate effects, Proportional baseline hazards can be considered, Dynamic prediction possible, Standard errors of all estimated quantities are calculated. Extensions and modifications of the model: 1 Reduced rank, 2 Time-dependent covariates, 3 Semi-Markov, Non-Markov.