Multi-state Models: An Overview

Similar documents
Robust estimates of state occupancy and transition probabilities for Non-Markov multi-state models

Philosophy and Features of the mstate package

Multi-state models: prediction

Multistate models in survival and event history analysis

Multistate Modeling and Applications

Multi-state modelling with R: the msm package

A multi-state model for the prognosis of non-mild acute pancreatitis

Multistate models and recurrent event models

Lecture 7 Time-dependent Covariates in Cox Regression

Multistate Modelling Vertical Transmission and Determination of R 0 Using Transition Intensities

Multistate models and recurrent event models

Lecture 9. Statistics Survival Analysis. Presented February 23, Dan Gillen Department of Statistics University of California, Irvine

Dynamic Prediction of Disease Progression Using Longitudinal Biomarker Data

STAT331. Cox s Proportional Hazards Model

Package SimSCRPiecewise

A multistate additive relative survival semi-markov model

Lecture 12. Multivariate Survival Data Statistics Survival Analysis. Presented March 8, 2016

A multi-state model for the prognosis of non-mild acute pancreatitis

Survival Analysis. Stat 526. April 13, 2018

You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA

ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES. Cox s regression analysis Time dependent explanatory variables

Analysis of competing risks data and simulation of data following predened subdistribution hazards

Survival Analysis I (CHL5209H)

Chapter 4 Fall Notations: t 1 < t 2 < < t D, D unique death times. d j = # deaths at t j = n. Y j = # at risk /alive at t j = n

Cox s proportional hazards model and Cox s partial likelihood

Power and Sample Size Calculations with the Additive Hazards Model

Definitions and examples Simple estimation and testing Regression models Goodness of fit for the Cox model. Recap of Part 1. Per Kragh Andersen

Joint Modeling of Longitudinal Item Response Data and Survival

Introduction to cthmm (Continuous-time hidden Markov models) package

Estimating transition probabilities for the illness-death model The Aalen-Johansen estimator under violation of the Markov assumption Torunn Heggland

Package threg. August 10, 2015

An Overview of Methods for Applying Semi-Markov Processes in Biostatistics.

REGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520

Lecture 22 Survival Analysis: An Introduction

Journal of Statistical Software

Package crrsc. R topics documented: February 19, 2015

Lecture 5 Models and methods for recurrent event data

Survival Regression Models

Lecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL

Lecture 8 Stat D. Gillen

Longitudinal + Reliability = Joint Modeling

On Measurement Error Problems with Predictors Derived from Stationary Stochastic Processes and Application to Cocaine Dependence Treatment Data

First Aid Kit for Survival. Hypoxia cohort. Goal. DFS=Clinical + Marker 1/21/2015. Two analyses to exemplify some concepts of survival techniques

Part IV Extensions: Competing Risks Endpoints and Non-Parametric AUC(t) Estimation

CASE STUDY: Bayesian Incidence Analyses from Cross-Sectional Data with Multiple Markers of Disease Severity. Outline:

Survival Analysis. 732G34 Statistisk analys av komplexa data. Krzysztof Bartoszek

Estimation for Modified Data

Sieve estimation in a Markov illness-death process under dual censoring

Modelling geoadditive survival data

Extensions of Cox Model for Non-Proportional Hazards Purpose

Building a Prognostic Biomarker

A Multistate Model for Bivariate Interval Censored Failure Time Data

Subject CT4 Models. October 2015 Examination INDICATIVE SOLUTION

[Part 2] Model Development for the Prediction of Survival Times using Longitudinal Measurements

Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective. Anastasios (Butch) Tsiatis and Xiaofei Bai

Abstract. 1. Introduction

Statistical Inference and Methods

Parametric multistate survival analysis: New developments

A Regression Model For Recurrent Events With Distribution Free Correlation Structure

DAGStat Event History Analysis.

MAS3301 / MAS8311 Biostatistics Part II: Survival

for Time-to-event Data Mei-Ling Ting Lee University of Maryland, College Park

Survival Distributions, Hazard Functions, Cumulative Hazards

Lecture 3. Truncation, length-bias and prevalence sampling

Multi-state models: a flexible approach for modelling complex event histories in epidemiology

e 4β e 4β + e β ˆβ =0.765

Survival Analysis Math 434 Fall 2011

Group Sequential Tests for Delayed Responses. Christopher Jennison. Lisa Hampson. Workshop on Special Topics on Sequential Methodology

Introduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview

Balancing Covariates via Propensity Score Weighting

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

Simple techniques for comparing survival functions with interval-censored data

Statistical Issues in the Use of Composite Endpoints in Clinical Trials

Pricing and Risk Analysis of a Long-Term Care Insurance Contract in a non-markov Multi-State Model

Approximation of Survival Function by Taylor Series for General Partly Interval Censored Data

Lecture 7. Proportional Hazards Model - Handling Ties and Survival Estimation Statistics Survival Analysis. Presented February 4, 2016

Residuals and model diagnostics

Meei Pyng Ng 1 and Ray Watson 1

Direct likelihood inference on the cause-specific cumulative incidence function: a flexible parametric regression modelling approach

Application of Time-to-Event Methods in the Assessment of Safety in Clinical Trials

Concepts and Tests for Trend in Recurrent Event Processes

PhD course: Statistical evaluation of diagnostic and predictive models

Statistical Methods for Alzheimer s Disease Studies

Models for Multivariate Panel Count Data

Continuous Time Survival in Latent Variable Models

The STS Surgeon Composite Technical Appendix

UNIVERSITY OF CALIFORNIA, SAN DIEGO

Practice Exam 1. (A) (B) (C) (D) (E) You are given the following data on loss sizes:

3003 Cure. F. P. Treasure

Frailty Models and Copulas: Similarities and Differences

Survival models and health sequences

CONTINUOUS TIME MULTI-STATE MODELS FOR SURVIVAL ANALYSIS. Erika Lynn Gibson

Survival Prediction Under Dependent Censoring: A Copula-based Approach

A Bayesian Nonparametric Approach to Causal Inference for Semi-competing risks

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang

Regression with incomplete covariates and left-truncated time-to-event data

Longitudinal breast density as a marker of breast cancer risk

A Generalized Global Rank Test for Multiple, Possibly Censored, Outcomes

CIMAT Taller de Modelos de Capture y Recaptura Known Fate Survival Analysis

Transcription:

Multi-state Models: An Overview Andrew Titman Lancaster University 14 April 2016

Overview Introduction to multi-state modelling Examples of applications Continuously observed processes Intermittently observed processes State Misclassification

Multi-state models Models for event history analysis. Times of occurrence of events Types of event that occurred Joint modelling of survival and important (categorical) time dependent covariates Model the transition intensities between states of a process E.g. healthy to diseased E.g. diseased to death Modelling onset and progression of chronic diseases Process usually modelled in continuous time

Example: Simple survival model Alive h(t) Dead

Example: Competing risks model h 1 (t) Cause 1 Alive h 2 (t) Cause 2 h 3 (t) Cause 3

Example: Progressive Illness-death model Healthy λ 01 (t) Ill λ 02 (t) λ 12 (t) Dead

Example: Outcome prediction in bone marrow transplantation R Keiding et al (2001) O: Initally transplanted C 0 A AC R: Relapse D: Die in Remission A: Acute graft-vs-host C: Chronic graft-vs-host D AC: Acute+Chronic gvhd

Modelling interdependence Standard survival data, and also competing risks data, involve patients having at most one event of interest Once each subject can experience more than one event, assumptions need to be made about dependencies between events Most commonly a Markov assumption is adopted, where only the current state and time govern the trajectory of the process. Potentially other time scales and summaries of past history may be important

Mathematical framework Models defined by transition intensities between states λ rs (t, F t ) = lim δt 0 P(X(t + δt) = s X(t) = r, F t ) δt where F t is the history, or filtration, of the process up to time t. Common assumptions Markov: λ rs (t, F t ) = λ rs (t) (Homogeneous) Semi-Markov: λ rs (t, F t ) = λ rs (t t r ) where t r is the time of entry into current state r.

Application: Cancer progression and survival Overall survival (OS) after diagnosis of cancer is heavily influenced by an intermediate event of cancer progression Progression-free survival (PFS) is commonly reported in addition to overall survival in cancer trials and used as the primary endpoint in some cases Defined as the minimum of time of death and time of progression Process of progression and survival can be represented through an illness-death multi-state model

Application: Cancer progression and survival Pre- Progression λ 01 (t) Post- Progression λ 02 (t) λ 12 (t, u) Dead

Application: Cancer progression and survival Appropriateness of PFS as a surrogate endpoint depends on association with OS Can be characterized through the transition intensities Often reasonable to assume λ 12 (t, u) = λ 12 (u) e.g. clock-reset at time of For PFS to be a good surrogate need λ 12 (t) λ 02 (t) progression Pre- Progression λ 01 (t) Dead Post- Progression λ 02 (t) λ 12 (t, u) Also need primary effect of treatments to be on λ 01 (t)

Model fitting Estimation for multi-state models with continuous observation (up to right-censoring) is quite straightforward Under a Markov or semi-markov assumption, the likelihood factorizes into periods of time at risk for each transition intensity Same formulation as for left-truncated (i.e. delayed entry) survival data A patient becomes at risk of transitions out of state i from their time of entry into state i Models for each transition intensity can be maximized independently, provided there are no shared parameters Can be fitted using the survival package in R Determining quantities of interest, such as transition probabilities, from the models is more challenging.

Example: Illness-death model Patient 1 enters illness state at 968 days. Dies at 1521 days In counting process format would represent data as follows: id entry exit from to event transtype 1 0 968 0 1 1 1 1 0 968 0 2 0 2 1 968 1521 1 2 1 3 Model without covariates in R: fit0 <- survfit(surv(entry,exit,event)~strata(transtype))

Transition Probabilities The transition probabilities are defined as P rs (t 0, t 1 ) = P(X(t 1 ) = s X(t 0 ) = r) Complicated function of the transition intensities In progressive models the transition probabilities can be written as an integral of the transition intensities with respect to the possible times of intermediate events. For more general Markov models, they are given by solving Kolmogorov Forward Equations (a system of differential equations)

Aalen-Johansen estimator Under a Markov assumption, the matrix of transition probabilities of an R state multi-state model can be estimated non-parametrically using the Aalen-Johansen estimator. ˆP(t 0, t 1 ) = k:t 0 t k t 1 (I + d ˆΛ k ) where d ˆΛ k is an R R matrix with (i, j) entry d ˆΛ ijk = d ijk r ik for i j, d ˆΛ iik = j i d ˆΛ ijk where d ijk : number of i j transitions at t k, r ik : number of subjects under observation in state i at t k

Covariate models Standard approach to incorporating covariates is to assume proportional intensities (Cox-Markov model) λ rs (t; z) = λ rs0 (t) exp(z β rs ) Extension of the Cox model, where the baseline intensities, λ rs0 (t), are non-parametric Potentially allow different β rs for each transition intensity and hence fit separate models to each.

Model building In practice, may have insufficient numbers of transitions between particular states to reliably fit independent models Constraints possible Allow common regression coefficients across transitions out of a state β rs = β r, for s = 1,..., R Allow proportionality between transition intensities λ rs0 (t) = λ r10 (t) exp(θ s ), s 1

Interpretation issues Having fitted a model for the effect of covariates on transition intensities, often want to also determine the transition probabilities for different covariate patterns This is reasonably straightforward to implement 1. Calculate the estimates of the transition intensities for a given patient with covariates z (using the Nelson-Aalen-Breslow estimator) d ˆΛ ijk (z) = d ˆΛ ijk (0) exp(z ˆβ) 2. Plug these into the Aalen-Johansen formula Procedure performed by the mstate package in R But no guarantee of a simple relationship that explains the covariate effect c.f. regression of cause-specific hazards in competing risks analysis

Example: Cancer Overall survival S(t) 0.0 0.2 0.4 0.6 0.8 1.0 Estimated survival 0 500 1000 1500 2000 2500 3000 Time (days) Patients with different covariates may have crossing survival curves e.g. if a covariate is associated with faster progression but improved survival post-progression Direct regression methods possible via pseudo-observations

Application: Health Economics In Health Economics, cost-effectiveness usually determined by Incremental Cost-Effectiveness Ratio (ICER): ICER = Cost QALY Quality Adjusted Life Years (QALY) is an estimate of life expectancy weighted by health state A multi-state model with states corresponding to the distinct health states can be built QALY depends on state occupancy probabilities, P 1r (0, t) and the health utilities, u r, in each state r. QALY = R r=1 τ 0 u r P 1r (0, t)dt

Intermittently observed processes It is not always possible to assume that the exact time of transitions between states can be observed (up to right censoring) Disease status may only be diagnosable at clinic visits, leading to interval censoring. In addition the precise transition(s) that took place may not be known. Time of entry into the absorbing state (e.g. death) may still be known up to right-censoring

Example: Cancer progression and survival While death will be known to the nearest day, progression is typically determined at scheduled follow-up visits Leads to dual censored data Almost always ignored in standard analysis, but does lead to bias (Zeng et al; 2015) S(t) 0.0 0.2 0.4 0.6 0.8 1.0 0 8 16 24 32 40 48 56 64 Time (weeks)

Intermittently observed processes Interval censoring makes estimation more difficult, particularly non- or semi-parametric approaches Usually need to assume that the observation times are fixed or, if random, non-informative. e.g. next clinic visit time determined at current visit and not influenced by disease progression in between. In the Markov case the likelihood can be expressed as a product of transition probabilities. Subject in states x 0, x 1,..., x m at times t 0, t 1,..., t m has likelihood m P xi 1 x i (t i 1, t i ; θ) i=1 Modifications needed to allow for exact times of death and when state occupied at end of follow-up is unknown.

Estimation: Markov case For time homogeneous models the transition probabilities can be expressed as a matrix exponential of the generator matrix of transition intensities P(t 0, t 1 ) = exp(λ(t 1 t 0 )) where Λ is the R R matrix with (r, s) entry λ rs for r s and λ rr = j λ rj. Piecewise constant transition intensities can also be accommodated straightforwardly. Kalbfleisch & Lawless (1985) developed methods for maximum likelihood estimation Implemented in the msm package in R (Jackson, 2011).

State Misclassification Often the underlying disease of interest is known (or assumed) to be progressive - but this is contradicted by individual patients sequences of states Cognitive decline Development of progressive chronic diseases Assume that the observed state at a clinic visit is subject to classification error e.g. diagnosis of dementia via memory tests e.g. patient undergoes a diagnostic test with < 100% sensitivity and specificity

Example: CAV in post heart-transplantation patients 4 state progressive illness-death model for development of cardiac allograft vasculopathy in post-heart transplant patients Diagnosis of disease status via angiogram at discrete clinic visits. 1972 clinic visits from 596 patients Angiogram potentially under diagnoses disease Underlying model assumed to be homogeneous Markov State 1 State 2 λ12 Disease Free Mild/Moderate CAV λ24 λ14 State 4 Death λ23 λ34 State 3 Severe CAV

Example: CAV in post heart-transplantation patients To From 1 2 3 4 Cen 1 1366 204 44 122 276 2 46 134 54 48 69 3 4 13 107 55 26 Observed transitions include some cases that contradict the assumption of a progressive disease Ad hoc approach: Assume true state is maximum observed up to that time Instead assume angiogram may misspecify disease to adjacent states e rs = P(O k = s X k = r)

State Misclassification A multi-state Markov model observed with misclassification can be represented by a hidden Markov model (HMM) HMM requires important assumption that conditional on the underlying states x 1, x 2,..., x m, the observed states o 1, o 2,..., o m are independent. Plausibility depends on method of diagnosis and regularity of tests Likelihood can be calculated iteratively through a forward recursion P(O 1,..., O m, X m) = P(O m X m) Xm 1 P(O 1,..., O m 1, X m 1 )P(X m X m 1 ) = e XmO m X m 1 P(O 1,..., O m 1, X m 1 )P Xm 1 X m (t m 1, t m)

Example: CAV in post heart-transplantation patients Parameter Naive Markov Hidden Markov λ 12 0.039 (0.027,0.057) 0.033 (0.021,0.050) λ 14 0.022 (0.017,0.030) 0.021 (0.015,0.029) λ 23 0.199 (0.162,0.246) 0.190 (0.143,0.252) λ 24 0.041 (0.023,0.075) 0.053 (0.029,0.099) λ 34 0.146 (0.116,0.184) 0.155 (0.120,0.201) β (IHD) 12 0.446 (0.185,0.706) 0.520 (0.234,0.807) β (dage) 12 0.022 (0.011,0.033) 0.025 (0.013,0.037) e 12 0.025 (0.015,0.042) e 21 0.186 (0.123,0.272) e 23 0.065 (0.038,0.108) e 32 0.102 (0.051,0.194)

State Misclassification Can also assume misclassification for processes with backwards transitions e.g. (discretization of) CD4 count in patients with HIV Becomes more difficult to distinguish between different parts of the underlying and observation process Markov assumption (vs. say semi-markov dependence) Assumption of conditional independence of observations given true states Assumption of non-informative observation times

Further issues In some settings may have clustered multi-state processes, e.g. multiple processes from the same person Either account for via shared random effects or via robust marginal analyses Data may have complicated sampling schemes e.g. Truncation because patients only including in a study if in a given state - requires modified likelihoods to estimate population-level quantities Patient initiated visit times - observation times no longer non-informative: limited methods to deal with this currently. Goodness-of-fit/Model diagnostics Models, particularly those for intermittently observed data, often make strong assumptions. Important to assess model fit.

Conclusions Multi-state models are particularly useful if we want to build a model for an overall process of survival Potential to obtain dynamic predictions, not just marginal quantities Able to accommodate observational data with complex sampling, e.g. intermittent patient-specific, unequally spaced observation times.

References Kalbfleisch, J. D., & Lawless, J. F. (1985). The analysis of panel data under a Markov assumption. Journal of the American Statistical Association, 80, 863-871. Keiding, N., Klein, J. P., & Horowitz, M. M. (2001). Multistate models and outcome prediction in bone marrow transplantation. Statistics in Medicine, 20, 1871-1885. Jackson, C. H. (2011). Multi-state models for panel data: the msm package for R. Journal of Statistical Software, 38(8), 1-29. Zeng, L., Cook, R. J., Wen, L., & Boruvka, A. (2015). Bias in progressionfree survival analysis due to intermittent assessment of progression. Statistics in Medicine, 34, 3181-3193.