Survival models and health sequences

Similar documents
Survival models and health sequences

Lecture 7 Time-dependent Covariates in Cox Regression

Joint Modeling of Longitudinal Item Response Data and Survival

Robust estimates of state occupancy and transition probabilities for Non-Markov multi-state models

STAT 526 Spring Final Exam. Thursday May 5, 2011

Simple techniques for comparing survival functions with interval-censored data

Analysis of Time-to-Event Data: Chapter 6 - Regression diagnostics

Survival Analysis. Lu Tian and Richard Olshen Stanford University

Sample size re-estimation in clinical trials. Dealing with those unknowns. Chris Jennison. University of Kyoto, January 2018

Multi-state Models: An Overview

Ph.D. course: Regression models. Introduction. 19 April 2012

Ph.D. course: Regression models. Regression models. Explanatory variables. Example 1.1: Body mass index and vitamin D status

Nonparametric two-sample tests of longitudinal data in the presence of a terminal event

BIOS 2083: Linear Models

A Bayesian Nonparametric Approach to Causal Inference for Semi-competing risks

Definitions and examples Simple estimation and testing Regression models Goodness of fit for the Cox model. Recap of Part 1. Per Kragh Andersen

STAT 525 Fall Final exam. Tuesday December 14, 2010

Multistate models and recurrent event models

BIOS 2083 Linear Models c Abdus S. Wahed

Lecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL

The Ef ciency of Simple and Countermatched Nested Case-control Sampling

1.4 Properties of the autocovariance for stationary time-series

On prediction and density estimation Peter McCullagh University of Chicago December 2004

Multistate models and recurrent event models

Lecture 9. Statistics Survival Analysis. Presented February 23, Dan Gillen Department of Statistics University of California, Irvine

Written Exam (2 hours)

β j = coefficient of x j in the model; β = ( β1, β2,

Log-linearity for Cox s regression model. Thesis for the Degree Master of Science

Dynamic Prediction of Disease Progression Using Longitudinal Biomarker Data

The coxvc_1-1-1 package

Local Likelihood Bayesian Cluster Modeling for small area health data. Andrew Lawson Arnold School of Public Health University of South Carolina

Extensions of Cox Model for Non-Proportional Hazards Purpose

ST 732, Midterm Solutions Spring 2019

Lecture 3. Truncation, length-bias and prevalence sampling

Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification. Todd MacKenzie, PhD

General Linear Model: Statistical Inference

An application of the GAM-PCA-VAR model to respiratory disease and air pollution data

Estimating the Mean Response of Treatment Duration Regimes in an Observational Study. Anastasios A. Tsiatis.

BIOSTATISTICAL METHODS

MAS3301 / MAS8311 Biostatistics Part II: Survival

Analyzing Pilot Studies with Missing Observations

Chapter 7 Fall Chapter 7 Hypothesis testing Hypotheses of interest: (A) 1-sample

TMA 4275 Lifetime Analysis June 2004 Solution

I i=1 1 I(J 1) j=1 (Y ij Ȳi ) 2. j=1 (Y j Ȳ )2 ] = 2n( is the two-sample t-test statistic.

leebounds: Lee s (2009) treatment effects bounds for non-random sample selection for Stata

Rank preserving Structural Nested Distribution Model (RPSNDM) for Continuous

Approximation of Survival Function by Taylor Series for General Partly Interval Censored Data

CHL 5225 H Crossover Trials. CHL 5225 H Crossover Trials

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley

Censoring mechanisms

REGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520

UNIVERSITY OF CALIFORNIA, SAN DIEGO

DAGStat Event History Analysis.

A Generalized Global Rank Test for Multiple, Possibly Censored, Outcomes

Group Sequential Tests for Delayed Responses. Christopher Jennison. Lisa Hampson. Workshop on Special Topics on Sequential Methodology

Chapter 7: Hypothesis testing

A Framework for Modeling Positive Class Expansion with Single Snapshot

Causal Inference. Miguel A. Hernán, James M. Robins. May 19, 2017

You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?

Stock Sampling with Interval-Censored Elapsed Duration: A Monte Carlo Analysis

A. Motivation To motivate the analysis of variance framework, we consider the following example.

Residuals and model diagnostics

ANOVA TESTING 4STEPS. 1. State the hypothesis. : H 0 : µ 1 =

Survival Regression Models

Lecture 5 Models and methods for recurrent event data

Goodness-Of-Fit for Cox s Regression Model. Extensions of Cox s Regression Model. Survival Analysis Fall 2004, Copenhagen

PANEL DATA RANDOM AND FIXED EFFECTS MODEL. Professor Menelaos Karanasos. December Panel Data (Institute) PANEL DATA December / 1

Econometrics Midterm Examination Answers

Quantitative Techniques - Lecture 8: Estimation

4. Comparison of Two (K) Samples

Akaike criterion: Kullback-Leibler discrepancy

Mixture modelling of recurrent event times with long-term survivors: Analysis of Hutterite birth intervals. John W. Mac McDonald & Alessandro Rosina

Multistate Modeling and Applications

SSUI: Presentation Hints 2 My Perspective Software Examples Reliability Areas that need work

Lecture 6: Single-classification multivariate ANOVA (k-group( MANOVA)

Published online: 10 Apr 2012.

VIII. ANCOVA. A. Introduction

Frailty Modeling for clustered survival data: a simulation study

DESAIN EKSPERIMEN BLOCKING FACTORS. Semester Genap 2017/2018 Jurusan Teknik Industri Universitas Brawijaya

18 Bivariate normal distribution I

We begin by thinking about population relationships.

Optimization. The value x is called a maximizer of f and is written argmax X f. g(λx + (1 λ)y) < λg(x) + (1 λ)g(y) 0 < λ < 1; x, y X.

MIXED MODELS THE GENERAL MIXED MODEL

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness

Nonparametric Model Construction

Part 6: Multivariate Normal and Linear Models

Estimation for Modified Data

Generalizing the MCPMod methodology beyond normal, independent data

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang

In contrast, parametric techniques (fitting exponential or Weibull, for example) are more focussed, can handle general covariates, but require

Survival Analysis Math 434 Fall 2011

Exercises. (a) Prove that m(t) =

What is Experimental Design?

STAT331. Cox s Proportional Hazards Model

Multivariate Survival Analysis

Part III Measures of Classification Accuracy for the Prediction of Survival Times

Ph.D. Preliminary Examination Statistics June 2, 2014

Part III. Hypothesis Testing. III.1. Log-rank Test for Right-censored Failure Time Data

Department of Statistical Science FIRST YEAR EXAM - SPRING 2017

Statistics and Probability Letters. Using randomization tests to preserve type I error with response adaptive and covariate adaptive randomization

Transcription:

Survival models and health sequences Walter Dempsey University of Michigan July 27, 2015

Survival Data Problem Description Survival data is commonplace in medical studies, consisting of failure time information for each patient i, T i. Many studies require patient monitoring, generating a series of measurements of a health process, Y i. Acompleteuncensoredobservationforonepatientconsistsofthefollowingtriple: (Y, T, t) Y health process (ex. CD4 cell count, Prothrombin ndex, Quality of Life ndex) T failure time as measured from recruitment t appointment schedule Joint modeling of the repeated measurements and survival data is necessary in order to gauge the e ect of covariates on the response. Walter Dempsey (University of Michigan) Survival models and health sequences July 27, 2015 2 / 24

Survival Process Problem Description Survival Process AstochasticprocessY i (t) :R!Rwhere R is the state space. Typically Y i (t) is state of health or quality of life of patient i at time t, typically measured since recruitment ex. Simple Survival Process, state space is R = {0, 1} is su cient Walter Dempsey (University of Michigan) Survival models and health sequences July 27, 2015 3 / 24

Survival Process Problem Description Survival Process AstochasticprocessY i (t) :R!Rwhere R is the state space. Typically Y i (t) is state of health or quality of life of patient i at time t, typically measured since recruitment ex. Simple Survival Process, state space is R = {0, 1} is su cient Flatlining An absorbing state [ 2Rsuch that Y (t) =[ ) Y (t 0 )=[ for all t 0 > t. The survival time, T, is the time to failure: T i =sup{t : Y i (t) 6= [} Walter Dempsey (University of Michigan) Survival models and health sequences July 27, 2015 3 / 24

Appointment Schedule Problem Description The appointment schedule is a random subset t [0, T ), which is informative for survival regardless of health trajectory Sequential Conditional ndependence Consider patient with k appointments t (k) =(t 0 <...<t k 1 ). The sequence Y [t (k) ] may a ect the scheduled appointment date t k.thesequential conditional independence assumption states t k? Y (T, t (k), Y [t (k) ]). (1) (Example) Half of patients had appointment within the last three weeks of life! unclear if assumption is violated by patient-initiated appointments (71 within 10 days prior to death). Walter Dempsey (University of Michigan) Survival models and health sequences July 27, 2015 4 / 24

Problem Description Motivating Example : Prednisone Case Study From 1962 to 1969, 488 patients with histologically verified liver cirrhosis at hospitals in Copenhagen were randomly assigned to the two treatment arms. 251 and 237 patients in the prednisone and placebo treatment groups respectively 292 uncensored records ) 40% patients have censored records (45% of total measurements correspond to censored patient records) The purpose of the trial was to ascertain whether prednisone prolonged survival for patients with cirrhosis. Prothrombin ndex 0 20 40 60 80 100 120 140 0 1 2 3 4 5 Time Since Recruitment Walter Dempsey (University of Michigan) Survival models and health sequences July 27, 2015 5 / 24

Problem Description Motivating Example : Prednisone Case Study Table 1 shows average Y -values by T and t. Thecellaverages8 266non-independent, highly variable measurements. Table 1 : Average prothrombin levels indexed by T and t Survival Time t after recruitment (yrs) time (T ) 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8+ 0 1 58.0 1 2 72.5 66.4 2 3 72.6 73.2 66.0 3 4 69.8 71.2 68.5 54.2 4 5 68.5 75.7 72.5 74.6 57.7 5 6 70.5 77.3 73.5 57.1 64.5 60.9 6 7 81.8 73.6 81.1 80.6 79.4 75.5 75.8 7 8 84.4 88.8 88.1 92.1 85.2 81.2 84.3 88.1 8+ 77.3 73.6 87.0 74.1 92.0 80.3 89.2 79.4 84.7 Suggests reverse-time is a more e ective way of organizing the data to display main trends in the mean response. Walter Dempsey (University of Michigan) Survival models and health sequences July 27, 2015 6 / 24

Revival Process Section 2 Revival Process Walter Dempsey (University of Michigan) Survival models and health sequences July 27, 2015 7 / 24

The Revival Process Revival Process Revival Model Assuming the survival time is finite with probability one, wecandefinetherevival Process Z i (s) =Y i (T i s) The revival process is the re-aligned health process. This is well-defined conditional on the survival time and provides an invertible mapping from the survival process to the joint survival time and revival process. Y $ (Z, T ) We consider this joint process in lieu of the survival process as we hope it provides e ective alignment of patient records for comparison and signal extraction. Walter Dempsey (University of Michigan) Survival models and health sequences July 27, 2015 8 / 24

The Revival Process Revival Process Revival Model Table 2 confirms there is considerable excess variation associated with rows (116.8) and with the reverse-time factor (77.8) but not so much with columns. Table 2 : ANOVA decomposition for Table 1 Source U/V kp UY k 2 kp VY k 2 d.f. M.S. Diagonal (R + C + D)/(R + C) 544.3 7 77.8 Column (R + C + D)/(R + D) 237.9 7 34.0 Row (R + C + D)/(C + D) 817.3 7 116.8 Residual RC/(R + C + D) 497.2 21 23.7 ) Time reversal yields e ective alignment of patient records for comparison and signal extraction. Walter Dempsey (University of Michigan) Survival models and health sequences July 27, 2015 9 / 24

Covariates Revival Process Revival Model Definition (Temporal and time-evolving variable) A temporal variable x is a function defined for every t 0. t is a covariate if is a function on the units (entire function is determined at baseline) Typically implies x constant, but there are exceptions (eg. patient s age) A time-evolving variable x 0 is a temporal variable that is not a function on the units (eg. marital status, quality of life, and air quality) Every time-evolving variable is necessarily part of the response process The joint distribution of time-evolving variables and survival time may be used to predict survival time beyond t whose status is known at times t prior to t. Probabilistic prediction is not possible without the requisite mathematical structure of -fields. Walter Dempsey (University of Michigan) Survival models and health sequences July 27, 2015 10 / 24

Revival Process Revival Model Treatment e ect: definition and estimation For patient i, leta i (t) be the treatment arm scheduled for patient i at time t. Null level required for times at and before recruitment, t apple 0. Entire temporal trajectory, a i (t), for t 0isspecifiedatbaselineanddeterminedby randomization (ie. it is a time-dependent covariate). Let ā i (s) =a i (T s) bethetreatmentarmexpressedinrevivaltime. Z? T ā because T is a function of ā Lack of interference () :thetreatmentassignedtooneindividualhasnoe ecton the response distribution for other individuals. Lack of interference () :thetreatmentprotocolatonetimehasnoe ectonthe response distribution at other times. Z[s]? ā ā[s] Walter Dempsey (University of Michigan) Survival models and health sequences July 27, 2015 11 / 24

Revival Process Revival Model Treatment e ect: definition and estimation Consider two patients, one in each treatment arm, a i (t) =ā i (T i t)=1, a j (t) =ā j (T j t)=0 such that x i = x j. The conventional treatment definitions are 10(t) =E(Y i (t) E(Y j (t)) or 0 10(t) =E(Y i (t) E(Y j (t) T i, T j > t) Supposing the dependence on T is additive, the di erence of means at revival time s E(Z i (s) T ) E(Z j (s) T )= 10(s)+ (T i ) (T j ) contains both a treatment e ect and an e ect due to the di erence in survival times. Walter Dempsey (University of Michigan) Survival models and health sequences July 27, 2015 12 / 24

Conditional Distribution Survival Prediction The joint density of (T, t (k), Y [t (k) ]at(t, t (k), y) is a product of three factors f (t) Y j<k p(t j, y j H j, T = t) = f (t) Y j<k p(y j H j, T = t) Y j<k p(t j H j, T = t) = f (t) g k (y; t t (k) t) Y j<k p(t j H j, T = t), (2) where f = F 0 is the survival density, and H j is the observed history (t (j), Y [t (j) ]) at time t j 1. We assume the appointment schedule is uninformative for prediction in the sense that for t k 1 < t k < t. p(t k H k, T = t) =p(t k H k, T = 1) (3) Walter Dempsey (University of Michigan) Survival models and health sequences July 27, 2015 13 / 24

Parameter Estimation Likelihood Factorization : Survival distribution specification Re-alignment by the survival time requires the survival time to be finite with probability one. pr(t < 1) =1 Walter Dempsey (University of Michigan) Survival models and health sequences July 27, 2015 14 / 24

Parameter Estimation Likelihood Factorization : Survival distribution specification Re-alignment by the survival time requires the survival time to be finite with probability one. pr(t < 1) =1 Harmonic Process The harmonic process is an exchangeable survival process defined by two non-negative parameters (, ). T i Exp ( ( (1 + ) ( ))), where ( ) is the derivative of the log-gamma function. Given unique survival times T 1 <...<T k the conditional hazard is the product of a continuous and discrete component. The continuous component is H(t) = X i:t i applet T i T i 1 R ] (T i 1 )+ + t T j R ] (T j )+, where R ] (t) =#{i : T i < t}. Thediscretecomponentis Y j:t j applet r j + r j + d j + Walter Dempsey (University of Michigan) Survival models and health sequences July 27, 2015 14 / 24

ncomplete Records Parameter Estimation On the assumption that censoring is uninformative, the joint density is Z f (t)p(t c t)g(y; t t t)dt t c where t c = t \ [0, c] andy [t c]=y. Walter Dempsey (University of Michigan) Survival models and health sequences July 27, 2015 15 / 24

ncomplete Records Parameter Estimation On the assumption that censoring is uninformative, the joint density is Z f (t)p(t c t)g(y; t t t)dt where t c = t \ [0, c] andy [t c]=y. mputation t c mpute survival times T 0 using the conditional survival distribution f T Y [t c], t c, T > c; ˆu, ˆ / f (T ; ˆ )g(y [t c]; T t c; ˆu)1[T > c] The log-likelihood component associated with the imputed, uncensored record is given by log g(y; T 0 t c; )+logf (T 0 ; ) so parameter estimation after imputation is straightforward. Walter Dempsey (University of Michigan) Survival models and health sequences July 27, 2015 15 / 24

A worked example : cirrhosis study Section 5 A worked example : cirrhosis study Walter Dempsey (University of Michigan) Survival models and health sequences July 27, 2015 16 / 24

A worked example : cirrhosis study Motivating Example: Continued Figure 1 : Smoothed Mean Curve for Prothrombin ndex Aligned by Time until Failure Control Prednisone 50 60 70 80 90 Prothrombin ndex -5-4 -3-2 -1 0 Time Until Failure Walter Dempsey (University of Michigan) Survival models and health sequences July 27, 2015 17 / 24

A worked example : cirrhosis study Motivating Example: Continued We suggest the following model based on Figure 1, E(Z i (s) T )= + āi (s) + 0T i + 1s + s log(s + ) cov(z i (s), Z j (s 0 ) T )= 1 2 ijk 1(s, s 0 ; )+ 2 2 ij + 3 2 ij ss 0 s s K 1(s, s 0 0 ; )=exp Table 3 : Coe cients for revival model Censored records Uncensored records Covariate Coef. S.E. Ratio Coef. S.E. Ratio T? Null Treatment 0.00 - - 0.00 - - - Control 4.13 1.84 2.3 2.41 1.43 1.7 0.94 Prednizone 11.56 1.75 6.6 13.55 1.47 9.2 1.14 Survival (T ) 2.65 0.39 6.9 1.75 0.47 3.7 2.79 Revival (s) 2.78 0.49 5.7 2.11 0.47 4.5 1.72 log(s + ) 3.74 2.68 1.4 4.66 0.41 11.5 0.46 0.164 Walter Dempsey (University of Michigan) Survival models and health sequences July 27, 2015 18 / 24

A worked example : cirrhosis study Motivating Example: Continued Table 4 : Variance components Censored records Uncensored records Coe cient S.E. Coe cient S.E. AR1 2 1 166.27 29.79 209.95 29.54 Patient 2 2 155.84 31.02 206.82 34.48 White Noise 2 3 223.69 17.30 179.59 12.90 Walter Dempsey (University of Michigan) Survival models and health sequences July 27, 2015 19 / 24

A worked example : cirrhosis study E ect of prothrombin on prognosis Over a period of 5 years and one month following recruitment, patient u had eight appointments with prothrombin values as follows: t u (days) 0 126 226 392 770 1127 1631 1855 Y u[t u] 49 93 122 120 110 100 72 59 This is the record for patient 402 who was assigned to prednisone and was subsequently censored at 2661 days. The log density at y u is a quadratic form depending on t only through µ. h(t, y u)=const (y u µ) 0 1 (y u µ)/2 Walter Dempsey (University of Michigan) Survival models and health sequences July 27, 2015 20 / 24

A worked example : cirrhosis study E ect of prothrombin on prognosis This estimated factor is shown in Fig. 2a for three versions of the record in which the final prothrombin value is 59, 69 or 79. Figure 2 : Three versions of the record for patient 402: log modification factors for the predictive survival density (left panel) and hazard functions (right panel). Walter Dempsey (University of Michigan) Survival models and health sequences July 27, 2015 21 / 24

Summary A worked example : cirrhosis study The health sequence is regarded as a random process in its own right, not as a time-dependent covariate governing survival. To a substantial extent, the model for survival time is decoupled from the revival model for the behaviour of the health sequence in reverse time. Realignment implies that value Y i (0) at recruitment must not be treated as a covariate, but as an integral part of the response sequence. f they were available, values prior to recruitment could also be used. The definition of a treatment e ect is not the usual one because the natural way to compare the records for two individuals is not at a fixed time following recruitment, but at a fixed revival time. The treatment value need not be constant in revival time. The predictive value of a partial health sequence for subsequent survival emerges naturally from the joint survival-revival distribution. n particular, the conditional hazard given the finite sequence of earlier values is typically not constant during the subsequent inter-appointment period. Walter Dempsey (University of Michigan) Survival models and health sequences July 27, 2015 22 / 24

Summary A worked example : cirrhosis study Records cannot be aligned until the patient dies, which means that the revival process is not observable component-wise until T is known. As a result, the likelihood analysis for incomplete records is technically more complicated. The omission of incomplete records from the revival likelihood does not lead to bias in estimation, but it does lead to ine ciency, which could be substantial if the majority of records are incomplete. The principal assumption, that appointment dates be uninformative for subsequent survival, does not a ect likelihood calculations, but it does a ect prognosis calculations for individual patients. For that reason, it is advisable to label all appointments as scheduled or unscheduled. Walter Dempsey (University of Michigan) Survival models and health sequences July 27, 2015 23 / 24

A worked example : cirrhosis study Thank you Walter Dempsey (University of Michigan) Survival models and health sequences July 27, 2015 24 / 24