Individualized Treatment Effects with Censored Data via Nonparametric Accelerated Failure Time Models

Similar documents
Bayesian Nonparametric Accelerated Failure Time Models for Analyzing Heterogeneous Treatment Effects

A Bayesian Nonparametric Approach to Causal Inference for Semi-competing risks

BIOL 51A - Biostatistics 1 1. Lecture 1: Intro to Biostatistics. Smoking: hazardous? FEV (l) Smoke

REGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520

Estimating the Mean Response of Treatment Duration Regimes in an Observational Study. Anastasios A. Tsiatis.

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness

Robustifying Trial-Derived Treatment Rules to a Target Population

Lecture 7 Time-dependent Covariates in Cox Regression

Multivariate Survival Analysis

Support Vector Hazard Regression (SVHR) for Predicting Survival Outcomes. Donglin Zeng, Department of Biostatistics, University of North Carolina

BNP survival regression with variable dimension covariate vector

Two-stage Adaptive Randomization for Delayed Response in Clinical Trials

Case Study in the Use of Bayesian Hierarchical Modeling and Simulation for Design and Analysis of a Clinical Trial

Bayesian Nonparametric Inference Methods for Mean Residual Life Functions

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang

4. Comparison of Two (K) Samples

Accelerated Failure Time Models

Approximation of Survival Function by Taylor Series for General Partly Interval Censored Data

Ph.D. course: Regression models. Introduction. 19 April 2012

Survival Analysis Math 434 Fall 2011

You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?

Bayesian regression tree models for causal inference: regularization, confounding and heterogeneity

Rerandomization to Balance Covariates

Subgroup analysis using regression modeling multiple regression. Aeilko H Zwinderman

Ph.D. course: Regression models. Regression models. Explanatory variables. Example 1.1: Body mass index and vitamin D status

Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective. Anastasios (Butch) Tsiatis and Xiaofei Bai

[Part 2] Model Development for the Prediction of Survival Times using Longitudinal Measurements

Longitudinal + Reliability = Joint Modeling

A general mixed model approach for spatio-temporal regression data

Optimising Group Sequential Designs. Decision Theory, Dynamic Programming. and Optimal Stopping

Prerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3

CTDL-Positive Stable Frailty Model

Survival Analysis. Lu Tian and Richard Olshen Stanford University

Ronald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California

Interim Monitoring of Clinical Trials: Decision Theory, Dynamic Programming. and Optimal Stopping

Modeling & Simulation to Improve the Design of Clinical Trials

STAT 526 Spring Final Exam. Thursday May 5, 2011

Nonparametric Predictive Inference (An Introduction)

Analysis of Time-to-Event Data: Chapter 4 - Parametric regression models

Part IV Extensions: Competing Risks Endpoints and Non-Parametric AUC(t) Estimation

Biost 518 Applied Biostatistics II. Purpose of Statistics. First Stage of Scientific Investigation. Further Stages of Scientific Investigation

Extensions of Cox Model for Non-Proportional Hazards Purpose

Power and Sample Size Calculations with the Additive Hazards Model

A Generalized Global Rank Test for Multiple, Possibly Censored, Outcomes

Residuals and model diagnostics

Auxiliary-variable-enriched Biomarker Stratified Design

Evidence synthesis for a single randomized controlled trial and observational data in small populations

Bayesian variable selection for identifying subgroups in cost-effectiveness analysis

Application of Time-to-Event Methods in the Assessment of Safety in Clinical Trials

Analysing geoadditive regression data: a mixed model approach

Previous lecture. P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing.

Instrumental variables estimation in the Cox Proportional Hazard regression model

Chapter 2. Data Analysis

What is Experimental Design?

Group Sequential Tests for Delayed Responses. Christopher Jennison. Lisa Hampson. Workshop on Special Topics on Sequential Methodology

Introduction to Statistical Analysis

Personalized Treatment Selection Based on Randomized Clinical Trials. Tianxi Cai Department of Biostatistics Harvard School of Public Health

Dynamic Prediction of Disease Progression Using Longitudinal Biomarker Data

Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification. Todd MacKenzie, PhD

Pubh 8482: Sequential Analysis

β j = coefficient of x j in the model; β = ( β1, β2,

BAYESIAN ESTIMATION OF LINEAR STATISTICAL MODEL BIAS

Probabilistic Index Models

PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH

Building a Prognostic Biomarker

Statistics in medicine

Package anoint. July 19, 2015

Joint Modeling of Longitudinal Item Response Data and Survival

Multistate models and recurrent event models

Part III Measures of Classification Accuracy for the Prediction of Survival Times

Group Sequential Designs: Theory, Computation and Optimisation

Correction of the likelihood function as an alternative for imputing missing covariates. Wojciech Krzyzanski and An Vermeulen PAGE 2017 Budapest

with the usual assumptions about the error term. The two values of X 1 X 2 0 1

DEPARTMENT OF COMPUTER SCIENCE Autumn Semester MACHINE LEARNING AND ADAPTIVE INTELLIGENCE

Estimating subgroup specific treatment effects via concave fusion

Modeling and measuring training information in a network. SML 2014 Jan Ramon

Statistical Methods for Alzheimer s Disease Studies

PSI Journal Club March 10 th, 2016

arxiv: v1 [stat.ap] 6 Apr 2018

Adaptive Prediction of Event Times in Clinical Trials

Use of frequentist and Bayesian approaches for extrapolating from adult efficacy data to design and interpret confirmatory trials in children

BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY

Modelling geoadditive survival data

Generalizing the MCPMod methodology beyond normal, independent data

Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates

Bayesian non-parametric model to longitudinally predict churn

A Bayesian Machine Learning Approach for Optimizing Dynamic Treatment Regimes

STAT331. Cox s Proportional Hazards Model

Bayes methods for categorical data. April 25, 2017

Chapter 11. Correlation and Regression

Evaluating the value of structural heath monitoring with longitudinal performance indicators and hazard functions using Bayesian dynamic predictions

Incorporating unobserved heterogeneity in Weibull survival models: A Bayesian approach

Luke B Smith and Brian J Reich North Carolina State University May 21, 2013

Extending causal inferences from a randomized trial to a target population

Analysis of Longitudinal Data. Patrick J. Heagerty PhD Department of Biostatistics University of Washington

A Joint Model with Marginal Interpretation for Longitudinal Continuous and Time-to-event Outcomes

Multistate models and recurrent event models

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model

Generalizing the MCPMod methodology beyond normal, independent data

Lecture 1. Introduction Statistics Statistical Methods II. Presented January 8, 2018

Transcription:

Individualized Treatment Effects with Censored Data via Nonparametric Accelerated Failure Time Models Nicholas C. Henderson Thomas A. Louis Gary Rosner Ravi Varadhan Johns Hopkins University July 31, 2018

Heterogeneity of Treatment Effect (HTE) Heterogeneity of Treatment Effect (HTE) refers to variability that is attributable to observable differences in patient characteristics. Accurate evaluation of HTE offers many potential benefits including informing patient decision-making and in appropriately targeting existing therapies. Often, HTE is analyzed mainly to examine consistency of treatment effect across key patient subgroups This is often done through subgroup analyses or tests for treatment-covariate interactions. 2

HTE Goals/Questions Characterizing and utilizing HTE encompasses a wide range of related goals and questions. Many of these key questions go beyond what can be addressed through conventional subgroup analysis. Key questions of interest include: Quantifying overall heterogeneity in treatment response. Estimating the proportion of patients that benefit from treatment Detection of cross-over (qualitative) interactions. Estimating individualized treatment effects. 3

Modeling HTE In contrast to subgroup analysis, many important HTE questions could be directly addressed if a sufficiently rich model describing patient outcomes were available. Bayesian nonparametric methods are well-suited to provide this individual-level view of HTE. Bayesian nonparametrics allow construction of flexible models for patient outcomes coupled with probability modeling of all unknown quantities Motivation of this work: Develop a flexible, non-parametric approach that can address many of the previously highlighted HTE goals. 4

Why Bayes? Emphasizes estimation of treatment effect heterogeneity rather than hypothesis testing. Well-suited to estimation with many parameters and small subgroups. Tends to shrink when data are sparse. Direct probability statements for questions of interest: e.g., what is the probability that a given individual will benefit from the treatment? Customized treatment recommendations - can utilize the posterior for each individual, can directly weigh efficacy versus safety. 5

Time-to-Event Data and Notation Our focus here is on cases where patient outcomes are time-to-events: T 1,..., T n For the i th patient, we observe Y i = duration of follow-up; Y i = min{t i, C i }. { 1 if failure time is observed δ i = 0 if outcome is right-censored A i = treatment assignment, A i = 1 or A i = 0 x i - a collection of baseline covariates 6

Accelerated Failure Time (AFT) Models and Individualized Treatment Effects We assume patients are randomly assigned to one of two treatments A = 1 or A = 0. To explore HTE in this trial, we consider the AFT model for log-failure time T i log T i = m(a i, x i ) }{{} Regression Function + W i }{{} Error Term The error term is assumed to satisfy the mean-zero constraint: E(W i ) = 0 7

Accelerated Failure Time (AFT) Models log T i = m(a i, x i ) }{{} + W }{{} i Regression Function Error Term In contrast to Cox PH models, AFT models provide a direct, generative model linking survival times and patient covariates. AFT models have a nice interpretation as a regression with log-time as the response. They provide interpretable measures of treatment effect: i.e., differences in expected log-survival time or ratios in expected survival. 8

Accelerated Failure Time (AFT) Models and Individualized Treatment Effects The Individualized Treatment Effect (ITE) θ(x i ) for the i th patient is the difference between expected log-failure times θ(x i ) = E [ log T i A i = 1, x i ] E [ log Ti A i = 0, x i ] = m(1, x i ) m(0, x i ). The ITE θ(x i ) represents the expected treatment effect for a patient with covariate vector x i. The ratio of expected failure times offers a more interpretable measure of treatment effect ξ(x i ) = E[ T i A i = 1, x i ] E [ T i A i = 0, x i ] = exp{θ(x i )} 9

Modeling the Regression Function Our flexible approach to modeling the regression function m(a i, x i ) builds on Bayesian additive regression trees (BART). Advantages of BART for ITE estimation: Good at handling interactions and non-linearities Very effective as an off-the-shelf method - works quite well without any hyperparameter tuning. Shown to be effective in the causal inference settings (). Seamlessly incorporates both discrete and continuous predictors. Automatically provides measures of uncertainty despite the complex nature of the model. 10

A Fully Nonparametric AFT model log T i = m(a i, x i ) }{{} + W }{{} i Regression Function Error Term Choosing a parametric form for the distribution of W i may be too restrictive on the form of the baseline hazard function. Instead, assume the density of f W of W i takes the form ( w τ ) f W (w G, σ) = φ dg(τ) σ We place a centered Dirichlet process prior on G G CDP(G 0, M) M Gamma(ψ 1, ψ 2 ) G 0 = Normal(0, σ 2 τ ) 11

A Fully Nonparametric AFT model While very flexible, our AFT models does entail certain assumptions about patient survival. Survival function ( ) S(t A i, x i ) = 1 F W log t m(a i, x i ) Survival curves across individuals cannot cross. 12

Using the Nonparametric AFT model to assess HTE The posterior distribution of all unknowns in the AFT model can be used to assess a variety of questions. For example, Point estimates of covariate-specific treatment effects The distribution of treatment effects Proportion of patient expected to benefit from treatment Qualitative interactions. The posterior can potentially be utilized in an individualized decision analysis 13

Application: The SOLVD trial Two, large placebo-controlled trial studying the efficacy of the drug enalapril in chronic heart failure patients 2, 569 enrolled in the treatment trial and 4, 228 enrolled in the prevention trial We utilized 18 patient covariates common to both trials (e.g., age, gender, ejection fraction). Primary Outcome: Time to death or first hospitalization. 14

The SPRINT trial: basic summary statistics Enalapril was found to be effective in both trials. In the treatment trial, 510 primary events in the Enalapril treatment arm and 452 primary events in the control arm. Hazard ratio in treatment trial: 0.73, [0.64, 0.82] 15

Individualized treatment effect estimates for all patients in the SOLVD trial. 6000 5000 Patient index 4000 3000 2000 1000 0 0.5 0.0 0.5 1.0 1.5 Difference in expected log failure time 16

Distribution of treatment effects in the SOLVD trial. 2.0 Treatment Trial Prevention Trial 1.5 Density 1.0 0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 treatment effect (ratio in expected survival) 17

Proportion Benefiting The proportion of patients benefiting is the proportion of individuals with a positive treatment effect (i.e., θ(x i ) > 0) Q = 1 n n 1{θ(x i ) > 0} = 1 n i=1 n 1{ξ(x i ) > 1} i=1 Alternatively, one could define the proportion benefiting relative to a clinically relevant threshold ε > 0, i.e., Q ɛ = 1 n n i=1 1{θ(x i) > ε}. An estimate of Q is obtained from taking the area under the curve to the right of 1 in the graph of treatment effect distribution (shown on the previous slide). The estimated percentage of patients benefiting in the treatment trial was 96% and 89% in the prevention trial. 18

Posterior of Proportion Benefiting in the SOLVD trials. 10 Treatment Trial Prevention Trial 8 Density 6 4 2 0 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Proportion Benefiting 19

Local Evidence of Benefit Posterior Probabilities of Treatment Benefit P{ξ(x i ) > 1 y, δ} Treatment Trial Prev. Trial P{ξ(x i ) > 1 y, δ} (0.99, 1] 51.38 20.47 P{ξ(x i ) > 1 y, δ} (0.95, 0.99] 24.69 23.71 P{ξ(x i ) > 1 y, δ} (0.75, 0.95] 20.08 41.98 P{ξ(x i ) > 1 y, δ} [0, 0.75] 3.85 13.84 20

Evidence of Differential Treatment Effect For patient i, the posterior probability of a greater than average treatment effect may be defined as D i = P{θ(x i ) θ avg data}, θ = 1 n n θ(x i ) i=1 and the posterior probability of a differential treatment effect is D i = max{1 2D i, 2D i 1}. Note that D i will be close to 1 whenever D i is either close to 1 or close to 0. Trt. Trial Prev. Trial Strong Evidence: Di > 0.95 19.4% 7.3 Moderate Evidence: Di > 0.80 41.9% 31.6 21

Individual-Specific Posterior Survival Curves in SOLVD 1.0 0.8 Survival Probability 0.6 0.4 0.2 0.0 Enalapril Placebo Enalapril KM estimate Placebo KM estimate 0 500 1000 1500 Time 22

Examining Qualitative Interactions Beyond quantitative heterogeneity, examination of qualitative interactions is often of key interest. Qualitative Interaction: occurs when the treatment effect in one subgroup has a different sign than in another subgroup. The presence of qualitative interactions can be examined by looking at the posterior histogram. For pre-specified subgroups of interest such as male vs. female, we can look at the posteriors of the subgroup-level treatment effects θ male = θ female = 1 N male i male 1 θ(x i ) N female i female θ(x i ) 23

SOLVD Treatment: posterior of θ male and θ female P { sign(θ male ) sign(θ female ) data } = 0.13 3 Male Female Density 2 1 0 0.2 0.0 0.2 0.4 0.6 0.8 1.0 difference in log survival time (days) 24

Variable Importance: Partial Dependence Plots 0.600 0.9 0.595 0.8 difference in log survival 0.590 0.585 0.580 difference in log survival 0.7 0.6 0.575 0.5 0.570 30 40 50 60 70 80 age 10 15 20 25 30 35 ejection fraction 25

Variable Importance for Treatment-Covariate Interactions Which covariates are important in driving differences in treatment effect? (prognostic vs. predictive) If there are no treatment-covariate interactions, m(1, x) m(0, x) should not depend on the value of x. The treatment effect θ(x) = m(1, x) m(0, x) should only depend on predictive covariates. One approach: Run some form of regression with ˆm(1, x i ) ˆm(0, x i ) as the responses: For example, run CART to find patient subgroups. Perform a variable selection procedure, to find a parsimonious model. 26

Variable Importance Regression with ˆm(1, x i ) ˆm(0, x i ) as the responses. Variables sorted by absolute value of the associated t-statistic Variable Estimate t value ejection fraction -0.0211-96.67 gender 0.0983 25.62 diabetic 0.0667 19.62 chronic pulmonary disease -0.0401-9.086 creatinine 0.0339 8.54 27

The AFTrees package The methods discussed here are implemented in the AFTrees package. ## An example: library(aftrees) solvd.fit <- AFTrees(X, y, status, ndpost = 2000) ## X - design matrix ## y - follow-up time ## status - event indicator (1 if event, 0 otherwise) ## ndpost - number of posterior draws The AFTrees package is available for download at https://github.com/nchenderson/aftrees 28

Thanks Acknowledgements: This work was supported through a Patient-Centered Outcomes Research Institute (PCORI) Award (ME-1303-5896). 29