Estimation of Optimal Treatment Regimes Via Machine Learning. Marie Davidian

Similar documents
Implementing Precision Medicine: Optimal Treatment Regimes and SMARTs. Anastasios (Butch) Tsiatis and Marie Davidian

A Sampling of IMPACT Research:

Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective. Anastasios (Butch) Tsiatis and Xiaofei Bai

Reader Reaction to A Robust Method for Estimating Optimal Treatment Regimes by Zhang et al. (2012)

Methods for Interpretable Treatment Regimes

Lecture 9: Learning Optimal Dynamic Treatment Regimes. Donglin Zeng, Department of Biostatistics, University of North Carolina

A Robust Method for Estimating Optimal Treatment Regimes

Set-valued dynamic treatment regimes for competing outcomes

Web-based Supplementary Materials for A Robust Method for Estimating. Optimal Treatment Regimes

TREE-BASED REINFORCEMENT LEARNING FOR ESTIMATING OPTIMAL DYNAMIC TREATMENT REGIMES. University of Michigan, Ann Arbor

Estimating the Mean Response of Treatment Duration Regimes in an Observational Study. Anastasios A. Tsiatis.

SEQUENTIAL MULTIPLE ASSIGNMENT RANDOMIZATION TRIALS WITH ENRICHMENT (SMARTER) DESIGN

Robustifying Trial-Derived Treatment Rules to a Target Population

Causal Inference Basics

Supplement to Q- and A-learning Methods for Estimating Optimal Dynamic Treatment Regimes

Weighting Methods. Harvard University STAT186/GOV2002 CAUSAL INFERENCE. Fall Kosuke Imai

Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates

Recent Advances in Outcome Weighted Learning for Precision Medicine

Semiparametric Regression and Machine Learning Methods for Estimating Optimal Dynamic Treatment Regimes

Double Robustness. Bang and Robins (2005) Kang and Schafer (2007)

A Bayesian Machine Learning Approach for Optimizing Dynamic Treatment Regimes

Comparative effectiveness of dynamic treatment regimes

Estimating Optimal Dynamic Treatment Regimes from Clustered Data

Q learning. A data analysis method for constructing adaptive interventions

Adaptive Trial Designs

Extending causal inferences from a randomized trial to a target population

Lecture 2: Constant Treatment Strategies. Donglin Zeng, Department of Biostatistics, University of North Carolina

Estimating direct effects in cohort and case-control studies

Discussion of Papers on the Extensions of Propensity Score

Stat 642, Lecture notes for 04/12/05 96

Introduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview

Combining multiple observational data sources to estimate causal eects

arxiv: v1 [stat.me] 15 May 2011

Estimating Causal Effects of Organ Transplantation Treatment Regimes

Sample size considerations for precision medicine

Dynamic Prediction of Disease Progression Using Longitudinal Biomarker Data

Application of Time-to-Event Methods in the Assessment of Safety in Clinical Trials

An Introduction to Causal Analysis on Observational Data using Propensity Scores

Comparing Adaptive Interventions Using Data Arising from a SMART: With Application to Autism, ADHD, and Mood Disorders

Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall

arxiv: v1 [math.st] 29 Jul 2014

Targeted Group Sequential Adaptive Designs

The Supervised Learning Approach To Estimating Heterogeneous Causal Regime Effects

A Decision Theoretic Approach to Causality

Power and Sample Size Calculations with the Additive Hazards Model

7 Sensitivity Analysis

Gov 2002: 13. Dynamic Causal Inference

High Dimensional Propensity Score Estimation via Covariate Balancing

Bayesian regression tree models for causal inference: regularization, confounding and heterogeneity

A Gate-keeping Approach for Selecting Adaptive Interventions under General SMART Designs

5 Methods Based on Inverse Probability Weighting Under MAR

Learning Classification with Auxiliary Probabilistic Information Quang Nguyen Hamed Valizadegan Milos Hauskrecht

Causal inference in epidemiological practice

Division of Pharmacoepidemiology And Pharmacoeconomics Technical Report Series

Previous lecture. P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing.

Propensity Score Weighting with Multilevel Data

Deductive Derivation and Computerization of Semiparametric Efficient Estimation

Construction and statistical analysis of adaptive group sequential designs for randomized clinical trials

Covariate Balancing Propensity Score for General Treatment Regimes

PSC 504: Dynamic Causal Inference

Individualized Treatment Effects with Censored Data via Nonparametric Accelerated Failure Time Models

Semiparametric Mixed Effects Models with Flexible Random Effects Distribution

University of California, Berkeley

Quantile-Optimal Treatment Regimes

Interactive Q-learning for Probabilities and Quantiles

Modification and Improvement of Empirical Likelihood for Missing Response Problem

Personalized Treatment Selection Based on Randomized Clinical Trials. Tianxi Cai Department of Biostatistics Harvard School of Public Health

Group Sequential Designs: Theory, Computation and Optimisation

Targeted Maximum Likelihood Estimation in Safety Analysis

Causal modelling in Medical Research

Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal

University of California, Berkeley

Pubh 8482: Sequential Analysis

Fair Inference Through Semiparametric-Efficient Estimation Over Constraint-Specific Paths

Data-Efficient Information-Theoretic Test Selection

DATA-ADAPTIVE VARIABLE SELECTION FOR

Evaluation of Viable Dynamic Treatment Regimes in a Sequentially Randomized Trial of Advanced Prostate Cancer

Adaptive Designs: Why, How and When?

Improved Doubly Robust Estimation when Data are Monotonely Coarsened, with Application to Longitudinal Studies with Dropout

Causal Inference. Miguel A. Hernán, James M. Robins. May 19, 2017

A short introduction to supervised learning, with applications to cancer pathway analysis Dr. Christina Leslie

Causal Inference for Mediation Effects

Group Sequential Tests for Delayed Responses. Christopher Jennison. Lisa Hampson. Workshop on Special Topics on Sequential Methodology

Estimating the Marginal Odds Ratio in Observational Studies

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

Propensity Score Analysis with Hierarchical Data

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang

BIOS 6649: Handout Exercise Solution

Probabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016

Multi-state Models: An Overview

SUPPORT VECTOR MACHINE

Weighted MCID: Estimation and Statistical Inference

Switching-state Dynamical Modeling of Daily Behavioral Data

e author and the promoter give permission to consult this master dissertation and to copy it or parts of it for personal use. Each other use falls

Package Rsurrogate. October 20, 2016

Achieving Optimal Covariate Balance Under General Treatment Regimes

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness

Lecture 1: Bayesian Framework Basics

Causal Inference with General Treatment Regimes: Generalizing the Propensity Score

Using modern statistical methodology for validating and reporti. Outcomes

Transcription:

Estimation of Optimal Treatment Regimes Via Machine Learning Marie Davidian Department of Statistics North Carolina State University Triangle Machine Learning Day April 3, 2018 1/28 Optimal DTRs Via ML

Precision medicine The right treatment for the right patient at the right time 2/28 Optimal DTRs Via ML

Precision medicine Patent heterogeneity: Genetic/genomic profiles Demographic, physiological characteristics Medical history, concomitant conditions Environment, lifestyle factors Adverse reactions, adherence to prior treatment... Fundamental premise: A patient s characteristics are implicated in which treatment options s/he should receive 3/28 Optimal DTRs Via ML

Clinical decision-making Clinical practice: Clinicians make a series of treatment decisions over the course of a patient s disease or disorder Key decision points in the disease/disorder process Fixed schedule, milestones, events necessitating a decision Multiple treatment options at each decision point Synthesize all information on the patient to decide on an option Goal: Make the best decisions leading to the most beneficial expected clinical outcome for this patient given his/her characteristics 4/28 Optimal DTRs Via ML

Example: Acute leukemia Two decision points: Decision 1 : Induction chemotherapy (2 options: C 1, C 2 ) Decision 2 : Maintenance treatment for patients who respond (2 options: M 1, M 2 ) Salvage chemotherapy for those who don t respond (2 options: S 1, S 2 ) Clinical outcome: Progression-free or overall survival time 5/28 Optimal DTRs Via ML

Treatment regime Precision medicine: Formalize clinical decision-making and make it evidence-based At each decision point, would like a formal rule that takes as input all available information on the patient to that point and outputs a recommended treatment action from among the possible, feasible options Treatment regime: A set of decision rules, each corresponding to a decision point Aka dynamic treatment regime, adaptive treatment strategy, adaptive intervention, treatment policy 6/28 Optimal DTRs Via ML

Two decision regime: Acute leukemia At baseline: Information x 1, accrued information h 1 = x 1 H 1 Decision 1: Set of options A 1 = {C 1,C 2 }; rule 1: d 1 (h 1 ): H 1 A 1 Between Decisions 1 and 2: Collect additional information x 2, including responder status Accrued information h 2 = (x 1, chemotherapy at decision 1, x 2 ) H 2 Decision 2: Set of options A 2 = {M 1,M 2,S 1,S 2 }; rule 2: d 2 (h 2 ): H 2 {M 1,M 2 } (responder), d 2 (h 2 ): H 2 {S 1,S 2 } (nonresponder) Treatment regime : d = {d 1 (h 1 ), d 2 (h 2 )} = (d 1, d 2 ) 7/28 Optimal DTRs Via ML

In general Treatment regime with K decision points: Baseline information x 1 X 1, intermediate information x k X k between Decisions k 1 and k, k = 2,..., K Set of treatment options A k at Decision k, elements a k A k Accrued information or history h 1 = x 1 H 1 h k = (x 1, a 1,..., x k 1, a k 1, x k ) H k, k = 2,..., K, Decision rules d 1 (h 1 ), d 2 (h 2 ),..., d K (h K ), d k : H k A k Treatment regime d = {d 1 (h 1 ),..., d K (h K )} = (d 1, d 2,..., d K ) Class of all possible K -decision regimes: D 8/28 Optimal DTRs Via ML

Optimal treatment regime Goal: Find the best or optimal regime in D d opt = (d opt 1,..., d opt K ) Assume: There is a clinical outcome by which treatment benefit can be assessed Survival time, CD4 count,... Coded so that larger is better Causal inference perspective... 9/28 Optimal DTRs Via ML

Optimal treatment regime Potential outcomes: For any regime d D Y * (d) = the outcome a patient would achieve if s/he were to receive treatment according to the rules in d Value of d V(d) = E{Y * (d)}, the population average outcome if all patients in the population were to receive treatment options according to d Optimal regime: d opt = arg max V(d) d D I.e., E{Y * (d)} E{Y * (d opt )} for all d D 10/28 Optimal DTRs Via ML

Optimal treatment regime Challenge: Can we estimate d opt = arg max d D V(d) from data? From a randomized clinical trial or observational database d opt is defined in terms of potential outcomes Must be able to express the definition of d opt equivalently in terms of the observed data 11/28 Optimal DTRs Via ML

Statistical framework Simplest setting: A single decision with two treatment options A 1 = {0, 1} Treatment regime: d D comprises a single rule d 1 d = {d 1 (h 1 )} Data: Independent and identically distributed (iid) (X 1i, A 1i, Y i ), i = 1,..., n n subjects indexed by i X 1i = baseline information observed on subject i A 1i = treatment option in A 1 actually received by subject i Y i = observed outcome for subject i History for subject i: H 1i = X 1i 12/28 Optimal DTRs Via ML

Assumptions Consistency: Y = Y * (1)I(A 1 = 1) + Y * (0)I(A 1 = 0) Positivity: pr(a 1 = a 1 H 1 = h 1 ) > 0 for all h 1 H 1, a 1 = 0, 1 No unmeasured confounders: {Y * (1), Y * (0)} A 1 H 1 The history H 1 contains all information used to assign treatments in the observed data Automatically satisfied for data from a randomized trial Standard but unverifiable assumption for observational studies 13/28 Optimal DTRs Via ML

Value of a regime Under these assumptions: Can be shown that V(d) = E{Y * (d)} [ ] = E E{Y * (1) H 1 }I{d 1 (H 1 ) = 1} + E{Y * (0) H 1 }I{d 1 (H 1 ) = 0} = E [E(Y H 1, A 1 = 1)I{d 1 (H 1 ) = 1} + E(Y H 1, A 1 = 0)I{d 1 (H 1 ) = 0}] = E [Q 1 (H 1, 1)I{d 1 (H 1 ) = 1} + Q 1 (H 1, 0)I{d 1 (H 1 ) = 0}], Implies: Optimal regime Q 1 (h 1, a 1 ) = E(Y H 1 = h 1, A 1 = a 1 ) d opt 1 (h 1 ) = I{Q 1 (h 1, 1) Q 1 (h 1, 0)} = I{Q 1 (h 1, 1) Q 1 (h 1, 0) 0} = I{C 1 (h 1 ) 0} C 1 (h 1 ) = Q 1 (h 1, 1) Q 1 (h 1, 0) is the contrast function 14/28 Optimal DTRs Via ML

Regression estimator for optimal regime Regression model: Q 1 (h 1, a 1 ; β 1 ) Fitted model Q 1 (h 1, a 1 ; β 1 ) Estimated optimal regime d opt 1 (h 1 ) = I{Q 1 (h 1, 1; β 1 ) Q 1 (h 1, 0; β 1 ) 0} Simplest form of Q-learning Concern: Misspecification of regression model 15/28 Optimal DTRs Via ML

Direct/policy search estimator for optimal regime Restricted class of regimes D η : Indexed by η 1 d η = {d 1 (h 1 ; η 1 )}, η = η 1 Motivated by a regression model, e.g., h 1 = (x 11, x 12 ), d 1 (h 1 ; η 1 ) = I(η 11 + η 12 x 11 + η 13 x 12 0), η 1 = (η 11, η 12, η 13 ) T Based on cost, feasibility in practice, interpretability; e.g., d 1 (h 1 ; η 1 ) = I(x 11 < η 11, x 12 < η 12 ), η 1 = (η 11, η 12 ) T Or d 1 (h 1 ; η 1 ) in the form of a list (if-then-else clauses) Optimal restricted regime d opt η = {d 1 (h 1 ; η opt 1 )} d 1 (h 1 ; η opt 1 ), ηopt 1 = arg max V(d η ) η 1 16/28 Optimal DTRs Via ML

Direct/policy search estimator for optimal regime Optimal restricted regime: d opt η = {d 1 (h 1 ; η opt 1 )} d 1 (h 1 ; η opt 1 ), ηopt 1 = arg max V(d η ) η 1 Suggests: Obtain an estimator V(d η ) for V (d η ) for any fixed η 1 Treat V(d η ) as a function of η 1 and maximize in η 1 That is, estimate η opt 1 by η opt 1 = arg max η 1 V(dη ) = d opt η = {d 1 (h 1, η opt 1 )} 17/28 Optimal DTRs Via ML

Inverse probability weighted value estimators Define: Consistency indicator C dη = I{A 1 = d 1 (H 1 ; η 1 )} Propensity of treatment consistent with d η π dη,1(h 1 ; η 1 ) = pr(c dη = 1 H 1 ) = π 1 (H 1 )I{d 1 (H 1 ; η 1 ) = 1} + {1 π 1 (H 1 )}I{d 1 (H 1 ; η 1 ) = 0} π 1 (h 1 ) = pr(a 1 = 1 H 1 = h 1 ) is the propensity score π 1 (h 1 ) known in a randomized trial; can posit a model π 1 (h 1 ; γ 1 ) in an observational study and obtain π dη,1(h 1 ; η 1, γ 1 ) Semiparametric theory for missing data yields... 18/28 Optimal DTRs Via ML

Inverse probability weighted value estimators Inverse probability weighted estimator for V(d η ): For fixed η 1 V IPW (d η ) = n 1 n i=1 C dη,iy i π dη,1(h 1i ; η 1, γ 1 ) Doubly robust augmented inverse probability weighted estimator: More efficient and stable V AIPW (d η ) = n [ n 1 i=1 C dη,iy i π dη,1(h 1i ; η 1, γ 1 ) C d η,i π dη,1(h 1i ; η 1, γ 1 ) π dη,1(h 1i ; η 1, γ 1 ) ] Q dη,1(h 1i ; η 1, β 1 ) Q dη,1(h 1 ; η 1, β 1 ) = Q 1 (h 1, 1; β 1 )I{d 1 (h 1 ; η 1 ) = 1}+Q 1 (h 1, 0; β 1 )I{d 1 (h 1 ; η 1 ) = 0} 19/28 Optimal DTRs Via ML

Direct/policy search estimators for optimal regime Result: Estimators for η opt 1 by maximizing V IPW (d η )or V AIPW (d η ) in η 1 Estimators for optimal restricted regime d opt η = {d 1 (h 1 ; η opt 1 )} d opt η,ipw = {d 1(h 1, η opt opt 1,IPW )} and d η,aipw = {d 1(h 1, η opt 1,AIPW )} Challenge: nonsmooth functions of η 1 ; nonstandard optimization problem 20/28 Optimal DTRs Via ML

Classification analogy So what is the connection to machine learning? ψ 1 (H 1, A 1, Y ) = A 1Y π 1 (H 1 ) {A 1 π 1 (H 1 )} Q 1 (H 1, 1), π 1 (H 1 ) ψ 0 (H 1, A 1, Y ) = (1 A 1)Y 1 π 1 (H 1 ) + {A 1 π 1 (H 1 )} Q 1 (H 1, 0). 1 π 1 (H 1 ) E{ψ 1 (H 1, A 1, Y ) ψ 0 (H 1, A 1, Y ) H 1 } = Q 1 (H 1, 1) Q 1 (H 1, 0) = C 1 (H 1 ), the contrast function Predictor of the contrast function Ĉ 1 (H 1i, A 1i, Y i ) = ψ 1 (H 1i, A 1i, Y i ) ψ 0 (H 1i, A 1i, Y i ) with fitted models Q 1 (h 1, a 1 ; β 1 ) and π 1 (h 1 ; γ 1 ) substituted 21/28 Optimal DTRs Via ML

Classification analogy Lots of algebra: Maximizing V AIPW (d η ) in η 1 is equivalent to minimizing n 1 n i=1 [ ] Ĉ1(H 1i, A 1i, Y i ) I I{Ĉ1(H 1i, A 1i, Y i ) 0} d 1 (H 1i ; η 1 ) A weighted classification error with Label I{Ĉ1(H 1i, A 1i, Y i ) 0} Weight Ĉ1(H 1i, A 1i, Y i ) Classifier d 1 (H 1i ; η 1 ) And similarly for V IPW (d η ) with ψ 1 (H 1, A 1, Y ) = A 1Y π 1 (H 1 ), ψ 0(H 1, A 1, Y ) = (1 A 1)Y 1 π 1 (H 1 ), 22/28 Optimal DTRs Via ML

Classification analogy Result: Direct/policy search estimation of dη opt by maximizing V IPW (d η ) or V AIPW (d η ) is equivalent to minimizing a weighted classification error Choice of classification approach dictates the restricted class D η E.g., linear or nonlinear SVM, CART, random forests, etc etc Can add a penalty to achieve parsimonious representation Outcome weighted learning (O-learning ) uses V IPW (d η ) with nonlinear SVM 23/28 Optimal DTRs Via ML

K > 1 decisions Extensions: d opt = (d opt 1,..., d opt K ) Q-learning Direct/policy search estimation within restricted class D η Backward induction implementation with classification representation at each step 24/28 Optimal DTRs Via ML

Discussion Summary: Direct/policy search estimation of an optimal treatment regime can be cast as a weighted classification problem Can exploit existing machine learning techniques to estimate an optimal treatment regime 25/28 Optimal DTRs Via ML

Acknowledgement IMPACT Innovative Methods Program for Advancing Clinical Trials A joint venture of Duke, UNC-Chapel Hill, NC State Supported by NCI Program Project P01 CA142538 (2010 2020) http://impact.unc.edu Statistical methods for precision cancer medicine 26/28 Optimal DTRs Via ML

Upcoming SAMSI: Statistical and Applied Mathematical Sciences Institute https://www.samsi.info/ 2018-2019 Program on Statistical, Mathematical, and Computational Methods for Precision Medicine Opening Workshop: August 13-17, 2018 27/28 Optimal DTRs Via ML

Some references Zhang, B., Tsiatis, A. A., Laber, E. B., and Davidian, M. (2012). A robust method for estimating optimal treatment regimes. Biometrics 68, 1010 1018. Zhang, B., Tsiatis, A. A., Davidian, M., Zhang, M., and Laber, E. B. (2012). Estimating optimal treatment regimes from a classification perspective. Stat 1, 103 114. Zhang, B., Tsiatis, A. A., Laber, E. B., and Davidian, M. (2013). Robust estimation of optimal dynamic treatment regimes for sequential treatment decisions. Biometrika 100, 681 694. Zhang, Y., Laber, E. B., Tsiatis, A. A., and Davidian, M. (2015). Using decision lists to construct interpretable and parsimonious treatment regimes Biometrics 71, 895 904. Zhang, Y., Laber E. B., Davidian, M., and Tsiatis, A. A. (2018). Estimation of optimal treatment regimes using lists. Journal of the American Statistical Association, in press. Zhao, Y., Zeng, D., Rush, A. J., and Kosorok, M. R. (2012). Estimating individualized treatment rules using outcome weighted learning. Journal of the American Statistical Association 107, 1106 1118. Zhao, Y. Q., Zeng, D., Laber, E. B., and Kosorok, M. R. (2015). New statistical learning methods for estimating optimal dynamic treatment regimes. Journal of the American Statistical Association 110, 583 598. 28/28 Optimal DTRs Via ML