Set-valued dynamic treatment regimes for competing outcomes

Size: px

Start display at page:

Download "Set-valued dynamic treatment regimes for competing outcomes"

Delphia Nelson
5 years ago
Views:

1 Set-valued dynamic treatment regimes for competing outcomes Eric B. Laber Department of Statistics, North Carolina State University JSM, Montreal, QC, August 5, 2013

2 Acknowledgments Zhiqiang Tan Jamie Robins Dan Lizotte Brad Ferguson NCSU and UNC DTR Groups

3 Dynamic treatment regimes Motivation: treatment of chronic illness Ex. ADHD, HIV/AIDS, cancer, depression, schizophrenia, drug and alcohol addiction, bipolar disorder, etc. Treatments adapt to evolving health status of patient Requires balancing immediate and long-term outcomes Dynamic treatment regimes (DTRs) Operationalize decision process as sequence of decision rules One decision rule for each intervention time Intervention times often outcome-dependent Decision rule maps up-to-date patient history to a recommended treatment Maximize expectation of a single clinical outcome

4 Estimation of DTRs Many options for estimating DTRs from observational or randomized studies Q- and A-learning (Murphy 2003, 2005; Robins 2004) Regret regression (Henderson et al., 2010) Dynamical systems models (Rivera et al., 2007; Navarro-Barrientos et al., 2010, 2011) Policy search Augmented value maximization (Zhang 2012, 2013ab) Outcome weighted learning (Zhao et al., 2012, 2013ab) Marginal structural mean models (Orellana et al., 2010) All aim optimize the mean of a single scalar outcome by which the goodness of competing DTRs are measured

5 Competing outcomes Specifying a single scalar outcome sometimes oversimplifies the goal of clinical decision making Need to balance several potentially competing outcomes, e.g., symptom relief and side-effect burden Patients may have heterogeneous and evolving preferences across competing outcomes In some populations, patients do not know or cannot communicate their preferences

6 Motivating example CATIE Schizophrenia study (Stroup et. al., 2003) 18 month SMART study with two main phases Aim to compare typical and atypical antipsychotics Trade-off between efficacy and side-effect burden Drugs known to be efficacious also known to have severe negative side-effects (e.g., olanzapine; Brier et al., 2005) Separate arms for efficacy and tolerability at second stage Sequential treatment of severe schizophrenia Preferences vary widely across patients (Kinter, 2009) Ex. Joe may be willing to tolerate severe weight gain, lethargy, and reduced social functioning to gain symptom relief. Joan cannot function at work if she is lethargic or experiences clouded thinking and is thus willing to take a less effective treatment if it has low side-effect burden. Preference elicitation is difficult in this population Preferences may evolve over time (Strauss et al., 2011)

7 Composite outcomes A natural approach is to combine competing outcomes into a single composite outcome A single composite outcome for all patients (e.g., Wang et al., 2012) Elicited by panel of experts Requires patient preference homogeneity (e.g., all patients are ambivalent about trading 1 side-effect unit for three units of symptom relief) Preferences do not change over time Estimate the optimal DTR across all convex combinations of competing outcomes simultaneously (e.g., Lizotte et al., 2012) True preferences must be expressible as convex combinations Patient preferences cannot change over time

8 Setup: Two-stage binary treatments For simplicity, we consider the two-stage binary treatment setting, this is not essential (see Laber et al., 2013) Observe {(X 1i, A 1i, X 2i, A 2i, E i, S i )} n i=1 comprising n iid patient trajectories : X t R pt : subject covariate vector prior to tth txt A t A t = { 1, 1}: treatment received at time t E R: first outcome, coded so that higher is better S R: second outcome, coded so that higher is better Define: H1 = X 1 and H 2 = (X 1, A 1, X 2 ), let H t denote the domain of H t Focus on estimands

9 Setup: DTRs vs SVDTRs A DTR is a pair of functions π = (π 1, π 2 ) where π t : H t A t Under DTR π a patient presenting with Ht = h t at time t is recommended treatment π t (h t ) For scalar summary Y = Y (E, S) of (E, S) the optimal DTR π opt satisfies E πopt Y E π Y for all π, where E π denotes expectation under treatment assignment via π An SVDTR is a pair of functions π = (π 1, π 2 ) where π t : H t 2 At Under SVDTR π a patient presenting with H t = h t at time t is offered treatment choices π t (h t ) A t Difficult to define optimality without assumptions about patient utility; we will define ideal decision rules View SVDTRs as a communication tool, typically paired with graphical displays (more on this later)

10 Review: dynamic programming for optimal DTR When underlying generative distribution is known, optimal DTR π opt can be found via dynamic programming 1. Define Q 2Y (h 2, a 2 ) E(Y H 2 = h 2, A 2 = a 2 ) 2. Y max a2 Q 2Y (H 2, a 2 ) = Q 2Y (H 2, π opt 2 (H 2 )) 3. Q 1Y (h 1, a 1 ) E(Y H 1 = h 1, A 1 = a 1 ) then π opt t (h t ) = arg max at A t Q ty (h t, a t ) (Bellman, 1957) Q-learning mimics dynamic programming but uses regression models in place of the requisite conditional expectations Note* subscript denotes outcome used to define Q-function

11 SVDTRs Incorporate clinically significant differences E, S > 0; a difference in E(S) of less than E ( S ) is not considered clinically significant (e.g., Friedman et al., 2010) Define contrast functions r 2E (h 2 ) E(E H 2 = h 2, A 2 = 1) E(E H 2 = h 2, A 2 = 1) r 2S (h 2 ) E(S H 2 = h 2, A 2 = 1) E(S H 2 = h 2, A 2 = 1) Ideal set-valued second stage decision rule π Ideal 2 (h 2) = {sgn(r 2E (h 2 ))}, {sgn(r 2S (h 2 ))}, { 1, 1}, if r 2E (h 2 ) E and sgn(r 2E (h 2 ))r S (h 2 ) > S, if r 2S (h 2 ) S and sgn(r 2S (h 2 ))r 2E (h 2 ) > E, otherwise,

12 SVDTRs: schematic for π Ideal 2 r 2S (h 2 ) { 1, 1} {1} S E S { 1, 1} E r 2E (h 2 ) { 1} { 1, 1}

13 SVDTRs: defining the ideal first stage rule Q-learning requires a fixed second stage rule for backup step Problem: π Ideal 2 is not a decision rule Idea: 1. Define a set of sensible second stage rules S(π2 Ideal ) 2. For each τ 2 S(π Ideal 2 π Ideal 1 ) derive ideal set-valued rule (, τ 2) assuming τ 2 will be followed at the second stage 3. A patient presenting with H 1 = h 1 is offered treatments π1 Ideal (h 1) 1 (h 1, τ 2) τ 2 S(π Ideal 2 Ideal π )

14 SVDTRs: defining a sensible set Decision rule τ 2 is compatible with set-valued rule π 2 if τ 2 (h 2 ) π 2 (h 2 ) for all h 2 H 2 Let C(π2 ) denote the set of all policies compatible with π 2 Minimally require S(π Ideal 2 ) C(πIdeal 2 ) C(π Ideal) is large and contains many unrealistic rules 2 Contrast models suggest functional form for decision rules } F {τ 2 : ρ R p 2 s.t. τ 2 (h 2 ) = sgn(h 2,2 ρ) Define the set of sensible second stage rules as ) F C(πIdeal S(π Ideal 2 2 )

15 SVDTRs: ideal set-valued first stage rule For a fixed second stage rule τ 2 define Q1E (h 1, a 1, τ 2 ) E {Q 2E (H 2, τ 2 (H 2 )) H 1 = h 1, A 1 = a 1 } r1e (h 1, τ 2 ) Q 1E (h 1, 1, τ 2 ) Q 1E (h 1, 1, τ 2 ) Q1S and r 1S defined analogously Ideal set-valued first stage rule under τ 2 at the second stage: π Ideal 1 (h 1, τ 2 ) = {sgn(r 1E (h 1, τ 2 ))}, {sgn(r 1S (h 1, τ 2 ))}, { 1, 1}, if r 1E (h 1, τ 2 ) E and sgn(r 1E (h 1, τ 2 ))r 1S (h 1, τ 2 ) > S, if r 1S (h 1, τ 2 ) S and sgn(r 1S (h 1, τ 2 ))r 1E (h 1, τ 2 ) > E, otherwise,

16 SVDTRs: Defining π Ideal 1 (, τ 2) assigns a single treatment if that treatment is expected to yield a clinically significant improvement on one or both the outcomes while not causing clinically significant loss in either outcome assuming the clinician will follow τ 2 at the second decision point. π1 Ideal Recall: Ideal decision rule at the first stage π1 Ideal (h Ideal 1) 1 (h 1, τ 2 ) π τ 2 S(π opt 2 )

17 SVDTR estimation Linear models; H t0, H t1 summaries of H t 1. Regress E and S on H 21 and H 22 to obtain Q 2E (H 2, A 2 ) = β 21E H 21 + β 22E H 22A 2 Q 2S (H 2, A 2 ) = β 21S H 21 + β 22S H 22A 2 2. Define r 2E (h 2 ) Q 2E (h 2, 1) Q 2E (h 2, 1) r 2S (h 2 ) Q 2S (h 2, 1) Q 2S (h 2, 1) 3. Use the plug-in estimator of π Ideal 2 to obtain π 2 (h 2 ) = {sgn( r 2E (h 2 ))}, {sgn( r 2S (h 2 ))}, { 1, 1}, if r 2E (h 2 ) E and sgn( r 2E (h 2 )) r S (h 2 ) > S, if r 2S (h 2 ) S and sgn( r 2S (h 2 )) r 2E (h 2 ) > E, otherwise,

18 SVDTR estimation cont d 4.* Construct set S( π 2 ) 5. For each τ 2 in S( π 2 ) regress Q 2E (H 2, τ 2 (H 2 )) and Q 2S (H 2, τ 2 (H 2 )) on H 11, H 11 and A 1 to obtain Q 1E (H 1, A 1 ; τ 2 ) = β 11E (τ 2 ) H 11 + β 12E (τ 2 ) H 12 Q 1S (H 1, A 1 ; τ 2 ) = β 11S (τ 2 ) H 11 + β 12S (τ 2 ) H Define r 1E (h 1, τ 2 ) Q 1E (h 1, 1, τ 2 ) Q 1E (h 1, 1, τ 2 ) r 1S (h 1, τ 2 ) Q 1S (h 1, 1, τ 2 ) Q 1S (h 1, 1, τ 2 ) 7. Use the plugin-estimator of π Ideal 1 to obtain π 1 (h 1, τ 2 ) (similar to step 3) 8. Define π 1 (h 1 ) τ 2 S( π 2 ) π 1 (h 1, τ 2 )

19 SVDTR computation Constructing the set S( π 2 ) is a seemingly difficult enumeration problem While S( π 2 ) is uncountably infinite, one can prove there exists a finite set S ( π 2 ) so that for all h 1 : π 1 (h 1, τ 2 ) = π 1 (h 1, τ 2 ) τ 2 S ( π 2 ) τ 2 S( π 2 ) We cast construction of S ( π 2 ) as a linear Mixed Integer Program (MIP) Computed efficiently using CPLEX With CATIE data (n 1100, p 20) runtime is less than 1 minute on a dual core laptop (2.8GHz, 8GiB)

20 Communicating the SVDTR Patient presenting at the second stage with H 2 = h 2 given: 1. π 2 (h 2 ) 2. Estimates ( Q 2E (h 2, 1), Q 2S (h 2, 1)), ( Q 2S (h 2, 1), Q 2E (h 2, 1)) Patient presenting at the first stage with H 1 = h 1 given: 1. π 1 (h 1 ) 2. Plot of Q 1E (h 1, a 1, τ 2 ) against Q 1S (h 1, a 1, τ 2 ) across τ 2 S( π 1 ) with separate plotting symbols/colors for a 1 = ±1

21 First stage plot for mean CATIE subject Estimated SVDTR using data from the CATIE study; outcomes E = PANSS and S = BMI (details in Laber et al., 2013) Constructed plot for a subject with h 1 = n 1 n i=1 H 1i PERP Not PERP BMI PANSS

22 Discussion Introduced set-valued dynamic treatment regimes (SVDTRs) One approach to dealing with competing outcomes under: Patient preference heterogeneity Difficult to elicit preferences Evolving patient preference Presented the two-stage binary treatment case, method extends to an arbitrary number of txts and stages (see Laber et al., 2013) Suggests a new framework for decision problems with both a decision maker and a treatment screener. This framework generalized MDPs and may facilitate minimax and other definitions of optimality for SVDTRs.

23 Thank you.

Estimation of Optimal Treatment Regimes Via Machine Learning. Marie Davidian

Estimation of Optimal Treatment Regimes Via Machine Learning Marie Davidian Department of Statistics North Carolina State University Triangle Machine Learning Day April 3, 2018 1/28 Optimal DTRs Via ML