Implementing Precision Medicine: Optimal Treatment Regimes and SMARTs. Anastasios (Butch) Tsiatis and Marie Davidian

Size: px
Start display at page:

Download "Implementing Precision Medicine: Optimal Treatment Regimes and SMARTs. Anastasios (Butch) Tsiatis and Marie Davidian"

Transcription

1 Implementing Precision Medicine: Optimal Treatment Regimes and SMARTs Anastasios (Butch) Tsiatis and Marie Davidian Department of Statistics North Carolina State University 1/65 Treatment Regimes and SMARTs

2 Precision medicine Patent heterogeneity: Demographic characteristics Physiological characteristics Medical history, concomitant conditions Genetic/genomic characteristics Precision medicine: Multiple treatment options Multiple decision points A patient s characteristics are implicated in which treatment option s/he should receive and when 2/65 Treatment Regimes and SMARTs

3 Example: Acute leukemias Two decision points: Decision 1 : Induction chemotherapy (C, options c 1, c 2 ) Decision 2 : Maintenance treatment for patients who respond (M, options m 1, m 2 ) Salvage chemotherapy for those who don t (S, options s 1, s 2 ) 3/65 Treatment Regimes and SMARTs

4 Clinical decision-making How are these decisions made? Clinical judgment, practice guidelines Synthesis of all information on a patient up to the point of a decision to determine next treatment action from among the possible options Goal: Make these decisions so as to lead to the best outcome for the patient Goal: Formalize clinical decision-making and make it evidence-based 4/65 Treatment Regimes and SMARTs

5 Treatment regime Formalizing precision medicine: At any decision Construct a rule that takes as input the accrued information on the patient to that point and outputs the next treatment from among the possible options Decision 1: If age < 50 years and WBC < /µl, give chemotherapy c 2 ; otherwise, give c 1 (Dynamic) treatment regime(n): Aka adaptive treatment strategy A set of such rules, each corresponding to a decision point An algorithm for treating an individual patient 5/65 Treatment Regimes and SMARTs

6 Single decision point First: Focus on single decision point x = all available information on the patient Set of treatment options, e.g., {c 1,c 2 } = {0, 1} Treatment regime: A single rule of the form d(x) such that d(x) = 0 or 1 depending on x Example rules/regimes: x = {age, WBC, gender} Rule involving cut-offs (rectangular region ) d(x) = I(age < 50 and WBC < 10) Rule involving a linear combination (hyperplane ) d(x) = I(age log(wbc) 60 > 0) 6/65 Treatment Regimes and SMARTs

7 Best treatment regime? Challenge: There is an infinitude of possible regimes d D = class of all possible treatment regimes Can we find the best set of rules; i.e., the best treatment regime in D? How to define best? Approach: Formalize best using ideas from causal inference 7/65 Treatment Regimes and SMARTs

8 Clinical outcome Outcome: There is a clinical outcome by which treatment benefit can be assessed Survival time, CD4 count, indicator of no myocardial infarction within 30 days,... Larger outcomes are better 8/65 Treatment Regimes and SMARTs

9 Defining best An optimal regime d opt : Should satisfy If an individual patient were to receive treatment according to d opt, his/her expected outcome would be as large as possible given the information available on him/her If all patients in the population were to receive treatment according to d opt, the expected (average) outcome for the population would be as large as possible Formalize this... 9/65 Treatment Regimes and SMARTs

10 Optimal regime for single decision For simplicity: Consider regimes involving a single decision with two treatment options (coded as 0 and 1) A = {0, 1} Baseline information x Treatment regime: A single rule d(x) d : x {0, 1} d D, the class of all regimes 10/65 Treatment Regimes and SMARTs

11 Potential outcomes Treatments 0 and 1: We can hypothesize potential outcomes Y (1) = outcome that would be achieved if a patient were to receive treatment 1; Y (0) similarly E{Y (1)} is the expected outcome if all patients in the population were to receive 1; E{Y (0)} similarly Potential outcome for a regime: For any d D Y (d) = potential outcome for a patient with baseline information X if s/he were to receive treatment in accordance with regime d Y (d) = Y (1)d(X) + Y (0){1 d(x)} Potential outcome for regime d for a patient with information X is the potential outcome for the treatment option dictated by d 11/65 Treatment Regimes and SMARTs

12 Optimal treatment regime Thus: E{Y (d) X = x} is the expected outcome for a patient with information x if s/he were to receive treatment according to regime d D E{Y (d)} = E[ E{Y (d) X} ] is the expected (average) outcome for the population if all patients were to receive treatment according to regime d D Optimal regime: d opt is a regime in D such that E{Y (d) X = x} E{Y (d opt ) X = x} for all x and d D E{Y (d)} E{Y (d opt )} for all d D 12/65 Treatment Regimes and SMARTs

13 Value of a treatment regime Definition: E{Y (d)} is referred to as the value of regime d Write V (d) = E{Y (d)} Thus, d opt is a regime in D that maximizes the value V (d) among all d D 13/65 Treatment Regimes and SMARTs

14 Optimal treatment regime Simplest case: A = {0, 1} Y (d) = Y (1)d(X) + Y (0){1 d(x)} Thus, the value V (d) of d is [ V (d) = E{Y (d)} = E = E It follows that d opt (x) = I ] E{Y (d) X} [ E{Y (1) X}d(X) + E{Y (0) X}{1 d(x)} [ ] E{Y (1) X = x} > E{Y (0) X = x} ] 14/65 Treatment Regimes and SMARTs

15 Statistical problem Goal: Given data from a clinical trial or observational study, estimate an optimal regime d opt satisfying this definition Observed data: Single decision, K = 1, A = {0, 1}. (X i, A i, Y i ), i = 1,..., n iid X = baseline information A = treatment actually received Y = observed outcome Required: Optimal regime is defined in terms of potential outcomes Must express equivalently in terms of observed data 15/65 Treatment Regimes and SMARTs

16 Assumptions Consistency assumption: Y = Y (1)A + Y (0)(1 A) No unmeasured confounders assumption: Y (0), Y (1) A X X contains all information used to assign treatments Automatically satisfied in a randomized trial Unverifiable but necessary in an observational study 16/65 Treatment Regimes and SMARTs

17 Under these assumptions E{Y (1)} = E[E{Y (1) X}] = E[E{Y (1) X, A = 1}] and similarly for E{Y (0)} = E{ E(Y X, A = 1) } Value V (d) of d: By similar calculations V (d) = E{Y (d)} = E[E{Y (d) X}] [ ] = E E{Y (1) X}d(X) + E{Y (0) X}{1 d(x)} [ ] = E E(Y X, A = 1)d(X) + E(Y X, A = 0){1 d(x)} 17/65 Treatment Regimes and SMARTs

18 In terms of obesrved data [ ] V (d) = E E(Y X, A = 1)d(X) + E(Y X, A = 0){1 d(x)} [ ] d opt (x) = I E{Y X = x, A = 1} > E{Y X = x, A = 0} Result: Can express V (d) and d opt in terms the observed data Q(x, a) = E(Y X = x, A = a) is the regression of outcome on X and treatment actually received A d opt chooses the treatment option that makes this regression the largest d opt (x) = I{Q(x, 1) > Q(x, 0)} 18/65 Treatment Regimes and SMARTs

19 Estimating V (d) and d opt Q(x, a) = E(Y X = x, A = a) [ ] V (d) = E Q(X, 1)d(X) + Q(X, 0){1 d(x)} d opt (x) = I{Q(x, 1) > Q(x, 0)} To use these results: The regression Q(x, a) = E(Y X = x, A = a) is not known Posit a model Q(x, a; β) (e.g., linear or logistic regression) Q(x, a; β) = β 0 + β 1 x + β 2 a + β 3 ax Estimate β by β (e.g., least squares, ML ) 19/65 Treatment Regimes and SMARTs

20 Estimating V (d) Estimator for the value of d: For any d V (d) = n 1 n i=1 [ Q(X i, 1; β)d(x i ) + Q(X i, 0; β){1 ] d(x i )} 20/65 Treatment Regimes and SMARTs

21 Regression estimator for d opt Regression estimator for d opt : d opt Q (x) = I{ Q(x, 1; β) > Q(x, 0; β) } Regression estimator for the value of d opt : V (d opt ) V Q ( d opt ) = n 1 n i=1 [ Q(X i, 1; β) d opt (X i ) + Q(X i, 0; β){1 d ] opt (X i )} Concern: Q(x, a; β) may be misspecified d opt could be far from d opt 21/65 Treatment Regimes and SMARTs

22 Alternative approach Class of regimes: Q(X, A; β) defines a class of regimes d(x, β) = I{Q(x, 1; β) > Q(x, 0; β)}, indexed by β, that may or may not contain d opt E.g., if we posit Q(X, A; β) = β 0 + β 1 X 1 + β 2 X 2 + A(β 3 + β 4 X 1 + β 5 X 2 ) Implies regimes of the form d(x, β) = I(β 3 + β 4 x 1 + β 5 x 2 0) 22/65 Treatment Regimes and SMARTs

23 Restricted class of regimes Suggests: Consider directly a restricted class of regimes D η d η (x) = d(x, η), indexed by η May be motivated by a regression model or based on cost, feasibility in practice, interpretability; e.g., d(x, η) = I(x 1 < η 0, x 2 < η 1 ) D η may or may not contain d opt, but still of interest Optimal restricted regime d opt η (x) = d(x, η opt ), η opt maximizes V (d η ) = E{Y (d η )} Estimate the optimal restricted regime by estimating η opt 23/65 Treatment Regimes and SMARTs

24 Estimating an optimal restricted regime Idea: Maximize a good estimator for the value V (d η ) Missing data analogy: For fixed η, define C η = I{A = d(x, η)} = Ad(X, η) + (1 A){1 d(x, η)} C η = 1 if the treatment received is consistent with having following d η and = 0 otherwise Only subjects with C η = 1 have observed outcomes Y consistent with following d η For the rest, such outcomes are missing 24/65 Treatment Regimes and SMARTs

25 Estimating an optimal restricted regime Propensity score: Propensity for treatment 1 π(x) = pr(a = 1 X = x) Randomized trial: π(x) is known Observational study: Posit a model π(x; γ) (e.g., logistic regression) and obtain γ using (A i, X i ), i = 1,..., n Propensity of receiving treatment consistent with d η π c (X; η) = pr(c η = 1 X) Write π c (x; η, γ) with π(x; γ) = E[Ad(X, η) + (1 A){1 d(x, η)} X] = π(x)d(x, η) + {1 π(x)}{1 d(x, η)} 25/65 Treatment Regimes and SMARTs

26 Estimating an optimal restricted regime Inverse probability weighted estimator for V (d η ): V IPWE (d η ) = n 1 n i=1 C η,i Y i π c (X i ; η, γ). Consistent for V (d η ) if π(x; γ) (thus π c (x; η, γ)) is correctly specified But uses data only from subjects with C η = 1 26/65 Treatment Regimes and SMARTs

27 Estimating an optimal restricted regime Doubly robust augmented inverse probability weighted estimator: n { V AIPWE (d η ) = n 1 Cη,i Y i π c (X i ; η, γ) C } η,i π c (X i ; η, γ) m(x i ; η, π c (X i ; η, γ) β) i=1 m(x; η, β) = E{Y (d η ) X} = Q(X, 1; β)d(x, η)+q(x, 0; β){1 d(x, η)} Q(x, a; β) is a model for E(Y X = x, A = a) Consistent if either π(x, γ) or Q(x, a; β) is correct (doubly robust ) Attempts to gain efficiency by using data from all subjects 27/65 Treatment Regimes and SMARTs

28 Value search estimators Result: Estimators η opt by maximizing V IPWE (d η ) or V AIPWE (d η ) in η Estimated optimal restricted regime d opt η (x) = d(x, η opt ) Non-smooth in η; need suitable optimization techniques Estimators for V (dη opt ) = E{Y (dη opt )} V IPWE ( d opt η,ipwe ) or VAIPWE ( d opt η,aipwe ) Semiparametric theory : AIPWE is more efficient than IPWE for estimating V (d η ) = E{Y (d η )} Estimating regimes based on AIPWE should be better Zhang et al. (2012), Biometrics 28/65 Treatment Regimes and SMARTs

29 Classification perspective Algebra: Maximizing V IPWE (d η ) or V AIPWE (d η ) in η can be cast as a weighted classification problem D η determined by choice of classifier, e.g., SVM, CART Can exploit available algorithms and software Zhang et al. (2012), Stat ; Zhao et al. (2012), JASA (IPWE only, Outcome weighted learning, OWL) 29/65 Treatment Regimes and SMARTs

30 Multiple decision points At baseline: Information x 1 (accrued information h 1 = x 1 ) Decision 1: Two options {c 1, c 2 }; rule 1: d 1 (x 1 ) x 1 {c 1, c 2 } Between decisions 1 and 2: Collect additional information x 2, including responder status Accrued information h 2 = {x 1, chemotherapy at decision 1, x 2 } Decision 2: Four options {m 1, m 2, s 1, s 2 }; rule 2: d 2 (h 2 ) h 2 {m 1, m 2 } (responder), h 2 {s 1, s 2 } (nonresponder) Regime : d = (d 1, d 2 ) 30/65 Treatment Regimes and SMARTs

31 Multiple decision points For example: x 1 ={age, WBC, gender, %CD56 expression,... } a 1 = chemotherapy (c 1 or c 2 ) x 2 = {grade 3+ hematologic adverse event, post-induction ECOG status, WBC, platelets,..., responder status} Accrued information h 1 = x 1, h 2 = (x 1, a 1, x 2 ) Rules d 1 (h 1 ), d 2 (h 2 ) Regime d = {d 1 (h 1 ), d 2 (h 2 )} 31/65 Treatment Regimes and SMARTs

32 Multiple decision points In general: K decision points Baseline information x 1, intermediate information x k between decisions k 1 and k, k = 2,..., K Set of treatment options at decision k a k A k Accrued information h 1 = x 1 h k = {x 1, a 1, x 2, a 2,..., x k 1, a k 1, x k } k = 2,..., K Decision rules d 1 (h 1 ), d 2 (h 2 ),..., d K (h K ), d k : h k A k Dynamic treatment regime d = (d 1, d 2,..., d K ) D is the set of all possible K -decision regimes 32/65 Treatment Regimes and SMARTs

33 Optimal regime for multiple decisions An optimal regime d opt D: Should satisfy: If an individual patient with baseline information x 1 were to receive all K treatments according to d opt, his/her expected outcome would be as large as possible If all patients in the population were to receive all K treatments according to d opt, the expected (average) outcome for the population would be as large as possible 33/65 Treatment Regimes and SMARTs

34 Potential outcomes Potential outcomes for regime d = (d 1,..., d K ) D: Baseline information X 1, potential outcomes X 2 (d),..., X K (d), Y (d) Optimal regime: d opt is a regime in D such that E{Y (d)} E{Y (d opt )} for all d D E{Y (d) X 1 = x 1 } E{Y (d opt ) X 1 = x 1 } for all x 1, d D Excruciating details: Schulte et al. (2014), Statistical Science 34/65 Treatment Regimes and SMARTs

35 Statistical problem Goal: Given data from an appropriate clinical trial or observational study, estimate an optimal regime d opt satisfying this definition Required observed data: K decisions (X 1i, A 1i, X 2i, A 2i,..., X (K 1)i, A (K 1)i, X Ki, A Ki, Y i ), i = 1,..., n, iid X 1 = baseline information A k, k = 1,..., K = treatment received at decision k X k, k = 2,..., K = intermediate information observed between decisions k 1 and k H k = (X 1, A 1, X 2,..., A k 1, X k ), k = 2,..., K = accrued information to decision k Y = observed outcome, ascertained after decision K or can be a function of X 2,..., X K 35/65 Treatment Regimes and SMARTs

36 Assumptions Consistency assumption: X k, k = 2,..., K, and Y are equal to the potential outcomes under the treatments A 1,..., A K actually received Critical assumption: Sequential randomization Generalization of no unmeasured confounders Treatment received at each decision k = 1,..., K is statistically independent of potential outcomes that would be achieved under all treatment options at all decisions given the accrued information H k 36/65 Treatment Regimes and SMARTs

37 Data sources Studies: Longitudinal observational Sequential, Multiple Assignment, Randomized Trial (SMART) 37/65 Treatment Regimes and SMARTs

38 SMART Leukemia example: Randomization at s M 1 Response M 2 C 1 No Response S 1 S 2 Cancer Response M 1 C 2 M 2 No Response S 1 S 2 Sequential randomization automatically holds 38/65 Treatment Regimes and SMARTs

39 Characterizing an optimal regime For definiteness: Take K = 2 and A k = {0, 1}, k = 1, 2 Accrued information H 1 = X 1, H 2 = (X 1, A 1, X 2 ) Optimal regime d opt : Follows from backward induction Formally in terms of potential outcomes Sequential randomization assumption allows equivalent expressions in terms of observed data (X 1, A 1, X 2, A 2, Y ) (as for single decision and no unmeasured confounders) 39/65 Treatment Regimes and SMARTs

40 Characterizing an optimal regime Optimal regime d opt : Backward induction Decision 2: Q 2 (h 2, a 2 ) = E(Y H 2 = h 2, A 2 = a 2 ) d opt 2 (h 2 ) = I{Q 2 (h 2, 1) > Q 2 (h 2, 0)} Ỹ 2 (h 2 ) = max{q 2 (h 2, 0), Q 2 (h 2, 1)} Decision 1: Q 1 (h 1, a 1 ) = E{Ỹ2(H 2 ) H 1 = h 1, A 1 = a 1 } d opt = (d opt 1, d opt 2 ) d opt 1 (h 1 ) = I{Q 1 (h 1, 1) > Q 1 (h 1, 0)]} Ỹ 1 (h 1 ) = max{q 1 (h 1, 0), Q 1 (h 1, 1)} The value of d opt is V (d opt ) = E{Ỹ1(H 1 )} 40/65 Treatment Regimes and SMARTs

41 Q-Learning (sequential regression) Estimation of d opt : Decision 2: Posit and fit a model Q 2 (h 2, a 2 ; β 2 ) by regressing Y on H 2, A 2 (e.g., least squares) and estimate d opt Q,2 (h 2) = I{Q 2 (h 2, 1; β 2 ) > Q 2 (h 2, 0; β 2 )} For each i, form predicted value Ỹ 2i = Ỹ2i(H 2i ; β 2 ) = max{q 2 (H 2i, 0; β 2 ), Q 2 (H 2i, 1; β 2 )} Decision 1: Posit and fit a model Q 1 (h 1, a 1 ; β 1 ) by regressing Ỹ 2 on H 1, A 1 (e.g., least squares) and estimate d opt Q,1 (h 1) = I{Q 1 (h 1, 1; β 1 ) > Q 1 (h 1, 0; β 1 )} Estimated regime d opt Q opt opt = ( d, d Q,1 Q,2 ) 41/65 Treatment Regimes and SMARTs

42 Other estimators Issue for Q-learning: Model misspecification Inverse weighted methods: Analogy to a dropout problem Restricted class D η Generalization of IPWE, AIPWE, Zhang et al. (2013), Biometrika Subjects are included as long as observed sequential treatments are consistent with following d η Zhao et al. (2015), JASA (IPWE only, simultaneous or backward outcome weighted learning, S/BOWL) 42/65 Treatment Regimes and SMARTs

43 SMART Leukemia example: Randomization at s M 1 Response M 2 C 1 No Response S 1 S 2 Cancer Response M 1 C 2 M 2 No Response S 1 S 2 Sequential randomization automatically holds 43/65 Treatment Regimes and SMARTs

44 SMART Embedded regimes: This SMART embeds 8 simple regimes for j, l, p = 1, 2 of the form Give c j followed by m l if response, else s p if nonresponse E.g., in symbols d 1 (h 1 ) = c j for all h 1 d 2 (h 2 ) = m l if response = s p if nonresponse These embedded regimes use no baseline information at decision 1 and response status only at decision 2 Randomization guarantees subjects whose actual treatments are consistent with following at least 1 of the 8 regimes 44/65 Treatment Regimes and SMARTs

45 SMART Multi-purpose: Compare the embedded regimes ( treatment sequences ) Collect extensive, detailed information at baseline and intermediate to decisions 1 and 2 to inform estimation of an optimal regime Experience: Numerous SMARTs conducted in behavioral sciences research Stepped care interventions Also in cancer (CALBG/Alliance), but not focused on regimes 45/65 Treatment Regimes and SMARTs

46 CALGB 8923 (1995) Non- Response Follow-up Chemo + Placebo Intensification I Response Intensification II AML Non- Response Follow-up Chemo + GM-CSF Intensification I Response But not for the purpose of inference on regimes... Intensification II 46/65 Treatment Regimes and SMARTs

47 SMART design principles Current paradigm: Embed simple regimes Base sample size on conventional comparisons, e.g., decision 1 treatments Base sample size on embedded regimes (how?) Base sample size on comparing most and least intensive embedded regimes Open problem 47/65 Treatment Regimes and SMARTs

48 Optimizing behavioral cancer pain intervention PI: Tamara Somers, Duke University 48/65 Treatment Regimes and SMARTs

49 Behavioral interventions to prevent HIV PI: Brian Mustanski, Northwestern University 49/65 Treatment Regimes and SMARTs

50 Methodological challenges High-dimensional information? Regression model selection? Black box vs. restricted class of regimes? Assessment of uncertainty nonstandard asymptotic theory Censored survival outcomes Multiple outcomes (e.g., efficacy and toxicity)? Design considerations for SMARTs? Real time regimes (mhealth )... Statistical research is way ahead of what is actually being done in practice 50/65 Treatment Regimes and SMARTs

51 Software IMPACT Innovative Methods Program for Advancing Clinical Trials A joint venture of Duke, UNC-Chapel Hill, NC State Supported by NCI Program Project P01 CA ( ) R package: DynTxRegime Implements all this and more Also available on CRAN 51/65 Treatment Regimes and SMARTs

52 Shameless promotion Coming in 2018: Tsiatis, A. A., Davidian, M., Laber, E. B., and Holloway, S. T. (2017). An Introduction to Dynamic Treatment Regimes: Statistical Methods for Precision Medicine. Chapman & Hall/CRC Press. 52/65 Treatment Regimes and SMARTs

53 References Schulte, P. J., Tsiatis, A. A., Laber, E. B., and Davidian, M. (2014). Q- and A-learning methods for estimating optimal dynamic treatment regimes. Statistical Science, 29, Zhang, B., Tsiatis, A. A., Laber, E. B., and Davidian, M. (2012). A robust method for estimating optimal treatment regimes. Biometrics, 68, Zhang, B., Tsiatis, A.A., Davidian, M., Zhang, M., and Laber, E.B., (2012). Estimating optimal treatment regimes from a classification perspective. Stat, 1, Zhang, B., Tsiatis, A. A., Laber, E. B., and Davidian, M. (2013). Robust estimation of optimal dynamic treatment regimes for sequential treatment decisions. Biometrika, 100, Zhao, Y., Zeng, D., Laber, E. B., and Kosorok, M. R. (2015). New statistical learning methods for estimating optimal dynamic treatment regimes. Journal of the American Statistical Association, 110, Zhao, Y., Zeng, D., Rush, A. J., and Kosorok, M. R. (2012). Estimating individualized treatment rules using outcome weighted learning. Journal of the American Statistical Association, 107, /65 Treatment Regimes and SMARTs

54 Additional references Bai, X., Tsiatis, A. A., Lu, W., and Song, R. (2016). Optimal treatment regimes for survival endpoints using a locally-efficient doubly-robust estimator from a classification perspective. Lifetime Data Analysis, in press. Davidian, M., Tsiatis, A. A., and Laber, E. B. (2016). Dynamic treatment regimes. In Cancer Clinical Trials: Current and Controversial Issues in Design and Analysis, George, S. L., Wang, X. and Pang, H. (eds). Boca Raton: Chapman & Hall/CRC Press, ch. 13, pp Jiang, R., Lu, W., Song, R., and Davidian, M. (2016). On estimation of optimal treatment regimes for maximizing t-year survival probability. Journal of the Royal Statistical Society, Series B, in press. Laber, E. B., Zhao, Y., Regh, T., Davidian, M., Tsiatis, A. A., Stanford, J., Zeng, D., Song, R., and Kosorok, M. R. (2016). Using pilot data to size a two-arm randomized trial to find a nearly optimal personalized treatment strategy. Statistics in Medicine, 35, Goldberg, Y. and Kosorok, M. R. (2012). Q-learning with censored data, Annals of Statistics, 40, /65 Treatment Regimes and SMARTs

55 Additional references Zhang, Y., Laber, E. B., Tsiatis, A. A., and Davidian, M. (2015). Using decision lists to construct interpretable and parsimonious treatment regimes. Biometrics, 71, Zhao, Y. Q., Zeng, D., Laber, E. B., Song, R., Yuan, M. and Kosorok, M. R. (2015), Doubly robust learning for estimating individualized treatment with censored data, Biometrika, 102, /65 Treatment Regimes and SMARTs

56 Important philosophical point Distinguish between: The best treatment for a patient The best treatment decision for a patient given the information available on the patient Best treatment for a patient: Is 0 or 1 according to Y (0) > Y (1) or Y (0) < Y (1) Best treatment decision given the information available: We cannot hope to determine this because we can never see all potential outcomes on a given patient We can hope to make the optimal decision given the information available, i.e., find d opt that makes E{Y (d) X = x} and V (d) = E{Y (d)} as large as possible 56/65 Treatment Regimes and SMARTs

57 Classification perspective Generic classification situation: Z = class, label ; here, Z = {0, 1} (binary ) X = features (covariates) d is a classifier: d : x {0, 1} D is a family of classifiers, e.g., Hyperplanes of the form Rectangular regions of the form I(η 0 + η 1 X 1 + X 2 > 0) I(X 1 < η 1 ) + I(X 1 η 1, X 2 < η 2 ) 57/65 Treatment Regimes and SMARTs

58 Classification perspective Generic classification problem: Training set: (X i, Z i ), i = 1,..., n Find classifier d D that minimizes n Classification error {Z i d(x i )} 2 i=1 Weighted classification error n w i {Z i d(x i )} 2 Machine learning (supervised learning) Support vector machines: Hyperplanes Recursive partitioning (CART ): Rectangular regions i=1 58/65 Treatment Regimes and SMARTs

59 Value search estimator, revisited Recall: Estimation of d η restricted class D η η opt maximizes V (d η ) = E{Y (d η )} AIPWE V AIPWE (d η ) = n 1 n { Cη,i Y i π c (X i ; η, γ) C } η,i π c (X i ; η, γ) m(x i ; η, π c (X i ; η, γ) β) i=1 η opt maximizes V AIPWE (d η ) = estimator d opt η,aipwe (x) 59/65 Treatment Regimes and SMARTs

60 Value search estimator, revisited Algebra: V AIPWE (d η ) can be rewritten as n 1 n Ĉ(X i) [I{Ĉ(X i) > 0} d(x i ; η)] 2 + terms not involving d i=1 Ĉ(X i ) = { Ai Y i π(x i, γ) A } i π(x i, γ) Q(X i, 1; π(x i ; γ) β) { (1 Ai )Y i 1 π(x i ; γ) + A } i π(x i ; γ) 1 π(x i ; γ) Q(X i, 0; β), E{Ĉ(X i) X i } C(X i ) = Q(X i, 1) Q(X i, 0), C(X) is the contrast function 60/65 Treatment Regimes and SMARTs

61 Classification formulation Result: Estimate η opt and thus obtain minimizing in η n 1 d opt η,aipwe (x) = d(x, ηopt ) by n Ĉ(X i) [I{Ĉ(X i) > 0} d(x i ; η)] 2 i=1 Can be viewed: As a weighted classification problem Label I{Ĉ(X i) > 0} Classifier d(x i ; η) Weight Ĉ(X i) Zhang et al. (2012), Stat ; Zhao et al. (2012), JASA (IPWE only, Outcome weighted learning, OWL) 61/65 Treatment Regimes and SMARTs

62 More on Q-Learning Estimation of V (d opt ): For each i, form Ỹ 1i = Ỹ1i(H 1i ; β 1 ) = max{q 1 (H 1i, 0; β 1 ), Q 1 (H 1i, 1; β 1 )} Estimate the value V (d opt ) by V Q (d opt ) = n 1 n Ỹ 1i i=1 62/65 Treatment Regimes and SMARTs

63 More on Q-Learning Decision 2: Positing/fitting Q 2 (H 2, A 2 ; β 2 ) is a standard regression problem Data (H 2i, A 2i, Y i ), i = 1,..., n For continuous Y, linear models are often used, e.g. Q 2 (h 2, a 2 ; β 2 ) = h T 2 β 21 + a 2 ( h T 2 β 22), h2 = (1, h T 2 )T Standard model selection methods, diagnostics, etc Under the linear model d opt 2 (h 2 ) = I( h T 2 β 22 > 0) Ỹ 2 (H 2 ; β 2 ) = H T 2 β 21 + ( H T 2 β 22)I( H T 2 β 22 > 0) 63/65 Treatment Regimes and SMARTs

64 More on Q-Learning Ỹ 2 (H 2 ; β 2 ) = H T 2 β 21 + ( H T 2 β 22)I( H T 2 β 22 > 0) Decision 1: It is common to assume a linear model for Q 1 (H 1, A 1 ) = E{Ỹ2(H 2 ; β 2 ) H 1, A 1 } Q 1 (h 1, a 1 ; β 1 ) = h T 1 β 11 + a 2 ( h T 1 β 12), h1 = (1, h T 1 )T Data (H 1i, A 1i, Ỹ 2i ), i = 1..., n This is clearly a nonstandard regression problem; more in a moment Under the linear model (if it were correct ) d opt 1 (h 1 ) = I( h T 1 β 12 > 0) 64/65 Treatment Regimes and SMARTs

65 More on Q-Learning Decision 1: Even if the Decision 2 linear model is correctly specified, a linear model at Decision 1 is almost certainly incorrect In truth Q 2 (h 2, a 2 ) = x 1 + a 2 (0.5a 1 + x 2 ) and May show that X 2 X 1 = x 1, A 1 = a 1 N (ξx 1, σ 2 ) Q 1 (h 1, a 1 ) = x 1 + Not linear in general σ 2π exp{ (0.5a 1 + ξx 1 ) 2 /(2σ 2 )} +(0.5a 1 + ξx 1 )Φ{(0.5a 1 + ξx 1 )/σ} 65/65 Treatment Regimes and SMARTs

Estimation of Optimal Treatment Regimes Via Machine Learning. Marie Davidian

Estimation of Optimal Treatment Regimes Via Machine Learning. Marie Davidian Estimation of Optimal Treatment Regimes Via Machine Learning Marie Davidian Department of Statistics North Carolina State University Triangle Machine Learning Day April 3, 2018 1/28 Optimal DTRs Via ML

More information

A Sampling of IMPACT Research:

A Sampling of IMPACT Research: A Sampling of IMPACT Research: Methods for Analysis with Dropout and Identifying Optimal Treatment Regimes Marie Davidian Department of Statistics North Carolina State University http://www.stat.ncsu.edu/

More information

Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective. Anastasios (Butch) Tsiatis and Xiaofei Bai

Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective. Anastasios (Butch) Tsiatis and Xiaofei Bai Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective Anastasios (Butch) Tsiatis and Xiaofei Bai Department of Statistics North Carolina State University 1/35 Optimal Treatment

More information

Web-based Supplementary Materials for A Robust Method for Estimating. Optimal Treatment Regimes

Web-based Supplementary Materials for A Robust Method for Estimating. Optimal Treatment Regimes Biometrics 000, 000 000 DOI: 000 000 0000 Web-based Supplementary Materials for A Robust Method for Estimating Optimal Treatment Regimes Baqun Zhang, Anastasios A. Tsiatis, Eric B. Laber, and Marie Davidian

More information

Reader Reaction to A Robust Method for Estimating Optimal Treatment Regimes by Zhang et al. (2012)

Reader Reaction to A Robust Method for Estimating Optimal Treatment Regimes by Zhang et al. (2012) Biometrics 71, 267 273 March 2015 DOI: 10.1111/biom.12228 READER REACTION Reader Reaction to A Robust Method for Estimating Optimal Treatment Regimes by Zhang et al. (2012) Jeremy M. G. Taylor,* Wenting

More information

Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates

Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates Anastasios (Butch) Tsiatis Department of Statistics North Carolina State University http://www.stat.ncsu.edu/

More information

A Robust Method for Estimating Optimal Treatment Regimes

A Robust Method for Estimating Optimal Treatment Regimes Biometrics 63, 1 20 December 2007 DOI: 10.1111/j.1541-0420.2005.00454.x A Robust Method for Estimating Optimal Treatment Regimes Baqun Zhang, Anastasios A. Tsiatis, Eric B. Laber, and Marie Davidian Department

More information

Methods for Interpretable Treatment Regimes

Methods for Interpretable Treatment Regimes Methods for Interpretable Treatment Regimes John Sperger University of North Carolina at Chapel Hill May 4 th 2018 Sperger (UNC) Interpretable Treatment Regimes May 4 th 2018 1 / 16 Why Interpretable Methods?

More information

Estimating the Mean Response of Treatment Duration Regimes in an Observational Study. Anastasios A. Tsiatis.

Estimating the Mean Response of Treatment Duration Regimes in an Observational Study. Anastasios A. Tsiatis. Estimating the Mean Response of Treatment Duration Regimes in an Observational Study Anastasios A. Tsiatis http://www.stat.ncsu.edu/ tsiatis/ Introduction to Dynamic Treatment Regimes 1 Outline Description

More information

Robustifying Trial-Derived Treatment Rules to a Target Population

Robustifying Trial-Derived Treatment Rules to a Target Population 1/ 39 Robustifying Trial-Derived Treatment Rules to a Target Population Yingqi Zhao Public Health Sciences Division Fred Hutchinson Cancer Research Center Workshop on Perspectives and Analysis for Personalized

More information

Double Robustness. Bang and Robins (2005) Kang and Schafer (2007)

Double Robustness. Bang and Robins (2005) Kang and Schafer (2007) Double Robustness Bang and Robins (2005) Kang and Schafer (2007) Set-Up Assume throughout that treatment assignment is ignorable given covariates (similar to assumption that data are missing at random

More information

Weighting Methods. Harvard University STAT186/GOV2002 CAUSAL INFERENCE. Fall Kosuke Imai

Weighting Methods. Harvard University STAT186/GOV2002 CAUSAL INFERENCE. Fall Kosuke Imai Weighting Methods Kosuke Imai Harvard University STAT186/GOV2002 CAUSAL INFERENCE Fall 2018 Kosuke Imai (Harvard) Weighting Methods Stat186/Gov2002 Fall 2018 1 / 13 Motivation Matching methods for improving

More information

Covariate Balancing Propensity Score for General Treatment Regimes

Covariate Balancing Propensity Score for General Treatment Regimes Covariate Balancing Propensity Score for General Treatment Regimes Kosuke Imai Princeton University October 14, 2014 Talk at the Department of Psychiatry, Columbia University Joint work with Christian

More information

Recent Advances in Outcome Weighted Learning for Precision Medicine

Recent Advances in Outcome Weighted Learning for Precision Medicine Recent Advances in Outcome Weighted Learning for Precision Medicine Michael R. Kosorok Department of Biostatistics Department of Statistics and Operations Research University of North Carolina at Chapel

More information

SEQUENTIAL MULTIPLE ASSIGNMENT RANDOMIZATION TRIALS WITH ENRICHMENT (SMARTER) DESIGN

SEQUENTIAL MULTIPLE ASSIGNMENT RANDOMIZATION TRIALS WITH ENRICHMENT (SMARTER) DESIGN SEQUENTIAL MULTIPLE ASSIGNMENT RANDOMIZATION TRIALS WITH ENRICHMENT (SMARTER) DESIGN Ying Liu Division of Biostatistics, Medical College of Wisconsin Yuanjia Wang Department of Biostatistics & Psychiatry,

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview

Introduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview Introduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations

More information

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A. Linero and M. Daniels UF, UT-Austin SRC 2014, Galveston, TX 1 Background 2 Working model

More information

Lecture 2: Constant Treatment Strategies. Donglin Zeng, Department of Biostatistics, University of North Carolina

Lecture 2: Constant Treatment Strategies. Donglin Zeng, Department of Biostatistics, University of North Carolina Lecture 2: Constant Treatment Strategies Introduction Motivation We will focus on evaluating constant treatment strategies in this lecture. We will discuss using randomized or observational study for these

More information

Sample size considerations for precision medicine

Sample size considerations for precision medicine Sample size considerations for precision medicine Eric B. Laber Department of Statistics, North Carolina State University November 28 Acknowledgments Thanks to Lan Wang NCSU/UNC A and Precision Medicine

More information

TREE-BASED REINFORCEMENT LEARNING FOR ESTIMATING OPTIMAL DYNAMIC TREATMENT REGIMES. University of Michigan, Ann Arbor

TREE-BASED REINFORCEMENT LEARNING FOR ESTIMATING OPTIMAL DYNAMIC TREATMENT REGIMES. University of Michigan, Ann Arbor Submitted to the Annals of Applied Statistics TREE-BASED REINFORCEMENT LEARNING FOR ESTIMATING OPTIMAL DYNAMIC TREATMENT REGIMES BY YEBIN TAO, LU WANG AND DANIEL ALMIRALL University of Michigan, Ann Arbor

More information

Primal-dual Covariate Balance and Minimal Double Robustness via Entropy Balancing

Primal-dual Covariate Balance and Minimal Double Robustness via Entropy Balancing Primal-dual Covariate Balance and Minimal Double Robustness via (Joint work with Daniel Percival) Department of Statistics, Stanford University JSM, August 9, 2015 Outline 1 2 3 1/18 Setting Rubin s causal

More information

Adaptive Trial Designs

Adaptive Trial Designs Adaptive Trial Designs Wenjing Zheng, Ph.D. Methods Core Seminar Center for AIDS Prevention Studies University of California, San Francisco Nov. 17 th, 2015 Trial Design! Ethical:!eg.! Safety!! Efficacy!

More information

Lecture 9: Learning Optimal Dynamic Treatment Regimes. Donglin Zeng, Department of Biostatistics, University of North Carolina

Lecture 9: Learning Optimal Dynamic Treatment Regimes. Donglin Zeng, Department of Biostatistics, University of North Carolina Lecture 9: Learning Optimal Dynamic Treatment Regimes Introduction Refresh: Dynamic Treatment Regimes (DTRs) DTRs: sequential decision rules, tailored at each stage by patients time-varying features and

More information

Semiparametric Mixed Effects Models with Flexible Random Effects Distribution

Semiparametric Mixed Effects Models with Flexible Random Effects Distribution Semiparametric Mixed Effects Models with Flexible Random Effects Distribution Marie Davidian North Carolina State University davidian@stat.ncsu.edu www.stat.ncsu.edu/ davidian Joint work with A. Tsiatis,

More information

Modification and Improvement of Empirical Likelihood for Missing Response Problem

Modification and Improvement of Empirical Likelihood for Missing Response Problem UW Biostatistics Working Paper Series 12-30-2010 Modification and Improvement of Empirical Likelihood for Missing Response Problem Kwun Chuen Gary Chan University of Washington - Seattle Campus, kcgchan@u.washington.edu

More information

Power and Sample Size Calculations with the Additive Hazards Model

Power and Sample Size Calculations with the Additive Hazards Model Journal of Data Science 10(2012), 143-155 Power and Sample Size Calculations with the Additive Hazards Model Ling Chen, Chengjie Xiong, J. Philip Miller and Feng Gao Washington University School of Medicine

More information

Set-valued dynamic treatment regimes for competing outcomes

Set-valued dynamic treatment regimes for competing outcomes Set-valued dynamic treatment regimes for competing outcomes Eric B. Laber Department of Statistics, North Carolina State University JSM, Montreal, QC, August 5, 2013 Acknowledgments Zhiqiang Tan Jamie

More information

Combining multiple observational data sources to estimate causal eects

Combining multiple observational data sources to estimate causal eects Department of Statistics, North Carolina State University Combining multiple observational data sources to estimate causal eects Shu Yang* syang24@ncsuedu Joint work with Peng Ding UC Berkeley May 23,

More information

High Dimensional Propensity Score Estimation via Covariate Balancing

High Dimensional Propensity Score Estimation via Covariate Balancing High Dimensional Propensity Score Estimation via Covariate Balancing Kosuke Imai Princeton University Talk at Columbia University May 13, 2017 Joint work with Yang Ning and Sida Peng Kosuke Imai (Princeton)

More information

Extending causal inferences from a randomized trial to a target population

Extending causal inferences from a randomized trial to a target population Extending causal inferences from a randomized trial to a target population Issa Dahabreh Center for Evidence Synthesis in Health, Brown University issa dahabreh@brown.edu January 16, 2019 Issa Dahabreh

More information

Supplement to Q- and A-learning Methods for Estimating Optimal Dynamic Treatment Regimes

Supplement to Q- and A-learning Methods for Estimating Optimal Dynamic Treatment Regimes Submitted to the Statistical Science Supplement to - and -learning Methods for Estimating Optimal Dynamic Treatment Regimes Phillip J. Schulte, nastasios. Tsiatis, Eric B. Laber, and Marie Davidian bstract.

More information

Group Sequential Designs: Theory, Computation and Optimisation

Group Sequential Designs: Theory, Computation and Optimisation Group Sequential Designs: Theory, Computation and Optimisation Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj 8th International Conference

More information

Semiparametric Regression and Machine Learning Methods for Estimating Optimal Dynamic Treatment Regimes

Semiparametric Regression and Machine Learning Methods for Estimating Optimal Dynamic Treatment Regimes Semiparametric Regression and Machine Learning Methods for Estimating Optimal Dynamic Treatment Regimes by Yebin Tao A dissertation submitted in partial fulfillment of the requirements for the degree of

More information

Prerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3

Prerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3 University of California, Irvine 2017-2018 1 Statistics (STATS) Courses STATS 5. Seminar in Data Science. 1 Unit. An introduction to the field of Data Science; intended for entering freshman and transfers.

More information

Estimating Optimal Dynamic Treatment Regimes from Clustered Data

Estimating Optimal Dynamic Treatment Regimes from Clustered Data Estimating Optimal Dynamic Treatment Regimes from Clustered Data Bibhas Chakraborty Department of Biostatistics, Columbia University bc2425@columbia.edu Society for Clinical Trials Annual Meetings Boston,

More information

Comparative effectiveness of dynamic treatment regimes

Comparative effectiveness of dynamic treatment regimes Comparative effectiveness of dynamic treatment regimes An application of the parametric g- formula Miguel Hernán Departments of Epidemiology and Biostatistics Harvard School of Public Health www.hsph.harvard.edu/causal

More information

arxiv: v1 [stat.me] 15 May 2011

arxiv: v1 [stat.me] 15 May 2011 Working Paper Propensity Score Analysis with Matching Weights Liang Li, Ph.D. arxiv:1105.2917v1 [stat.me] 15 May 2011 Associate Staff of Biostatistics Department of Quantitative Health Sciences, Cleveland

More information

Application of Time-to-Event Methods in the Assessment of Safety in Clinical Trials

Application of Time-to-Event Methods in the Assessment of Safety in Clinical Trials Application of Time-to-Event Methods in the Assessment of Safety in Clinical Trials Progress, Updates, Problems William Jen Hoe Koh May 9, 2013 Overview Marginal vs Conditional What is TMLE? Key Estimation

More information

Q learning. A data analysis method for constructing adaptive interventions

Q learning. A data analysis method for constructing adaptive interventions Q learning A data analysis method for constructing adaptive interventions SMART First stage intervention options coded as 1(M) and 1(B) Second stage intervention options coded as 1(M) and 1(B) O1 A1 O2

More information

5 Methods Based on Inverse Probability Weighting Under MAR

5 Methods Based on Inverse Probability Weighting Under MAR 5 Methods Based on Inverse Probability Weighting Under MAR The likelihood-based and multiple imputation methods we considered for inference under MAR in Chapters 3 and 4 are based, either directly or indirectly,

More information

Causal Inference Basics

Causal Inference Basics Causal Inference Basics Sam Lendle October 09, 2013 Observed data, question, counterfactuals Observed data: n i.i.d copies of baseline covariates W, treatment A {0, 1}, and outcome Y. O i = (W i, A i,

More information

Dynamic Prediction of Disease Progression Using Longitudinal Biomarker Data

Dynamic Prediction of Disease Progression Using Longitudinal Biomarker Data Dynamic Prediction of Disease Progression Using Longitudinal Biomarker Data Xuelin Huang Department of Biostatistics M. D. Anderson Cancer Center The University of Texas Joint Work with Jing Ning, Sangbum

More information

Integrated approaches for analysis of cluster randomised trials

Integrated approaches for analysis of cluster randomised trials Integrated approaches for analysis of cluster randomised trials Invited Session 4.1 - Recent developments in CRTs Joint work with L. Turner, F. Li, J. Gallis and D. Murray Mélanie PRAGUE - SCT 2017 - Liverpool

More information

arxiv: v1 [math.st] 29 Jul 2014

arxiv: v1 [math.st] 29 Jul 2014 On Estimation of Optimal Treatment Regimes For Maximizing t-year Survival Probability arxiv:1407.7820v1 [math.st] 29 Jul 2014 Runchao Jiang 1, Wenbin Lu, Rui Song, and Marie Davidian North Carolina State

More information

Discussion of Papers on the Extensions of Propensity Score

Discussion of Papers on the Extensions of Propensity Score Discussion of Papers on the Extensions of Propensity Score Kosuke Imai Princeton University August 3, 2010 Kosuke Imai (Princeton) Generalized Propensity Score 2010 JSM (Vancouver) 1 / 11 The Theme and

More information

Improved Doubly Robust Estimation when Data are Monotonely Coarsened, with Application to Longitudinal Studies with Dropout

Improved Doubly Robust Estimation when Data are Monotonely Coarsened, with Application to Longitudinal Studies with Dropout Biometrics 64, 1 23 December 2009 DOI: 10.1111/j.1541-0420.2005.00454.x Improved Doubly Robust Estimation when Data are Monotonely Coarsened, with Application to Longitudinal Studies with Dropout Anastasios

More information

Comparing Adaptive Interventions Using Data Arising from a SMART: With Application to Autism, ADHD, and Mood Disorders

Comparing Adaptive Interventions Using Data Arising from a SMART: With Application to Autism, ADHD, and Mood Disorders Comparing Adaptive Interventions Using Data Arising from a SMART: With Application to Autism, ADHD, and Mood Disorders Daniel Almirall, Xi Lu, Connie Kasari, Inbal N-Shani, Univ. of Michigan, Univ. of

More information

Bootstrapping Sensitivity Analysis

Bootstrapping Sensitivity Analysis Bootstrapping Sensitivity Analysis Qingyuan Zhao Department of Statistics, The Wharton School University of Pennsylvania May 23, 2018 @ ACIC Based on: Qingyuan Zhao, Dylan S. Small, and Bhaswar B. Bhattacharya.

More information

A Bayesian Machine Learning Approach for Optimizing Dynamic Treatment Regimes

A Bayesian Machine Learning Approach for Optimizing Dynamic Treatment Regimes A Bayesian Machine Learning Approach for Optimizing Dynamic Treatment Regimes Thomas A. Murray, (tamurray@mdanderson.org), Ying Yuan, (yyuan@mdanderson.org), and Peter F. Thall (rex@mdanderson.org) Department

More information

SIMPLE EXAMPLES OF ESTIMATING CAUSAL EFFECTS USING TARGETED MAXIMUM LIKELIHOOD ESTIMATION

SIMPLE EXAMPLES OF ESTIMATING CAUSAL EFFECTS USING TARGETED MAXIMUM LIKELIHOOD ESTIMATION Johns Hopkins University, Dept. of Biostatistics Working Papers 3-3-2011 SIMPLE EXAMPLES OF ESTIMATING CAUSAL EFFECTS USING TARGETED MAXIMUM LIKELIHOOD ESTIMATION Michael Rosenblum Johns Hopkins Bloomberg

More information

Bayesian regression tree models for causal inference: regularization, confounding and heterogeneity

Bayesian regression tree models for causal inference: regularization, confounding and heterogeneity Bayesian regression tree models for causal inference: regularization, confounding and heterogeneity P. Richard Hahn, Jared Murray, and Carlos Carvalho June 22, 2017 The problem setting We want to estimate

More information

Deductive Derivation and Computerization of Semiparametric Efficient Estimation

Deductive Derivation and Computerization of Semiparametric Efficient Estimation Deductive Derivation and Computerization of Semiparametric Efficient Estimation Constantine Frangakis, Tianchen Qian, Zhenke Wu, and Ivan Diaz Department of Biostatistics Johns Hopkins Bloomberg School

More information

Targeted Maximum Likelihood Estimation in Safety Analysis

Targeted Maximum Likelihood Estimation in Safety Analysis Targeted Maximum Likelihood Estimation in Safety Analysis Sam Lendle 1 Bruce Fireman 2 Mark van der Laan 1 1 UC Berkeley 2 Kaiser Permanente ISPE Advanced Topics Session, Barcelona, August 2012 1 / 35

More information

Package Rsurrogate. October 20, 2016

Package Rsurrogate. October 20, 2016 Type Package Package Rsurrogate October 20, 2016 Title Robust Estimation of the Proportion of Treatment Effect Explained by Surrogate Marker Information Version 2.0 Date 2016-10-19 Author Layla Parast

More information

Previous lecture. P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing.

Previous lecture. P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing. Previous lecture P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing. Interaction Outline: Definition of interaction Additive versus multiplicative

More information

This is the submitted version of the following book chapter: stat08068: Double robustness, which will be

This is the submitted version of the following book chapter: stat08068: Double robustness, which will be This is the submitted version of the following book chapter: stat08068: Double robustness, which will be published in its final form in Wiley StatsRef: Statistics Reference Online (http://onlinelibrary.wiley.com/book/10.1002/9781118445112)

More information

Longitudinal + Reliability = Joint Modeling

Longitudinal + Reliability = Joint Modeling Longitudinal + Reliability = Joint Modeling Carles Serrat Institute of Statistics and Mathematics Applied to Building CYTED-HAROSA International Workshop November 21-22, 2013 Barcelona Mainly from Rizopoulos,

More information

7 Sensitivity Analysis

7 Sensitivity Analysis 7 Sensitivity Analysis A recurrent theme underlying methodology for analysis in the presence of missing data is the need to make assumptions that cannot be verified based on the observed data. If the assumption

More information

e author and the promoter give permission to consult this master dissertation and to copy it or parts of it for personal use. Each other use falls

e author and the promoter give permission to consult this master dissertation and to copy it or parts of it for personal use. Each other use falls e author and the promoter give permission to consult this master dissertation and to copy it or parts of it for personal use. Each other use falls under the restrictions of the copyright, in particular

More information

The Supervised Learning Approach To Estimating Heterogeneous Causal Regime Effects

The Supervised Learning Approach To Estimating Heterogeneous Causal Regime Effects The Supervised Learning Approach To Estimating Heterogeneous Causal Regime Effects Thai T. Pham Stanford Graduate School of Business thaipham@stanford.edu May, 2016 Introduction Observations Many sequential

More information

Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs

Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs Michael J. Daniels and Chenguang Wang Jan. 18, 2009 First, we would like to thank Joe and Geert for a carefully

More information

A Gate-keeping Approach for Selecting Adaptive Interventions under General SMART Designs

A Gate-keeping Approach for Selecting Adaptive Interventions under General SMART Designs 1 / 32 A Gate-keeping Approach for Selecting Adaptive Interventions under General SMART Designs Tony Zhong, DrPH Icahn School of Medicine at Mount Sinai (Feb 20, 2019) Workshop on Design of mhealth Intervention

More information

Quantile-Optimal Treatment Regimes

Quantile-Optimal Treatment Regimes Quantile-Optimal Treatment Regimes Lan Wang, Yu Zhou, Rui Song and Ben Sherwood Abstract Finding the optimal treatment regime (or a series of sequential treatment regimes) based on individual characteristics

More information

REGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520

REGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520 REGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520 Department of Statistics North Carolina State University Presented by: Butch Tsiatis, Department of Statistics, NCSU

More information

Achieving Optimal Covariate Balance Under General Treatment Regimes

Achieving Optimal Covariate Balance Under General Treatment Regimes Achieving Under General Treatment Regimes Marc Ratkovic Princeton University May 24, 2012 Motivation For many questions of interest in the social sciences, experiments are not possible Possible bias in

More information

Propensity Score Weighting with Multilevel Data

Propensity Score Weighting with Multilevel Data Propensity Score Weighting with Multilevel Data Fan Li Department of Statistical Science Duke University October 25, 2012 Joint work with Alan Zaslavsky and Mary Beth Landrum Introduction In comparative

More information

Interim Monitoring of Clinical Trials: Decision Theory, Dynamic Programming. and Optimal Stopping

Interim Monitoring of Clinical Trials: Decision Theory, Dynamic Programming. and Optimal Stopping Interim Monitoring of Clinical Trials: Decision Theory, Dynamic Programming and Optimal Stopping Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj

More information

PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH

PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH The First Step: SAMPLE SIZE DETERMINATION THE ULTIMATE GOAL The most important, ultimate step of any of clinical research is to do draw inferences;

More information

Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall

Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall 1 Structural Nested Mean Models for Assessing Time-Varying Effect Moderation Daniel Almirall Center for Health Services Research, Durham VAMC & Dept. of Biostatistics, Duke University Medical Joint work

More information

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

FULL LIKELIHOOD INFERENCES IN THE COX MODEL October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach

More information

arxiv: v2 [stat.me] 17 Jan 2017

arxiv: v2 [stat.me] 17 Jan 2017 Semiparametric Estimation with Data Missing Not at Random Using an Instrumental Variable arxiv:1607.03197v2 [stat.me] 17 Jan 2017 BaoLuo Sun 1, Lan Liu 1, Wang Miao 1,4, Kathleen Wirth 2,3, James Robins

More information

Personalized Treatment Selection Based on Randomized Clinical Trials. Tianxi Cai Department of Biostatistics Harvard School of Public Health

Personalized Treatment Selection Based on Randomized Clinical Trials. Tianxi Cai Department of Biostatistics Harvard School of Public Health Personalized Treatment Selection Based on Randomized Clinical Trials Tianxi Cai Department of Biostatistics Harvard School of Public Health Outline Motivation A systematic approach to separating subpopulations

More information

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Linear Classifiers. Blaine Nelson, Tobias Scheffer

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Linear Classifiers. Blaine Nelson, Tobias Scheffer Universität Potsdam Institut für Informatik Lehrstuhl Linear Classifiers Blaine Nelson, Tobias Scheffer Contents Classification Problem Bayesian Classifier Decision Linear Classifiers, MAP Models Logistic

More information

Building a Prognostic Biomarker

Building a Prognostic Biomarker Building a Prognostic Biomarker Noah Simon and Richard Simon July 2016 1 / 44 Prognostic Biomarker for a Continuous Measure On each of n patients measure y i - single continuous outcome (eg. blood pressure,

More information

Flexible Estimation of Treatment Effect Parameters

Flexible Estimation of Treatment Effect Parameters Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both

More information

Individualized Treatment Effects with Censored Data via Nonparametric Accelerated Failure Time Models

Individualized Treatment Effects with Censored Data via Nonparametric Accelerated Failure Time Models Individualized Treatment Effects with Censored Data via Nonparametric Accelerated Failure Time Models Nicholas C. Henderson Thomas A. Louis Gary Rosner Ravi Varadhan Johns Hopkins University July 31, 2018

More information

ECO Class 6 Nonparametric Econometrics

ECO Class 6 Nonparametric Econometrics ECO 523 - Class 6 Nonparametric Econometrics Carolina Caetano Contents 1 Nonparametric instrumental variable regression 1 2 Nonparametric Estimation of Average Treatment Effects 3 2.1 Asymptotic results................................

More information

Using Estimating Equations for Spatially Correlated A

Using Estimating Equations for Spatially Correlated A Using Estimating Equations for Spatially Correlated Areal Data December 8, 2009 Introduction GEEs Spatial Estimating Equations Implementation Simulation Conclusion Typical Problem Assess the relationship

More information

Targeted Group Sequential Adaptive Designs

Targeted Group Sequential Adaptive Designs Targeted Group Sequential Adaptive Designs Mark van der Laan Department of Biostatistics, University of California, Berkeley School of Public Health Liver Forum, May 10, 2017 Targeted Group Sequential

More information

Causal Inference for Mediation Effects

Causal Inference for Mediation Effects Causal Inference for Mediation Effects by Jing Zhang B.S., University of Science and Technology of China, 2006 M.S., Brown University, 2008 A Dissertation Submitted in Partial Fulfillment of the Requirements

More information

Causal modelling in Medical Research

Causal modelling in Medical Research Causal modelling in Medical Research Debashis Ghosh Department of Biostatistics and Informatics, Colorado School of Public Health Biostatistics Workshop Series Goals for today Introduction to Potential

More information

Multi-state Models: An Overview

Multi-state Models: An Overview Multi-state Models: An Overview Andrew Titman Lancaster University 14 April 2016 Overview Introduction to multi-state modelling Examples of applications Continuously observed processes Intermittently observed

More information

Probabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016

Probabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016 Probabilistic classification CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2016 Topics Probabilistic approach Bayes decision theory Generative models Gaussian Bayes classifier

More information

Adaptive Designs: Why, How and When?

Adaptive Designs: Why, How and When? Adaptive Designs: Why, How and When? Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj ISBS Conference Shanghai, July 2008 1 Adaptive designs:

More information

2 Naïve Methods. 2.1 Complete or available case analysis

2 Naïve Methods. 2.1 Complete or available case analysis 2 Naïve Methods Before discussing methods for taking account of missingness when the missingness pattern can be assumed to be MAR in the next three chapters, we review some simple methods for handling

More information

Stat 642, Lecture notes for 04/12/05 96

Stat 642, Lecture notes for 04/12/05 96 Stat 642, Lecture notes for 04/12/05 96 Hosmer-Lemeshow Statistic The Hosmer-Lemeshow Statistic is another measure of lack of fit. Hosmer and Lemeshow recommend partitioning the observations into 10 equal

More information

Topics and Papers for Spring 14 RIT

Topics and Papers for Spring 14 RIT Eric Slud Feb. 3, 204 Topics and Papers for Spring 4 RIT The general topic of the RIT is inference for parameters of interest, such as population means or nonlinearregression coefficients, in the presence

More information

A short introduction to supervised learning, with applications to cancer pathway analysis Dr. Christina Leslie

A short introduction to supervised learning, with applications to cancer pathway analysis Dr. Christina Leslie A short introduction to supervised learning, with applications to cancer pathway analysis Dr. Christina Leslie Computational Biology Program Memorial Sloan-Kettering Cancer Center http://cbio.mskcc.org/leslielab

More information

Causal Inference with General Treatment Regimes: Generalizing the Propensity Score

Causal Inference with General Treatment Regimes: Generalizing the Propensity Score Causal Inference with General Treatment Regimes: Generalizing the Propensity Score David van Dyk Department of Statistics, University of California, Irvine vandyk@stat.harvard.edu Joint work with Kosuke

More information

Class 4: Classification. Quaid Morris February 11 th, 2011 ML4Bio

Class 4: Classification. Quaid Morris February 11 th, 2011 ML4Bio Class 4: Classification Quaid Morris February 11 th, 211 ML4Bio Overview Basic concepts in classification: overfitting, cross-validation, evaluation. Linear Discriminant Analysis and Quadratic Discriminant

More information

An Introduction to Causal Analysis on Observational Data using Propensity Scores

An Introduction to Causal Analysis on Observational Data using Propensity Scores An Introduction to Causal Analysis on Observational Data using Propensity Scores Margie Rosenberg*, PhD, FSA Brian Hartman**, PhD, ASA Shannon Lane* *University of Wisconsin Madison **University of Connecticut

More information

Linear Methods for Prediction

Linear Methods for Prediction Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we

More information

Variable selection and machine learning methods in causal inference

Variable selection and machine learning methods in causal inference Variable selection and machine learning methods in causal inference Debashis Ghosh Department of Biostatistics and Informatics Colorado School of Public Health Joint work with Yeying Zhu, University of

More information

Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal

Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal Overview In observational and experimental studies, the goal may be to estimate the effect

More information

Optimising Group Sequential Designs. Decision Theory, Dynamic Programming. and Optimal Stopping

Optimising Group Sequential Designs. Decision Theory, Dynamic Programming. and Optimal Stopping : Decision Theory, Dynamic Programming and Optimal Stopping Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj InSPiRe Conference on Methodology

More information

DATA-ADAPTIVE VARIABLE SELECTION FOR

DATA-ADAPTIVE VARIABLE SELECTION FOR DATA-ADAPTIVE VARIABLE SELECTION FOR CAUSAL INFERENCE Group Health Research Institute Department of Biostatistics, University of Washington shortreed.s@ghc.org joint work with Ashkan Ertefaie Department

More information

Group Sequential Tests for Delayed Responses. Christopher Jennison. Lisa Hampson. Workshop on Special Topics on Sequential Methodology

Group Sequential Tests for Delayed Responses. Christopher Jennison. Lisa Hampson. Workshop on Special Topics on Sequential Methodology Group Sequential Tests for Delayed Responses Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj Lisa Hampson Department of Mathematics and Statistics,

More information

Classification. Chapter Introduction. 6.2 The Bayes classifier

Classification. Chapter Introduction. 6.2 The Bayes classifier Chapter 6 Classification 6.1 Introduction Often encountered in applications is the situation where the response variable Y takes values in a finite set of labels. For example, the response Y could encode

More information

Propensity Score Methods for Causal Inference

Propensity Score Methods for Causal Inference John Pura BIOS790 October 2, 2015 Causal inference Philosophical problem, statistical solution Important in various disciplines (e.g. Koch s postulates, Bradford Hill criteria, Granger causality) Good

More information

Global Sensitivity Analysis for Repeated Measures Studies with Informative Drop-out: A Semi-Parametric Approach

Global Sensitivity Analysis for Repeated Measures Studies with Informative Drop-out: A Semi-Parametric Approach Global for Repeated Measures Studies with Informative Drop-out: A Semi-Parametric Approach Daniel Aidan McDermott Ivan Diaz Johns Hopkins University Ibrahim Turkoz Janssen Research and Development September

More information