Lecture 9: Learning Optimal Dynamic Treatment Regimes. Donglin Zeng, Department of Biostatistics, University of North Carolina
|
|
- Ferdinand O’Brien’
- 5 years ago
- Views:
Transcription
1 Lecture 9: Learning Optimal Dynamic Treatment Regimes
2 Introduction
3 Refresh: Dynamic Treatment Regimes (DTRs) DTRs: sequential decision rules, tailored at each stage by patients time-varying features and intermediate outcomes in previous stages (Lavori & Dawson 1998, Lavori et al. 2000, Murphy et al. 2001). Used in cancer, psychiatry, substance abuse research. Examples of DTRs Adaptive Pharmacological Behavioral Treatments for Children with Attention Deficit Hyperactive Disorder (ADHD, Pelham 2002). DTR1: Prescribe medication (MED) as initial treatment; if a child responds then continue; if a child does not respond then augment with behavioral modification (BMOD). DTR2: Prescribe BMOD as initial treatment; if a child responds then continue; if a child does not respond then augment with MED.
4 Dynamic Treatment Regimes (DTRs) Examples of DTRs Adaptive Pharmacological Behavioral Treatments for Children with Attention Deficit Hyperactive Disorder (ADHD, Pelham 2002). DTR1: Prescribe medication (MED) as initial treatment; if a child responds then continue; if a child does not respond then augment with behavioral modification (BMOD).
5 Dynamic Treatment Regimes (DTRs) Examples of DTRs Adaptive Pharmacological Behavioral Treatments for Children with Attention Deficit Hyperactive Disorder (ADHD, Pelham 2002). DTR2: Prescribe BMOD as initial treatment; if a child responds then continue; if a child does not respond then augment with MED.
6 Existing Methods
7 General Multi-Stage DTR notation A k : treatment at stage k, take value { 1, 1}. H k : historical information at stage k. R k : reward at stage k. DTR A sequence of decision functions D = (D 1, D 2,..., D K ) maps from the historical information domain (H 1, H 2,..., H K ) to treatment domain vector of ( 1, 1).
8 Value Function and Optimal DTR The value function associated with D is the expected total reward if D is actually implemented: V(D) = E D (R R K ). Optimal DTR: D = argmax D V(D). Key relationship based on SMART designs: V(D) is [ ] I(A1 = D 1 (H 1 ), A 2 = D 2 (H 2 ), ) E (R 1 + R 2 + ). P(A 1 H 1 )P(A 2 H 2 )
9 Existing Methods for Learning DTRs Dynamic modelling of clinical outcomes: G-computation, Monte-Carlo simulation, Bayesian approaches (Lavori & Dawson 2004; Wathen & Thall 2008) Sequential modelling of Q-functions (expected individual outcome given best treatment in prospect): Q-learning, A-learning or double robust regression models (Murphy et al. 2006; Robins 2004) Sequential maximization of value functions (maximal expected benefit for a given treatment strategy): O-learning (Zhao et al. 2012, 2014) We focus on the last two methods.
10 Q-learning: two-stage example Data: (H 1, A 1, R 1, H 2, A 2, R 2 ) where H: state; A: treatment; R: reward. Goal: maximize R 1 + R 2 to estimate the best treatment at each stage. Q-Learning: using backward-induction logic Compare the expected outcome from regression model of second stage for two treatments. Pick the treatment with larger expected outcome. Imputation: Create a pseudo second stage outcome R 2 by the maximum across the two treatments from the above regression model. Fit regression model where the output is { R 1 + R 2 }. Pick the treatment 1 maximizing the regression expected value for a given set of baseline variables.
11 Q-learning: algorithm At stage 2 (no future), we fit Q 2 (H 2, A 2 ) = E[R 2 H 2, A 2 ], then estimate D2 = argmax a { 1,1} Q 2 (H 2, a). At stage 1, we obtain individual optimal future reward as R 1 = R 1 + max Q 2(H 1, a); a { 1,1} so we estimate Q 1 (H 1, A 1 ) as E[R 1 H 1, A 1 ]. We obtain D1 = argmax a { 1,1} Q 1(H 1, a).
12 Pros and Cons Pros: Each step is a regression analysis. Make use of all the subjects. Cons: Regression models may be misspecified. The objective function is for model fitting but not directly for value maximization.
13 Extension Single-Stage O-Learning
14 O-learning: single stage Directly maximize value function (Zhao et al. 2012) ( ) RI(A = D(H)) V(D) = E D (R) = E. P(A H) Interpretation: Subjects with high rewards most likely, we want D(H) to be the same as the assigned treatment; Subjects with low rewards we may want D(H) to be the opposite to the assigned treatment. O-learning is a weighted classification problem with outcome as weights (classification tree, SVM).
15 O-learning: multiple stages A backward algorithm (Zhao et al. 2014): At stage 2, apply single stage O-learning to estimate D 2. For stage 1, only keep the subjects whose observed treatment is the same as the optimal one, A 2 = D 2 (H 2). For this subgroup of patients, apply single stage O-learning to estimate D 1.
16 Pros and Cons Pros: Directly maximize the value function for optimal treatments. It only uses the subjects who actually follow optimal regimes in future so is robust. Cons: Need to handle negative weights (Zhao et al recommends subtracting a small constant). Highly variable weights may affect performance. Discard a significant proportion of subjects in the backward procedure.
17 Improve O-Learning via Augmentation
18 New Approach: AMOL Augmented Multistage O-Learning (AMOL): based on a backward O-learning but with three novel improvements. Improvement 1: we aim to reduce the variability of weights. Improvement 2: we can handle negative weights. Improvement 3: we utilize all the subjects including those who may not take optimal treatments in future stages.
19 Improvement 1: Fitting residuals to reduce variability Fit regression model s(h): R i H i. Change the weights from observed outcome R i to R i s(h i ). ( ) RI(A = D(H)) V(D) = E P(A H) ( ) (R s(h))i(a = D(H)) = E + E[s(H)]. P(A H)
20 Improvement 2: Accommodate negative weights Note ( ) RI(A = D(H)) argmax D E P(A H) ( ) R I(Asign(R) = D(H)) = argmax D E. P(A H) When R i > 0, the desirable rule for large R i should be D(H i ) = A i. When R i < 0, the desirable rule for large R i is D(H i ) = A i.
21 Improved O-learning using surrogate loss: Weighted classification problem: f = argmin f n 1 n i=1 (1 sign(r i )A i f (H i )) + R i π i + λ f 2. where D (h) = sign(f (h)), f (x) = βx + β 0 ; and f 2 = β 2.
22 Improvement 3: Use all subjects at each stage Ideas At each stage, O-learning requires knowing incremental reward for each subjects, i.e., future reward if they are treated optimally. For subjects who actually take non-optimal treatments, their future value increment is missing. Augmentation technique for missing data can be used. The augmentation needs the imputation of incremental reward for these subjects: models in Q-learning provide natural imputation. Therefore, this approach integrates O- and Q-learning.
23 Augmented Inverse Probability Weighted Estimation AIPW in Missing data literature (Robins et al. 1994, Robins 1999): Estimate µ, mean of sample Y i s. Some Y i s are missing, Z i = I(Y i is observed ). H i s are predictors. If either of the two parametric models is correctly specified: µ(h, γ 1 ) = E(Y H), π(h, γ 3 ) = P(Z = 1 H), the estimator ˆµ is consistent. If both are correct, it is most efficient. n [ ˆµ = n 1 Z i Y i π(h i, ˆγ 3 ) Z ] i π(h i, ˆγ 3 )) µ(h i, ˆγ 1 ) π(h i, ˆγ 3 )) i=1
24 Augmented Multistage O-learning Algorithm AMOL Complete Algorithm At stage 2, r 2 = R 2 s 2 (H 2 ), we minimize [ ] I(sign(r2 )A 2 D 2 (H 2 )) r 2 E P(A 2 H 2 ) using O-learning to obtain D 2. At stage 2, fit a Q-learning model to compute R 2(h) = max E[R 2 A 2 = a, H 2 = h]. a { 1,1}
25 Augmented Multistage O-learning Algorithm AMOL Complete Algorithm (continued) Compute the augmented increment reward at stage 2: Q 2 = I(A 2 = D 2 (H 2)) P(A 2 H 2 ) R 2 I(A 2 = D 2 (H 2)) P(A 2 H 2 ) R P(A 2 H 2 ) 2(H 2 ). At stage 1, calculate r 1 = R 1 + Q 2 s 1 (H 1 ) then minimize [ ] I(sign(r1 )A 1 D 1 (H 1 )) r 1 E. P(A 1 H 1 )
26 AMOL Theoretical Results Theorem 1: Consistency of AMOL optimal treatment rule For any function µ(h) which maps from the history information H k to the outcome domain, R k 1 + K j=k I(A j = D j (H j)( K j=k R j) K j=k π j(a j, H j ) K j=k I(A j = Dj (H j)) j k π j(a j, H j ) j k π µ(h j(a j, H j ) k ) is always unbiased for E[R k 1 + R k H k], and for pure randomization, its conditional variance is minimized if µ(h k ) = R 2 (H k).
27 AMOL Theoretical Results Theorem 2: Convergence rate of the value function Under some regularity conditions including geometric noise conditions for boundary, convergence rates of Q-learning models and the rate of Gaussian kernel bandwidth for RKHS, we obtain K P V k( f k,..., f K ) V k (fk,..., f K ) c (K j) 0 ɛ nj (τ) where ɛ nk (τ) = c [ j=k 1 (K j + 1)e τ, λ 2 + (2 v k )(1+δ k ) 2+v k (2+v k )(1+q k ) nk n 2 2+v k + τ + τ nλ nk n β k ] qk + λ q k +1 nk.
28 Comparing Performance in Simulation Study
29 Simulation Set-up Two-stage and four-stage settings. 500 replicates and an independent 10, 000 test set. 50 baseline covariates: X 1, X 2,..., X 50 from N (0, 1). Treatment A is are randomly assigned to { 1, 1} equally. Scenario 1: R 1 = X 1 A 1 + N (0, 1); R 2 = (R 1 + X2 2 + X ) A 2 + N (0, 1). Scenario 2 extends 1 to four stages R 1 = X 3 A 1 + N (0, 1); R 2 = (R 1 + X X ) A 2 + N (0, 1); R 3 = 2 (R 2 + X 3 ) A 3 + 3X 4 4X 5 + N (0, 1); R 4 = 3 (R 3 + X 6 ) A 4 4X 3 + N (0, 1).
30 Scenario 1 Simulation Results scenario1 Emipirical Value/Std from optimal value Qlearning Olearning Olearning Residual AMOL n Figure: Scenario1: two-stage trial with 500 replicates and optimal value 2.86
31 Scenario 2 Simulation Results scenario2 Emipirical Value/Std from optimal value Qlearning Olearning Olearning Residual AMOL n Figure: Scenario2: four-stage trial with 500 replicates and optimal value 25.6
32 Analysis of ADHD Study
33 ADHD Data Analysis Interventions include different dose of methamphetamine (MED) and different intensities of behavioral modification (BMOD). The first stage lasted 2 months and impairment rating scale and individualized list of target behaviors were used to assess response. Children who didn t respond were rerandomized to either intensified or switched treatment. Primary outcome is a school performance score measured from 1 to 5.
34 Additional Information on ADHD Data A total of 150 subjects at the initial stage Four baseline covariates: prior medication history, ADHD impairment score, ODD diagnosis and race Two time varying co-variates (adherence to treatment, months to remission) for stage participants did not respond to first stage intervention, re-randomized in the second stage.
35 ADHD Data Analysis Q-learning O-learning AMOL Mean 3.601(0.0284) 3.097(0.0387) 3.660(0.0268) Q learning O learning AMOL AMOL Sparse Figure: Predicted Values based on fold CV
36 ADHD Coefficient for stage 2 Q-L O-L( 10 2 ) AMOL Intercept ODD Diagnosis ADHD score (cont.) Medication prior Race (white=1) trt1(1 for bmod;-1 for med) ODD Diagnosis* trt ADHD *trt Prior med*trt race*trt months tol non-response Adherence to trt months top non-response*trt Adherence to trt1*trt Adherence to trt1*trt Table: Coefficients for ADHD stage 2 Q-learning also include other interaction terms with trt2 which are omitted in the table.
37 ADHD coefficients for stage 1 Q-L O-L( 10 3 ) AMOL( 10 3 ) Intercept ODD Diagnosis ADHD score (cont.) Medication prior Race (white=1) trt1(1 for bmod;-1 for med) ODD Diagnosis* trt ADHD *trt Prior med*trt race*trt Table: Coefficients for ADHD stage 1
38 Interpretations of the Coefficients Sparse Optimal Rules from AMOL Children Prior Med= 1 Med in first stage, otherwise, BMOD For the second stage, Children adhere to initial treatment INTENSIFY, otherwise ADD the other TRT. DTRs Observed value BMOD then ADD MED BMOD then INTENSIFY BMOD MED then ADD BMOD MED then INTESIFY MED 2.789
39 Future Consideration
40 Additional Issues To find tailoring variables for future studies: Ranking the importance of feature variables and feature selection for DTR; More interpretable rules: incorporate tree model. Exploration of other classifiers; Identifying high-benefit subgroups. Other types of outcomes Multi-dimensional outcomes value function (benefit-risk)
SEQUENTIAL MULTIPLE ASSIGNMENT RANDOMIZATION TRIALS WITH ENRICHMENT (SMARTER) DESIGN
SEQUENTIAL MULTIPLE ASSIGNMENT RANDOMIZATION TRIALS WITH ENRICHMENT (SMARTER) DESIGN Ying Liu Division of Biostatistics, Medical College of Wisconsin Yuanjia Wang Department of Biostatistics & Psychiatry,
More informationQ learning. A data analysis method for constructing adaptive interventions
Q learning A data analysis method for constructing adaptive interventions SMART First stage intervention options coded as 1(M) and 1(B) Second stage intervention options coded as 1(M) and 1(B) O1 A1 O2
More informationEstimation of Optimal Treatment Regimes Via Machine Learning. Marie Davidian
Estimation of Optimal Treatment Regimes Via Machine Learning Marie Davidian Department of Statistics North Carolina State University Triangle Machine Learning Day April 3, 2018 1/28 Optimal DTRs Via ML
More informationStructural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall
1 Structural Nested Mean Models for Assessing Time-Varying Effect Moderation Daniel Almirall Center for Health Services Research, Durham VAMC & Dept. of Biostatistics, Duke University Medical Joint work
More informationTREE-BASED REINFORCEMENT LEARNING FOR ESTIMATING OPTIMAL DYNAMIC TREATMENT REGIMES. University of Michigan, Ann Arbor
Submitted to the Annals of Applied Statistics TREE-BASED REINFORCEMENT LEARNING FOR ESTIMATING OPTIMAL DYNAMIC TREATMENT REGIMES BY YEBIN TAO, LU WANG AND DANIEL ALMIRALL University of Michigan, Ann Arbor
More informationA Bayesian Machine Learning Approach for Optimizing Dynamic Treatment Regimes
A Bayesian Machine Learning Approach for Optimizing Dynamic Treatment Regimes Thomas A. Murray, (tamurray@mdanderson.org), Ying Yuan, (yyuan@mdanderson.org), and Peter F. Thall (rex@mdanderson.org) Department
More informationSTAT 5500/6500 Conditional Logistic Regression for Matched Pairs
STAT 5500/6500 Conditional Logistic Regression for Matched Pairs Motivating Example: The data we will be using comes from a subset of data taken from the Los Angeles Study of the Endometrial Cancer Data
More informationBIOS 2083: Linear Models
BIOS 2083: Linear Models Abdus S Wahed September 2, 2009 Chapter 0 2 Chapter 1 Introduction to linear models 1.1 Linear Models: Definition and Examples Example 1.1.1. Estimating the mean of a N(μ, σ 2
More informationComparing Adaptive Interventions Using Data Arising from a SMART: With Application to Autism, ADHD, and Mood Disorders
Comparing Adaptive Interventions Using Data Arising from a SMART: With Application to Autism, ADHD, and Mood Disorders Daniel Almirall, Xi Lu, Connie Kasari, Inbal N-Shani, Univ. of Michigan, Univ. of
More informationEstimating Optimal Dynamic Treatment Regimes from Clustered Data
Estimating Optimal Dynamic Treatment Regimes from Clustered Data Bibhas Chakraborty Department of Biostatistics, Columbia University bc2425@columbia.edu Society for Clinical Trials Annual Meetings Boston,
More informationSupport Vector Hazard Regression (SVHR) for Predicting Survival Outcomes. Donglin Zeng, Department of Biostatistics, University of North Carolina
Support Vector Hazard Regression (SVHR) for Predicting Survival Outcomes Introduction Method Theoretical Results Simulation Studies Application Conclusions Introduction Introduction For survival data,
More informationRobustifying Trial-Derived Treatment Rules to a Target Population
1/ 39 Robustifying Trial-Derived Treatment Rules to a Target Population Yingqi Zhao Public Health Sciences Division Fred Hutchinson Cancer Research Center Workshop on Perspectives and Analysis for Personalized
More informationRobustness to Parametric Assumptions in Missing Data Models
Robustness to Parametric Assumptions in Missing Data Models Bryan Graham NYU Keisuke Hirano University of Arizona April 2011 Motivation Motivation We consider the classic missing data problem. In practice
More informationA Sampling of IMPACT Research:
A Sampling of IMPACT Research: Methods for Analysis with Dropout and Identifying Optimal Treatment Regimes Marie Davidian Department of Statistics North Carolina State University http://www.stat.ncsu.edu/
More informationExtending causal inferences from a randomized trial to a target population
Extending causal inferences from a randomized trial to a target population Issa Dahabreh Center for Evidence Synthesis in Health, Brown University issa dahabreh@brown.edu January 16, 2019 Issa Dahabreh
More informationMachine Learning Linear Classification. Prof. Matteo Matteucci
Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)
More informationSet-valued dynamic treatment regimes for competing outcomes
Set-valued dynamic treatment regimes for competing outcomes Eric B. Laber Department of Statistics, North Carolina State University JSM, Montreal, QC, August 5, 2013 Acknowledgments Zhiqiang Tan Jamie
More informationOn Measurement Error Problems with Predictors Derived from Stationary Stochastic Processes and Application to Cocaine Dependence Treatment Data
On Measurement Error Problems with Predictors Derived from Stationary Stochastic Processes and Application to Cocaine Dependence Treatment Data Yehua Li Department of Statistics University of Georgia Yongtao
More informationA Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness
A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A. Linero and M. Daniels UF, UT-Austin SRC 2014, Galveston, TX 1 Background 2 Working model
More informationDouble Robustness. Bang and Robins (2005) Kang and Schafer (2007)
Double Robustness Bang and Robins (2005) Kang and Schafer (2007) Set-Up Assume throughout that treatment assignment is ignorable given covariates (similar to assumption that data are missing at random
More informationWeb-based Supplementary Materials for A Robust Method for Estimating. Optimal Treatment Regimes
Biometrics 000, 000 000 DOI: 000 000 0000 Web-based Supplementary Materials for A Robust Method for Estimating Optimal Treatment Regimes Baqun Zhang, Anastasios A. Tsiatis, Eric B. Laber, and Marie Davidian
More informationUniversity of California, Berkeley
University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2010 Paper 259 Targeted Maximum Likelihood Based Causal Inference Mark J. van der Laan University of
More informationPersonalized Treatment Selection Based on Randomized Clinical Trials. Tianxi Cai Department of Biostatistics Harvard School of Public Health
Personalized Treatment Selection Based on Randomized Clinical Trials Tianxi Cai Department of Biostatistics Harvard School of Public Health Outline Motivation A systematic approach to separating subpopulations
More informationLinear Regression (9/11/13)
STA561: Probabilistic machine learning Linear Regression (9/11/13) Lecturer: Barbara Engelhardt Scribes: Zachary Abzug, Mike Gloudemans, Zhuosheng Gu, Zhao Song 1 Why use linear regression? Figure 1: Scatter
More informationModification and Improvement of Empirical Likelihood for Missing Response Problem
UW Biostatistics Working Paper Series 12-30-2010 Modification and Improvement of Empirical Likelihood for Missing Response Problem Kwun Chuen Gary Chan University of Washington - Seattle Campus, kcgchan@u.washington.edu
More informationECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam
ECLT 5810 Linear Regression and Logistic Regression for Classification Prof. Wai Lam Linear Regression Models Least Squares Input vectors is an attribute / feature / predictor (independent variable) The
More informationHigh Dimensional Propensity Score Estimation via Covariate Balancing
High Dimensional Propensity Score Estimation via Covariate Balancing Kosuke Imai Princeton University Talk at Columbia University May 13, 2017 Joint work with Yang Ning and Sida Peng Kosuke Imai (Princeton)
More informationA Gate-keeping Approach for Selecting Adaptive Interventions under General SMART Designs
1 / 32 A Gate-keeping Approach for Selecting Adaptive Interventions under General SMART Designs Tony Zhong, DrPH Icahn School of Medicine at Mount Sinai (Feb 20, 2019) Workshop on Design of mhealth Intervention
More informationStructural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall
1 Structural Nested Mean Models for Assessing Time-Varying Effect Moderation Daniel Almirall Center for Health Services Research, Durham VAMC & Duke University Medical, Dept. of Biostatistics Joint work
More informationDoubly Robust Policy Evaluation and Learning
Doubly Robust Policy Evaluation and Learning Miroslav Dudik, John Langford and Lihong Li Yahoo! Research Discussed by Miao Liu October 9, 2011 October 9, 2011 1 / 17 1 Introduction 2 Problem Definition
More informationReader Reaction to A Robust Method for Estimating Optimal Treatment Regimes by Zhang et al. (2012)
Biometrics 71, 267 273 March 2015 DOI: 10.1111/biom.12228 READER REACTION Reader Reaction to A Robust Method for Estimating Optimal Treatment Regimes by Zhang et al. (2012) Jeremy M. G. Taylor,* Wenting
More informationWeighting Methods. Harvard University STAT186/GOV2002 CAUSAL INFERENCE. Fall Kosuke Imai
Weighting Methods Kosuke Imai Harvard University STAT186/GOV2002 CAUSAL INFERENCE Fall 2018 Kosuke Imai (Harvard) Weighting Methods Stat186/Gov2002 Fall 2018 1 / 13 Motivation Matching methods for improving
More informationSemiparametric Regression and Machine Learning Methods for Estimating Optimal Dynamic Treatment Regimes
Semiparametric Regression and Machine Learning Methods for Estimating Optimal Dynamic Treatment Regimes by Yebin Tao A dissertation submitted in partial fulfillment of the requirements for the degree of
More informationImplementing Precision Medicine: Optimal Treatment Regimes and SMARTs. Anastasios (Butch) Tsiatis and Marie Davidian
Implementing Precision Medicine: Optimal Treatment Regimes and SMARTs Anastasios (Butch) Tsiatis and Marie Davidian Department of Statistics North Carolina State University http://www4.stat.ncsu.edu/~davidian
More informationTargeted Group Sequential Adaptive Designs
Targeted Group Sequential Adaptive Designs Mark van der Laan Department of Biostatistics, University of California, Berkeley School of Public Health Liver Forum, May 10, 2017 Targeted Group Sequential
More informationLecture 9: Bayesian Learning
Lecture 9: Bayesian Learning Cognitive Systems II - Machine Learning Part II: Special Aspects of Concept Learning Bayes Theorem, MAL / ML hypotheses, Brute-force MAP LEARNING, MDL principle, Bayes Optimal
More informationTargeted Maximum Likelihood Estimation for Dynamic Treatment Regimes in Sequential Randomized Controlled Trials
From the SelectedWorks of Paul H. Chaffee June 22, 2012 Targeted Maximum Likelihood Estimation for Dynamic Treatment Regimes in Sequential Randomized Controlled Trials Paul Chaffee Mark J. van der Laan
More informationNonparameteric Regression:
Nonparameteric Regression: Nadaraya-Watson Kernel Regression & Gaussian Process Regression Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro,
More informationAdaptive Trial Designs
Adaptive Trial Designs Wenjing Zheng, Ph.D. Methods Core Seminar Center for AIDS Prevention Studies University of California, San Francisco Nov. 17 th, 2015 Trial Design! Ethical:!eg.! Safety!! Efficacy!
More informationVariable selection and machine learning methods in causal inference
Variable selection and machine learning methods in causal inference Debashis Ghosh Department of Biostatistics and Informatics Colorado School of Public Health Joint work with Yeying Zhu, University of
More informationGlobal Sensitivity Analysis for Repeated Measures Studies with Informative Drop-out: A Semi-Parametric Approach
Global for Repeated Measures Studies with Informative Drop-out: A Semi-Parametric Approach Daniel Aidan McDermott Ivan Diaz Johns Hopkins University Ibrahim Turkoz Janssen Research and Development September
More informationMachine Learning. Lecture 9: Learning Theory. Feng Li.
Machine Learning Lecture 9: Learning Theory Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 2018 Why Learning Theory How can we tell
More informationRandomization-Based Inference With Complex Data Need Not Be Complex!
Randomization-Based Inference With Complex Data Need Not Be Complex! JITAIs JITAIs Susan Murphy 07.18.17 HeartSteps JITAI JITAIs Sequential Decision Making Use data to inform science and construct decision
More informationPropensity Score Weighting with Multilevel Data
Propensity Score Weighting with Multilevel Data Fan Li Department of Statistical Science Duke University October 25, 2012 Joint work with Alan Zaslavsky and Mary Beth Landrum Introduction In comparative
More informationClassification 1: Linear regression of indicators, linear discriminant analysis
Classification 1: Linear regression of indicators, linear discriminant analysis Ryan Tibshirani Data Mining: 36-462/36-662 April 2 2013 Optional reading: ISL 4.1, 4.2, 4.4, ESL 4.1 4.3 1 Classification
More informationComparative effectiveness of dynamic treatment regimes
Comparative effectiveness of dynamic treatment regimes An application of the parametric g- formula Miguel Hernán Departments of Epidemiology and Biostatistics Harvard School of Public Health www.hsph.harvard.edu/causal
More informationEmpirical Likelihood Methods for Two-sample Problems with Data Missing-by-Design
1 / 32 Empirical Likelihood Methods for Two-sample Problems with Data Missing-by-Design Changbao Wu Department of Statistics and Actuarial Science University of Waterloo (Joint work with Min Chen and Mary
More informationRecent Advances in Outcome Weighted Learning for Precision Medicine
Recent Advances in Outcome Weighted Learning for Precision Medicine Michael R. Kosorok Department of Biostatistics Department of Statistics and Operations Research University of North Carolina at Chapel
More informationarxiv: v1 [stat.ap] 17 Mar 2018
Power Analysis in a SMART Design: Sample Size Estimation for Determining the Best Dynamic Treatment Regime arxiv:1804.04587v1 [stat.ap] 17 Mar 2018 William J. Artman Department of Biostatistics and Computational
More informationInteractive Q-learning for Probabilities and Quantiles
Interactive Q-learning for Probabilities and Quantiles Kristin A. Linn 1, Eric B. Laber 2, Leonard A. Stefanski 2 arxiv:1407.3414v2 [stat.me] 21 May 2015 1 Department of Biostatistics and Epidemiology
More informationOptimal Treatment Regimes for Survival Endpoints from a Classification Perspective. Anastasios (Butch) Tsiatis and Xiaofei Bai
Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective Anastasios (Butch) Tsiatis and Xiaofei Bai Department of Statistics North Carolina State University 1/35 Optimal Treatment
More informationSemi-Nonparametric Inferences for Massive Data
Semi-Nonparametric Inferences for Massive Data Guang Cheng 1 Department of Statistics Purdue University Statistics Seminar at NCSU October, 2015 1 Acknowledge NSF, Simons Foundation and ONR. A Joint Work
More informationEstimating direct effects in cohort and case-control studies
Estimating direct effects in cohort and case-control studies, Ghent University Direct effects Introduction Motivation The problem of standard approaches Controlled direct effect models In many research
More informationLecture 2: Constant Treatment Strategies. Donglin Zeng, Department of Biostatistics, University of North Carolina
Lecture 2: Constant Treatment Strategies Introduction Motivation We will focus on evaluating constant treatment strategies in this lecture. We will discuss using randomized or observational study for these
More informationMarginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal
Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal Overview In observational and experimental studies, the goal may be to estimate the effect
More informationEcon 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines
Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Maximilian Kasy Department of Economics, Harvard University 1 / 37 Agenda 6 equivalent representations of the
More informationMODULE -4 BAYEIAN LEARNING
MODULE -4 BAYEIAN LEARNING CONTENT Introduction Bayes theorem Bayes theorem and concept learning Maximum likelihood and Least Squared Error Hypothesis Maximum likelihood Hypotheses for predicting probabilities
More informationShu Yang and Jae Kwang Kim. Harvard University and Iowa State University
Statistica Sinica 27 (2017), 000-000 doi:https://doi.org/10.5705/ss.202016.0155 DISCUSSION: DISSECTING MULTIPLE IMPUTATION FROM A MULTI-PHASE INFERENCE PERSPECTIVE: WHAT HAPPENS WHEN GOD S, IMPUTER S AND
More informationThe Supervised Learning Approach To Estimating Heterogeneous Causal Regime Effects
The Supervised Learning Approach To Estimating Heterogeneous Causal Regime Effects Thai T. Pham Stanford Graduate School of Business thaipham@stanford.edu May, 2016 Introduction Observations Many sequential
More informationData Mining Stat 588
Data Mining Stat 588 Lecture 02: Linear Methods for Regression Department of Statistics & Biostatistics Rutgers University September 13 2011 Regression Problem Quantitative generic output variable Y. Generic
More informationIndividualized Treatment Effects with Censored Data via Nonparametric Accelerated Failure Time Models
Individualized Treatment Effects with Censored Data via Nonparametric Accelerated Failure Time Models Nicholas C. Henderson Thomas A. Louis Gary Rosner Ravi Varadhan Johns Hopkins University July 31, 2018
More informationMcGill University. Faculty of Science MATH 204 PRINCIPLES OF STATISTICS II. Final Examination
McGill University Faculty of Science MATH 204 PRINCIPLES OF STATISTICS II Final Examination Date: 20th April 2009 Time: 9am-2pm Examiner: Dr David A Stephens Associate Examiner: Dr Russell Steele Please
More informationSensitivity study of dose-finding methods
to of dose-finding methods Sarah Zohar 1 John O Quigley 2 1. Inserm, UMRS 717,Biostatistic Department, Hôpital Saint-Louis, Paris, France 2. Inserm, Université Paris VI, Paris, France. NY 2009 1 / 21 to
More informationAdaptive Crowdsourcing via EM with Prior
Adaptive Crowdsourcing via EM with Prior Peter Maginnis and Tanmay Gupta May, 205 In this work, we make two primary contributions: derivation of the EM update for the shifted and rescaled beta prior and
More informationGroup Sequential Designs: Theory, Computation and Optimisation
Group Sequential Designs: Theory, Computation and Optimisation Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj 8th International Conference
More informationData Mining: Concepts and Techniques. (3 rd ed.) Chapter 8. Chapter 8. Classification: Basic Concepts
Data Mining: Concepts and Techniques (3 rd ed.) Chapter 8 1 Chapter 8. Classification: Basic Concepts Classification: Basic Concepts Decision Tree Induction Bayes Classification Methods Rule-Based Classification
More informationSupport Vector Machines (SVM) in bioinformatics. Day 1: Introduction to SVM
1 Support Vector Machines (SVM) in bioinformatics Day 1: Introduction to SVM Jean-Philippe Vert Bioinformatics Center, Kyoto University, Japan Jean-Philippe.Vert@mines.org Human Genome Center, University
More informationRelevance Vector Machines
LUT February 21, 2011 Support Vector Machines Model / Regression Marginal Likelihood Regression Relevance vector machines Exercise Support Vector Machines The relevance vector machine (RVM) is a bayesian
More informationMark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation.
CS 189 Spring 2015 Introduction to Machine Learning Midterm You have 80 minutes for the exam. The exam is closed book, closed notes except your one-page crib sheet. No calculators or electronic items.
More informationCounterfactual Model for Learning Systems
Counterfactual Model for Learning Systems CS 7792 - Fall 28 Thorsten Joachims Department of Computer Science & Department of Information Science Cornell University Imbens, Rubin, Causal Inference for Statistical
More informationRobustness of the Contextual Bandit Algorithm to A Physical Activity Motivation Effect
Robustness of the Contextual Bandit Algorithm to A Physical Activity Motivation Effect Xige Zhang April 10, 2016 1 Introduction Technological advances in mobile devices have seen a growing popularity in
More information1 Machine Learning Concepts (16 points)
CSCI 567 Fall 2018 Midterm Exam DO NOT OPEN EXAM UNTIL INSTRUCTED TO DO SO PLEASE TURN OFF ALL CELL PHONES Problem 1 2 3 4 5 6 Total Max 16 10 16 42 24 12 120 Points Please read the following instructions
More informationSupplement to Q- and A-learning Methods for Estimating Optimal Dynamic Treatment Regimes
Submitted to the Statistical Science Supplement to - and -learning Methods for Estimating Optimal Dynamic Treatment Regimes Phillip J. Schulte, nastasios. Tsiatis, Eric B. Laber, and Marie Davidian bstract.
More informationCOMS 4771 Regression. Nakul Verma
COMS 4771 Regression Nakul Verma Last time Support Vector Machines Maximum Margin formulation Constrained Optimization Lagrange Duality Theory Convex Optimization SVM dual and Interpretation How get the
More informationECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam
ECLT 5810 Linear Regression and Logistic Regression for Classification Prof. Wai Lam Linear Regression Models Least Squares Input vectors is an attribute / feature / predictor (independent variable) The
More informationHealth utilities' affect you are reported alongside underestimates of uncertainty
Dr. Kelvin Chan, Medical Oncologist, Associate Scientist, Odette Cancer Centre, Sunnybrook Health Sciences Centre and Dr. Eleanor Pullenayegum, Senior Scientist, Hospital for Sick Children Title: Underestimation
More informationRelating Latent Class Analysis Results to Variables not Included in the Analysis
Relating LCA Results 1 Running Head: Relating LCA Results Relating Latent Class Analysis Results to Variables not Included in the Analysis Shaunna L. Clark & Bengt Muthén University of California, Los
More informationCausal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions
Causal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions Joe Schafer Office of the Associate Director for Research and Methodology U.S. Census
More informationFACTORIZATION MACHINES AS A TOOL FOR HEALTHCARE CASE STUDY ON TYPE 2 DIABETES DETECTION
SunLab Enlighten the World FACTORIZATION MACHINES AS A TOOL FOR HEALTHCARE CASE STUDY ON TYPE 2 DIABETES DETECTION Ioakeim (Kimis) Perros and Jimeng Sun perros@gatech.edu, jsun@cc.gatech.edu COMPUTATIONAL
More informationASSESSING THE EFFECT OF TREATMENT REGIMES ON LONGITUDINAL OUTCOME DATA: APPLICATION TO REVAMP STUDY OF DEPRESSION
Journal of Statistical Research 2012, Vol. 46, No. 2, pp. 233-254 ISSN 0256-422 X ASSESSING THE EFFECT OF TREATMENT REGIMES ON LONGITUDINAL OUTCOME DATA: APPLICATION TO REVAMP STUDY OF DEPRESSION SACHIKO
More informationBIOSTATISTICAL METHODS
BIOSTATISTICAL METHODS FOR TRANSLATIONAL & CLINICAL RESEARCH Cross-over Designs #: DESIGNING CLINICAL RESEARCH The subtraction of measurements from the same subject will mostly cancel or minimize effects
More informationMath for Machine Learning Open Doors to Data Science and Artificial Intelligence. Richard Han
Math for Machine Learning Open Doors to Data Science and Artificial Intelligence Richard Han Copyright 05 Richard Han All rights reserved. CONTENTS PREFACE... - INTRODUCTION... LINEAR REGRESSION... 4 LINEAR
More informationMachine Learning for OR & FE
Machine Learning for OR & FE Regression II: Regularization and Shrinkage Methods Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com
More informationMulti-state Models: An Overview
Multi-state Models: An Overview Andrew Titman Lancaster University 14 April 2016 Overview Introduction to multi-state modelling Examples of applications Continuously observed processes Intermittently observed
More informationMatching. Quiz 2. Matching. Quiz 2. Exact Matching. Estimand 2/25/14
STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University Frequency 0 2 4 6 8 Quiz 2 Histogram of Quiz2 10 12 14 16 18 20 Quiz2
More informatione author and the promoter give permission to consult this master dissertation and to copy it or parts of it for personal use. Each other use falls
e author and the promoter give permission to consult this master dissertation and to copy it or parts of it for personal use. Each other use falls under the restrictions of the copyright, in particular
More informationLinear Classification
Linear Classification Lili MOU moull12@sei.pku.edu.cn http://sei.pku.edu.cn/ moull12 23 April 2015 Outline Introduction Discriminant Functions Probabilistic Generative Models Probabilistic Discriminative
More informationChap 2. Linear Classifiers (FTH, ) Yongdai Kim Seoul National University
Chap 2. Linear Classifiers (FTH, 4.1-4.4) Yongdai Kim Seoul National University Linear methods for classification 1. Linear classifiers For simplicity, we only consider two-class classification problems
More information9.520: Class 20. Bayesian Interpretations. Tomaso Poggio and Sayan Mukherjee
9.520: Class 20 Bayesian Interpretations Tomaso Poggio and Sayan Mukherjee Plan Bayesian interpretation of Regularization Bayesian interpretation of the regularizer Bayesian interpretation of quadratic
More informationAnalysing data: regression and correlation S6 and S7
Basic medical statistics for clinical and experimental research Analysing data: regression and correlation S6 and S7 K. Jozwiak k.jozwiak@nki.nl 2 / 49 Correlation So far we have looked at the association
More informationECE521 week 3: 23/26 January 2017
ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear
More informationSTAT 526 Spring Final Exam. Thursday May 5, 2011
STAT 526 Spring 2011 Final Exam Thursday May 5, 2011 Time: 2 hours Name (please print): Show all your work and calculations. Partial credit will be given for work that is partially correct. Points will
More informationLecture 1 Introduction to Multi-level Models
Lecture 1 Introduction to Multi-level Models Course Website: http://www.biostat.jhsph.edu/~ejohnson/multilevel.htm All lecture materials extracted and further developed from the Multilevel Model course
More informationAnalysing longitudinal data when the visit times are informative
Analysing longitudinal data when the visit times are informative Eleanor Pullenayegum, PhD Scientist, Hospital for Sick Children Associate Professor, University of Toronto eleanor.pullenayegum@sickkids.ca
More informationBayesian Nonparametric Accelerated Failure Time Models for Analyzing Heterogeneous Treatment Effects
Bayesian Nonparametric Accelerated Failure Time Models for Analyzing Heterogeneous Treatment Effects Nicholas C. Henderson Thomas A. Louis Gary Rosner Ravi Varadhan Johns Hopkins University September 28,
More information6. Regularized linear regression
Foundations of Machine Learning École Centrale Paris Fall 2015 6. Regularized linear regression Chloé-Agathe Azencot Centre for Computational Biology, Mines ParisTech chloe agathe.azencott@mines paristech.fr
More informationRegression I: Mean Squared Error and Measuring Quality of Fit
Regression I: Mean Squared Error and Measuring Quality of Fit -Applied Multivariate Analysis- Lecturer: Darren Homrighausen, PhD 1 The Setup Suppose there is a scientific problem we are interested in solving
More informationCounterfactual Evaluation and Learning
SIGIR 26 Tutorial Counterfactual Evaluation and Learning Adith Swaminathan, Thorsten Joachims Department of Computer Science & Department of Information Science Cornell University Website: http://www.cs.cornell.edu/~adith/cfactsigir26/
More informationEvaluation of Viable Dynamic Treatment Regimes in a Sequentially Randomized Trial of Advanced Prostate Cancer
Evaluation of Viable Dynamic Treatment Regimes in a Sequentially Randomized Trial of Advanced Prostate Cancer Lu Wang, Andrea Rotnitzky, Xihong Lin, Randall E. Millikan, and Peter F. Thall Abstract We
More informationFor more information about how to cite these materials visit
Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/
More information