Targeted Maximum Likelihood Estimation for Adaptive Designs: Adaptive Randomization in Community Randomized Trial

Size: px
Start display at page:

Download "Targeted Maximum Likelihood Estimation for Adaptive Designs: Adaptive Randomization in Community Randomized Trial"

Transcription

1 Targeted Maximum Likelihood Estimation for Adaptive Designs: Adaptive Randomization in Community Randomized Trial Mark J. van der Laan 1 University of California, Berkeley School of Public Health laan@berkeley.edu mayaliv@berkeley.edu stat.berkeley.edu/ laan/ works.bepress.com/maya petersen/ targetedlearningbook.com Based on Joint Work with Maya Petersen, Antoine Chambaz, Laura Balzer May 24, 2012 Mark van der Laan (UC Berkeley) Adaptive Design in Community RCT May 24, / 16

2 General Adaptive Designs Full data: X = (X 1,..., X n ) n i.i.d. full-data X i P X,0 M F. Target Quantity: ψ F 0 = ΨF (P X,0 ) for some Ψ F : M F IR. Observed data: O = (O i = Φ(X i, A i ) : i = 1,..., n) P PX,0,g 0, for some censored data structure Φ(X i, A i ), and, where conditional distribution g 0 of A = (A 1,..., A n ), given X, is known (or, at least, can be well estimated), and satisfies coarsening at random wr.t. X: g 0 (A X) = h 0 (O). The likelihood of O is given by: n P PX,0,g 0 (O) = P X,0 (C(O i ))g 0 (A X), i=1 where C(O i ) is coarsening of X i implied by O i. Mark van der Laan (UC Berkeley) Adaptive Design in Community RCT May 24, / 16

3 Under positivity assumption, the full-data parameter is thus identified from observed data distribution, giving us the statistical target parameter Ψ : M IR, with M = {P PX,g 0 : P X M F } the observed data model. This defines the estimation problem. Mark van der Laan (UC Berkeley) Adaptive Design in Community RCT May 24, / 16

4 Group Sequential Adaptive Randomized Trials Suppose the individuals are ordered i = 1,..., n, and the treatment/action assignment A i for subject i can also depend on the data O 1,..., O i 1, beyond its usual allowed dependence on O i : g i = P(A i X, A(i 1)) = h(a i O 1,..., O i 1, O i ). The adaptive design g i can be chosen so that it learns/adapts (for i ) an optimal fixed design g opt (A X ). Mark van der Laan (UC Berkeley) Adaptive Design in Community RCT May 24, / 16

5 In van der Laan (2008), and Chambaz, van der Laan (2010a,b), we develop targeted maximum likelihood estimators (TMLE) and statistical inference. The statistical inference is based on Martingale CLT s and tail probability bounds for martingale sum processes. The limit distribution of the TMLE corresponds with the limit distribution of the TMLE under i.i.d. sampling P PX,0,g under a fixed unknown design g identified by the limit of the adaptive design g i. Mark van der Laan (UC Berkeley) Adaptive Design in Community RCT May 24, / 16

6 Adaptive Randomization in Community RCT Full-data: Let X i = (W i, Yi 0, Yi 1 ) i.i.d. P X,0 M F for a nonparametric full-data model M F, and let ψ0 F = E 0{Y 1 Y 0 } be the additive causal effect of a binary treatment. Observed data and CAR: Let O i = (W i, A i = (A t i, i ), Y i = Y i (A t i, i ) = i Y At i i ), where A t i denotes treatment, and 1 i denotes a missingness indicator. Let g 0 be the conditional distribution of A, given X, which is assumed to only be a function of W = (W 1,..., W n ). Mark van der Laan (UC Berkeley) Adaptive Design in Community RCT May 24, / 16

7 Likelihood: P(O 1,..., O n ) = n Q W (W i )Q Y (Y i W i, A i )g 0 (A W), i=1 where Q W and Q Y denote the common marginal distribution of W and the common conditional distribution of Y, given A, W, respectively. We assume J g 0 (A W) = g 0,j (A(j) : j C j (W) W), (1) j=1 where C 1 (W),..., C J (W) is a partitioning of the sample {1,..., n} into J clusters deterministically implied by W. Statistical target: The causal quantity is identified by Ψ(P 0 ) = E QW,0 { Q 0 (1, W ) Q 0 (0, W )}, where Q 0 (a t, w) = E 0 (Y i A i = a t, i = 1, W i = w) (which is constant in i). Mark van der Laan (UC Berkeley) Adaptive Design in Community RCT May 24, / 16

8 Canonical gradient/efficient Influence Curve: The statistical parameter Ψ is pathwise differentiable and its canonical gradient at P is given by D(P)(O) = 1 n n D(Q, ḡ)(o i ), i=1 where D(Q, g) is the efficient influence curve at P Q,g for the individual level data structure: D(Q, ḡ)(o i ) = D W (Q)(W i) + D Y (Q, ḡ)(o i)}, and DW (Q)(W i) = Q(1, W i ) Q(0, W i ) Ψ(Q) DY (Q, ḡ)(o i) = i(2a t i 1) ḡ(a i W i ) (Y i Q(A t i, W i )), Mark van der Laan (UC Berkeley) Adaptive Design in Community RCT May 24, / 16

9 ḡ(a W i ) = 1 n n g j (a W j = w), w=wi j=1 g 0,j is the conditional distribution of A j, given W j, defined by g 0,j (a W j = w) = g 0,j (a (w l : l j), W j = w) Q W,0 (w l ). (w l :l j) l j (2) Mark van der Laan (UC Berkeley) Adaptive Design in Community RCT May 24, / 16

10 Double robustness: We have E 0 D( Q, ḡ, ψ) = ψ 0 ψ if Q = Q 0 or ḡ = ḡ 0, (3) assuming that for all i, 0 < ḡ i (1 W ) < 1 a.e. Mark van der Laan (UC Berkeley) Adaptive Design in Community RCT May 24, / 16

11 Let Mark van Q der Laan be(uc theberkeley) empiricaladaptive distribution, Design in Community and QRCT 0 = (Q, Q May 0 ). 24, We 2012note11 / 16 Targeted MLE Let Y {0, 1} be binary or continuous in (0, 1). Let Q 0 n be an initial estimator of Q 0, which can be based on the loss-function L i ( Q)(O i ) = i { Yi log Q(W i, A i ) + (1 Y i ) log(1 Q(W i, A i )) }. Cross-validation treating sample as i.i.d. based on this loss function is appropriate. Let ḡ n = 1/n i g i,n be an estimator of ḡ 0, which could be based on loss L(ḡ)(W, A) = i log ḡ(a i W i ). Cross-validation now has to treat the clusters C j (W) as unit. If g 0 is known by design, one may/should use plug-in estimator, plugging in empirical distribution Q W,n. As least favorable submodel for fluctuating Q 0 n we use logistic regression Logit Q 0 n(ɛ) = Logit Q 0 n + ɛh ḡ n, where H ḡ n (A, W ) = (2A 1)/ḡ n (A W ).

12 The amount of fluctuation ɛ n is estimated as ɛ n = arg min ɛ n L i ( Q n(ɛ))(o 0 i ). i=1 This provides the update Q n = Q 0 n(ɛ n ). Let Q n = (Q W,n, Q n). The TMLE of ψ 0 is the corresponding plug-in estimator Ψ(Q n) = 1 n n { Q n(1, W i ) Q n(0, W i )}. i=1 The TMLE solves 0 = D( Q n, ḡ n, ψ n) = 1 n n D( Q n, ḡ n, ψn)(o i ). i=1 Mark van der Laan (UC Berkeley) Adaptive Design in Community RCT May 24, / 16

13 Asymptotics of TMLE for Community RCT s Theorem: Suppose that for j = 1,..., J 1, C j (W) is a pair whose observations will be denoted with (O 1,j, O 2,j ), and one cluster C J is defined by the units with missingness. Suppose g 0,i (A i W) = g 0 (A i W i ), and it is known, so that ḡ 0 = g 0 is known as well. Let Q n converge to some Q. Mark van der Laan (UC Berkeley) Adaptive Design in Community RCT May 24, / 16

14 The asymptotic variance σ 2 of the standardized TMLE n(ψ n ψ 0 ) can be represented as follows: σ 2 = P Q0,g 0 {D(Q, g 0 )} 2 lim n E 0 1 J J j=1 ( Q 0 Q )(1, W 1j )( Q 0 Q )(1, W 2j ) lim n E 0 1 J J j=1 ( Q 0 Q )(0, W 1j )( Q 0 Q )(0, W 2j ). Under the assumption that the covariance-terms are positive, a conservative estimate of σ 2 is thus given by: σ 2 n = 1 n n {D (Qn, g 0 )(O i )} 2. i=1 Note: Consistent estimation of true asymptotic variance relies on consistent estimation of Q 0! Mark van der Laan (UC Berkeley) Adaptive Design in Community RCT May 24, / 16

15 Concluding Remark: Semiparametric estimation and inference for sample size one problems is possible and represents an exciting and important area of research. Mark van der Laan (UC Berkeley) Adaptive Design in Community RCT May 24, / 16

16 Further Materials Targeted Learning Book Springer Series in Statistics van der Laan & Rose targetedlearningbook.com 2012 JSM Short Course Instructors: Petersen, Rose & van der Laan Mark van der Laan (UC Berkeley) Adaptive Design in Community RCT May 24, / 16

Targeted Group Sequential Adaptive Designs

Targeted Group Sequential Adaptive Designs Targeted Group Sequential Adaptive Designs Mark van der Laan Department of Biostatistics, University of California, Berkeley School of Public Health Liver Forum, May 10, 2017 Targeted Group Sequential

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2011 Paper 290 Targeted Minimum Loss Based Estimation of an Intervention Specific Mean Outcome Mark

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2011 Paper 288 Targeted Maximum Likelihood Estimation of Natural Direct Effect Wenjing Zheng Mark J.

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2014 Paper 327 Entering the Era of Data Science: Targeted Learning and the Integration of Statistics

More information

Application of Time-to-Event Methods in the Assessment of Safety in Clinical Trials

Application of Time-to-Event Methods in the Assessment of Safety in Clinical Trials Application of Time-to-Event Methods in the Assessment of Safety in Clinical Trials Progress, Updates, Problems William Jen Hoe Koh May 9, 2013 Overview Marginal vs Conditional What is TMLE? Key Estimation

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2010 Paper 260 Collaborative Targeted Maximum Likelihood For Time To Event Data Ori M. Stitelman Mark

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2015 Paper 334 Targeted Estimation and Inference for the Sample Average Treatment Effect Laura B. Balzer

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2010 Paper 259 Targeted Maximum Likelihood Based Causal Inference Mark J. van der Laan University of

More information

Targeted Learning for High-Dimensional Variable Importance

Targeted Learning for High-Dimensional Variable Importance Targeted Learning for High-Dimensional Variable Importance Alan Hubbard, Nima Hejazi, Wilson Cai, Anna Decker Division of Biostatistics University of California, Berkeley July 27, 2016 for Centre de Recherches

More information

Targeted Maximum Likelihood Estimation in Safety Analysis

Targeted Maximum Likelihood Estimation in Safety Analysis Targeted Maximum Likelihood Estimation in Safety Analysis Sam Lendle 1 Bruce Fireman 2 Mark van der Laan 1 1 UC Berkeley 2 Kaiser Permanente ISPE Advanced Topics Session, Barcelona, August 2012 1 / 35

More information

Construction and statistical analysis of adaptive group sequential designs for randomized clinical trials

Construction and statistical analysis of adaptive group sequential designs for randomized clinical trials Construction and statistical analysis of adaptive group sequential designs for randomized clinical trials Antoine Chambaz (MAP5, Université Paris Descartes) joint work with Mark van der Laan Atelier INSERM

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2014 Paper 330 Online Targeted Learning Mark J. van der Laan Samuel D. Lendle Division of Biostatistics,

More information

Causal Inference Basics

Causal Inference Basics Causal Inference Basics Sam Lendle October 09, 2013 Observed data, question, counterfactuals Observed data: n i.i.d copies of baseline covariates W, treatment A {0, 1}, and outcome Y. O i = (W i, A i,

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2011 Paper 282 Super Learner Based Conditional Density Estimation with Application to Marginal Structural

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2015 Paper 339 Drawing Valid Targeted Inference When Covariate-adjusted Response-adaptive RCT Meets

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2006 Paper 213 Targeted Maximum Likelihood Learning Mark J. van der Laan Daniel Rubin Division of Biostatistics,

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2012 Paper 300 Causal Inference for Networks Mark J. van der Laan Division of Biostatistics, School

More information

SIMPLE EXAMPLES OF ESTIMATING CAUSAL EFFECTS USING TARGETED MAXIMUM LIKELIHOOD ESTIMATION

SIMPLE EXAMPLES OF ESTIMATING CAUSAL EFFECTS USING TARGETED MAXIMUM LIKELIHOOD ESTIMATION Johns Hopkins University, Dept. of Biostatistics Working Papers 3-3-2011 SIMPLE EXAMPLES OF ESTIMATING CAUSAL EFFECTS USING TARGETED MAXIMUM LIKELIHOOD ESTIMATION Michael Rosenblum Johns Hopkins Bloomberg

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2009 Paper 248 Application of Time-to-Event Methods in the Assessment of Safety in Clinical Trials Kelly

More information

Collaborative Targeted Maximum Likelihood Estimation. Susan Gruber

Collaborative Targeted Maximum Likelihood Estimation. Susan Gruber Collaborative Targeted Maximum Likelihood Estimation by Susan Gruber A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Biostatistics in the

More information

Fair Inference Through Semiparametric-Efficient Estimation Over Constraint-Specific Paths

Fair Inference Through Semiparametric-Efficient Estimation Over Constraint-Specific Paths Fair Inference Through Semiparametric-Efficient Estimation Over Constraint-Specific Paths for New Developments in Nonparametric and Semiparametric Statistics, Joint Statistical Meetings; Vancouver, BC,

More information

Statistical Inference for Data Adaptive Target Parameters

Statistical Inference for Data Adaptive Target Parameters Statistical Inference for Data Adaptive Target Parameters Mark van der Laan, Alan Hubbard Division of Biostatistics, UC Berkeley December 13, 2013 Mark van der Laan, Alan Hubbard ( Division of Biostatistics,

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 24 Paper 332 Statistical Inference for the Mean Outcome Under a Possibly Non-Unique Optimal Treatment

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2009 Paper 251 Nonparametric population average models: deriving the form of approximate population

More information

Targeted Maximum Likelihood Estimation for Dynamic Treatment Regimes in Sequential Randomized Controlled Trials

Targeted Maximum Likelihood Estimation for Dynamic Treatment Regimes in Sequential Randomized Controlled Trials From the SelectedWorks of Paul H. Chaffee June 22, 2012 Targeted Maximum Likelihood Estimation for Dynamic Treatment Regimes in Sequential Randomized Controlled Trials Paul Chaffee Mark J. van der Laan

More information

Data-adaptive inference of the optimal treatment rule and its mean reward. The masked bandit

Data-adaptive inference of the optimal treatment rule and its mean reward. The masked bandit Data-adaptive inference of the optimal treatment rule and its mean reward. The masked bandit Antoine Chambaz, Wenjing Zheng, Mark Van Der Laan To cite this version: Antoine Chambaz, Wenjing Zheng, Mark

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2010 Paper 269 Diagnosing and Responding to Violations in the Positivity Assumption Maya L. Petersen

More information

Extending causal inferences from a randomized trial to a target population

Extending causal inferences from a randomized trial to a target population Extending causal inferences from a randomized trial to a target population Issa Dahabreh Center for Evidence Synthesis in Health, Brown University issa dahabreh@brown.edu January 16, 2019 Issa Dahabreh

More information

e author and the promoter give permission to consult this master dissertation and to copy it or parts of it for personal use. Each other use falls

e author and the promoter give permission to consult this master dissertation and to copy it or parts of it for personal use. Each other use falls e author and the promoter give permission to consult this master dissertation and to copy it or parts of it for personal use. Each other use falls under the restrictions of the copyright, in particular

More information

A new approach to hierarchical data analysis: Targeted maximum likelihood estimation for the causal effect of a cluster-level exposure

A new approach to hierarchical data analysis: Targeted maximum likelihood estimation for the causal effect of a cluster-level exposure A new approach to hierarchical data analysis: Targeted maximum likelihood estimation for the causal effect of a cluster-level exposure arxiv:1706.02675v2 [stat.me] 2 Apr 2018 Laura B. Balzer, Wenjing Zheng,

More information

The International Journal of Biostatistics

The International Journal of Biostatistics The International Journal of Biostatistics Volume 2, Issue 1 2006 Article 2 Statistical Inference for Variable Importance Mark J. van der Laan, Division of Biostatistics, School of Public Health, University

More information

Adaptive Trial Designs

Adaptive Trial Designs Adaptive Trial Designs Wenjing Zheng, Ph.D. Methods Core Seminar Center for AIDS Prevention Studies University of California, San Francisco Nov. 17 th, 2015 Trial Design! Ethical:!eg.! Safety!! Efficacy!

More information

Data-adaptive smoothing for optimal-rate estimation of possibly non-regular parameters

Data-adaptive smoothing for optimal-rate estimation of possibly non-regular parameters arxiv:1706.07408v2 [math.st] 12 Jul 2017 Data-adaptive smoothing for optimal-rate estimation of possibly non-regular parameters Aurélien F. Bibaut January 8, 2018 Abstract Mark J. van der Laan We consider

More information

Targeted Maximum Likelihood Estimation of the Parameter of a Marginal Structural Model

Targeted Maximum Likelihood Estimation of the Parameter of a Marginal Structural Model Johns Hopkins Bloomberg School of Public Health From the SelectedWorks of Michael Rosenblum 2010 Targeted Maximum Likelihood Estimation of the Parameter of a Marginal Structural Model Michael Rosenblum,

More information

Causal Inference for Case-Control Studies. Sherri Rose. A dissertation submitted in partial satisfaction of the. requirements for the degree of

Causal Inference for Case-Control Studies. Sherri Rose. A dissertation submitted in partial satisfaction of the. requirements for the degree of Causal Inference for Case-Control Studies By Sherri Rose A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Biostatistics in the Graduate Division

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2016 Paper 352 Scalable Collaborative Targeted Learning for High-dimensional Data Cheng Ju Susan Gruber

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2005 Paper 191 Population Intervention Models in Causal Inference Alan E. Hubbard Mark J. van der Laan

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2008 Paper 241 A Note on Risk Prediction for Case-Control Studies Sherri Rose Mark J. van der Laan Division

More information

AFT Models and Empirical Likelihood

AFT Models and Empirical Likelihood AFT Models and Empirical Likelihood Mai Zhou Department of Statistics, University of Kentucky Collaborators: Gang Li (UCLA); A. Bathke; M. Kim (Kentucky) Accelerated Failure Time (AFT) models: Y = log(t

More information

Targeted Minimum Loss Based Estimation for Longitudinal Data. Paul H. Chaffee. A dissertation submitted in partial satisfaction of the

Targeted Minimum Loss Based Estimation for Longitudinal Data. Paul H. Chaffee. A dissertation submitted in partial satisfaction of the Targeted Minimum Loss Based Estimation for Longitudinal Data by Paul H. Chaffee A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Biostatistics

More information

Combining multiple observational data sources to estimate causal eects

Combining multiple observational data sources to estimate causal eects Department of Statistics, North Carolina State University Combining multiple observational data sources to estimate causal eects Shu Yang* syang24@ncsuedu Joint work with Peng Ding UC Berkeley May 23,

More information

Deductive Derivation and Computerization of Semiparametric Efficient Estimation

Deductive Derivation and Computerization of Semiparametric Efficient Estimation Deductive Derivation and Computerization of Semiparametric Efficient Estimation Constantine Frangakis, Tianchen Qian, Zhenke Wu, and Ivan Diaz Department of Biostatistics Johns Hopkins Bloomberg School

More information

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) = Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,

More information

Empirical Bayes Moderation of Asymptotically Linear Parameters

Empirical Bayes Moderation of Asymptotically Linear Parameters Empirical Bayes Moderation of Asymptotically Linear Parameters Nima Hejazi Division of Biostatistics University of California, Berkeley stat.berkeley.edu/~nhejazi nimahejazi.org twitter/@nshejazi github/nhejazi

More information

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2009 Paper 252 Targeted Maximum Likelihood Estimation: A Gentle Introduction Susan Gruber Mark J. van

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2012 Paper 294 Why Match in Individually and Cluster Randomized Trials? Laura B. Balzer Maya L. Petersen

More information

TARGETED SEQUENTIAL DESIGN FOR TARGETED LEARNING INFERENCE OF THE OPTIMAL TREATMENT RULE AND ITS MEAN REWARD

TARGETED SEQUENTIAL DESIGN FOR TARGETED LEARNING INFERENCE OF THE OPTIMAL TREATMENT RULE AND ITS MEAN REWARD The Annals of Statistics 2017, Vol. 45, No. 6, 1 28 https://doi.org/10.1214/16-aos1534 Institute of Mathematical Statistics, 2017 TARGETED SEQUENTIAL DESIGN FOR TARGETED LEARNING INFERENCE OF THE OPTIMAL

More information

Empirical Bayes Moderation of Asymptotically Linear Parameters

Empirical Bayes Moderation of Asymptotically Linear Parameters Empirical Bayes Moderation of Asymptotically Linear Parameters Nima Hejazi Division of Biostatistics University of California, Berkeley stat.berkeley.edu/~nhejazi nimahejazi.org twitter/@nshejazi github/nhejazi

More information

Modern Statistical Learning Methods for Observational Data and Applications to Comparative Effectiveness Research

Modern Statistical Learning Methods for Observational Data and Applications to Comparative Effectiveness Research Modern Statistical Learning Methods for Observational Data and Applications to Comparative Effectiveness Research Chapter 4: Efficient, doubly-robust estimation of an average treatment effect David Benkeser

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2004 Paper 155 Estimation of Direct and Indirect Causal Effects in Longitudinal Studies Mark J. van

More information

ENHANCED PRECISION IN THE ANALYSIS OF RANDOMIZED TRIALS WITH ORDINAL OUTCOMES

ENHANCED PRECISION IN THE ANALYSIS OF RANDOMIZED TRIALS WITH ORDINAL OUTCOMES Johns Hopkins University, Dept. of Biostatistics Working Papers 10-22-2014 ENHANCED PRECISION IN THE ANALYSIS OF RANDOMIZED TRIALS WITH ORDINAL OUTCOMES Iván Díaz Johns Hopkins University, Johns Hopkins

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2015 Paper 341 The Statistics of Sensitivity Analyses Alexander R. Luedtke Ivan Diaz Mark J. van der

More information

,..., θ(2),..., θ(n)

,..., θ(2),..., θ(n) Likelihoods for Multivariate Binary Data Log-Linear Model We have 2 n 1 distinct probabilities, but we wish to consider formulations that allow more parsimonious descriptions as a function of covariates.

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2003 Paper 137 Loss-Based Estimation with Cross-Validation: Applications to Microarray Data Analysis

More information

A Sampling of IMPACT Research:

A Sampling of IMPACT Research: A Sampling of IMPACT Research: Methods for Analysis with Dropout and Identifying Optimal Treatment Regimes Marie Davidian Department of Statistics North Carolina State University http://www.stat.ncsu.edu/

More information

Causal Effect Models for Realistic Individualized Treatment and Intention to Treat Rules

Causal Effect Models for Realistic Individualized Treatment and Intention to Treat Rules University of California, Berkeley From the SelectedWorks of Maya Petersen March, 2007 Causal Effect Models for Realistic Individualized Treatment and Intention to Treat Rules Mark J van der Laan, University

More information

Lecture 6: Discrete Choice: Qualitative Response

Lecture 6: Discrete Choice: Qualitative Response Lecture 6: Instructor: Department of Economics Stanford University 2011 Types of Discrete Choice Models Univariate Models Binary: Linear; Probit; Logit; Arctan, etc. Multinomial: Logit; Nested Logit; GEV;

More information

Flexible Estimation of Treatment Effect Parameters

Flexible Estimation of Treatment Effect Parameters Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both

More information

Modeling Real Estate Data using Quantile Regression

Modeling Real Estate Data using Quantile Regression Modeling Real Estate Data using Semiparametric Quantile Regression Department of Statistics University of Innsbruck September 9th, 2011 Overview 1 Application: 2 3 4 Hedonic regression data for house prices

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2002 Paper 122 Construction of Counterfactuals and the G-computation Formula Zhuo Yu Mark J. van der

More information

arxiv: v1 [math.st] 24 Mar 2016

arxiv: v1 [math.st] 24 Mar 2016 The Annals of Statistics 2016, Vol. 44, No. 2, 713 742 DOI: 10.1214/15-AOS1384 c Institute of Mathematical Statistics, 2016 arxiv:1603.07573v1 [math.st] 24 Mar 2016 STATISTICAL INFERENCE FOR THE MEAN OUTCOME

More information

Targeted Learning. Sherri Rose. April 24, Associate Professor Department of Health Care Policy Harvard Medical School

Targeted Learning. Sherri Rose. April 24, Associate Professor Department of Health Care Policy Harvard Medical School Targeted Learning Sherri Rose Associate Professor Department of Health Care Policy Harvard Medical School Slides: drsherrirosecom/short-courses Code: githubcom/sherrirose/cncshortcourse April 24, 2017

More information

Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates

Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates Anastasios (Butch) Tsiatis Department of Statistics North Carolina State University http://www.stat.ncsu.edu/

More information

Linear Models for Regression

Linear Models for Regression Linear Models for Regression Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr

More information

Estimation of Dynamic Regression Models

Estimation of Dynamic Regression Models University of Pavia 2007 Estimation of Dynamic Regression Models Eduardo Rossi University of Pavia Factorization of the density DGP: D t (x t χ t 1, d t ; Ψ) x t represent all the variables in the economy.

More information

Robust and flexible estimation of data-dependent stochastic mediation effects: a proposed method and example in a randomized trial setting

Robust and flexible estimation of data-dependent stochastic mediation effects: a proposed method and example in a randomized trial setting Robust and flexible estimation of data-dependent stochastic mediation effects: a proposed method and example in a randomized trial setting arxiv:1707.09021v2 [stat.ap] 29 Oct 2018 Kara E. Rudolph 1, Oleg

More information

Machine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li. https://funglee.github.io

Machine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li. https://funglee.github.io Machine Learning Lecture 4: Regularization and Bayesian Statistics Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 207 Overfitting Problem

More information

Logistic Regression Review Fall 2012 Recitation. September 25, 2012 TA: Selen Uguroglu

Logistic Regression Review Fall 2012 Recitation. September 25, 2012 TA: Selen Uguroglu Logistic Regression Review 10-601 Fall 2012 Recitation September 25, 2012 TA: Selen Uguroglu!1 Outline Decision Theory Logistic regression Goal Loss function Inference Gradient Descent!2 Training Data

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS Parametric Distributions Basic building blocks: Need to determine given Representation: or? Recall Curve Fitting Binary Variables

More information

Estimating the Effect of Vigorous Physical Activity on Mortality in the Elderly Based on Realistic Individualized Treatment and Intentionto-Treat

Estimating the Effect of Vigorous Physical Activity on Mortality in the Elderly Based on Realistic Individualized Treatment and Intentionto-Treat University of California, Berkeley From the SelectedWorks of Oliver Bembom May, 2007 Estimating the Effect of Vigorous Physical Activity on Mortality in the Elderly Based on Realistic Individualized Treatment

More information

Surrogate loss functions, divergences and decentralized detection

Surrogate loss functions, divergences and decentralized detection Surrogate loss functions, divergences and decentralized detection XuanLong Nguyen Department of Electrical Engineering and Computer Science U.C. Berkeley Advisors: Michael Jordan & Martin Wainwright 1

More information

1 Basic summary of article to set stage for discussion

1 Basic summary of article to set stage for discussion Epidemiol. Methods 214; 3(1): 21 31 Discussion Mark J. van der Laan*, Alexander R. Luedtke and Iván Díaz Discussion of Identification, Estimation and Approximation of Risk under Interventions that Depend

More information

Machine Learning 4771

Machine Learning 4771 Machine Learning 4771 Instructor: Tony Jebara Topic 7 Unsupervised Learning Statistical Perspective Probability Models Discrete & Continuous: Gaussian, Bernoulli, Multinomial Maimum Likelihood Logistic

More information

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model Other Survival Models (1) Non-PH models We briefly discussed the non-proportional hazards (non-ph) model λ(t Z) = λ 0 (t) exp{β(t) Z}, where β(t) can be estimated by: piecewise constants (recall how);

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

Lecture 14 October 13

Lecture 14 October 13 STAT 383C: Statistical Modeling I Fall 2015 Lecture 14 October 13 Lecturer: Purnamrita Sarkar Scribe: Some one Disclaimer: These scribe notes have been slightly proofread and may have typos etc. Note:

More information

Estimation of the Bivariate and Marginal Distributions with Censored Data

Estimation of the Bivariate and Marginal Distributions with Censored Data Estimation of the Bivariate and Marginal Distributions with Censored Data Michael Akritas and Ingrid Van Keilegom Penn State University and Eindhoven University of Technology May 22, 2 Abstract Two new

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2003 Paper 130 Unified Cross-Validation Methodology For Selection Among Estimators and a General Cross-Validated

More information

Bayesian Machine Learning

Bayesian Machine Learning Bayesian Machine Learning Andrew Gordon Wilson ORIE 6741 Lecture 2: Bayesian Basics https://people.orie.cornell.edu/andrew/orie6741 Cornell University August 25, 2016 1 / 17 Canonical Machine Learning

More information

Primal-dual Covariate Balance and Minimal Double Robustness via Entropy Balancing

Primal-dual Covariate Balance and Minimal Double Robustness via Entropy Balancing Primal-dual Covariate Balance and Minimal Double Robustness via (Joint work with Daniel Percival) Department of Statistics, Stanford University JSM, August 9, 2015 Outline 1 2 3 1/18 Setting Rubin s causal

More information

Department of Statistical Science FIRST YEAR EXAM - SPRING 2017

Department of Statistical Science FIRST YEAR EXAM - SPRING 2017 Department of Statistical Science Duke University FIRST YEAR EXAM - SPRING 017 Monday May 8th 017, 9:00 AM 1:00 PM NOTES: PLEASE READ CAREFULLY BEFORE BEGINNING EXAM! 1. Do not write solutions on the exam;

More information

Bayesian estimation of the discrepancy with misspecified parametric models

Bayesian estimation of the discrepancy with misspecified parametric models Bayesian estimation of the discrepancy with misspecified parametric models Pierpaolo De Blasi University of Torino & Collegio Carlo Alberto Bayesian Nonparametrics workshop ICERM, 17-21 September 2012

More information

Monte Carlo Integration I [RC] Chapter 3

Monte Carlo Integration I [RC] Chapter 3 Aula 3. Monte Carlo Integration I 0 Monte Carlo Integration I [RC] Chapter 3 Anatoli Iambartsev IME-USP Aula 3. Monte Carlo Integration I 1 There is no exact definition of the Monte Carlo methods. In the

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued

Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations Research

More information

Math 494: Mathematical Statistics

Math 494: Mathematical Statistics Math 494: Mathematical Statistics Instructor: Jimin Ding jmding@wustl.edu Department of Mathematics Washington University in St. Louis Class materials are available on course website (www.math.wustl.edu/

More information

Figure 36: Respiratory infection versus time for the first 49 children.

Figure 36: Respiratory infection versus time for the first 49 children. y BINARY DATA MODELS We devote an entire chapter to binary data since such data are challenging, both in terms of modeling the dependence, and parameter interpretation. We again consider mixed effects

More information

Pattern Recognition and Machine Learning. Bishop Chapter 9: Mixture Models and EM

Pattern Recognition and Machine Learning. Bishop Chapter 9: Mixture Models and EM Pattern Recognition and Machine Learning Chapter 9: Mixture Models and EM Thomas Mensink Jakob Verbeek October 11, 27 Le Menu 9.1 K-means clustering Getting the idea with a simple example 9.2 Mixtures

More information

Bayesian Estimation Under Informative Sampling with Unattenuated Dependence

Bayesian Estimation Under Informative Sampling with Unattenuated Dependence Bayesian Estimation Under Informative Sampling with Unattenuated Dependence Matt Williams 1 Terrance Savitsky 2 1 Substance Abuse and Mental Health Services Administration Matthew.Williams@samhsa.hhs.gov

More information

Information Theory and Statistics, Part I

Information Theory and Statistics, Part I Information Theory and Statistics, Part I Information Theory 2013 Lecture 6 George Mathai May 16, 2013 Outline This lecture will cover Method of Types. Law of Large Numbers. Universal Source Coding. Large

More information

arxiv: v1 [math.st] 28 Feb 2017

arxiv: v1 [math.st] 28 Feb 2017 Bridging Finite and Super Population Causal Inference arxiv:1702.08615v1 [math.st] 28 Feb 2017 Peng Ding, Xinran Li, and Luke W. Miratrix Abstract There are two general views in causal analysis of experimental

More information

An Introduction to Statistical and Probabilistic Linear Models

An Introduction to Statistical and Probabilistic Linear Models An Introduction to Statistical and Probabilistic Linear Models Maximilian Mozes Proseminar Data Mining Fakultät für Informatik Technische Universität München June 07, 2017 Introduction In statistical learning

More information

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor Aguirregabiria

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor Aguirregabiria ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor guirregabiria SOLUTION TO FINL EXM Monday, pril 14, 2014. From 9:00am-12:00pm (3 hours) INSTRUCTIONS:

More information

Randomization-Based Inference With Complex Data Need Not Be Complex!

Randomization-Based Inference With Complex Data Need Not Be Complex! Randomization-Based Inference With Complex Data Need Not Be Complex! JITAIs JITAIs Susan Murphy 07.18.17 HeartSteps JITAI JITAIs Sequential Decision Making Use data to inform science and construct decision

More information

Bayesian Regression Linear and Logistic Regression

Bayesian Regression Linear and Logistic Regression When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we

More information

Introduction to Probabilistic Machine Learning

Introduction to Probabilistic Machine Learning Introduction to Probabilistic Machine Learning Piyush Rai Dept. of CSE, IIT Kanpur (Mini-course 1) Nov 03, 2015 Piyush Rai (IIT Kanpur) Introduction to Probabilistic Machine Learning 1 Machine Learning

More information

Density estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas

Density estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas 0 0 5 Motivation: Regression discontinuity (Angrist&Pischke) Outcome.5 1 1.5 A. Linear E[Y 0i X i] 0.2.4.6.8 1 X Outcome.5 1 1.5 B. Nonlinear E[Y 0i X i] i 0.2.4.6.8 1 X utcome.5 1 1.5 C. Nonlinearity

More information

Diagnosing and responding to violations in the positivity assumption.

Diagnosing and responding to violations in the positivity assumption. University of California, Berkeley From the SelectedWorks of Maya Petersen 2012 Diagnosing and responding to violations in the positivity assumption. Maya Petersen, University of California, Berkeley K

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview

Introduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview Introduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations

More information