Targeted Maximum Likelihood Estimation in Safety Analysis
|
|
- Preston Moody
- 5 years ago
- Views:
Transcription
1 Targeted Maximum Likelihood Estimation in Safety Analysis Sam Lendle 1 Bruce Fireman 2 Mark van der Laan 1 1 UC Berkeley 2 Kaiser Permanente ISPE Advanced Topics Session, Barcelona, August / 35
2 Outline 1 Introduction 2 Super learning 3 TMLE and collaborative TMLE 4 Kaiser Permanente data example 5 Simulations based on KP data 2 / 35
3 Outline 1 Introduction 2 Super learning 3 TMLE and collaborative TMLE 4 Kaiser Permanente data example 5 Simulations based on KP data 3 / 35
4 Traditional approach in epidemiology and clinical medicine Fit several parametric logistic regression models, and select a favorite one. Report point estimate of coefficient in front of treatment, confidence intervals, and p-value, as if this parametric model was a priori-specified. Problems Parametric model is misspecified, but parameter estimates are interpreted as if the model is correct Estimates of variance do not account for model selection, so confidence intervals and p-values are wrong, even if the final model is somehow correct! 4 / 35
5 The statistical estimation problem Observed data: Realizations of random variables with a probability distribution. Statistical model: Set of possible distributions for the data-generating distribution, defined by actual knowledge about the data. e.g. in an RCT, we know the probability of each subject receiving treatment. Statistical target parameter: Function of the data-generating distribution that we wish to learn from the data. Estimator: An a priori-specified algorithm that takes the observed data and returns an estimate of the target parameter. Benchmarked by a dissimilarity-measure (e.g., MSE) w.r.t target parameter. 5 / 35
6 Causal inference Non-testable assumptions in addition to the assumptions defining the statistical model. (e.g. the no unmeasured confounders assumption). Allows for causal interpretation of statistical parameter estimates Even if we don t believe the non-testable causal assumptions, the statistical estimation problem is still the same, and estimates still have valid statistical interpretations. 6 / 35
7 Targeted learning Define true statistical models, and interesting target parameters Avoid reliance on human art and nonrealistic parametric models Target the fit of the data-generating distribution to the parameter of interest Statistical inference Has been applied to: static or dynamic treatments, direct and indirect effects, parameters of MSMs, variable importance analysis, longitudinal/repeated measures data with time-dependent confounding, censoring/missingness, case-control studies, RCTs 7 / 35
8 Two stage estimation methodology Super learning (SL) (van der Laan et al. 2007) Uses a library of candidate estimators (e.g. multiple parametric models, machine learning algorithms like neural networks, RandomForest, etc.) Builds data-adaptive weighted combination of estimators using cross validation Targeted maximum likelihood estimation (TMLE) (van der Laan and Rubin 2006) Updates initial estimate, often a Super Learner, to remove bias for the parameter of interest Calculates final parameter from updated fit of the data-generating distribution 8 / 35
9 Outline 1 Introduction 2 Super learning 3 TMLE and collaborative TMLE 4 Kaiser Permanente data example 5 Simulations based on KP data 9 / 35
10 Super learning No need to chose a priori a particular parametric model or machine learning algorithm for a particular problem Allows one to combine many data-adaptive estimators into one improved estimator. Grounded by oracle results for loss-function based cross-validation (Van Der Laan and Dudoit 2003). Loss function needs to be bounded. Performs asymptotically as well as best (oracle) weighted combination, or achieves parametric rate of convergence. 10 / 35
11 Super learning Figure: Relative Cross-Validated Mean Squared Error (compared to main terms least squares regression) 11 / 35
12 Super learning 12 / 35
13 Outline 1 Introduction 2 Super learning 3 TMLE and collaborative TMLE 4 Kaiser Permanente data example 5 Simulations based on KP data 13 / 35
14 TMLE algorithm 14 / 35
15 Targeted MLE 1 Identify the least favorable parametric model for fluctuating initial ˆP Small fluctuation maximum change in target. 2 Identify optimum amount of fluctuation by MLE. 3 Apply optimal fluctuation to ˆP 1st-step targeted maximum likelihood estimator. 4 Repeat until the incremental fluctuation" is zero Some important cases: 1 step to convergence. 5 Final probability distribution solves efficient score equation for target parameter T-MLE is a double robust & locally efficient plug-in estimator 15 / 35
16 Collaborative TMLE (CTMLE) algorithm Like TMLE, but chooses an estimate ĝ of the treatment mechanism/propensity score based on how well it helps estimate Ψ(Q 0 ) instead of how well it estimates the true g 0. Build estimate for g 0 in a stepwise fashion Strongest confounders are adjusted for first Instrumental variables and weak confounders tend to be excluded Order of terms added to ĝ is chosen via a penalized log likelihood, and number of terms is chosen via cross-validation 16 / 35
17 Kang and Schafer (2007) simulations Outcome Y continuous subject to missingness, and 4 covariates, W 1, W 2, W 3, W 4 True population mean (target parameter) is 210, mean among the non-missing is 200. Positivity violations g 0 ( = 1 W ) as small as 0.01 Modification 1: stronger positivity violations, g 0 ( = 1 W ) as small as Modification 2: same as 1, but one covariate is no longer affects Y, so it is an instrumental variable. 17 / 35
18 Kang and Schafer (2007) simulations Kang and Schafer Simulation OLS WLS A IPCW TMLE C TMLE / 35
19 Kang and Schafer (2007) simulations Modification 1 to Kang and Schafer Simulation OLS WLS A IPCW TMLE C TMLE 19 / 35
20 Kang and Schafer (2007) simulations Modification 2 to Kang and Schafer Simulation OLS WLS A IPCW TMLE C TMLE 20 / 35
21 Outline 1 Introduction 2 Super learning 3 TMLE and collaborative TMLE 4 Kaiser Permanente data example 5 Simulations based on KP data 21 / 35
22 Description of dataset A subset of data from Kaiser Permanente, part of which is used in FDA s Mini-Sentinel drug safety surveillance. Population: diabetic patients without prior cardiovascular disease who are new users of pioglitazone or a sulfonylurea (two anti-diabetic drugs) and who are followed up for at least 6 months without also starting the other drug. 1 Treatment arm (in this example): pioglitazone (Treatment variable A = 1) Comparator: sulfonylurea (A = 0) Outcome (Y ): acute myocardial infarction (AMI) in first 6 months of new anti-diabetic drug use. Baseline covariates (W ): fifty covariates including demographics, comorbidities, and other drug use. 1 We found that adjusting for missing outcomes had no effect on the results in this case so we suppress those results and ignore missingness in this example. 22 / 35
23 Causal model, counterfactual outcomes, and parameter of interest Non-parametric structural equation model: Each variable is an unknown deterministic function of the past and an error. W = f W (U W ) A = f A (W, U A ) Y = f Y (A, W, U Y ) Counterfactual outcomes: substitute a fixed treatment for A in f Y : Y a = f Y (W, a, U Y ) for a {0, 1}. Causal parameter of interest: The average treatment effect (ATE). E(Y 1 Y 0 ) Statistical parameter of interest: Ψ(P 0 ) = E[E(Y A = 1, W ) E(Y A = 0, W )] equals E(Y 1 Y 0 ) under randomization assumption ( no unmeasured confounders ) and positivity assumption 23 / 35
24 Analysis results Summary of outcome by treatment Estimates Treatment Comparator Total Total AMI 5 (0.233%) 86 (0.3437%) 91 (0.335%) Estimate p-value Unadjusted G-comp PS matching IPTW AIPTW TMLE Though sample size is large, there are so few AMIs in this subset of data from Kaiser Permanente that it is hard to tell if adjustment for potential confounders is important. 24 / 35
25 Outline 1 Introduction 2 Super learning 3 TMLE and collaborative TMLE 4 Kaiser Permanente data example 5 Simulations based on KP data 25 / 35
26 Strategy Simulate datasets based on real study data where the true effect is known to highlight properties of estimators. Start with KP data set, including additional new users of three other anti-diabetic drugs. Sample W with replacement from empirical distribution of baseline covariates Simulate treatment A assignments based on a known function of baseline covariates Simulate outcome Y based on a function of W adjusted so that Y is not too rare. Because the Y is simulated based on a function of only baseline covariates and not the treatment, the true average treatment effect is known to be zero. 26 / 35
27 Simulation 1 Treatment mechanism a function of 12 covariates strongly predictive of the outcome. Outcome and propensity score models known and can be correctly specified. Outcome and propensity score models are misspecified by leaving out half of the important confounders. Results demonstrate the double-robustness of TMLE and AIPTW: when either the model for the outcome regression or the PS is specified correctly, the parameter estimate is consistent, which is not the case for the G-computation estimator or IPTW. 27 / 35
28 Simulation 1 Estimator Bias MSE n=1000 n=5000 n=1000 n=5000 Unadjusted G-comp PSM IPTW AIPTW TMLE G-comp, misspecified PSM, misspecified IPTW, misspecified AIPTW, Outcome misspecified AIPTW, PS misspecified TMLE, Outcome misspecified TMLE, PS misspecified / 35
29 Simulation 2 Treatment mechanism now depends on a covariate that is very predictive of treatment, resulting in positivity violations, but is not a confounder. Results illustrate that IPTW has much higher variance than other estimators, particularly in small samples, and that CTMLE is very robust to violations of the positivity assumption, particularly in small samples. 29 / 35
30 Simulation 2 Estimator Bias MSE n=100 n=500 n=100 n=500 Unadjusted G-comp PSM IPTW AIPTW TMLE CTMLE Some estimates are out of the parameter space (> 1) due to very large weights, resulting in the high variance. 30 / 35
31 Simulation 3 Treatment mechanism depends on the interactions between binary covariates. Main terms logistic regression for the PS is not sufficient to account for all confounding. Results demonstrate that data adaptive SuperLearning is necessary to estimate the PS well enough to adjust for confounding. 31 / 35
32 Simulation 3 Estimator Bias MSE n=1000 n=5000 n=1000 n=5000 Unadjusted PSM, PS main terms only IPTW, PS main terms only AIPTW, PS main terms only TMLE, PS main terms only PSM, PS SuperLearner IPTW, PS SuperLearner AIPTW, PS SuperLearner TMLE, PS SuperLearner Here the outcome regression in TMLE and AIPTW is unadjusted to emphasize the benefits of SuperLearning for the PS. 32 / 35
33 Further Materials Targeted Learning Book Springer Series in Statistics van der laan & Rose targetedlearningbook.com 33 / 35
34 References I J. Kang and J. Schafer. Demystifying double robustness: a comparison of alternative strategies for estimating a population mean from incomplete data. Statistical Science, 22(4): , M. Van Der Laan and S. Dudoit. Unified cross-validation methodology for selection among estimators and a general cross-validated adaptive epsilon-net estimator: Finite sample oracle inequalities and examples. UC Berkeley Division of Biostatistics Working Paper Series, page 130, M. J. van der Laan and S. Rose. Targeted Learning: Causal Inference for Observational and Experimental Data. Springer, New York, ISBN / 35
35 References II M. J. van der Laan and D. Rubin. Targeted Maximum Likelihood Learning. The International Journal of Biostatistics, 2(1), Jan ISSN doi: / M. J. van der Laan, E. C. Polley, and A. E. Hubbard. Super learner. Statistical applications in genetics and molecular biology, 6(1), Jan ISSN doi: / / 35
Causal Inference Basics
Causal Inference Basics Sam Lendle October 09, 2013 Observed data, question, counterfactuals Observed data: n i.i.d copies of baseline covariates W, treatment A {0, 1}, and outcome Y. O i = (W i, A i,
More informationUniversity of California, Berkeley
University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2010 Paper 260 Collaborative Targeted Maximum Likelihood For Time To Event Data Ori M. Stitelman Mark
More informationTargeted Learning for High-Dimensional Variable Importance
Targeted Learning for High-Dimensional Variable Importance Alan Hubbard, Nima Hejazi, Wilson Cai, Anna Decker Division of Biostatistics University of California, Berkeley July 27, 2016 for Centre de Recherches
More informationUniversity of California, Berkeley
University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2010 Paper 259 Targeted Maximum Likelihood Based Causal Inference Mark J. van der Laan University of
More informationUniversity of California, Berkeley
University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2011 Paper 282 Super Learner Based Conditional Density Estimation with Application to Marginal Structural
More informationCollaborative Targeted Maximum Likelihood Estimation. Susan Gruber
Collaborative Targeted Maximum Likelihood Estimation by Susan Gruber A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Biostatistics in the
More informationUniversity of California, Berkeley
University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2011 Paper 290 Targeted Minimum Loss Based Estimation of an Intervention Specific Mean Outcome Mark
More informatione author and the promoter give permission to consult this master dissertation and to copy it or parts of it for personal use. Each other use falls
e author and the promoter give permission to consult this master dissertation and to copy it or parts of it for personal use. Each other use falls under the restrictions of the copyright, in particular
More informationUniversity of California, Berkeley
University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2011 Paper 288 Targeted Maximum Likelihood Estimation of Natural Direct Effect Wenjing Zheng Mark J.
More informationTargeted Maximum Likelihood Estimation for Adaptive Designs: Adaptive Randomization in Community Randomized Trial
Targeted Maximum Likelihood Estimation for Adaptive Designs: Adaptive Randomization in Community Randomized Trial Mark J. van der Laan 1 University of California, Berkeley School of Public Health laan@berkeley.edu
More informationFair Inference Through Semiparametric-Efficient Estimation Over Constraint-Specific Paths
Fair Inference Through Semiparametric-Efficient Estimation Over Constraint-Specific Paths for New Developments in Nonparametric and Semiparametric Statistics, Joint Statistical Meetings; Vancouver, BC,
More informationUniversity of California, Berkeley
University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2016 Paper 352 Scalable Collaborative Targeted Learning for High-dimensional Data Cheng Ju Susan Gruber
More informationApplication of Time-to-Event Methods in the Assessment of Safety in Clinical Trials
Application of Time-to-Event Methods in the Assessment of Safety in Clinical Trials Progress, Updates, Problems William Jen Hoe Koh May 9, 2013 Overview Marginal vs Conditional What is TMLE? Key Estimation
More informationModern Statistical Learning Methods for Observational Data and Applications to Comparative Effectiveness Research
Modern Statistical Learning Methods for Observational Data and Applications to Comparative Effectiveness Research Chapter 4: Efficient, doubly-robust estimation of an average treatment effect David Benkeser
More informationUniversity of California, Berkeley
University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2014 Paper 327 Entering the Era of Data Science: Targeted Learning and the Integration of Statistics
More informationTargeted Group Sequential Adaptive Designs
Targeted Group Sequential Adaptive Designs Mark van der Laan Department of Biostatistics, University of California, Berkeley School of Public Health Liver Forum, May 10, 2017 Targeted Group Sequential
More informationEmpirical Bayes Moderation of Asymptotically Linear Parameters
Empirical Bayes Moderation of Asymptotically Linear Parameters Nima Hejazi Division of Biostatistics University of California, Berkeley stat.berkeley.edu/~nhejazi nimahejazi.org twitter/@nshejazi github/nhejazi
More informationUniversity of California, Berkeley
University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2009 Paper 248 Application of Time-to-Event Methods in the Assessment of Safety in Clinical Trials Kelly
More informationTargeted Maximum Likelihood Estimation for Dynamic Treatment Regimes in Sequential Randomized Controlled Trials
From the SelectedWorks of Paul H. Chaffee June 22, 2012 Targeted Maximum Likelihood Estimation for Dynamic Treatment Regimes in Sequential Randomized Controlled Trials Paul Chaffee Mark J. van der Laan
More informationConstruction and statistical analysis of adaptive group sequential designs for randomized clinical trials
Construction and statistical analysis of adaptive group sequential designs for randomized clinical trials Antoine Chambaz (MAP5, Université Paris Descartes) joint work with Mark van der Laan Atelier INSERM
More informationEmpirical Bayes Moderation of Asymptotically Linear Parameters
Empirical Bayes Moderation of Asymptotically Linear Parameters Nima Hejazi Division of Biostatistics University of California, Berkeley stat.berkeley.edu/~nhejazi nimahejazi.org twitter/@nshejazi github/nhejazi
More informationMarginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal
Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal Overview In observational and experimental studies, the goal may be to estimate the effect
More informationDATA-ADAPTIVE VARIABLE SELECTION FOR
DATA-ADAPTIVE VARIABLE SELECTION FOR CAUSAL INFERENCE Group Health Research Institute Department of Biostatistics, University of Washington shortreed.s@ghc.org joint work with Ashkan Ertefaie Department
More informationSIMPLE EXAMPLES OF ESTIMATING CAUSAL EFFECTS USING TARGETED MAXIMUM LIKELIHOOD ESTIMATION
Johns Hopkins University, Dept. of Biostatistics Working Papers 3-3-2011 SIMPLE EXAMPLES OF ESTIMATING CAUSAL EFFECTS USING TARGETED MAXIMUM LIKELIHOOD ESTIMATION Michael Rosenblum Johns Hopkins Bloomberg
More informationThe International Journal of Biostatistics
The International Journal of Biostatistics Volume 2, Issue 1 2006 Article 2 Statistical Inference for Variable Importance Mark J. van der Laan, Division of Biostatistics, School of Public Health, University
More informationUniversity of California, Berkeley
University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2015 Paper 334 Targeted Estimation and Inference for the Sample Average Treatment Effect Laura B. Balzer
More informationVariable selection and machine learning methods in causal inference
Variable selection and machine learning methods in causal inference Debashis Ghosh Department of Biostatistics and Informatics Colorado School of Public Health Joint work with Yeying Zhu, University of
More informationA new approach to hierarchical data analysis: Targeted maximum likelihood estimation for the causal effect of a cluster-level exposure
A new approach to hierarchical data analysis: Targeted maximum likelihood estimation for the causal effect of a cluster-level exposure arxiv:1706.02675v2 [stat.me] 2 Apr 2018 Laura B. Balzer, Wenjing Zheng,
More informationDouble Robustness. Bang and Robins (2005) Kang and Schafer (2007)
Double Robustness Bang and Robins (2005) Kang and Schafer (2007) Set-Up Assume throughout that treatment assignment is ignorable given covariates (similar to assumption that data are missing at random
More informationA Sampling of IMPACT Research:
A Sampling of IMPACT Research: Methods for Analysis with Dropout and Identifying Optimal Treatment Regimes Marie Davidian Department of Statistics North Carolina State University http://www.stat.ncsu.edu/
More informationUniversity of California, Berkeley
University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2015 Paper 341 The Statistics of Sensitivity Analyses Alexander R. Luedtke Ivan Diaz Mark J. van der
More informationUniversity of California, Berkeley
University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2008 Paper 241 A Note on Risk Prediction for Case-Control Studies Sherri Rose Mark J. van der Laan Division
More informationDistributed analysis in multi-center studies
Distributed analysis in multi-center studies Sharing of individual-level data across health plans or healthcare delivery systems continues to be challenging due to concerns about loss of patient privacy,
More informationAdaptive Trial Designs
Adaptive Trial Designs Wenjing Zheng, Ph.D. Methods Core Seminar Center for AIDS Prevention Studies University of California, San Francisco Nov. 17 th, 2015 Trial Design! Ethical:!eg.! Safety!! Efficacy!
More informationModification and Improvement of Empirical Likelihood for Missing Response Problem
UW Biostatistics Working Paper Series 12-30-2010 Modification and Improvement of Empirical Likelihood for Missing Response Problem Kwun Chuen Gary Chan University of Washington - Seattle Campus, kcgchan@u.washington.edu
More informationCausal Effect Models for Realistic Individualized Treatment and Intention to Treat Rules
University of California, Berkeley From the SelectedWorks of Maya Petersen March, 2007 Causal Effect Models for Realistic Individualized Treatment and Intention to Treat Rules Mark J van der Laan, University
More informationUniversity of California, Berkeley
University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2010 Paper 269 Diagnosing and Responding to Violations in the Positivity Assumption Maya L. Petersen
More informationExtending causal inferences from a randomized trial to a target population
Extending causal inferences from a randomized trial to a target population Issa Dahabreh Center for Evidence Synthesis in Health, Brown University issa dahabreh@brown.edu January 16, 2019 Issa Dahabreh
More informationPropensity Score Weighting with Multilevel Data
Propensity Score Weighting with Multilevel Data Fan Li Department of Statistical Science Duke University October 25, 2012 Joint work with Alan Zaslavsky and Mary Beth Landrum Introduction In comparative
More informationCovariate Balancing Propensity Score for General Treatment Regimes
Covariate Balancing Propensity Score for General Treatment Regimes Kosuke Imai Princeton University October 14, 2014 Talk at the Department of Psychiatry, Columbia University Joint work with Christian
More informationThis is the submitted version of the following book chapter: stat08068: Double robustness, which will be
This is the submitted version of the following book chapter: stat08068: Double robustness, which will be published in its final form in Wiley StatsRef: Statistics Reference Online (http://onlinelibrary.wiley.com/book/10.1002/9781118445112)
More informationTargeted Minimum Loss Based Estimation for Longitudinal Data. Paul H. Chaffee. A dissertation submitted in partial satisfaction of the
Targeted Minimum Loss Based Estimation for Longitudinal Data by Paul H. Chaffee A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Biostatistics
More informationDeductive Derivation and Computerization of Semiparametric Efficient Estimation
Deductive Derivation and Computerization of Semiparametric Efficient Estimation Constantine Frangakis, Tianchen Qian, Zhenke Wu, and Ivan Diaz Department of Biostatistics Johns Hopkins Bloomberg School
More informationCombining multiple observational data sources to estimate causal eects
Department of Statistics, North Carolina State University Combining multiple observational data sources to estimate causal eects Shu Yang* syang24@ncsuedu Joint work with Peng Ding UC Berkeley May 23,
More informationRobust Semiparametric Regression Estimation Using Targeted Maximum Likelihood with Application to Biomarker Discovery and Epidemiology
Robust Semiparametric Regression Estimation Using Targeted Maximum Likelihood with Application to Biomarker Discovery and Epidemiology by Catherine Ann Tuglus A dissertation submitted in partial satisfaction
More informationG-ESTIMATION OF STRUCTURAL NESTED MODELS (CHAPTER 14) BIOS G-Estimation
G-ESTIMATION OF STRUCTURAL NESTED MODELS (CHAPTER 14) BIOS 776 1 14 G-Estimation ( G-Estimation of Structural Nested Models 14) Outline 14.1 The causal question revisited 14.2 Exchangeability revisited
More informationTargeted Maximum Likelihood Estimation of the Parameter of a Marginal Structural Model
Johns Hopkins Bloomberg School of Public Health From the SelectedWorks of Michael Rosenblum 2010 Targeted Maximum Likelihood Estimation of the Parameter of a Marginal Structural Model Michael Rosenblum,
More informationOn the Use of the Bross Formula for Prioritizing Covariates in the High-Dimensional Propensity Score Algorithm
On the Use of the Bross Formula for Prioritizing Covariates in the High-Dimensional Propensity Score Algorithm Richard Wyss 1, Bruce Fireman 2, Jeremy A. Rassen 3, Sebastian Schneeweiss 1 Author Affiliations:
More informationUniversity of California, Berkeley
University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2014 Paper 330 Online Targeted Learning Mark J. van der Laan Samuel D. Lendle Division of Biostatistics,
More informationStatistical Inference for Data Adaptive Target Parameters
Statistical Inference for Data Adaptive Target Parameters Mark van der Laan, Alan Hubbard Division of Biostatistics, UC Berkeley December 13, 2013 Mark van der Laan, Alan Hubbard ( Division of Biostatistics,
More informationCausal Inference for Case-Control Studies. Sherri Rose. A dissertation submitted in partial satisfaction of the. requirements for the degree of
Causal Inference for Case-Control Studies By Sherri Rose A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Biostatistics in the Graduate Division
More informationUniversity of California, Berkeley
University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2009 Paper 251 Nonparametric population average models: deriving the form of approximate population
More informationUniversity of California, Berkeley
University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2004 Paper 155 Estimation of Direct and Indirect Causal Effects in Longitudinal Studies Mark J. van
More informationUniversity of California, Berkeley
University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2009 Paper 252 Targeted Maximum Likelihood Estimation: A Gentle Introduction Susan Gruber Mark J. van
More informationCausal Inference with Big Data Sets
Causal Inference with Big Data Sets Marcelo Coca Perraillon University of Colorado AMC November 2016 1 / 1 Outlone Outline Big data Causal inference in economics and statistics Regression discontinuity
More informationEstimating the Mean Response of Treatment Duration Regimes in an Observational Study. Anastasios A. Tsiatis.
Estimating the Mean Response of Treatment Duration Regimes in an Observational Study Anastasios A. Tsiatis http://www.stat.ncsu.edu/ tsiatis/ Introduction to Dynamic Treatment Regimes 1 Outline Description
More informationPEARL VS RUBIN (GELMAN)
PEARL VS RUBIN (GELMAN) AN EPIC battle between the Rubin Causal Model school (Gelman et al) AND the Structural Causal Model school (Pearl et al) a cursory overview Dokyun Lee WHO ARE THEY? Judea Pearl
More informationAssess Assumptions and Sensitivity Analysis. Fan Li March 26, 2014
Assess Assumptions and Sensitivity Analysis Fan Li March 26, 2014 Two Key Assumptions 1. Overlap: 0
More informationAn Introduction to Causal Analysis on Observational Data using Propensity Scores
An Introduction to Causal Analysis on Observational Data using Propensity Scores Margie Rosenberg*, PhD, FSA Brian Hartman**, PhD, ASA Shannon Lane* *University of Wisconsin Madison **University of Connecticut
More information1 Basic summary of article to set stage for discussion
Epidemiol. Methods 214; 3(1): 21 31 Discussion Mark J. van der Laan*, Alexander R. Luedtke and Iván Díaz Discussion of Identification, Estimation and Approximation of Risk under Interventions that Depend
More informationModern Statistical Learning Methods for Observational Biomedical Data. Chapter 2: Basic identification and estimation of an average treatment effect
Modern Statistical Learning Methods for Observational Biomedical Data Chapter 2: Basic identification and estimation of an average treatment effect David Benkeser Emory Univ. Marco Carone Univ. of Washington
More informationIntegrated approaches for analysis of cluster randomised trials
Integrated approaches for analysis of cluster randomised trials Invited Session 4.1 - Recent developments in CRTs Joint work with L. Turner, F. Li, J. Gallis and D. Murray Mélanie PRAGUE - SCT 2017 - Liverpool
More informationImproving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates
Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates Anastasios (Butch) Tsiatis Department of Statistics North Carolina State University http://www.stat.ncsu.edu/
More informationComparative effectiveness of dynamic treatment regimes
Comparative effectiveness of dynamic treatment regimes An application of the parametric g- formula Miguel Hernán Departments of Epidemiology and Biostatistics Harvard School of Public Health www.hsph.harvard.edu/causal
More informationStructural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall
1 Structural Nested Mean Models for Assessing Time-Varying Effect Moderation Daniel Almirall Center for Health Services Research, Durham VAMC & Dept. of Biostatistics, Duke University Medical Joint work
More informationPropensity Score Methods for Causal Inference
John Pura BIOS790 October 2, 2015 Causal inference Philosophical problem, statistical solution Important in various disciplines (e.g. Koch s postulates, Bradford Hill criteria, Granger causality) Good
More informationTargeted Learning with Daily EHR Data
Targeted Learning with Daily EHR Data Oleg Sofrygin 1,2, Zheng Zhu 1, Julie A Schmittdiel 1, Alyce S. Adams 1, Richard W. Grant 1, Mark J. van der Laan 2, and Romain Neugebauer 1 arxiv:1705.09874v1 [stat.ap]
More informationDoubly Robust Estimation in Missing Data and Causal Inference Models
Biometrics 61, 962 972 December 2005 DOI: 10.1111/j.1541-0420.2005.00377.x Doubly Robust Estimation in Missing Data and Causal Inference Models Heejung Bang Division of Biostatistics and Epidemiology,
More informationEstimating the Effect of Vigorous Physical Activity on Mortality in the Elderly Based on Realistic Individualized Treatment and Intentionto-Treat
University of California, Berkeley From the SelectedWorks of Oliver Bembom May, 2007 Estimating the Effect of Vigorous Physical Activity on Mortality in the Elderly Based on Realistic Individualized Treatment
More informationData splitting. INSERM Workshop: Evaluation of predictive models: goodness-of-fit and predictive power #+TITLE:
#+TITLE: Data splitting INSERM Workshop: Evaluation of predictive models: goodness-of-fit and predictive power #+AUTHOR: Thomas Alexander Gerds #+INSTITUTE: Department of Biostatistics, University of Copenhagen
More informationShu Yang and Jae Kwang Kim. Harvard University and Iowa State University
Statistica Sinica 27 (2017), 000-000 doi:https://doi.org/10.5705/ss.202016.0155 DISCUSSION: DISSECTING MULTIPLE IMPUTATION FROM A MULTI-PHASE INFERENCE PERSPECTIVE: WHAT HAPPENS WHEN GOD S, IMPUTER S AND
More informationBig Data, Causal Modeling, and Estimation
Big Data, Causal Modeling, and Estimation The Center for Interdisciplinary Studies in Security and Privacy Summer Workshop Sherri Rose NSF Mathematical Sciences Postdoctoral Research Fellow Department
More informationUniversity of California, Berkeley
University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2009 Paper 250 A Machine-Learning Algorithm for Estimating and Ranking the Impact of Environmental Risk
More informationRobust Estimation of Inverse Probability Weights for Marginal Structural Models
Robust Estimation of Inverse Probability Weights for Marginal Structural Models Kosuke IMAI and Marc RATKOVIC Marginal structural models (MSMs) are becoming increasingly popular as a tool for causal inference
More informationarxiv: v1 [stat.me] 5 Apr 2017
Doubly Robust Inference for Targeted Minimum Loss Based Estimation in Randomized Trials with Missing Outcome Data arxiv:1704.01538v1 [stat.me] 5 Apr 2017 Iván Díaz 1 and Mark J. van der Laan 2 1 Division
More informationExtending the results of clinical trials using data from a target population
Extending the results of clinical trials using data from a target population Issa Dahabreh Center for Evidence-Based Medicine, Brown School of Public Health Disclaimer Partly supported through PCORI Methods
More informationPrimal-dual Covariate Balance and Minimal Double Robustness via Entropy Balancing
Primal-dual Covariate Balance and Minimal Double Robustness via (Joint work with Daniel Percival) Department of Statistics, Stanford University JSM, August 9, 2015 Outline 1 2 3 1/18 Setting Rubin s causal
More informationCausal Inference. Prediction and causation are very different. Typical questions are:
Causal Inference Prediction and causation are very different. Typical questions are: Prediction: Predict Y after observing X = x Causation: Predict Y after setting X = x. Causation involves predicting
More informationdiluted treatment effect estimation for trigger analysis in online controlled experiments
diluted treatment effect estimation for trigger analysis in online controlled experiments Alex Deng and Victor Hu February 2, 2015 Microsoft outline Trigger Analysis and The Dilution Problem Traditional
More informationEstimating the Marginal Odds Ratio in Observational Studies
Estimating the Marginal Odds Ratio in Observational Studies Travis Loux Christiana Drake Department of Statistics University of California, Davis June 20, 2011 Outline The Counterfactual Model Odds Ratios
More informationSince the seminal paper by Rosenbaum and Rubin (1983b) on propensity. Propensity Score Analysis. Concepts and Issues. Chapter 1. Wei Pan Haiyan Bai
Chapter 1 Propensity Score Analysis Concepts and Issues Wei Pan Haiyan Bai Since the seminal paper by Rosenbaum and Rubin (1983b) on propensity score analysis, research using propensity score analysis
More informationInstrumental variables estimation in the Cox Proportional Hazard regression model
Instrumental variables estimation in the Cox Proportional Hazard regression model James O Malley, Ph.D. Department of Biomedical Data Science The Dartmouth Institute for Health Policy and Clinical Practice
More informationBayesian regression tree models for causal inference: regularization, confounding and heterogeneity
Bayesian regression tree models for causal inference: regularization, confounding and heterogeneity P. Richard Hahn, Jared Murray, and Carlos Carvalho June 22, 2017 The problem setting We want to estimate
More informationA Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,
A Course in Applied Econometrics Lecture 18: Missing Data Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. When Can Missing Data be Ignored? 2. Inverse Probability Weighting 3. Imputation 4. Heckman-Type
More informationTargeted Learning. Sherri Rose. April 24, Associate Professor Department of Health Care Policy Harvard Medical School
Targeted Learning Sherri Rose Associate Professor Department of Health Care Policy Harvard Medical School Slides: drsherrirosecom/short-courses Code: githubcom/sherrirose/cncshortcourse April 24, 2017
More informationCausal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification. Todd MacKenzie, PhD
Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification Todd MacKenzie, PhD Collaborators A. James O Malley Tor Tosteson Therese Stukel 2 Overview 1. Instrumental variable
More informationCausal inference in epidemiological practice
Causal inference in epidemiological practice Willem van der Wal Biostatistics, Julius Center UMC Utrecht June 5, 2 Overview Introduction to causal inference Marginal causal effects Estimating marginal
More informationStructural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall
1 Structural Nested Mean Models for Assessing Time-Varying Effect Moderation Daniel Almirall Center for Health Services Research, Durham VAMC & Duke University Medical, Dept. of Biostatistics Joint work
More informationAn Introduction to Causal Mediation Analysis. Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016
An Introduction to Causal Mediation Analysis Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016 1 Causality In the applications of statistics, many central questions
More informationGov 2002: 5. Matching
Gov 2002: 5. Matching Matthew Blackwell October 1, 2015 Where are we? Where are we going? Discussed randomized experiments, started talking about observational data. Last week: no unmeasured confounders
More informationUniversity of California, Berkeley
University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2005 Paper 191 Population Intervention Models in Causal Inference Alan E. Hubbard Mark J. van der Laan
More informationEconometrics with Observational Data. Introduction and Identification Todd Wagner February 1, 2017
Econometrics with Observational Data Introduction and Identification Todd Wagner February 1, 2017 Goals for Course To enable researchers to conduct careful quantitative analyses with existing VA (and non-va)
More informationDiagnosing and responding to violations in the positivity assumption.
University of California, Berkeley From the SelectedWorks of Maya Petersen 2012 Diagnosing and responding to violations in the positivity assumption. Maya Petersen, University of California, Berkeley K
More informationSemi-Parametric Estimation in Network Data and Tools for Conducting Complex Simulation Studies in Causal Inference.
Semi-Parametric Estimation in Network Data and Tools for Conducting Complex Simulation Studies in Causal Inference by Oleg A Sofrygin A dissertation submitted in partial satisfaction of the requirements
More informationarxiv: v1 [stat.me] 15 May 2011
Working Paper Propensity Score Analysis with Matching Weights Liang Li, Ph.D. arxiv:1105.2917v1 [stat.me] 15 May 2011 Associate Staff of Biostatistics Department of Quantitative Health Sciences, Cleveland
More informationOptimal Treatment Regimes for Survival Endpoints from a Classification Perspective. Anastasios (Butch) Tsiatis and Xiaofei Bai
Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective Anastasios (Butch) Tsiatis and Xiaofei Bai Department of Statistics North Carolina State University 1/35 Optimal Treatment
More informationGlobal Sensitivity Analysis for Repeated Measures Studies with Informative Drop-out: A Semi-Parametric Approach
Global for Repeated Measures Studies with Informative Drop-out: A Semi-Parametric Approach Daniel Aidan McDermott Ivan Diaz Johns Hopkins University Ibrahim Turkoz Janssen Research and Development September
More informationFlexible Estimation of Treatment Effect Parameters
Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both
More informationData Integration for Big Data Analysis for finite population inference
for Big Data Analysis for finite population inference Jae-kwang Kim ISU January 23, 2018 1 / 36 What is big data? 2 / 36 Data do not speak for themselves Knowledge Reproducibility Information Intepretation
More informationPrerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3
University of California, Irvine 2017-2018 1 Statistics (STATS) Courses STATS 5. Seminar in Data Science. 1 Unit. An introduction to the field of Data Science; intended for entering freshman and transfers.
More information