Integrated approaches for analysis of cluster randomised trials

Size: px
Start display at page:

Download "Integrated approaches for analysis of cluster randomised trials"

Transcription

1 Integrated approaches for analysis of cluster randomised trials Invited Session Recent developments in CRTs Joint work with L. Turner, F. Li, J. Gallis and D. Murray Mélanie PRAGUE - SCT Liverpool May 9, 2017

2 Collaborators M. Prague - Marginal methods for CRTs May 9,

3 SOMMAIRE Conditional vs. Marginal models Marginal Models Estimation Improved Marginal models estimation (Doubly robust) Conclusion / Discussion M. Prague - Marginal methods for CRTs May 9,

4 1 Background Conditional vs. Marginal models M. Prague - Marginal methods for CRTs May 9,

5 Notations In cross sectional CRTs: A i Intervention group for cluster i X ij Baseline covariates of individual j in cluster i Y ij Outcome at the time of interest for individual j in cluster i Conditional Regression: Y ij = g(β 0 + βa COND A i + u i ), u N (0, σ) β COND A the conditional effect of intervention Mixed Effect Models Marginal Regression: µ ij = E(Y ij A i ) = g(β 0 + βa MAR A i ) the marginal effect of intervention β MAR A Estimating Equation-based models M. Prague - Marginal methods for CRTs May 9,

6 Conditional vs. Marginal methods It is essential to understand the underlying assumptions of each method: Conditional models rely on correct specification of untestable aspects of the data distribution (βa COND ) Marginal models rely on a correct definition of the population of interest, which can make it difficult to generalise results to other populations (βa MAR ) Definition of the parameter of interest: intervention effect Conditional mean: β COND A Effect given other responses in the cluster(s) and unobserved random effects Marginal mean: βa MAR Effect according to average response across the population. M. Prague - Marginal methods for CRTs May 9,

7 How to make a decision? Pros and cons [Hubbard et al. 2010] M. Prague - Marginal methods for CRTs May 9,

8 More... Turner E., Li F., Gallis, J, Prague M. and Murray D. Review of recent methodological developments in group randomized trials: Part 1 - Design. (2017) American journal of public Health. in press Turner E., Prague M., Gallis J, Li F. and Murray D. Review of recent methodological developments in group randomized trials: Part 2 - Analysis. (2017) American journal of public Health. in press M. Prague - Marginal methods for CRTs May 9,

9 2 Marginal Models Adjustment for missing data M. Prague - Marginal methods for CRTs May 9,

10 GEE Principle [Liang et Zeger, 1986] M m(y i, A i, β) = i=1 M i=1 µ i β V 1 i (Y i µ i ) = 0 First, a naive linear regression analysis is carried out, assuming the observations within subjects are independent. Then, residuals are calculated from the naive model (observed-predicted) and a working correlation matrix is estimated from these residuals. Then the regression coefficients are refit, correcting for the correlation. (Iterative process) The within-subject correlation structure is treated as a nuisance variable (i.e. as a covariate) M. Prague - Marginal methods for CRTs May 9,

11 Quadratic Inference Function Limitation of GEE: the working correlation matrix can be difficult to specify. GEE is always unbiased but loss of efficiency if Vi 1 is misspecified V 1 i QIF : V 1 i = a 0 M 0 + a 1 M 1 + a 2 M a n M n where, (a 0,..., a n ) is estimated and (M 0,..., M n ) is a basis of know matrices. Facts: QIF more efficient than GEE [Odueyungbo et al. 2008]. No implementation yet in R SAS or STATA. M. Prague - Marginal methods for CRTs May 9,

12 The missing data problem MCAR: P(R ij Y obs i MAR: P(R ij Y obs i CDM: P(R ij Y obs i MNAR: P(R ij Y obs i ; Y miss i ; X i ; A i ) = P(R ij ) ; Y miss i ; Y miss ; X i ; A i ) = P(R ij X i ; A i, Y obs i ) i ; X i ; A i ) = P(R ij X i ; A i ) ; Y miss i ; X i ; A i ) = P(R ij X i ; A i, Y obs i, Y miss i ) M. Prague - Marginal methods for CRTs May 9,

13 Adjusting for missing data Idea 1 (Multiple imputation): For every missing data, impute what could be the value of the missing. Disavantage: How to impute? Find f (Y X, A). Idea 2 (Inverse-Probability weighting): If four individuals are identical according to covariates and three are missing. The observed individual will get a weight of 4, which correspond to the probability of observation of 0.25 (or 1/4= 0.25). As a result, data from this individual should count once for himself and 3 times for other individuals missing. Disavantage: How to describe identical? This link needs to be exact to obtain correct weighting. Find P(R = 1 X, A). M. Prague - Marginal methods for CRTs May 9,

14 Adjusting for missing data Idea 1 (Multiple imputation): For every missing data, impute what could be the value of the missing. Disavantage: How to impute? Find f (Y X, A). Idea 2 (Inverse-Probability weighting): If four individuals are identical according to covariates and three are missing. The observed individual will get a weight of 4, which correspond to the probability of observation of 0.25 (or 1/4= 0.25). As a result, data from this individual should count once for himself and 3 times for other individuals missing. Disavantage: How to describe identical? This link needs to be exact to obtain correct weighting. Find P(R = 1 X, A). M. Prague - Marginal methods for CRTs May 9,

15 Inverse probability Weighted (IPW) GEE Solve: M m(y i, A i, X i, β) = i=1 M i=1 µ i β V 1 i W i (Y i µ i ) = 0 Properties: W i = Diag[ 1 π ij ] j=1...ni is the weighting matrix. π ij = P(R ij = 1 A i, X ij ) is the propensity score (PS). PS has to be correctly specified to ensure Consistency and Asymptotic Normality (CAN). M. Prague - Marginal methods for CRTs May 9,

16 The Caveat A wrong formula is often implemented in softwares 1 : M µ i β W 1/2 i i=1 Vi 1 W 1/2 i (Y i µ i ) = 0 Solution [Pepe et al. 1992, for longitudinal data]: Weights need to be cluster-specific or the working correlation matrix should be identity. 1 R (geepack), SAS (GENMOD need to be used with observation specific weights),... In any case check the manual... aa aa M. Prague - Marginal methods for CRTs May 9,

17 Simulation - Toy example Settings: Age ij N (30, 10) M = 100 and n i = 100 u i ICC = 0.05 R = 1000 replicates Generation: logit(p(y ij = 1)) = A i Age ij Age ij A i + u i logit(p(r ij = missing)) = A i + 0.1Age ij Age ij A i Independence Exchangeable R package Bias SE Coverage Bias SE Coverage CRTgeeDR geepack geem M. Prague - Marginal methods for CRTs May 9,

18 More... Turner E. and Prague M. GEE Analysis of Cluster Randomized Controlled Trial Data with Missing Outcomes: A tutorial in inverse probability weighting methods Submitted International Journal of Epidemiology. Liang et Zeger (1986) Longitudinal data analysis using generalized linear models. Biometrika 73:13-22 Qu et al. (2000) Improving generalized estimating equations using inference functions. Biometrika, 87: M. Prague - Marginal methods for CRTs May 9,

19 3 Going further with Marginal Models Doubly Robust, TMLE,... M. Prague - Marginal methods for CRTs May 9,

20 The imbalance of baseline covariates problem A pronounced baseline imbalance is not expected a priori in a CRT: if the randomisation process has worked correctly, any observed imbalance must always be a random phenomenon. It impacts efficiency but not biais M. Prague - Marginal methods for CRTs May 9,

21 Doubly Robust GEE Estimator (implemented in the R package CRTgeeDR) Outcome Model (OM) : B ij (X i, A i ) = E(Y ij A i, X i ) Propensity Score (PS) : [W i ] jj = R ij /P(R ij = 1 X i, A i ) Unbiased if OM or PS correspond to the TRUE data generation process Missing data More weight to individuals unlikely to be observed Augmenta)on for Unbalanced covariates Distance between the data (Y) an models (μ, B) Model of Interest # & % D i V 1 i W i [Y i B i (X i, A i )]+ D i (a)v 1 i [B i (X i, a) µ(β, a)] ( = 0 i $ a=0,1 ' Correla@ons Design Matrix GEE Doubly Robust es)mator M. Prague - Marginal methods for CRTs May 9,

22 tmle (soon in R ctmle) M. Prague - Marginal methods for CRTs May 9,

23 South African Man Study [Jemmott et al. (2014)] Population: Men y.o., Sexually active, Consent / completed the baseline survey. Intervention - HIV reduction Strengthen behavioral beliefs that support condom use, Increase skill and self-efficacy to use condoms, Increase HIV/STI risk-reduction knowledge. Control - Health promotion Adhere to physical-activity guidelines Have a diet with 5-a-Day fruit-and-vegetable consumption Limit fat and alcohol intake M. Prague - Marginal methods for CRTs May 9,

24 South African Man Study [Jemmott et al. (2014)] Outcome: Frequency of protected intercourses Missing data : HIV/STI group Control group Y 64% [26%; 100%] 60% [22%; 100%] HIV/STI group Control group R Y 20.8% 17.5% Imbalance in baseline covariates : HIV/STI group Control group % Married in the Neighbourhood 19.1% 19.2% % Married in the sample 4.4% 7.2% M. Prague - Marginal methods for CRTs May 9,

25 South African Man Study [Jemmott et al. (2014)] Primary Outcome : Frequency of protected intercourses HIV/STI intervention effect SD p-value GEE (biased) IPW-GEE DR-GEE Secondary Outcome : Frequency of protected intercourses with casual partner HIV/STI intervention effect SD p-value GEE (biased) IPW-GEE DR-GEE M. Prague - Marginal methods for CRTs May 9,

26 More... M. Prague, R. Wang, E. Tchetgen Tchetgen and V. De Grutolla Accounting for Interactions and Complex Inter-Subject Dependency in Estimating Treatment Effect in Cluster Randomized Trials With Missing Outcomes (2016) Biometrics. 72(4) M. Prague, R. Wang and V. De Grutolla CRTgeeDR: An R Package for Doubly Robust Generalized Estimating Equations Estimations in Cluster Randomised Trials with Missing Data In revision R Journal. John B. Jemmott III et al. Cluster-Randomized Controlled Trial of an HIV/Sexually Transmitted Infection Risk-Reduction Intervention for South African Men (2013) American Journal of Public Health 104 (3) Van der Laan et Robins, Springer, unified Methods for censored dlongitudinal data and causality. M. Prague - Marginal methods for CRTs May 9,

27 6 Conclusion Discussion and future works M. Prague - Marginal methods for CRTs May 9,

28 Going further Take home message : Marginal models most suited to estimate intervention effect in two-levels CRT If using marginal estimation, you must adjust for missing data IPW for CRT in standard software may be biased due to implementation Using Doubly robust or TMLE approach may improve efficiency (ex. package R CRTgeeDR, tmle) M. Prague - Marginal methods for CRTs May 9,

29 Acknowledgement and Fundings Main source of funding was: R37 AI (PI: V. De Gruttola) R01MH (PI: S. Little) Other acknowledgement: R01 HD (PI: J. Jemmot) NCRR 1S10RR (Cluster HMS) R01 AI24643 (PI: Rui Wang) And my new affiliation: M. Prague - Marginal methods for CRTs May 9,

30 Thanks and happy to take questions! SISTM Inria, Bordeaux, Sud-ouest, France

Using Estimating Equations for Spatially Correlated A

Using Estimating Equations for Spatially Correlated A Using Estimating Equations for Spatially Correlated Areal Data December 8, 2009 Introduction GEEs Spatial Estimating Equations Implementation Simulation Conclusion Typical Problem Assess the relationship

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2009 Paper 251 Nonparametric population average models: deriving the form of approximate population

More information

Combining multiple observational data sources to estimate causal eects

Combining multiple observational data sources to estimate causal eects Department of Statistics, North Carolina State University Combining multiple observational data sources to estimate causal eects Shu Yang* syang24@ncsuedu Joint work with Peng Ding UC Berkeley May 23,

More information

Downloaded from:

Downloaded from: Hossain, A; DiazOrdaz, K; Bartlett, JW (2017) Missing binary outcomes under covariate-dependent missingness in cluster randomised trials. Statistics in medicine. ISSN 0277-6715 DOI: https://doi.org/10.1002/sim.7334

More information

A Sampling of IMPACT Research:

A Sampling of IMPACT Research: A Sampling of IMPACT Research: Methods for Analysis with Dropout and Identifying Optimal Treatment Regimes Marie Davidian Department of Statistics North Carolina State University http://www.stat.ncsu.edu/

More information

Extending causal inferences from a randomized trial to a target population

Extending causal inferences from a randomized trial to a target population Extending causal inferences from a randomized trial to a target population Issa Dahabreh Center for Evidence Synthesis in Health, Brown University issa dahabreh@brown.edu January 16, 2019 Issa Dahabreh

More information

Causal inference in epidemiological practice

Causal inference in epidemiological practice Causal inference in epidemiological practice Willem van der Wal Biostatistics, Julius Center UMC Utrecht June 5, 2 Overview Introduction to causal inference Marginal causal effects Estimating marginal

More information

Double Robustness. Bang and Robins (2005) Kang and Schafer (2007)

Double Robustness. Bang and Robins (2005) Kang and Schafer (2007) Double Robustness Bang and Robins (2005) Kang and Schafer (2007) Set-Up Assume throughout that treatment assignment is ignorable given covariates (similar to assumption that data are missing at random

More information

Targeted Maximum Likelihood Estimation in Safety Analysis

Targeted Maximum Likelihood Estimation in Safety Analysis Targeted Maximum Likelihood Estimation in Safety Analysis Sam Lendle 1 Bruce Fireman 2 Mark van der Laan 1 1 UC Berkeley 2 Kaiser Permanente ISPE Advanced Topics Session, Barcelona, August 2012 1 / 35

More information

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A. Linero and M. Daniels UF, UT-Austin SRC 2014, Galveston, TX 1 Background 2 Working model

More information

Longitudinal analysis of ordinal data

Longitudinal analysis of ordinal data Longitudinal analysis of ordinal data A report on the external research project with ULg Anne-Françoise Donneau, Murielle Mauer June 30 th 2009 Generalized Estimating Equations (Liang and Zeger, 1986)

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2010 Paper 260 Collaborative Targeted Maximum Likelihood For Time To Event Data Ori M. Stitelman Mark

More information

An Introduction to Causal Analysis on Observational Data using Propensity Scores

An Introduction to Causal Analysis on Observational Data using Propensity Scores An Introduction to Causal Analysis on Observational Data using Propensity Scores Margie Rosenberg*, PhD, FSA Brian Hartman**, PhD, ASA Shannon Lane* *University of Wisconsin Madison **University of Connecticut

More information

The impact of covariance misspecification in multivariate Gaussian mixtures on estimation and inference

The impact of covariance misspecification in multivariate Gaussian mixtures on estimation and inference The impact of covariance misspecification in multivariate Gaussian mixtures on estimation and inference An application to longitudinal modeling Brianna Heggeseth with Nicholas Jewell Department of Statistics

More information

An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data

An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data Jae-Kwang Kim 1 Iowa State University June 28, 2012 1 Joint work with Dr. Ming Zhou (when he was a PhD student at ISU)

More information

Matching. Quiz 2. Matching. Quiz 2. Exact Matching. Estimand 2/25/14

Matching. Quiz 2. Matching. Quiz 2. Exact Matching. Estimand 2/25/14 STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University Frequency 0 2 4 6 8 Quiz 2 Histogram of Quiz2 10 12 14 16 18 20 Quiz2

More information

Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall

Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall 1 Structural Nested Mean Models for Assessing Time-Varying Effect Moderation Daniel Almirall Center for Health Services Research, Durham VAMC & Dept. of Biostatistics, Duke University Medical Joint work

More information

ANALYSIS OF CORRELATED DATA SAMPLING FROM CLUSTERS CLUSTER-RANDOMIZED TRIALS

ANALYSIS OF CORRELATED DATA SAMPLING FROM CLUSTERS CLUSTER-RANDOMIZED TRIALS ANALYSIS OF CORRELATED DATA SAMPLING FROM CLUSTERS CLUSTER-RANDOMIZED TRIALS Background Independent observations: Short review of well-known facts Comparison of two groups continuous response Control group:

More information

Empirical Bayes Moderation of Asymptotically Linear Parameters

Empirical Bayes Moderation of Asymptotically Linear Parameters Empirical Bayes Moderation of Asymptotically Linear Parameters Nima Hejazi Division of Biostatistics University of California, Berkeley stat.berkeley.edu/~nhejazi nimahejazi.org twitter/@nshejazi github/nhejazi

More information

e author and the promoter give permission to consult this master dissertation and to copy it or parts of it for personal use. Each other use falls

e author and the promoter give permission to consult this master dissertation and to copy it or parts of it for personal use. Each other use falls e author and the promoter give permission to consult this master dissertation and to copy it or parts of it for personal use. Each other use falls under the restrictions of the copyright, in particular

More information

Adaptive Trial Designs

Adaptive Trial Designs Adaptive Trial Designs Wenjing Zheng, Ph.D. Methods Core Seminar Center for AIDS Prevention Studies University of California, San Francisco Nov. 17 th, 2015 Trial Design! Ethical:!eg.! Safety!! Efficacy!

More information

Estimating direct effects in cohort and case-control studies

Estimating direct effects in cohort and case-control studies Estimating direct effects in cohort and case-control studies, Ghent University Direct effects Introduction Motivation The problem of standard approaches Controlled direct effect models In many research

More information

Estimating the Marginal Odds Ratio in Observational Studies

Estimating the Marginal Odds Ratio in Observational Studies Estimating the Marginal Odds Ratio in Observational Studies Travis Loux Christiana Drake Department of Statistics University of California, Davis June 20, 2011 Outline The Counterfactual Model Odds Ratios

More information

A weighted simulation-based estimator for incomplete longitudinal data models

A weighted simulation-based estimator for incomplete longitudinal data models To appear in Statistics and Probability Letters, 113 (2016), 16-22. doi 10.1016/j.spl.2016.02.004 A weighted simulation-based estimator for incomplete longitudinal data models Daniel H. Li 1 and Liqun

More information

Propensity Score Weighting with Multilevel Data

Propensity Score Weighting with Multilevel Data Propensity Score Weighting with Multilevel Data Fan Li Department of Statistical Science Duke University October 25, 2012 Joint work with Alan Zaslavsky and Mary Beth Landrum Introduction In comparative

More information

Harvard University. Harvard University Biostatistics Working Paper Series

Harvard University. Harvard University Biostatistics Working Paper Series Harvard University Harvard University Biostatistics Working Paper Series Year 2015 Paper 197 On Varieties of Doubly Robust Estimators Under Missing Not at Random With an Ancillary Variable Wang Miao Eric

More information

Empirical Bayes Moderation of Asymptotically Linear Parameters

Empirical Bayes Moderation of Asymptotically Linear Parameters Empirical Bayes Moderation of Asymptotically Linear Parameters Nima Hejazi Division of Biostatistics University of California, Berkeley stat.berkeley.edu/~nhejazi nimahejazi.org twitter/@nshejazi github/nhejazi

More information

Mediation analyses. Advanced Psychometrics Methods in Cognitive Aging Research Workshop. June 6, 2016

Mediation analyses. Advanced Psychometrics Methods in Cognitive Aging Research Workshop. June 6, 2016 Mediation analyses Advanced Psychometrics Methods in Cognitive Aging Research Workshop June 6, 2016 1 / 40 1 2 3 4 5 2 / 40 Goals for today Motivate mediation analysis Survey rapidly developing field in

More information

5 Methods Based on Inverse Probability Weighting Under MAR

5 Methods Based on Inverse Probability Weighting Under MAR 5 Methods Based on Inverse Probability Weighting Under MAR The likelihood-based and multiple imputation methods we considered for inference under MAR in Chapters 3 and 4 are based, either directly or indirectly,

More information

MISSING or INCOMPLETE DATA

MISSING or INCOMPLETE DATA MISSING or INCOMPLETE DATA A (fairly) complete review of basic practice Don McLeish and Cyntha Struthers University of Waterloo Dec 5, 2015 Structure of the Workshop Session 1 Common methods for dealing

More information

Causal Inference with Measurement Error

Causal Inference with Measurement Error Causal Inference with Measurement Error by Di Shu A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Doctor of Philosophy in Statistics Waterloo,

More information

Shu Yang and Jae Kwang Kim. Harvard University and Iowa State University

Shu Yang and Jae Kwang Kim. Harvard University and Iowa State University Statistica Sinica 27 (2017), 000-000 doi:https://doi.org/10.5705/ss.202016.0155 DISCUSSION: DISSECTING MULTIPLE IMPUTATION FROM A MULTI-PHASE INFERENCE PERSPECTIVE: WHAT HAPPENS WHEN GOD S, IMPUTER S AND

More information

Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing

Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing Alessandra Mattei Dipartimento di Statistica G. Parenti Università

More information

This is the submitted version of the following book chapter: stat08068: Double robustness, which will be

This is the submitted version of the following book chapter: stat08068: Double robustness, which will be This is the submitted version of the following book chapter: stat08068: Double robustness, which will be published in its final form in Wiley StatsRef: Statistics Reference Online (http://onlinelibrary.wiley.com/book/10.1002/9781118445112)

More information

Modeling the scale parameter ϕ A note on modeling correlation of binary responses Using marginal odds ratios to model association for binary responses

Modeling the scale parameter ϕ A note on modeling correlation of binary responses Using marginal odds ratios to model association for binary responses Outline Marginal model Examples of marginal model GEE1 Augmented GEE GEE1.5 GEE2 Modeling the scale parameter ϕ A note on modeling correlation of binary responses Using marginal odds ratios to model association

More information

Robustness to Parametric Assumptions in Missing Data Models

Robustness to Parametric Assumptions in Missing Data Models Robustness to Parametric Assumptions in Missing Data Models Bryan Graham NYU Keisuke Hirano University of Arizona April 2011 Motivation Motivation We consider the classic missing data problem. In practice

More information

Analysis of Incomplete Non-Normal Longitudinal Lipid Data

Analysis of Incomplete Non-Normal Longitudinal Lipid Data Analysis of Incomplete Non-Normal Longitudinal Lipid Data Jiajun Liu*, Devan V. Mehrotra, Xiaoming Li, and Kaifeng Lu 2 Merck Research Laboratories, PA/NJ 2 Forrest Laboratories, NY *jiajun_liu@merck.com

More information

Causal Inference Basics

Causal Inference Basics Causal Inference Basics Sam Lendle October 09, 2013 Observed data, question, counterfactuals Observed data: n i.i.d copies of baseline covariates W, treatment A {0, 1}, and outcome Y. O i = (W i, A i,

More information

Propensity Score Methods for Causal Inference

Propensity Score Methods for Causal Inference John Pura BIOS790 October 2, 2015 Causal inference Philosophical problem, statistical solution Important in various disciplines (e.g. Koch s postulates, Bradford Hill criteria, Granger causality) Good

More information

Robust covariance estimator for small-sample adjustment in the generalized estimating equations: A simulation study

Robust covariance estimator for small-sample adjustment in the generalized estimating equations: A simulation study Science Journal of Applied Mathematics and Statistics 2014; 2(1): 20-25 Published online February 20, 2014 (http://www.sciencepublishinggroup.com/j/sjams) doi: 10.11648/j.sjams.20140201.13 Robust covariance

More information

Statistical Methods. Missing Data snijders/sm.htm. Tom A.B. Snijders. November, University of Oxford 1 / 23

Statistical Methods. Missing Data  snijders/sm.htm. Tom A.B. Snijders. November, University of Oxford 1 / 23 1 / 23 Statistical Methods Missing Data http://www.stats.ox.ac.uk/ snijders/sm.htm Tom A.B. Snijders University of Oxford November, 2011 2 / 23 Literature: Joseph L. Schafer and John W. Graham, Missing

More information

An Empirical Comparison of Multiple Imputation Approaches for Treating Missing Data in Observational Studies

An Empirical Comparison of Multiple Imputation Approaches for Treating Missing Data in Observational Studies Paper 177-2015 An Empirical Comparison of Multiple Imputation Approaches for Treating Missing Data in Observational Studies Yan Wang, Seang-Hwane Joo, Patricia Rodríguez de Gil, Jeffrey D. Kromrey, Rheta

More information

Biostatistics Workshop Longitudinal Data Analysis. Session 4 GARRETT FITZMAURICE

Biostatistics Workshop Longitudinal Data Analysis. Session 4 GARRETT FITZMAURICE Biostatistics Workshop 2008 Longitudinal Data Analysis Session 4 GARRETT FITZMAURICE Harvard University 1 LINEAR MIXED EFFECTS MODELS Motivating Example: Influence of Menarche on Changes in Body Fat Prospective

More information

Multilevel Statistical Models: 3 rd edition, 2003 Contents

Multilevel Statistical Models: 3 rd edition, 2003 Contents Multilevel Statistical Models: 3 rd edition, 2003 Contents Preface Acknowledgements Notation Two and three level models. A general classification notation and diagram Glossary Chapter 1 An introduction

More information

GMM Logistic Regression with Time-Dependent Covariates and Feedback Processes in SAS TM

GMM Logistic Regression with Time-Dependent Covariates and Feedback Processes in SAS TM Paper 1025-2017 GMM Logistic Regression with Time-Dependent Covariates and Feedback Processes in SAS TM Kyle M. Irimata, Arizona State University; Jeffrey R. Wilson, Arizona State University ABSTRACT The

More information

Weighting Methods. Harvard University STAT186/GOV2002 CAUSAL INFERENCE. Fall Kosuke Imai

Weighting Methods. Harvard University STAT186/GOV2002 CAUSAL INFERENCE. Fall Kosuke Imai Weighting Methods Kosuke Imai Harvard University STAT186/GOV2002 CAUSAL INFERENCE Fall 2018 Kosuke Imai (Harvard) Weighting Methods Stat186/Gov2002 Fall 2018 1 / 13 Motivation Matching methods for improving

More information

arxiv: v2 [stat.me] 17 Jan 2017

arxiv: v2 [stat.me] 17 Jan 2017 Semiparametric Estimation with Data Missing Not at Random Using an Instrumental Variable arxiv:1607.03197v2 [stat.me] 17 Jan 2017 BaoLuo Sun 1, Lan Liu 1, Wang Miao 1,4, Kathleen Wirth 2,3, James Robins

More information

Models for binary data

Models for binary data Faculty of Health Sciences Models for binary data Analysis of repeated measurements 2015 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen 1 / 63 Program for

More information

IP WEIGHTING AND MARGINAL STRUCTURAL MODELS (CHAPTER 12) BIOS IPW and MSM

IP WEIGHTING AND MARGINAL STRUCTURAL MODELS (CHAPTER 12) BIOS IPW and MSM IP WEIGHTING AND MARGINAL STRUCTURAL MODELS (CHAPTER 12) BIOS 776 1 12 IPW and MSM IP weighting and marginal structural models ( 12) Outline 12.1 The causal question 12.2 Estimating IP weights via modeling

More information

Package drgee. November 8, 2016

Package drgee. November 8, 2016 Type Package Package drgee November 8, 2016 Title Doubly Robust Generalized Estimating Equations Version 1.1.6 Date 2016-11-07 Author Johan Zetterqvist , Arvid Sjölander

More information

2 Naïve Methods. 2.1 Complete or available case analysis

2 Naïve Methods. 2.1 Complete or available case analysis 2 Naïve Methods Before discussing methods for taking account of missingness when the missingness pattern can be assumed to be MAR in the next three chapters, we review some simple methods for handling

More information

GEE for Longitudinal Data - Chapter 8

GEE for Longitudinal Data - Chapter 8 GEE for Longitudinal Data - Chapter 8 GEE: generalized estimating equations (Liang & Zeger, 1986; Zeger & Liang, 1986) extension of GLM to longitudinal data analysis using quasi-likelihood estimation method

More information

Estimating the Mean Response of Treatment Duration Regimes in an Observational Study. Anastasios A. Tsiatis.

Estimating the Mean Response of Treatment Duration Regimes in an Observational Study. Anastasios A. Tsiatis. Estimating the Mean Response of Treatment Duration Regimes in an Observational Study Anastasios A. Tsiatis http://www.stat.ncsu.edu/ tsiatis/ Introduction to Dynamic Treatment Regimes 1 Outline Description

More information

multilevel modeling: concepts, applications and interpretations

multilevel modeling: concepts, applications and interpretations multilevel modeling: concepts, applications and interpretations lynne c. messer 27 october 2010 warning social and reproductive / perinatal epidemiologist concepts why context matters multilevel models

More information

An Introduction to Causal Mediation Analysis. Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016

An Introduction to Causal Mediation Analysis. Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016 An Introduction to Causal Mediation Analysis Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016 1 Causality In the applications of statistics, many central questions

More information

Application of Time-to-Event Methods in the Assessment of Safety in Clinical Trials

Application of Time-to-Event Methods in the Assessment of Safety in Clinical Trials Application of Time-to-Event Methods in the Assessment of Safety in Clinical Trials Progress, Updates, Problems William Jen Hoe Koh May 9, 2013 Overview Marginal vs Conditional What is TMLE? Key Estimation

More information

Sample Size and Power Considerations for Longitudinal Studies

Sample Size and Power Considerations for Longitudinal Studies Sample Size and Power Considerations for Longitudinal Studies Outline Quantities required to determine the sample size in longitudinal studies Review of type I error, type II error, and power For continuous

More information

arxiv: v1 [stat.me] 15 May 2011

arxiv: v1 [stat.me] 15 May 2011 Working Paper Propensity Score Analysis with Matching Weights Liang Li, Ph.D. arxiv:1105.2917v1 [stat.me] 15 May 2011 Associate Staff of Biostatistics Department of Quantitative Health Sciences, Cleveland

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2015 Paper 334 Targeted Estimation and Inference for the Sample Average Treatment Effect Laura B. Balzer

More information

Fair Inference Through Semiparametric-Efficient Estimation Over Constraint-Specific Paths

Fair Inference Through Semiparametric-Efficient Estimation Over Constraint-Specific Paths Fair Inference Through Semiparametric-Efficient Estimation Over Constraint-Specific Paths for New Developments in Nonparametric and Semiparametric Statistics, Joint Statistical Meetings; Vancouver, BC,

More information

Unbiased estimation of exposure odds ratios in complete records logistic regression

Unbiased estimation of exposure odds ratios in complete records logistic regression Unbiased estimation of exposure odds ratios in complete records logistic regression Jonathan Bartlett London School of Hygiene and Tropical Medicine www.missingdata.org.uk Centre for Statistical Methodology

More information

,..., θ(2),..., θ(n)

,..., θ(2),..., θ(n) Likelihoods for Multivariate Binary Data Log-Linear Model We have 2 n 1 distinct probabilities, but we wish to consider formulations that allow more parsimonious descriptions as a function of covariates.

More information

A note on L convergence of Neumann series approximation in missing data problems

A note on L convergence of Neumann series approximation in missing data problems A note on L convergence of Neumann series approximation in missing data problems Hua Yun Chen Division of Epidemiology & Biostatistics School of Public Health University of Illinois at Chicago 1603 West

More information

Methods for inferring short- and long-term effects of exposures on outcomes, using longitudinal data on both measures

Methods for inferring short- and long-term effects of exposures on outcomes, using longitudinal data on both measures Methods for inferring short- and long-term effects of exposures on outcomes, using longitudinal data on both measures Ruth Keogh, Stijn Vansteelandt, Rhian Daniel Department of Medical Statistics London

More information

Conceptual overview: Techniques for establishing causal pathways in programs and policies

Conceptual overview: Techniques for establishing causal pathways in programs and policies Conceptual overview: Techniques for establishing causal pathways in programs and policies Antonio A. Morgan-Lopez, Ph.D. OPRE/ACF Meeting on Unpacking the Black Box of Programs and Policies 4 September

More information

Journal of Biostatistics and Epidemiology

Journal of Biostatistics and Epidemiology Journal of Biostatistics and Epidemiology Methodology Marginal versus conditional causal effects Kazem Mohammad 1, Seyed Saeed Hashemi-Nazari 2, Nasrin Mansournia 3, Mohammad Ali Mansournia 1* 1 Department

More information

Some methods for handling missing values in outcome variables. Roderick J. Little

Some methods for handling missing values in outcome variables. Roderick J. Little Some methods for handling missing values in outcome variables Roderick J. Little Missing data principles Likelihood methods Outline ML, Bayes, Multiple Imputation (MI) Robust MAR methods Predictive mean

More information

Longitudinal Modeling with Logistic Regression

Longitudinal Modeling with Logistic Regression Newsom 1 Longitudinal Modeling with Logistic Regression Longitudinal designs involve repeated measurements of the same individuals over time There are two general classes of analyses that correspond to

More information

ST 790, Homework 1 Spring 2017

ST 790, Homework 1 Spring 2017 ST 790, Homework 1 Spring 2017 1. In EXAMPLE 1 of Chapter 1 of the notes, it is shown at the bottom of page 22 that the complete case estimator for the mean µ of an outcome Y given in (1.18) under MNAR

More information

Analysis of propensity score approaches in difference-in-differences designs

Analysis of propensity score approaches in difference-in-differences designs Author: Diego A. Luna Bazaldua Institution: Lynch School of Education, Boston College Contact email: diego.lunabazaldua@bc.edu Conference section: Research methods Analysis of propensity score approaches

More information

Vector-Based Kernel Weighting: A Simple Estimator for Improving Precision and Bias of Average Treatment Effects in Multiple Treatment Settings

Vector-Based Kernel Weighting: A Simple Estimator for Improving Precision and Bias of Average Treatment Effects in Multiple Treatment Settings Vector-Based Kernel Weighting: A Simple Estimator for Improving Precision and Bias of Average Treatment Effects in Multiple Treatment Settings Jessica Lum, MA 1 Steven Pizer, PhD 1, 2 Melissa Garrido,

More information

Conditional Inference Functions for Mixed-Effects Models with Unspecified Random-Effects Distribution

Conditional Inference Functions for Mixed-Effects Models with Unspecified Random-Effects Distribution Conditional Inference Functions for Mixed-Effects Models with Unspecified Random-Effects Distribution Peng WANG, Guei-feng TSAI and Annie QU 1 Abstract In longitudinal studies, mixed-effects models are

More information

Selection on Observables: Propensity Score Matching.

Selection on Observables: Propensity Score Matching. Selection on Observables: Propensity Score Matching. Department of Economics and Management Irene Brunetti ireneb@ec.unipi.it 24/10/2017 I. Brunetti Labour Economics in an European Perspective 24/10/2017

More information

A Decision Theoretic Approach to Causality

A Decision Theoretic Approach to Causality A Decision Theoretic Approach to Causality Vanessa Didelez School of Mathematics University of Bristol (based on joint work with Philip Dawid) Bordeaux, June 2011 Based on: Dawid & Didelez (2010). Identifying

More information

Propensity Score Analysis with Hierarchical Data

Propensity Score Analysis with Hierarchical Data Propensity Score Analysis with Hierarchical Data Fan Li Alan Zaslavsky Mary Beth Landrum Department of Health Care Policy Harvard Medical School May 19, 2008 Introduction Population-based observational

More information

Comparing Adaptive Interventions Using Data Arising from a SMART: With Application to Autism, ADHD, and Mood Disorders

Comparing Adaptive Interventions Using Data Arising from a SMART: With Application to Autism, ADHD, and Mood Disorders Comparing Adaptive Interventions Using Data Arising from a SMART: With Application to Autism, ADHD, and Mood Disorders Daniel Almirall, Xi Lu, Connie Kasari, Inbal N-Shani, Univ. of Michigan, Univ. of

More information

Web-based Supplementary Materials for A Robust Method for Estimating. Optimal Treatment Regimes

Web-based Supplementary Materials for A Robust Method for Estimating. Optimal Treatment Regimes Biometrics 000, 000 000 DOI: 000 000 0000 Web-based Supplementary Materials for A Robust Method for Estimating Optimal Treatment Regimes Baqun Zhang, Anastasios A. Tsiatis, Eric B. Laber, and Marie Davidian

More information

Analyzing Pilot Studies with Missing Observations

Analyzing Pilot Studies with Missing Observations Analyzing Pilot Studies with Missing Observations Monnie McGee mmcgee@smu.edu. Department of Statistical Science Southern Methodist University, Dallas, Texas Co-authored with N. Bergasa (SUNY Downstate

More information

Dan Graham Professor of Statistical Modelling. Centre for Transport Studies

Dan Graham Professor of Statistical Modelling. Centre for Transport Studies Quantifying the effects of urban road capacity expansions on network performance and productivity via a mixed model generalised propensity score estimator Dan Graham Professor of Statistical Modelling

More information

Lecture 3.1 Basic Logistic LDA

Lecture 3.1 Basic Logistic LDA y Lecture.1 Basic Logistic LDA 0.2.4.6.8 1 Outline Quick Refresher on Ordinary Logistic Regression and Stata Women s employment example Cross-Over Trial LDA Example -100-50 0 50 100 -- Longitudinal Data

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2009 Paper 248 Application of Time-to-Event Methods in the Assessment of Safety in Clinical Trials Kelly

More information

Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall

Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall 1 Structural Nested Mean Models for Assessing Time-Varying Effect Moderation Daniel Almirall Center for Health Services Research, Durham VAMC & Duke University Medical, Dept. of Biostatistics Joint work

More information

Discussing Effects of Different MAR-Settings

Discussing Effects of Different MAR-Settings Discussing Effects of Different MAR-Settings Research Seminar, Department of Statistics, LMU Munich Munich, 11.07.2014 Matthias Speidel Jörg Drechsler Joseph Sakshaug Outline What we basically want to

More information

TECHNICAL REPORT Fixed effects models for longitudinal binary data with drop-outs missing at random

TECHNICAL REPORT Fixed effects models for longitudinal binary data with drop-outs missing at random TECHNICAL REPORT Fixed effects models for longitudinal binary data with drop-outs missing at random Paul J. Rathouz University of Chicago Abstract. We consider the problem of attrition under a logistic

More information

Data Integration for Big Data Analysis for finite population inference

Data Integration for Big Data Analysis for finite population inference for Big Data Analysis for finite population inference Jae-kwang Kim ISU January 23, 2018 1 / 36 What is big data? 2 / 36 Data do not speak for themselves Knowledge Reproducibility Information Intepretation

More information

Internal vs. external validity. External validity. This section is based on Stock and Watson s Chapter 9.

Internal vs. external validity. External validity. This section is based on Stock and Watson s Chapter 9. Section 7 Model Assessment This section is based on Stock and Watson s Chapter 9. Internal vs. external validity Internal validity refers to whether the analysis is valid for the population and sample

More information

A Copula-Based Method for Analyzing Bivariate Binary Longitudinal Data

A Copula-Based Method for Analyzing Bivariate Binary Longitudinal Data University of Pennsylvania ScholarlyCommons Publicly Accessible Penn Dissertations Fall 12-22-2010 A Copula-Based Method for Analyzing Bivariate Binary Longitudinal Data Seunghee Baek University of Pennsylvania,

More information

MISSING or INCOMPLETE DATA

MISSING or INCOMPLETE DATA MISSING or INCOMPLETE DATA A (fairly) complete review of basic practice Don McLeish and Cyntha Struthers University of Waterloo Dec 5, 2015 Structure of the Workshop Session 1 Common methods for dealing

More information

Pricing and Risk Analysis of a Long-Term Care Insurance Contract in a non-markov Multi-State Model

Pricing and Risk Analysis of a Long-Term Care Insurance Contract in a non-markov Multi-State Model Pricing and Risk Analysis of a Long-Term Care Insurance Contract in a non-markov Multi-State Model Quentin Guibert Univ Lyon, Université Claude Bernard Lyon 1, ISFA, Laboratoire SAF EA2429, F-69366, Lyon,

More information

7 Sensitivity Analysis

7 Sensitivity Analysis 7 Sensitivity Analysis A recurrent theme underlying methodology for analysis in the presence of missing data is the need to make assumptions that cannot be verified based on the observed data. If the assumption

More information

INTRODUCTION TO MULTILEVEL MODELLING FOR REPEATED MEASURES DATA. Belfast 9 th June to 10 th June, 2011

INTRODUCTION TO MULTILEVEL MODELLING FOR REPEATED MEASURES DATA. Belfast 9 th June to 10 th June, 2011 INTRODUCTION TO MULTILEVEL MODELLING FOR REPEATED MEASURES DATA Belfast 9 th June to 10 th June, 2011 Dr James J Brown Southampton Statistical Sciences Research Institute (UoS) ADMIN Research Centre (IoE

More information

Variable selection and machine learning methods in causal inference

Variable selection and machine learning methods in causal inference Variable selection and machine learning methods in causal inference Debashis Ghosh Department of Biostatistics and Informatics Colorado School of Public Health Joint work with Yeying Zhu, University of

More information

CHL 5225 H Crossover Trials. CHL 5225 H Crossover Trials

CHL 5225 H Crossover Trials. CHL 5225 H Crossover Trials CHL 55 H Crossover Trials The Two-sequence, Two-Treatment, Two-period Crossover Trial Definition A trial in which patients are randomly allocated to one of two sequences of treatments (either 1 then, or

More information

Module 6 Case Studies in Longitudinal Data Analysis

Module 6 Case Studies in Longitudinal Data Analysis Module 6 Case Studies in Longitudinal Data Analysis Benjamin French, PhD Radiation Effects Research Foundation SISCR 2018 July 24, 2018 Learning objectives This module will focus on the design of longitudinal

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2011 Paper 290 Targeted Minimum Loss Based Estimation of an Intervention Specific Mean Outcome Mark

More information

Peng Li * and David T Redden

Peng Li * and David T Redden Li and Redden BMC Medical Research Methodology (2015) 15:38 DOI 10.1186/s12874-015-0026-x RESEARCH ARTICLE Open Access Comparing denominator degrees of freedom approximations for the generalized linear

More information

Weighting. Homework 2. Regression. Regression. Decisions Matching: Weighting (0) W i. (1) -å l i. )Y i. (1-W i 3/5/2014. (1) = Y i.

Weighting. Homework 2. Regression. Regression. Decisions Matching: Weighting (0) W i. (1) -å l i. )Y i. (1-W i 3/5/2014. (1) = Y i. Weighting Unconfounded Homework 2 Describe imbalance direction matters STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2008 Paper 241 A Note on Risk Prediction for Case-Control Studies Sherri Rose Mark J. van der Laan Division

More information

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i, A Course in Applied Econometrics Lecture 18: Missing Data Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. When Can Missing Data be Ignored? 2. Inverse Probability Weighting 3. Imputation 4. Heckman-Type

More information

Covariate selection and propensity score specification in causal inference

Covariate selection and propensity score specification in causal inference Covariate selection and propensity score specification in causal inference Ingeborg Waernbaum Doctoral Dissertation Department of Statistics Umeå University SE-901 87 Umeå, Sweden Copyright c 2008 by Ingeborg

More information