Combining multiple observational data sources to estimate causal eects
|
|
- Jonah Barnett
- 5 years ago
- Views:
Transcription
1 Department of Statistics, North Carolina State University Combining multiple observational data sources to estimate causal eects Shu Yang* Joint work with Peng Ding UC Berkeley May 23, 2018 Atlantic Causal Inference Conference
2 A textbook setup: one observational study 1 Causal inference is a central goal of research in the social, health, and economic sciences One main statistical approach is the potential outcomes framework (Rubin, 1974) binary treatment A, potential outcomes Y (1) and Y (0), observed outcome Y = Y (A) treatment effects: Y (1) Y (0) and E{Y (1) Y (0)} A key assumption unconfoundedness or ignorability A {Y (1), Y (0)} observed covariates Causal effects can be identified and estimated using regression imputation, (augmented) inverse probability weighting, and matching
3 A modern setup: multiple data sources 2 Multiple data sources are increasingly available New opportunities: more information New challenges: eg varying level of confounder information, different sampling schemes and data structures, various modes of output, and so on Motivating Example: chronic obstructive pulmonary disease herpes zoster Yang et al (2011): positive association 2005 longitudinal health insurance database in Taiwan (big main data) missing confounders: cigarette smoking and alcohol consumption Lin and Chen (2014): causal analysis with more confounders 2005 national health interview survey in Taiwan (validation data) regression analysis Goal: a general framework to estimate causal effects combining big main data with unmeasured confounders smaller validation data with supplementary information on these confounders
4 Two common types of studies 3 Classical two-phase samples (Neyman, 1938; Cochran, 2007) some variables (eg A, X, and Y ) may be cheaper some variables (eg U) are more expensive first phase: easy-to-obtain variables measured for all units second phase: expensive variables measured for a validation subsample large literature, eg, Breslow et al (1988, 1997ab), Schill et al (1997) and Breslow et al (2003) for (logistic) regression Combining multiple big data sources validation data with full information on (A, X, U, Y ) external big data with only (A, X, Y ) (eg from electronic health records, claims databases, disease data registries, and census data) Chatterjee et al (2016, 2017) for parametric regression analyses
5 Notation: two data sources 4 Notation binary treatment A {0, 1}, potential outcomes Y (0) and Y (1), observed outcome Y = Y (A) pretreatment covariates (X, U), where X is fully observed, but U may not be fully observed Two observed data sources with nested two-phase structure (for illustration) main data O1 = {(A i, X i, Y i ) : i S 1 } with size n 1 = S 1 validation data O2 = {(A j, X j, U j, Y j ) : j S 2 } with size n 2 = S 2 Covariates Treatment Outcomes X U A Y (1) Y (0) Y Main 1 1? Validation data n 2 0? (O 2 ) n 2 + 1? 1? data (O 1 ) n 1? 0?
6 Sampling and parameter of interest 5 S 1 for the main data is a random sample from a super-population {Ai, X i, U i, Y i (0), Y i (1) : i S 1 } are IID S 2 for the validation data is a simple random sample from S 1 {Aj, X j, U j, Y j (0), Y j (1) : j S 2 } are also IID later relax S2 to be a general probability sample Estimand of interest is eg average causal effect (ACE) τ = E{Y (1) Y (0)} drop the indices i and j in the expectation because of IID
7 Assumptions for identification 6 Assumption (Ignorability) Y (a) A (X, U) for a = 0 and 1 It implies that for a = 0, 1 P{A = 1 X, U, Y (a)} = P(A = 1 X, U) = e(x, U) and E{Y (a) X, U} = E{Y (a) A = a, X, U} = E(Y A = a, X, U) = µ a (X, U) Assumption (Overlap) There exist constants c 1 and c 2 such that with probability 1, the propensity score is bounded, ie, 0 < c 1 e(x, U) c 2 < 1
8 Assumptions for estimation 7 Outcome distribution and propensity score are often unknown Assumption (Outcome model) The parametric model µ a (X, U; β a ) is a correct specification for µ a (X, U), for a = 0, 1; ie, µ a (X, U) = µ a (X, U; β a), where β a is the true model parameter, for a = 0, 1 Assumption (Propensity score model) The parametric model e(x, U; α) is a correct specification for e(x, U); ie, e(x, U) = e(x, U; α ), where α is the true model parameter Consistency of different estimators requires different assumptions
9 Commonly-used estimators: regression, weighting (based on validation data only) 8 Example (Regression imputation) ˆτ reg,2 = 1 µ 1 (X j, U j ; n ˆβ 1 ) µ 0 (X j, U j ; ˆβ 0 ) 2 j S 2 consistent if outcome model is correctly specified Example (Inverse probability weighting, IPW) ˆτ IPW,2 = 1 n 2 j S 2 A j Y j e(x j, U j ; ˆα) 1 (1 A j )Y j n 2 1 e(x j, U j ; ˆα) j S 2 consistent if propensity score is correctly specified
10 AIPW (based on validation data only) 9 Example (Augmented inverse probability weighting) The AIPW estimator is ˆτ AIPW,2 = n 1 2 j S 2 ˆτ AIPW,2,j, where ˆτ AIPW,2,j = A j Y j e(x j, U j ; ˆα) A j e(x j, U j ; ˆα) e(x j, U j ; ˆα) (1 A j)y j 1 e(x j, U j ; ˆα) { µ 1 (X j, U j ; ˆβ 1 ) µ 0 (X j, U j ; ˆβ 0 ) } Doubly robust: consistent if outcome or propensity score is correct Locally efficient if both outcome and propensity score are correct ˆτ reg,2, ˆτ IPW,2 and ˆτ AIPW,2 are regular asymptotically linear (RAL): ˆτ 2 τ = n 1 2 j S 2 ψ(a j, X j, U j, Y j )
11 Matching (based on validation data only) 10 Imputing counterfactual outcomes of unit j via matching Matching based on V (eg, (X, U)), fixed M, with replacement Example (Matching) ˆτ (0) mat,2 = n 1 2 {Ŷj(1) Ŷj(0)} j S 2 Abadie and Imbens (2006): biased if matching on p-dimensional variable (p 2) bias corrected estimator ˆτ mat,2 asymptotically linear and Normal by Martingale theory ˆτ mat,2 τ = n 1 2 j S 2 ψ mat,j not regular (functional forms are not smooth for fixed numbers of matches)
12 A general strategy for efficient estimation combining main and validation data 11 Validation data (credible): consistent but inefficient estimators Main big data (powerful): large sample size but error-prone large sample size, some information naively apply existing estimators: error-prone How to leverage both data to improve efficiency? A simple idea: ( n 1/2 2 ) ˆτ 2 τ ˆτ 2,ep ˆτ 1,ep { ( d v2 Γ N 0 L+1, T Γ V )} ˆτ2 : a consistent estimator from the validation data ˆτ2,ep and ˆτ 1,ep : two error-prone estimators with the same bias asymptotically Normal: applies to all the estimators reviewed before consistent variance estimators ˆv2, ˆΓ and ˆV
13 Strategy: eliminate conditional bias and improve efficiency 12 If the sampling distribution holds exactly, then n 1/2 2 (ˆτ 2 τ) n 1/2 2 (ˆτ 2,ep ˆτ 1,ep ) N 2 Γ T V 1 (ˆτ 2,ep ˆτ 1,ep ), v 2 Γ T V 1 Γ }{{}}{{} conditional bias conditional variance n1/2 Correction for conditional bias ˆτ = ˆτ 2 ˆΓ T ˆV 1 (ˆτ 2,ep ˆτ 1,ep ) More efficient estimator achieving conditional variance n 1/2 2 (ˆτ τ) N (0, v 2 Γ T V 1 Γ) Asymptotic variance estimator: ˆv = (ˆv 2 ˆΓ T ˆV 1ˆΓ)/n2
14 Some remarks on our strategy 13 ˆτ is asymptotically optimal among linear combinations {ˆτ2 + λ T (ˆτ 2,ep ˆτ 1,ep ) : λ R L } (eg Fuller 2009) the class of estimators {ˆτ = f (ˆτ 2, ˆτ 1,ep, ˆτ 2,ep ) : f (x, y, z) is smooth and ˆτ is consistent for τ} Error-prone estimators do not need to be consistent for τ ˆτd,ep (d = 1, 2) can be vector in R L the only requirement is ˆτ2,ep ˆτ P 1,ep 0 Choice of ˆτ d,ep (d = 1, 2) increasing the dimension will increase the asymptotic efficiency of ˆτ increasing the dimension may harm the finite sample properties suggestion: ˆτd,ep is of the same type as ˆτ 2
15 Compare to the literature 14 Survey calibration weighting (eg Deville and Sarndal, 1992; Fuller, 2009) Generalized method of moments (eg Imbens and Lancaster 1994) Constraint empirical likelihood (eg Chen and Sitter 1999) Regression analyses of two-phase sampling (eg Chatterjee et al 2016) Optimality issues (eg Deville and Sarndal 1992) Advantages of our strategy simple, requires only standard software for existing methods can deal with estimators not from moment conditions, eg, matching does not require a correct model specification of U given (A, X, Y ) coupled with a unified wild bootstrap procedure for inference
16 An application 15 Exposure A: chronic obstructive pulmonary disease (COPD) causes systematic inflammation dysregulates a patient s immune function Outcome Y : development of herpes zoster (HZ) Main data used by Yang et al (2011): without U smoking and alcohol consumption 2005 Longitudinal Health Insurance Database in Taiwan 8, 486 subjects having COPD (A = 1) and 33, 944 subjects not (A = 0) Validation data used by Lin and Chen (2014) 2005 National Health Interview Survey in Taiwan comparable to the main study sample 244 subjects diagnosed of COPD and 904 subjects not
17 Average causal effect estimation 16 Conclusions Est SE 95% CI ˆτ reg, ( 00047, 00402) ˆτ reg,2® (00109, 00200) ˆτ AIPW, ( 00044, 00402) ˆτ AIPW,2&AIPW (00109, 00203) ˆτ IPW, ( 00048, 00398) ˆτ IPW,2&IPW (00108, 00202) ˆτ mat, ( 00101, 00273) ˆτ mat,2&mat ( 00011, 00183) combining the main and validation data improves efficiency matching estimator has least improvement using two-phase sampling (similar phenomenon in simulation) on average, COPD increases the prob of HZ by 155% Caveat: causal interpretation relies on the assumption that all confounders are measured in validation data
18 More general two-phase sampling 17 Let I i be the indicator of selecting unit i into the validation data Ii is the missing data indicator of U i We have assumed that S 2 is a simple random sample from S 1 ie I (A, X, U, Y ) or U is missing completely at random (MCAR) We now relax it to allow a more general sampling Assumption {(I i, A i, X i, U i, Y i ) : i S 1 } are IID S 2 is selected from S 1 with a known inclusion probability π = P(I = 1 A, X, U, Y ) > c for some positive constant c We can allow π to depend on unknown parameters
19 More general two-phase sampling 18 Obtain ˆτ 1,ep as before from S 1 Obtain ˆτ 2 and ˆτ 2,ep using the weighted procedures with sampling weight π 1 j for unit j in S 2 Theorem Under certain regularity conditions, the joint asymptotic normality holds for the sampling weighted estimator ( ) { ( n 1/2 ˆτ 2 τ v2 Γ 2 N 0 ˆτ 2,ep ˆτ L+1, T )}, 1,ep Γ V The proposed estimator ˆτ = ˆτ 2 ˆΓ T ˆV 1 (ˆτ 2,ep ˆτ 1,ep )
20 Connection with the missing data literature 19 Proposition ˆτ has an asymptotic linear form n 1/2 1 (ˆτ τ) = n 1/2 1 i S 1 { Ii π i s(a i, X i, U i, Y i ) ( ) } Ii 1 κ(a i, X i, Y i ) π i where s(a i, X i, U i, Y i ) is ψ(a i, X i, U i, Y i ) for RAL estimators and ψ mat,i for the matching estimator, and a similar definition applies to the φ i term in κ(a i, X i, Y i ) = ΓV 1 φ i Robins et al (1994) discussed optimality: κ opt (A, X, Y ) = E{s(A, X, U, Y ) A, X, Y } To obtain the optimal estimator, we need a model P(U A, X, Y )
21 Summary 20 Combining big data to improve efficiency of the population ACE based on gold-standard validation data where: treatment assignment is ignorable sampling selection is ignroable More data fusion problems for causal inference Covariate shift and misalignment Sampling selection bias Versions of treatment Unmeasured confounding complex data structure
22 Thank you!
Constrained Maximum Likelihood Estimation for Model Calibration Using Summary-level Information from External Big Data Sources
Constrained Maximum Likelihood Estimation for Model Calibration Using Summary-level Information from External Big Data Sources Yi-Hau Chen Institute of Statistical Science, Academia Sinica Joint with Nilanjan
More informationWeighting Methods. Harvard University STAT186/GOV2002 CAUSAL INFERENCE. Fall Kosuke Imai
Weighting Methods Kosuke Imai Harvard University STAT186/GOV2002 CAUSAL INFERENCE Fall 2018 Kosuke Imai (Harvard) Weighting Methods Stat186/Gov2002 Fall 2018 1 / 13 Motivation Matching methods for improving
More informationRobustness to Parametric Assumptions in Missing Data Models
Robustness to Parametric Assumptions in Missing Data Models Bryan Graham NYU Keisuke Hirano University of Arizona April 2011 Motivation Motivation We consider the classic missing data problem. In practice
More informationPropensity Score Weighting with Multilevel Data
Propensity Score Weighting with Multilevel Data Fan Li Department of Statistical Science Duke University October 25, 2012 Joint work with Alan Zaslavsky and Mary Beth Landrum Introduction In comparative
More informationDouble Robustness. Bang and Robins (2005) Kang and Schafer (2007)
Double Robustness Bang and Robins (2005) Kang and Schafer (2007) Set-Up Assume throughout that treatment assignment is ignorable given covariates (similar to assumption that data are missing at random
More informationSelection on Observables: Propensity Score Matching.
Selection on Observables: Propensity Score Matching. Department of Economics and Management Irene Brunetti ireneb@ec.unipi.it 24/10/2017 I. Brunetti Labour Economics in an European Perspective 24/10/2017
More informationHigh Dimensional Propensity Score Estimation via Covariate Balancing
High Dimensional Propensity Score Estimation via Covariate Balancing Kosuke Imai Princeton University Talk at Columbia University May 13, 2017 Joint work with Yang Ning and Sida Peng Kosuke Imai (Princeton)
More informationWhat s New in Econometrics. Lecture 1
What s New in Econometrics Lecture 1 Estimation of Average Treatment Effects Under Unconfoundedness Guido Imbens NBER Summer Institute, 2007 Outline 1. Introduction 2. Potential Outcomes 3. Estimands and
More informationCovariate Balancing Propensity Score for General Treatment Regimes
Covariate Balancing Propensity Score for General Treatment Regimes Kosuke Imai Princeton University October 14, 2014 Talk at the Department of Psychiatry, Columbia University Joint work with Christian
More informationCausal Inference Basics
Causal Inference Basics Sam Lendle October 09, 2013 Observed data, question, counterfactuals Observed data: n i.i.d copies of baseline covariates W, treatment A {0, 1}, and outcome Y. O i = (W i, A i,
More informationPropensity Score Analysis with Hierarchical Data
Propensity Score Analysis with Hierarchical Data Fan Li Alan Zaslavsky Mary Beth Landrum Department of Health Care Policy Harvard Medical School May 19, 2008 Introduction Population-based observational
More informationImbens/Wooldridge, Lecture Notes 1, Summer 07 1
Imbens/Wooldridge, Lecture Notes 1, Summer 07 1 What s New in Econometrics NBER, Summer 2007 Lecture 1, Monday, July 30th, 9.00-10.30am Estimation of Average Treatment Effects Under Unconfoundedness 1.
More informationEstimating the Marginal Odds Ratio in Observational Studies
Estimating the Marginal Odds Ratio in Observational Studies Travis Loux Christiana Drake Department of Statistics University of California, Davis June 20, 2011 Outline The Counterfactual Model Odds Ratios
More informationCausal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies
Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies Kosuke Imai Department of Politics Princeton University November 13, 2013 So far, we have essentially assumed
More informationExtending causal inferences from a randomized trial to a target population
Extending causal inferences from a randomized trial to a target population Issa Dahabreh Center for Evidence Synthesis in Health, Brown University issa dahabreh@brown.edu January 16, 2019 Issa Dahabreh
More informationGov 2002: 4. Observational Studies and Confounding
Gov 2002: 4. Observational Studies and Confounding Matthew Blackwell September 10, 2015 Where are we? Where are we going? Last two weeks: randomized experiments. From here on: observational studies. What
More informationTelescope Matching: A Flexible Approach to Estimating Direct Effects
Telescope Matching: A Flexible Approach to Estimating Direct Effects Matthew Blackwell and Anton Strezhnev International Methods Colloquium October 12, 2018 direct effect direct effect effect of treatment
More informationCausal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions
Causal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions Joe Schafer Office of the Associate Director for Research and Methodology U.S. Census
More informationFlexible Estimation of Treatment Effect Parameters
Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both
More informationESTIMATION OF TREATMENT EFFECTS VIA MATCHING
ESTIMATION OF TREATMENT EFFECTS VIA MATCHING AAEC 56 INSTRUCTOR: KLAUS MOELTNER Textbooks: R scripts: Wooldridge (00), Ch.; Greene (0), Ch.9; Angrist and Pischke (00), Ch. 3 mod5s3 General Approach The
More informationDeductive Derivation and Computerization of Semiparametric Efficient Estimation
Deductive Derivation and Computerization of Semiparametric Efficient Estimation Constantine Frangakis, Tianchen Qian, Zhenke Wu, and Ivan Diaz Department of Biostatistics Johns Hopkins Bloomberg School
More informationA Sampling of IMPACT Research:
A Sampling of IMPACT Research: Methods for Analysis with Dropout and Identifying Optimal Treatment Regimes Marie Davidian Department of Statistics North Carolina State University http://www.stat.ncsu.edu/
More informationThe propensity score with continuous treatments
7 The propensity score with continuous treatments Keisuke Hirano and Guido W. Imbens 1 7.1 Introduction Much of the work on propensity score analysis has focused on the case in which the treatment is binary.
More informationCausal Inference with Measurement Error
Causal Inference with Measurement Error by Di Shu A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Doctor of Philosophy in Statistics Waterloo,
More informationA note on multiple imputation for general purpose estimation
A note on multiple imputation for general purpose estimation Shu Yang Jae Kwang Kim SSC meeting June 16, 2015 Shu Yang, Jae Kwang Kim Multiple Imputation June 16, 2015 1 / 32 Introduction Basic Setup Assume
More informationModification and Improvement of Empirical Likelihood for Missing Response Problem
UW Biostatistics Working Paper Series 12-30-2010 Modification and Improvement of Empirical Likelihood for Missing Response Problem Kwun Chuen Gary Chan University of Washington - Seattle Campus, kcgchan@u.washington.edu
More informationSIMPLE EXAMPLES OF ESTIMATING CAUSAL EFFECTS USING TARGETED MAXIMUM LIKELIHOOD ESTIMATION
Johns Hopkins University, Dept. of Biostatistics Working Papers 3-3-2011 SIMPLE EXAMPLES OF ESTIMATING CAUSAL EFFECTS USING TARGETED MAXIMUM LIKELIHOOD ESTIMATION Michael Rosenblum Johns Hopkins Bloomberg
More informationA Measure of Robustness to Misspecification
A Measure of Robustness to Misspecification Susan Athey Guido W. Imbens December 2014 Graduate School of Business, Stanford University, and NBER. Electronic correspondence: athey@stanford.edu. Graduate
More informationStatistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes
Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes Kosuke Imai Department of Politics Princeton University July 31 2007 Kosuke Imai (Princeton University) Nonignorable
More informationBootstrapping Sensitivity Analysis
Bootstrapping Sensitivity Analysis Qingyuan Zhao Department of Statistics, The Wharton School University of Pennsylvania May 23, 2018 @ ACIC Based on: Qingyuan Zhao, Dylan S. Small, and Bhaswar B. Bhattacharya.
More informationIntegrated approaches for analysis of cluster randomised trials
Integrated approaches for analysis of cluster randomised trials Invited Session 4.1 - Recent developments in CRTs Joint work with L. Turner, F. Li, J. Gallis and D. Murray Mélanie PRAGUE - SCT 2017 - Liverpool
More informationCross-Sectional Regression after Factor Analysis: Two Applications
al Regression after Factor Analysis: Two Applications Joint work with Jingshu, Trevor, Art; Yang Song (GSB) May 7, 2016 Overview 1 2 3 4 1 / 27 Outline 1 2 3 4 2 / 27 Data matrix Y R n p Panel data. Transposable
More informationVariable selection and machine learning methods in causal inference
Variable selection and machine learning methods in causal inference Debashis Ghosh Department of Biostatistics and Informatics Colorado School of Public Health Joint work with Yeying Zhu, University of
More informationShu Yang and Jae Kwang Kim. Harvard University and Iowa State University
Statistica Sinica 27 (2017), 000-000 doi:https://doi.org/10.5705/ss.202016.0155 DISCUSSION: DISSECTING MULTIPLE IMPUTATION FROM A MULTI-PHASE INFERENCE PERSPECTIVE: WHAT HAPPENS WHEN GOD S, IMPUTER S AND
More informationImbens/Wooldridge, IRP Lecture Notes 2, August 08 1
Imbens/Wooldridge, IRP Lecture Notes 2, August 08 IRP Lectures Madison, WI, August 2008 Lecture 2, Monday, Aug 4th, 0.00-.00am Estimation of Average Treatment Effects Under Unconfoundedness, Part II. Introduction
More informationarxiv: v1 [stat.me] 15 May 2011
Working Paper Propensity Score Analysis with Matching Weights Liang Li, Ph.D. arxiv:1105.2917v1 [stat.me] 15 May 2011 Associate Staff of Biostatistics Department of Quantitative Health Sciences, Cleveland
More informationFractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling
Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Jae-Kwang Kim 1 Iowa State University June 26, 2013 1 Joint work with Shu Yang Introduction 1 Introduction
More informationG-ESTIMATION OF STRUCTURAL NESTED MODELS (CHAPTER 14) BIOS G-Estimation
G-ESTIMATION OF STRUCTURAL NESTED MODELS (CHAPTER 14) BIOS 776 1 14 G-Estimation ( G-Estimation of Structural Nested Models 14) Outline 14.1 The causal question revisited 14.2 Exchangeability revisited
More informationPropensity-Score Based Methods for Causal Inference in Observational Studies with Fixed Non-Binary Treatments
Propensity-Score Based Methods for Causal Inference in Observational Studies with Fixed Non-Binary reatments Shandong Zhao David A. van Dyk Kosuke Imai July 3, 2018 Abstract Propensity score methods are
More informationCalibration Estimation for Semiparametric Copula Models under Missing Data
Calibration Estimation for Semiparametric Copula Models under Missing Data Shigeyuki Hamori 1 Kaiji Motegi 1 Zheng Zhang 2 1 Kobe University 2 Renmin University of China Economics and Economic Growth Centre
More informationOUTCOME REGRESSION AND PROPENSITY SCORES (CHAPTER 15) BIOS Outcome regressions and propensity scores
OUTCOME REGRESSION AND PROPENSITY SCORES (CHAPTER 15) BIOS 776 1 15 Outcome regressions and propensity scores Outcome Regression and Propensity Scores ( 15) Outline 15.1 Outcome regression 15.2 Propensity
More informationOn the Use of Linear Fixed Effects Regression Models for Causal Inference
On the Use of Linear Fixed Effects Regression Models for ausal Inference Kosuke Imai Department of Politics Princeton University Joint work with In Song Kim Atlantic ausal Inference onference Johns Hopkins
More informationUniversity of California, Berkeley
University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2015 Paper 334 Targeted Estimation and Inference for the Sample Average Treatment Effect Laura B. Balzer
More informationPropensity score weighting for causal inference with multi-stage clustered data
Propensity score weighting for causal inference with multi-stage clustered data Shu Yang Department of Statistics, North Carolina State University arxiv:1607.07521v1 stat.me] 26 Jul 2016 Abstract Propensity
More informationCausal Inference with General Treatment Regimes: Generalizing the Propensity Score
Causal Inference with General Treatment Regimes: Generalizing the Propensity Score David van Dyk Department of Statistics, University of California, Irvine vandyk@stat.harvard.edu Joint work with Kosuke
More informationCausal Inference in Observational Studies with Non-Binary Treatments. David A. van Dyk
Causal Inference in Observational Studies with Non-Binary reatments Statistics Section, Imperial College London Joint work with Shandong Zhao and Kosuke Imai Cass Business School, October 2013 Outline
More informationCausal Mechanisms Short Course Part II:
Causal Mechanisms Short Course Part II: Analyzing Mechanisms with Experimental and Observational Data Teppei Yamamoto Massachusetts Institute of Technology March 24, 2012 Frontiers in the Analysis of Causal
More informationCausal Inference Lecture Notes: Selection Bias in Observational Studies
Causal Inference Lecture Notes: Selection Bias in Observational Studies Kosuke Imai Department of Politics Princeton University April 7, 2008 So far, we have studied how to analyze randomized experiments.
More informationTargeted Maximum Likelihood Estimation in Safety Analysis
Targeted Maximum Likelihood Estimation in Safety Analysis Sam Lendle 1 Bruce Fireman 2 Mark van der Laan 1 1 UC Berkeley 2 Kaiser Permanente ISPE Advanced Topics Session, Barcelona, August 2012 1 / 35
More informationTelescope Matching: A Flexible Approach to Estimating Direct Effects *
Telescope Matching: A Flexible Approach to Estimating Direct Effects * Matthew Blackwell Anton Strezhnev August 4, 2018 Abstract Estimating the direct effect of a treatment fixing the value of a consequence
More informationEstimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing
Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing Alessandra Mattei Dipartimento di Statistica G. Parenti Università
More informationEmpirical likelihood methods in missing response problems and causal interference
The University of Toledo The University of Toledo Digital Repository Theses and Dissertations 2017 Empirical likelihood methods in missing response problems and causal interference Kaili Ren University
More informationCalibration Estimation of Semiparametric Copula Models with Data Missing at Random
Calibration Estimation of Semiparametric Copula Models with Data Missing at Random Shigeyuki Hamori 1 Kaiji Motegi 1 Zheng Zhang 2 1 Kobe University 2 Renmin University of China Institute of Statistics
More informationAn Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data
An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data Jae-Kwang Kim 1 Iowa State University June 28, 2012 1 Joint work with Dr. Ming Zhou (when he was a PhD student at ISU)
More informationUniversity of California, Berkeley
University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2010 Paper 260 Collaborative Targeted Maximum Likelihood For Time To Event Data Ori M. Stitelman Mark
More informationBalancing Covariates via Propensity Score Weighting
Balancing Covariates via Propensity Score Weighting Kari Lock Morgan Department of Statistics Penn State University klm47@psu.edu Stochastic Modeling and Computational Statistics Seminar October 17, 2014
More informationPersonalized Treatment Selection Based on Randomized Clinical Trials. Tianxi Cai Department of Biostatistics Harvard School of Public Health
Personalized Treatment Selection Based on Randomized Clinical Trials Tianxi Cai Department of Biostatistics Harvard School of Public Health Outline Motivation A systematic approach to separating subpopulations
More informationClustering as a Design Problem
Clustering as a Design Problem Alberto Abadie, Susan Athey, Guido Imbens, & Jeffrey Wooldridge Harvard-MIT Econometrics Seminar Cambridge, February 4, 2016 Adjusting standard errors for clustering is common
More informationMarginal and Nested Structural Models Using Instrumental Variables
Marginal and Nested Structural Models Using Instrumental Variables Zhiqiang TAN The objective of many scientific studies is to evaluate the effect of a treatment on an outcome of interest ceteris paribus.
More informationPropensity-Score Based Methods for Causal Inference in Observational Studies with Fixed Non-Binary Treatments
Propensity-Score Based Methods for Causal Inference in Observational Studies with Fixed Non-Binary reatments Shandong Zhao Department of Statistics, University of California, Irvine, CA 92697 shandonm@uci.edu
More informationBalancing Covariates via Propensity Score Weighting: The Overlap Weights
Balancing Covariates via Propensity Score Weighting: The Overlap Weights Kari Lock Morgan Department of Statistics Penn State University klm47@psu.edu PSU Methodology Center Brown Bag April 6th, 2017 Joint
More informationStructural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall
1 Structural Nested Mean Models for Assessing Time-Varying Effect Moderation Daniel Almirall Center for Health Services Research, Durham VAMC & Duke University Medical, Dept. of Biostatistics Joint work
More informationPrimal-dual Covariate Balance and Minimal Double Robustness via Entropy Balancing
Primal-dual Covariate Balance and Minimal Double Robustness via (Joint work with Daniel Percival) Department of Statistics, Stanford University JSM, August 9, 2015 Outline 1 2 3 1/18 Setting Rubin s causal
More informationBiomarkers for Disease Progression in Rheumatology: A Review and Empirical Study of Two-Phase Designs
Department of Statistics and Actuarial Science Mathematics 3, 200 University Avenue West, Waterloo, Ontario, Canada, N2L 3G1 519-888-4567, ext. 33550 Fax: 519-746-1875 math.uwaterloo.ca/statistics-and-actuarial-science
More informationImplementing Matching Estimators for. Average Treatment Effects in STATA
Implementing Matching Estimators for Average Treatment Effects in STATA Guido W. Imbens - Harvard University West Coast Stata Users Group meeting, Los Angeles October 26th, 2007 General Motivation Estimation
More informationFractional Imputation in Survey Sampling: A Comparative Review
Fractional Imputation in Survey Sampling: A Comparative Review Shu Yang Jae-Kwang Kim Iowa State University Joint Statistical Meetings, August 2015 Outline Introduction Fractional imputation Features Numerical
More informationImplementing Matching Estimators for. Average Treatment Effects in STATA. Guido W. Imbens - Harvard University Stata User Group Meeting, Boston
Implementing Matching Estimators for Average Treatment Effects in STATA Guido W. Imbens - Harvard University Stata User Group Meeting, Boston July 26th, 2006 General Motivation Estimation of average effect
More informationComparative effectiveness of dynamic treatment regimes
Comparative effectiveness of dynamic treatment regimes An application of the parametric g- formula Miguel Hernán Departments of Epidemiology and Biostatistics Harvard School of Public Health www.hsph.harvard.edu/causal
More informationENTROPY BALANCING IS DOUBLY ROBUST. Department of Statistics, Wharton School, University of Pennsylvania DANIEL PERCIVAL. Google Inc.
ENTROPY BALANCING IS DOUBLY ROBUST QINGYUAN ZHAO arxiv:1501.03571v3 [stat.me] 11 Feb 2017 Department of Statistics, Wharton School, University of Pennsylvania DANIEL PERCIVAL Google Inc. Abstract. Covariate
More informationPropensity Score Methods, Models and Adjustment
Propensity Score Methods, Models and Adjustment Dr David A. Stephens Department of Mathematics & Statistics McGill University Montreal, QC, Canada. d.stephens@math.mcgill.ca www.math.mcgill.ca/dstephens/siscr2016/
More informationG-ESTIMATION OF STRUCTURAL NESTED MODELS (CHAPTER 14) BIOS G-Estimation
G-ESTIMATION OF STRUCTURAL NESTED MODELS (CHAPTER 14) BIOS 776 1 14 G-Estimation G-Estimation of Structural Nested Models ( 14) Outline 14.1 The causal question revisited 14.2 Exchangeability revisited
More informationEmpirical Likelihood Methods for Two-sample Problems with Data Missing-by-Design
1 / 32 Empirical Likelihood Methods for Two-sample Problems with Data Missing-by-Design Changbao Wu Department of Statistics and Actuarial Science University of Waterloo (Joint work with Min Chen and Mary
More informationEstimating direct effects in cohort and case-control studies
Estimating direct effects in cohort and case-control studies, Ghent University Direct effects Introduction Motivation The problem of standard approaches Controlled direct effect models In many research
More informationImbens, Lecture Notes 1, Unconfounded Treatment Assignment, IEN, Miami, Oct 10 1
Imbens, Lecture Notes 1, Unconfounded Treatment Assignment, IEN, Miami, Oct 10 1 Lectures on Evaluation Methods Guido Imbens Impact Evaluation Network October 2010, Miami Methods for Estimating Treatment
More informationEstimation of the Conditional Variance in Paired Experiments
Estimation of the Conditional Variance in Paired Experiments Alberto Abadie & Guido W. Imbens Harvard University and BER June 008 Abstract In paired randomized experiments units are grouped in pairs, often
More informationLIKELIHOOD RATIO INFERENCE FOR MISSING DATA MODELS
LIKELIHOOD RATIO IFERECE FOR MISSIG DATA MODELS KARU ADUSUMILLI AD TAISUKE OTSU Abstract. Missing or incomplete outcome data is a ubiquitous problem in biomedical and social sciences. Under the missing
More informationCasual Mediation Analysis
Casual Mediation Analysis Tyler J. VanderWeele, Ph.D. Upcoming Seminar: April 21-22, 2017, Philadelphia, Pennsylvania OXFORD UNIVERSITY PRESS Explanation in Causal Inference Methods for Mediation and Interaction
More informationCalibration Estimation of Semiparametric Copula Models with Data Missing at Random
Calibration Estimation of Semiparametric Copula Models with Data Missing at Random Shigeyuki Hamori 1 Kaiji Motegi 1 Zheng Zhang 2 1 Kobe University 2 Renmin University of China Econometrics Workshop UNC
More informationGraybill Conference Poster Session Introductions
Graybill Conference Poster Session Introductions 2013 Graybill Conference in Modern Survey Statistics Colorado State University Fort Collins, CO June 10, 2013 Small Area Estimation with Incomplete Auxiliary
More informationMethods for inferring short- and long-term effects of exposures on outcomes, using longitudinal data on both measures
Methods for inferring short- and long-term effects of exposures on outcomes, using longitudinal data on both measures Ruth Keogh, Stijn Vansteelandt, Rhian Daniel Department of Medical Statistics London
More informationComparison of Three Approaches to Causal Mediation Analysis. Donna L. Coffman David P. MacKinnon Yeying Zhu Debashis Ghosh
Comparison of Three Approaches to Causal Mediation Analysis Donna L. Coffman David P. MacKinnon Yeying Zhu Debashis Ghosh Introduction Mediation defined using the potential outcomes framework natural effects
More informationCovariate selection and propensity score specification in causal inference
Covariate selection and propensity score specification in causal inference Ingeborg Waernbaum Doctoral Dissertation Department of Statistics Umeå University SE-901 87 Umeå, Sweden Copyright c 2008 by Ingeborg
More informationAn Introduction to Causal Analysis on Observational Data using Propensity Scores
An Introduction to Causal Analysis on Observational Data using Propensity Scores Margie Rosenberg*, PhD, FSA Brian Hartman**, PhD, ASA Shannon Lane* *University of Wisconsin Madison **University of Connecticut
More informationPropensity Score Methods for Estimating Causal Effects from Complex Survey Data
Propensity Score Methods for Estimating Causal Effects from Complex Survey Data Dissertation Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School
More informationTopics and Papers for Spring 14 RIT
Eric Slud Feb. 3, 204 Topics and Papers for Spring 4 RIT The general topic of the RIT is inference for parameters of interest, such as population means or nonlinearregression coefficients, in the presence
More informationCausal inference in epidemiological practice
Causal inference in epidemiological practice Willem van der Wal Biostatistics, Julius Center UMC Utrecht June 5, 2 Overview Introduction to causal inference Marginal causal effects Estimating marginal
More informationComment: Understanding OR, PS and DR
Statistical Science 2007, Vol. 22, No. 4, 560 568 DOI: 10.1214/07-STS227A Main article DOI: 10.1214/07-STS227 c Institute of Mathematical Statistics, 2007 Comment: Understanding OR, PS and DR Zhiqiang
More informationPropensity Score Methods for Causal Inference
John Pura BIOS790 October 2, 2015 Causal inference Philosophical problem, statistical solution Important in various disciplines (e.g. Koch s postulates, Bradford Hill criteria, Granger causality) Good
More informationIP WEIGHTING AND MARGINAL STRUCTURAL MODELS (CHAPTER 12) BIOS IPW and MSM
IP WEIGHTING AND MARGINAL STRUCTURAL MODELS (CHAPTER 12) BIOS 776 1 12 IPW and MSM IP weighting and marginal structural models ( 12) Outline 12.1 The causal question 12.2 Estimating IP weights via modeling
More informationData Integration for Big Data Analysis for finite population inference
for Big Data Analysis for finite population inference Jae-kwang Kim ISU January 23, 2018 1 / 36 What is big data? 2 / 36 Data do not speak for themselves Knowledge Reproducibility Information Intepretation
More informationStructural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall
1 Structural Nested Mean Models for Assessing Time-Varying Effect Moderation Daniel Almirall Center for Health Services Research, Durham VAMC & Dept. of Biostatistics, Duke University Medical Joint work
More informationAn Introduction to Causal Mediation Analysis. Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016
An Introduction to Causal Mediation Analysis Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016 1 Causality In the applications of statistics, many central questions
More informationMarginal, crude and conditional odds ratios
Marginal, crude and conditional odds ratios Denitions and estimation Travis Loux Gradute student, UC Davis Department of Statistics March 31, 2010 Parameter Denitions When measuring the eect of a binary
More informationAN EVALUATION OF PARAMETRIC AND NONPARAMETRIC VARIANCE ESTIMATORS IN COMPLETELY RANDOMIZED EXPERIMENTS. Stanley A. Lubanski. and. Peter M.
AN EVALUATION OF PARAMETRIC AND NONPARAMETRIC VARIANCE ESTIMATORS IN COMPLETELY RANDOMIZED EXPERIMENTS by Stanley A. Lubanski and Peter M. Steiner UNIVERSITY OF WISCONSIN-MADISON 018 Background To make
More informationUniversity of California, Berkeley
University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2010 Paper 269 Diagnosing and Responding to Violations in the Positivity Assumption Maya L. Petersen
More informationDATA-ADAPTIVE VARIABLE SELECTION FOR
DATA-ADAPTIVE VARIABLE SELECTION FOR CAUSAL INFERENCE Group Health Research Institute Department of Biostatistics, University of Washington shortreed.s@ghc.org joint work with Ashkan Ertefaie Department
More informationENTROPY BALANCING IS DOUBLY ROBUST QINGYUAN ZHAO. Department of Statistics, Stanford University DANIEL PERCIVAL. Google Inc.
ENTROPY BALANCING IS DOUBLY ROBUST QINGYUAN ZHAO Department of Statistics, Stanford University DANIEL PERCIVAL Google Inc. Abstract. Covariate balance is a conventional key diagnostic for methods used
More informationCausal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification. Todd MacKenzie, PhD
Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification Todd MacKenzie, PhD Collaborators A. James O Malley Tor Tosteson Therese Stukel 2 Overview 1. Instrumental variable
More informationWhen Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data?
When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data? Kosuke Imai Department of Politics Center for Statistics and Machine Learning Princeton University Joint
More informationOptimal Treatment Regimes for Survival Endpoints from a Classification Perspective. Anastasios (Butch) Tsiatis and Xiaofei Bai
Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective Anastasios (Butch) Tsiatis and Xiaofei Bai Department of Statistics North Carolina State University 1/35 Optimal Treatment
More information