The Impact of Measurement Error on Propensity Score Analysis: An Empirical Investigation of Fallible Covariates

Similar documents
An Empirical Comparison of Multiple Imputation Approaches for Treating Missing Data in Observational Studies

Propensity Score Methods for Causal Inference

Controlling for latent confounding by confirmatory factor analysis (CFA) Blinded Blinded

Analysis of propensity score approaches in difference-in-differences designs

Selection on Observables: Propensity Score Matching.

Estimating the Marginal Odds Ratio in Observational Studies

Authors and Affiliations: Nianbo Dong University of Missouri 14 Hill Hall, Columbia, MO Phone: (573)

Matching. Quiz 2. Matching. Quiz 2. Exact Matching. Estimand 2/25/14

Propensity Score Weighting with Multilevel Data

Covariate Balancing Propensity Score for General Treatment Regimes

Weighting Methods. Harvard University STAT186/GOV2002 CAUSAL INFERENCE. Fall Kosuke Imai

An Introduction to Causal Analysis on Observational Data using Propensity Scores

Propensity Score Analysis with Hierarchical Data

Causal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions

Causal Inference Basics

(Mis)use of matching techniques

DATA-ADAPTIVE VARIABLE SELECTION FOR

Weighting. Homework 2. Regression. Regression. Decisions Matching: Weighting (0) W i. (1) -å l i. )Y i. (1-W i 3/5/2014. (1) = Y i.

Controlling for overlap in matching

Propensity Score Matching and Analysis TEXAS EVALUATION NETWORK INSTITUTE AUSTIN, TX NOVEMBER 9, 2018

Bayesian regression tree models for causal inference: regularization, confounding and heterogeneity

Measuring Social Influence Without Bias

University of Michigan School of Public Health

Primal-dual Covariate Balance and Minimal Double Robustness via Entropy Balancing

Vector-Based Kernel Weighting: A Simple Estimator for Improving Precision and Bias of Average Treatment Effects in Multiple Treatment Settings

Methods for inferring short- and long-term effects of exposures on outcomes, using longitudinal data on both measures

Data Integration for Big Data Analysis for finite population inference

Balancing Covariates via Propensity Score Weighting: The Overlap Weights

Balancing Covariates via Propensity Score Weighting

OUTCOME REGRESSION AND PROPENSITY SCORES (CHAPTER 15) BIOS Outcome regressions and propensity scores

Empirical Likelihood Methods for Two-sample Problems with Data Missing-by-Design

Comparing Group Means When Nonresponse Rates Differ

arxiv: v1 [stat.me] 15 May 2011

Introduction to Econometrics. Assessing Studies Based on Multiple Regression

Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal

finite-sample optimal estimation and inference on average treatment effects under unconfoundedness

Causal Inference with General Treatment Regimes: Generalizing the Propensity Score

PROPENSITY SCORE MATCHING. Walter Leite

Introduction An approximated EM algorithm Simulation studies Discussion

Assess Assumptions and Sensitivity Analysis. Fan Li March 26, 2014

Targeted Maximum Likelihood Estimation in Safety Analysis

Ratio of Mediator Probability Weighting for Estimating Natural Direct and Indirect Effects

Chapter 11. Regression with a Binary Dependent Variable

Since the seminal paper by Rosenbaum and Rubin (1983b) on propensity. Propensity Score Analysis. Concepts and Issues. Chapter 1. Wei Pan Haiyan Bai

A comparison of weighted estimators for the population mean. Ye Yang Weighting in surveys group

Partially Identified Treatment Effects for Generalizability

Assessing Studies Based on Multiple Regression

Observational Studies and Propensity Scores

Causal Inference in Observational Studies with Non-Binary Treatments. David A. van Dyk

Ignoring the matching variables in cohort studies - when is it valid, and why?

What s New in Econometrics. Lecture 1

Marginal, crude and conditional odds ratios

Propensity score modelling in observational studies using dimension reduction methods

Robustness to Parametric Assumptions in Missing Data Models

Growth Mixture Modeling and Causal Inference. Booil Jo Stanford University

Flexible mediation analysis in the presence of non-linear relations: beyond the mediation formula.

High Dimensional Propensity Score Estimation via Covariate Balancing

Econometric Analysis of Cross Section and Panel Data

Quantitative Economics for the Evaluation of the European Policy

PSC 504: Dynamic Causal Inference

Dynamics in Social Networks and Causality

Gov 2002: 5. Matching

Double Robustness. Bang and Robins (2005) Kang and Schafer (2007)

Combining multiple observational data sources to estimate causal eects

Business Statistics. Lecture 10: Correlation and Linear Regression

New Developments in Nonresponse Adjustment Methods

Gov 2002: 13. Dynamic Causal Inference

Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification. Todd MacKenzie, PhD

Given a sample of n observations measured on k IVs and one DV, we obtain the equation

MMWS Software Program Manual

Penalized Spline of Propensity Methods for Missing Data and Causal Inference. Roderick Little

Bayesian causal forests: dealing with regularization induced confounding and shrinking towards homogeneous effects

Use of Matching Methods for Causal Inference in Experimental and Observational Studies. This Talk Draws on the Following Papers:

Matching Techniques. Technical Session VI. Manila, December Jed Friedman. Spanish Impact Evaluation. Fund. Region

CompSci Understanding Data: Theory and Applications

Estimating the Mean Response of Treatment Duration Regimes in an Observational Study. Anastasios A. Tsiatis.

Econ 673: Microeconometrics Chapter 12: Estimating Treatment Effects. The Problem

More Statistics tutorial at Logistic Regression and the new:

Bios 6648: Design & conduct of clinical research

PEARL VS RUBIN (GELMAN)

Propensity Score Methods for Estimating Causal Effects from Complex Survey Data

Multi-level Models: Idea

Semiparametric Generalized Linear Models

Variable selection and machine learning methods in causal inference

DEALING WITH MULTIVARIATE OUTCOMES IN STUDIES FOR CAUSAL EFFECTS

arxiv: v1 [stat.me] 8 Jun 2016

Causal Inference by Minimizing the Dual Norm of Bias. Nathan Kallus. Cornell University and Cornell Tech

Can a Pseudo Panel be a Substitute for a Genuine Panel?

Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing

Multidimensional Control Totals for Poststratified Weights

EMERGING MARKETS - Lecture 2: Methodology refresher

Bounds on Causal Effects in Three-Arm Trials with Non-compliance. Jing Cheng Dylan Small

Optimal Blocking by Minimizing the Maximum Within-Block Distance

Chained Versus Post-Stratification Equating in a Linear Context: An Evaluation Using Empirical Data

Formula for the t-test

G-ESTIMATION OF STRUCTURAL NESTED MODELS (CHAPTER 14) BIOS G-Estimation

ANALYTIC COMPARISON. Pearl and Rubin CAUSAL FRAMEWORKS

On the Use of Linear Fixed Effects Regression Models for Causal Inference

The problem of causality in microeconometrics.

Estimating Causal Effects from Observational Data with the CAUSALTRT Procedure

Transcription:

The Impact of Measurement Error on Propensity Score Analysis: An Empirical Investigation of Fallible Covariates Eun Sook Kim, Patricia Rodríguez de Gil, Jeffrey D. Kromrey, Rheta E. Lanehart, Aarti Bellara, Reginald S. Lee Modern Modeling Methods Conference, May 21st, 2013, Windsor Locks, CT

Introduction Background Purpose Method Results Common Support Balance Bias RMSE Type I error CI coverage and width Conclusions Further research Presentation Outline 2

Introduction Rubin s Causal Model (RCM) T i =Y 1i Y 0i T i = treatment effect for individual i Y 1i = potential outcome for treatment Y 0i = potential outcome for control Fundamental Problem of Causal Inference Solution to estimate causality T = E(Y 1i Z i = 1) E(Y 0i Z i = 0) where (Y 1 and Y 0 ) Z Assumptions Strongly ignorable treatment assignment Stable unit treatment value assumption (SUTVA) 3

Propensity Score Methods (PSM) Propensity Score (PS): Estimate of an individual s probability for being assigned to treatment group logit ( Z 1) log 1 ˆ where p is the number of predictors Use the estimated propensity score to condition treatment and control groups Caliper Matching Matching without caliper Stratification Covariance Adjustment PS Weighting ˆ 0 p i 1 ix i 4

Researcher s Decisions Covariate Selection PS Estimation Evaluate Common Support Trimming Samples Conditioning Methods Balance Properties Outcome Model 5

Covariate Selection Model specification error Relation of covariates to outcome Relation of covariates to treatment assignment Brookhart et al., 2006; Rubin & Thomas, 1996; Rubin, 1997 Measurement errors in covariates Deleterious effect of measurement error on PS analysis Bellara et al., 2013; Steiner, Cook, & Shadish, 2011 6

Background: Previous Research Bias in Point Estimates by Covariate Reliability 7

Background: Type I Error Rates by Covariate Reliability Type I Error Rates by Covariate Reliability 8

Background: CI Coverage by Covariate Reliability CI Coverage by Covariate Reliability 9

Purpose of the Study To investigate the effect of covariate selection based on measurement quality on the balance and the estimation of the treatment effect in PS analysis When the covariates in a sample have various levels of measurement error Select a full set or a subset of reliable covariates To provide guidelines in selecting covariates that could reduce selection bias more efficiently in the presence of measurement errors 10

Method Simulation study Fully crossed factorial mixed design with 6 between-subjects factors and 3 within-subject factors Between Number of covariates Population treatment effect Covariate relationship to treatment Covariate relationship to outcome Correlation among covariates Sample size Within PS conditioning methods Covariate selection Trimming 2160 conditions x 7 conditioning methods x 3 covariate sets x 2 trimming 5000 replications SAS IML Procedure 11

Method Design factors in data generation Between-subject factors: Number of covariates 9, 18, 27 Population Treatment effect 0.0, 0.2, 0.5, 0.8 Covariate relationship to treatment assignment 0.025, 0.050, 0.100 Covariate relationship to outcome 0.025, 0.050, 0.100 Correlation among covariates 0,.2,.5 Sample Size 50, 100, 250, 500, 1000 12

Method Within-Subjects Factor PS conditioning methods Ignoring Covariates Matching without caliper one-to-one matching Matching with caliper Caliper width =.25 SD of PS ANCOVA PS ANCOVA PS Weighting Inverse probability of treatment weights Stratification Quintile 13

Within-Subject Factor: Covariate selection Method Three levels of reliability in equal proportions in a single sample.6,.8, 1.0 Covariate selection A full set Covariates with high reliability (.8) Covariates with perfect reliability only (1.0 only) 14

Results Common Support Balance Bias RMSE Type I error CI Coverage CI Width 15

Common Support Coverage By Sample Size and Covariate Set 16

Common Support Coverage by Correlation among Covariates and Covariate Set 17

Common Support Coverage by Covariate Relation to Outcome and Covariate Set 18

Common Support Coverage by Covariate Relation to Treatment Assignment and Covariate Set 19

Common Support Coverage by Number of Covariates And Covariate Set 20

Distribution of Balance of Binary Covariate 21

Distribution of Balance of Binary Covariate When N > 100 22

Distribution of Balance of Continuous Covariate When N > 100 23

Distribution of Balance of Binary Covariates By Covariate Set When N > 100 24

Balance of Binary Covariates by Covariate Set and Conditioning Method with No Correlation Among Covariates (r = 0.0) 25

Balance of Binary Covariates by Covariate Set and Conditioning Method with High Correlation Among Covariates (r = 0.5) 26

Bias Distribution of Bias Cmatch NoCmatch Ignore Ancova PS_Ancova Weighting Stratify Conditioning Method 27

Bias Distribution of Bias when N> 100 Caliper match NoCaliper Ignore Ancova PS_Ancova Weighting Stratify Match Conditioning Method 28

Distribution of Bias by Covariate Sets when N> 100 29

30

31

RMSE Mean Distribution by Method 32

RMSE 33

RMSE 34

RMSE 35

RMSE 36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

Conclusions Model specification error made deleterious effects on propensity score analysis Consistent across conditioning methods Observed in most of outcome variables (e.g., bias, Type I error) More serious as more covariates are omitted 52

Conclusions When there are covariates with different levels of reliability in a single sample, omitting covariates with poor measurement quality is not recommended More cautious when the covariates are highly related to outcome More cautious when the covariates are highly related to treatment assignment More cautious when sample size is large The degree depends on conditioning methods (e.g., less impact on PS ANCOVA) and also on the simulation study outcomes (e.g., negligible effect on balance)

Further Research Errors-in-variables model Explicitly model measurement errors in propensity score estimation using the errors-in-variables logistic model rather than omitting covariates with measurement error Propensity score analysis with binary outcome The impact of measurement error and model specification error on the estimation of binary outcomes The effect of misspecification of functional forms in propensity score estimation 54

Contact Information Your comments and questions are valued and encouraged. Contact the author at: Eun Sook Kim, Ph. D. Department of Educational Measurement and Research University of South Florida 4202 E. Fowler Ave. EDU 105 Tampa, FL 33620 Office: EDU 369 Phone: (813) 974-7692 ekim3@usf.edu