The Impact of Measurement Error on Propensity Score Analysis: An Empirical Investigation of Fallible Covariates Eun Sook Kim, Patricia Rodríguez de Gil, Jeffrey D. Kromrey, Rheta E. Lanehart, Aarti Bellara, Reginald S. Lee Modern Modeling Methods Conference, May 21st, 2013, Windsor Locks, CT
Introduction Background Purpose Method Results Common Support Balance Bias RMSE Type I error CI coverage and width Conclusions Further research Presentation Outline 2
Introduction Rubin s Causal Model (RCM) T i =Y 1i Y 0i T i = treatment effect for individual i Y 1i = potential outcome for treatment Y 0i = potential outcome for control Fundamental Problem of Causal Inference Solution to estimate causality T = E(Y 1i Z i = 1) E(Y 0i Z i = 0) where (Y 1 and Y 0 ) Z Assumptions Strongly ignorable treatment assignment Stable unit treatment value assumption (SUTVA) 3
Propensity Score Methods (PSM) Propensity Score (PS): Estimate of an individual s probability for being assigned to treatment group logit ( Z 1) log 1 ˆ where p is the number of predictors Use the estimated propensity score to condition treatment and control groups Caliper Matching Matching without caliper Stratification Covariance Adjustment PS Weighting ˆ 0 p i 1 ix i 4
Researcher s Decisions Covariate Selection PS Estimation Evaluate Common Support Trimming Samples Conditioning Methods Balance Properties Outcome Model 5
Covariate Selection Model specification error Relation of covariates to outcome Relation of covariates to treatment assignment Brookhart et al., 2006; Rubin & Thomas, 1996; Rubin, 1997 Measurement errors in covariates Deleterious effect of measurement error on PS analysis Bellara et al., 2013; Steiner, Cook, & Shadish, 2011 6
Background: Previous Research Bias in Point Estimates by Covariate Reliability 7
Background: Type I Error Rates by Covariate Reliability Type I Error Rates by Covariate Reliability 8
Background: CI Coverage by Covariate Reliability CI Coverage by Covariate Reliability 9
Purpose of the Study To investigate the effect of covariate selection based on measurement quality on the balance and the estimation of the treatment effect in PS analysis When the covariates in a sample have various levels of measurement error Select a full set or a subset of reliable covariates To provide guidelines in selecting covariates that could reduce selection bias more efficiently in the presence of measurement errors 10
Method Simulation study Fully crossed factorial mixed design with 6 between-subjects factors and 3 within-subject factors Between Number of covariates Population treatment effect Covariate relationship to treatment Covariate relationship to outcome Correlation among covariates Sample size Within PS conditioning methods Covariate selection Trimming 2160 conditions x 7 conditioning methods x 3 covariate sets x 2 trimming 5000 replications SAS IML Procedure 11
Method Design factors in data generation Between-subject factors: Number of covariates 9, 18, 27 Population Treatment effect 0.0, 0.2, 0.5, 0.8 Covariate relationship to treatment assignment 0.025, 0.050, 0.100 Covariate relationship to outcome 0.025, 0.050, 0.100 Correlation among covariates 0,.2,.5 Sample Size 50, 100, 250, 500, 1000 12
Method Within-Subjects Factor PS conditioning methods Ignoring Covariates Matching without caliper one-to-one matching Matching with caliper Caliper width =.25 SD of PS ANCOVA PS ANCOVA PS Weighting Inverse probability of treatment weights Stratification Quintile 13
Within-Subject Factor: Covariate selection Method Three levels of reliability in equal proportions in a single sample.6,.8, 1.0 Covariate selection A full set Covariates with high reliability (.8) Covariates with perfect reliability only (1.0 only) 14
Results Common Support Balance Bias RMSE Type I error CI Coverage CI Width 15
Common Support Coverage By Sample Size and Covariate Set 16
Common Support Coverage by Correlation among Covariates and Covariate Set 17
Common Support Coverage by Covariate Relation to Outcome and Covariate Set 18
Common Support Coverage by Covariate Relation to Treatment Assignment and Covariate Set 19
Common Support Coverage by Number of Covariates And Covariate Set 20
Distribution of Balance of Binary Covariate 21
Distribution of Balance of Binary Covariate When N > 100 22
Distribution of Balance of Continuous Covariate When N > 100 23
Distribution of Balance of Binary Covariates By Covariate Set When N > 100 24
Balance of Binary Covariates by Covariate Set and Conditioning Method with No Correlation Among Covariates (r = 0.0) 25
Balance of Binary Covariates by Covariate Set and Conditioning Method with High Correlation Among Covariates (r = 0.5) 26
Bias Distribution of Bias Cmatch NoCmatch Ignore Ancova PS_Ancova Weighting Stratify Conditioning Method 27
Bias Distribution of Bias when N> 100 Caliper match NoCaliper Ignore Ancova PS_Ancova Weighting Stratify Match Conditioning Method 28
Distribution of Bias by Covariate Sets when N> 100 29
30
31
RMSE Mean Distribution by Method 32
RMSE 33
RMSE 34
RMSE 35
RMSE 36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
Conclusions Model specification error made deleterious effects on propensity score analysis Consistent across conditioning methods Observed in most of outcome variables (e.g., bias, Type I error) More serious as more covariates are omitted 52
Conclusions When there are covariates with different levels of reliability in a single sample, omitting covariates with poor measurement quality is not recommended More cautious when the covariates are highly related to outcome More cautious when the covariates are highly related to treatment assignment More cautious when sample size is large The degree depends on conditioning methods (e.g., less impact on PS ANCOVA) and also on the simulation study outcomes (e.g., negligible effect on balance)
Further Research Errors-in-variables model Explicitly model measurement errors in propensity score estimation using the errors-in-variables logistic model rather than omitting covariates with measurement error Propensity score analysis with binary outcome The impact of measurement error and model specification error on the estimation of binary outcomes The effect of misspecification of functional forms in propensity score estimation 54
Contact Information Your comments and questions are valued and encouraged. Contact the author at: Eun Sook Kim, Ph. D. Department of Educational Measurement and Research University of South Florida 4202 E. Fowler Ave. EDU 105 Tampa, FL 33620 Office: EDU 369 Phone: (813) 974-7692 ekim3@usf.edu