Balancing Covariates via Propensity Score Weighting

Similar documents
Balancing Covariates via Propensity Score Weighting: The Overlap Weights

Overlap Propensity Score Weighting to Balance Covariates

Balancing Covariates via Propensity Score Weighting

Weighting. Homework 2. Regression. Regression. Decisions Matching: Weighting (0) W i. (1) -å l i. )Y i. (1-W i 3/5/2014. (1) = Y i.

Propensity Score Weighting with Multilevel Data

Matching. Quiz 2. Matching. Quiz 2. Exact Matching. Estimand 2/25/14

Propensity Score Analysis with Hierarchical Data

An Introduction to Causal Analysis on Observational Data using Propensity Scores

Gov 2002: 5. Matching

Propensity Score Methods for Causal Inference

Selection on Observables: Propensity Score Matching.

Estimating the Marginal Odds Ratio in Observational Studies

Achieving Optimal Covariate Balance Under General Treatment Regimes

What s New in Econometrics. Lecture 1

Variable selection and machine learning methods in causal inference

Bayesian causal forests: dealing with regularization induced confounding and shrinking towards homogeneous effects

Observational Studies and Propensity Scores

Implementing Matching Estimators for. Average Treatment Effects in STATA

Primal-dual Covariate Balance and Minimal Double Robustness via Entropy Balancing

Propensity Score Matching

Difference-in-Differences Methods

Combining Difference-in-difference and Matching for Panel Data Analysis

Combining multiple observational data sources to estimate causal eects

PSC 504: Dynamic Causal Inference

Subgroup Balancing Propensity Score

Use of Matching Methods for Causal Inference in Experimental and Observational Studies. This Talk Draws on the Following Papers:

Implementing Matching Estimators for. Average Treatment Effects in STATA. Guido W. Imbens - Harvard University Stata User Group Meeting, Boston

Use of Matching Methods for Causal Inference in Experimental and Observational Studies. This Talk Draws on the Following Papers:

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies

Causal Inference with General Treatment Regimes: Generalizing the Propensity Score

Imbens/Wooldridge, IRP Lecture Notes 2, August 08 1

Weighting Methods. Harvard University STAT186/GOV2002 CAUSAL INFERENCE. Fall Kosuke Imai

Extending causal inferences from a randomized trial to a target population

Moving the Goalposts: Addressing Limited Overlap in Estimation of Average Treatment Effects by Changing the Estimand

Causal Inference in Observational Studies with Non-Binary Treatments. David A. van Dyk

Covariate Balancing Propensity Score for General Treatment Regimes

Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification. Todd MacKenzie, PhD

Rerandomization to Balance Covariates

Selective Inference for Effect Modification

Flexible Estimation of Treatment Effect Parameters

arxiv: v1 [stat.me] 15 May 2011

ESTIMATION OF TREATMENT EFFECTS VIA MATCHING

Deductive Derivation and Computerization of Semiparametric Efficient Estimation

Notes on causal effects

Multi-state Models: An Overview

Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal

Section 10: Inverse propensity score weighting (IPSW)

Propensity score modelling in observational studies using dimension reduction methods

Propensity Score Matching and Analysis TEXAS EVALUATION NETWORK INSTITUTE AUSTIN, TX NOVEMBER 9, 2018

Gov 2002: 13. Dynamic Causal Inference

Bootstrapping Sensitivity Analysis

Dealing with Limited Overlap in Estimation of Average Treatment Effects

Empirical Likelihood Methods for Two-sample Problems with Data Missing-by-Design

Experiments and Quasi-Experiments

Stratified Randomized Experiments

Combining Experimental and Non-Experimental Design in Causal Inference

Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing

The Role of the Propensity Score in Observational Studies with Complex Data Structures

Introduction to Econometrics

Counterfactual Model for Learning Systems

Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates

Estimation of the Conditional Variance in Paired Experiments

Causal modelling in Medical Research

A Sampling of IMPACT Research:

Statistical Analysis of Causal Mechanisms

Optimal Blocking by Minimizing the Maximum Within-Block Distance

Estimating Average Treatment Effects: Supplementary Analyses and Remaining Challenges

The problem of causality in microeconometrics.

The Impact of Measurement Error on Propensity Score Analysis: An Empirical Investigation of Fallible Covariates

Web-based Supplementary Material for. Dependence Calibration in Conditional Copulas: A Nonparametric Approach

Statistical Models for Causal Analysis

Summary and discussion of The central role of the propensity score in observational studies for causal effects

Assess Assumptions and Sensitivity Analysis. Fan Li March 26, 2014

Propensity Score Methods, Models and Adjustment

Estimating the Mean Response of Treatment Duration Regimes in an Observational Study. Anastasios A. Tsiatis.

NISS. Technical Report Number 167 June 2007

Advanced Quantitative Research Methodology, Lecture Notes: Research Designs for Causal Inference 1

AGEC 661 Note Fourteen

Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes

AAEC/ECON 5126 FINAL EXAM: SOLUTIONS

Bayesian regression tree models for causal inference: regularization, confounding and heterogeneity

Ratio of Mediator Probability Weighting for Estimating Natural Direct and Indirect Effects

The Evaluation of Social Programs: Some Practical Advice

Section 7: Local linear regression (loess) and regression discontinuity designs

ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics

arxiv: v1 [stat.me] 4 Feb 2017

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data?

Imbens, Lecture Notes 1, Unconfounded Treatment Assignment, IEN, Miami, Oct 10 1

Optimal bandwidth selection for the fuzzy regression discontinuity estimator

Comparative effectiveness of dynamic treatment regimes

Prerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3

Comments on Best Quasi- Experimental Practice

Causal Inference with Big Data Sets

Inference for Average Treatment Effects

Lecture 3: Causal inference

Robustness to Parametric Assumptions in Missing Data Models

THE DESIGN (VERSUS THE ANALYSIS) OF EVALUATIONS FROM OBSERVATIONAL STUDIES: PARALLELS WITH THE DESIGN OF RANDOMIZED EXPERIMENTS DONALD B.

ECO Class 6 Nonparametric Econometrics

Telescope Matching: A Flexible Approach to Estimating Direct Effects

Targeted Maximum Likelihood Estimation in Safety Analysis

Transcription:

Balancing Covariates via Propensity Score Weighting Kari Lock Morgan Department of Statistics Penn State University klm47@psu.edu Stochastic Modeling and Computational Statistics Seminar October 17, 2014 Joint work with Fan Li (Duke) and Alan Zaslavsky (Harvard) Kari Lock Morgan (Penn State) SMAC Seminar October 17, 2014 1 / 25

Outline 1 Causal inference in observational studies - a brief overview 2 Introduce a general class of balancing weights 3 Propose new overlap weights and show some optimality properties 4 Illustrate with examples Kari Lock Morgan (Penn State) SMAC Seminar October 17, 2014 2 / 25

Causal Inference in Observational Studies Ideal goal: estimate the causal effect of a treatment using observational data Problem: Without randomization to treatment groups, severe covariate imbalance is likely Realistic goal: Balance observed covariates between treatment groups Kari Lock Morgan (Penn State) SMAC Seminar October 17, 2014 3 / 25

Example: Right Heart Catheterization Right heart catheterization (RHC) is an invasive diagnostic procedure to assess cardiac function What is the causal effect of right heart catheterization on survival? 2184 treatment (RHC), 3551 control (no RHC) Observational data (Murphy and Cluff, 1990) Covariate imbalance Kari Lock Morgan (Penn State) SMAC Seminar October 17, 2014 4 / 25

Example: Right Heart Catheterization Figure: Imbalance in APACHE, Acute Physiology and Chronic Health Evaluation Score, measured before procedure. Treatment Control 0 20 40 60 80 100 120 140 APACHE Score Kari Lock Morgan (Penn State) SMAC Seminar October 17, 2014 5 / 25

Example: Right Heart Catheterization APACHE scores for a random subset of 20 patients: Treatment: 34, 41, 42, 43, 49, 79, 80, 84 Control: 9, 19, 35, 40, 44, 45, 48, 50, 51, 53, 53, 60 Get rid of units with no comparable units in other group? Treatment: 34, 41, 42, 43, 49 Control: 35, 40, 44, 45, 48, 50, 51, 53, 53 Options for covariate balance: Matching - match each unit in treatment group to similar unit(s) in control group, discard the rest (34, 35), (41, 40), (42, 44), (43, 45), (49, 48) Subclassification - compare units within similar subclasses, then take a (weighted) average across subclasses (34, 35), (40, 41, 42, 43, 44, 45), (48, 49, 50, 51, 53, 53) Weighting - weight each unit so the weighted covariate distributions are similar Kari Lock Morgan (Penn State) SMAC Seminar October 17, 2014 6 / 25

Propensity Scores This gets much more complicated with many covariates. Wouldn t it be nice if we could just balance on a single number summarizing all covariates... The propensity score allows us to do this! The propensity score, e(x), is the probability a unit belongs to the treatment group, based on observed covariates: e(x) = Pr(Z i = 1 X i = x), where Z i indicates treatment group (Z i = 1 for treatment and Z i = 0 for control) and X i denotes the covariates. Amazing fact: balancing just the propensity score yields balance on all covariates included in the propensity score model! Kari Lock Morgan (Penn State) SMAC Seminar October 17, 2014 7 / 25

Covariate Distributions Population density of the covariates X is f (x) Density for group Z = z is f z (x) = P(X = x Z = z) Then f z (x) = P(X = x Z = z) = P(Z = z X = x)p(x = x), P(Z = z) So f 1 (x) f (x)e(x) and f 0 (x) f (x)(1 e(x)) GOAL: make f 1 (x) f 0 (x) One solution: Use weights, w z (x), such that f 1 (x)w 1 (x) f 0 (x)w 0 (x) Kari Lock Morgan (Penn State) SMAC Seminar October 17, 2014 8 / 25

Balancing weights We propose the following class of balancing weights: w 1 (x) h(x) e(x), w 0 (x) h(x) 1 e(x), where h( ) is a pre-specified function. The weighted covariate distributions in the two groups have the same target density f (x)h(x): f 1 (x)w 1 (x) f (x)e(x) h(x) e(x) = f (x)h(x), f 0 (x)w 0 (x) f (x) (1 e(x)) h(x) 1 e(x) = f (x)h(x). Kari Lock Morgan (Penn State) SMAC Seminar October 17, 2014 9 / 25

Examples of target population and balancing weights target population h(x) estimand ( weight (w 1 ), w 0 ) 1 combined 1 ATE e(x), 1 1 e(x) [HT] treated e(x) ATT ( ) e(x) 1, 1 e(x) control 1 e(x) ATC ( ) 1 e(x) e(x), 1 truncated 1(α < e(x) < 1 α) ATTrunc combined ( 1(α<e(x)<1 α) e(x), 1(α<e(x)<1 α) 1 e(x) ) overlap e(x)(1 e(x)) ATO (1 e(x), e(x)) Kari Lock Morgan (Penn State) SMAC Seminar October 17, 2014 10 / 25

Estimands and Estimators Potential outcome framework: Y i (1), Y i (0) Conditional average treatment effect (ATE) τ(x) E(Y (1) X = x) E(Y (0) X = x). Estimand is average (ATE) over a target population with density f (x)h(x): τ h τ(dx)f (x)h(x)µ(dx) f (x)h(x)µ(dx). τ h can be estimated by weighted averages: ˆτ h w = i:z i =1 Y iw 1 (x i ) i:z i:z i =1 w i =0 Y iw 0 (x i ) 1(x i ) i:z i =0 w 0(x i ). Kari Lock Morgan (Penn State) SMAC Seminar October 17, 2014 11 / 25

Asymptotic Variance of ˆτ h Theorem Given the normalizing constraint f (x)h(x)µ(dx) = 1, the large-sample variance of the estimator ˆτ h is: [ V[ˆτ h ] = f (x)h(x) 2 v1 (x) e(x) + v ] 0(x) µ(dx)/n, 1 e(x) where v z (x) is the variance of Y in a neighborhood dx of x in the Z = z group. Kari Lock Morgan (Penn State) SMAC Seminar October 17, 2014 12 / 25

Minimizing Asymptotic Variance of ˆτ h Theorem Assuming v 0 (x) v 1 (x) v, the function h(x) = e(x)(1 e(x)) gives the smallest asymptotic variance for the weighted estimator ˆτ h, and min{v[ˆτ h ]} = v f (x)e(x)(1 e(x))µ(dx). N Kari Lock Morgan (Penn State) SMAC Seminar October 17, 2014 13 / 25

Overlap weights We propose a new weight by letting h(x) = e(x)(1 e(x)), leading to the overlap weights: w 1 (x) 1 e(x), w 0 (x) e(x). Target population f (x)e(x)(1 e(x)) Marginal units who may or may not receive the treatment (Rosenbaum, 2012). Defined by overlap of covariates... Kari Lock Morgan (Penn State) SMAC Seminar October 17, 2014 14 / 25

Overlap Weights Figure: Densities for the treatment group, f 1 (x), control group, f 0 (x), and overlap population, f (x)h(x). Kari Lock Morgan (Penn State) SMAC Seminar October 17, 2014 15 / 25

Exact Balance Theorem When the propensity scores are estimated from a logistic regression model with main effects, logit{e(x i )} = β 0 + β x i, the overlap weights lead to exact balance in any included covariate between treatment and control groups. That is, i x i,kz i (1 ê i ) i i Z = x i,k(1 Z i )ê i i(1 ê i ) i (1 Z. i)ê i Kari Lock Morgan (Penn State) SMAC Seminar October 17, 2014 16 / 25

Simulated Example n 0 = n 1 = 1000 units, with X i N(0, 1) + 2Z i. Figure: Original covariate distributions within each treatment group, and weighted covariate distributions with overlap, HT, ATT weights. Unweighted Overlap HT ATT Z=0 Z=1 2 0 2 4 2 0 2 4 2 0 2 4 2 0 2 4 X X X X Unweighted Overlap HT ATT x 1 1.98 1.01 0.74 1.98 x 0 0.03 1.01 1.19 2.22 Kari Lock Morgan (Penn State) SMAC Seminar October 17, 2014 17 / 25

Simulated Example A single covariate: X i N(0, 1) + 2Z i. Outcome model with additive treatment effect: Y i X i + τz i + N(0, 1), with τ = 1. Use the nonparametric estimator ˆτ w h with different weights: Unweighted Overlap HT ATT ˆτ 2.945 1.000 0.581 0.640 SE(ˆτ) 0.054 0.038 0.386 0.402 Kari Lock Morgan (Penn State) SMAC Seminar October 17, 2014 18 / 25

Racial Disparity in Medical Expenditure Goal: estimate racial disparity in medical expenditures after balancing covariates (Le Cook et al., 2010) Race is not manipulable so comparisons are descriptive, not causal Data: 2009 Medical Expenditure Panel Survey: 9830 non-hispanic Whites, 4020 Blacks, 1446 Asians, 5280 Hispanics Three independent comparisons; comparing non-hispanic Whites to each minority group Logistic regression to estimate propensity scores, 31 covariates (5 continuous, 26 binary) Ignore survey weights here, but weighting allows easy incorporation of survey weights Kari Lock Morgan (Penn State) SMAC Seminar October 17, 2014 19 / 25

Racial Disparity in Medical Expenditure Figure: Covariate balance (absolute standardized bias) with no weights, overlap weights, and HT weights. White Black White Asian White Hispanic Absolute Standardized Bias 0 20 40 60 0 20 40 60 0 20 40 60 Unweighted Overlap HT Unweighted Overlap HT Unweighted Overlap HT Kari Lock Morgan (Penn State) SMAC Seminar October 17, 2014 20 / 25

Racial Disparity in Medical Expenditure One Asian woman has over 30% of the weight! (out of 1446 Asians) 78 year old Asian lady with a BMI of 55.4: e(x) = 0.9998 Common practice: Eliminate cases with propensity scores close to 0 or 1 Truncate propensity scores or weights Can lead to ad hoc changes to target population Results can be very sensitive to truncation choice The overlap weights avoid these extreme weights and avoid an abrupt threshold for elimination or truncation Kari Lock Morgan (Penn State) SMAC Seminar October 17, 2014 21 / 25

Racial Disparity in Medical Expenditure Table: Unweighted, overlap weights, and HT weighting estimates (SE) for difference in yearly medical expenditure. Unweighted Overlap HT White - Black $786 (222) $824 (185) $856 (200) White - Asian $2764 (209) $1227 (205) $2167 (640) White - Hispanic $2599 (174) $1212 (171) $596 (323) Kari Lock Morgan (Penn State) SMAC Seminar October 17, 2014 22 / 25

Right heart catheterization (RHC) RHC vs. non RHC Absolute Standardized Bias 0 5 10 15 Unweighted Overlap HT ATT Table: Estimated treatment effect (in %) with different weights unweighted overlap HT ATT ˆτ h 7.36 6.54 5.93 5.81 SE(ˆτ h ) 1.27 1.32 2.46 2.67 Kari Lock Morgan (Penn State) SMAC Seminar October 17, 2014 23 / 25

Advantages of the overlap weight Statistical advantages Minimizes asymptotic variance of the weighted average estimator among all balancing weights. Perfect (exact small-sample) balance for means of included covariates in logistic propensity score model. Weights are bounded (unlike HT, etc.). Avoids artificially truncating weights or eliminating cases. Scientific advantages Clinical equipoise. The marginal units are likely the group who are responsive to policy intervention. Kari Lock Morgan (Penn State) SMAC Seminar October 17, 2014 24 / 25

Summary Unified framework for use of weighting to balance covariates for any target population. The general class of balancing weights balance covariates and include many of the existing weights. A new weighting method, the overlap weights, have desirable properties Arxiv: http://arxiv.org/abs/1404.1785 Kari Lock Morgan (Penn State) SMAC Seminar October 17, 2014 25 / 25