SIMULATION-BASED SENSITIVITY ANALYSIS FOR MATCHING ESTIMATORS

Similar documents
A SIMULATION-BASED SENSITIVITY ANALYSIS FOR MATCHING ESTIMATORS

A Simulation-Based Sensitivity Analysis for Matching Estimators

Lisa Gilmore Gabe Waggoner

Selection on Observables: Propensity Score Matching.

Estimation of average treatment effects based on propensity scores

Assess Assumptions and Sensitivity Analysis. Fan Li March 26, 2014

DISCUSSION PAPER SERIES. No FROM TEMPORARY HELP JOBS TO PERMANENT EMPLOYMENT: WHAT CAN WE LEARN FROM MATCHING ESTIMATORS AND THEIR SENSITIVITY?

ESTIMATION OF TREATMENT EFFECTS VIA MATCHING

Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing

The problem of causality in microeconometrics.

DOCUMENTS DE TRAVAIL CEMOI / CEMOI WORKING PAPERS. A SAS macro to estimate Average Treatment Effects with Matching Estimators

Implementing Matching Estimators for. Average Treatment Effects in STATA

Implementing Matching Estimators for. Average Treatment Effects in STATA. Guido W. Imbens - Harvard University Stata User Group Meeting, Boston

Gov 2002: 4. Observational Studies and Confounding

Propensity Score Matching

Matching Techniques. Technical Session VI. Manila, December Jed Friedman. Spanish Impact Evaluation. Fund. Region

(Mis)use of matching techniques

What s New in Econometrics. Lecture 1

Job Training Partnership Act (JTPA)

Sensitivity analysis for average treatment effects

Causal Inference Basics

Flexible Estimation of Treatment Effect Parameters

IDENTIFICATION OF TREATMENT EFFECTS WITH SELECTIVE PARTICIPATION IN A RANDOMIZED TRIAL

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies

Section 10: Inverse propensity score weighting (IPSW)

Vector-Based Kernel Weighting: A Simple Estimator for Improving Precision and Bias of Average Treatment Effects in Multiple Treatment Settings

Controlling for overlap in matching

Logistic regression: Why we often can do what we think we can do. Maarten Buis 19 th UK Stata Users Group meeting, 10 Sept. 2015

The Economics of European Regions: Theory, Empirics, and Policy

Propensity Score Matching and Analysis TEXAS EVALUATION NETWORK INSTITUTE AUSTIN, TX NOVEMBER 9, 2018

Quantitative Economics for the Evaluation of the European Policy

The propensity score with continuous treatments

Imbens/Wooldridge, IRP Lecture Notes 2, August 08 1

7 Sensitivity Analysis

Propensity Score Analysis with Hierarchical Data

Exploring Marginal Treatment Effects

Introduction to Propensity Score Matching: A Review and Illustration

Propensity Score Methods for Causal Inference

ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics

Econ 2148, fall 2017 Instrumental variables I, origins and binary treatment case

Combining multiple observational data sources to estimate causal eects

Causal modelling in Medical Research

1 Motivation for Instrumental Variable (IV) Regression

PROPENSITY SCORE MATCHING. Walter Leite

Variable selection and machine learning methods in causal inference

Propensity Score Analysis Using teffects in Stata. SOC 561 Programming for the Social Sciences Hyungjun Suh Apr

dqd: A command for treatment effect estimation under alternative assumptions

Gov 2002: 5. Matching

Estimating the causal effect of fertility on economic wellbeing: data requirements, identifying assumptions and estimation methods

Introduction to causal identification. Nidhiya Menon IGC Summer School, New Delhi, July 2015

EMERGING MARKETS - Lecture 2: Methodology refresher

An Introduction to Causal Analysis on Observational Data using Propensity Scores

Robustness to Parametric Assumptions in Missing Data Models

Large Sample Properties & Simulation

Estimation of the Conditional Variance in Paired Experiments

Sensitivity checks for the local average treatment effect

High Dimensional Propensity Score Estimation via Covariate Balancing

Gov 2002: 9. Differences in Differences

Econometric Analysis of Cross Section and Panel Data

Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal

Flexible mediation analysis in the presence of non-linear relations: beyond the mediation formula.

A Course in Applied Econometrics. Lecture 2 Outline. Estimation of Average Treatment Effects. Under Unconfoundedness, Part II

Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis"

Potential Outcomes Model (POM)

Experimental Designs for Identifying Causal Mechanisms

Empirical approaches in public economics

Testing for covariate balance using quantile regression and resampling methods

Covariate Balancing Propensity Score for General Treatment Regimes

Unpacking the Black-Box of Causality: Learning about Causal Mechanisms from Experimental and Observational Studies

Methods to Estimate Causal Effects Theory and Applications. Prof. Dr. Sascha O. Becker U Stirling, Ifo, CESifo and IZA

Estimating the Mean Response of Treatment Duration Regimes in an Observational Study. Anastasios A. Tsiatis.

Unpacking the Black-Box of Causality: Learning about Causal Mechanisms from Experimental and Observational Studies

ECO375 Tutorial 8 Instrumental Variables

Statistical Models for Causal Analysis

Estimating the Marginal Odds Ratio in Observational Studies

What s New in Econometrics? Lecture 14 Quantile Methods

Econ 673: Microeconometrics Chapter 12: Estimating Treatment Effects. The Problem

Causal Sensitivity Analysis for Decision Trees

Bayesian Inference for Sequential Treatments under Latent Sequential Ignorability. June 19, 2017

Sensitivity analysis and distributional assumptions

Lecture Outline. Biost 518 Applied Biostatistics II. Choice of Model for Analysis. Choice of Model. Choice of Model. Lecture 10: Multiple Regression:

Imbens/Wooldridge, Lecture Notes 1, Summer 07 1

A simple alternative to the linear probability model for binary choice models with endogenous regressors

Causal Analysis in Social Research

Causal Inference with General Treatment Regimes: Generalizing the Propensity Score

Matching. Quiz 2. Matching. Quiz 2. Exact Matching. Estimand 2/25/14

Sharp Bounds on Causal Effects under Sample Selection*

INFERENCE APPROACHES FOR INSTRUMENTAL VARIABLE QUANTILE REGRESSION. 1. Introduction

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data?

ECO Class 6 Nonparametric Econometrics

Data Integration for Big Data Analysis for finite population inference

New Developments in Econometrics Lecture 16: Quantile Estimation

Unpacking the Black-Box: Learning about Causal Mechanisms from Experimental and Observational Studies

Partial effects in fixed effects models

Package causalweight

ANALYTIC COMPARISON. Pearl and Rubin CAUSAL FRAMEWORKS

Propensity Score Weighting with Multilevel Data

Implementing Rubin s Alternative Multiple Imputation Method for Statistical Matching in Stata

Causal Inference Lecture Notes: Selection Bias in Observational Studies

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Transcription:

SIMULATION-BASED SENSITIVITY ANALYSIS FOR MATCHING ESTIMATORS TOMMASO NANNICINI universidad carlos iii de madrid UK Stata Users Group Meeting London, September 10, 2007

CONTENT Presentation of a Stata program - sensatt - that implements the sensitivity analysis for propensity-score matching estimators proposed by: Ichino A., F. Mealli, and T. Nannicini (2007), From Temporary Help Jobs to Permanent Employment: What Can We Learn from Matching Estimators and their Sensitivity?, Journal of Applied Econometrics, forthcoming. In this paper, we present an econometric tool that builds on Rosenbaum and Rubin (1983a) and Rosenbaum (1987a), and aims at assessing the robustness of average treatment effects estimated with matching methods. 1

MOTIVATION Increasing use of matching estimators in evaluation studies for which a convincing source of exogenous variation of treatment assignment does not exist. In Stata, one can use the following routine programs: nnmatch psmatch2 attnd, attnw, attr, attk Matching estimators are now easy to use and perhaps too many users adopt them without checking both the conditions for their application and the sensitivity of the results to possible deviations from these conditions. We propose a simulation-based sensitivity analysis for matching estimators, aimed at assessing their robustness to specific failures of the Conditional Independence Assumption (CIA) in the evaluation problem at hand. 2

INTUITION FOR THE SENSITIVITY ANALYSIS Matching estimators rely crucially on the CIA to identify treatment effects. Suppose that this condition is not satisfied given observables, but would be satisfied if we could observe another variable. This (binary) variable can be simulated in the data and used as an additional matching factor in combination with the preferred matching estimator. A comparison of the estimates obtained with and without matching on this simulated binary variable tells us to what extent the estimator is robust to this specific source of failure of the CIA. The simulated values of the binary variable can be constructed to capture different hypotheses on the nature of potential confounding factors. 3

FRAMEWORK Consider Rubin s (1974) potential-outcome framework for causal inference. Our goal is to estimate the Average Treatment effect on the Treated (ATT): E(Y 1 Y 0 T =1). (1) One possible identification strategy is to impose the CIA: Y 0 T W. (2) A further requirement for identification is the overlap condition: Pr(T =1 W ) < 1. (3) Under assumptions (2) and (3), within each cell defined by W, treatment assignment is random, and the outcome of control subjects can be used to estimate the counterfactual outcome of the treated in the case of no treatment. 4

PROPENSITY-SCORE MATCHING The propensity score is the individual probability of receiving the treatment given the observed covariates: p(w )=P (T =1 W ). Under the CIA, Y 0 is independent of T given the propensity score (Rosenbaum and Rubin, 1983b). If p(w ) is known, the ATT can be consistently estimated as: E(Y 1 Y 0 T =1)=E {p(w ) T =1} [E(Y 1 p(w ),T =1) E(Y 0 p(w ),T =0)] (4) In practice, p(w ) has to be estimated, and an algorithm has to be used in order to match treated and control units on the basis of their estimated score. The program sensatt makes use of three algorithms: nearest neighbor, radius, kernel. 5

FIVE STEPS FOR A CORRECT MATCHING First step. To use data where the treated and control units come from the same local area and are asked the same set of questions (HIT, 1997). Second step. To discuss why the CIA should be verified in the specific context of the evaluation question at hand. Third step. To test (indirectly) whether the available empirical evidence casts doubt on the plausibility of the CIA (Rosenbaum, 1987b; Imbens, 2004). Fourth step. To inspect how the observations are distributed across the common support and how sensitive the estimates are with respect to the utilization of observations in the tails (Black and Smith, 2004). Fifth step. To assess whether (and to what extent) the estimated average treatment effects are robust to possible deviations from the CIA. 6

SENSITIVITY ANALYSIS Identification of the ATT relies crucially on the validity of the CIA. The CIA is untestable, since the data are completely uninformative about the distribution of Y 0 for treated subjects, but its credibility can be supported/rejected by theoretical reasoning and additional evidence. Moreover, one should try to assess whether (and to what extent) the estimated average treatment effects are robust to possible deviations from the CIA The sensitivity analysis proposed by Ichino, Mealli, and Nannicini (2007) allows applied researchers who make use of matching estimators to tackle this task. 7

CENTRAL ASSUMPTION Assume that the CIA does not hold given the set of observable covariates W, but it holds given W and an unobserved binary variable U: Y 0 T (W, U ). (5) Using Rosenbaum s (1987b) terminology, we are moving from (Y 0 W )-adjustable treatment assignment of condition 2 to (Y 0 W, U )-adjustable treatment assignment of condition 5. A similar assumption in: Rosenbaum and Rubin (1983b) Rosenbaum (1987a) Imbens (2003) Altonji et al. (2005) 8

CHARACTERIZING THE DISTRIBUTION OF U For simplicity, consider binary potential outcomes: Y 0,Y 1 {0, 1}. The observed outcome is given by: Y = T Y 1 +(1 T ) Y 0. The distribution of the binary confounding factor U is fully characterized by the choice of four parameters: p ij Pr(U =1 T = i, Y = j) =Pr(U =1 T = i, Y = j, W ) (6) with i, j = {0, 1}, which give the probability that U =1in each of the four groups defined by the treatment status and the outcome value. Given p ij and the observed probabilities Pr(Y = i T = j), we can compute: p i. Pr(U =1 T = i) = 1 j=0 p ij Pr(Y = j T = i). (7) Two simplifying assumptions are made: a) binary U, b) conditional independence of U with respect to W. Monte Carlo exercise shows they are innocuous. 9

SIMULATION Given arbitrary (but meaningful) values of the parameters p ij,weattributea value of U to each subject, according to her treatment status and outcome. We then include U in the set of matching variables used to estimate the propensity score and to compute the matching estimate of the ATT. For each set of values of the sensitivity parameters, we repeat the matching estimation many times (e.g., 1,000) in order to obtain an estimate of the ATT, which is an average of the ATTs over the distribution of U. These ATT estimates are identified under our extended CIA, i.e., the assumption that treatment assignment is unconfounded conditioning on W and a confounder U that behaves according to the chosen configuration of the p ij. 10

STANDARD ERRORS A solution is to see the simulated confounder as a problem of missing data that can be solved by multiply imputing the missing values of U. Let m be the number of imputations of the missing U, andletatˆ T i and se 2 i be the point estimate and the estimated variance of the ATT estimator at the i-th imputed data set. The within-imputation and the between-imputation variances are defined as, respectively: se 2 B = 1 m 1 se 2 W = 1 m m i=1 se2 i (8) m i=1 ( ˆ AT T i ˆ AT T ) 2. (9) The total variance associated with ˆ AT T is given by: T = se 2 W +(1+ 1 m )se2 B. (10) 11

MULTIVALUED OR CONTINUOUS OUTCOMES With multivalued or continuous outcomes the same sensitivity analysis can be applied by defining: p ij Pr(U =1 T = i, I(Y >y )=j), (11) where I is the indicator function and y is a chosen typical value (e.g., the median) of the distribution of Y. Of course, the ATT is estimated for the multivalued or continuous outcome Y. The program sensatt allows for the utilization of four y : mean, median, 25th centile, 75th centile. 12

GUIDELINES FOR THE SIMULATIONS Note that the confounder U would be dangerous if we had that: Pr(Y 0 =1 T,W,U) Pr(Y 0 =1 T,W), (12) Pr(T =1 W, U ) Pr(T =1 W ). (13) These expressions, unlike the parameters p ij, both include W and refer to the potential (not observed) outcome in case of no treatment. However, Ichino, Mealli, and Nannicini (2007) show that: p 01 >p 00 Pr(Y 0 =1 T =0,U =1,W) >Pr(Y 0 =1 T =0,U =0,W), p 1. >p 0. Pr(T =1 U =1,W) >Pr(T =1 U =0,W). Hence, by simply assuming that d = p 01 p 00 > 0, one can simulate a confounder that has a positive effect on Y 0 (conditioning on W ). And, by setting s = p 1. p 0. > 0, one can simulate a confounder with a positive effect on T. 13

THE OUTCOME AND SELECTION EFFECTS However, by setting the quantities d and s, we can control the sign but not the magnitude of the conditional association of U with Y 0 and T. To sidestep this problem, the program sensatt estimates a logit model of Pr(Y =1 T =0,U,W) at every iteration. The average odds ratio of U ( Γ) isreportedastheoutcome effect of the simulated confounder. Similarly, the logit model of Pr(T =1 U, W ) is estimated at every iteration, and the average odds ratio of U ( Λ) isreportedastheselection effect. By setting d>0 and s>0, both the outcome and selection effects must be positive (i.e., Γ > 1 and Λ > 1). By displaying the associated Γ and Λ, wecan assess the magnitude of these effects, which end up characterizing U. 14

TWO SIMULATION EXERCISES a) Calibrated confounders. Pick the parameters p ij to make the distribution of U similar to the empirical distribution of important binary covariates. This simulation exercise reveals the extent to which the baseline estimates are robust to deviations from the CIA induced by the impossibility of observing factors similar to the observed covariates. b) Killer confounders. Search for a configuration of p ij such that, if U wereobserved,theestimatedattwouldbedriventozero,andthenassess the plausibility of this particular configuration. One can build a table of simulated ATTs such that d increases by 0.1 along each column, and s increases by 0.1 along each column, looking for those configurations that kill the ATT. 15

THESYNTAXOFTHESTATAPROGRAM sensatt outcome treatment [varlist] [weight] [if exp] [in range] [, alg(att*) reps(#) p(varname) p11(#) p10(#) p01(#) p00(#) se(se type) ycent(#) pscore(scorevar) logit index comsup bootstrap ] The following remarks should be taken into account: The program makes use of the commands for the propensity-score matching estimation of average treatment effects written by Becker and Ichino (2002): attnd, attnw, attk, attr. The treatment must be binary. 16

OPTIONS alg(att*) specifies the name of the command (i.e., of the matching algorithm) that is used in the ATT estimation. One of the following commands can be specified: attnd, attnw, attk, attr. The default is attnd. p(varname) indicates the binary variable which is used to simulate the confounder. The parameters p ij used to simulate U are set equal to the ones observed for varname. p11(#), p10(#), p01(#) and p00(#) jointly specify the parameters p ij used to simulate U in the data. Since they are probabilities, they must be between zero and one. For each parameter, the default is zero. reps(#) specifies the number of iterations, i.e., how many times the simulation of U and the ATT estimation are replicated. The default is 1,000. 17

OPTIONS (cont.) se(se type) allows the user to decide which standard error should be displayed with the simulated ATT. Three se types are possible: set uses the total variance in a multiple-imputation setting; sew uses the within-imputation variance; seb uses the between-imputation variance. The default is set. ycent(#) is relevant only with continuous outcomes. It means that U is simulated on the basis of the binary transformation of the outcome: I(Y >y ), where y is the #th centile of the distribution of Y. Three centiles are allowed: 25, 50, 75. If ycent(#) is not specified by the user, y is the mean of Y. 18

EXAMPLES For concrete examples of the utilization of the sensitivity analysis performed by the program sensatt, see: Ichino A., F. Mealli, and T. Nannicini (2007), From Temporary Help Jobs to Permanent Employment: What Can We Learn from Matching Estimators and their Sensitivity?, Journal of Applied Econometrics, forthcoming. Nannicini T. (2007), Simulation-Based Sensitivity Analysis for Matching Estimators, The Stata Journal, Issue 3, forthcoming. 19