MATCHING TECHNIQUES. Technical Track Session VI. Emanuela Galasso. The World Bank

Similar documents
MATCHING TECHNIQUES Technical Track Session VI Céline Ferré The World Bank

Matching Techniques. Technical Session VI. Manila, December Jed Friedman. Spanish Impact Evaluation. Fund. Region

CAUSAL INFERENCE. Technical Track Session I. Phillippe Leite. The World Bank

INSTRUMENTAL VARIABLES

Internal vs. external validity. External validity. This section is based on Stock and Watson s Chapter 9.

REGRESSION DISCONTINUITY (RD) Technical Track Session V. Dhushyanth Raju Julieta Trias The World Bank

Evaluating enterprise support: state of the art and future challenges. Dirk Czarnitzki KU Leuven, Belgium, and ZEW Mannheim, Germany

AP Statistics Notes Unit Two: The Normal Distributions

Bootstrap Method > # Purpose: understand how bootstrap method works > obs=c(11.96, 5.03, 67.40, 16.07, 31.50, 7.73, 11.10, 22.38) > n=length(obs) >

A Matrix Representation of Panel Data

What is Statistical Learning?

Resampling Methods. Chapter 5. Chapter 5 1 / 52

7 TH GRADE MATH STANDARDS

SUPPLEMENTARY MATERIAL GaGa: a simple and flexible hierarchical model for microarray data analysis

Computational modeling techniques

Pattern Recognition 2014 Support Vector Machines

Hypothesis Tests for One Population Mean

Differentiation Applications 1: Related Rates

Comparing Several Means: ANOVA. Group Means and Grand Mean

Lab 1 The Scientific Method

Chapter 3: Cluster Analysis

MATHEMATICS SYLLABUS SECONDARY 5th YEAR

How T o Start A n Objective Evaluation O f Your Training Program

Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeoff

CS 109 Lecture 23 May 18th, 2016

Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeoff

Five Whys How To Do It Better

A Quick Overview of the. Framework for K 12 Science Education

SURVIVAL ANALYSIS WITH SUPPORT VECTOR MACHINES

NUMBERS, MATHEMATICS AND EQUATIONS

4th Indian Institute of Astrophysics - PennState Astrostatistics School July, 2013 Vainu Bappu Observatory, Kavalur. Correlation and Regression

Math Foundations 20 Work Plan

Resampling Methods. Cross-validation, Bootstrapping. Marek Petrik 2/21/2017

Name: Block: Date: Science 10: The Great Geyser Experiment A controlled experiment

On Huntsberger Type Shrinkage Estimator for the Mean of Normal Distribution ABSTRACT INTRODUCTION

CHAPTER 24: INFERENCE IN REGRESSION. Chapter 24: Make inferences about the population from which the sample data came.

NAME: Prof. Ruiz. 1. [5 points] What is the difference between simple random sampling and stratified random sampling?

CHAPTER 4 DIAGNOSTICS FOR INFLUENTIAL OBSERVATIONS

x 1 Outline IAML: Logistic Regression Decision Boundaries Example Data

Functional Form and Nonlinearities

Lesson Plan. Recode: They will do a graphic organizer to sequence the steps of scientific method.

Mathematics and Computer Sciences Department. o Work Experience, General. o Open Entry/Exit. Distance (Hybrid Online) for online supported courses

Pipetting 101 Developed by BSU CityLab

k-nearest Neighbor How to choose k Average of k points more reliable when: Large k: noise in attributes +o o noise in class labels

Lecture 17: Free Energy of Multi-phase Solutions at Equilibrium

Fall 2013 Physics 172 Recitation 3 Momentum and Springs

Homology groups of disks with holes

Emphases in Common Core Standards for Mathematical Content Kindergarten High School

COMP 551 Applied Machine Learning Lecture 5: Generative models for linear classification

22.54 Neutron Interactions and Applications (Spring 2004) Chapter 11 (3/11/04) Neutron Diffusion

Lecture 13: Markov Chain Monte Carlo. Gibbs sampling

Last Updated: Oct 14, 2017

Assessment Primer: Writing Instructional Objectives

Reinforcement Learning" CMPSCI 383 Nov 29, 2011!

5 th grade Common Core Standards

CS 477/677 Analysis of Algorithms Fall 2007 Dr. George Bebis Course Project Due Date: 11/29/2007

SAMPLING DYNAMICAL SYSTEMS

Computational modeling techniques

A New Evaluation Measure. J. Joiner and L. Werner. The problems of evaluation and the needed criteria of evaluation

Kinetic Model Completeness

Physics 2010 Motion with Constant Acceleration Experiment 1

Sequential Allocation with Minimal Switching

You need to be able to define the following terms and answer basic questions about them:

Eric Klein and Ning Sa

Checking the resolved resonance region in EXFOR database

Unit 1: Introduction to Biology

IN a recent article, Geary [1972] discussed the merit of taking first differences

Section 6-2: Simplex Method: Maximization with Problem Constraints of the Form ~

Modelling of Clock Behaviour. Don Percival. Applied Physics Laboratory University of Washington Seattle, Washington, USA

Part 3 Introduction to statistical classification techniques

Review Problems 3. Four FIR Filter Types

UG Course Outline EC2203: Quantitative Methods II 2017/18

Midwest Big Data Summer School: Machine Learning I: Introduction. Kris De Brabanter

Weathering. Title: Chemical and Mechanical Weathering. Grade Level: Subject/Content: Earth and Space Science

1 The limitations of Hartree Fock approximation

T Algorithmic methods for data mining. Slide set 6: dimensionality reduction

IAML: Support Vector Machines

Turing Machines. Human-aware Robotics. 2017/10/17 & 19 Chapter 3.2 & 3.3 in Sipser Ø Announcement:

The blessing of dimensionality for kernel methods

Document for ENES5 meeting

EASTERN ARIZONA COLLEGE Introduction to Statistics

, which yields. where z1. and z2

Writing Guidelines. (Updated: November 25, 2009) Forwards

Collocation Map for Overcoming Data Sparseness

arxiv:hep-ph/ v1 2 Jun 1995

COMP 551 Applied Machine Learning Lecture 11: Support Vector Machines

The Law of Total Probability, Bayes Rule, and Random Variables (Oh My!)

PSU GISPOPSCI June 2011 Ordinary Least Squares & Spatial Linear Regression in GeoDa

MODULE FOUR. This module addresses functions. SC Academic Elementary Algebra Standards:

Lead/Lag Compensator Frequency Domain Properties and Design Methods

8 th Grade Math: Pre-Algebra

Instructional Plan. Representational/Drawing Level

Determining the Accuracy of Modal Parameter Estimation Methods

Maximum A Posteriori (MAP) CS 109 Lecture 22 May 16th, 2016

Intelligent Pharma- Chemical and Oil & Gas Division Page 1 of 7. Global Business Centre Ave SE, Calgary, AB T2G 0K6, AB.

making triangle (ie same reference angle) ). This is a standard form that will allow us all to have the X= y=

[COLLEGE ALGEBRA EXAM I REVIEW TOPICS] ( u s e t h i s t o m a k e s u r e y o u a r e r e a d y )

Smoothing, penalized least squares and splines

COMP 551 Applied Machine Learning Lecture 4: Linear classification

TP1 - Introduction to ArcGIS

Transcription:

MATCHING TECHNIQUES Technical Track Sessin VI Emanuela Galass The Wrld Bank These slides were develped by Christel Vermeersch and mdified by Emanuela Galass fr the purpse f this wrkshp When can we use matching? What if the assignment t the treatment is dne nt randmly, but n the basis f bservables? This is when matching methds cme in! Matching methds allw yu t cnstruct cmparisn grups when the assignment t the treatment is dne n the basis f bservable variables. 1

When can we use matching? Intuitin: the cmparisn grup needs t be as similar as pssible t the treatment grup, in terms f the bservables befre the start f the treatment. The methd assumes there are n remaining unbservable differences between treatment and cmparisn grups. Key Questin What is the effect f treatment n the treated when the assignment t the treatment is based n bservable variables? 2

Uncnfundedness & Selectin n bservables Let X dente a matrix in which each rw is a vectr f pre-treatment bservable variables fr individual i. Uncnfundedness: Assignment t treatment is uncnfunded given pre-treatment variables X if Y 1, Y 0 D X Uncnfundedness is equivalent t saying that: (1) within each cell defined by X: treatment is randm (2) the selectin int treatment depends nly n the bservables X. Average effects f treatment n the treated Assuming uncnfundedness given X Intuitin Estimate the treatment effect within each cell defined by X Take the average ver the different cells Math In yur handuts: Annex 1 3

Strategy fr estimating average effect f treatment n the treated Selectin n bservables Uncnfundedness suggests the fllwing strategy fr the estimatin f the average treatment effect δ Stratify the data int cells defined by each particular value f X Within each cell (i.e. cnditining n X) cmpute the difference between the average utcmes f the treated and the cntrls Average these differences with respect t the distributin f X in the ppulatin f treated units. Is this strategy feasible? Is ur strategy feasible? The Dimensinality Prblem This may nt be feasible when The sample is small The set f cvariates is large Many f the cvariates have many values r are cntinuus This is what we call The dimensinality prblem 4

The Dimensinality Prblem Examples Hw many cells d we have with 2 binary X variables? And with 3 binary X variables? And with K binary X variables? Hw abut if we have 2 variables that take n 7 values each? As the number f cells grws, we ll get lack f cmmn supprt cells cntaining nly treated bservatins cells cntaining nly cntrls An Alternative t slve the Dimensinality Prblem The prpensity scre allws t cnvert the multidimensinal setup f matching int a nedimensinal setup. In that way, it allws t reduce the dimensinality prblem. Rsenbaum and Rubin Rsenbaum and Rubin (1983) prpse an equivalent and feasible estimatin strategy based n the cncept f Prpensity Scre. 5

Matching based n the Prpensity Scre Definitin The prpensity scre is the cnditinal prbability f receiving the treatment given the pre-treatment variables: Lemma 1 Lemma 2 p(x) =Pr{D = 1 X} = EX{D X} If p(x) is the prpensity scre, then D X p(x) Given the prpensity scre, the pre-treatment variables are balanced between beneficiaries and nn- beneficiaries Y1, Y0 D X => Y 1, Y0 D p(x) Suppse that assignment t treatment is uncnfunded given the pre-treatment variables X. Then assignment t treatment is uncnfunded given the prpensity scre p(x). Des the prpensity scre apprach slve the dimensinality prblem? The balancing prperty f the prpensity scre (Lemma 1) ensures that: YES! Observatins with the same prpensity scre have the same distributin f bservable cvariates independently f treatment status; and fr a given prpensity scre, assignment t treatment is randm and therefre treatment and cntrl units are bservatinally identical n average. 6

Implementatin f the estimatin strategy This suggests the fllwing strategy fr the estimatin f the average treatment effect δ Step 1 Estimate a lgit (r prbit) mdel f prgram participatin. Predicted values are the prpensity scres. E.g. With a lgit functin, see Annex 3. This step is necessary because the true prpensity scre is unknwn and therefre the prpensity scre has t be estimated. When is prpensity scre matching apprpriate? Idea behind prpensity scre matching: estimatin f treatment effects requires a careful matching f treated and cntrls. If treated and cntrls are very different in terms f bservables this matching is nt sufficiently clse and reliable r it may even be impssible. The cmparisn f the estimated prpensity scres acrss treated and cntrls prvides a useful diagnstic tl t evaluate hw similar are treated and cntrls, and therefre hw reliable is the estimatin strategy. 7

S yu want prpensity scre t be the same fr treatments and cntrls The range f variatin f prpensity scres shuld be the same fr treated and cntrls. Cunt hw many cntrls have a prpensity scre lwer than the minimum r higher than the maximum f the prpensity scres f the treated and vice versa. Frequency f prpensity scres is the same fr treated and cntrl. Draw histgrams f the estimated prpensity scres fr the treated and cntrls. The bins crrespnd t the blcks cnstructed fr the estimatin f prpensity scres. The issue f cmmn supprt Density Density f scres fr nn-participants Density f scres fr participants 0 Regin f cmmn supprt 1 Prpensity scre 8

Density.2.4.6 0 Example: Cmmn supprt issues 0 1-5 -4-3 -2-1 0 1 2 3 4 5-5 -4-3 -2-1 0 1 2 3 4 5 Linear predictin Graphs by treated Figure A1: Prpensity Scres Fr EiC Phase 1 and nn-eic schls. Surce: Machin, McNally, Meghir, Excellence in Cities: Evaluatin f an educatin plicy in disadvantaged areas. Implementatin f the estimatin strategy Remember we re discussing a strategy fr the estimatin f the average treatment effect n the treated, called δ Step 1 Estimate the prpensity scre (see Annex 3) Step 2 Restrict the analysis t the regin f cmmn supprt (key surce f bias in bservatinal studies) 9

Step 3: Estimate the average treatment effect given the prpensity scre Fr each participant find a sample f nn-participants that have similar prpensity scres. Cmpare the utcme indicatr fr each participant and its cmparisn grup. Calculate the mean f these individual gains t btain the average verall gain. ATT P j1 ( Y NP j1 -WijYij0) / i1 P Step 3: Estimate the average treatment effect given the prpensity scre Similar can be defined in many ways. These different weights crrespnd t different ways f ding matching: Stratificatin n the Scre Nearest neighbr matching n the Scre Radius matching n the Scre Kernel matching n the Scre Weighting n the basis f the Scre 10

T summarize: Matching is the bservatinal analgue f an experiment in which placement is independent f utcmes The key difference is that a pure experiment des nt require the untestable assumptin f independence cnditinal n bservables. PSM requires gd data Often cmbined with difference-in-difference methds (cntrl fr selectin based n timeinvariant unbserved characteristics) References Dehejia, R.H. and S. Wahba (1999), Causal Effects in Nn-experimental Studies: Reevaluating the Evaluatin f Training Prgrams, Jurnal f the American Statistical Assciatin, 94, 448, 1053-1062. Dehejia, R.H. and S. Wahba (1996), Causal Effects in Nn-experimental Studies: Reevaluating the Evaluatin f Training Prgrams, Harvard University, Mime. Hahn, Jinyng (1998), On the rle f the prpensity scre in efficient semiparamentric estimatin f average treatment effects, Ecnmetrica, 66,2,315-331. Heckman, James J. H. Ichimura, and P. Tdd (1998), Matching as an ecnmetric evaluatin estimatr, Review f Ecnmic Studies, 65, 261-294. Hiran, K., G.W. Imbens and G. Ridder (2000), Efficient Estimatin f Average Treatment Effects using the Estimated Prpensity Scre, mime. Rsenbaum, P.R. and D.B. Rubin (1983), The Central Rle f the Prpensity Scre in Observatinal Studies fr Causal Effects, Bimetrika 70, 1, 41 55. Vinha, K. (2006) A primer n Prpensity Scre Matching Estimatrs Dcument CEDE 2006-13, Universidad de ls Andes 11

Thank Yu? Q & A 12

Annex 1: Average effects f treatment n the treated assuming uncnfundedness given X If we are willing t assume uncnfundedness: 0 0 0 1 1 1 E Y u D =0, X = E Y u D 1, X E Y u X i i i i i i i i E Y u D =0, X E Y u D 1, X E Y u X i i i i i i i i Using these expressins, we can define fr each cell defined by X =average treatment effect n the treated in cell defined by X X E { D 1, X} i i i Ei Y1 ui Y0ui Di 1, X Ei Y1 ui Di 1, X 0 i E i can measure sample analg E Y u D 1, X 1 Di i 0u can NOT measure sample analg Y u 1, X E Y D 0, X i i i can measure sample analg i Annex 1: Average effects f treatment n the treated assuming uncnfundedness given X Nw what is the relatin between "average treatment effect n the treated"... and... "average treatment effect n the treated within cell defined by X "? X average treatment effect n the treated E D 1 i i i by the law f iterated expectatins E E D =1, X E E i X i i E E D =1, X X i i i X X X {average treatment effect n the treated within cell defined by X} 13

Annex 2: Average effects f treatment and the prpensity scre S let's match treatments and cntrls n the basis f the prpensity scre p(x) instead f X. 0 =0, = 0 1, 0 1 =0, 1 1, 1 E Y u D p X E Y u D p X E Y u p X i i i i i i i i i i i E Y u D p X E Y u D p X E Y u p X i i i i i i i i i i i Using these expressins, we can define f cell defined by p X =average treatment effect n the treated in cell defined by p X Ei{ i Di 1, p X } Ei Y1 ui Y0ui Di 1, p X Ei Y1 ui Di 1, p X EY0 ui Di 1, p X p X can measure sample analg can NOT measure sample analg 1 Di 1, 0 Di 0, E Y u p X E Y u p X i i i i can measure sample analg Annex 2: Average effects f treatment and the prpensity scre Nw what is the relatin between p X p X "average treatment effect n the treated"... and... average treatment effect n the treated E D 1 i i i by the law f iterated expectatins Ei Ep X i Di p X E =1, i i i =1, p X p X p X E E D p X p X "average treatment effect n the treated within cell defined by "? E {treatment effect n the treated within cell defined by p X } 14

Annex 3: Estimatin f the prpensity scre Any standard prbability mdel can be used t estimate the prpensity scre, e.g. a lgit mdel: Pr h e X i} 1 e ( X i ) { Di h( X ) i (16) where h(xi) is a functin f cvariates with linear and higher rder terms. Estimatin f the prpensity scre Which higher rder terms d yu include in h(xi)? This is determined slely by the need t btain an estimate f the prpensity scre that satisfies the balancing prperty. The specificatin f h(xi) is (1) mre parsimnius than the full set f interactins between bservables X (2) thugh nt t parsimnius: it still needs t satisfy the balancing prperty. Nte: the estimatin f the prpensity scres des nt need a behaviral interpretatin. 15

An algrithm fr estimating the prpensity scre 1. Start with a parsimnius lgit r prbit functin t estimate the scre. 2. Srt the data accrding t the estimated prpensity scre (frm lwest t highest). 3. Stratify all bservatins in blcks such that in each blck the estimated prpensity scres fr the treated and the cntrls are nt statistically different: a) start with five blcks f equal scre range {0-0.2,..., 0.8-1} b) test whether the means f the scres fr the treated and the cntrls are statistically different in each blck c) if yes, increase the number f blcks and test again d) if n, g t next step. An algrithm fr estimating the prpensity scre (cntinued) 4. Test that the balancing prperty hlds in all blcks fr all cvariates: a) fr each cvariate, test whether the means (and pssibly higher rder mments) fr the treated and fr the cntrls are statistically different in all blcks; b) if ne cvariate is nt balanced in ne blck, split the blck and test again within each finer blck; c) if ne cvariate is nt balanced in all blcks, mdify the lgit estimatin f the prpensity scre adding mre interactin and higher rder terms and then test again. Nte: In all this prcedure the utcme has n rle. Use the STATA prgram pscre.ad, psmatch2.ad, match.ad (frm STATA type findit name ad ) 16