Randomized trials for policy

Similar documents
External validity, causal interaction and randomised trials

ESTIMATION OF TREATMENT EFFECTS VIA MATCHING

12E016. Econometric Methods II 6 ECTS. Overview and Objectives

The Econometrics of Randomized Experiments

Quantitative Economics for the Evaluation of the European Policy

Potential Outcomes Model (POM)

AN EVALUATION OF PARAMETRIC AND NONPARAMETRIC VARIANCE ESTIMATORS IN COMPLETELY RANDOMIZED EXPERIMENTS. Stanley A. Lubanski. and. Peter M.

Selection on Observables: Propensity Score Matching.

Regression Discontinuity Designs

The Economics of European Regions: Theory, Empirics, and Policy

Comments on Best Quasi- Experimental Practice

Lecture 8. Roy Model, IV with essential heterogeneity, MTE

NBER WORKING PAPER SERIES A NOTE ON ADAPTING PROPENSITY SCORE MATCHING AND SELECTION MODELS TO CHOICE BASED SAMPLES. James J. Heckman Petra E.

A Note on Adapting Propensity Score Matching and Selection Models to Choice Based Samples

Econometrics II. Nonstandard Standard Error Issues: A Guide for the. Practitioner

What s New in Econometrics. Lecture 1

Online Appendix to Yes, But What s the Mechanism? (Don t Expect an Easy Answer) John G. Bullock, Donald P. Green, and Shang E. Ha

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies

Econ 2148, fall 2017 Instrumental variables I, origins and binary treatment case

Instrumental Variables

Principles Underlying Evaluation Estimators

A Course in Applied Econometrics. Lecture 2 Outline. Estimation of Average Treatment Effects. Under Unconfoundedness, Part II

Implementing Matching Estimators for. Average Treatment Effects in STATA. Guido W. Imbens - Harvard University Stata User Group Meeting, Boston

Applied Microeconometrics Chapter 8 Regression Discontinuity (RD)

Empirical approaches in public economics

A Measure of Robustness to Misspecification

Implementing Matching Estimators for. Average Treatment Effects in STATA

Comments on: Panel Data Analysis Advantages and Challenges. Manuel Arellano CEMFI, Madrid November 2006

Introduction to Statistical Inference

A Distinction between Causal Effects in Structural and Rubin Causal Models

Identi cation of Positive Treatment E ects in. Randomized Experiments with Non-Compliance

The Role of Social Comparison in Reducing Residential Water Consumption: Evidence from a Randomized Controlled Trial

Econometrics of causal inference. Throughout, we consider the simplest case of a linear outcome equation, and homogeneous

IsoLATEing: Identifying Heterogeneous Effects of Multiple Treatments

Technical Track Session I:

Moving the Goalposts: Addressing Limited Overlap in Estimation of Average Treatment Effects by Changing the Estimand

Flexible Estimation of Treatment Effect Parameters

DOCUMENTS DE TRAVAIL CEMOI / CEMOI WORKING PAPERS. A SAS macro to estimate Average Treatment Effects with Matching Estimators

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

Regression Discontinuity Designs.

Applied Econometrics Lecture 1

Recitation 7. and Kirkebøen, Leuven, and Mogstad (2014) Spring Peter Hull

Big Data, Machine Learning, and Causal Inference

EMERGING MARKETS - Lecture 2: Methodology refresher

Causal Directed Acyclic Graphs

Why experimenters should not randomize, and what they should do instead

New Developments in Econometrics Lecture 11: Difference-in-Differences Estimation

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data?

Advanced Quantitative Research Methodology, Lecture Notes: Research Designs for Causal Inference 1

Empirical Methods in Applied Microeconomics

Course Description. Course Requirements

Clustering as a Design Problem

Controlling for latent confounding by confirmatory factor analysis (CFA) Blinded Blinded

Notes on causal effects

The Simple Linear Regression Model

Small-sample cluster-robust variance estimators for two-stage least squares models

Bonn Summer School Advances in Empirical Macroeconomics

Least Squares Estimation of a Panel Data Model with Multifactor Error Structure and Endogenous Covariates

Logistic regression: Why we often can do what we think we can do. Maarten Buis 19 th UK Stata Users Group meeting, 10 Sept. 2015

Michael Lechner Causal Analysis RDD 2014 page 1. Lecture 7. The Regression Discontinuity Design. RDD fuzzy and sharp

Estimation of the Conditional Variance in Paired Experiments

Econometrics of Policy Evaluation (Geneva summer school)

Lecture 11/12. Roy Model, MTE, Structural Estimation

Comparing Change Scores with Lagged Dependent Variables in Models of the Effects of Parents Actions to Modify Children's Problem Behavior

Teaching Causal Inference in Undergraduate Econometrics

Thank you for taking part in this survey! You will be asked about whether or not you follow certain forecasting practices.

Introduction to Econometrics

IV Quantile Regression for Group-level Treatments, with an Application to the Distributional Effects of Trade

Unpacking the Black-Box of Causality: Learning about Causal Mechanisms from Experimental and Observational Studies

Policy-Relevant Treatment Effects

Unpacking the Black-Box of Causality: Learning about Causal Mechanisms from Experimental and Observational Studies

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals

Introduction to Econometrics

Wild Bootstrap Inference for Wildly Dierent Cluster Sizes

Job Training Partnership Act (JTPA)

Imbens/Wooldridge, IRP Lecture Notes 2, August 08 1

Matching for Causal Inference Without Balance Checking

Robustness to Parametric Assumptions in Missing Data Models

IDENTIFICATION OF TREATMENT EFFECTS WITH SELECTIVE PARTICIPATION IN A RANDOMIZED TRIAL

PEARL VS RUBIN (GELMAN)

Simple Linear Regression

1 Impact Evaluation: Randomized Controlled Trial (RCT)

Technical Track Session I: Causal Inference

ted: a Stata Command for Testing Stability of Regression Discontinuity Models

What if we want to estimate the mean of w from an SS sample? Let non-overlapping, exhaustive groups, W g : g 1,...G. Random

Difference-in-Differences Estimation

Causal Inference with and without Experiments I

EC402 - Problem Set 3

Introduction to causal identification. Nidhiya Menon IGC Summer School, New Delhi, July 2015

CEPA Working Paper No

The Exact Distribution of the t-ratio with Robust and Clustered Standard Errors

Imbens/Wooldridge, Lecture Notes 1, Summer 07 1

ECON 4160: Econometrics-Modelling and Systems Estimation Lecture 9: Multiple equation models II

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,

Propensity Score Matching

A Course on Advanced Econometrics

Final Exam. Economics 835: Econometrics. Fall 2010

Econometric analysis of models with social interactions

Matching. Quiz 2. Matching. Quiz 2. Exact Matching. Estimand 2/25/14

The decision theoretic approach to causal inference OR Rethinking the paradigms of causal modelling

Transcription:

A review of the external validity of treatment effects Development Policy Research Unit, University of Cape Town Annual Bank Conference on Development Economics, 2-3 June 2014

Table of contents 1 Introduction Paper overview Background 2 Simple external validity Class size example Resolution by sampling? 3 Empirical challenges Inconsistencies in method 4

Paper overview Background Paper overview 1 Review of critiques of RCTs and responses 2 Review of literature(s) on external validity: programme evaluation; experimental economics; medicine; philosophy; structural econometrics; time-series econometrics 3 External validity as a problem of interaction: framework, possible solutions and implications

Paper overview Background Background RCTs have become very important in (development) economics but also controversial: Internal validity: (when) do RCTs identify a causal effect of interest? External validity: what can we infer about causal relationships in non-experimental populations from RCT results? Can RCTs address the big questions of development? Overlap obscures the fundamental problem of external validity, so I focus on extrapolation from an ideal experiment.

Simple external validity Class size example Resolution by sampling? Interaction: the fundamental challenge to external validity

Simple external validity Class size example Resolution by sampling? Interaction: the fundamental challenge to external validity Definition Simple external validity E[Y i (1) Y i (0) D i = 1] = E[Y i (1) Y i (0) D i = 0] (1)

Simple external validity Class size example Resolution by sampling? Interaction: the fundamental challenge to external validity Definition Simple external validity E[Y i (1) Y i (0) D i = 1] = E[Y i (1) Y i (0) D i = 0] (1) all...threats to external validity [can be described] in terms of statistical interaction effects (Cook and Campbell, 1979)

Simple external validity Class size example Resolution by sampling? Interaction: the fundamental challenge to external validity Definition Simple external validity E[Y i (1) Y i (0) D i = 1] = E[Y i (1) Y i (0) D i = 0] (1) all...threats to external validity [can be described] in terms of statistical interaction effects (Cook and Campbell, 1979) Straightforward to show that if treatment variable interacts with some covariate(s) (W ) then simple external validity fails where E[W D = 1] E[W D = 0]

Simple external validity Class size example Resolution by sampling? Illustrative example: class size and test scores Class size has been important example in EV debates (Angrist and Pischke, 2010), but empirical studies are based on an additive educational production function.

Simple external validity Class size example Resolution by sampling? Illustrative example: class size and test scores Class size has been important example in EV debates (Angrist and Pischke, 2010), but empirical studies are based on an additive educational production function. Alternative (simple) theory: class size matters because of what happens in the classroom

Simple external validity Class size example Resolution by sampling? Illustrative example: class size and test scores Class size has been important example in EV debates (Angrist and Pischke, 2010), but empirical studies are based on an additive educational production function. Alternative (simple) theory: class size matters because of what happens in the classroom Formally: A ijgk = α 0ig +α 1 H ig +β(1 δc gj )f (q gj, R gj, α 0 jg )+α 2 G gk +ɛ igjk

Simple external validity Class size example Resolution by sampling? Sampling and replication Cook and Campbell (1979) frame problem in terms of interaction and solution in terms of sampling or replication. (More recently, see Allcott and Mullainathan (2012)).

Simple external validity Class size example Resolution by sampling? Sampling and replication Cook and Campbell (1979) frame problem in terms of interaction and solution in terms of sampling or replication. (More recently, see Allcott and Mullainathan (2012)). First option: random sampling for representativeness

Simple external validity Class size example Resolution by sampling? Sampling and replication Cook and Campbell (1979) frame problem in terms of interaction and solution in terms of sampling or replication. (More recently, see Allcott and Mullainathan (2012)). First option: random sampling for representativeness Second option: deliberate sampling for heterogeneity (to meet overlapping support condition of Hotz et al. (2005))

Empirical challenges Inconsistencies in method

Empirical challenges Inconsistencies in method Definition E[Y i (1) Y i (0) D i = 1] = E W [E[Y i T 1, D i = 0, W i ] E[Y i T 0, D i = 0, W i ] D i = 1]

Empirical challenges Inconsistencies in method Definition E[Y i (1) Y i (0) D i = 1] = E W [E[Y i T 1, D i = 0, W i ] E[Y i T 0, D i = 0, W i ] D i = 1] Hotz et al. (2005) show three conditions are sufficient. Successful randomization plus:

Empirical challenges Inconsistencies in method Definition E[Y i (1) Y i (0) D i = 1] = E W [E[Y i T 1, D i = 0, W i ] E[Y i T 0, D i = 0, W i ] D i = 1] Hotz et al. (2005) show three conditions are sufficient. Successful randomization plus: Location independence D i (Y i (0), Y i (1)) W i (2)

Empirical challenges Inconsistencies in method Definition E[Y i (1) Y i (0) D i = 1] = E W [E[Y i T 1, D i = 0, W i ] E[Y i T 0, D i = 0, W i ] D i = 1] Hotz et al. (2005) show three conditions are sufficient. Successful randomization plus: Location independence D i (Y i (0), Y i (1)) W i (2) Overlapping support For all w, δ < Pr(D i = 1 W i = w) < 1 δ, (3) for some δ > 0 and for all w W

Empirical challenges Inconsistencies in method Problem 1: Empirical requirements Table: Empirical requirements for external validity (assuming an ideal experiment, no specification of functional form) R1 R2 R3.1 R4.1 The interacting factors (W ) must be known ex ante All elements of W must be observed in both populations Empirical measures of elements of W must be comparable across populations The researcher must be able to obtain unbiased estimates of the conditional average treatment effect (E[ D = 0, W ]) for all values of W

Empirical challenges Inconsistencies in method Problem 2: Inconsistency Manski (2013a,b) has noted asymmetry in dealing with internal and external validity. Above framework elucidates one simple aspect of this.

Empirical challenges Inconsistencies in method Problem 2: Inconsistency Manski (2013a,b) has noted asymmetry in dealing with internal and external validity. Above framework elucidates one simple aspect of this. Assumptions required for non-experimental matching methods?

Empirical challenges Inconsistencies in method Problem 2: Inconsistency Manski (2013a,b) has noted asymmetry in dealing with internal and external validity. Above framework elucidates one simple aspect of this. Assumptions required for non-experimental matching methods? 1 Unconfoundedness/selection on observables (T i (Y i (0), Y i (1)) X )

Empirical challenges Inconsistencies in method Problem 2: Inconsistency Manski (2013a,b) has noted asymmetry in dealing with internal and external validity. Above framework elucidates one simple aspect of this. Assumptions required for non-experimental matching methods? 1 Unconfoundedness/selection on observables (T i (Y i (0), Y i (1)) X ) 2 Overlapping support (across T = 0 and T = 1)

Empirical challenges Inconsistencies in method Problem 2: Inconsistency Manski (2013a,b) has noted asymmetry in dealing with internal and external validity. Above framework elucidates one simple aspect of this. Assumptions required for non-experimental matching methods? 1 Unconfoundedness/selection on observables (T i (Y i (0), Y i (1)) X ) 2 Overlapping support (across T = 0 and T = 1) But these are equivalent in form to requirements for conditional external validity...using X and T, instead of W and D

External validity problem is currently unresolved

External validity problem is currently unresolved Therefore:

External validity problem is currently unresolved Therefore: 1 Either more caution in claiming policy relevance of randomised evaluations;

External validity problem is currently unresolved Therefore: 1 Either more caution in claiming policy relevance of randomised evaluations; 2 Or acceptance that qualitative (subjective?) assessment of external validity is inconsistent with insisting on randomization for internal validity.

External validity problem is currently unresolved Therefore: 1 Either more caution in claiming policy relevance of randomised evaluations; 2 Or acceptance that qualitative (subjective?) assessment of external validity is inconsistent with insisting on randomization for internal validity. 3 Replication (maybe also random sampling) cannot answer external validity question without information on interacting variables.

External validity problem is currently unresolved Therefore: 1 Either more caution in claiming policy relevance of randomised evaluations; 2 Or acceptance that qualitative (subjective?) assessment of external validity is inconsistent with insisting on randomization for internal validity. 3 Replication (maybe also random sampling) cannot answer external validity question without information on interacting variables. Theory may help by providing guidance on what the interacting factors might be, but empirical obstacles remain impressive and may be insurmountable in some (many?) cases

Allcott, H. and S. Mullainathan (2012). External validity and partner selection bias. NBER Working Paper (18373). Angrist, J. D. and J.-S. Pischke (2010). The credibility revolution in empirical economics: How better research design is taking the con out of econometrics. Journal of Economic Perspectives 24(2), 3 30. Cook, T. D. and D. T. Campbell (1979). Quasi-Experimentation: Design and Analysis Issues for Field Settings. Wadsworth. Hotz, V. J., G. W. Imbens, and J. H. Mortimer (2005). Predicting the efficacy of future training programs using past experiences at other locations. Journal of Econometrics 125, 241 270. Manski, C. F. (2013a). Public policy in an uncertain world: analysis and decisions. Cambridge (MA): Harvard University Press. Manski, C. F. (2013b). Response to the review of public policy in an uncertain world. Economic Journal 123, F412 F415.