Regression Discontinuity Designs in Stata

Similar documents
Robust Data-Driven Inference in the Regression-Discontinuity Design

Robust Data-Driven Inference in the Regression-Discontinuity Design

Section 7: Local linear regression (loess) and regression discontinuity designs

Supplemental Appendix to "Alternative Assumptions to Identify LATE in Fuzzy Regression Discontinuity Designs"

Multidimensional Regression Discontinuity and Regression Kink Designs with Difference-in-Differences

Regression Discontinuity Designs Using Covariates

Regression Discontinuity Designs

Optimal Bandwidth Choice for Robust Bias Corrected Inference in Regression Discontinuity Designs

Optimal bandwidth selection for the fuzzy regression discontinuity estimator

Michael Lechner Causal Analysis RDD 2014 page 1. Lecture 7. The Regression Discontinuity Design. RDD fuzzy and sharp

ted: a Stata Command for Testing Stability of Regression Discontinuity Models

A Practical Introduction to Regression Discontinuity Designs: Part I

A Practical Introduction to Regression Discontinuity Designs: Volume I

Regression Discontinuity Design

Regression Discontinuity Designs Using Covariates

Simultaneous selection of optimal bandwidths for the sharp regression discontinuity estimator

Optimal Bandwidth Choice for the Regression Discontinuity Estimator

Bootstrap Confidence Intervals for Sharp Regression Discontinuity Designs with the Uniform Kernel

A Nonparametric Bayesian Methodology for Regression Discontinuity Designs

Why high-order polynomials should not be used in regression discontinuity designs

Optimal Data-Driven Regression Discontinuity Plots. Supplemental Appendix

ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics

An Alternative Assumption to Identify LATE in Regression Discontinuity Design

Regression Discontinuity Design Econometric Issues

Lecture 10 Regression Discontinuity (and Kink) Design

Approximate Permutation Tests and Induced Order Statistics in the Regression Discontinuity Design

A Simple Adjustment for Bandwidth Snooping

Why High-Order Polynomials Should Not Be Used in Regression Discontinuity Designs

Regression III Regression Discontinuity Designs

Regression Discontinuity Designs with a Continuous Treatment

Imbens, Lecture Notes 4, Regression Discontinuity, IEN, Miami, Oct Regression Discontinuity Designs

Difference-in-Differences Methods

Regression Discontinuity Designs with a Continuous Treatment

Applied Microeconometrics Chapter 8 Regression Discontinuity (RD)

Regression III Regression Discontinuity Designs

A Simple Adjustment for Bandwidth Snooping

What s New in Econometrics. Lecture 1

Regression Discontinuity Designs with a Continuous Treatment

Simple and Honest Confidence Intervals in Nonparametric Regression

Simultaneous selection of optimal bandwidths for the sharp regression discontinuity estimator

Local Polynomial Order in Regression Discontinuity Designs 1

Imbens/Wooldridge, Lecture Notes 3, NBER, Summer 07 1

Regression Discontinuity Designs with Clustered Data

Regression Discontinuity: Advanced Topics. NYU Wagner Rajeev Dehejia

Testing Continuity of a Density via g-order statistics in the Regression Discontinuity Design

Simultaneous Selection of Optimal Bandwidths for the Sharp Regression Discontinuity Estimator

Optimal Bandwidth Choice for the Regression Discontinuity Estimator

SIMPLE AND HONEST CONFIDENCE INTERVALS IN NONPARAMETRIC REGRESSION. Timothy B. Armstrong and Michal Kolesár. June 2016 Revised March 2018

Optimal bandwidth choice for the regression discontinuity estimator

Regression Discontinuity

Inference in Regression Discontinuity Designs with a Discrete Running Variable

Causal Inference with Big Data Sets

Testing continuity of a density via g-order statistics in the regression discontinuity design

OPTIMAL INFERENCE IN A CLASS OF REGRESSION MODELS. Timothy B. Armstrong and Michal Kolesár. May 2016 Revised May 2017

Bayesian nonparametric predictive approaches for causal inference: Regression Discontinuity Methods

The Stata Journal. Editors H. Joseph Newton Department of Statistics Texas A&M University College Station, Texas

APermutationTestfortheRegressionKinkDesign. Peter Ganong and Simon Jäger. March 25, 2015

Finding Instrumental Variables: Identification Strategies. Amine Ouazad Ass. Professor of Economics

41903: Introduction to Nonparametrics

An Alternative Assumption to Identify LATE in Regression Discontinuity Designs

The Economics of European Regions: Theory, Empirics, and Policy

Optimal bandwidth selection for differences of nonparametric estimators with an application to the sharp regression discontinuity design

Regression Discontinuity Designs.

Regression Discontinuity Designs Using Covariates: Supplemental Appendix

Regression Discontinuity

Nonparametric Inference via Bootstrapping the Debiased Estimator

Confidence intervals for kernel density estimation

On the Effect of Bias Estimation on Coverage Accuracy in Nonparametric Inference

A Permutation Test for the Regression Kink. Design

Clustering as a Design Problem

Kernel-based density. Nuno Vasconcelos ECE Department, UCSD

Selection on Observables: Propensity Score Matching.

APermutationTestfortheRegression. Kink Design

Applied Statistics Lecture Notes

Empirical Likelihood Methods for Two-sample Problems with Data Missing-by-Design

A Practitioner s Guide to Cluster-Robust Inference

SIMPLE AND HONEST CONFIDENCE INTERVALS IN NONPARAMETRIC REGRESSION. Timothy B. Armstrong and Michal Kolesár. June 2016 Revised October 2016

Lecture 10 Multiple Linear Regression

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Propensity Score Methods for Causal Inference

Gov 2002: 5. Matching

Regression Discontinuity Design

A Simple Adjustment for Bandwidth Snooping

Statistical Models for Causal Analysis

Implementing Matching Estimators for. Average Treatment Effects in STATA. Guido W. Imbens - Harvard University Stata User Group Meeting, Boston

Inference for Average Treatment Effects

Minimum Contrast Empirical Likelihood Manipulation. Testing for Regression Discontinuity Design

Weighting Methods. Harvard University STAT186/GOV2002 CAUSAL INFERENCE. Fall Kosuke Imai

Robust Inference in Fuzzy Regression Discontinuity Designs

Optimal inference in a class of regression models

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

Is My Matched Dataset As-If Randomized, More, Or Less? Unifying the Design and. Analysis of Observational Studies

ESTIMATION OF TREATMENT EFFECTS VIA MATCHING

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data?

Gov 2002: 3. Randomization Inference

EXAMINATION: QUANTITATIVE EMPIRICAL METHODS. Yale University. Department of Political Science

The regression discontinuity (RD) design has become

Unpacking the Black-Box: Learning about Causal Mechanisms from Experimental and Observational Studies

Density estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas

Estimation of the Conditional Variance in Paired Experiments

Transcription:

Regression Discontinuity Designs in Stata Matias D. Cattaneo University of Michigan July 30, 2015

Overview Main goal: learn about treatment effect of policy or intervention. If treatment randomization available, easy to estimate treatment effects. If treatment randomization not available, turn to observational studies. Instrumental variables. Selection on observables. Regression discontinuity (RD) designs. Simple and objective. Requires little information, if design available. Might be viewed as a local randomized trial. Easy to falsify, easy to interpret. Careful: very local!

Overview of RD packages https://sites.google.com/site/rdpackages rdrobust package: estimation, inference and graphical presentation using local polynomials, partitioning, and spacings estimators. rdrobust: RD inference (point estimation and CI; classic, bias-corrected, robust). rdbwselect: bandwidth or window selection (IK, CV, CCT). rdplot: plots data (with optimal block length). rddensity package: discontinuity in density test at cutoff (a.k.a. manipulation testing) using novel local polynomial density estimator. rddensity: manipulation testing using local polynomial density estimation. rdbwdensity: bandwidth or window selection. rdlocrand package: covariate balance, binomial tests, randomization inference methods (window selection & inference). rdrandinf: inference using randomization inference methods. rdwinselect: falsification testing and window selection. rdsensitivity: treatment effect models over grid of windows, CI inversion. rdrbounds: Rosenbaum bounds.

Randomized Control Trials Notation: (Y i (0), Y i (1), X i ), i = 1, 2,..., n. Treatment: T i {0, 1}, T i independent of (Y i (0), Y i (1), X i ). Data: (Y i, T i, X i ), i = 1, 2,..., n, with Y i = { Yi (0) if T i = 0 Y i (1) if T i = 1 Average Treatment Effect: τ ATE = E[Y i (1) Y i (0)] = E[Y i T = 1] E[Y i T = 0] Experimental Design.

Sharp RD design Notation: (Y i (0), Y i (1), X i ), i = 1, 2,..., n, X i continuous Treatment: T i {0, 1}, T i = 1(X i x). Data: (Y i, T i, X i ), i = 1, 2,..., n, with Y i = { Yi (0) if T i = 0 Y i (1) if T i = 1 Average Treatment Effect at the cutoff: τ SRD = E[Y i (1) Y i (0) X i = x] = lim x x E[Y i X i = x] lim x x E[Y i X i = x] Quasi-Experimental Design: local randomization (more later)

Outcome variable (Y) 2 1 0 1 2 3 τ 0 0.6 0.4 0.2 0.0 0.2 0.4 0.6 Assignment variable (R)

Outcome variable (Y) 2 1 0 1 2 3 τ 0 0.6 0.4 0.2 0.0 0.2 0.4 0.6 Assignment variable (R)

Outcome variable (Y) 2 1 0 1 2 3 τ 0 0.6 0.4 0.2 0.0 0.2 0.4 0.6 Assignment variable (R)

Outcome variable (Y) 6 4 2 0 2 4 6 τ 0 Local Random Assignment 0.6 0.4 0.2 0.0 0.2 0.4 0.6 Assignment variable (R)

Empirical Illustration: Cattaneo, Frandsen & Titiunik (2015, JCI) Problem: incumbency advantage (U.S. senate). Data: Y i = election outcome. T i = whether incumbent. X i = vote share previous election ( x = 0). Z i = covariates (demvoteshlag1, demvoteshlag2, dopen, etc.). Potential outcomes: Y i (0) = election outcome if had not been incumbent. Y i (1) = election outcome if had been incumbent. Causal Inference: Y i (0) Y i T i = 0 and Y i (1) Y i T i = 1

Graphical and Falsification Methods Always plot data: main advantage of RD designs! Plot regression functions to assess treatment effect and validity. Plot density of X i for assessing validity; test for continuity at cutoff and elsewhere. Important: use also estimators that do not smooth-out data. RD Plots (Calonico, Cattaneo & Titiunik, JASA): Two ingredients: (i) Smoothed global polynomial fit & (ii) binned discontinuous local-means fit. Two goals: (i) detention of discontinuities, & (ii) representation of variability. Two tuning parameters: G lobal p olynom ial degree (k n). L o cation (E S or Q S) and num b er of bins (J n).

Manipulation Tests & Covariate Balance and Placebo Tests Density tests near cutoff: Idea: distribution of running variable should be similar at either side of cutoff. Method 1: Histograms & Binomial count test. Method 2: Density Estimator at boundary. Pre-binned local polynomial method M ccrary (2008). N ew tuning-param eter-free m etho d C attan eo, Jan sson an d M a (2015). Placebo tests on pre-determined/exogenous covariates. Idea: zero RD treatment effect for pre-determined/exogenous covariates. Methods: global polynomial, local polynomial, randomization-based. Placebo tests on outcomes. Idea: zero RD treatment effect for outcome at values other than cutoff. Methods: global polynomial, local polynomial, randomization-based.

Estimation and Inference Methods Global polynomial approach (not recommended). Robust local polynomial inference methods. Bandwidth selection. Bias-correction. Confidence intervals. Local randomization and randomization inference methods. Window selection. Estimation and Inference methods. Falsification, sensitivity and related methods

Conventional Local-polynomial Approach Idea: approximate regression functions for control and treatment units locally. Local-linear estimator (w/ weights K( )): h n X i < x : x X i h n : Y i = α + (X i x) β + ε,i Y i = α + + (X i x) β + + ε +,i Treatment effect (at the cutoff): ˆτ SRD = ˆα + ˆα Can be estimated using linear models (w/ weights K( )): Y i = α + τ SRD T i + (X i x) β 1 + T i (X i x) γ 1 + ε i, h n X i h n Once h n chosen, inference is standard : weighted linear models. Details coming up next.

Conventional Local-polynomial Approach How to choose h n? Imbens & Kalyanaraman (2012, ReStud): optimal plug-in, ĥ IK = ĈIK n 1/5 Calonico, Cattaneo & Titiunik (2014, ECMA): refinement of IK ĥ CCT = ĈCCT n 1/5 Ludwig & Miller (2007, QJE): cross-validation, ĥ CV = arg min h n w(x i ) (Y i ˆµ 1 (X i, h)) 2 i=1 Key idea: trade-off bias and variance of ˆτ SRD(h n). Heuristically: Bias(ˆτ SRD) = ĥ and Var(ˆτ SRD) = ĥ

Local-Polynomial Methods: Bandwidth Selection Two main methods: plug-in & cross-validation. Both MSE-optimal in some sense. Imbens & Kalyanaraman (2012, ReStud): propose MSE-optimal rule, h MSE = C 1/5 MSE Var(ˆτ SRD) n 1/5 C MSE = C(K) Bias(ˆτ SRD) 2 IK implementation: first-generation plug-in rule. CCT implementation: second-generation plug-in rule. They differ in the way Var(ˆτ SRD) and Bias(ˆτ SRD) are estimated. Imbens & Kalyanaraman (2012, ReStud): discuss cross-validation approach, ĥ CV = arg min h>0 CV δ (h), CV δ (h) = where n 1(X,[δ] X i X +,[δ] ) (Y i ˆµ(X i ; h)) 2, i=1 ˆµ +,p (x; h) and ˆµ,p (x; h) are local polynomials estimates. δ (0, 1), X,[δ] and X +,[δ] denote δ-th quantile of {X i : X i < x} and {X i : X i x}. Our implementation uses δ = 0.5; but this is a tuning parameter!

Conventional Approach to RD Local-linear estimator (w/ weights K( )): h n X i < x : x X i h n : Y i = α + (X i x) β + ε,i Y i = α + + (X i x) β + + ε +,i Treatment effect (at the cutoff): ˆτ SRD = ˆα + ˆα Construct usual t-test. For H 0 : τ SRD = 0, ˆT (h n) = ˆτ SRD ˆα + ˆα = ˆVn ˆV+,n + ˆV d N (0, 1),n 95% Confidence interval: [ ] Î(h n) = ˆτ SRD ± 1.96 ˆV n

Bias-Correction Approach to RD Note well: for usual t-test, ˆT (h MSE) = ˆτ SRD d N (B, 1) N (0, 1), B > 0 ˆVn Bias B in RD estimator captures curvature of regression functions. Undersmoothing/ Small Bias Approach: Choose smaller h n... Perhaps ĥn = 0.5 ĥik? = Not clear guidance & power loss! Bias-correction Approach: ˆT bc (h n, b n) = ˆτ SRD ˆB n ˆVn d N (0, 1) [ ) = 95% Confidence Interval: Î bc (h n, b n) = (ˆτ SRD ˆB n ] ± 1.96 ˆV n How to choose b n? Same ideas as before... ˆb n = Ĉ n 1/7

Robust Bias-Correction Approach to RD Recall: ˆT (h n) = ˆτ SRD bc d N (0, 1) and ˆT (hn, b n) = ˆV ˆτ SRD ˆB n d N (0, 1) n ˆV n ˆBn is constructed to estimate leading bias B. Robust approach: ˆT bc (h n, b n) = ˆτ SRD ˆB n ˆVn = ˆτ SRD Bn + ˆVn Bn ˆB n ˆVn }{{}}{{} d N (0,1) d N (0,γ) Robust bias-corrected t-test: ˆT rbc (h n, b n) = ˆτ SRD ˆB n = ˆV ˆτ SRD ˆB n d N (0, 1) n + Ŵ n ˆV n bc = 95% Confidence Interval: [ ) Î rbc (h n, b n) = (ˆτ SRD ˆB n ] ± 1.96 ˆV n bc, ˆV n bc = ˆV n + Ŵ n

Local-Polynomial Methods: Robust Inference Approach 1: Undersmoothing/ Small Bias. [ ] Î(h n) = ˆτ SRD ± 1.96 ˆV n Approach 2: Bias correction (not recommended). [ ) Î bc (h n, b n) = (ˆτ SRD ˆB n ] ± 1.96 ˆV n Approach 3: Robust Bias correction. [ Î rbc (h n, b n) = (ˆτ SRD ˆB ) n ] ± 1.96 ˆV n + Ŵn

Local-randomization approach and finite-sample inference Popular approach: local-polynomial methods. Approximates regression function and relies on continuity assumptions. Requires: choosing weights, bandwidth and polynomial order. Alternative approach: local-randomization + randomization-inference Gives an alternative that can be used as a robustness check. Key assumption: exists window W = [ h n, h n] around cutoff ( h n < x < h n) where T i independent of (Y i(0), Y i(1)) (for all X i W ) In words: treatment is randomly assigned within W. Good news: if plausible, then RCT ideas/methods apply. Not-so-good news: most plausible for very small windows (very few observations). One solution: employ small window but use randomization-inference methods. Requires: choosing randomization rule, window and statistic.

Local-randomization approach and finite-sample inference Recall key assumption: exists W = [ h n, h n] around cutoff ( h n < x < h n) where T i independent of (Y i (0), Y i (1)) (for all X i W ) How to choose window? Use balance tests on pre-determined/exogenous covariates. Very intuitive, easy to implement. How to conduct inference? Use randomization-inference methods. 1 Choose statistic of interest. E.g., t-stat for difference-in-means. 2 Choose randomization rule. E.g., number of treatments and controls given. 3 Compute finite-sample distribution of statistics by permuting treatment assignments.

Local-randomization approach and finite-sample inference Do not forget to validate & falsify the empirical strategy. 1 Plot data to make sure local-randomization is plausible. 2 Conduct placebo tests. (e.g., use pre-intervention outcomes or other covariates not used select W ) 3 Do sensitivity analysis. See Cattaneo, Frandsen and Titiunik (2015) for introduction. See Cattaneo, Titiunik and Vazquez-Bare (2015) for further results and implementation.