Supplementary material to: Tolerating deance? Local average treatment eects without monotonicity.

Similar documents
Identification Analysis for Randomized Experiments with Noncompliance and Truncation-by-Death

Fuzzy Changes-in-Changes

Identi cation of Positive Treatment E ects in. Randomized Experiments with Non-Compliance

Testing instrument validity for LATE identification based on inequality moment constraints

University of Toronto Department of Economics. Testing Local Average Treatment Effect Assumptions

Fuzzy Differences-in-Differences

An Alternative Assumption to Identify LATE in Regression Discontinuity Designs

treatment effect models

Testing instrument validity for LATE identification based on inequality moment constraints

IsoLATEing: Identifying Heterogeneous Effects of Multiple Treatments

Sharp IV bounds on average treatment effects on the treated and other populations under endogeneity and noncompliance

Partial Identification of Average Treatment Effects in Program Evaluation: Theory and Applications

Going Beyond LATE: Bounding Average Treatment Effects of Job Corps Training

Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes

Sensitivity checks for the local average treatment effect

An Alternative Assumption to Identify LATE in Regression Discontinuity Design

TESTING LOCAL AVERAGE TREATMENT EFFECT ASSUMPTIONS. This version: Thursday 19 th November, 2015 First version: December 7 th, 2013

Trimming for Bounds on Treatment Effects with Missing Outcomes *

Econometrics of causal inference. Throughout, we consider the simplest case of a linear outcome equation, and homogeneous

Michael Lechner Causal Analysis RDD 2014 page 1. Lecture 7. The Regression Discontinuity Design. RDD fuzzy and sharp

Bounds on Population Average Treatment E ects with an Instrumental Variable

Flexible Estimation of Treatment Effect Parameters

Econ 2148, fall 2017 Instrumental variables I, origins and binary treatment case

Using Instrumental Variables for Inference about Policy Relevant Treatment Parameters

Using Instrumental Variables for Inference about Policy Relevant Treatment Parameters

On Variance Estimation for 2SLS When Instruments Identify Different LATEs

arxiv: v6 [stat.me] 22 Oct 2018

Sharp Bounds on Causal Effects under Sample Selection*

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies

Identification with Latent Choice Sets: The Case of the Head Start Impact Study

Regression Discontinuity Designs

Online Appendix to Yes, But What s the Mechanism? (Don t Expect an Easy Answer) John G. Bullock, Donald P. Green, and Shang E. Ha

AGEC 661 Note Fourteen

ted: a Stata Command for Testing Stability of Regression Discontinuity Models

A Note on Adapting Propensity Score Matching and Selection Models to Choice Based Samples

Groupe de lecture. Instrumental Variables Estimates of the Effect of Subsidized Training on the Quantiles of Trainee Earnings. Abadie, Angrist, Imbens

NBER WORKING PAPER SERIES USING INSTRUMENTAL VARIABLES FOR INFERENCE ABOUT POLICY RELEVANT TREATMENT EFFECTS

Two-way fixed effects estimators with heterogeneous treatment effects

Supplemental Appendix to "Alternative Assumptions to Identify LATE in Fuzzy Regression Discontinuity Designs"

Principles Underlying Evaluation Estimators

Two-way fixed effects estimators with heterogeneous treatment effects

What do instrumental variable models deliver with discrete dependent variables?

WORKSHOP ON PRINCIPAL STRATIFICATION STANFORD UNIVERSITY, Luke W. Miratrix (Harvard University) Lindsay C. Page (University of Pittsburgh)

TESTING IDENTIFYING ASSUMPTIONS IN FUZZY REGRESSION DISCONTINUITY DESIGN 1. INTRODUCTION

The Econometric Evaluation of Policy Design: Part I: Heterogeneity in Program Impacts, Modeling Self-Selection, and Parameters of Interest

ECO 2403 TOPICS IN ECONOMETRICS

The problem of causality in microeconometrics.

Bounds on Average and Quantile Treatment Effects of Job Corps Training on Wages*

The problem of causality in microeconometrics.

A Course in Applied Econometrics. Lecture 5. Instrumental Variables with Treatment Effect. Heterogeneity: Local Average Treatment Effects.

Comments on: Panel Data Analysis Advantages and Challenges. Manuel Arellano CEMFI, Madrid November 2006

It s never too LATE: A new look at local average treatment effects with or without defiers

Testing the validity of the sibling sex ratio instrument

Instrumental Variables

Two-way fixed effects estimators with heterogeneous treatment effects

IDENTIFICATION OF TREATMENT EFFECTS WITH SELECTIVE PARTICIPATION IN A RANDOMIZED TRIAL

Weak Stochastic Increasingness, Rank Exchangeability, and Partial Identification of The Distribution of Treatment Effects

Exact Nonparametric Inference for a Binary. Endogenous Regressor

What s New in Econometrics. Lecture 1

Bounds on Average and Quantile Treatment E ects of Job Corps Training on Participants Wages

Recitation Notes 6. Konrad Menzel. October 22, 2006

Lecture 8. Roy Model, IV with essential heterogeneity, MTE

Nonadditive Models with Endogenous Regressors

Supplement to Fuzzy Differences-in-Differences

arxiv: v2 [stat.me] 21 Nov 2016

Identifying Effects of Multivalued Treatments

Course Description. Course Requirements

Recitation Notes 5. Konrad Menzel. October 13, 2006

Inference on Breakdown Frontiers

Identifying Treatment Effects in the Presence of Confounded Types

ECONOMETRICS II (ECO 2401) Victor Aguirregabiria. Spring 2018 TOPIC 4: INTRODUCTION TO THE EVALUATION OF TREATMENT EFFECTS

Why high-order polynomials should not be used in regression discontinuity designs

Regression Discontinuity Designs.

Identification of Regression Models with Misclassified and Endogenous Binary Regressor

Introduction to Causal Inference. Solutions to Quiz 4

150C Causal Inference

Imbens, Lecture Notes 2, Local Average Treatment Effects, IEN, Miami, Oct 10 1

Lecture 10 Regression Discontinuity (and Kink) Design

Microeconometrics. C. Hsiao (2014), Analysis of Panel Data, 3rd edition. Cambridge, University Press.

Counterfactual worlds

The Pennsylvania State University - Department of Economics

Optimal bandwidth selection for the fuzzy regression discontinuity estimator

leebounds: Lee s (2009) treatment effects bounds for non-random sample selection for Stata

Rewrap ECON November 18, () Rewrap ECON 4135 November 18, / 35

Applied Microeconometrics Chapter 8 Regression Discontinuity (RD)

Instrumental Variables in Models with Multiple Outcomes: The General Unordered Case

PSC 504: Instrumental Variables

Quantitative Economics for the Evaluation of the European Policy

The Generalized Roy Model and Treatment Effects

CEPA Working Paper No

Identification and Inference in Regression Discontinuity Designs with a Manipulated Running Variable

Potential Outcomes and Causal Inference I

Generated Covariates in Nonparametric Estimation: A Short Review.

Essays on Microeconometrics

Lecture 11 Roy model, MTE, PRTE

Exploring Marginal Treatment Effects

Applied Microeconometrics. Maximilian Kasy

ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics

A Distinction between Causal Effects in Structural and Rubin Causal Models

Impact Evaluation Technical Workshop:

Transcription:

Supplementary material to: Tolerating deance? Local average treatment eects without monotonicity. Clément de Chaisemartin September 1, 2016 Abstract This paper gathers the supplementary material to de Chaisemartin (2016). I show that one can estimate quantile treatment eects in the surviving-compliers subpopulation, that one can test the CD condition, and that my results extend to multivariate treatment and instrument, to fuzzy regression discontinuity designs, and to the bounds in Lee (2009) for survey non-response. University of California at Santa Barbara, clementdechaisemartin@ucsb.edu

1 Testability and quantile treatment eects. In this section, I show that random instrument, exclusion restriction, and CD have a testable implication. Then, I discuss how the testable implications of random instrument, exclusion restriction, and ND studied in Huber & Mellace (2012), Kitagawa (2015), and Mourie & Wan (2014) relate to the CD condition. For any random variable R, let S(R) denote the support of R. Let U dz be a random variable uniformly distributed on [0, 1] denoting the rank of an observation in the distribution of Y D = d, Z = z. If the distribution of Y D = d, Z = z is continuous, U dz = F Y D=d,Z=z (Y ). If the distribution of Y D = d, Z = z is discrete, one can randomly allocate a rank to tied observations by setting U dz = F Y D=d,Z=z (Y ) + V (F Y D=d,Z=z (Y ) F Y D=d,Z=z (Y )), where V is uniformly distributed on [0, 1] and independent of (Y, D, Z), and Y = sup{y S(Y D = d, Z = z) : y < Y }. 1 For every d {0, 1}, let p d = F S P (D=d Z=d). Notice that both p 0 and p 1 are included between 0 and 1. Finally, let L = E(Y D = 1, Z = 1, U 11 p 1 ) E(Y D = 0, Z = 0, U 00 1 p 0 ) L = E(Y D = 1, Z = 1, U 11 1 p 1 ) E(Y D = 0, Z = 0, U 00 p 0 ). Theorem S1 If Assumptions 1, 2, and 5 are satised, L W L. (29) The intuition of this result goes as follows. Under random instrument, exclusion restriction, and CD, E(Y 1 Y 0 C V ) is point identied: following Theorem 2.1 in de Chaisemartin (2016), it is equal to W. It is also partially identied. C V is included in {D = 0, Z = 0}, and it accounts for p 0 % of this population. E(Y 0 C V ) can therefore not be larger than the mean of Y 0 of the p 0 % of this population with largest Y 0. It can also not be smaller than the mean of Y 0 of the p 0 % with lowest Y 0 (see Horowitz & Manski, 1995 and Lee (2009)). Combining this with a similar reasoning for E(Y 1 C V ) yields worst case bounds for E(Y 1 Y 0 C V ), L and L. The estimand of E(Y 1 Y 0 C V ) should lie within its bounds, hence the testable implication. To implement this test, one can use results from Andrews & Soares (2010). Let θ satisfy the two following moment inequality conditions: ( Y Z 0 E 0 E P (Z = 1) ( θ + ( F S ( Y (1 Z) P (Z = 0) F S Y DZ1{U 11 p 1 } θ + P (D = 1, Z = 1, U 11 p 1 ) Y (1 D)(1 Z)1{U 00 1 p 0 } P (D = 0, Z = 0, U 00 1 p 0 ) Y DZ1{U 11 1 p 1 } P (D = 1, Z = 1, U 11 1 p 1 ) Y (1 D)(1 Z)1{U 00 p 0 } P (D = 0, Z = 0, U 00 p 0 ) ) )) ( Y Z Y (1 Z) P (Z = 1) P (Z = 0) This denes a moment inequality model with preliminary estimated parameters. This model satises all the technical conditions required in Andrews & Soares (2010) if E( Y 2+δ ) < + 1 Y should be understood as if Y = inf{s(y D = d, Z = z)}. )). 1

for some strictly positive δ, and if P (Z = 0), P (Z = 1), P (D = 1, Z = 1, U 11 p 1 ), P (D = 1, Z = 1, U 11 1 p 1 ), P (D = 0, Z = 0, U 00 p 0 ), and P (D = 0, Z = 0, U 00 1 p 0 ) are all bounded by below by some strictly positive constant ɛ. Under these two mild restrictions, their Theorem 1 applies, and one can use it to construct a uniformly valid condence interval for θ. (29) is rejected when 0 does not belong to this condence interval. As pointed out in Balke & Pearl (1997) and Heckman & Vytlacil (2005), random instrument, exclusion restriction, and ND also have testable implications. Huber & Mellace (2011), Kitagawa (2015), Machado et al. (2013), and Mourie & Wan (2014) have developed statistical tests of these implications. I now discuss how these tests relate to the CD condition. The test suggested in Huber & Mellace (2011) is not a test of CD. Huber & Mellace (2011) use the fact that E(Y 0 NT ) and E(Y 1 AT ) are both point and partially identied under random instrument, exclusion restriction, and ND. For instance, E(Y D = 0, Z = 1) is equal to E(Y 0 NT ) and E(Y D = 0, Z = 0) is equal to (1 p 0 )E(Y 0 NT ) + p 0 E(Y 0 C). This implies E(Y D = 0, Z = 0, U 00 1 p 0 ) E(Y D = 0, Z = 1) E(Y D = 0, Z = 0, U 00 p 0 ). P (NT ) But under CD, E(Y D = 0, Z = 1) is equal to E(Y P (NT )+P (F ) 0 NT ) + the previous implication need not be true anymore. P (F ) P (NT )+P (F ) E(Y 0 F ), so The tests suggested in Kitagawa (2015), Machado et al. (2013), and Mourie & Wan (2014) are not tests of CD, but they are tests of the CDM condition introduced hereafter. Assumption S1 (Compliers-deers for marginals: CDM) There is a subpopulation of C denoted C F which satises: P (C F ) = P (F ) Y 0 C F Y 0 F Y 1 C F Y 1 F. The CDM condition requires that a subgroup of compliers have the same size and the same marginal distributions of Y 0 and Y 1 as deers. CDM is stronger than CD. On the other hand, it is invariant to the scaling of the outcome, while CD is not. With non-binary outcomes, working under assumptions invariant to scaling might be desirable so one might then prefer to invoke CDM than CD, despite the fact it is a stronger assumption. Following the logic of Theorem 2.2 in de Chaisemartin (2016), a sucient condition for CDM to hold is that there be more compliers than deers in each subgroup with the same value of (Y 0, Y 1 ): P (F Y 0, Y 1 ) P (C Y 0, Y 1 ). 2

One can show that under CDM, the marginal distributions of Y 0 and Y 1 are identied for surviving-compliers. The estimands are the same as in Imbens & Rubin (1997): f Y0 C V (y 0 ) = P (D = 0 Z = 0)f Y D=0,Z=0(y 0 ) P (D = 0 Z = 1)f Y D=0,Z=1 (y 0 ) P (D = 0 Z = 0) P (D = 0 Z = 1) f Y1 C V (y 1 ) = P (D = 1 Z = 1)f Y D=1,Z=1(y 1 ) P (D = 1 Z = 0)f Y D=1,Z=0 (y 1 ). P (D = 1 Z = 1) P (D = 1 Z = 0) The testing procedures in Kitagawa (2015) and Mourie & Wan (2014) test whether the rhs of the two previous equations are positive everywhere. Under CDM, these quantities are densities so they must be positive. These procedures are therefore tests of CDM. Machado et al. (2013) focus on binary outcomes. Their procedure tests inequalities equivalent to those considered in Kitagawa (2015). Therefore, their procedure is also a test of the CDM condition. 2 Multivariate treatment and instrument Results presented in the main paper extend to applications where treatment is multivariate. Assume that for every z {0; 1}, D z {0, 1, 2,..., J} for some integer J. Let (Y j ) j {0,1,2,...,J} denote the corresponding potential outcomes. Let C j = {D 1 j > D 0 } denote the subpopulation of compliers induced to go from a treatment below to above j by the instrument. Let F j = {D 0 j > D 1 } denote the subpopulation of deers induced to go from above to below j by the instrument. As in Angrist & Imbens (1995), I assume that the cdf of D Z = 0 stochastically dominates that of D Z = 1. This implies that P (C j ) P (F j ). Consider the following CD condition: Assumption S2 (Compliers-deers for multivariate treatment: CDMU) For every j {1, 2,..., J}, there is a subpopulation of C j denoted C j F P (C j F ) = P (F j ) E(Y j Y j 1 C j F ) = E(Y j Y j 1 F j ). which satises: If CDMU is satised, one can show that W is equal to the following average causal response: J w j E(Y j Y j 1 C j V ), j=1 where C j V is a subset of C j of size P (C j ) P (F j ), and w j are positive real numbers. This generalizes Theorem 1 in Angrist & Imbens (1995). Their Theorem 2 considers the case where the instrument is multivariate: Z {0, 1, 2,..., K}. Let β denote the coecient of D in a 2SLS regression of Y on D using the set of dummies 3

(1{Z = k}) 1 k K as instruments. Angrist & Imbens (1995) show that if D z D z for every z z, β is equal to a weighted average of average causal responses. Let C jz = {D z j > D z 1 } (resp. F jz = {D z j > D z 1 }) denote the subpopulation of compliers induced to go from a treatment below to above j (resp. above to below j) when the instrument moves from z 1 to z. Consider the following condition: Assumption S3 (Compliers-deers for multivariate treatment and instrument: CDMU2) For every j {1, 2,..., J} and z {0, 1, 2,..., K}, there is a subpopulation of C jz denoted C jz F which satises: P (C jz F ) = P (F jz ) E(Y j Y j 1 C jz F ) = E(Y j Y j 1 F jz ). Under CDMU2, one can show that β is equal to a weighted average of average causal responses for surviving-compliers. 3 Fuzzy regression discontinuity designs and survey non response My results also extend to many treatment eect models relying on ND type of conditions. An important example are fuzzy regression discontinuity (RD) designs studied in Hahn et al. (2001). Their Theorem 3 still holds if their Assumption A.3, a ND at the threshold assumption, is replaced by a CD at the threshold assumption. Let S denote the forcing variable, and let s denote the threshold. Let C s = {C, S = s} and F s = {F, S = s} respectively denote compliers and deers at the threshold. Assumption S4 (Compliers-deers at the threshold: CDT) There is a subpopulation of C s denoted C s F which satises: P (C s F S = s) = P (F s S = s) E(Y 1 Y 0 C s F, S = s) = E(Y 1 Y 0 F s, S = s). Finally, my results extend to the sample selection and survey non-response literatures. ND type of conditions have also been used in the sample selection and survey non-response literatures, for instance in Lee (2009) or Behaghel et al. (2012). Let R 0 and R 1 denote a subject's potential responses to a survey without and with the treatment. Let AR, RC, and RF respectively denote always respondents (subjects who satisfy R 0 = R 1 = 1), reponse-compliers (those who satisfy R 0 = 0 and R 1 = 1), and response-deers (those who satisfy R 0 = 1 and R 1 = 0). Under the assumption that there are no response-deers (R 1 R 0 ), Lee (2009) derives bounds for E(Y 1 Y 0 AR). Under the weaker assumption that a subpopulation RC F of RC has the same size and the same average treatment eect as RF, the bounds derived in Lee (2009) are valid bounds for E(Y 1 Y 0 AR RC F ). 4

References Andrews, D. W. K. & Soares, G. (2010), `Inference for parameters dened by moment inequalities using generalized moment selection', Econometrica 78(1), 119157. Angrist, J. D. & Imbens, G. W. (1995), `Two-stage least squares estimation of average causal eects in models with variable treatment intensity', Journal of the American Statistical Association 90(430), pp. 431442. Balke, A. & Pearl, J. (1997), `Bounds on treatment eects from studies with imperfect compliance', Journal of the American Statistical Association 92(439), 11711176. Behaghel, L., Crépon, B., Gurgand, M. & Le Barbanchon, T. (2012), Please call again: Correcting non-response bias in treatment eect models, IZA Discussion Papers 6751, Institute for the Study of Labor (IZA). de Chaisemartin, C. (2016), `Tolerating deance? monotonicity.'. identication of treatment eects without Hahn, J., Todd, P. & Van der Klaauw, W. (2001), `Identication and estimation of treatment eects with a regression-discontinuity design', Econometrica 69(1), 201209. Heckman, J. J. & Vytlacil, E. (2005), `Structural equations, treatment eects, and econometric policy evaluation', Econometrica 73(3), 669738. Horowitz, J. L. & Manski, C. F. (1995), `Identication and robustness with contaminated and corrupted data', Econometrica 63(2), 281302. Huber, M. & Mellace, G. (2011), Testing instrument validity for late identication based on inequality moment constraints, Technical report, University of St. Gallen, School of Economics and Political Science. Huber, M. & Mellace, G. (2012), Relaxing monotonicity in the identication of local average treatment eects. Working Paper. Imbens, G. W. & Rubin, D. B. (1997), `Estimating outcome distributions for compliers in instrumental variables models', Review of Economic Studies 64(4), 555574. Kitagawa, T. (2015), `A test for instrument validity', Econometrica 83(5), 20432063. Lee, D. S. (2009), `Training, wages, and sample selection: Estimating sharp bounds on treatment eects', Review of Economic Studies 76(3), 10711102. 5

Machado, C., Shaikh, A. & Vytlacil, E. (2013), Instrumental variables, and the sign of the average treatment eect, Technical report, Citeseer. Mourie, I. & Wan, Y. (2014), Testing late assumptions, Technical report. 6

A Proofs Proof of Theorem S1 Proving that L and L are valid bounds for E(Y 1 Y 0 C V ) relies on a similar argument as the proof of Proposition 1.a in Lee (2009). Due to a concern for brevity, I refer the reader to this paper for this part of the proof. The testable implication directly follows. QED. 7