A Course in Applied Econometrics. Lecture 5. Instrumental Variables with Treatment Effect. Heterogeneity: Local Average Treatment Effects.

Similar documents
Imbens, Lecture Notes 2, Local Average Treatment Effects, IEN, Miami, Oct 10 1

What s New in Econometrics. Lecture 1

150C Causal Inference

AGEC 661 Note Fourteen

Instrumental Variables

Lecture 8. Roy Model, IV with essential heterogeneity, MTE

Flexible Estimation of Treatment Effect Parameters

Introduction to causal identification. Nidhiya Menon IGC Summer School, New Delhi, July 2015

Nonadditive Models with Endogenous Regressors

Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes

Instrumental Variables

Noncompliance in Randomized Experiments

The Problem of Causality in the Analysis of Educational Choices and Labor Market Outcomes Slides for Lectures

Methods to Estimate Causal Effects Theory and Applications. Prof. Dr. Sascha O. Becker U Stirling, Ifo, CESifo and IZA

On Variance Estimation for 2SLS When Instruments Identify Different LATEs

A Course in Applied Econometrics. Lecture 2 Outline. Estimation of Average Treatment Effects. Under Unconfoundedness, Part II

Econ 2148, fall 2017 Instrumental variables I, origins and binary treatment case

PSC 504: Instrumental Variables

Sensitivity checks for the local average treatment effect

Imbens/Wooldridge, IRP Lecture Notes 2, August 08 1

IsoLATEing: Identifying Heterogeneous Effects of Multiple Treatments

Instrumental Variables. Ethan Kaplan

IV Estimation WS 2014/15 SS Alexander Spermann. IV Estimation

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,

Michael Lechner Causal Analysis RDD 2014 page 1. Lecture 7. The Regression Discontinuity Design. RDD fuzzy and sharp

Implementing Matching Estimators for. Average Treatment Effects in STATA

Lecture 11/12. Roy Model, MTE, Structural Estimation

Lecture 11 Roy model, MTE, PRTE

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

What s New in Econometrics. Lecture 13

Estimating the Dynamic Effects of a Job Training Program with M. Program with Multiple Alternatives

Instrumental Variables in Action: Sometimes You get What You Need

Groupe de lecture. Instrumental Variables Estimates of the Effect of Subsidized Training on the Quantiles of Trainee Earnings. Abadie, Angrist, Imbens

Comments on: Panel Data Analysis Advantages and Challenges. Manuel Arellano CEMFI, Madrid November 2006

Identification Analysis for Randomized Experiments with Noncompliance and Truncation-by-Death

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables

The problem of causality in microeconometrics.

Final Exam. Economics 835: Econometrics. Fall 2010

Potential Outcomes Model (POM)

ECO Class 6 Nonparametric Econometrics

Implementing Matching Estimators for. Average Treatment Effects in STATA. Guido W. Imbens - Harvard University Stata User Group Meeting, Boston

An Alternative Assumption to Identify LATE in Regression Discontinuity Designs

A Consistent Variance Estimator for 2SLS When Instruments Identify Different LATEs

ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics

ECONOMETRICS II (ECO 2401) Victor Aguirregabiria. Spring 2018 TOPIC 4: INTRODUCTION TO THE EVALUATION OF TREATMENT EFFECTS

Using Instrumental Variables for Inference about Policy Relevant Treatment Parameters

The problem of causality in microeconometrics.

Empirical Methods in Applied Microeconomics

Recitation Notes 5. Konrad Menzel. October 13, 2006

What s New in Econometrics? Lecture 14 Quantile Methods

Introduction to Causal Inference. Solutions to Quiz 4

ExtrapoLATE-ing: External Validity and Overidentification in the LATE framework. March 2011

New Developments in Econometrics Lecture 16: Quantile Estimation

Instrumental Variables

1 Outline. 1. Motivation. 2. SUR model. 3. Simultaneous equations. 4. Estimation

The Generalized Roy Model and Treatment Effects

Causal Inference with General Treatment Regimes: Generalizing the Propensity Score

Exam ECON5106/9106 Fall 2018

ECON Introductory Econometrics. Lecture 16: Instrumental variables

Partial Identification of Average Treatment Effects in Program Evaluation: Theory and Applications

Impact Evaluation Technical Workshop:

Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case

Identi cation of Positive Treatment E ects in. Randomized Experiments with Non-Compliance

Econometrics of causal inference. Throughout, we consider the simplest case of a linear outcome equation, and homogeneous

NBER WORKING PAPER SERIES TREATMENT EFFECT HETEROGENEITY IN THEORY AND PRACTICE. Joshua D. Angrist

A Course in Applied Econometrics. Lecture 10. Partial Identification. Outline. 1. Introduction. 2. Example I: Missing Data

Recitation Notes 6. Konrad Menzel. October 22, 2006

arxiv: v3 [stat.me] 20 Feb 2016

An Alternative Assumption to Identify LATE in Regression Discontinuity Design

Applied Microeconometrics. Maximilian Kasy

Linear IV and Simultaneous Equations

Principles Underlying Evaluation Estimators

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies

Testing instrument validity for LATE identification based on inequality moment constraints

Identification of Regression Models with Misclassified and Endogenous Binary Regressor

Chapter 6: Policy Evaluation Methods: Treatment Effects

Discussion of Identifiability and Estimation of Causal Effects in Randomized. Trials with Noncompliance and Completely Non-ignorable Missing Data

arxiv: v1 [stat.me] 8 Jun 2016

The Econometric Evaluation of Policy Design: Part I: Heterogeneity in Program Impacts, Modeling Self-Selection, and Parameters of Interest

Weak Stochastic Increasingness, Rank Exchangeability, and Partial Identification of The Distribution of Treatment Effects

Applied Health Economics (for B.Sc.)

Supplementary material to: Tolerating deance? Local average treatment eects without monotonicity.

Identification with Latent Choice Sets: The Case of the Head Start Impact Study

Testing instrument validity for LATE identification based on inequality moment constraints

Instrumental Variables in Action

WORKSHOP ON PRINCIPAL STRATIFICATION STANFORD UNIVERSITY, Luke W. Miratrix (Harvard University) Lindsay C. Page (University of Pittsburgh)

The relationship between treatment parameters within a latent variable framework

A Measure of Robustness to Misspecification

A Course in Applied Econometrics Lecture 4: Linear Panel Data Models, II. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

Quantitative Economics for the Evaluation of the European Policy

So far our focus has been on estimation of the parameter vector β in the. y = Xβ + u

The Economics of European Regions: Theory, Empirics, and Policy

Sharp IV bounds on average treatment effects on the treated and other populations under endogeneity and noncompliance

Binary Models with Endogenous Explanatory Variables

SOCIAL EXPERIMENTS AND INSTRUMENTAL VARIABLES WITH DURATION OUTCOMES

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

Identification and Extrapolation with Instrumental Variables

Exact Nonparametric Inference for a Binary. Endogenous Regressor

Linear Models in Econometrics

1. The OLS Estimator. 1.1 Population model and notation

Transcription:

A Course in Applied Econometrics Lecture 5 Outline. Introduction 2. Basics Instrumental Variables with Treatment Effect Heterogeneity: Local Average Treatment Effects 3. Local Average Treatment Effects 4. Extrapolation to the Population 5. Covariates Guido Imbens IRP Lectures, UW Madison, August 2008 6. Multivalued Instruments 7. Multivalued Endogenous Regressors. Introduction. Instrumental variables estimate average treatment effects, with the average depending on the instruments. 2. Population averages are only estimable under unrealistically strong assumptions ( identification at infinity, or under the constant effect). 3. Compliers (for whom we can identify effects) are not necessarily the subpopulations that are ex ante the most interesting subpopulations, but need extrapolation for others. 4. The set up here allows the researcher to sharply separate the extrapolation to the (sub-)population of interest from exploration of the information in the data. 2 2. Basics Linear IV with Constant Coefficients. Standard set up: Y i = β 0 + β W i + ε i. There is concern that the regressor W i is endogenous, correlated with ε i. Suppose that we have an instrument Z i that is both uncorrelated with ε i and correlated with W i. In the single instrument / single endogenous regressor, we end up with the ratio of covariances ˆβ IV Ni= = N (Y i Ȳ ) (Z i Z) Ni= N (W i W ) (Z i Z). Using a central limit theorem for all the moments and the delta method we can infer the large sample distribution without additional assumptions. 3

Potential Outcome Set Up Let Y i (0) and Y i () be two potential outcomes for unit i, one for each value of the endogenous regressor or treatment. Let W i be the realized value of the endogenous regressor, equal to zero or one. We observe W i and Y i = Y i (W i )= { Yi () if W i = Y i (0) if W i =0. Define two potential outcomes W i (0) and W i (), representing the value of the endogenous regressor given the two values for the instrument Z i. The actual or realized value of the endogenous variable is W i = W i (Z i )= { Wi () if Z i = W i (0) if Z i =0. 3. Local Average Treatment Effects The key instrumental variables assumption is Assumption (Independence) Z i (Y i (0),Y i (),W i (0),W i ()). It requires that the instrument is as good as randomly assigned, and that it does not directly affect the outcome. The assumption is formulated in a nonparametric way, without definitions of residuals that are tied to functional forms. So we observe the triple Z i, W i = W i (Z i )andy i = Y i (W i (Z i )). 4 5 Assumptions (ctd) Alternatively, we separate the assumption by postulating the existence of four potential outcomes, Y i (z, w), corresponding to the outcome that would be observed if the instrument was Z i = z and the treatment was W i = w. Assumption 2 (Random Assignment) and Z i (Y i (0, 0),Y i (0, ),Y i (, 0),Y i (, ),W i (0),W i ()). Compliance Types It is useful for our approach to think about the compliance behavior of the different units W i (0) Assumption 3 (Exclusion Restriction) Y i (z, w) =Y i (z,w), for all z, z,w. The first of these two assumptions is implied by random assignment of Z i, but the second is substantive, and randomization has no bearing on it. 6 W i () 0 never-taker defier complier always-taker 7

We cannot directly establish the type of a unit based on what we observe for them since we only see the pair (Z i,w i ), not the pair (W i (0),W i ()). Nevertheless, we can rule out some possibilities. Monotonicity Assumption 4 (Monotonicity/No-Defiers) W i () W i (0). Z i 0 complier/never-taker never-taker/defier W i always-taker/defier complier/always-taker This assumption makes sense in a lot of applications. It is implied directly by many (constant coefficient) latent index models of the type: W i (z) ={π 0 + π z + ε i > 0}, but it is much weaker than that. 8 9 Implications for Compliance types: Distribution of Compliance Types Z i Under random assignment and monotonicity we can estimate the distribution of compliance types: 0 complier/never-taker never-taker W i always-taker complier/always-taker π a =Pr(W i (0) = W i () = ) = E[W i Z i =0] π c =Pr(W i (0) = 0,W i () = ) = E[W i Z i =] E[W i Z i =0] For individuals with (Z i =0,W i =)andfor(z i =,W i =0) we can now infer the compliance type. π n =Pr(W i (0) = W i () = 0) = E[W i Z i =] 0

Now consider average outcomes by instrument and treatment: E[Y i W i =0,Z i =0]= π c E[Y i (0) complier] + π n E[Y i (0) never taker], π c + π n π c + π n E[Y i W i =0,Z i =]=E[Y i (0) never taker], E[Y i W i =,Z i =0]=E[Y i () always taker], E[Y i W i =,Z i =]= π c E[Y i () complier] + π a E[Y i () always taker]. π c + π a π c + π a Local Average Treatment Effect Hence the instrumental variables estimand, the ratio of these two reduced form estimands, is equal to the local average treatment effect β IV = E[Y i Z i =] E[Y i Z i =0] E[W i Z i =] E[W i Z i =0] = E[Y i () Y i (0) complier]. From this we can infer the average outcome for compliers, E[Y i (0) complier], and E[Y i () complier], 2 3 4. Extrapolating to the Full Population We can estimate E [Y i (0) never taker], and E [Y i () always taker] We can learn from these averages whether there is any evidence of heterogeneity in outcomes by compliance status, by comparing the pair of average outcomes of Y i (0); E [Y i (0) never taker], and E [Y i (0) complier], and the pair of average outcomes of Y i (): E [Y i () always taker], and E [Y i () complier]. If compliers, never-takers and always-takers are found to be substantially different in levels, then it appears much less plausible that the average effect for compliers is indicative of average effects for other compliance types. 4 5. Covariates Traditionally the TSLS set up is used with the covariates entering in the outcome equation linearly and additively, as Y i = β 0 + β W i + β 2 X i + ε i, with the covariates added to the set of instruments. Given the potential outcome set up with general heterogeneity in the effects of the treatment, one may also wish to allow for more heterogeneity in the correlations between treatment effects and covariates. Here we describe a general way of doing so. Unlike TSLS type approaches, this involves modelling both the dependence of the outcome and the treatment on the covariates. 5

Heckman Selection Model A traditional parametric model with a dummy endogenous variables might have the form (translated to the potential outcome set up used here): W i (z) ={π 0 + π z + π 2 X i + η i 0}, Y i (w) =β 0 + β w + β 2 X i + ε i, with (η i,ε i ) jointly normally distributed (e.g., Heckman, 978). Such a model impose restrictions on the relation between compliance types, covariates and outcomes: never taker if η i < π 0 π π 2 X i i is a complier if π 0 π π 2 X i η i < π 0 π π 2 X always taker if π 0 π 2 X i η i, which imposes strong restrictions, e.g., if E[Y i (0) n, X i ] < E[Y i (0) c, X i ], then E[Y i () c, X i ] < E[Y i () a, X i ] 6 Flexible Alternative Model Specify f Y (w) X,T (y x, t) =f(y x; θ wt ), for (w, t) =(0,n), (0,c),(,c),(,a). A natural model for the distribution of type is a trinomial logit model: Pr(T i = complier X i )= Pr(T i = never taker X i )= Pr(T i =always taker X i )= + exp(π n X i) + exp(π a X i), exp(π nx i ) + exp(π n X i) + exp(π a X i), Pr(T i = complier X i ) Pr(T i = never taker X i ). 7 The log likelihood function is then, factored in terms of the contribution by observed (W i,z i )values: L(π n,π a,θ 0n,θ 0c,θ c,θ a )= exp(π n X i) + exp(π i W i =0,Z i = n X i) + exp(π a X i) f(y i X i ; θ 0n ) i W i =0,Z i =0 i W i =,Z i = ( exp(π n X i ) + exp(π n X i) f(y i X i ; θ 0n )+ ( exp(π a X i ) + exp(π ax i ) f(y i X i ; θ a )+ + exp(π n X i) f(y + exp(π ax i ) f(y exp(π a X i) + exp(π i W i =,Z i =0 nx i ) + exp(π ax i ) f(y i X i ; θ a ). 8 Application: Angrist (990) effect of military service The simple ols regression leads to: log(earnings) i = 5.4364 0.0205 veteran i (0079) (0.067) In Table we present population sizes of the four treatmen/instrument samples. For example, with a low lottery number 5,948 individuals do not, and,372 individuals do serve in the military. Z i 0 5,948,95 W i,372 865 9

Using these data we get the following proportions of the various compliance types, given in Table, under the non-defiers assumption. For example, the proportion of nevertakers is estimated as the conditional probability of W i = 0 given Z i =: Pr(nevertaker) = 95 95 + 865. W i (0) Estimated Average Outcomes by Treatment and Instrument Z i 0 E[Y ]=5.4472 E[Y ]=5.4076, W i E[Y ]=5.4028 E[Y ]=5.4289 W i () 0 never-taker (0.6888) defier (0) complier (0.237) always-taker (0.875) Not much variation by treatment status given instrument, but these comparisons are not causal under IV assumptions. 20 2 W i () 0 E[Yi (0)] = 5.4028 W i (0) E[Yi (0)] = 5.6948, E[Yi ()] = 5.462 defier (NA) E[Yi ()] = 5.4076 The local average treatment effect is -0.2336, a 23% drop in earnings as a result of serving in the military. Simply doing IV or TSLS would give you the same numerical results: log(earnings) i = 5.4836 0.2336 veteran i (0.0289) (0.266) 22 It is interesting in this application to inspect the average outcome for different compliance groups. Average log earnings for never-takers are 5.40, lower by 29% than average earnings for compliers who do not serve in the military. This suggests that never-takers are substantially different than compliers, and that the average effect of 23% for compliers need not be informative never-takers. Note that E[Y i (0) n, X i ] < E[Y i (0) c, X i ], but also E[Y i () c, X i ] > E[Y i () a, X i ] Compliers earn more than nevertakers when not serving, and more than always-takers when serving. Does not fit standard gaussian selection model. 23

6. Multivalued Instruments For any two values of the instrument z 0 and z satisfying the local average treatment effect assumptions we can define the corresponding local average treatment effect: τ z,z 0 = E[Y i () Y i (0) W i (z )=,W i (z 0 )=0]. Note that these local average treatment effects need not be the same for different pairs of instrument values (z 0,z ). Interpretation of IV Estimand Suppose that monotonicity holds for all (z, z ), and suppose that the instruments are ordered in such a way that p(z k ) p(z k ), where p(z) =E[W i Z i = z]. Also suppose that the instrument is relevant, E[g(Z i ) W i ] 0. Then the instrumental variables estimator based on using g(z) as an instrument for W estimates a weighted average of local average treatment effects: τ g( ) = Cov(Y i,g(z i )) Cov(W i,g(z i )) = K λ k τ zk,z k, k= Comparisons of estimates based on different instruments underly conventional tests of overidentifying restrictions in TSLS settings. An alternative interpretation of rejections in such testing procedures is therefore treatment effect heterogeneity. 24 λ k = (p(z k) p(z k )) K l=k π l (g(z l ) E[g(Z i )] Kk= p(z k ) p(z k )) K l=k π l (g(z l ) E[g(Z i )], π k =Pr(Z i = z k ). These weights are nonnegative and sum up to one. 25 Marginal Treatment Effect If the instrument is continuous, and p(z) is continuous in z, we can define the limit of the local average treatment effects τ z = lim z z,z z τ z,z. Suppose we have a latent index model for the receipt of treatment: W i (z) ={h(z)+η i 0}, with the scalar unobserved component η i independent of the instrument Z i. Then we can define the marginal treatment effect τ(η) (Heckman and Vytlacil, 2005) as τ(η) =E [Y i () Y i (0) η i = η]. 26 This marginal treatment effect relates directly to the limit of the local average treatment effects τ(η) =τ z, with η = h(z)). Note that we can only define this for values of η for which there is a z such that τ = h(z). Normalizing the marginal distribution of η to be uniform on [0, ], this restricts η to be in the interval [inf z p(z), sup z p(z)], where p(z) =Pr(W i = Z i = z). Now we can characterize various average treatment effects in terms of this limit. E.g.: τ = τ(η)df η(η). η 27

7. Multivalued Endogenous Variables τ = Cov(Y i,zi) Cov(Wi,Zi) = E[Y i Zi =] E[Yi Zi =0] E[Wi Zi =] E[Wi Zi =0]. Exclusion restriction and monotonicity: Y i (w) W i (z) Z i, W i () W i (0), Then J τ = λ j E[Y i (j) Y i (j ) W i () j>w i (0)], j= λj = Pr(W i() j>w i (0) Ji= Pr(Wi() i>wi(0). with the weights λj estimable. 28 Illustration: Angrist-Krueger (99) Returns to Educ. êduci = 2.797 0.09 qobi (0.006) (0.03) log(earnings) i = 5.903 0.0 qobi (0.00) (0.003) The instrumental variables estimate is the ratio ˆβ IV = 0.09 0.0 =0.020. Weights γj =Pr(Wi() j>wi(0) can be estimated as ˆγj = {Wi j} N i Zi= N0 {Wi j}. i Zi=0 29 0.6 0.3 0.2 0. 0 0.5 0. 0.05 0 0.05 0.04 0.03 0.02 0.0 0 0.8 0.4 Figure : histogram estimate of density of years of education 0 2 4 6 8 2 4 6 8 20 Figure 2: Normalized Weight Function for Instrumental Variables Estimand 0 2 4 6 8 2 4 6 8 20 Figure 3: Unnormalized Weight Function for Instrumental Variables Estimand 0 2 4 6 8 2 4 6 8 20 Figure 3: Education Distribution Function by Quarter 0.4 0.2 0 0 2 4 6 8 2 4 6 8 20