Recitation Notes 6. Konrad Menzel. October 22, 2006

Similar documents
Recitation Notes 5. Konrad Menzel. October 13, 2006

150C Causal Inference

IV Estimation WS 2014/15 SS Alexander Spermann. IV Estimation

PSC 504: Instrumental Variables

Click to edit Master title style

Sensitivity checks for the local average treatment effect

Instrumental Variables in Action: Sometimes You get What You Need

Lecture 8. Roy Model, IV with essential heterogeneity, MTE

The problem of causality in microeconometrics.

An example to start off with

The problem of causality in microeconometrics.

Session IV Instrumental Variables

Instrumental Variables

Introduction to causal identification. Nidhiya Menon IGC Summer School, New Delhi, July 2015

Estimating the Dynamic Effects of a Job Training Program with M. Program with Multiple Alternatives

Potential Outcomes Model (POM)

Instrumental Variables

The Problem of Causality in the Analysis of Educational Choices and Labor Market Outcomes Slides for Lectures

Groupe de lecture. Instrumental Variables Estimates of the Effect of Subsidized Training on the Quantiles of Trainee Earnings. Abadie, Angrist, Imbens

Using Instrumental Variables to Find Causal Effects in Public Health

Empirical approaches in public economics

Has the Family Planning Policy Improved the Quality of the Chinese New. Generation? Yingyao Hu University of Texas at Austin

Econometrics of causal inference. Throughout, we consider the simplest case of a linear outcome equation, and homogeneous

Econ 2148, fall 2017 Instrumental variables I, origins and binary treatment case

Instrumental Variables

PSC 504: Differences-in-differeces estimators

Chapter Course notes. Experiments and Quasi-Experiments. Michael Ash CPPA. Main point of experiments: convincing test of how X affects Y.

Instrumental Variables in Action

AGEC 661 Note Fourteen

Introduction to Causal Inference. Solutions to Quiz 4

The returns to schooling, ability bias, and regression

Experiments and Quasi-Experiments

Rockefeller College University at Albany

WORKSHOP ON PRINCIPAL STRATIFICATION STANFORD UNIVERSITY, Luke W. Miratrix (Harvard University) Lindsay C. Page (University of Pittsburgh)

Assessing Studies Based on Multiple Regression

II. MATCHMAKER, MATCHMAKER

Unless provided with information to the contrary, assume for each question below that the Classical Linear Model assumptions hold.

Regression Discontinuity

Recitation 7. and Kirkebøen, Leuven, and Mogstad (2014) Spring Peter Hull

Causality and Experiments

ExtrapoLATE-ing: External Validity and Overidentification in the LATE framework. March 2011

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors

Quantitative Economics for the Evaluation of the European Policy

Instrumental Variables

Introduction to Econometrics

Regression Discontinuity Designs.

Regression Discontinuity

AAEC/ECON 5126 FINAL EXAM: SOLUTIONS

Empirical Methods in Applied Microeconomics

Education Production Functions. April 7, 2009

Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification. Todd MacKenzie, PhD

The Generalized Roy Model and Treatment Effects

Online Appendix to Yes, But What s the Mechanism? (Don t Expect an Easy Answer) John G. Bullock, Donald P. Green, and Shang E. Ha

Economics 241B Estimation with Instruments

EC 533 Labour Economics Problem Set 1 Answers. = w r. S = f S. f r = 0. log y = log w + log f(s, A)

Methods to Estimate Causal Effects Theory and Applications. Prof. Dr. Sascha O. Becker U Stirling, Ifo, CESifo and IZA

Chapter 6: Policy Evaluation Methods: Treatment Effects

Noncompliance in Randomized Experiments

Estimating Marginal and Average Returns to Education

Eco517 Fall 2014 C. Sims FINAL EXAM

On Variance Estimation for 2SLS When Instruments Identify Different LATEs

Instrumental Variables. Ethan Kaplan

Imbens, Lecture Notes 2, Local Average Treatment Effects, IEN, Miami, Oct 10 1

Topics in Applied Econometrics and Development - Spring 2014

Selection on Observables: Propensity Score Matching.

Controlling for Time Invariant Heterogeneity

Program Evaluation in the Presence of Strategic Interactions

Mid-term exam Practice problems

Impact Evaluation Technical Workshop:

A Course in Applied Econometrics. Lecture 5. Instrumental Variables with Treatment Effect. Heterogeneity: Local Average Treatment Effects.

Final Exam. Economics 835: Econometrics. Fall 2010

Identification for Difference in Differences with Cross-Section and Panel Data

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43

Treatment Effects with Normal Disturbances in sampleselection Package

IsoLATEing: Identifying Heterogeneous Effects of Multiple Treatments

Development. ECON 8830 Anant Nyshadham

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables

1 Motivation for Instrumental Variable (IV) Regression

EC402 - Problem Set 3

New Developments in Econometrics Lecture 11: Difference-in-Differences Estimation

ECONOMETRICS II (ECO 2401) Victor Aguirregabiria. Spring 2018 TOPIC 4: INTRODUCTION TO THE EVALUATION OF TREATMENT EFFECTS

Partial Identification of Average Treatment Effects in Program Evaluation: Theory and Applications

Gov 2002: 4. Observational Studies and Confounding

NBER WORKING PAPER SERIES TREATMENT EFFECT HETEROGENEITY IN THEORY AND PRACTICE. Joshua D. Angrist

Applied Microeconometrics I

Econometrics Homework 4 Solutions

Part VII. Accounting for the Endogeneity of Schooling. Endogeneity of schooling Mean growth rate of earnings Mean growth rate Selection bias Summary

1 Impact Evaluation: Randomized Controlled Trial (RCT)

Instrumental Variables (Take 2): Causal Effects in a Heterogeneous World

Rising Wage Inequality and the Effectiveness of Tuition Subsidy Policies:

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

Supplementary material to: Tolerating deance? Local average treatment eects without monotonicity.

Policy-Relevant Treatment Effects

Suggested Answers Problem set 4 ECON 60303

Michael Lechner Causal Analysis RDD 2014 page 1. Lecture 7. The Regression Discontinuity Design. RDD fuzzy and sharp

An Alternative Assumption to Identify LATE in Regression Discontinuity Design

EMERGING MARKETS - Lecture 2: Methodology refresher

Identification and Extrapolation with Instrumental Variables

Gov 2002: 9. Differences in Differences

Problem Set # 1. Master in Business and Quantitative Methods

Transcription:

Recitation Notes 6 Konrad Menzel October, 006 Random Coefficient Models. Motivation In the empirical literature on education and earnings, the main object of interest is the human capital earnings function (HCEF) Y i α + βs i + γ X i + γ X i + ε i Typically, IV estimates of the return to education are higher than OLS even though we d thought that OLS already suffered from an upward ability bias (for a summary of results from prominent studies, see tables from Card s handbook chapter on next page). There are several explanations for this empirical regularity. OLS suffers from measurement error, IV does not. publication bias: a study by Ashenfelter and Harmon finds a positive correlation between the point estimate and its sampling error across studies 3. Card s favorite story: instruments in the literature affect populations with relatively low levels of, and therefore high returns to education In this recitation, I am planning to elaborate a little on the last point.. Heterogeneous Treatment Effects Assume you have data about a training program for unemployed workers, and you want to tell a policy maker whether the program was successful so that the government should continue to finance it. Say, your main outcome of interest is earnings six months after the training program, y i, and you know whether a particular person participated (D i ) or not (D i 0). Now, so far we have only looked at regressions of the type Y i α + D i β + ε i This implies that across the whole population, everyone who participates earns exactly β dollars more than if he hadn t received training. This doesn t make much sense in most real-world applications. For The story behind the publication bias typically goes follows: it s hard to publish a study with a small t-statistic for the return to education. A study with a very imprecise estimator can only arrive at a significant t-statistic if the value of the coefficient is higher (in absolute value, but the return to education - at least in publishable studies - is usually positive). Therefore if studies are selected on having a large t-statistic, the imprecise estimators we see published must be on average higher than the more precise ones (and typically the IV standard error is several orders of magnitude higher than that of OLS).

Figure : Source: Card, D. (999): The Causal Effect of Education on Earnings, Handbook of Labor Economics, ch 30, pp.835-36

example we d think that, say, a basic literacy program wouldn t have much of an effect on more educated individuals, or that the causal effect of the number of children in a household on the mother s labor supply differs a lot with the mother s age, education and other characteristics. So we d rather like to write the model as Y i α i + D i β i which allows individuals not only to have different initial earnings levels, but each person could also benefit from the program to a different degree. Note that there is no ε i in this equation. The reason for this is that the causal model for the potential outcomes should be understood as deterministic, i.e. the α i and β i are fixed for the individual. From this point of view the disturbance ε in the regression equation only captures our imperfect knowledge about the parameters in the potential outcomes for a particular individual i. So if people participated in the program regardless of their initial earnings level (this may stand for ability ), α i D i, and we could run OLS, remembering that for dummy regressions, (α i D i) βˆls E[Y i D i ] E[Y i D i 0] E[α i + β i D i ] E[α i D i 0] E[β i D i ] Therefore, OLS estimates the treatment effect on the treated individuals for our training program. If treatment was truly the outcome of a randomization, i.e. (α i,β i ) D i, we have in addition plim N βˆls E[β i D i ] E[β i ] : β ATE which is called the average treatment effect, i.e. the effect for all individuals regardless of their actual treatment status. We might be tempted to think that it is a disadvantage of OLS that it doesn t pick up the treatment effect for the whole population, but for practical purposes, this typically isn t so. Often individuals which aren t reached by our program aren t too relevant for our evaluation question either e.g. we wouldn t observe unemployed economics PhDs participating in a basic literacy program, but we wouldn t be too interested either in the effect of the literacy program on them because we d never choose to send them to that program anyway on apriori grounds. The treatment effect on the treated answers the question about how much better off we are by running the program (ignoring the cost of running it) compared to a world in which we shut it down entirely and for everyone. In many situations, there is always some degree of self-selection into, or imperfect compliance with a particular treatment, but sometimes we have a good instrument Z for participation, e.g. a randomized encouragement to participate, some exogenous eligibility rule, or some factor that shifts exogenously individuals cost of taking up the program. This assignment Z makes only some people switch from control to treatment (compliers), and from treatment to control (defiers). But there are also individuals which participate no matter what (alwaystakers) or don t in any case (never-takers). In order to formalize that, we denote the treatment a person would receive if Z i with D i, and for Z i 0, we would observe D 0i. Now let s also assume. Independence: (D i,β i,ε i ) Z i. Monotonicity: D i D 0i for all individuals This is the far too complicated derivation I gave in recitation, just for completeness: ` E[D i Y i ] E[Y i ]E[D i ] E[D i β i ] E[D i β i ]E[D i ] E[D i ] E[D i ] E[β i D i ] βˆls E[β i D i ] E[D i ] E[D i ] E[D i ] E[D i ] E[D i ] E[D i ] by the law of iterated expectations. 3

Under these assumptions ˆ E[Y i Z i ] E[Y i Z i 0] E[D i β i Z i ] E[D i β i Z i 0] β Wald E[D i Z i ] E[D i Z i 0] E[D i Z i ] E[D i Z i 0] E[D i β i Z i ] E[D 0i β i Z i 0] indep. E[(D i D 0i )β i ] E[D i Z i ] E[D 0i Z i 0] E[D i D 0i ] LIE E[β i D i,d 0i 0]P(D i,d 0i 0) E[β i D i 0,D 0i ]P(D i 0,D 0i ) P(D i,d 0i 0) P(D i 0,D 0i ) monot. E[β i D i > D 0i ]P(D i > D 0i ) E[β i D i > D 0i ] : β LATE P(D i > D 0i ) That is, Wald (and thereby all instrumental variables estimators) estimate the average effect of the treatment on the subpopulation of compliers with a particular instrument. The important point to take away from this is that each instruments has a different set of compliers, so e.g. in the Angrist and Evans paper on the twin births and same-sex instruments we d expect the LATE for the effect of the third child on mothers of twins to be different from the LATE on mothers whose first two children were of the same sex - the group of mothers that have a third child in order to balance the sex composition of their offspring is different from those mothers who have twins at their second birth (and therefore automatically have a third child). This is again not a weakness of the estimator, but we just have to be aware that each instrument defines its own Wald estimand. If Z is our policy intervention (e.g. offering a training program for which participation is voluntary), the instrumental variables estimator gives us directly the answer about the effect on those individuals which were affected by that policy, excluding people who would have received treatment in any case. And that s the actual question we d typically want to ask if we evaluate the policy corresponding to Z: we want to know by how much e.g. offering that particular training program makes everyone better off given that on the one hand in a world without that intervention there could still be close substitutes available, and that on the other hand, many people wouldn t want to take part in the program either way..3 Example: General Exam 003, Question I This part is based on Question I from the 003 Labor Generals. We are given a human capital earnings function y k S i α i + b i S i i and individuals face a convex cost of schooling c i κ i + r i S i + k S i where r i is often interpreted as the individual s discount rate, or the opportunity cost of capital (for empirical purposes, earnings y i is actually understood to be in logs, but for the theoretical derivations, it should be just earnings). In order to determine the optimal level of schooling, the individual equalizes the marginal return to schooling to the marginal cost, b i r i b i k S i r i + k S i S i k + k If k 0, the marginal return to schooling at the level of education chosen by the individual is y i b i k S i r i S i 4

which is also the equilibrium condition for the classic model we saw in class. But if the marginal cost of education is strictly convex (k > 0), we get instead y i b i r i k k b i k b i + r i S i k + k k + k k + k which is a convex combination of the discount rate and b i. Now we want to estimate the return to schooling using a binary instrumental variable Z to address omitted variable bias and potential measurement error issues. More specifically, we assume that the instrumental variable shifts the capital cost (or discount rate for each individual, i.e. the individual faces r i r 0i if the instrument Z i 0, and r i r i if it takes the value. If the instrument Z is independent of (b i,r 0i,r i ), we can rewrite plim N βˆiv E[y i Z i ] E[y i Z i 0] E[S i Z i ] E[S i Z i 0] E[b i S i k S i Z i ] E[b i S i k S i Z i 0] E[S i Z i ] E[S i Z i 0] indep E[ΔS i b i ] k E[Δ(S i )] E[ΔS i ] because of independence of (r 0i,r i,b i ) and Z i. From the last part, we know that and b i r i b i r 0i r 0i r i ΔS i k + k k + k k + k Δ(S ( ) i ) S i S 0i (Si + S 0i )(S i S 0i ) r0i r i S i ΔS i S i k + k where the trick in (*) is to use the third binomial formula (a + b)(a b) a b (though brute force would have given the same result). Plugging this back into the plim for the Wald estimator, and noticing that the (k + k ) terms cancel out, ˆ E[(r 0i r i )b i ] k E[(r 0i r i )S i ] plim N β IV E[r 0i r i ] From this we can see that the Wald estimator weights individual returns by r 0i r i, so that individuals who experience a high shift in the discount rate as a result of the intervention are over-represented relative to the other persons. Card s argument about discount rate bias (he doesn t call it like that) is that the instrumental variables which are typically used in the literature induce individuals to increase their schooling levels who would have received relatively low levels schooling otherwise, but therefore also have relatively high marginal returns. This means that if we are interested in a population average of the parameters in the HCEF, our IV estimates of the return to schooling are biased upwards. On the other hand one could argue that the effect on the subpopulation which is affected by the instrument is a more interesting policy parameters as long as that subpopulation is similar to the population which would be affected by a particular policy measure. E.g. as its title already states, the Angrist and Krueger paper which used quarter of birth dummies as instruments for high-school dropouts adequately estimates the effect of mandatory schooling on kids for which the legal constraints are actually binding /who are at the margin of dropping out in the absence of mandatory schooling. 5