AGEC 661 Note Fourteen

Similar documents
The Problem of Causality in the Analysis of Educational Choices and Labor Market Outcomes Slides for Lectures

150C Causal Inference

Methods to Estimate Causal Effects Theory and Applications. Prof. Dr. Sascha O. Becker U Stirling, Ifo, CESifo and IZA

Imbens, Lecture Notes 2, Local Average Treatment Effects, IEN, Miami, Oct 10 1

Introduction to causal identification. Nidhiya Menon IGC Summer School, New Delhi, July 2015

A Course in Applied Econometrics. Lecture 5. Instrumental Variables with Treatment Effect. Heterogeneity: Local Average Treatment Effects.

Truncation and Censoring

The problem of causality in microeconometrics.

Lecture 8. Roy Model, IV with essential heterogeneity, MTE

Lecture 12: Application of Maximum Likelihood Estimation:Truncation, Censoring, and Corner Solutions

Instrumental Variables in Action: Sometimes You get What You Need

Lecture 11/12. Roy Model, MTE, Structural Estimation

Principles Underlying Evaluation Estimators

Treatment Effects with Normal Disturbances in sampleselection Package

IDENTIFICATION OF TREATMENT EFFECTS WITH SELECTIVE PARTICIPATION IN A RANDOMIZED TRIAL

Applied Health Economics (for B.Sc.)

Tobit and Selection Models

Instrumental Variables

Quantitative Economics for the Evaluation of the European Policy

Labor Supply and the Two-Step Estimator

1. Basic Model of Labor Supply

Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes

Flexible Estimation of Treatment Effect Parameters

Link to Paper. The latest iteration can be found at:

Instrumental Variables in Action

Comments on: Panel Data Analysis Advantages and Challenges. Manuel Arellano CEMFI, Madrid November 2006

Lecture 11 Roy model, MTE, PRTE

Potential Outcomes Model (POM)

Econometrics of causal inference. Throughout, we consider the simplest case of a linear outcome equation, and homogeneous

The problem of causality in microeconometrics.

Four Parameters of Interest in the Evaluation. of Social Programs. James J. Heckman Justin L. Tobias Edward Vytlacil

On Variance Estimation for 2SLS When Instruments Identify Different LATEs

14.74 Lecture 10: The returns to human capital: education

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,

IV Estimation WS 2014/15 SS Alexander Spermann. IV Estimation

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

Econ 2148, fall 2017 Instrumental variables I, origins and binary treatment case

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables

Recitation Notes 5. Konrad Menzel. October 13, 2006

Instrumental Variables

Recitation Notes 6. Konrad Menzel. October 22, 2006

Matching. James J. Heckman Econ 312. This draft, May 15, Intro Match Further MTE Impl Comp Gen. Roy Req Info Info Add Proxies Disc Modal Summ

WORKSHOP ON PRINCIPAL STRATIFICATION STANFORD UNIVERSITY, Luke W. Miratrix (Harvard University) Lindsay C. Page (University of Pittsburgh)

Causality and Experiments

Gibbs Sampling in Endogenous Variables Models

Program Evaluation in the Presence of Strategic Interactions

What s New in Econometrics. Lecture 1

Causal Inference with General Treatment Regimes: Generalizing the Propensity Score

IsoLATEing: Identifying Heterogeneous Effects of Multiple Treatments

Part VII. Accounting for the Endogeneity of Schooling. Endogeneity of schooling Mean growth rate of earnings Mean growth rate Selection bias Summary

The relationship between treatment parameters within a latent variable framework

II. MATCHMAKER, MATCHMAKER

Short Questions (Do two out of three) 15 points each

Applied Microeconometrics. Maximilian Kasy

Impact Evaluation Technical Workshop:

Instrumental Variables

Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification. Todd MacKenzie, PhD

An Economic Analysis of Exclusion Restrictions for Instrumental Variable Estimation

Empirical approaches in public economics

Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case

Estimating the Dynamic Effects of a Job Training Program with M. Program with Multiple Alternatives

Sample Selection. In a UW Econ Ph.D. Model. April 25, Kevin Kasberg and Alex Stevens ECON 5360 Dr. Aadland

Statistical Models for Causal Analysis

Hypothesis Testing. Part I. James J. Heckman University of Chicago. Econ 312 This draft, April 20, 2006

The Generalized Roy Model and Treatment Effects

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

An example to start off with

Predicting the Treatment Status

Potential Outcomes and Causal Inference I

Controlling for Time Invariant Heterogeneity

Instrumental Variables

Weighting. Homework 2. Regression. Regression. Decisions Matching: Weighting (0) W i. (1) -å l i. )Y i. (1-W i 3/5/2014. (1) = Y i.

What s New in Econometrics. Lecture 13

Trimming for Bounds on Treatment Effects with Missing Outcomes *

Introduction to Causal Inference. Solutions to Quiz 4

PSC 504: Differences-in-differeces estimators

ECON 3150/4150, Spring term Lecture 7

PSC 504: Instrumental Variables

arxiv: v1 [math.st] 7 Jan 2014

Groupe de lecture. Instrumental Variables Estimates of the Effect of Subsidized Training on the Quantiles of Trainee Earnings. Abadie, Angrist, Imbens

Partial Identification of Average Treatment Effects in Program Evaluation: Theory and Applications

Using Matching, Instrumental Variables and Control Functions to Estimate Economic Choice Models

14.32 Final : Spring 2001

Supplementary material to: Tolerating deance? Local average treatment eects without monotonicity.

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies

Discrete Dependent Variable Models

Instrumental Variables

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models

Rewrap ECON November 18, () Rewrap ECON 4135 November 18, / 35

ECO375 Tutorial 9 2SLS Applications and Endogeneity Tests

Notes on Heterogeneity, Aggregation, and Market Wage Functions: An Empirical Model of Self-Selection in the Labor Market

1 Motivation for Instrumental Variable (IV) Regression

More on Roy Model of Self-Selection

An overview of applied econometrics

Single-Equation GMM: Endogeneity Bias

Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis"

Max. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes

Selection on Observables: Propensity Score Matching.

Instrumental Variables (Take 2): Causal Effects in a Heterogeneous World

AN OVERVIEW OF INSTRUMENTAL VARIABLES*

Transcription:

AGEC 661 Note Fourteen Ximing Wu 1 Selection bias 1.1 Heckman s two-step model Consider the model in Heckman (1979) Y i = X iβ + ε i, D i = I {Z iγ + η i > 0}. For a random sample from the population, we observe D i, Z i and X i,but Y i only if D i = 1. The parametric form of the model assumes that ε i η i X i, Z i N 0, σ2 ε ρσ ε. 0 ρσ ε 1 Note that the variance for η i is normalized to one since we only observe the sign of Z iγ + η i. Therefore γ can only be identified up to a scale. A classical example of this two-equation model is the wage regression model, where Y i is the observed wage rate and D i is the labor supply decision, which is decided by the latent calculation that one s expected wage rate is larger than her reservation wage rate such that she chooses to work. This is the selection bias problem, where individuals with observable wage are those who self-select into working since the expected payoff of working offers a higher utility than not working. Under the assumption of bi-variate normality assumption, one can estimate the model 1

efficiently using the MLE. However, the log-likelihood function is complicated and the computational problem does not behave nicely. Instead, Heckman proposed a two-step method. Thanks to the bi-variate normality assumption, one can show that the conditional mean of ε i given η i is a linear function of η i. In fact, E [ε i η i ] = δη i, where δ = ρσ ε. Thus we have E [Y i X i, η i ] = X iβ + δη i. Since Eη i 0 in the wage equation, the OLS estimate Y i on X i is biased unless ρ = 0, where the selection process is independent of the expected outcome. This is the so called selection bias or endogenous variable bias. Note that E [η i X i, Z i, D i = 1] = E [η i X i, Z i, Z iγ + η i > 0] = E [η i η i > Z iγ] = λ (Z iγ), which is the truncated (from below) mean of the normal variate η i. One can show that the truncated mean of a standard normal variate from below at a constant a takes the form λ (a) = φ (a) /Φ (a), which is the inverse Mill s ratio, wherein φ and Φ are the pdf and cdf of standard normal. Therefore, we have E [Y i X i, η i ] = X iβ + δλ (Z iγ). Heckman s two-step estimator is the following. First estimate γ by probit (recall the normality assumption), which gives consistent estimate of γ. Then for each observation with D i = 1, calculate the estimated inverse Mill s ratio ˆλ i = λ (Z iˆγ), and regress Y i on X i and ˆλ i. 2

Compared to the MLE, this two-step estimate is not necessarily efficient. However, to test if selection problem is present (δ = 0), the OLS standard error suffices here. 1.2 Exclusion Restriction So far, we do not use exclusion restriction, and Z i is allowed to be identical to X i. However, this practice may lead to nearly perfect collinearity. In this case, the identification is called identification based on functional form assumptions in the sense that λ i is a nonlinear function of X i (when Z i = X i ) and the conditional expectation of Y i given X i ends up being nonlinear in X i, where the nonlinear part is interpreted as the selection bias. Therefore, the identification depends crucially on the functional form assumptions. Instead, when we use exclusion restrictions (variables in Z i that are not in X i ), the identification is not that fragile. In this case, there is variation in λ i conditional on X i, so the selection bias coefficient is separately identified, even when the normality assumption does not hold. However, sometimes it is difficult to justify an exclusion restriction that a certain variable affects the participation decision but not the outcome. 2 Causal Inference using Instrumental Variable Angrist, Imbens and Rubin, Identification of Causal Effects Using Instrumental Variables, JASA, 1996, 91 (434): 444-455. Summary: This paper estimates the effect of veteran status in the Vietnam war on mortality, using the lottery number that assigned priority for the draft as an instrument. Consider the problem of inferring the effect of veteran status on health outcome. For person i, let Y i be the observed health outcome, D i be the observed treatment (veteran status) and Z i be the observed draft status. For simplicity, Z i is a binary variable such that those with a low lottery number (Z i = 1) would have served in the military (D i = 1) and those with a high lottery number (Z i = 0) would not have served (D i = 0). 3

2.1 Structural model Goldberger (1972) defined structural equation models as stochastic models in which each equation represents a causal link, rather than a mere empirical association. Consider the standard dummy endogenous variables model Y i = β 0 + β 1 D i + ε i, D i = α 0 + α 1 Z i + v i, and D i = 1, if D i > 0, = 0, if D i 0. In this model, β 1 represents the causal effect of treatment of D on Y. The underlying assumptions include: 1. E [Z i ε i ] = 0 and E [Z i v i ] = 0. This suggests that any effect of Z on Y must be through an effect of Z on D. 2. cov(d i, Z i ) 0, implying that α 1 0. Under this two conditions, Z is considered a valid IV. One can show that the IV estimator can be defined as the ratio of sample covariance ˆβ IV 1 = ĉov (Y i, Z i ) /ĉov (D i, Z i ) N i=1 = Y iz i / N i=1 Z i N i=1 Y i (1 Z i ) / N i=1 (1 Z i) N i=1 D iz i / N i=1 Z i N i=1 D i (1 Z i ) / N i=1 (1 Z i). This IV estimand depends crucially on the functional form (linearity) and distributional assumptions. 4

2.2 Causal inference using potential outcome Let the vector Z = [Z i ], i = 1,..., N. The vector D and Y are similarly defined. Then D i (Z) is the indicator for whether person i would serve given the random vector of draft assignment Z. [note that this is a potential outcome, instead of realized one]. In an ideal world, D i (Z) = Z i for all i if everyone complied with the draft lottery. However, in real world, we observe D i (Z) differs from Z i for various reasons. Similarly, Y i (Z, D) is another potential outcome given the draft and treatment vectors. The two potential outcomes D i (Z) and Y i (Z, D) are fixed but unknown parameters to estimate. Generally, differences in these potential outcomes due to assigned and received treatment will be revealed by analyzing data obtained by randomly assigning Z to the sample. Assumption 1: Stable Unit Treatment Value Assumption (SUTVA) If Z i = Z i, then D i (Z) = D i (Z ) If Z i = Z i and D i = D i, then Y i (Z, D) = Y i (Z, D ). Assumption 2: Random assignment (of Z). Definition 1: causal effects of Z on D for person i is D i (1) D i (0) and on Y is Y i (1, D i (1)) Y i (0, D i (0)). Given assumptions 1 and 2, the average causal effect of Z on Y is Yi Z i Zi Yi (1 Z i ) (1 Zi ) and the average causal effect of Z on D is Di Z i Zi Di (1 Z i ) (1 Zi ). Obviously, these two estimands are simple average differences between those with Z i = 1 and with Z i = 0. Next, one can show the ratio of these two estimands is the IV estimand. 5

Note that due to the presence of imperfect compliance, we can not evaluate the causal effect of D on Y by taking simple average difference between those with D i = 1 and with D i = 0 since the assignment is not ignorable. We need some extra assumptions to identify the causal effect. Assumption 3: Exclusion restrictiony (Z, D) = Y (Z, D) for all Z, Z and for all D. This assumption implies that Y i (1, d) = Y i (0, d) for d = 0, 1. In other words, any effect of Z on Y must be through an effect of Z on D. Now we can define the potential outcome Y (D) = Y (Z, D) for all Z and D. Combined with assumption 1, we can write Y i (D i ) instead of Y i (Z, D). Definition 2: the causal effect of D on Y for person i is Y i (1) Y i (0). Next we focus on average causal effects in groups of people who can be induced to change treatments. Assumption 4: Nonzero causal effects of Z on D : the causal effects of Z on D, E [D i (1) D i (0)] 0. Assumption 5: Monotonicity: D i (1) D i (0) for all i This implies that no one does the opposite of his assignment. Definition 3: Instrumental variable for the causal effect of D on Y : a variable satisfies assumption 1 5 above. 6

Now one can show that the causal effect of Z on Y for person i Y i (1, D i (1)) Y i (0, D i (0)) = Y i (D i (1)) Y i (D i (0)) (by exclusion restriction) = [Y i (1) D i (1) + Y i (0) (1 D i (1))] (since D i (1) = Pr (D i = 1 Z i = 1)) [Y i (1) D i (0) + Y i (0) (1 D i (0))] = [Y i (1) Y i (0)] [D i (1) D i (0)]. Therefore, the causal effect of Z on Y is the product of: (i) causal effect of D on Y and (ii) causal effect of Z on D. Note that D i (1) D i (0) only takes three values: 1,0 and -1. When D i (1) D i (0) = 0, the causal effect of Z on Y is clearly zero. The case D i (1) D i (0) = 1 is ruled out by the monotonicity assumption. Therefore, we obtain E [Y i (1, D i (1)) Y i (0, D i (0))] = E {[Y i (1) Y i (0)] [D i (1) D i (0)]} = E [Y i (1) Y i (0) D i (1) D i (0) = 1] Pr [D i (1) D i (0) = 1]. Finally, by Assumption 4, we obtain the causal interpretation of IV estimand: E [Y i (1, D i (1)) Y i (0, D i (0))] E [D i (1) D i (0)] = E [Y i (1, D i (1)) Y i (0, D i (0))] Pr [D i (1) D i (0) = 1] = E [Y i (1) Y i (0) D i (1) D i (0) = 1]. This treatment effect is called Local Average Treatment Effect. Therefore, the IV estimand captures the effect of the treatment of those who would be affected by the treatment, rather than the generally population. 7