Session IV Instrumental Variables

Similar documents
Click to edit Master title style

An example to start off with

Recitation Notes 6. Konrad Menzel. October 22, 2006

Instrumental Variables and the Problem of Endogeneity

Recitation Notes 5. Konrad Menzel. October 13, 2006

Lecture#12. Instrumental variables regression Causal parameters III

1 Motivation for Instrumental Variable (IV) Regression

Rockefeller College University at Albany

Econ 1123: Section 2. Review. Binary Regressors. Bivariate. Regression. Omitted Variable Bias

Using Instrumental Variables to Find Causal Effects in Public Health

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors

Job Training Partnership Act (JTPA)

Chapter 2: simple regression model

Beyond the Target Customer: Social Effects of CRM Campaigns

Applied Health Economics (for B.Sc.)

Simultaneous Equation Models Learning Objectives Introduction Introduction (2) Introduction (3) Solving the Model structural equations

CSC321 Lecture 4 The Perceptron Algorithm

Differences in Differences (Dif-in and Panel Data. Technical Session III. Manila, December Manila, December Christel Vermeersch

Write your identification number on each paper and cover sheet (the number stated in the upper right hand corner on your exam cover).

COGS Q250 Fall Homework 7: Learning in Neural Networks Due: 9:00am, Friday 2nd November.

Handout 12. Endogeneity & Simultaneous Equation Models

1 Impact Evaluation: Randomized Controlled Trial (RCT)

ECON Introductory Econometrics. Lecture 17: Experiments

Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16)

4 Instrumental Variables Single endogenous variable One continuous instrument. 2

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors

An explanation of Two Stage Least Squares

Now we will define some common sampling plans and discuss their strengths and limitations.

Assessing Studies Based on Multiple Regression

Mgmt 469. Causality and Identification

The returns to schooling, ability bias, and regression

4 Instrumental Variables Single endogenous variable One continuous instrument. 2

Introduction to Econometrics. Assessing Studies Based on Multiple Regression

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables

Economics 241B Estimation with Instruments

Chapter 6: Endogeneity and Instrumental Variables (IV) estimator

FNCE 926 Empirical Methods in CF

AGEC 661 Note Fourteen

Applied Quantitative Methods II

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

ECE521 Lecture7. Logistic Regression

SIMULTANEOUS EQUATION MODEL

Applied Microeconometrics I

Extended regression models using Stata 15

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Internal vs. external validity. External validity. This section is based on Stock and Watson s Chapter 9.

ECO375 Tutorial 8 Instrumental Variables

Controlling for Time Invariant Heterogeneity

Gov 2002: 4. Observational Studies and Confounding

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data?

Essential of Simple regression

Development. ECON 8830 Anant Nyshadham

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Econometrics. Week 11. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Instrumental Variables

Empirical approaches in public economics

Basic econometrics. Tutorial 3. Dipl.Kfm. Johannes Metzler

Problem Set - Instrumental Variables

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors

Lecture 4: Multivariate Regression, Part 2

Panel data methods for policy analysis

8. Instrumental variables regression

Quantitative Economics for the Evaluation of the European Policy

Sampling and Sample Size. Shawn Cole Harvard Business School

Lecture 8: Instrumental Variables Estimation

Lecture 2. Conditional Probability

Instrumental Variables

Chapter 11. Regression with a Binary Dependent Variable

Instrumental Variables

Impact Evaluation Technical Workshop:

Gibbs Sampling in Endogenous Variables Models

Exam D0M61A Advanced econometrics

Problem Set 10: Panel Data

MS&E 226: Small Data

PSC 504: Instrumental Variables

ECON Introductory Econometrics. Lecture 16: Instrumental variables

Lecture 9: Panel Data Model (Chapter 14, Wooldridge Textbook)

IV estimators and forbidden regressions

Econometrics. 7) Endogeneity

Short Questions (Do two out of three) 15 points each

Basic Linear Model. Chapters 4 and 4: Part II. Basic Linear Model

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43

The Distribution of Treatment Effects in Experimental Settings: Applications of Nonlinear Difference-in-Difference Approaches. Susan Athey - Stanford

Impact Evaluation Workshop 2014: Asian Development Bank Sept 1 3, 2014 Manila, Philippines

Instrumental variables estimation using heteroskedasticity-based instruments

Rising Wage Inequality and the Effectiveness of Tuition Subsidy Policies:

Økonomisk Kandidateksamen 2004 (I) Econometrics 2. Rettevejledning

ECNS 561 Multiple Regression Analysis

Topic 2: Logistic Regression

Strategies for dealing with Missing Data

14.32 Final : Spring 2001

EMERGING MARKETS - Lecture 2: Methodology refresher

Econ 8208 Homework 2 Due Date: May 7

Lecture 4: Linear panel models

Structural Equation Modeling An Econometrician s Introduction

Online Videos FERPA. Sign waiver or sit on the sides or in the back. Off camera question time before and after lecture. Questions?

EC 533 Labour Economics Problem Set 1 Answers. = w r. S = f S. f r = 0. log y = log w + log f(s, A)

FNCE 926 Empirical Methods in CF

6. Assessing studies based on multiple regression

Transcription:

Impact Evaluation Session IV Instrumental Variables Christel M. J. Vermeersch January 008 Human Development Human Network Development Network Middle East and North Africa Middle East Region and North Africa Region World Bank Institute Spanish Impact Evaluation Fund

An example to start off with Say we wish to evaluate a voluntary job training program Anyunemployedpersoniselegible Some people choose to register ( Treatments ) Other people choose not to register ( Comparisons ) Some simple (but not-so-good) ways to evaluate the program: Compare before and after situation in the treatment group Compare situation of treatments and comparisons after the intervention Compare situation of treatments and comparisons before and after

Voluntary job training program Say we decide to compare outcomes for those who participate to the outcomes of those who do not participate: a simple model to do this: y= α+ βt + β x+ ε 1 Where T = 1 if person participates in training T = 0 if person does not participate in training x = control variables ( exogenous & observable) Why is this not working properly? 3

Voluntary job training program Say we decide to compare outcomes for those who participate to the outcomes of those who do not participate: a simple model to do this: y= α+ βt + β x+ ε 1 Where T = 1 if person participates in training T = 0 if person does not participate in training x = control variables ( exogenous & observable) Why is this not working properly? problems : > Variables that we omit (for various reasons) but that > Decision to participate in training is endogenous are important 4

Problem #1: Omitted Variables Even in a well-thought model, we'll miss > "forgotten" characteristics: we didn't know they mattered > characteristics that are too complicated to measure Examples : > Varying talent and levels of motivation > Different levels of information > Varying opportunity cost of participation > Varying level of access to services The full "correct" model is: The model we use: y = α + γ T + γ x+ γ D+ η = α + βt 1 3 y 1 + β x+ ε 5

Problem #: Endogenous Decision to Participate Participation is a decision variable? It 's endogenous! (I.e. it depends on the participants themselves.) y = α + βt + β x + ε T 1 = π + π D+ ξ 6

Problem #: Endogenous Decision to Participate Participation is a decision variable? it 's endogenous y = α + βt + β x + ε T 1 = π + π D+ ξ => y = α + β ( π + π D+ ξ) + β x + ε 1 => y = α+ βπ+ β x + βπ D+ βξ+ ε 1 1 1 Note: in the two cases: we're missing the D term, which we need to estimate the correct model, but which we cannot measure properly. 7

The correct model is: Simplified model: y y = α + γ T 1 = α + β T 1 + γ x + γ D + β x + ε 3 + η Say we estimate the treatment effect γ 1 with β 1,OLS If D is correlated with T, and we don t include D in the simplified model, then the estimator of the parameter on T will pick up part of the effect of D. This will happen to the extent that D and T are correlated. Thus: our OLS estimator β 1,OLS of the treatment effect γ captures 1 the effect of other characteristics (D) in addition to the treatment effect. This means that there is a difference between E(β 1,OLS ) y γ 1 the expected value of the OLS estimator β 1 isn t γ1, the real treatment effect β 1,OLS is a biased estimator of the treatment effect γ 1. CV4 8

Slide 8 CV4 corr (T,ε)=corr (T, γ3*d+η) = γ 3*corr (T, D) wb6893, 05/3/006

The correct model is: Simplified model: y y = α + γ T 1 = α + β T 1 + γ x + γ D + β x + ε 3 + η This means that there is a difference between E(β 1,OLS ) y γ 1 the expected value of the OLS estimator β 1 isn t γ1, the real treatment effect β 1,OLS is a biased estimator of the treatment effect γ 1. Why did this happen? One of the basic conditions for OLS to be BLUE was violated: In other words E(β 1,OLS ) γ 1 Even worse..plim(β 1,OLS ) γ 1 (biased estimator) (inconsistent estimator) CV6 9

Slide 9 CV6 corr (T,ε)=corr (T, γ3*d+η) = γ 3*corr (T, D) wb6893, 05/3/006

What can we do to solve this problem? y y = α + γ T 1 = α + β T 1 + γ x + β x + γ D 3 + ε + η Trytocleanthe correlationbetween T and ε: Isolate the variation in T that is not correlated with ε through the omitted variable D We can do this using an instrumental variable (IV) 10

Basic idea behind IV y y The basic problem is that corr (T,D) 0 = α + γ T 1 = α + β T 1 + γ x + β x + γ D + ε + η Find a variable Z that satisfies two conditions: 3 1. Correlated with T: corr (Z, T) 0 ---- Z and T are correlated, or Z predicts part of T. Z is not correlated with ε: corr (Z, ε) = 0 ----By itself, Z has no influence on y. The only way it can influence y is because it influences T. All of the effect of Z on y passes through T. Examples of Z in the case of voluntary job training program? 11

Two-stage least squares (SLS) Remember the original model with endogenous T: y = α + βt + β x+ ε 1 Step 1: Regress the endogenous variable T on the instrumental variable(s) Z and other exogenous variables T = δ + δ x+ θ Z + τ 0 1 1 > Calculate the predicted value of T for each observation: > Since Z and x are not correlated with ε, neither will be Tˆ. > You will need one instrumental variable for each potentially endogenous regressor Tˆ 1

Two-stage least squares (SLS) Step : Regress y on the predicted variable Tˆ and the other exogenous variables y = α + βtˆ + β x+ ε 1 > Note: the standard errors of the second stage OLS need to be corrected because Tˆ is not a fixed regressor. > In practice: use STATA ivreg command, which does the two steps. at once and reports correct standard errors. > Intuition: by using Z for T, we cleaned T of its correlation with ε. > It can be shown that (under certain conditions) IV yields a consistent estimator of γ (large sample theory) 1 13

Uses of instrumental variables Simultaneity: X and Y cause each other instrument X Omitted Variables: X is picking up the effect of other variables which are omitted instrument X with a variable that is not correlated with the omitted variable(s) Measurement Error: X is not measured precisely instrument X 14

Examples Hoxby (000) and Angrist (1999) The effect of class size on learning achievements Chaudhury, Gertler, Vermeersch (work in progress) Estimating the effect of school autonomy on learning in Nepal 15

School autonomy in Nepal Goal is to evaluate A. Autonomous school management by communities B. School report cards C. School information networks Data We could integrate about 1500 schools in the evaluation Each community freely chooses to participate or not. School report cards and school networks done by NGOs School network can only be done in autonomous schools School report cards can be done in any school. 16

School autonomy in Nepal Goal is to evaluate A. Autonomous school management by communities B. School report cards C. School information networks Data We could integrate about 1500 schools in the evaluation Each community freely chooses to participate or not. School report cards and school networks done by NGOs School network can only be done in autonomous schools School report cards can be done in any school. Assume each community has exactly one schools Task: design the implementation of the program so it can be evaluated propose method of evaluation. 17

School autonomy in Nepal Interventions: A. Autonomous school management by communities B. School report cards C. School information networks Creation of demand for devolution to the community (A) Feedback of performance indicators (B) Network support after devolution to the community (C) Group O No No No 00 (Control) Group A Yes No No 300 Group B No Yes No 100 Group AB Yes Yes No 300 Group AC Yes No Yes 300 Group ABC Yes Yes Yes 300 Total 1500 Number of schools in the group 18

Reminder and a word of caution. corr (Z,ε) =0 if corr (Z, ε) 0 Bad instrument Problem! Finding a good instrument is hard! Use both theory and common sense to find one! We can think of designs that yield good instruments. corr(z,t) 0 Weak instruments : the correlation between Z and T needs to be sufficiently strong. If not, the bias stays large even for large sample sizes. 19