Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Similar documents
Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

1 Motivation for Instrumental Variable (IV) Regression

Econometrics. Week 6. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Instrumental Variables

Econometrics. Week 11. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Multiple Regression Analysis: Heteroskedasticity

ECO375 Tutorial 8 Instrumental Variables

ECON Introductory Econometrics. Lecture 16: Instrumental variables

Ec1123 Section 7 Instrumental Variables

4 Instrumental Variables Single endogenous variable One continuous instrument. 2

4 Instrumental Variables Single endogenous variable One continuous instrument. 2

8. Instrumental variables regression

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables

WISE International Masters

Econometrics Summary Algebraic and Statistical Preliminaries

1. You have data on years of work experience, EXPER, its square, EXPER2, years of education, EDUC, and the log of hourly wages, LWAGE

Econometrics Multiple Regression Analysis: Heteroskedasticity

Fixed Effects Models for Panel Data. December 1, 2014

Econometrics Problem Set 11

Econometrics - 30C00200

Final Exam. Economics 835: Econometrics. Fall 2010

Lecture 8: Instrumental Variables Estimation

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors

08 Endogenous Right-Hand-Side Variables. Andrius Buteikis,

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012

Applied Statistics and Econometrics. Giuseppe Ragusa Lecture 15: Instrumental Variables

LECTURE 11. Introduction to Econometrics. Autocorrelation

Dealing With Endogeneity

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit


Multiple Regression Analysis

Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I Lecture 3: The Simple Linear Regression Model

Multiple Linear Regression

Instrumental Variables and GMM: Estimation and Testing. Steven Stillman, New Zealand Department of Labour

Econometrics Homework 4 Solutions

Ordinary Least Squares Regression

Introductory Econometrics

The Simple Linear Regression Model

Instrumental Variables and the Problem of Endogeneity

Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16)

Heteroskedasticity. Part VII. Heteroskedasticity


IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors

Applied Quantitative Methods II

Handout 12. Endogeneity & Simultaneous Equation Models

Lecture 4: Heteroskedasticity

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

EC402 - Problem Set 3

Specification testing in panel data models estimated by fixed effects with instrumental variables

Lab 11 - Heteroskedasticity

Econometrics Review questions for exam

Multivariate Regression Analysis

Econometrics of Panel Data

Review of Econometrics

Motivation for multiple regression

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

4.8 Instrumental Variables

Birkbeck Working Papers in Economics & Finance

LECTURE 5. Introduction to Econometrics. Hypothesis testing

Statistical Inference with Regression Analysis

ECON3327: Financial Econometrics, Spring 2016

Lecture 2 Multiple Regression and Tests

LECTURE 10: MORE ON RANDOM PROCESSES

Introduction to Econometrics. Heteroskedasticity

Topics in Applied Econometrics and Development - Spring 2014

Heteroskedasticity and Autocorrelation

Introductory Econometrics

Instrumental variables estimation using heteroskedasticity-based instruments

Heteroskedasticity. We now consider the implications of relaxing the assumption that the conditional

F9 F10: Autocorrelation

Freeing up the Classical Assumptions. () Introductory Econometrics: Topic 5 1 / 94

Instrumental Variables

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

Økonomisk Kandidateksamen 2004 (I) Econometrics 2. Rettevejledning

Diagnostics of Linear Regression

An overview of applied econometrics

Intro to Applied Econometrics: Basic theory and Stata examples

ECON3150/4150 Spring 2015

WISE International Masters

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors

Applied Economics. Panel Data. Department of Economics Universidad Carlos III de Madrid

Applied Microeconometrics (L5): Panel Data-Basics

ECON The Simple Regression Model

Economics 308: Econometrics Professor Moody

Chapter 6. Panel Data. Joan Llull. Quantitative Statistical Methods II Barcelona GSE

Lecture 6 Multiple Linear Regression, cont.

Econometrics. 8) Instrumental variables

Contest Quiz 3. Question Sheet. In this quiz we will review concepts of linear regression covered in lecture 2.

Topic 7: Heteroskedasticity

ECO375 Tutorial 9 2SLS Applications and Endogeneity Tests

Applied Statistics and Econometrics

Reliability of inference (1 of 2 lectures)

Intermediate Econometrics

Exogeneity tests and weak identification

Transcription:

Econometrics Week 8 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 25

Recommended Reading For the today Instrumental Variables Estimation and Two Stage Least Squares Chapter 15 (pp. 461 491). In the next week Simultaneous Equations Models Chapter 16 (pp. 501 523). 2 / 25

Today s talk We will further study the problem of endogenous explanatory variables in multiple regression models. Under omitted variables, OLS is generally inconsistent. When a suitable proxy variable is found for an unobserved variable, problem can be solved. But lot of times, it is difficult to find such a proxy. We will take a more rigorous approaches to the endogeneity problem: Instrumental Variables (IV) estimator. Two stage least squares (2SLS). Today, we will show that IV can be used to obtain consistent estimator in presence of omitted variables. 3 / 25

Why use Instrumental Variables? Consider a simple regression model: y = β 0 + β 1 x + u, where we think that x and u are correlated: Cov(x, u) 0. To obtain consistent estimators for β 0 and β 1 in this case, we need a new variable. Instrumental Variable z This variable has to satisfy following properties: (1) z is uncorrelated with u, Cov(z, u) = 0. (2) z is correlated with x, Cov(z, x) 0. (1) z is exogenous in the regression equation. (2) z must be related to the endogenous variable x. 4 / 25

Valid Instruments As Cov(z, u) = 0 can never be tested (u is unobservable error), we have to rely on the economic theory use common sense to decide about exogeneity. Cov(z, x) 0 can easily be tested by running a simple regression: x = π 0 + π 1 z + ν Cov(z, x) 0 holds if and only if π 1 0. Thus we should be able to reject the null hypothesis: H 0 : π 1 = 0 against the two-sided alternative that H A : π 0 5 / 25

Example: College Education Consider an equation for returns to college education among young workers: wages = β 0 + β 1 college + u People freely choose to go to college Cov(college, u) 0. Good instrumental variable thus is on which: makes going to college more likely (relevance). does not affect wages directly (exogeneity). 6 / 25

Example: College Education Good instrumental variables in this case might be: Distance between pre-college residence and college Those living in the proximity of college will be more likely to go to college (relevance) Pre-college residence is usually the parent s decision (exogeneity). Father s education An educated father will tend to inform the child better about the profits of education (relevance). Father s education is father s decision (exogeneity). Now suppose we have a valid instrument z, what do we do with it? 7 / 25

IV Estimation We use it to consistently estimate the parameters, as proper IV identifies the β 1 parameter as: Cov(z, y) = β 1 Cov(z, x) + Cov(z, u). As we know that Cov(z, u) = 0 and Cov(z, x) 0 (notice that is z and x are uncorrelated, this equation fails): β 1 = Cov(z, x) 0 z is relevant. Cov(z, u) = 0 z is exogenous. Cov(z, y) Cov(z, x). Hence, β 1 is identified and given random sample, we have: Instrumental Variable (IV) estimator n i=1 ˆβ 1,IV = (z i z)(y i ȳ) n i=1 (z i z)(x i x) 8 / 25

IV Estimation cont. Intercept can be estimated as: ˆβ 0,IV = ȳ ˆβ 1,IV x. NOTE When z = x, we have OLS estimator of β 1. In other words, when x is exogenous, IV estimator is identical to OLS estimator. IV estimator is consistent plim( ˆβ 1,IV ) = β 1. 9 / 25

Statistical Inference with IV Estimation IV estimates are asymptotically normal use standard errors Usually, we impose homoskedasticity assumption: E(u 2 z) = σ 2 = V ar(u) The asymptotic variance of ˆβ 1,IV Under the homoskedasticity assumption, the asymptotic variance of the ˆβ 1,IV is: σ 2 nσxρ 2 2. x,z where ρ 2 x,z is the square of the correlation between x and z 10 / 25

Statistical Inference with IV Estimation cont. This is important as it provides us with the standard errors. Standard errors of ˆβ 1,IV The (asymptotic) standard error of ˆβ 1,IV ˆσ 2 SST x Rx,z 2, can be estimated as: where ˆσ 2 can be estimated from the IV residuals, SST x is total sum of squares of the x and R 2 x,z is simple R 2 from the regression of x on z Resulting standard errors allows us to construct t statistics for testing the hypotheses about β 1 and about confidence intervals of β 1. 11 / 25

IV versus OLS Estimation Standard errors in IV case differs from OLS case only in the R 2 x,z. Since R 2 x,z < 1, standard errors of IV are always larger than in OLS. The stronger the correlation between z and x, the smaller the IV standard errors (in case of 1, it is equivalent to OLS). 12 / 25

The Effect of Poor Instruments What happens if Cov(z, u) 0? IV estimator will be inconsistent. However, it can still be better than OLS. Asymptotic bias of IV and OLS estimators plim ˆβ 1,IV = β 1 + Corr(z, u) Corr(z, x).σ u σ x plim ˆβ 1,OLS = β 1 + Corr(x, u). σ u σ x Thus asymptotic bias in IV will be smaller than asymptotic bias in OLS if: Corr(z, u) < Corr(x, u) Corr(z, x) 13 / 25

IV estimation in the Multiple Regression Case We can extend the IV estimation to multiple regression. Let s start with the case, where only one of the explanatory variables is correlated with the error: y 1 = β 0 + β 1 y 2 + β 2 z 1 + u 1. This is called structural equation where we distinguish between endogenous and exogenous variables. y 1 is clearly endogenous as it is correlated with u 1 z 1 is exogenous (uncorrelated with u 1, Cov(z 1, u 1 ) = 0). y 2 is endogenous, but suspected of being correlated with u 1, Cov(y 2, u 1 ) 0. 14 / 25

IV estimation in the Multiple Regression Case We know that OLS estimator will be biased and inconsistent we need to find proper instrumental variable for y 2, Cov(z 2, u 1 ) = 0. z 2 also needs to be correlated with y 2 : y 2 = π 0 + π 1 z 1 + π 2 z 2 + ν 2. The key identification condition is π 2 0. This reduced form equation regresses the endogenous variable on all exogenous variables. 15 / 25

Two Stage Least Squares We may need to have multiple instruments for each variable, say z 2 and z 3 In this case we may use more than one IV estimator. BUT: None of the IV estimators would be efficient. Since z 1, z 2 and z 3 are all uncorrelated with u 1, any linear combination of exogenous variables would be valid IV. Thus we choose the linear combination that is most highly correlated with y 2. This estimator is known as the two stage least squares (2SLS). 16 / 25

Two Stage Least Squares cont. Consider following model: y 1 = β 0 + β 1 y 2 + β 2 z 1 + u 1. 2SLS estimates is obtained in two stages: Two-stage least squares (2SLS) (1): Obtain OLS fitted values of endogenous variable: ŷ 2 = ˆπ 0 + ˆπ 1 z 1 + ˆπ 2 z 2 + ˆπ 3 z 3 (2): y 1 = β 0 + β 1 ŷ 2 + β 2 z 1 + u 1 But let STATA do the estimation for you to get the correct (robust) standard errors. We can extend to multiple endogenous variables. BUT, we need at least as many instruments as there are endogenous variables (proper conditions statement in Advanced Econometrics course). 17 / 25

Addressing Errors-in-Variables with IV Estimation IV can be used not only to solve the omitted variables problem, but also measurement error problem. Recall (from Ch.9) the equation: y = β 0 + β 1 x 1 + β 2 x 2 + u, where y and x 2 are observed but x 1 is not. Instead, we observe x 1 = x 1 + ɛ 1, Cov(x 1, ɛ 1) = 0. Correlation of x 1 and ɛ 1 biased and inconsistent OLS. If there is such a z that Corr(z, u) = 0 and Corr(z, x 1 ) 0, IV will remove this bias. 18 / 25

Testing for Endogeneity When all explanatory variables are exogenous, both OLS and 2SLS are consistent estimators. BUT: 2SLS is less efficient than OLS OLS is preferred. If we have endogeneity problem, only IV is consistent. Thus it is good to have a test for endogeneity (to see if the 2SLS is necessary). Hausman test for endogeneity H 0 : OLS and IV are consistent. We simply compute both estimates and use Hausman test for comparison. (more about this test during the Advanced Econometrics course.) 19 / 25

Testing for Endogeneity cont. Another alternative is to use a regression-based test. If y 2 is endogenous, then ν 2 (from the reduced form equation) and u 1 from the structural model will be correlated. Regression-based test for endogeneity y 1 = β 0 + β 1 y 2 + β 2 z1 + β 3 z 2 + u 1 (1): Regress potentially endogenous variable y 2 on all exogenous variables and obtain residuals ˆν 2 : y 2 = π 0 + π 1 z 1 + π 2 z 2 + π 3 z 3 + π 4 z 4 + ν 2. (2): Run structural model including endogenous variable and residual ν 2 : y 1 = β 0 + β 1 y 2 + β 2 z 1 + β 3 z 2 + δ 1ˆν 2 + u 1 (3): If H 0 : δ 1 = 0 is rejected against H A : δ 1 0 on small significance level Cov(ν 2, u 1 ) 0 y 2 is endogenous. 20 / 25

Testing Overidentification Restrictions If we have only one instrument for our endogenous variable, we can not test whether the instrument is uncorrelated with the error. We say that model is just identified. In case of multiple instruments for each endogenous variable, it is possible to test overidentifying restrictions to see if some of the instruments are correlated with the error. We call the testing for overidentifying restrictions 21 / 25

Testing Overidentification Restrictions cont. (1): Estimate the structural model using IV and obtain residuals, û 1. (2): Regress û 1 on all exogenous variables and obtain R 2 Test the H 0 : all IVs are uncorrelated with u 1 LM = nr 2 a χ 2 q where q is the number of instrumental variables from outside minus the total number of endogenous explanatory variables. If we reject the H 0, at least some of the IV are not exogenous. 22 / 25

Testing for Heteroskedasticity Heteroskedasticity in 2SLS raises the same issues as with OLS. We can use adjusted Breusch-Pagan test. First, we compute the 2SLS residuals û 1 Second, we regress squared residuals û 2 1 on all exogenous variables z 1, z 2,..., z m. Third, we use the F statistic for joint significance. The null hypothesis of homoskedasticity is rejected if z j are jointly significant. 23 / 25

Testing for Serial Correlation Applying 2SLS to time series data brings the same considerations as when using OLS (Lectures 2 4). We need a slight adjustment to test for serial correlation. First, we compute the 2SLS residuals û t Second, re-estimate the structural model by 2SLS including the lagged residuals û t 1, and using the same instruments as originally It is possible to use 2SLS on a quasi-differenced model, using quasi0differenced instruments. 24 / 25

Thank you Thank you very much for your attention! 25 / 25