Ability Bias, Errors in Variables and Sibling Methods. James J. Heckman University of Chicago Econ 312 This draft, May 26, 2006

Size: px
Start display at page:

Download "Ability Bias, Errors in Variables and Sibling Methods. James J. Heckman University of Chicago Econ 312 This draft, May 26, 2006"

Transcription

1 Ability Bias, Errors in Variables and Sibling Methods James J. Heckman University of Chicago Econ 312 This draft, May 26,

2 1 Ability Bias Consider the model: log = where =income, = schooling, and 0 and 1 are parameters of interest. What we have omitted from the above specification is unobserved ability, which is captured in the residual term.wethusre-writetheaboveas: log = where is ability, ( 0 ) ( 0), andwebelievethat ( ) 6= 0.Thus,( ) 6= 0,sothatOLS on our original specification gives biased and inconsistent estimates. 2

3 1.1 Strategies for Estimation 1. Use proxies for ability: Find proxies for ability and include them as regressors. Examples may include: height, weight, etc. The problem with this approach is that proxies may measure ability with error and thus introduce additional bias (see Section 1.3). 3

4 2. Fixed Eect Method: Find a paired comparison. Examples may include a genetic twin or sibling with similar or identical ability. Consider two individuals and 0 : log log 0 = ( ) ( ) = 1 ( 0)+( 0)+( 0 ) Note: if = 0,thenOLS performedonourfixedeect 4

5 estimator is unbiased and consistent. If 6= 0,thenwe just get a dierent bias (see Section 1.2). Further, if is measured with error, we may exacerbate the bias in our fixed eect estimator (see Section 1.3). 1.2 OLS vs. Fixed Eect (FE) In the OLS case with ability bias, we have: plim ( 1 )= 1 + ( ) () (See derivation of Equation (2.2) for more background on the above derivation). 5

6 We also impose: () = ( 0 ) ( ) = ( 0 0 ) ( 0 ) = ( 0 ) With these assumptions, our fixed eect estimator is given by: plim 1 = 1 + ( 0 ( 0 )+( 0 )) ( 0 ) = 1 + ( ) (0 ) () ( 0 ). Note that if ( 0 )=0 and ability is positively correlated with schooling, then the fixed eect estimator is upward biased. 6

7 From the preceding, we see that the fixed eect estimator has more asymptotic bias if: ( ) ( 0 ) () ( 0 ) ( ) () ( ) () (0 ) ( 0 ). 7

8 1.3 Measurement Error Say = + where is observed schooling. Our model now becomes: log = = ( + 1 ) and the fixed eect estimator gives: log log 0 = ( ) ( ) = 1 ( 0 )+( 0 )+ 1 ( 0 ) Now we wish to examine which estimator (OLS or fixed eect), has more asymptotic bias given our measurement error problem. For the remaining arguments of this section, we assume: ( ) = ( 0 ) = ( 0 )=0 so that the OLS estimator gives: 8

9 plim 1 = 1 + ( + 1 ) ( ) = 1 + ( ) 1(). ()+() The fixed eect estimator gives: plim 1 = 1 + ³ 0 ( 0 )+ 1 ( 0 ) ( 0 ) = 1 + (( 0 ) ( 0 )) 1 ( 0 ) ( 0 )+( 0 ) = 1 + ( ) ( 0 ) 1 () ()+() ( 0 ). 9

10 Under what conditions will the fixed eectbiasbegreater? From the above, we know that this will be true if and only if: ( ) ( 0 ) 1 () ( ) 1() ()+() ( 0 ) ()+() ( 0 )(()+()) ( 1 () ( )) ( 0 ) ( ) 1() ()+() ( 0 ) ( 0 ). If this inequality holds, taking dierences can actually worsen the fit over OLS alone. Intuitively, we see that we have dierenced out the true component,, and compounded our measurement error problem with the fixed eect estimator. 10

11 In the special case = 0, the condition is 1 () ()+() ( 0 ) ( ) 1() ()+() 11

12 2 Errors in Variables 2.1 The Model Suppose that the equation for earnings is given by: = where ( 1 2 )=0 0.Alsodefine: 1 = and 2 =

13 Here, 1 and 2 are observed and measure 1 and 2 with error. We also impose that. So, our initial model can be equivalently re-written as: = ( ). Finally, by assumed independence of and, wewrite: = +. 13

14 2.2 McCallum s Problem Question: Is it better for estimation of 1 to include other variables measured with error? Suppose that 1 is not measured with error, in the sense that 1 =0while 2 is measured with error. In and below, we consider both excluding and including 2 and investigate the asymptotic properties of both cases Excluded 2 The equation for earnings with omitted 2 is: = 1 1 +( ) 14

15 Therefore, by arguments similar to those in the appendix, we know: plim 1 = (2.1) 11 Here, 12 is the covariance between the regressors, and 11 is thevarianceof 1 Before moving on to a more general model for the inclusion of 2 let us first consider the classical case for including both variables. Suppose = = We know that: plim ˆ = ( ) 1 ( ) (2.2) 15

16 where the coecient and regressor vectors have been stacked appropriately (see Appendix for derivation). Note that represents the variance-covariance matrix of the measurement errors, and is the variance-covariance matrix of the regressors. Straightforward computations thus give: plim ˆ = = " #

17 2.2.2 Included 2 In McCallum s problem we suppose that 12 =0 Further, as 1 is not measured with error, 11 =0 Substituting this into equation 2.2 yields: 1 plim ˆ 11 = With a little algebra, the above gives: μ plim ˆ 12 1 = μ μ = (1 2 12)

18 where 2 12 is simply the correlation coecient, we know that: Further, so including 2 results in less asymptotic bias (inconsistency). (We get this result by comparing the above with the bias from excluding 2 in section 2.2.1, the result captured in equation (2.1)). So, we have justified the kitchen sink approach. This result generalizes to the multiple regressor case - 1 badly measured variable with good ones (Econometrica, 1972). 18

19 2.3 General Case In the most general case, we have: plim ˆ = ( ) = With a little algebra we find: 1 2. det( )=

20 Therefore: plim ˆ = 1 det( ) Supposing 12 =0 we get: ( ) ( ) det( )=det( ) 12 =0 =

21 and thus: plim ˆ = ( ) 11 det( ) det( ) det( ) ( ) 22 det( ) 1 2 Note that if OLS may not be downward biased for 1.If 2 =0 we get: plim ˆ 2 = det( ) so, if 2 were a race variable and blacks get lower quality schooling, (where schooling is measured by 1 ) then 12 0 and hence ˆ 2 0 This would be a finding in support of labor market discrimination. 21

22 2.4 The Kitchen Sink Revisited McCallum s analysis suggests that one should toss in a variable measured with error if there is no measurement error in 1 But suppose that there is measurement error in 1 Is it still better to include the additional variable measured with error as a regressor? We proceed by imposing 2 =0. (i) Excluded X 2. The equation for earnings with measurement error in 1 and excluded 2 is: = (1 + 1 ) 1 +( ) = 1 1 +( ) 22

23 Therefore: μ plim 1 = = μ 11 = (2.3) (ii) Included X 2. From our analysis in the General Case (Section 2.3), we know that: μ plim ˆ ( = 22) (2.4) det( ) 23

24 If 22 =0 so that 2 is not measured with error: μ plim ˆ = Ã! = (2.5) Comparing eqn (2.4) and eqn (2.5), we see that adding the variable measured without error always exacerbates the bias. 24

25 For, the bias in the excluded case will be smaller if: μ μ whichisalwaysthecase,provided (Notethatthe coecients on 1 for both the excluded and included case are less than one. So, the larger coecient is the one with less bias, as stated above.) 25

26 Now suppose that 22 0 so that both variables are measured with error. Then: μ plim ˆ ( = 22) det( ) = Intuitively, adding measurement error in 2 can only worsen the bias, and thus exclusion should again be preferred to inclusion. Formally, including 2 givesmorebiasifandonly 26

27 if: μ μ μ

28 Thus, provided including 2 resultsinmorebias than excluding it. If 2 12 =0 the bias from including 2 is obviously seen to be: = 1 μ = μ so that including and excluding 2 yields the same result

29 Finally, from the General Case section, we have: plim ˆ 1 = 1 ( ) ( 12 22) L Hôpital s rule on the above shows that: 11 lim ³plim 1 ˆ ³ lim plim 1 ˆ = =0 and =

30 Appendix Derivation of Equation (2.2) We can write = +( ) where: = 1 2 and = 1 2 and 1 2 are 1. 30

31 So: ³ ˆ = 0 1 ( 0 ) = + ³ 0 1 ³ 0 ( ) Ã 0! 1 = + μμ μ μ ³ + ³ 0 1 ³ ³ 0 ³ ³

32 0 = 0 μ =

33 = 0 μ ( ) = ( ) 1 ( ) where the second-to-last step follows from the independence of and This type of argument is also used to derive the probability limit of the s in section 1. 33

34 3 Sibling Models: Components of Variance Scheme Suppose that data on two brothers, say and is at our disposal Without loss of generality, we will consider how to estimate parameters of interest for person in what follows. We will begin by introducing a general model and then focus on the two-person case mentioned above. Consider the following triangular system: 1 = 1 2 = =

35 Here, indexes the person in the group. We assume that and 0 are uncorrelated (i.e., uncorrelated across groups). Further, we suppose: = + = +, for = We assume is uncorrelated across equations and across within the group, is i.i.d. across groups, and is i.i.d. within groups and uncorrelated with. 35

36 3.1 Estimation We specialize the above model into a two person framework and propose a similar three equation system. Let 1 = early (preschool) test score, 2 = schooling (years), and 3 = earnings. It seems plausible to write the equation system 1 = = = where = ability. Regressing 3 on 2 clearly gives biased estimates of 23 as ( 2 ) 6= 0 If 3 0 then OLS estimates of 23 are upward biased. One estimation approach is to use 1 as a proxy for ability: 3 = ( 1 1 )+ 3 36

37 However, this results in a similar problem regressing 3 on 1 and 2 will give biased estimates as 1 is correlated with our residual. (i.e., 1 is an imperfect proxy). Solutions: Onesolutionistouse 1 as an instrument for 1 Why is this a valid IV? From our construction of the model, we know that the are uncorrelated across equations and groups. Further, test scores are correlated across siblings. That is, ( 1 1 ) 6= 0by our group structure. Another solution is possible if there exists an additional early reading on the same person: 0 = Then if 0 6=0 0 is a valid proxy for 1 and we can perform 2SLS. 37

38 3.2 Griliches and Chamberlain model Here we have a modified triangular system as follows: 1 = = = where 1 = years schooling, 2 = late test score (SAT), and 3 = earnings. Notethattherearealternativemodelswith other dependent variables. For example, { 1 = schooling, 2 = early earnings, and 3 = late earnings}, and { 1 = schooling, 2 = consumption, and 3 = earnings}. Getting the equation system into reduced form and expressing as matrix notation, we write = +, 38

39 where: and: = = = = ( ) ( ) Estimation. For estimation, we impose that 23 =0 In our second example of section 3.2, this would be equivalent to stating that there is no correlation between transient income and consumption (permanent income hypothesis). In general, with one factor, we need one more exclusion than that implied by triangularity. 39

40 (i) 1 proxies. so that = = We can then estimate 2 1 consistently by using 1 as an instrument for 1 in the equation above. (ii) Get residuals from (i): = (iii). Use the residuals as an instrument for 1 in the 3 equation. is valid since it is both uncorrelated with and 3 and it is correlated with 1 : 40

41 μ ( 1 )= μ 1 = μ 1 = = 0 1 if 1 6=0 and, 2 6=0 Thus we can estimate 13. (iv). Interchange the role of 2 and 3 to estimate 12. (v). Form the residual (and recall that 13 is known and 23 =0) = =

42 (vi) Use 1 as a proxy for ability. Substituting this into gives: = (vii) Now use 1 as an instrument for 1 in the above to get an estimate of 3 1. (viii) Interchange the role of 2 and 3 to estimate

43 3.3 Triangular systems more generally Without loss of generality, suppose that 2 is excluded from the equation of our system. (We are supposing the existence of an extra exclusion than that implied by triangularity). We seek to estimate the parameters of the system in equation as well as equations before and after Equation t. i. Use 1 as a proxy for ability. Solving for and substituting into the equation: = + 43

44 We get: = and we are considering =2 1 The ratio 1 can then be identified using 1 as an instrument for 1 ii. Form the residuals: = 1 1 =2 1 Now we have 2 IV s ( ) for the 2 independent variables in the equation ( ) so we can consistently estimate the coecients in the equation. 44

45 Equations before t. iii. Form: = We can use 1 1 to form 1 purged IV s and is used as a proxy for unobserved ability,. Inthisway,wecanestimatealloftheparameters in equations (Note the sequential order implicit in this triangular system. We must first estimate before this step can be made.) Example. Suppose 3 and 3 =

46 Use = + as a proxy for. Substituting this into our 3 equation yields: 3 = μ Observe that 1 and 2, are independent of our residual, but is not. We can use as an instrument for to estimate the parameters above. This obviously generalizes for all equations less than. 46

47 Equations after t. iv. Assume identification for all equations through via an exclusion restriction in equation. Example. As an example, consider the following: Define: 4 = Solving for 1 and 2 and substituting into the equation for 4 we find: 47

48 4 = ( ) where: = ( ) 1 +( ) = ( ) 1 +( )( ) = = = Using 1 as a proxy for and substituting we get: μ 4 =

49 where 1 = We can then use 1 2 and 3 as instruments to get an estimate of 34 Define: 4 = = (Excluding 3 allows us to estimate the remaining parameters). Using 3 as a proxy for yields: 4 = μ We can then estimate 14 and 24 by using 1 2 and 3 as an IV. We can continue estimating. For example, consider the 5 equation: (i) Rewrite in terms of and 4 49

50 (ii) Use 1 to proxy. (iii) Use a cross-member IV for 1 in addition to =2 3 4 which gives our estimate of 45 (iv) Now form 5 = (v) With 4 excluded, we can use purged IV s on 5 as before. 50

51 3.4 Comments 1. One needs to check the rank order conditions for identification (requires imposing an exclusion restriction). 2. Griliches and Chamberlain (IER, 1976) find a small ability bias - 3 decimal point dierence in schooling coecient. 51

52 4 Twin Methods Basic Principle: Monozygotic or MZ (identical) twins are more similar than Dizygotic or DZ (fraternal) twins. The key assumption is that if environmental factors are the same for both types of twins, then we can estimate genetic components to outcomes. 52

53 4.1 Univariate Twin Model Let = observed phenotypic variable, = unobserved genotype, and = environment. Further, suppose that we can write our model additively: = + and assume independence of and so that 2 = Now suppose that we have data on another individual: = + Then our phenotypic covariance is: ( )= ( )+ ( ) 53

54 where we are imposing the assumption: ( )=( )=0 Defining standardized forms and some simplifying notation, let Thus, = + which implies = + We can also derive the identity: = =1 where the last step follows from our assumption of independence. Now we wish to consider the correlation between ob- 54

55 served phenotypes of our two individuals: = ( ) = ( + + ) = 2 ( ) + 2 ( ) ( ) ( ) = say, with and defined as above. We assume that =1 and that 1 Thatis,thegenotypicvariableisperfectly correlated among identical twins, but less than perfectly correlated among fraternal twins. Replacing this result into the above produces: = =

56 Therefore: = (1 ) 2 +( ) 2 = (1 ) 2 +( )(1 2 ) where the last equality follows from our established identity. Solving for 2, we find: 2 = ( ) ( ) (1 ) ( ) The only known in the right hand side of the above equality is the expression ( ), which is simply the correlation coecient of the observed phenotypic variable. The remaining two expressions, (1 ) and ( ) can not be computed as they represent statistics on variables we don t observe. 56

57 One could impose = so that: 2 = 1 The expression is a measure of how closely the genetic variable is correlated across our two observations. One could then guess or estimate a value for this parameter to derive corresponding estimates of 2 the ratio of how much variance in the phenotypic variable is explained by variance in the genetic component. Other studies have attempted to include ( ) 6= 0but this presents an identification problem. A typical value of the estimable portion of the above,, is commonly reported in the literature to be

Using Matching, Instrumental Variables and Control Functions to Estimate Economic Choice Models

Using Matching, Instrumental Variables and Control Functions to Estimate Economic Choice Models Using Matching, Instrumental Variables and Control Functions to Estimate Economic Choice Models James J. Heckman and Salvador Navarro The University of Chicago Review of Economics and Statistics 86(1)

More information

Labor Supply and the Two-Step Estimator

Labor Supply and the Two-Step Estimator Labor Supply and the Two-Step Estimator James J. Heckman University of Chicago Econ 312 This draft, April 7, 2006 In this lecture, we look at a labor supply model and discuss various approaches to identify

More information

1 Motivation for Instrumental Variable (IV) Regression

1 Motivation for Instrumental Variable (IV) Regression ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data

More information

Lecture Notes on Measurement Error

Lecture Notes on Measurement Error Steve Pischke Spring 2000 Lecture Notes on Measurement Error These notes summarize a variety of simple results on measurement error which I nd useful. They also provide some references where more complete

More information

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012 Problem Set #6: OLS Economics 835: Econometrics Fall 202 A preliminary result Suppose we have a random sample of size n on the scalar random variables (x, y) with finite means, variances, and covariance.

More information

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables Applied Econometrics (MSc.) Lecture 3 Instrumental Variables Estimation - Theory Department of Economics University of Gothenburg December 4, 2014 1/28 Why IV estimation? So far, in OLS, we assumed independence.

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Many economic models involve endogeneity: that is, a theoretical relationship does not fit

More information

Measurement Error. Often a data set will contain imperfect measures of the data we would ideally like.

Measurement Error. Often a data set will contain imperfect measures of the data we would ideally like. Measurement Error Often a data set will contain imperfect measures of the data we would ideally like. Aggregate Data: (GDP, Consumption, Investment are only best guesses of theoretical counterparts and

More information

Discrete Dependent Variable Models

Discrete Dependent Variable Models Discrete Dependent Variable Models James J. Heckman University of Chicago This draft, April 10, 2006 Here s the general approach of this lecture: Economic model Decision rule (e.g. utility maximization)

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Asymptotics Asymptotics Multiple Linear Regression: Assumptions Assumption MLR. (Linearity in parameters) Assumption MLR. (Random Sampling from the population) We have a random

More information

Adding Uncertainty to a Roy Economy with Two Sectors

Adding Uncertainty to a Roy Economy with Two Sectors Adding Uncertainty to a Roy Economy with Two Sectors James J. Heckman The University of Chicago Nueld College, Oxford University This draft, August 7, 2005 1 denotes dierent sectors. =0denotes choice of

More information

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018 Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate

More information

Economics 241B Estimation with Instruments

Economics 241B Estimation with Instruments Economics 241B Estimation with Instruments Measurement Error Measurement error is de ned as the error resulting from the measurement of a variable. At some level, every variable is measured with error.

More information

The returns to schooling, ability bias, and regression

The returns to schooling, ability bias, and regression The returns to schooling, ability bias, and regression Jörn-Steffen Pischke LSE October 4, 2016 Pischke (LSE) Griliches 1977 October 4, 2016 1 / 44 Counterfactual outcomes Scholing for individual i is

More information

Hypothesis Testing. Part I. James J. Heckman University of Chicago. Econ 312 This draft, April 20, 2006

Hypothesis Testing. Part I. James J. Heckman University of Chicago. Econ 312 This draft, April 20, 2006 Hypothesis Testing Part I James J. Heckman University of Chicago Econ 312 This draft, April 20, 2006 1 1 A Brief Review of Hypothesis Testing and Its Uses values and pure significance tests (R.A. Fisher)

More information

An explanation of Two Stage Least Squares

An explanation of Two Stage Least Squares Introduction Introduction to Econometrics An explanation of Two Stage Least Squares When we get an endogenous variable we know that OLS estimator will be inconsistent. In addition OLS regressors will also

More information

Handout 11: Measurement Error

Handout 11: Measurement Error Handout 11: Measurement Error In which you learn to recognise the consequences for OLS estimation whenever some of the variables you use are not measured as accurately as you might expect. A (potential)

More information

11. Further Issues in Using OLS with TS Data

11. Further Issues in Using OLS with TS Data 11. Further Issues in Using OLS with TS Data With TS, including lags of the dependent variable often allow us to fit much better the variation in y Exact distribution theory is rarely available in TS applications,

More information

Probabilistic Choice Models

Probabilistic Choice Models Probabilistic Choice Models James J. Heckman University of Chicago Econ 312 This draft, March 29, 2006 This chapter examines dierent models commonly used to model probabilistic choice, such as eg the choice

More information

Econometrics. 7) Endogeneity

Econometrics. 7) Endogeneity 30C00200 Econometrics 7) Endogeneity Timo Kuosmanen Professor, Ph.D. http://nomepre.net/index.php/timokuosmanen Today s topics Common types of endogeneity Simultaneity Omitted variables Measurement errors

More information

ECON The Simple Regression Model

ECON The Simple Regression Model ECON 351 - The Simple Regression Model Maggie Jones 1 / 41 The Simple Regression Model Our starting point will be the simple regression model where we look at the relationship between two variables In

More information

Econ 582 Fixed Effects Estimation of Panel Data

Econ 582 Fixed Effects Estimation of Panel Data Econ 582 Fixed Effects Estimation of Panel Data Eric Zivot May 28, 2012 Panel Data Framework = x 0 β + = 1 (individuals); =1 (time periods) y 1 = X β ( ) ( 1) + ε Main question: Is x uncorrelated with?

More information

ECON Introductory Econometrics. Lecture 17: Experiments

ECON Introductory Econometrics. Lecture 17: Experiments ECON4150 - Introductory Econometrics Lecture 17: Experiments Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 13 Lecture outline 2 Why study experiments? The potential outcome framework.

More information

1. You have data on years of work experience, EXPER, its square, EXPER2, years of education, EDUC, and the log of hourly wages, LWAGE

1. You have data on years of work experience, EXPER, its square, EXPER2, years of education, EDUC, and the log of hourly wages, LWAGE 1. You have data on years of work experience, EXPER, its square, EXPER, years of education, EDUC, and the log of hourly wages, LWAGE You estimate the following regressions: (1) LWAGE =.00 + 0.05*EDUC +

More information

Multiple Regression Analysis. Part III. Multiple Regression Analysis

Multiple Regression Analysis. Part III. Multiple Regression Analysis Part III Multiple Regression Analysis As of Sep 26, 2017 1 Multiple Regression Analysis Estimation Matrix form Goodness-of-Fit R-square Adjusted R-square Expected values of the OLS estimators Irrelevant

More information

Notes on Twin Models

Notes on Twin Models Notes on Twin Models Rodrigo Pinto University of Chicago HCEO Seminar April 19, 2014 This draft, April 19, 2014 8:17am Rodrigo Pinto Gene-environment Interaction and Causality, April 19, 2014 8:17am 1

More information

Four Parameters of Interest in the Evaluation. of Social Programs. James J. Heckman Justin L. Tobias Edward Vytlacil

Four Parameters of Interest in the Evaluation. of Social Programs. James J. Heckman Justin L. Tobias Edward Vytlacil Four Parameters of Interest in the Evaluation of Social Programs James J. Heckman Justin L. Tobias Edward Vytlacil Nueld College, Oxford, August, 2005 1 1 Introduction This paper uses a latent variable

More information

Lecture Notes Part 7: Systems of Equations

Lecture Notes Part 7: Systems of Equations 17.874 Lecture Notes Part 7: Systems of Equations 7. Systems of Equations Many important social science problems are more structured than a single relationship or function. Markets, game theoretic models,

More information

Simultaneous Equation Models

Simultaneous Equation Models Simultaneous Equation Models Suppose we are given the model y 1 Y 1 1 X 1 1 u 1 where E X 1 u 1 0 but E Y 1 u 1 0 We can often think of Y 1 (and more, say Y 1 )asbeing determined as part of a system of

More information

Econometrics Review questions for exam

Econometrics Review questions for exam Econometrics Review questions for exam Nathaniel Higgins nhiggins@jhu.edu, 1. Suppose you have a model: y = β 0 x 1 + u You propose the model above and then estimate the model using OLS to obtain: ŷ =

More information

ECON Introductory Econometrics. Lecture 6: OLS with Multiple Regressors

ECON Introductory Econometrics. Lecture 6: OLS with Multiple Regressors ECON4150 - Introductory Econometrics Lecture 6: OLS with Multiple Regressors Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 6 Lecture outline 2 Violation of first Least Squares assumption

More information

Contest Quiz 3. Question Sheet. In this quiz we will review concepts of linear regression covered in lecture 2.

Contest Quiz 3. Question Sheet. In this quiz we will review concepts of linear regression covered in lecture 2. Updated: November 17, 2011 Lecturer: Thilo Klein Contact: tk375@cam.ac.uk Contest Quiz 3 Question Sheet In this quiz we will review concepts of linear regression covered in lecture 2. NOTE: Please round

More information

An Introduction to Econometrics. A Self-contained Approach. Frank Westhoff. The MIT Press Cambridge, Massachusetts London, England

An Introduction to Econometrics. A Self-contained Approach. Frank Westhoff. The MIT Press Cambridge, Massachusetts London, England An Introduction to Econometrics A Self-contained Approach Frank Westhoff The MIT Press Cambridge, Massachusetts London, England How to Use This Book xvii 1 Descriptive Statistics 1 Chapter 1 Prep Questions

More information

Multiple Linear Regression CIVL 7012/8012

Multiple Linear Regression CIVL 7012/8012 Multiple Linear Regression CIVL 7012/8012 2 Multiple Regression Analysis (MLR) Allows us to explicitly control for many factors those simultaneously affect the dependent variable This is important for

More information

4 Instrumental Variables Single endogenous variable One continuous instrument. 2

4 Instrumental Variables Single endogenous variable One continuous instrument. 2 Econ 495 - Econometric Review 1 Contents 4 Instrumental Variables 2 4.1 Single endogenous variable One continuous instrument. 2 4.2 Single endogenous variable more than one continuous instrument..........................

More information

Lecture 14. More on using dummy variables (deal with seasonality)

Lecture 14. More on using dummy variables (deal with seasonality) Lecture 14. More on using dummy variables (deal with seasonality) More things to worry about: measurement error in variables (can lead to bias in OLS (endogeneity) ) Have seen that dummy variables are

More information

Final Exam. Economics 835: Econometrics. Fall 2010

Final Exam. Economics 835: Econometrics. Fall 2010 Final Exam Economics 835: Econometrics Fall 2010 Please answer the question I ask - no more and no less - and remember that the correct answer is often short and simple. 1 Some short questions a) For each

More information

Dealing With Endogeneity

Dealing With Endogeneity Dealing With Endogeneity Junhui Qian December 22, 2014 Outline Introduction Instrumental Variable Instrumental Variable Estimation Two-Stage Least Square Estimation Panel Data Endogeneity in Econometrics

More information

Problem Set # 1. Master in Business and Quantitative Methods

Problem Set # 1. Master in Business and Quantitative Methods Problem Set # 1 Master in Business and Quantitative Methods Contents 0.1 Problems on endogeneity of the regressors........... 2 0.2 Lab exercises on endogeneity of the regressors......... 4 1 0.1 Problems

More information

Problem Set - Instrumental Variables

Problem Set - Instrumental Variables Problem Set - Instrumental Variables 1. Consider a simple model to estimate the effect of personal computer (PC) ownership on college grade point average for graduating seniors at a large public university:

More information

8. Instrumental variables regression

8. Instrumental variables regression 8. Instrumental variables regression Recall: In Section 5 we analyzed five sources of estimation bias arising because the regressor is correlated with the error term Violation of the first OLS assumption

More information

4 Instrumental Variables Single endogenous variable One continuous instrument. 2

4 Instrumental Variables Single endogenous variable One continuous instrument. 2 Econ 495 - Econometric Review 1 Contents 4 Instrumental Variables 2 4.1 Single endogenous variable One continuous instrument. 2 4.2 Single endogenous variable more than one continuous instrument..........................

More information

Chapter 6. Panel Data. Joan Llull. Quantitative Statistical Methods II Barcelona GSE

Chapter 6. Panel Data. Joan Llull. Quantitative Statistical Methods II Barcelona GSE Chapter 6. Panel Data Joan Llull Quantitative Statistical Methods II Barcelona GSE Introduction Chapter 6. Panel Data 2 Panel data The term panel data refers to data sets with repeated observations over

More information

ECNS 561 Multiple Regression Analysis

ECNS 561 Multiple Regression Analysis ECNS 561 Multiple Regression Analysis Model with Two Independent Variables Consider the following model Crime i = β 0 + β 1 Educ i + β 2 [what else would we like to control for?] + ε i Here, we are taking

More information

Econometrics Summary Algebraic and Statistical Preliminaries

Econometrics Summary Algebraic and Statistical Preliminaries Econometrics Summary Algebraic and Statistical Preliminaries Elasticity: The point elasticity of Y with respect to L is given by α = ( Y/ L)/(Y/L). The arc elasticity is given by ( Y/ L)/(Y/L), when L

More information

New Notes on the Solow Growth Model

New Notes on the Solow Growth Model New Notes on the Solow Growth Model Roberto Chang September 2009 1 The Model The firstingredientofadynamicmodelisthedescriptionofthetimehorizon. In the original Solow model, time is continuous and the

More information

More on Roy Model of Self-Selection

More on Roy Model of Self-Selection V. J. Hotz Rev. May 26, 2007 More on Roy Model of Self-Selection Results drawn on Heckman and Sedlacek JPE, 1985 and Heckman and Honoré, Econometrica, 1986. Two-sector model in which: Agents are income

More information

Lecture 5: Omitted Variables, Dummy Variables and Multicollinearity

Lecture 5: Omitted Variables, Dummy Variables and Multicollinearity Lecture 5: Omitted Variables, Dummy Variables and Multicollinearity R.G. Pierse 1 Omitted Variables Suppose that the true model is Y i β 1 + β X i + β 3 X 3i + u i, i 1,, n (1.1) where β 3 0 but that the

More information

Linear Models in Econometrics

Linear Models in Econometrics Linear Models in Econometrics Nicky Grant At the most fundamental level econometrics is the development of statistical techniques suited primarily to answering economic questions and testing economic theories.

More information

4.8 Instrumental Variables

4.8 Instrumental Variables 4.8. INSTRUMENTAL VARIABLES 35 4.8 Instrumental Variables A major complication that is emphasized in microeconometrics is the possibility of inconsistent parameter estimation due to endogenous regressors.

More information

Testing Linear Restrictions: cont.

Testing Linear Restrictions: cont. Testing Linear Restrictions: cont. The F-statistic is closely connected with the R of the regression. In fact, if we are testing q linear restriction, can write the F-stastic as F = (R u R r)=q ( R u)=(n

More information

Internal vs. external validity. External validity. This section is based on Stock and Watson s Chapter 9.

Internal vs. external validity. External validity. This section is based on Stock and Watson s Chapter 9. Section 7 Model Assessment This section is based on Stock and Watson s Chapter 9. Internal vs. external validity Internal validity refers to whether the analysis is valid for the population and sample

More information

A Note on the Correlated Random Coefficient Model. Christophe Kolodziejczyk

A Note on the Correlated Random Coefficient Model. Christophe Kolodziejczyk CAM Centre for Applied Microeconometrics Department of Economics University of Copenhagen http://www.econ.ku.dk/cam/ A Note on the Correlated Random Coefficient Model Christophe Kolodziejczyk 2006-10 The

More information

ECON Introductory Econometrics. Lecture 16: Instrumental variables

ECON Introductory Econometrics. Lecture 16: Instrumental variables ECON4150 - Introductory Econometrics Lecture 16: Instrumental variables Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 12 Lecture outline 2 OLS assumptions and when they are violated Instrumental

More information

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data July 2012 Bangkok, Thailand Cosimo Beverelli (World Trade Organization) 1 Content a) Endogeneity b) Instrumental

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model Most of this course will be concerned with use of a regression model: a structure in which one or more explanatory

More information

Regression #3: Properties of OLS Estimator

Regression #3: Properties of OLS Estimator Regression #3: Properties of OLS Estimator Econ 671 Purdue University Justin L. Tobias (Purdue) Regression #3 1 / 20 Introduction In this lecture, we establish some desirable properties associated with

More information

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors Laura Mayoral IAE, Barcelona GSE and University of Gothenburg Gothenburg, May 2015 Roadmap Deviations from the standard

More information

Chapter 14. Simultaneous Equations Models Introduction

Chapter 14. Simultaneous Equations Models Introduction Chapter 14 Simultaneous Equations Models 14.1 Introduction Simultaneous equations models differ from those we have considered in previous chapters because in each model there are two or more dependent

More information

Rewrap ECON November 18, () Rewrap ECON 4135 November 18, / 35

Rewrap ECON November 18, () Rewrap ECON 4135 November 18, / 35 Rewrap ECON 4135 November 18, 2011 () Rewrap ECON 4135 November 18, 2011 1 / 35 What should you now know? 1 What is econometrics? 2 Fundamental regression analysis 1 Bivariate regression 2 Multivariate

More information

Regression with time series

Regression with time series Regression with time series Class Notes Manuel Arellano February 22, 2018 1 Classical regression model with time series Model and assumptions The basic assumption is E y t x 1,, x T = E y t x t = x tβ

More information

The relationship between treatment parameters within a latent variable framework

The relationship between treatment parameters within a latent variable framework Economics Letters 66 (2000) 33 39 www.elsevier.com/ locate/ econbase The relationship between treatment parameters within a latent variable framework James J. Heckman *,1, Edward J. Vytlacil 2 Department

More information

Chapter 6: Linear Regression With Multiple Regressors

Chapter 6: Linear Regression With Multiple Regressors Chapter 6: Linear Regression With Multiple Regressors 1-1 Outline 1. Omitted variable bias 2. Causality and regression analysis 3. Multiple regression and OLS 4. Measures of fit 5. Sampling distribution

More information

Instrumental Variables, Simultaneous and Systems of Equations

Instrumental Variables, Simultaneous and Systems of Equations Chapter 6 Instrumental Variables, Simultaneous and Systems of Equations 61 Instrumental variables In the linear regression model y i = x iβ + ε i (61) we have been assuming that bf x i and ε i are uncorrelated

More information

Omitted Variable Bias, Coefficient Stability and Other Issues. Chase Potter 6/22/16

Omitted Variable Bias, Coefficient Stability and Other Issues. Chase Potter 6/22/16 Omitted Variable Bias, Coefficient Stability and Other Issues Chase Potter 6/22/16 Roadmap 1. Omitted Variable Bias 2. Coefficient Stability Oster 2015 Paper 3. Poorly Measured Confounders Pischke and

More information

Maximum Likelihood (ML) Estimation

Maximum Likelihood (ML) Estimation Econometrics 2 Fall 2004 Maximum Likelihood (ML) Estimation Heino Bohn Nielsen 1of32 Outline of the Lecture (1) Introduction. (2) ML estimation defined. (3) ExampleI:Binomialtrials. (4) Example II: Linear

More information

Single-Equation GMM: Endogeneity Bias

Single-Equation GMM: Endogeneity Bias Single-Equation GMM: Lecture for Economics 241B Douglas G. Steigerwald UC Santa Barbara January 2012 Initial Question Initial Question How valuable is investment in college education? economics - measure

More information

Business Economics BUSINESS ECONOMICS. PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS MODULE No. : 3, GAUSS MARKOV THEOREM

Business Economics BUSINESS ECONOMICS. PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS MODULE No. : 3, GAUSS MARKOV THEOREM Subject Business Economics Paper No and Title Module No and Title Module Tag 8, Fundamentals of Econometrics 3, The gauss Markov theorem BSE_P8_M3 1 TABLE OF CONTENTS 1. INTRODUCTION 2. ASSUMPTIONS OF

More information

PANEL DATA RANDOM AND FIXED EFFECTS MODEL. Professor Menelaos Karanasos. December Panel Data (Institute) PANEL DATA December / 1

PANEL DATA RANDOM AND FIXED EFFECTS MODEL. Professor Menelaos Karanasos. December Panel Data (Institute) PANEL DATA December / 1 PANEL DATA RANDOM AND FIXED EFFECTS MODEL Professor Menelaos Karanasos December 2011 PANEL DATA Notation y it is the value of the dependent variable for cross-section unit i at time t where i = 1,...,

More information

Multiple Equation GMM with Common Coefficients: Panel Data

Multiple Equation GMM with Common Coefficients: Panel Data Multiple Equation GMM with Common Coefficients: Panel Data Eric Zivot Winter 2013 Multi-equation GMM with common coefficients Example (panel wage equation) 69 = + 69 + + 69 + 1 80 = + 80 + + 80 + 2 Note:

More information

EC402 - Problem Set 3

EC402 - Problem Set 3 EC402 - Problem Set 3 Konrad Burchardi 11th of February 2009 Introduction Today we will - briefly talk about the Conditional Expectation Function and - lengthily talk about Fixed Effects: How do we calculate

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 7 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 68 Outline of Lecture 7 1 Empirical example: Italian labor force

More information

Missing dependent variables in panel data models

Missing dependent variables in panel data models Missing dependent variables in panel data models Jason Abrevaya Abstract This paper considers estimation of a fixed-effects model in which the dependent variable may be missing. For cross-sectional units

More information

Econometrics of causal inference. Throughout, we consider the simplest case of a linear outcome equation, and homogeneous

Econometrics of causal inference. Throughout, we consider the simplest case of a linear outcome equation, and homogeneous Econometrics of causal inference Throughout, we consider the simplest case of a linear outcome equation, and homogeneous effects: y = βx + ɛ (1) where y is some outcome, x is an explanatory variable, and

More information

Econometric Analysis of Cross Section and Panel Data

Econometric Analysis of Cross Section and Panel Data Econometric Analysis of Cross Section and Panel Data Jeffrey M. Wooldridge / The MIT Press Cambridge, Massachusetts London, England Contents Preface Acknowledgments xvii xxiii I INTRODUCTION AND BACKGROUND

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 6: Bias and variance (v5) Ramesh Johari ramesh.johari@stanford.edu 1 / 49 Our plan today We saw in last lecture that model scoring methods seem to be trading off two different

More information

Our point of departure, as in Chapter 2, will once more be the outcome equation:

Our point of departure, as in Chapter 2, will once more be the outcome equation: Chapter 4 Instrumental variables I 4.1 Selection on unobservables Our point of departure, as in Chapter 2, will once more be the outcome equation: Y Dβ + Xα + U, 4.1 where treatment intensity will once

More information

Logistic regression: Why we often can do what we think we can do. Maarten Buis 19 th UK Stata Users Group meeting, 10 Sept. 2015

Logistic regression: Why we often can do what we think we can do. Maarten Buis 19 th UK Stata Users Group meeting, 10 Sept. 2015 Logistic regression: Why we often can do what we think we can do Maarten Buis 19 th UK Stata Users Group meeting, 10 Sept. 2015 1 Introduction Introduction - In 2010 Carina Mood published an overview article

More information

1 Correlation between an independent variable and the error

1 Correlation between an independent variable and the error Chapter 7 outline, Econometrics Instrumental variables and model estimation 1 Correlation between an independent variable and the error Recall that one of the assumptions that we make when proving the

More information

Rockefeller College University at Albany

Rockefeller College University at Albany Rockefeller College University at Albany PAD 705 Handout: Suggested Review Problems from Pindyck & Rubinfeld Original prepared by Professor Suzanne Cooper John F. Kennedy School of Government, Harvard

More information

Econ 300/QAC 201: Quantitative Methods in Economics/Applied Data Analysis. 18th Class 7/2/10

Econ 300/QAC 201: Quantitative Methods in Economics/Applied Data Analysis. 18th Class 7/2/10 Econ 300/QAC 201: Quantitative Methods in Economics/Applied Data Analysis 18th Class 7/2/10 Out of the air a voice without a face Proved by statistics that some cause was just In tones as dry and level

More information

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47 ECON2228 Notes 2 Christopher F Baum Boston College Economics 2014 2015 cfb (BC Econ) ECON2228 Notes 2 2014 2015 1 / 47 Chapter 2: The simple regression model Most of this course will be concerned with

More information

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY Time Series Analysis James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY & Contents PREFACE xiii 1 1.1. 1.2. Difference Equations First-Order Difference Equations 1 /?th-order Difference

More information

CHAPTER 6: SPECIFICATION VARIABLES

CHAPTER 6: SPECIFICATION VARIABLES Recall, we had the following six assumptions required for the Gauss-Markov Theorem: 1. The regression model is linear, correctly specified, and has an additive error term. 2. The error term has a zero

More information

Econ 510 B. Brown Spring 2014 Final Exam Answers

Econ 510 B. Brown Spring 2014 Final Exam Answers Econ 510 B. Brown Spring 2014 Final Exam Answers Answer five of the following questions. You must answer question 7. The question are weighted equally. You have 2.5 hours. You may use a calculator. Brevity

More information

AGEC 661 Note Fourteen

AGEC 661 Note Fourteen AGEC 661 Note Fourteen Ximing Wu 1 Selection bias 1.1 Heckman s two-step model Consider the model in Heckman (1979) Y i = X iβ + ε i, D i = I {Z iγ + η i > 0}. For a random sample from the population,

More information

IV Estimation WS 2014/15 SS Alexander Spermann. IV Estimation

IV Estimation WS 2014/15 SS Alexander Spermann. IV Estimation SS 2010 WS 2014/15 Alexander Spermann Evaluation With Non-Experimental Approaches Selection on Unobservables Natural Experiment (exogenous variation in a variable) DiD Example: Card/Krueger (1994) Minimum

More information

MS&E 226: Small Data. Lecture 6: Bias and variance (v2) Ramesh Johari

MS&E 226: Small Data. Lecture 6: Bias and variance (v2) Ramesh Johari MS&E 226: Small Data Lecture 6: Bias and variance (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 47 Our plan today We saw in last lecture that model scoring methods seem to be trading o two di erent

More information

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Econometrics Week 8 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 25 Recommended Reading For the today Instrumental Variables Estimation and Two Stage

More information

We begin by thinking about population relationships.

We begin by thinking about population relationships. Conditional Expectation Function (CEF) We begin by thinking about population relationships. CEF Decomposition Theorem: Given some outcome Y i and some covariates X i there is always a decomposition where

More information

Regression Analysis Tutorial 34 LECTURE / DISCUSSION. Statistical Properties of OLS

Regression Analysis Tutorial 34 LECTURE / DISCUSSION. Statistical Properties of OLS Regression Analysis Tutorial 34 LETURE / DISUSSION Statistical Properties of OLS Regression Analysis Tutorial 35 Statistical Properties of OLS y = " + $x + g dependent included omitted variable explanatory

More information

Wooldridge, Introductory Econometrics, 3d ed. Chapter 9: More on specification and data problems

Wooldridge, Introductory Econometrics, 3d ed. Chapter 9: More on specification and data problems Wooldridge, Introductory Econometrics, 3d ed. Chapter 9: More on specification and data problems Functional form misspecification We may have a model that is correctly specified, in terms of including

More information

Final Exam Details. J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 March 8, / 24

Final Exam Details. J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 March 8, / 24 Final Exam Details The final is Thursday, March 17 from 10:30am to 12:30pm in the regular lecture room The final is cumulative (multiple choice will be a roughly 50/50 split between material since the

More information

Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16)

Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16) Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16) 1 2 Model Consider a system of two regressions y 1 = β 1 y 2 + u 1 (1) y 2 = β 2 y 1 + u 2 (2) This is a simultaneous equation model

More information

Econometric Methods and Applications II Chapter 2: Simultaneous equations. Econometric Methods and Applications II, Chapter 2, Slide 1

Econometric Methods and Applications II Chapter 2: Simultaneous equations. Econometric Methods and Applications II, Chapter 2, Slide 1 Econometric Methods and Applications II Chapter 2: Simultaneous equations Econometric Methods and Applications II, Chapter 2, Slide 1 2.1 Introduction An example motivating the problem of simultaneous

More information

Using regression to study economic relationships is called econometrics. econo = of or pertaining to the economy. metrics = measurement

Using regression to study economic relationships is called econometrics. econo = of or pertaining to the economy. metrics = measurement EconS 450 Forecasting part 3 Forecasting with Regression Using regression to study economic relationships is called econometrics econo = of or pertaining to the economy metrics = measurement Econometrics

More information

Instrumental Variables

Instrumental Variables Instrumental Variables Department of Economics University of Wisconsin-Madison September 27, 2016 Treatment Effects Throughout the course we will focus on the Treatment Effect Model For now take that to

More information

Experiments and Quasi-Experiments

Experiments and Quasi-Experiments Experiments and Quasi-Experiments (SW Chapter 13) Outline 1. Potential Outcomes, Causal Effects, and Idealized Experiments 2. Threats to Validity of Experiments 3. Application: The Tennessee STAR Experiment

More information

1 Appendix A: Matrix Algebra

1 Appendix A: Matrix Algebra Appendix A: Matrix Algebra. Definitions Matrix A =[ ]=[A] Symmetric matrix: = for all and Diagonal matrix: 6=0if = but =0if 6= Scalar matrix: the diagonal matrix of = Identity matrix: the scalar matrix

More information

Econometrics Master in Business and Quantitative Methods

Econometrics Master in Business and Quantitative Methods Econometrics Master in Business and Quantitative Methods Helena Veiga Universidad Carlos III de Madrid Models with discrete dependent variables and applications of panel data methods in all fields of economics

More information