Part VII. Accounting for the Endogeneity of Schooling. Endogeneity of schooling Mean growth rate of earnings Mean growth rate Selection bias Summary

Similar documents
Identifying and Estimating the Distributions of Ex Post. and Ex Ante Returns to Schooling


More on Roy Model of Self-Selection

Market and Nonmarket Benefits

Lecture 4 The Centralized Economy: Extensions

( ) : : (CUHIES2000), Heckman Vytlacil (1999, 2000, 2001),Carneiro, Heckman Vytlacil (2001) Carneiro (2002) Carneiro, Heckman Vytlacil (2001) :

Adding Uncertainty to a Roy Economy with Two Sectors

AGEC 661 Note Fourteen

Notes on Heterogeneity, Aggregation, and Market Wage Functions: An Empirical Model of Self-Selection in the Labor Market

4.8 Instrumental Variables

Rising Wage Inequality and the Effectiveness of Tuition Subsidy Policies:

Estimating Marginal and Average Returns to Education

Lecture 11 Roy model, MTE, PRTE

The Generalized Roy Model and Treatment Effects

ECOM 009 Macroeconomics B. Lecture 2

From Economic Model to Econometric Model (The Mincer Model)

Instrumental Variables

UNIVERSITY OF WISCONSIN DEPARTMENT OF ECONOMICS MACROECONOMICS THEORY Preliminary Exam August 1, :00 am - 2:00 pm

PSC 504: Differences-in-differeces estimators

The Econometric Evaluation of Policy Design: Part I: Heterogeneity in Program Impacts, Modeling Self-Selection, and Parameters of Interest

Principles Underlying Evaluation Estimators

Using Matching, Instrumental Variables and Control Functions to Estimate Economic Choice Models

Estimating the Technology of Children s Skill Formation

Lecture 8. Roy Model, IV with essential heterogeneity, MTE

Potential Outcomes Model (POM)

Income Distribution Dynamics with Endogenous Fertility. By Michael Kremer and Daniel Chen

Comparing IV with Structural Models: What Simple IV Can and Cannot Identify

Florian Hoffmann. September - December Vancouver School of Economics University of British Columbia

Recitation Notes 6. Konrad Menzel. October 22, 2006

Microeconomic theory focuses on a small number of concepts. The most fundamental concept is the notion of opportunity cost.

EC 423 Labour Economics Problem Set 3 Answers. 1. Optimal Choice of Schooling and Returns to Schooling. ye rt dt = wf(s, A) e rt dt. = w r.

The regression model with one stochastic regressor (part II)

ES103 Introduction to Econometrics

General motivation behind the augmented Solow model

EC 533 Labour Economics Problem Set 1 Answers. = w r. S = f S. f r = 0. log y = log w + log f(s, A)

problem. max Both k (0) and h (0) are given at time 0. (a) Write down the Hamilton-Jacobi-Bellman (HJB) Equation in the dynamic programming

A Robust Approach to Estimating Production Functions: Replication of the ACF procedure

Applied Microeconometrics I

Lecture 2: The Human Capital Model

Lecture 11/12. Roy Model, MTE, Structural Estimation

Economic Growth: Lecture 8, Overlapping Generations

11.433J / J Real Estate Economics Fall 2008

Macroeconomics II. Dynamic AD-AS model

Dynamic AD-AS model vs. AD-AS model Notes. Dynamic AD-AS model in a few words Notes. Notation to incorporate time-dimension Notes

ECOM 009 Macroeconomics B. Lecture 3

SINGLE-STEP ESTIMATION OF A PARTIALLY LINEAR MODEL

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies

Additional Material for Estimating the Technology of Cognitive and Noncognitive Skill Formation (Cuttings from the Web Appendix)

Chapter 1 Introduction. What are longitudinal and panel data? Benefits and drawbacks of longitudinal data Longitudinal data models Historical notes

A Meta-Analysis of the Urban Wage Premium

Causality and Experiments

Explaining Rising Wage Inequality: Explorations With a Dynamic General Equilibrium Model of Labor Earnings with Heterogeneous Agents

The Ramsey Model. (Lecture Note, Advanced Macroeconomics, Thomas Steger, SS 2013)

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

A Model of Human Capital Accumulation and Occupational Choices. A simplified version of Keane and Wolpin (JPE, 1997)

14.461: Technological Change, Lecture 4 Technology and the Labor Market

Final Exam. Economics 835: Econometrics. Fall 2010

EVALUATING EDUCATION POLICIES WHEN THE RATE OF RETURN VARIES ACROSS INDIVIDUALS PEDRO CARNEIRO * Take the standard model of potential outcomes:

Markov Perfect Equilibria in the Ramsey Model

ECON 361 Assignment 1 Solutions

The Real Business Cycle Model

Endogenous Information Choice

Lecture 4: Linear panel models

Session 4: Money. Jean Imbs. November 2010

Treatment Effects with Normal Disturbances in sampleselection Package

Decomposing the Intergenerational Transmission of Income

Sensitivity checks for the local average treatment effect

Ability Bias, Errors in Variables and Sibling Methods. James J. Heckman University of Chicago Econ 312 This draft, May 26, 2006

Imperfect Credit Markets, Household Wealth Distribution, and Development. (Prepared for Annual Review of Economics)

Estimating Marginal and Average Returns to Education

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors


Policy-Relevant Treatment Effects

Recitation 7. and Kirkebøen, Leuven, and Mogstad (2014) Spring Peter Hull

The Lucas Imperfect Information Model

Estimating the Dynamic Effects of a Job Training Program with M. Program with Multiple Alternatives

Chapter 4. Applications/Variations

More formally, the Gini coefficient is defined as. with p(y) = F Y (y) and where GL(p, F ) the Generalized Lorenz ordinate of F Y is ( )

Flexible Estimation of Treatment Effect Parameters

Foundations of Modern Macroeconomics Second Edition

A Measure of Robustness to Misspecification

4- Current Method of Explaining Business Cycles: DSGE Models. Basic Economic Models

Dynamic Macroeconomic Theory Notes. David L. Kelly. Department of Economics University of Miami Box Coral Gables, FL

A simple macro dynamic model with endogenous saving rate: the representative agent model

Econometrics Problem Set 4

The Kuhn-Tucker and Envelope Theorems

Macroeconomics Qualifying Examination

Empirical approaches in public economics

Lecture 6: Discrete-Time Dynamic Optimization

Solow Growth Model. Michael Bar. February 28, Introduction Some facts about modern growth Questions... 4

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

Multiple Regression Analysis: Further Issues

Lecture#12. Instrumental variables regression Causal parameters III

ECO 310: Empirical Industrial Organization Lecture 2 - Estimation of Demand and Supply

Econometric Causality

Advanced Economic Growth: Lecture 8, Technology Di usion, Trade and Interdependencies: Di usion of Technology

Dynamic Programming. Peter Ireland ECON Math for Economists Boston College, Department of Economics. Fall 2017

Econ 204A: Section 3

Impact Evaluation Technical Workshop:

The returns to schooling, ability bias, and regression

Transcription:

Part VII Accounting for the Endogeneity of Schooling 327 / 785

Much of the CPS-Census literature on the returns to schooling ignores the choice of schooling and its consequences for estimating the rate of return. It ignores uncertainty. It is static and ignores the dynamics of schooling choices and the sequential revelation of uncertainty. It also ignores ability bias. 328 / 785

Economists since C. Reinhold Noyes (1945) in his comment on Friedman and Kuznets (1945) have raised the specter of ability bias, noting that the estimated return to schooling may largely be a return to ability that would arise independently of schooling. Griliches (1977) and Willis (1986) summarize estimates from the conventional literature on ability bias. For the past 30 years, labor economists have been in pursuit of good instruments to estimate the rate of return to schooling, usually interpreted as a Mincer coefficient. 329 / 785

However, the previous sections show that, for many reasons, the Mincer coefficient is not informative on the true rate of return to schooling, and therefore is not the appropriate theoretical construct to gauge educational policy. Card (1999) is a useful reference for empirical estimates from instrumental variable models. 330 / 785

Even abstracting from the issues raised by the sequential updating of information, and the distinction between ex ante and ex post returns to schooling, which we discuss further below, there is the additional issue that returns, however defined, vary among persons. A random coefficients model of the economic return to schooling has been an integral part of the human capital literature since the papers by Becker and Chiswick (1966), Chiswick (1974), Chiswick and Mincer (1972) and Mincer (1974). 331 / 785

In its most stripped-down form and ignoring work experience terms, the Mincer model writes log earnings for person i with schooling level S i as ln y i = α i + ρ i S i, (14) where the rate of return ρ i varies among persons as does the intercept, α i. For the purposes of this discussion think of y i as an annualized flow of lifetime earnings. Unless the only costs of schooling are earnings foregone, and markets are perfect, ρ i is a percentage growth rate in earnings with schooling and not a rate of return to schooling. 332 / 785

Let α i = ᾱ + ε αi and ρ i = ρ + ε ρi where ᾱ and ρ are the means of α i and ρ i. Thus the means of ε αi and ε ρi are zero. Earnings equation (14) can be written as ln y i = ᾱ + ρs i + {ε αi + ε ρi S i }. (15) Equations (14) and (15) are the basis for a human capital analysis of wage inequality in which the variance of log earnings is decomposed into components due to the variance in S i and components due to the variation in the growth rate of earnings with schooling (the variance in ρ), the mean growth rate across regions or time ( ρ), and mean schooling levels ( S). See, e.g. Mincer, 1974, and Willis, 1986. 333 / 785

Given that the growth rate ρ i is a random variable, it has a distribution that can be studied using the methods surveyed below. Following the representative agent tradition in economics, it has become conventional to summarize the distribution of growth rates by the mean, although many other summary measures of the distribution are possible. For the prototypical distribution of ρ i, the conventional measure is the average growth rate E(ρ i ) or E(ρ i X ), where the latter conditions on X, the observed characteristics of individuals. Other means are possible such as the mean growth rates for persons who attain a given level of schooling. 334 / 785

The original Mincer model assumed that the growth rate of earnings with schooling, ρ i, is uncorrelated with or is independent of S i. This assumption is convenient but is not implied by economic theory. It is plausible that the growth rate of earnings with schooling declines with the level of schooling. 335 / 785

It is also plausible that there are unmeasured ability or motivational factors that affect the growth rate of earnings with schooling and are also correlated with the level of schooling. Rosen (1977) discusses this problem in some detail within the context of hedonic models of schooling and earnings. A similar problem arises in analyses of the impact of unionism on relative wages and is discussed in Lewis (1963). 336 / 785

Allowing for correlated random coefficients (so S i is correlated with ε ρi ) raises substantial problems that are just beginning to be addressed in a systematic fashion in the recent literature. Here, we discuss recent developments starting with Card s (1999) random coefficient model of the growth rate of earnings with schooling, a model that is derived from economic theory and is based on the analysis of Becker s model by Rosen (1977). We consider conditions under which it is possible to estimate the mean effect of schooling and the distribution of returns in his model. The next section considers the more general and recent analysis of Carneiro, Heckman, and Vytlacil (2005). 337 / 785

In Card s (1999, 2001) model, the preferences of a person over income (y) and schooling (S) are and U(y, S) = ln y (S) ϕ(s) ϕ (S) > 0 ϕ (S) > 0. The schooling-earnings relationship is y = g(s). This is a hedonic model of schooling, where g(s) reveals how schooling is priced out in the labor market. 338 / 785

This specification is written in terms of annualized earnings and abstracts from work experience. It assumes perfect certainty and abstracts from the sequential resolution of uncertainty that is central to the modern literature. In this formulation, discounting of future earnings is kept implicit. The first order condition for optimal determination of schooling is g (S) g(s) = ϕ (S). (16) 339 / 785

The term g (s) is the percentage change of earnings with g(s) schooling or the growth rate at level s. Card s model reproduces Rosen s (1977) model if r is the common interest rate at which agents can freely lend or borrow and if the only costs are S years of foregone earnings. In Rosen s setup, an agent with an infinite lifetime maximizes 1 r e rs g(s) so ϕ(s) = rs + ln r, and g (S) = r. g(s) 340 / 785

Linearizing the model, we obtain g (S i ) g(s i ) = β i (S i ) = ρ i k 1 S i, k 1 0, ϕ (S i ) = δ i (S i ) = r i + k 2 S i, k 2 0. Substituting these expressions into the first order condition (16), we obtain that the optimal level of schooling is S i = (ρ i r i ), where k = k 1 + k 2. k Observe that if both the growth rate and the returns are independent of S i, (k 1 = 0, k 2 = 0), then k = 0 and if ρ i = r i, there is no determinate level of schooling at the individual level. This is the original Mincer (1958) model. 341 / 785

One source of heterogeneity among persons in the model is ρ i, the way S i is transformed into earnings. School quality may operate through the ρ i for example, as in Behrman and Birdsall (1983), and ρ i may also differ due to inherent ability differences. A second source of heterogeneity is r i, the opportunity cost (cost of schooling) or cost of funds. Higher ability leads to higher levels of schooling. Higher costs of schooling results in lower levels of schooling. 342 / 785

We integrate the first order condition (16) to obtain the following hedonic model of earnings, ln y i = α i + ρ i S i 1 2 k 1S 2 i. (17) To achieve the familiar looking Mincer equation, assume k 1 = 0. This assumption rules out diminishing returns to schooling in terms of years of schooling. 343 / 785

Even under this assumption, ρ i is the percentage growth rate in earnings with schooling, but is not in general an internal rate of return to schooling. It would be a rate of return if there were no direct costs of schooling and everyone faces a constant borrowing rate. This is a version of the Mincer (1958) model, where k 2 = 0, and r i is constant for everyone but not necessarily the same constant. 344 / 785

If ρ i > r i, person i takes the maximum amount of schooling. If ρ i < r i, person i takes no schooling and if ρ i = r i, schooling is indeterminate. In the Card model, ρ i is the person-specific growth rate of earnings and overstates the true rate of return if there are direct and psychic costs of schooling. 345 / 785

This simple model is useful in showing the sources of endogeneity in the schooling earnings model. Since schooling depends on ρ i and r i, any covariance between ρ i r i (in the schooling equation) and ρ i (in the earnings function) produces a random coefficient model. Least squares will not estimate the mean growth rate of earnings with schooling unless, Cov(ρ i, ρ i r i ) = 0. 346 / 785

Dropping the i subscripts, the conditional expectation of log earnings given s is E (ln y S = s) = E (α S = s) + E (ρ S = s) s. The first term produces the conventional ability bias if there is any dependence between s and raw ability α. Raw ability is the contribution to earnings independent of the schooling level attained. The second term arises from sorting on returns to schooling that occurs when people make schooling decisions on the basis of growth rates of earnings with schooling. It is an effect that depends on the level of schooling attained. 347 / 785

In his Woytinsky Lecture (1967), Becker points out the possibility that many able people may not attend school if ability (ρ i ) is positively correlated with the cost of funds (r i ). A meritocratic society would eliminate this positive correlation and might aim to make it negative. Schooling is positively correlated with the growth rate (ρ i ) if Cov(ρ i, ρ i r i ) > 0. If the costs of schooling are sufficiently positively correlated with the growth rate, then schooling is negatively correlated with the growth rate. 348 / 785

Observe that S i does not directly depend on the random intercept α i. Of course, α i may be statistically dependent on (ρ i, r i ). In the context of Card s model, we consider conditions under which one can identify ρ, the mean growth rate of earnings in the population as well as the full distribution of ρ. First we consider the case where the marginal cost of funds, r i, is observed and consider other cases in the following subsections. 349 / 785

Estimating the mean growth rate of earnings when r i is observed A huge industry surveyed in Card (1999) seeks to estimate the mean growth rate in earnings, E (ρ i ), calling it the causal effect of schooling. For reasons discussed earlier in this chapter, in general, it is not an internal rate of return. However, it is one of the ingredients used in calculating the rate of return as we develop further below. The causal effect may also be of interest in its own right if the goal is to estimate pricing equations for labor market characteristics. We discuss some simple approaches for identifying causal effects before turning to a more systematic analysis below. 350 / 785

Estimating the mean growth rate of earnings when r i is observed Suppose that the cost of schooling, r i, is measured by the economist. Use the notation to denote statistical independence. Assume r i (ρ i, α i ). This assumption rules out any relationship between the cost of funds (r i ) and raw ability (α i ) with the growth rate of earnings with schooling. 351 / 785

Estimating the mean growth rate of earnings when r i is observed For example, it rules out fellowships based on ability. We make this assumption to illustrate some ideas and not because of its realism. Observing r i implies that we observe ρ i up to an additive constant. Recall that S i = (ρ i r i ), so that ρ i = r i + ks i and k ρ = E(ρ i ) = r + ke(s i ). 352 / 785

Estimating the mean growth rate of earnings when r i is observed r i is a valid instrument for S i under the assumption that k 1 = 0. It is independent of α i, ρ i (and hence ε αi, ε ρi ) and is correlated with S i because S i depends on r i. 353 / 785

Estimating the mean growth rate of earnings when r i is observed Form Cov(ln y i, r i ) Cov(S i, r i ) = E { (r i r) [ (α i ᾱ) + (ρ i ρ)(s i S) ]} + ρs i + ρ i S ρ S) {[ ] } ρi r i E [r i r] k 1 = k E[( r)( ρ)( ρ r)] ρ k σ2 r, σ2 r k where X = X E(X ). 354 / 785

Estimating the mean growth rate of earnings when r i is observed As a consequence of the assumed independence between r i and (ρ i, α i ), E[( r)( ρ) 2 ] = 0 and E[( r) 2 ρ] = 0, so [ ] Cov(ln yi, r i ) = ρ. Cov(S i, r i ) 355 / 785

Estimating the mean growth rate of earnings when r i is observed Observe that ρ is not identified by this argument if ρ i r i (so the mean growth rate of earnings depends on the cost of schooling). In that case, E[( r)( ρ) 2 ] 0 and E[( r) 2 ( ρ)] 0. If r i is known and r i = L i γ + M i, where the L i are observed variables that explain r i and E(M i L i ) = 0, then γ is identified, provided a rank condition for instrumental variables is satisfied. We require that L i be at least mean independent of (M i, ρ i, α i ). From the schooling equation we can write S i = (ρ i L i γ M i )/k and k is identified since we know γ. 356 / 785

Estimating the mean growth rate of earnings when r i is observed Observe that we can estimate the distribution of ρ i since ρ i = r i + ks i, k is identified and (r i, S i ) are known. This is true even if there are no instruments L, (γ = 0), provided that r i (ρ i, α i ). With the instruments that satisfy at least the mean independence condition, we can allow r i ρ i and all parameters and distributions are still identified. The model is fully identified provided r i is observed and L i (M i, ρ i, α i ). Thus, we can identify the mean return to schooling. 357 / 785

Estimating the mean growth rate when r i is not observed If r i is not observed and so cannot be used as an instrument, but we know that r i depends on observed factors L i and M i, r i = L i γ + M i and L i (M i, α i, ρ i ), then our analysis carries over and the mean growth rate ρ is identified. Recall that ln y i = α i + ρs i + (ρ i ρ)s i. Substitute for S i to get an expression of y i in terms of L i, ln y i = α i + ρ i (ρ i L i γ M i )/k. 358 / 785

Estimating the mean growth rate when r i is not observed We obtain the vector moment equations: Cov(ln y i, L i ) = ρ Cov(S i, L i ), so ρ is identified from the population moments because the covariances on both sides are available. Partition γ = (γ 0, γ 1 ), where γ 0 is the intercept and γ 1 is the vector of slope coefficients. From the schooling equation, we obtain S i = ρ i L i γ 1 M i k γ 1 = L i k + ρ i M i k γ 0 k γ 0 k. 359 / 785

Estimating the mean growth rate when r i is not observed We can identify γ 1 /k from the schooling equation, as well as the mean growth rate ρ. However, we cannot identify the distribution of ρ i or r i unless further assumptions are invoked. We also cannot separately identify γ 0, γ 1 or k. Heckman and Vytlacil (1998) show how to define and identify a version of treatment on the treated for growth rates in the Becker-Card-Rosen model. 360 / 785

Adding selection bias Selection bias can arise in two distinct ways in the Becker-Card-Rosen model: through dependence between α i and ρ i and through dependence between α i and r i. Allowing for selection bias, E(ln y i S i ) = E(α i S i ) + E(ρ i S i S i ) = E(α i S i ) + E(ρ i S i )S i. If there is an L i that affects r i but not ρ i and is independent of (α i, M i ), i.e., L i (α i, ρ i, M i ), and E(r i L i ) is a nontrivial function of L i, in the special case of a linear schooling model, E (ln y i L i ) = E(α i L i ) + E(ρ i S i L i ) = η + ρe(s i L i ). 361 / 785

Adding selection bias Since we can identify E(S i L i ) we can identify ρ. Thus, under the stated conditions, the instrumental variable (IV) method identifies ρ when there is selection bias. In a more general nonparametric case for the schooling equation, which we develop in the next section of this chapter, this argument breaks down and ρ is not identified when ρ i determines S i in a general way. 362 / 785

Adding selection bias The sensitivity of the IV method to assumptions about special features of Card s model is a simple demonstration of the fragility of the method. We return to this model later and use it to motivate recent developments in the literature on identifying information available to agents when they make their schooling decisions. 363 / 785

Summary Card s version of the Becker (1967)-Rosen (1977) model is a useful introduction to the modern literature on heterogeneous returns to schooling. ρ i is, in general, a person-specific growth rate of log earnings with schooling and not a rate of return. There is a distribution of ρ i and no scalar measure is an adequate summary of this distribution. Recent developments in this literature, to which we now turn, demonstrate that standard instrumental variable methods are blunt tools for recovering economically interpretable parameters. 364 / 785