EC 423 Labour Economics Problem Set 3 Answers. 1. Optimal Choice of Schooling and Returns to Schooling. ye rt dt = wf(s, A) e rt dt. = w r.

Steve Pischke EC 43 Labour Economics Problem Set 3 Answers 1 Optimal Choice of Schooling and Returns to Schooling (a) The individual maximizes or alternatively V = The first order condition is Notice that Z S Z ye rt dt = wf(s, A) e rt dt S = w r f(s, A)e rs max e V =logw log r +logf(s, A) rs e V S = f S f r =0 log y =logw +logf(s, A) log y S = f S f = r from the first order condition The first order condition does not depend on w This stems from the fact that w raises the cost of schooling and the additional earnings from schooling proportionately (b) First, derive the second order condition for the problem above e V S = f SSf (f S ) f < 0 which implies, using the first order condition f SS rf S < 0 Totally differentiate the first order condition (f SA rf A ) da = (f SS rf S ) we have da = f SA rf A f SS rf S 1

This is positive iff f SA rf A > 0 For f(s, A) =φ(s)δ(a) wehave f SA = φ 0 δ 0 = φ0 δφδ 0 = f Sf A = rf A φδ f Notice that the basic human capital production function used by Griliches, f(s, A) = exp(βs + γa) is multiplicative in S and A (c) S years of schooling now have an additional cost Z S C = ce rt dt = c ³ 1 e rs 0 r and the maximization problem becomes after multiplying through by r max rv = wf(s, A)e rs c ³ 1 e rs 1 The first order condition is rv S =(wf S rwf cr) e rs =0 or more intuitively wf {z S =r (wf + c) } {z } marginal marginal benefit cost The second order condition has the form w ³ f SS rf S + r f + cr < 0 Multiplying the FOC by r and substituting into the SOC yields the simplification f SS rf S < 0 which is the same as in the original problem Totally differentiating the first order condition w (f SS rf S ) +(f S rf) dw (wf + c) dr rdc =0 results in the following comparative statics results dr = wf + c w (f SS rf S ) < 0 dc = r w (f SS rf S ) < 0 dw = f S rf w (f SS rf S ) > 0 The last inequality follows from the FOC f S rf = rc w > 0

An increase in the wage now raises optimal schooling An increase in the wage still raises earnings proportionately but affects costs less than proportionately because foregone wages are only part of the total costs In order to balance marginal costs and benefits again after an increase in w schooling will have to be raised to lower the marginal benefit and raise the marginal cost 3 By totally differentiating the first order condition wf S wrf = cr it easy to see that it is still true that /da > 0iff SA rf A > 0 Using f(s, A) = exp(βs +γa), this condition becomes f SA rf A =(β r)γf = γrc/w, where the last equality follows from the first order condition It is easy to see that this expression has the sign of c if γ > 0 (d) The human capital production function implies Notice that this can be rewritten as f S f = AbSb 1 = r rs b = ASb =logf(s, A) log y =logw + r b S Since S is the only random variable on the right hand side it follows immediately that cov(log y, S) = r b var(s) OLS overestimates the returns to schooling because of ability bias Estimating Returns to Schooling (a) We know that the first order condition for this model is log y S i = β 1 +β S i = r Si = r β 1 β We cannot estimate the returns to schooling in this case because the choice of schooling is the same for everybody, so there would be no variation in the regressors (b) If the interest rate is random, schooling choices now become S i = r i β 1 β 3

The coefficients of a regression of log earnings on schooling and it s square are plim b β 1 = var(s i )cov(log y, S i ) cov(s i,s i )cov(log y, S i ) var(s i )var(s i ) cov(s i,s i ) plim b β = var(s i)cov(log y, S i ) cov(s i,s i )cov(log y, S i ) var(s i )var(s i ) cov(s i,s i ) The key covariances in these expressions are cov(log y, S i ) = β 1 var(s i )+β cov(s i,si ) cov(log y, Si ) = β 1 cov(s i,si )+β var(si ) plim b β 1 = var(s i )[β 1 var(s i )+β cov(s i,s i )] cov(s i,s i )[β 1 cov(s i,s i )+β var(s i )] var(s i )var(s i ) cov(s i,s i ) = β 1 [var(s i )var(s i ) cov(s i,s i ) ] var(s i )var(s i ) cov(s i,s i ) = β 1 + β [var(s i )cov(s i,s i ) var(s i )cov(s i,s i )] var(s i )var(s i ) cov(s i,s i ) and analogously plim β b = β There is no ability bias here because the human capital production function is multiplicative in S i and ε i (see question 1 (b)) (c) To make returns heterogeneous, make β 1 afunctionofε i : β 1 = θ 0 + θ 1 ε i Substituting in the first order condition yields S i = r θ 0 θ 1 ε i β The covariance is cov(si, ε i)= θ 1 σε β > 0 if θ 1 > 0 Whether more able individuals have higher or lower returns to schooling depends on whether we believe that schooling and ability are complements or substitutes Some schooling may make up for lack of ability, but we probably believe that ability also helps going further in school (after all, that s why we are all going through graduate training, isn t it?) (d) Note that the optimal schooling level is from part (a) S i = (1 z i)r 0 + z i r 1 + η i β 1 β 4

The probability limit of the Wald estimator is plim β b W = E(log y i z i =1) E(log y i z i =0) = E(β 0 + β 1 S i + β Si + ε i z i =1) E(β 0 + β 1 S i + β Si + ε i z i =0) = β 1 []+β [E(Si z i =1) E(Si z i =0)] E(Si = β 1 + β z i =1) E(Si z i =0) = β 1 + β = 1 (r 1 + r 0 ) r1 r 1β 1 r0 +r 0β 1 4β r 1 r 0 = β 1 + 1 (r 1 r 0 )(r 1 + r 0 β 1 ) β r 1 r 0 The population average return to schooling is Ã! d log yi E = E (β 1 +β S i ) i Ã! (1 zi )r 0 + z i r 1 + η i β 1 = β 1 +β E β = β 1 +β (1 P (z i =1))r 0 + P (z i =1)r 1 β 1 β = (1 P (z i =1))r 0 + P (z i =1)r 1 The return to schooling for an individual is equal to the interest rate faced by each individual, and on average these returns are r 0 and r 1 for each of the respective groups The population return is a weighted average of the two returns in the population The population weights depend on the fraction of the population faced with relatively low and high interest rates The Wald estimator also estimates an average of the population returns, but the weights here are fixed and don t depend on the population composition (because we are simply comparing the means for the low and high interest rate groups, no matter what the size of the groups is) Thus, only if P (z i =1)= 1 do the weights coincide and we get Ã! d log yi E = 1 i (r 1 + r 0 ) which is equal to the probability limit of the Wald estimator 5