Nonparametric Estimation of Wages and Labor Force Participation

1 Nonparametric Estimation of Wages Labor Force Participation John Pepper University of Virginia Steven Stern University of Virginia May 5, 000 Preliminary Draft - Comments Welcome Abstract. Model Let y i be a binary indicator for labor force participation: y i =i i works. Let w i be the wage i would get if she worked, let r i be her reservation wage. It is assumed that y i =i w i >r i. Let w i = g w (X i w;d i ;E i )+u w i where d i is a binary indicator of i s disability (d i =i i is disabled), E i is a continuous measure of i s education, X i is a vector of other observed variables a ecting w i, g w (²) is an unspeci ed function with g w (²) 0, g w (²) 0, g w3 (²) 0, u w i is an error with nite mean variance. De ne w i = g w (X i w;d i ;E i ). Similarly, let r i = g r (X i r;d i ;E i )+u r i where g r (²) is an unspeci ed function with g r (²) 0 g r (²) 0 u r i is an error with nite mean variance. De ne r i = g r (X i r;d i ;E i ). The joint density of (u w i ;u r i ), f (u w i ;u r i ), is unspeci ed beyond the moment restrictions already stated.

2 . Identi cation The goal is to estimate µ =[ w; r;g w (²) ;g r (²) ;f(²)] or that part of µ that is identi ed. The data consists of fy i ;w i y i ;X i ;d i ;E i g n i=. Assume temporarily that there are no restrictions on g w (X i w;d i ;E i ) or g r (X i r;d i ;E i ). Then, for observations where w i r i,weobserveonlyy i =0, while, for observations where where w i >r i,weobservey i = w i. De ne Note that u r i F (u w i ;u r i )= u r i F (u ) = Pr[u w i u r i u ] = u +u r f (u w i ;u r ) du r ; f (u w ;u r ) du w du r : F (u w i ;u r i ) > 0 u F (u ) > 0. Then the likelihood contribution for i is ( F (w i g w (X i w;d i ;E i ) ;w i g r (X i r;d i ;E i )) if y i = F (g r (X i r;d i ;E i ) g w (X i w;d i ;E i )) if y i =0 : (.) Now de ne = [w; X w;x r;d;e]=f (w g w (X w;d;e) ;w g r (X r;d;e)) (.) nonparametrically. = [X w;x r;d;e]=f (g r (X r;d;e) g w (X w;d;e)) (.3) De ne 0 z j = B Then = = can be estimated as b= [w; z] = ¾ X j w X j r d j E j C A : (.4) Pj y j H h i h w j w K (zj z) 0 (z j z) i ¾ Pj K h (z j z) 0 (z j z) i (.5)

3 b= [z] = Pj ( y j ) K h (z j z) 0 (z j z) i Pj K h (z j z) 0 (z j z) i (.6) where K [²] H [²] are kernel functions, is a bwidth matrix, ¾ is a bwidth. Then, note that = = F + F ; (.7) = = F g w ; = 3 = F g r ; = = F 0 g w ; = = F 0 g r : Thus, (F ;F ;F 0 ;g w ;g r ) is identi ed by (= ; = ; = 3 ; = ; = ). that Next, note = 4 = F g r F g w (.8) = 5 = F g r3 F g w3 : Assuming that F F (already identi ed) are not colinear as w changes, (g r ;g w ) is identi ed by = 4 variation in (F ;F ) with w, (g r3 ;g w3 ) is identi ed by = 5 variation in (F ;F ) with w. Note that there are also many overidentifying restrictions. Most obviously, because = 3 = F 0 [g r3 g w3 ] ; = 4 = F 0 [g r4 g w4 ] ; there are restrictions on (= 3 ; = 4 ). Also, any two values of w identify (g r3;g w3 ) (g r4 ;g w4 ); all other values of w provide overidentifying restrictions. Finally, I think that there might be some restrictions placed on variation in structural functionals as (w; X r;x w;d;e) varies, but I haven t convinced myself I am right, much less what they would be. One needs to anchor the relevant functionals. For example, de ne g r (0; 0; ) = a r g w (0; 0; ) = a w. Once g r g w are anchored, F (w a w ;w a r ) F (a r a w ) are identi ed. This is equivalent to assuming the mean of the errors or not including a constant in g r g w. 3

4 Note that one can estimate r w without making any functional form assumptions. A nonparametric estimator of =( w; r) is b =argmax X y i log = b [w i ;z i ( )] + ( y i )log= b [z i ( )] (.9) i where = b [w i ;z i ] = b [z i ] are de ned in equations (.5) (.6) (with sums over j 6= i) z i ( ) is de ned in equation (.4). What are the statistical properties of b? Once is estimated, we can infer the unspeci ed functionals (F ;F ;g w ;g r ) (see appendix for suggestion). 3. Estimation Strategy 3.. Adding Restrictions We know that equation (.) holds. The restrictions we want to add are F > 0, u r g r 0, gr 0, g w 0, gw gw 0, 0. Equation (.) implies that X r d X w d E = X r = d = F X r = F u r g r X r = F d = F u w g w d F u r g r d ; = w = F w = F u w + F u r : 0; (3.) = = impose no restrictions because we can not sign F. But, in regions d w u w where the estimate of F F is nonpositive, 0,, in regions where the u w d estimate of F is nonnegative, F u w 0. Thus imposing restrictions on the estimates of structural functions, F g r, implies restrictions on the nonparametric w function =. Similarly, equation (.3) implies that = X i r = X i w = F u g r X i r g w = F 0; u X i w = " = F gr g # w 0: d i u d i d i 4 0; (3.)

5 Thus imposing restrictions on the estimates of structural functions, F, g w, g r, implies restrictions on the nonparametric function. The next issue is whether restrictions on = = implied in equations (3.) (3.) imply restrictions on F, F, g w,g r. Note that the third restriction in equation (3.) restricts F. Given that restriction, the rst two u r g restrictions in equation (3.) restrict r gr. These, together with the X r d rstrestrictioninequation(3.),restrict F. Then, the second restriction u g in equation (3.) restricts w. Finally, the last restriction in equation (3.) X w restricts g r g w 0. But this implies only that g w g r which is a much d d d d weaker restriction than gw 0. In fact, there is no obvious way to impose d restrictions on the nonparametric functions, = =, such that they will imply that gw gw 0. Note, however, that, even though we can t restrict 0, we d d can still identify g w d. 3.. General Estimation Problem: Nonparametric Estimation of a Conditional Density The general estimation problem consists of estimating two conditional density functions. Consider the generic conditional density function f (" j»), consider estimating f (" j») using local regression methods. In our problem, when y =0, with " = y» = z,,wheny =, f (" j») == [z] f (" j») == [w; z] with " = y; w» = z. De ne f (u j t) =expfp(u; "; t;»)g (3.3) at (u; t) near (";») with p(u; "; t;») =a "» + b 0 " (u ")+b 0» (t»)+(u ") 0 C " (u ")+(t») 0 C» (t») a "» p(u; "; t;») = ; 5

6 Note that De ne where = De ne p(u; "; t;») b " = (u "); p(u; "; t;») b» = (t»); p(u; "; t;») C " = (u ")(u ") 0 ; p(u; "; t;») C» = (t»)(t») 0 : log f (" j») = a "» ; (3.4) " log f (" j») = b ";» log f (" j») = b»: log b f (" j») =argmin a "» ½ min L ³ ";»; a b;c "» ;b;c ¾ (3.5) L (";»; a "» ;b;c) (3.6) j n»j j " j P K " (" i "; " ) p(" i ;";» i ;») j n»j P j»j j " j RR K» (t»;» ) K " (u "; " )expfp(u; "; t;»)g dudt j» j R : K» (t»;» ) dt à & (& i ;&)=j & j K& (& i &; & )exp n b 0 & (& i &)+(& i &) 0 C & (& i &) o for & = ";» with ª & (&) b & = ª & (&) C & = ª & (&) = à & (v; &) dv (v &) à & (v; &) dv; (v &)(v &) 0 à & (v; &) dv: 6

7 Then equation (3.6) can be written as L (";»; a; b; C) = j n»j j " j P K " (" i "; " ) p(" i ;";» i ;») j n»j P (3.7) e a "» ª» (»)ª " (") : The partial derivatives are a "» j n»j j " j P K " (" i "; " ) j n»j P e a "» ª» (»)ª " ("); (3.8) b " j n»j j " j P K " (" i "; " )(" i ") e a "» ª» (») j n»j P (u ") à " (u; ") du; (3.9) j»j P i ª " (" i ;") K» (» i»; " )(» i») P b» i ª» (» i ;») e a "» ª " (") (t») û (t;») dt; (3.0) C " j n "j P i ª» (» i ;») K " (" i "; " )(" i ")(" i ") 0 P n i ª» (» i ;») e a "» ª» (») (u ")(u ") 0 à " (u; ") du; (3.) j»j P i ª " (" i ;") K» (» i»;» )(» i»)(» i») 0 P C» i ª» (» i ;») e a "» ª " (") (t»)(t») 0 û (t;») dt: Also, note that, from equation (3.4), we can get a consistent estimate of as b log f("j») " of as b»». (3.) log f("j») " 7

8 Now, let with Then, à & (v;&) = = = K & (ev) = p ( ) exp ev0 ev : ¼ ev = R & (v &) R & R 0 & = & : ( ) p j & j exp ev0 ev exp n b 0 & ¼ R & ev + ev0 R 0 & C &R & ev o (3.3) ( ) p j & j exp ev0 ev ¼ + b0 &R & ev + ev 0 R&C 0 & R & ev ½ p j & j exp h ¼ ev0 I R 0 & C i ¾ &R & ev + b 0 & R & ev : We assume that I R 0 & C &R & is positive de nite. Note that, asymptotically, R &! 0, it is likely that C & is small. Then equation (3.3) can be written as à & (& i ;&)= p ½ j & j exp ¼ (ev &) 0 h I R 0 & C i ¾ &R & (ev & )+ & (3.4) where & = h i I R&C 0 0 & R & R &b & ; & = b0 & R h 0 & I R & C i 0 &R & R & b &: Then equation (3.4) can be written as µ hi à & (v;&) =j & j 0 exp f & g Á R & C i &R & (ev & ) with Á ( ) being the stard multivariate normal density function. ª & (&) = à & (v; &) dv =expf & g ; Note that à & (v; &)(v &) dv =expf & g h I R 0 & C &R & i & ; 8

9 à & (v; &)(v &)(v &) 0 dv =expf & g h I R 0 &C & R & i : De ne =e a "» exp n» + " o : Then equations (??) through (3.) become a "» n j»j j " j P K " (" i "; " ) j n»j P ; b " j n»j j " j P K " (" i "; " )(" i ") j n»j P h i I R"C 0 " R " " ; b» j n»j j " j P K " (" i "; " )(» i») j n»j P h I R 0» C i»r»» ; C " j n "j P i ª» (» i ;») K " (" i "; " )(" i ")(" i ") 0 P n i ª» (» i ;») hi i R»C 0» R» ; j»j P i ª " (" i ;") K» (» i»;» )(» i»)(» i») 0 P C» i ª» (» i ;») hi R 0 " C i "R " : 9

10 3.3. Estimation Procedure Without Restrictions Estimation of µ =[ w; r;g w (²) ;g r (²) ;f(²)] follows from above. We consider the minimization problem analogous to equation (.9): X b =argmax y i log = b [w i ;z i ( )] + ( y i )log= b [z i ( )] (3.5) where i log b = [w i ;z i ( )] =arg min a " ( min L (";»; a; b; C) a» ;b;c from equation (3.5) with " =(y; w)» = z ( ); ( log b = [z i ( )] =arg min a " min L (";»; a; b; C) a» ;b;c ) ) (3.6) (3.7) with " = y» = z ( ). We can get consistent estimates of the partial derivatives of = b [w i ;z i ( )] = b [z i ( )] using our estimates of b from equations (3.6) (3.7). We can stack rst order conditions in equations (3.) (3.) to estimate the structural functionals. Details are described in the appendix Estimation Procedure With Restrictions Let µ k be a guess of µ. Using equations (3.) (3.), we can evaluate = b [w i ;z i ] = b [z i ] their partial derivatives at every point (w i ;z i ). Thisimpliesvalues of ³ a k ";a k»;b k ";b k» for both = b [w i ;z i ] = b [z i ] in equations (3.6) (3.7). In particular, for example for = b [w i ;z i ], b k " (") = log = b [w i ;z i ] =w i ; b k» (») = log = b [w i ;z i ] =z i ; "» a k "» (";») = b k " (u) du + b k» (t) dt: A similar set of equations apply for = b [z i ].Thevaluesof ³ a k "»;b k ";b k» for = b [w i ;z i ] = b [z i ] can be plugged into equation (3.5) then maximized over. Thisis equivalent to solving a constrained maximization problem that imposes the structure implied by µ on the nonparametric log likelihood function. It is also straightforward now to impose monotonicity constraints on the structural equations in that they correspond to imposing nonnegativity (nonpositivity) constraints on some elements of µ. 0

11 4. References References [] Mukarjee, Hari Steven Stern (994). Feasible Nonparametric Estimation of Multiargument Monotone Functions. Journal of the American Statistical Association. 89, [] Robertson, Tim, F. T. Wright, R. L. Dykstra (988). Order Restricted Statistical Inference. New York: John Wiley Sons. [3] Stern, Steven (996). Semiparametric Estimates of the Supply Dem E ects of Disability on Labor Force Participation. Journal of Econometrics Appendix: Proposal to Solve for Structural Functionals Nonparametrically The set of restrictions in equations (.7) (.8) can be approximated by a Taylor series expansion as = = F + F ; (5.) ³ ³ = = F g w F gw g w gw F F ; = 3 = ³ ³ F g r F gr g r gr F F ; = 4 = ³ ³ F g r + F gr g r + gr F F ³ ³ F g w F gw g w gw F F ; = 5 = ³ ³ F g r3 + F gr3 g r3 + gr3 F F F g w3 F ³ gw3 g w3 gw3 ³ F F ; = = F 0 g r + F ³ 0 g r g r + gr (F 0 F 0 ) ; (5.) = = F 0 g w F ³ 0 g w g w gw (F 0 F 0 ) ; = 3 = F 0 g r + F ³ 0 g r g r + gr (F 0 F 0 ) F 0 g w F ³ 0 g w g w gw (F 0 F 0 ) ;

12 = 4 = F 0 g r3 + F 0 ³ g r3 g r3 + gr3 (F 0 F 0 ) F 0 g w3 F 0 ³ g w3 g w3 gw3 (F 0 F 0 ) where underlined variables are evaluated at a xed point. These equations can be written in matrix form. Let 0 = F F = + F g w = 3 + F g r = 4 F g r + F g w W (w; z) = = 5 F g r3 + F g w3 ; = F 0 g r = B + F 0 g w = 3 F 0 g r + F 0 C g w3 A = 4 F 0 g r3 + F 0 g w3 0 A (w; z) = B g w 0 0 F g r F 0 0 g w g r 0 0 F 0 0 F 0 g w3 g r F 0 0 F 0 0 g r F g w F g r g w 0 F F g r3 g w3 0 0 F F 0 0 Q (w; z) = B F F F F F 0 F 0 g w g w g w g w g w3 g w3 g r g r g r g r g r3 g r3 Then the Taylor series approximation in equations (5.) (5.) can be written as W (w; z) =A (w; z) Q (w; z) (5.3) : C A ; C A

13 for each combination of (w; z). One might approximate the structural functionals by ) Make an initial guess of (F ;F ;g w ;g r ) satisfying initial conditions set k =0. ) Evaluate (F ;F ;g w ;g r ) (k) use it to evaluate W (k) (w; z) A (k) (w; z). 3) Solve for Q (k+) (w; z) using Q (k+) (w; z) = h A (k) (w; z) i W (k) (w; z) : 4) Using the de nition of Q (w; z),solvefor(f ;F ;g w ;g r ) (k+) given Q (k+) (w; z). 5) Check for convergence. If not, increment k by go to (). Note that this algorithm does not put restrictions on (F ;F ;g w ;g r ) that occur across di erent values of (w; z). Such restrictions are that ³ ³ F w () g w () ;w () g r () = F w () g w () ;w () g r () if w () g w () = w () g w () w () g r () = w () g r () F ³ g r () g ³ w () = F g r () g w () if g r () g w () = g r () g w () : Once the algorithm has converged, we can stack the equations in equation (5.3) over all values of (w; z) as W = AQ write restrictions (given estimates of g w g r )asbq =0. Putting these together, we want to solve the rst order conditions for the Lagrangian equation The solution is $ =(W AQ) 0 (W AQ)+ (BQ) : Q (A 0 A) B 0 h B (A 0 A) B 0i (BQ) (I think). Note that this involves inverting (A 0 A) (which is easy because it is block diagonal) h B (A 0 A) B 0i (which is not block diagonal but is pretty sparse,soitmaynotbethathard). 3

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor Aguirregabiria ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2014 Instructor: Victor guirregabiria SOLUTION TO FINL EXM Monday, pril 14, 2014. From 9:00am-12:00pm (3 hours) INSTRUCTIONS:

