Lecture 19. Endogenous Regressors and Instrumental Variables

Lecture 19. Endogenous Regressors and Instrumental Varables In the prevous lecture we consder a regresson model (I omt the subscrpts (1) Y β + D + u = 1 β The problem s that the dummy varable D s endogenous,.e. there s a relaton between D and u, e.g. because both are related to the same unobserved varable. If we express the relaton between D and u as a lnear regresson u κ + D + v = 1 κ then the OLS estmator of the regresson coeffcent of D n (1) estmates β + κ and NOT the structural regresson coeffcent β.

Soluton: Fnd varable Z such that 1. Z s related to D.. Z has no drect effect on Y,.e. Z only has an effect on Y through D. Such a varable Z s called an nstrumental varable. Consder the lnear regresson models (1) Y β + D + u = 1 β () D δ + Z + v = 1 δ () = γ + γ Z + ε Y 1 By substtuton of () n (1) we see that the followng relaton exsts between the β s, γ s, and δ s γ + 1 = β1 βδ1 γ = βδ

In partcular β = γ δ Because Z has no drect effect on Y, t s not an omtted varable n (1) and hence not related to u. Both () and () are perfectly fne regressons n whch there s no relaton between Z and the error. Hence the OLS estmators γ ˆ and δ ˆ n () and () estmate γ and δ. Concluson: An estmator of β s (4) β ˆ = γˆ δˆ In the case that Z s also a dummy varable we have β ˆ = Y D Z = 1 Z = 1 Y D Z = 0 Z = 0

There s an alternatve method to obtan the estmator (4) Consder agan the lnear regresson models (1) Y β + D + u = 1 β () D δ + Z + v = 1 δ The procedure conssts of steps 1. Estmate the regresson model n () and obtan the predcted values ˆ ˆ ˆ D = δ 1 + δ Z. Estmate the lnear regresson model wth dependent varable Y and ndependent varable Dˆ. The estmate of the regresson coeffcent of Dˆ s the estmate of the structural regresson coeffcent β. Ths estmator s called the Two-Stage Least Squares (SLS) estmator.

To see that the SLS estmator ndeed estmates the structural regresson coeffcent β we wrte the regresson equaton estmated n the second step Y = α + Dˆ + 1 α w Substtute the predcted value obtan Y ( α + α δˆ ) + α δˆ Z + = 1 1 ˆ = δ ˆ + δˆ Z to D 1 w Hence estmatng a regresson wth Dˆ as the ndependent varable s the same as estmatng a regresson wth δˆ Z as the ndependent varable.

Compare ths to Y 1 = γ + γ Z + ε OLS gves the estmator γ ˆ. If we multply the varable Z by a constant c, then the OLS estmator of the regresson coeffcent of cz s γ ˆ c (exercse n homework). Hence the OLS estmator wth δˆ Z as ndependent varable s γˆ δˆ = βˆ whch s the estmator of the structural regresson coeffcent that we found before. Concluson: The SLS estmator estmates the structural regresson coeffcent β.

Advantage of SLS procedure s that t can be used f there s more than one nstrumental varable. Warnng: The standard errors and the t- values reported n the second stage regresson of Y on Dˆ are not correct. There s a specal formula that s used n all computer packages that have SLS as an opton.

Applcaton: Loss of labor market experence due to partcpaton n Vetnam war Duraton of servce n Vetnam was on average 6 months. Was ths lost tme or dd that experence contrbute to the earnngs potental of draftees? Economc model for earnngs derves from human captal theory (5) Y t = δ t + γ 1S + γ Et + γ Et + ut wth Y t earnngs of ndvdual n year t, S years of schoolng of, and E t the labor market experence of. Busness cycle effects t δ Lnear n schoolng and quadratc n experence

Experence s not drectly observed. Instead we use (6) E = A S 6 D λ t t wth A t s the age of n year t and D s the ndcator of partcpaton n Vetnam war. The parameter λ s the loss of labor market experence due to the Vetnam war and ths s what we want to estmate. If we substtute (6) n (5) we get Y t γ + γ = δ t λ( A λs + γ t D ( A t 6) D + u t 6) + γ + γ 1 S ( A + γ t 6) S γ γ λd ( A t + γ 6) S λ + D We make the assumptons Schoolng and age are ndependent Schoolng ndependent of partcpaton n Vetnam war (more controversal)

Under these assumptons γ 1S + γ S γ ( At 6) S + γ λs can be added to the error term. If we defne X t = A t 6, the regresson equaton s t δ t + γ X t + γ X t + ( γ λ γ λ ) D γ Y = λx D + v Ths s a lnear regresson D t t t = δ t + γ Xt + γ Xt + π1d + π Y X D + v wth X t an exogenous and D an endogenous varable. We use the lottery number Z as the nstrumental varable and we replace D by the predcted value Dˆ. We fnd ˆ πˆ λ = γˆ t t

Result: See Table 5, frst column. Concluson: Loss experence s years whch s less than the average servce tme whch s years, or years Vetnam s 1 year US work experence.