Sectn 10 Regressn wth Stchastc Regressrs Meanng f randm regressrs Untl nw, we have assumed (aganst all reasn) that the values f x have been cntrlled by the expermenter. Ecnmsts almst never actually cntrl the regressrs We shuld usually thnk f them as randm varables that are determned jntly wth y and e Wth a small adaptatn f ur assumptns, OLS stll has the desrable prpertes t had befre OLS assumptns wth randm regressrs Wth fxed x Wth randm x SR1: y 1 x e wth x fxed A10.1: y1x e wth x, y, e randm SR: Ee 0 A10.: (x, y) btaned frm IID samplng SR3: var e A10.3: Ee x 0 cv e, e 0 A10.4: x takes n at least tw values SR4: j SR5: x takes n at least tw values A10.5: SR6: e s nrmal te that A10. mples SR4 (and A10.5?) var e x A10.6: e s nrmal te that A10.3 mples bth cv xe, 0 and Ee 0 Ths assumptn s a crtcal ne. Instead f assumng that x s a fxed value and e s randm, we make the prpertes f e cndtnal n the partcular utcme f x Ths allws us t perate n very much the same way as f x s fxed, as lng as A10.3 hlds. In the next sectn f the curse, we wll dscuss hw t deal wth vlatns f A10.3. OLS prpertes Small-sample prpertes If A10.1 A10.6 hld, then OLS s unbased OLS s BLUE OLS standard errrs are unbased ~ 95 ~
OLS ceffcent estmatrs (cndtnal n x) are nrmal Asympttc prpertes We can replace A10.3 by the weaker A10.3*: Ee x e 0, cv, 0. OLS s based n small samples f A10.3* s true but A10.3 s nt Under A10.1 A10.5, replacng wth A10.3*: OLS ceffcent estmatrs are cnsstent OLS ceffcent estmatrs are asympttcally nrmal If x s crrelated wth e If A10.3* s vlated, then OLS s based and ncnsstent Ceffcent n x wll pck up the effects f the parts f e that are crrelated wth t n addtn t the drect effects f x Drectn f bas depends n sgn f crrelatn between x and e Measurement errr (dscussed abve under nternal valdty) Suppse that the dependent varable s measured accurately but that we measure x wth errr: x x. The estmated mdel s y x e 1 ~ 96 ~. Because s part f x and therefre crrelated wth t, the cmpste errr term s nw crrelated wth the actual regressr, meanng that b s based and ncnsstent. x If e and are ndependent and nrmal, then plm b. x The estmatr s based tward zer. If mst f the varatn n x cmes frm x, then the bas wll be small. As the varance f the measurement errr grws n relatn t the varatn n the true varable, the magntude f the bas ncreases. As a wrst-case lmt, f the true x desn t vary acrss ur sample f bservatns and all f the varatn n ur measure x s randm nse, then the expected value f ur ceffcent s zer. Best slutn s gettng a better measure. Alternatves are nstrumental varables r drect measurement f degree f measurement errr. Fr example, f an alternatve, precse measure s avalable fr sme arguably randm sub-sample f bservatns, then we can calculate the varance f the true varable and the varance f the measurement errr and crrect the estmate. Omtted-varables bas
We derved ths result at the begnnng f the multple regressn analyss Omtted varable s ncluded n errr. If mtted varable s crrelated wth ncluded varable, then OLS estmatr f ceffcent n ncluded varable s based and ncnsstent. Smultaneus-equatns bas (smultanety bas) Suppse that y and x are part f a larger theretcal system f equatns: y 1 x e x 1 y u The tw varables are jntly determned and bth are endgenus. There s feedback frm y t x, r reverse causalty (actually, bdrectnal) ey x, s e and x are crrelated Supply and demand curves are dffcult t estmate because bth q and p are endgenus Instrumental varables Recall the methd f mments analyss by whch we derved the OLS estmatrs We used the assumed ppulatn mment cndtns x e E e 0, cv, 0 t derve the OLS nrmal equatns as sample mment 1 1 cndtns: eˆ 0, ˆ xe 0 If 1 1 cv xe, 0, then the ppulatn mment cndtns are nvald and we wll get based and ncnsstent estmatrs frm the OLS sample mment cndtns. The nstrumental-varables estmatr can be derved frm the methd f mments. As usual, suppse that y 1 x e, cv xe, 0. but suppse that Let z be a varable wth the fllwng prpertes: z des nt have a drect effect n y. It des nt belng n the equatn alngsde x. (z affects y nly thrugh x, nt ndependently.) z s exgenus. It s nt crrelated wth e. z s strngly crrelated wth x, the endgenus regressr. Ths makes z a vald nstrumental varable. We can explt cv ze, 0 as ur secnd mment cndtn n place f whch s nt true fr ths mdel. The sample mment cndtns are eˆ y ˆ ˆ x 0 1 1 1 ez ˆ z y ˆ ˆ x 0. 1 1 1 ~ 97 ~ cv xe, 0,
z z y y 1 Slvng the nrmal equatns yelds ˆ. z z x x 1 Cmpare ths t the standard OLS slpe estmatr b ˆ In matrx terms, ZX 1 Zy vs. 1 Prpertes f IV estmatr: Cnsstent as lng as z s exgenus b XX Xy Asympttcally nrmal ˆ ~,, rxz crr x, z rxz x x 1 As usual, we estmate by 1 IV 1 1 y ˆ ˆ 1 x x xy y x xx x Weak nstruments: If r xz s near zer, then the varance f ˆ s large and the IV estmatr s unrelable. Tw-stage least squares. What f we have multple strng nstruments and/r multple endgenus regressrs n a multple regressn? Wth mre nstruments than endgenus regressrs, we have an verdentfed system wth alternatve chces f nstruments. Suppse that x K s endgenus but the frst K 1 regressrs are exgenus Suppse that z 1 thrugh z L are L vald nstruments Any lnear cmbnatn f the nstruments s admssble Let s chse the ne that s mre crrelated wth x K T get that, we regress x 1x... 1x 11z1... z v and use the ftted values x ˆK as the nstrument fr x K Ths amunts t dng tw separate regressns, the frst-stage regressn f x K n the exgenus x varables and the nstruments z, then a secnd-stage regressn f ˆ 1 K1 K1 K K K K K L L K y x... x x e* The estmatrs f frm the secnd-stage regressn are called SLS estmatrs. ~ 98 ~
But t s nt exactly lke dng tw separate regressns because ur estmatr f the errr varance uses the actual values f x K rather than the ftted values: ˆ 1 IV y ˆ ˆ ˆ 1 x,... Kx, K K (If yu d the secnd regressn manually substtutng n the ftted values, Stata wll use the ftted values t calculate the resduals rather than the actual.) SLS easly extends t multple endgenus regressrs, as lng as there are mre ndependent nstruments than endgenus regressrs. Suppse there are G gd exgenus regressrs, B = K G bad endgenus regressrs, and L lucky nstrumental varables. 1 G G G1 G1 K K L > B means verdentfed, L = B s just dentfed, L < B means underdentfed (and can t be estmated by IV) y x... x x... x e Frst-stage regressns: xg j 1j jx... GjxG 1jz1... LjzL v j, j 1,..., B Get ftted values: xˆ ˆ ˆ ˆ ˆ ˆ G j 1j jx... GjxG 1jz1... LjzL, j 1,..., B Regress rgnal equatn replacng endgenus regressrs wth ftted values y 1 ˆ ˆ x... GxG G 1xG1... KxK e* T mplement SLS n Stata, use vregress sls depvar exvars (endvars = nstvars), ptns Overdentfcatn and generalzed methd f mments If we have addtnal nstruments beynd the mnmum (.e., an verdentfed system), then we have mre nfrmatn than we need t estmate the mdel. Suppse that z 1 and z are bth vald nstruments fr endgenus x All three mment cndtns: 1 1 1 y ˆ ˆ x mˆ 0 1 1 z y ˆ ˆ x mˆ 0,1 1 z y ˆ ˆ x mˆ 0, 1 3 are theretcally true. Ths pens the dr fr tw pssbltes: We can determne the degree t whch we cannt satsfy all three f these cndtns smultaneusly and use that as evdence f whether the mdel s ~ 99 ~
assumptns are vald. (If the mdel s perfect, then all three shuld be zer except fr samplng errr.) These are specfcatn tests dscussed belw We can thnk abut alternatve estmatrs (called GMM estmatrs) that wuld mnmze a weghted average f the squares f the m mments. SLS s a GMM estmatr wth a partcular weghtng f the mment cndtns. Instrument strength A strng nstrument must prvde crrelatn wth part f the endgenus regressr that s nt explaned by the ther (exgenus) regressrs. Regressn f xk 1x... K 1xK11z1 vk allws us t test 1 = 0 wth a standard F = t test. Hwever, cnventnal wsdm says that the nstrument s weak unless F > 10 rather than the standard crtcal values fr testng ths hypthess. Ths test can be appled wth multple nstruments and ne endgenus regressr, wth 10 stll beng the tradtnal threshld fr weak nstruments. (See HGL s Appendx 10E fr a really cnfusng expstn f the general test wth multple endgenus regressrs) Specfcatn tests If the mdel s verdentfed, then we can d tw knds f tests: A Hausman test f whether the x varables that we are treatng as endgenus truly are endgenus A test f the verdentfyng restrctns, whch can be nterpreted as a test f nstrument valdty Hausman test H x e H x e :cv, 0, :cv, 0 0 1 Under null hypthess, OLS s cnsstent and effcent, IV s cnsstent but neffcent. Snce bth are cnsstent, q b ˆ 0 n large samples Under alternatve hypthess, OLS s ncnsstent but IV s cnsstent, s q b ˆ c 0 n large samples. Stata cmmand hausman mplements the prcedure HGL gves alternatve mplementatn addng resduals frm frst-stage regressn t OLS f rgnal equatn and testng whether they are sgnfcant Tests fr nstrument valdty Is z crrelated wth e? ~ 100 ~
Can t d drect test because we can t get cnsstent estmatrs fr e wthut vald nstruments, and we can t knw whether nstruments are vald wthut cnsstent estmatr f e. Wth extra nstruments (verdentfed mdel), we can use sme t test the thers. LM test: D SLS/IV, get resduals, regress ê n all z nstruments and exgenus regressrs, under null hypthess that all nstruments are vald, R frm ths regressn ~ wth L B degrees f freedm. The J statstc s anther cmmn test f verdentfyng restrctns: As abve, regress the SLS/IV resduals n the exgenus varables n the equatn and all the nstruments. Cmpute the F statstc fr the null hypthess that the ceffcents n the nstruments are zer. The test statstc LF (where L s the number f nstruments) s asympttcally dstrbuted as a wth L B degrees f freedm (number f nstruments number f endgenus regressrs = number f verdentfyng restrctns t be tested). Why des the J test r the LM test wrk? If the nstruments are exgenus, then they shuld nt be crrelated wth y except thrugh ther effects n x. The SLS resduals are the part f y that s rthgnal t the part f z that wrks thrugh x. If that s the nly crrelatn that z has wth y (there s n drect effect ether drectn), then the resduals shuld be uncrrelated wth z, cndtnal n the ther x varables, the ncluded exgenus varables. Rejectn f the null hypthess tells us that at least ne f the verdentfyng restrctns des nt hld, whch may mean that ne r mre f the nstruments s nvald. ~ 101 ~