Mismeasured Variables in Econometric Analysis: Problems from the Right and Problems from the Left. Jerry Hausman 1

Size: px

Start display at page:

Download "Mismeasured Variables in Econometric Analysis: Problems from the Right and Problems from the Left. Jerry Hausman 1"

Clement Allen
5 years ago
Views:

1 DRAFT, January 3, 2001 Please do not quote wthout permsson Forthcomng: Journal of Economc Perspectves Msmeasured Varables n Econometrc Analyss: Problems from the Rght and Problems from the Left Jerry Hausman 1 The effect of msmeasured varables n statstcal and econometrc analyss s one of the oldest known problems, datng from the 1880 s n Adcock (1887). In the most straghtforward regresson analyss wth a sngle regressor varable under classcal msmeasurement assumptons the least squares estmate s downward based n magntude towards zero. 2 At MIT I have called ths the Iron law of econometrcs the magntude of the estmate s usually smaller than expected. Whle a msmeasured rght hand sde varable creates ths problem, a msmeasured left hand sde varable under classcal assumptons does not lead to bas. The only result s less precson n the estmated coeffcent and a lower t-statstc. In ths paper I consder three recent developments for msmeasurement econometrc models that may be less famlar to readers. The typcal soluton to the rght hand sde msmeasurement problem s to use nstrumental varables to acheve a consstent estmate. 3 However, n the last decade a lterature has arsen regardng the use of weak nstruments that can result n sgnfcant fnte sample bas. 4 Thus, relance on an nstrumental varable estmator to solve themsmeasurement problem may be msplaced n a partcular applcaton. Here, I dscuss a specfcaton test of Hahn and Hausman (1999) that permts a test of the weak nstrument hypothess. Next, for non-lnear models wth msmeasured varables, applcaton of nstrumental varables leads to nconsstent results. At MIT I call ths the forbdden regresson where mproper use of nstrumental varables leads to nconsstency. A paper 1 Department of Economcs, MIT, jhausman@mt.edu. Ths paper s gven n memory of Zv Grlches. Jason Abrevaya provded helpful comments. 2 Surveys of these classcal results are found n Agner, Hsao, Kapteyn and Wansbeek, T. (1984), Fuller (1987), and Hausman, Newey, and Powell (1995). Henceforth, I wll assume that the true parameter s postve so that I wll not repeat the n magntude qualfer. 3 Of course, a soluton may not exst f sutable nstruments cannot be found. 4 See e.g. Bound, Jaeger, and Baker (1997).

2 2 by Hausman, Ichmura, Newey and Powell (HINP) (1991) proposes a consstent estmator for the polynomal regresson specfcaton n the presence of msmeasurement. I dscuss that estmator here and a recent extenson of the approach to general non-lnear specfcatons by Schennach (2000). Thus, consstent estmators now are known for msmeasured non-lnear regresson models. Lastly, I return to msmeasured left hand sde varables. Many lmted dependent varable models n econometrcs are nconsstently estmated f the left hand sde varables have measurement error. 5 I dscuss probt/logt type models of bnary outcomes or dscrete choce and demonstrate that nconsstency occurs. I also dscuss consstent estmators of these models wth measurement errors and also specfcaton tests. Interestngly, these models do not requre nstrumental varables for consstent estmaton so long as a monotoncty condton s satsfed. I next consder duraton models and demonstrate that under measurement errors on the left hand sde varables, nconsstency wll agan result for general models. But so long as a stochastc domnance condton s satsfed, these models can agan be estmated consstently wthout the use of an nstrument. The last model I consder wth a msmeasured left hand sde varable s the quantle regresson (QR) model. Agan nconsstent estmates result, but I do not currently have a soluton to ths problem. I. Lnear Models Wth Msmeasured Varables I begn wth the classc lnear regresson msmeasurement model. I assume a lnear specfcaton after means of the varables have been removed wth the rght hand sde varable msmeasured wth uncorrelated measurement error where z s the true unobserved regressor and x s the observed varable that contans measurement error: y x = βz + ε = βx + ε βη = βx + τ = 1,..., n = z + η (1) 5 All the models dscussed here wll also be nconsstently estmated f the rght hand sde varable s msmeasured. However, I do not dscuss that topc specfcally n the paper snce the specfcatons can be consdered as nonlnear specfcatons.

3 3 I wll assume that the sample s ndependent and dentcally dstrbuted (..d.) and that z s the unobserved true varable and that the error n observaton η s mean zero and s uncorrelated wth z and wth ε. 6 Other rght hand sde varables, assumed to be measured wthout error, have been partalled out of the model. I wll assume that β > 0. The classc result s that usually the least squares (OLS) estmator b < β. 7 Ths result causes the ron law of econometrcs that I suggested above, and t s also called attenuaton n the statstcs lterature. In terms of a large sample result, p lm b = αβ < β where α = σ z / σ x = σ z /( σ z + σ η ) < 1. Thus, the amount of large sample bas depends on the rato of the varance of the sgnal (true varable) to the sum of the varance of the sgnal and the varance of the nose (error n measurement). 8 Another well-known classc result arses f the left hand sde and rght hand sde varables are nterchanged n the regresson specfcaton. The nverse of the OLS estmator of the coeffcent n the reverse regresson, g, gves an upward based estmate of the true coeffcent β. In large samples we have the bounds, p lmb < β < p lm g. A less well known result tells the wdth of the bounds snce 2 b / g = R where the R 2 arses from the regresson equaton. Thus, for cross secton econometrcs where R 2 s often about 0.3 the bounds can be qute wde whle n a tmes seres applcaton f the R 2 s qute hgh, msmeasurement wll not have a large effect on the least squares estmate. 9 As a last result, note that f the left hand sde varable s measured wth error but not the rght hand sde varable, the OLS estmator b would be unbased under the usual Gauss-Markov assumptons. 10 Indeed, the new measurement error term, ω, assumed to 6 The..d. assumpton wll be mantaned throughout the paper. 7 Ths result does not hold n general f the measurement error s correlated wth z or wth ε, although the downward bas stll often occurs. A straghtforward calculaton demonstrates the effect of correlated measurement error on the estmator b. 8 For the effect on the estmated coeffcents of the other varables n the regresson specfcaton that have been partalled out see Mejer and Wansbeek (2000). 9 In tmes seres, the approprate R 2 s after partallng out other rght hand sde varables, whch can reduce the orgnal R 2 by qute a lot. Also, note that ths result demonstrates that the tmes seres verson of the permanent ncome hypothess, n ts most smple form of equaton (1) wth z and x permanent and measured ncome, does not lead to a sgnfcantly greater measured MPC because the R 2 n tme seres data s qute hgh. 10 By Gauss-Markov assumptons I mean the assumptons that lead to OLS beng the best lnear unbased estmator.

4 4 be uncorrelated wth ε would not be separately dentfed because the measured varable q = + ω would be equvalent to replacng y by q n the orgnal regresson y specfcaton and denotng the resdual as ν = ε + ω. The only result would be reduced precson n the estmate b, a lower t-statstc, and a reduced R 2. However, f both left and rght hand sde varables are msmeasured wth errors of measurement uncorrelated wth each other and wth ε, the b remans the same as above but g ncreases because the R 2 has decreased snce ν has replaced ε and σ 2 ν 2 2 = σ ε + σ ω. Thus, the bounds for the true coeffcent ncrease when measurement error occurs n both the left and rght hand sde varables. Most solutons to the msmeasurement problem for the lnear specfcaton n econometrcs depend on the use of nstrumental varables. 11 Instrumental varables are assumed to be correlated wth the true z but uncorrelated wth ether ε and η. Let the vector of nstrumental varables be w and the nstrumental varable (IV) estmator wll use a lnear combnaton of the w to acheve a consstent estmator b IV of the true coeffcent β. In the usual case under Gauss-Markov assumptons 2SLS wll gve the effcent nstrumental varable estmator, whle n more general stuatons of condtonal heteroscedastcty the Whte (1984) effcent IV estmator should be used. 12 However, a sgnfcant understandng has emerged over the past few years that IV estmaton of the errors-n-varables model can lead to problems of nference n the stuaton of weak nstruments, whch can arse when the nstruments do not have a hgh degree of explanatory power for the msmeasured varable(s) or when the number of nstruments becomes large. The stuaton of weak nstruments causes conventonal (frst order) asymptotctheory to provde a poor gude to fnte sample nference. These problems of nference n the weak nstrument stuaton can arse when conventonal (frst order) asymptotc nference technques are used. In partcular, conventonal frst order 11 Of course, t has long been known that consstent estmators exst n the non-normal case, whch do not requre nstrumental varables; see e.g. Agner et. al. (1984). However, these estmators have not been used very much n econometrcs. 12 Henceforth, I wll confne the choce of estmators and tests to the Gauss-Markov setup of no heteroscedastcty or seral correlaton.

5 5 asymptotcs can lead to a lack of ndcaton of a problem even though sgnfcant (large sample) bas s present because estmated standard errors are not very accurate. Hahn-Hausman (HH 1999) take a new approach to dentfyng the potental problem and use hgher order asymptotc dstrbuton theory to determne f the conventonal frst order IV asymptotcs are relable n a partcular stuaton. HH demonstrate that the fnte sample bas depends on 3 factors: (1) bas s a monotoncally 2 ncreasng functon of β σ 2 η the varance of the measurement error multpled by the true coeffcent; (2) bas s a monotoncally ncreasng functon of K the number of nstrumental varables; and (3) bas s a monotoncally decreasng functon of the R 2 of the reduced form specfcaton of x regressed on the nstrumental varables w. The new specfcaton test takes the general approach of Hausman (1978) and estmates the same parameter(s) n two dfferent ways. In partcular, we compare the dfference of the forward (conventonal) 2SLS estmator of the coeffcent of the rght hand sde msmeasured varable wth the reverse regresson 2SLS estmator of the same unknown parameter when the normalzaton s changed. 13 Under the null hypothess that conventonal frst order asymptotcs provdes a relable gude, the two estmates should be very smlar. Indeed, they have untary correlaton accordng to frst order asymptotc dstrbuton theory. If the IV estmator s workng well, the forward and reverse estmators take the form under conventonal (frst order) asymptotcs, ( b IV g ), p lm n = 0 so that the forward and reverse estmators for β should be IV qute close. However, f they are not close where the measure of closeness s determned by second order asymptotcs n HH, then problems of nferences are lkely to be present. When second order asymptotc dstrbuton theory s used, the two estmators wll dffer due to second order bas terms. The HH test subtracts off these bas terms and then sees whether the resultng dfference n the two estmates satsfes the results of second order asymptotc theory. If t does and the second order bas term s small, the HH test does not reject the use of frst order asymptotc theory. If the new specfcaton test rejects HH then consder estmaton of the equaton by second-order unbased estmators of the type frst proposed by Nagar (1959). A second specfcaton test s then used to see 13 Thus, 2SLS s appled to the forward and reverse regresson, whch was dscussed above.

6 6 f these second-order unbased estmator provdes a sutable bass for nference. The HH specfcaton test may be qute useful n cross secton stuatons wth a msmeasured rght had sde varable because of the sgnfcant fnte sample bas that often exsts along wth nstruments that do not have hgh explanatory power for the msmeasured varable. II. Non-Lnear Models Wth Msmeasured Varables The lnear model wth measurement error s equvalent to a lnear smultaneous equaton model, so that two stage least squares or a closely related estmator s used. However, ths relatonshp no longer holds n the nonlnear regresson framework as noted by Y. Amemya (1985). The reason that 2SLS no longer leads to a consstent estmator n the nonlnear errors n varables problem s because the error of measurement s no longer addtvely separable from the true varable n the nonlnear regresson model. Applcaton of an IV estmator such as 2SLS or nonlnear 2SLS (N2SLS) leads to nconsstent estmates. A straghtforward way n whch to see why 2SLS or N2SLS does not yeld consstent estmators n the nonlnear errors n varable model s to consder the lnear n parameters and nonlnear n varables specfcaton for equaton (1) where z s replaced by the nonlnear functon g z ), whch s assumed to be a suffcently smooth ( functon to do Taylor approxmatons. Replacng the unobservable z wth the observed varable x leads to hgher order terms of the Taylor expanson whch enter the regresson error term but contan the true varable z. Thus, as demonstrated by Hausman-Newey- Powell (HNP, 1995), IV estmaton wll be nconsstent because the nstruments wll necessarly be correlated wth both the msmeasured rght hand sde and the error term n the econometrc specfcaton. Hausman-Ichmura-Newey-Powell (HINP, 1991) demonstrate how to solve the problem and acheve consstent estmates n the case of a polynomal specfcaton: K j y = β j ( z ) + ε = 1,...,n (2) j=0

7 7 where the unobserved varable z s replaced by x, whch s observed wth measurement error η as n equaton (1). HINP develop a consstent estmator under ether of two sets of assumptons. The frst s a repeated measurement specfcaton where w = z + v and v s ndependent of z, where the stronger (than no correlaton) assumpton s requred because of the nonlnear specfcaton. The second specfcaton s the nstrumental varables specfcaton x z + = wδ v where v s ndependent of q so that = w δ + v + η. The unknown parameter vector δ s estmated usng OLS. The HINP (1991) approach uses estmates of hgher order moments to derve consstent estmators. The usual IV approach multples through equaton (1) by where d s the estmate of δ. Then takng probablty lmts allows a soluton of the w d, normal equatons for b. In the polynomal specfcaton HINP use hgher order moments of the ndcator varable p w or the nstrument varable p ( w d) to multply both sdes of the equaton. HINP(1991) derve recursve relatonshps, whch permt calculaton of the estmates of the elements of the unknown parameter vector and soluton for the vector b whch s a consstent estmate of the unknown parameters β j n equaton (2). HINP also derve asymptotc varance estmators for the b vector. HNP(1995) use ths approach to estmate Engel curves for famly expendture n the U.S. Estmaton of Engel curves has long been an area of nterest among econometrcans where the "Leser (1963)-Workng" form of Engel curve n whch budget shares are regressed on the log of ncome or expendture has been wdely adopted n recent research. However, n a notable paper Gorman (1981) consdered Engel curves n whch ether expendture or budget shares are specfed as polynomals n functons of expendture, e.g. log of expendture. Gven an "exactly aggregable functon", Gorman demonstrates that the rank of the matrx of coeffcents for the polynomal terms n ncome s at most three. HNP nvestgate Engel curve specfcatons of the Gorman form and provde tests of hs rank three restrcton. Few studes of Engel curves have used estmators other than least squares or nonlnear least squares. HNP nvestgate two alternatve sets of nstruments: expendture n other perods whch follows from a lfe cycle model approach to consumpton or

8 8 determnants of ncome and expendture such as educaton and age. HNP use the 1982 U.S. Consumer Expendture Survey (CES). The CES collects data from famles over four quarters so that HNP can apply the repeated measurement technque. HNP use budget share and total expendture for each famly from 1982:1 and for the repeated measurement total expendture from 1982:2. They estmate Engel curves on fve commodty groups: food, clothng, recreaton, health care, and transportaton. HNP fnd that the usual assumpton of constant budget share elastctes, whch the Leser-Workng specfcaton mposes, appears nconsstent wth the 1982 CES data. HNP also fnd that a Hausman (1978) type specfcaton test of the IV estmates versus the OLS estmates strongly rejects the OLS estmates, ndcatng the mportance of the measurement error specfcaton. Thus, HNP fnd strong evdence that use of current expendture n estmaton of Engel curves on mcro data leads to errors n varables problems. Lastly, HNP explore the Gorman results. For the polynomal specfcaton of equaton (2), the rank restrcton takes the form that the rato of coeffcents of the cubc terms to the coeffcents of the quadratc terms wll be constant across budget share equatons. HNP estmate the "Gorman statstc" to see whether the coeffcents n the cubc specfc of equaton (2) have rank three. HNP fnd a rather remarkable result. HNP fnd the ratos of the coeffcents to be extremely close n actual values and estmated precsely. The ratos of the coeffcents for the dfferent budget share equatons are 25.0, -25.2, -25.1, , and The near equalty of the ratos provdes strong support for the Gorman hypothess. Thus, use of consstent estmates of a msmeasurement model for Engel curve analyss provdes an nterestng result that a restrcton from economc theory apples to the data. In a recent doctoral thess at MIT, Schennach (2000) has extended the HINP (1991) results from the polynomal specfcaton of equaton (2) to the consstent estmaton of nonlnear models wth measurement errors n the explanatory varables when one repeated observaton exsts. Thus, regresson models wth nonlnear functons such as g ( z ), where z s an unobserved varable can be estmated consstently. Schennach uses the Fourer transform to convert the ntegral equatons that relate the 14 The HINP consstent estmates provde stronger support for the Gorman hypothess than do the OLS estmates.

9 9 dstrbuton of the unobserved true varables to the msmeasured observed varables nto algebrac equatons. She then solves these equatons to dentfy moments of the true unobserved varables. These moments are used to construct a tradtonal nonlnear least squares estmator. Whle HINP restrcted ther estmator to polynomal moments of the unobserved varables and ts product wth the left hand sde varable, Schennach allows for arbtrary functons of the msmeasured rght hand sde varable x. 15 Thus, the approach s consderably more flexble than the HINP approach, although t does nvolve calculaton of the emprcal Fourer transforms. Schennach estmates nonlnear Engel curves usng CES data from 1984:1 usng the next quarter expendture as the repeated measurement. For both of her nonlnear specfcaton she fnds that for 4 out or the 5 budget categores that she estmates, a Hausman (1978) test rejects the tradtonal NLS estmator n favor of her estmator that allows for msmeasured rght hand sde varables. Thus, both HNP (1995) and Schennach (2000) fnd that n estmaton of Engel curves, a least squares approach whch uses current expendture nstead of a permanent ncome approach leads to nconsstent estmates where the usual downward bas (attenuaton) s present n the least squares estmates. III. Measurement Error n the Left Hand Sde Varables: Probt and Logt I now consder the consequences of msmeasurement n the left hand sde varable n bnary outcome models where the observed left hand sde varable s a functon of an unobserved (latent) dependent varable. Ths specfcaton often arses n lmted dependent varable models. The outcome s dfferent from the usual regresson specfcaton where a msmeasured left hand sde varable does not lead to problems, as dscussed above. In the lmted dependent varable specfcaton, msclassfcaton of the left hand sde varable leads to based and nconsstent estmators. The usual latent varable specfcaton has an observed latent varable y * and an * * observed varable y = 1 f y 0 and y = 0 f y < 0. The regresson specfcaton that allows for msclassfcaton s: 15 In partcular, consstent estmaton for probt and logt models wth msmeasured rght hand sde (explanatory) varables follows.

10 10 q = α 0 + ( 1 α 0 α1) F( z β ) + ε (3) where α 0 s the probablty that q = 1 although the true y = 0, α 1 s the probablty that q = 0 although the true y = 1, and F s the cumulatve dstrbuton for the probt model or logt model. Wth no msmeasurement α = α 0 and nonlnear least squares (NLS) 0 1 = or maxmum lkelhood estmaton (MLE) of equaton (3), mposng the no msclassfcaton assumpton, leads to consstent estmates. However, f msclassfcaton s present, both NLS and MLE lead to based and nconsstent estmaton. However, f equaton (3) s estmate by NLS allowng for msclassfcaton, Hausman, Abrevaya, and Scott-Morton (HAS 1998) demonstrate that consstent estmaton results. Thus, the estmated coeffcents of α 0 and α 1 provde a specfcaton test for msmeasurement. Smlarly, HAS demonstrate that MLE, whch permts for msclassfcaton, s straghtforward and consstent estmates result so long as a monotoncty condton holds: α +α 1. Ths monotoncty condton s relatvely weak snce t says that the 0 1 < combned probablty of msclassfcaton s not so hgh that on average you cannot tell whch result actually occurred. Thus, two results arse n the bnary lmted dependent varable cases whch are dfferent from the classcal regresson specfcaton wth measurement error n the left hand sde varable: (1) nconsstent estmaton results from the msmeasurement and (2) consstent estmaton does not requre nstrumental varables. The analyss extends to dscrete response models wth more than two categores, and MLE estmaton s consstent. HAS fnd that relatvely small amounts of msclassfcaton, as lttle as 2%, can lead to sgnfcant amounts of large sample bas n Monte Carlo experments. The assumpton of normally dstrbuted dsturbances requred by the probt specfcaton or extreme value dsturbances for the logt specfcaton, used n the specfcaton of F n equaton (3), s not necessary for consstent estmaton. Thus, HAS also demonstrate that the monotoncty condton allows for semparametrc estmaton where no cumulatve dstrbuton needs to be assumed. HAS demonstrate that the maxmum rank correlaton (MRC) estmator of Han (1987) provdes a consstent

11 11 estmator for β. A further advantage of the MRC estmator s that a more flexble model of msclassfcaton s suffcent for consstent estmaton as the exact form of msclassfcaton need not be specfed or estmated. HAS then demonstrate how to non-parametrcally estmate the response functon F as a functon of z b, the consstent estmate, usng an sotonc regresson (IR) method. The IR method works n the current stuaton snce F s monotonc n z β snce t s a dstrbuton functon. The IR technque mposes a monotoncty condton on the lefthand sde varable n a regresson specfcaton. The IR estmator s pontwse consstent, and HAS develop asymptotc standard errors and confdence ntervals for the consstent estmator of F. As an emprcal example HAS specfy and estmate a model of job change usng both the CPS (Current Populaton Survey) and PSID (Panel Study of Income Dynamcs) data sets. The results from the CPS data set, dscussed here, demonstrate strong evdence of msclassfcaton. Usng the MLE of a probt specfcaton whch allows for msclassfcaton, HAS fnd that α 0, the probablty of msclassfcaton for non-job changers s estmated to be 6% and s very precsely estmated, and α 1, the probablty of msclassfcaton for job changers s estmated to be 31% and s also very precsely estmated. The MLE estmates of many of the rght hand sde varables change by large amounts when msclassfcaton s permtted. HAS then use the MRC and IR estmators that do not mpose functonal form restrctons. They fnd that α 0, the probablty of msclassfcaton for non-job changers s estmated to be 4% and s very precsely estmated, and α 1, the probablty of msclassfcaton for job changers s estmated to be 40% and s also very precsely estmated. Thus, HAS fnd that the semparametrc estmates are qute smlar to the MLEestmates that allow for msclassfcaton. Also, the MRC estmates of most of the rght hand sde varables change by only relatvely small amounts compared to the parametrc MLE estmates that allow for msclassfcaton. Thus, msclassfcaton appears to be a potentally serous problem n mcro data where nconsstent results may arse. Whle the combnaton MRC/IR approach s qute flexble, the lmted emprcal experence of HAS appears to

12 12 demonstrate that the MLE approach to probt or logt allowng for msclassfcaton may gve reasonable results n many actual emprcal stuatons. IV. Measurement Error n the Left Hand Sde Varables: More General Models the form: The model n the last secton s a partcular example of a class of models that take y = f ( β ) + ε * z (4) where f s strctly ncreasng and y * s an unobserved (latent) varable. To allow for msmeasurement n ths more general stuaton Abrevaya-Hausman (AH 1999) model the observed left hand sde varable q as a stochastc functon of the underlyng y *. AH demonstraton that ths model allows for bnary choce wth msclassfcaton as n HAS (1998), msmeasured dscrete dependent varables, and msmeasured contnuous dependent varables. In ths latter stuaton the observed left hand sde varable s contnuous and s a functon of the unobserved (latent) varable and a random dsturbance so that q = h y, ω ). Ths last model specfcaton ncludes the ncreasngly used ( * duraton and hazard models. AH consder sem-parametrc estmaton of these models usng ether the MRC estmators or the more general monotone rank estmator (MRE) developed by Cavanagh and Sherman (1998). To acheve dentfcaton and allow consstent estmaton usng the MRC and MRE estmators AH develop a suffcent condton on the msmeasurement process. The suffcent condton s that the dstrbuton for the observed varable q for a hgher latent y * stochastcally domnates the dstrbuton of q j for a lower latent y * j. Thus, the effect of the msmeasurement cannot on average permute the orderng of the observed left hand sde varables wth respect to the orderng of the unobserved latent varables. So the effect of the msmeasurement cannot be too large on average. The use of the noton of frst order stochastc domnance s famlar from mcroeconomcs

13 13 where t s used to order portfolos or rsky dstrbutons. 16 In terms of an actual problem wth msmeasurement n the left hand sde varable the approprate queston to ask to satsfy the suffcent condton s: Are observatonal unts wth larger true values for ther left hand sde varable more lkely to report larger values than observatons unts wth smaller true values? If the answer s yes, the AH technques can be used. Agan nstrumental varables are not requred for consstent estmaton. AH consder ther approach for the duraton models such as the well-known Cox (1972) partal lkelhood model, the Han-Hausman (1990) and Meyer (1990) specfcatons, and more generally the proportonal hazard model. A hazard model, n the context of the return to employment for an unemployed person, answers the queston of what s the probablty of becomng employed n the next tme perod condtonal on beng unemployed up to the prevous tme perod. Thus, the hazard model has a smlarty to dscrete response models, but t allows for the effect of past outcomes on the probablty of the return to employment to affect the outcome n the current perod. AH demonstrate that conventonal MLE estmaton of commonly used duraton models and hazard models s nconsstent when the left hand sde varable s msmeasured, wth the sngle excepton of the hghly restrctve Webull duraton model. 17 So f the person were actually unemployed for say 24 weeks and answered 26 weeks n the survey, msmeasurement wllexst and would, n general, lead to nconsstent estmaton. AH fnd n Monte Carlo experments that both the Cox model and the Han-Hausman-Meyer (HHM) models have coeffcent estmates that are attenuated, e.g. based towards zero. Thus, duraton model specfcaton that do not allow for msmeasurement n the left hand sde varable have results smlar to the classcal regresson model wth msmeasurement n the rght hand sde varable. AH demonstrate that MRC and MRE estmators do well n the Monte Carlo experments n takng account of the msmeasurement and estmatng the unknown parameters. AH dscuss the fndng of msmeasured duraton data n survey or unemployment duraton data. Msmeasured duratons are common to all data sets that have survey 16 See e.g. Mas-Colell, Whnston, and Green (1995), p The Webull specfcaton requres a monotonc (baselne) hazard, whch makes t too restrctve for most problems. Also, the partcular applcaton cannot allow for censorng (check) whch s typcally present n applcatons of duraton models.

14 14 responses, and AH concentrate on the Survey of Income and Program partcpaton (SIPP) data set. Some common fndngs are that a sgnfcant number of reportng error are found n the CPS about 37% of unemployed workers overstated unemployment duratons; longer spells have a hgher proporton of reportng errors, and responses tend to be focal responses for nstance at a number of weeks that corresponds to an nteger month amount, e.g. 4 or 8 weeks. AH conduct an emprcal analyss on unemployment duraton n the SIPP data and use the Cox and HHM models, whch do not allow for msmeasurement as well as a sem-parametrc MRE model that does permt msmeasurement. Whle many of the MRE parameter estmates for the demographc varables are smlar to the Cox and HHM parameter estmates, one potentally mportant dfference s found. Allowng for msmeasurement fnds a much smaller and statstcally nsgnfcant effect on the UI beneft levels on unemployment duraton. The prevous wage nteracted wth recevng UI contnues to be sgnfcant, but the actual sze of the UI beneft ceases to be sgnfcant. In partcular the commonly used replacement-rate specfcaton s rejected when msmeasurement s permtted n the unemployment duratons. AH conclude that msmeasurement n the left hand sde varable can have an mportant effect n duraton models. V. An Unsolved Problem: Measurement Error n the Left Hand Sde Varable of Quantle Regresson Models Koenker and Bassett (KB 1978) ntroduced the quantle regresson (QR) estmator nto econometrcs. The QR estmator gves a vew of the entre condtonal dstrbuton rather than just, say, the condtonal mean. The advantage of the QR estmator s that t s a robust estmator of the regresson specfcaton so that t s not as senstve to outlers,.e. extreme observatons. 18 Alternatvely, the QR estmator performs well on effcency grounds when the stochastc dsturbance has thck tals that depart from the normal (Gaussan) assumpton. Lastly, the QR estmator does well n the heteroscedastc stuaton. The QR estmator estmates a regresson specfcaton at a number of θ th regresson quantles for 0 < θ < 1 n the regresson model specfcaton ε ( θ ) = y z β ( θ ). The QR estmator thus generalzes the regresson medan estmator

15 15 for the regresson quantle of θ = 1/ 2. Interestng applcatons of the QR estmator are ncreasngly common. Buchnsky (1994) used the QR technque to explore changes n the wage dstrbuton n the CPS over a 25 year perod. Buchnsky focuses on changes n the returns to educaton and experence at dfferent ponts of the wage dstrbuton, gven that he beleves that heteroscedastcty s lkely to be an mportant factor. He fnds that the returns to educaton and experence are dfferent at the dfferent quantles, θ = {0.1, 0.25, 0.5, 0.75, 0.9}. To the best of my knowledge, no one has explored the effect of msmeasured left hand sde varables for the QR approach. As before, assume that the measured left hand sde varable q = + ω. Msmeasurement n the left hand sde varable mght be y expected n some applcatons. For nstance, n the Buchnsky paper the man left hand sde varable s the weekly wage, defned n the CPS as the total ncome from wages and salares last year dvded by the number of weeks worked last year. Both numerator and denomnator arse from survey responses n the CPS (check). However, t s not dffcult to demonstrate that a msmeasured left hand sde varable wll lead to nconsstent QR estmaton because the stochastc dsturbance s now ε + ω. The presence of the error n measurement attenuates the quantles coeffcent estmates, leadng to estmates that are lnear combnaton of the actual quantle coeffcents β (θ ) for dfferent θ s. As an emprcal example I took a QR specfcaton wth a constant and one rght hand varable n the presence of heteroscedastcty and then added a (standard) normal measurement error to the left hand sde varable. The results for estmates of the slope coeffcent are n Table 1: 18 See the paper by Professor Koenker n ths ssue.

16 16 Table 1: Estmated QR Slope Coeffcents wth and wthout Measurement Error (ME) 19 Quantle No ME Normal ME As expected, the QR slope estmates wth normal measurement error are attenuated towards the medan coeffcent because of the mxture of the regresson error ε and the measurement error ω. Whle the QR estmator s robust to certan data features that create problems for the classcal lnear regresson model, t s not robust to a msmeasured left hand sde varable, whch does not create problems for the classcal lnear regresson model. I have not yet found a soluton to ths problem, smlar to the soluton for the lmted dependent varable problems and duraton model problems. Fndng a soluton seems a good topc for future research. 19 I used 10,000 observatons n the Monte Carlo runs. The rght hand sde varable s drawn from a unform dstrbuton. The specfcaton contans sgnfcant condtonal heteroscedastcty.

17 17 References Abrevaya J. and J. Hausman (1999), Semparametrc Estmaton wth Msmeasured Dependent Varables: An Applcaton to Duraton Models for Unemployment Spells, Annales D Econome et de Statstque, 55-56, Bound, J., D. A. Jaeger, and R. M. Baker (1997), Problems wth Instrumental Varables Estmaton when the Correlaton Between the Instruments and the Endogenous Explanatory Varable s Weak, Journal of the Amercan Statstcal Assocaton, 90, Adcock, R. (1887), "A Problem n Least Squares", The Analyst, 5, Agner, D.J., Hsao, C., Kapteyn A., Wansbeek, T. (1984), "Latent Varable Models n Econometrcs", n Z. Grlches and M. D. Intrlgator, Handbook of Econometrcs, vol. II, Amemya, Y. (1985), "Instrumental Varable Estmator for the Nonlnear Errors-n- Varables Model," Journal of Econometrcs, 28, Buchnsky, M. Changes n the U.S. Wage Structure : Applcaton of Quantle Regresson, Econometrca, 62, Cavanagh C., and R. Sherman (1998), Rank Estmators for Monotonc Index Models, Journal of Econometrcs, 84, Cox D.R. (1972), Regresson Models and Lfe Tables (wth dscusson), Journal of the Royal Statstcal Socety, B, 34: Gorman, W.M. (1981), "Some Engel Curves", n A. Deaton ed., Essays n the Theory and Measurement of Consumer Behavour n Honor of Sr Rchard Stone, Cambrdge: Cambrdge Unv. Press. Hahn J. and J. Hausman (1999), A New Specfcaton Test for the Valdty of Instrumental Varables, forthcomng Econometrca. Han A. (1987), Non-Parametrc Analyss of a Generalzed Regresson Model: The Maxmum Rank Correlaton Estmator, Journal of Econometrcs, 35: Han A. and J. Hausman (1990), Flexble Parametrc Estmaton of duraton and Competng Rsk Models, Journal of Appled Econometrcs, 5, Hausman, J. (1978), "Specfcaton Tests n Econometrcs", Econometrca, 46,

18 18 Hausman J., J. Abrevaya, and F. Scott-Morton (1998), Msclassfcaton of an Dependent Varable n a Dscrete-Response Settng, Journal of Econometrcs, 87, Hausman, J., H. Ichmura, W. Newey, and J. Powell (1991), "Measurement Errors n Polynomal Regresson Models," Journal of Econometrcs, 50, Hausman, J., W. Newey, and J. Powell (1995), "Measurement Errors n Polynomal Regresson Models," Journal of Econometrcs, 65, Koenker R. and G. Bassett, Regresson Quantles, Econometrca, 46, Leser, C.E.V. (1963), "Forms of Engel Functons", Econometrca, 31, Lvtan N. (1961), "Errors n Varables and Engel Curve Analyss", Econometrca, 29, Mas-Colell A., M. Whnston, and J. Green, Mcroeconomc Theory, Oxford Unv. Press, Mejer E. and T. Wansbeek, Measurement Error n a Sngle Regressor, Economc Letters, 69, Meyer B. (1990), Unemployment Insurance and Unemployment Spells, Econometrca, 58, Nagar, A.L. (1959): The Bas and Moment Matrx of the General k-class Estmators of the Parameters n Smultaneous Equatons, Econometrca, 27, Schennach S. (2000), Estmaton of Nonlnear Models wth Measurement Error, unpublshed MIT Ph.D. thess. Whte H. (1984), Asymptotc Theory for Econometrcans, New York: Academc Press.

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes