DIRECT SEMIPARAMETRIC ESTIMATION OF SINGLE-INDEX MODELS WITH DISCRETE COVARIATES

Size: px

Start display at page:

Download "DIRECT SEMIPARAMETRIC ESTIMATION OF SINGLE-INDEX MODELS WITH DISCRETE COVARIATES"

Deborah Gardner
5 years ago
Views:

1 DIRECT SEMIPARAMETRIC ESTIMATION OF SINGLE-INDEX MODELS WITH DISCRETE COVARIATES by Joel L. Horowitz Departmet of Ecoomics Uiersity of Iowa Iowa City, IA ad Wolfgag Hȧ. rdle Istitut fu.. r Statistik ud Ȯ. koometrie Humboldt-Uiersitȧ. t zu Berli Berli, Germay October 1994 Abstract Hȧ. rdle ad Stoker (1989), Powell, et al. (1989), ad Stoker (1991) hae deeloped aerage deriatie estimators of the parameter i the sigleidex model E(Y X=x) = G(x ), where G is a ukow fuctio ad X is a radom ector. These estimators are o-iteratie ad, therefore, easy to compute. Howeer, they require X to be cotiuously distributed, which precludes their use i may applicatios. This paper deelops a oiteratie, easily computed estimator of for models i which some compoets of X are discrete. Coefficiets of cotiuous compoets of X are obtaied usig aerage deriatie techiques. Coefficiets of discrete compoets are obtaied from differeces betwee certai itegrals of G. The estimator is 1/2 -cosistet ad asymptotically ormal. A applicatio to data o product ioatio by Germa maufacturers illustrates the estimator's usefuless. KEY WORDS: Idex model, Aerage deriatie estimatio Research supported i part by Deutsche Forschugsgemeischaft, Soderforschugsbereich 373, "Quatifikatio ud Simulatio Ȯ. koomischer Prozesse. The research of Joel L. Horowitz was supported i part by NSF grats DMS ad SBR

2 DIRECT SEMIPARAMETRIC ESTIMATION OF SINGLE-INDEX MODELS WITH DISCRETE COVARIATES 1. INTRODUCTION A sigle-idex regressio model has the form E(Y X=x) = G[(x, )], (1.1) where Y is a scalar depedet ariable, X is a ector of explaatory ariables, is a ector of parameters whose alues are ukow, (, ) is a kow fuctio, ad G is a fuctio that may or may ot be kow. May widely used parametric models hae this form. regressio, biary logit ad probit, ad tobit models. that G is kow up to a fiite-dimesioal parameter. Examples are liear These models assume Whe G is ukow, (1.1) proides a specificatio that is more flexible tha a parametric model while aoidig the loss of precisio that occurs i fully oparametric estimatio with a multidimesioal x. I most applicatios (x, ) = x, where x' ad are k 1 ectors. Thus, E(Y X=x) = G(x ). (1.2) This paper is cocered with estimatig i (1.2) whe G is ukow. Seeral estimators of that do ot require a parametric specificatio of G already exist. Ichimura (1993) deeloped a semiparametric least squares estimator of. This estimator is closely related to projectio pursuit regressio (Friedma ad Stuetzle 1981). Ha (1987) ad Sherma (1993) describe a maximum rak correlatio estimator. Klei ad Spady (1993) deeloped a quasi-maximum-likelihood estimator for the case i which Y is biary. This estimator achiees the asymptotic efficiecy boud of Cosslett (1987) if G is a distributio fuctio. The estimators of Ichimura, Ha, Klei ad Spady, ad Sherma are 1/2 -cosistet ad asymptotically ormal uder regularity coditios. The foregoig estimators hae the disadatage of beig difficult to compute because they require solig oliear optimizatio problems whose objectie fuctios are ot ecessarily cocae (coex i the case of semiparametric least squares) or uimodal. If X is a cotiuous radom ariable, the computatioal difficulty of estimatig through the use of aerage-deriatie estimators. ca be greatly reduced These estimators rely o 1

3 the fact that for ay weight fuctio w( ), E[w(X) G(X )/ X]. Aeragederiatie estimatio does ot require solig a optimizatio problem, ad computatio of aerage-deriatie estimates is o-iteratie ad fast. The estimators are 1/2 -cosistet ad asymptotically ormal uder regularity coditios (Hȧ. rdle ad Stoker 1989, Powell et al. 1989, Stoker 1991). Aerage-deriatie methods caot be used to estimate compoets of that multiply discrete compoets of X. This is because deriaties of G(X ) with respect to discrete compoets of X are ot idetified. Sice X has discrete compoets i may applicatios, a direct (o-iteratie) method for estimatig a method. whe X has such compoets is eeded. This paper deelops such The resultig o-iteratie estimator is much easier to compute tha estimators that require solig oliear optimizatio problems. Sectio 2 of this paper describes the estimator ad its properties. Sectio 3 presets the results of a Mote Carlo iestigatio of the estimator's fiite-sample properties. Sectio 4 illustrates the use of the estimator by applyig it to data o product ioatio by Germa maufacturers. Sectio 5 presets cocludig commets. All proofs are i the appedix. 2. THE ESTIMATOR I order to distiguish betwee cotiuous ad discrete coariates, we rewrite (1.2) i the form E(Y X=x,Z=z) = G(X + Z ), (2.1) where X deotes a 1 k ector of cotiuous radom ariables, Z deotes a 1 ector of discrete radom ariables, ad ad are coformable ectors of parameters that must be estimated from data. Idetificatio of ad requires that (2.1) hae at least oe cotiuous explaatory ariable (Ichimura 1993, Klei ad Spady 1992, Maski 1988), so k 1. There do ot hae to be ay discrete explaatory ariables, but we assume that there is at least oe sice the focus of this paper is o estimatig. Thus, 1. Sice ad are idetified oly up to sig ad scale, sig ad scale ormalizatios are eeded. We use 1 = 1, where 1 is the coefficiet of the first compoet of X. Let X ad deote compoets 2 through k of X ad, respectiely, if k > 1. 2

4 The mai problem to be soled here is estimatig. The parameter ca be estimated usig existig methods. For example, oe ca use aeragederiatie methods to estimate for each z i the support of Z ad the form a (possibly weighted) aerage of these estimates. Accordigly, i the remaider of this sectio we cocetrate o estimatig. 2.1 Iformal Descriptio of the Estimator The essetial idea of our estimator of ca be uderstood most easily by assumig for the momet that G is a kow fuctio. Defie S Z {z (i) : i = 1,...,M} to be the support of the discrete radom ariable Z. Our estimator works by deducig the horizotal distace betwee G( + z (i) ) ad G( + z (1) ) (i = 2,...,M) o a set of alues o which G( + z ) is assumed to satisfy a weak mootoicity coditio. Specifically, we assume that there are fiite umbers 0, 1, c 0, ad c 1 such that 0 < 1, c 0 < c 1, G( + z ) < c 0 for each z S Z if < 0, ad G( + z ) > c 1 for each z S Z if > 1. To see the implicatios of this assumptio for estimatio of, let I( ) deote the idicator fuctio. For z S Z defie 1 J(z) = {c I[G( + z ) < c ] + c I[G( + z ) > c ] G( + z )I[c G( + z ) c ]}d. (2.2) 0 1 The key fact that leads to our estimator is stated i the followig equatio, which is proed i lemma 1 of the appedix: (i) (1) (i) (1) J[z ] - J[z ] = (c - c )[z - z ] ; i = 2,...,M. (2.3) 1 0 Figure 1 gies a graphical explaatio of (2.3) for a model i which z is a scalar whose 2 possible alues are [z (2), z (1) ] = (1,0), ad = 2. Let (c 0,c 1 ) = (0.2,0.8), ad ( 0, 1 ) = (-2.85,0.85). The, it ca be see that J[z (2) ] is the area EFJ + ABFE + BDKJ = EFJ + 1.7c 0 + 2c 1. J[z (1) ] is the area ACGE + CDHG + GHK = 2c c 0 + GHK. But EFJ = GHK, so J[z (2) ] - J[z (1) ] = 2(c 1 - c 0 ) = (c 1 - c 0 )[z (2) - z (1) ]. Equatio (2.3) costitutes M - 1 liear equatios i the ukow compoets of. These equatios may be soled for if a uique solutio exists. To do this, defie the (M - 1) 1 ector J by 3

5 (2) (1) J[z ] - J[z ] J =.... (2.4) (M) (1) J[z ] - J[z ] Also, defie the (M - 1) matrix W by (2) (1) z - z W =.... (2.5) (M) (1) z - z The it ca be proed (see lemma 1 of the appedix) that if W'W is a osigular matrix, -1-1 = (c - c ) (W'W) W' J. (2.6) 1 0 Equatio (2.6) forms the basis for our estimator of. Of course, (2.6) caot be used directly i estimatio because G( + z ) ad, therefore, J are ot kow i applicatios. We sole this problem by replacig G( + z ) i (2.2) with a oparametric regressio estimator of E(Y Xb =,Z=z), where b is the estimator of. We use a kerel estimator because it is relatiely easy to aalyze ad implemet, but other estimators could be used. Deote the estimator of G( + z ) by G z (). The estimator of is -1-1 = (c - c ) (W'W) W' J, (2.7) 1 0 where (2) (1) J [z ] - J [z ] J =.... (2.8) (M) (1) J [z ] - J [z ] ad for each z S z 1 J (z) = {c I[G () < c ] + c I[G () > c ] 0 z 0 1 z G ()I[c G () c ]}d. (2.9) z 0 z 1 These ideas are formalized i subsectio 2.2, where we gie coditios uder which is cosistet for ad 1/2 ( - ) is asymptotically ormal. 4

6 2.2 Assumptios ad Results We begi this subsectio by presetig our assumptios. Let S V deote the support of the distributio of V X. Let f( z) be the probability desity of V coditioal Z = z, p(, ~ x z) be the joit desity of (V, X) coditioal o Z = z, p(z) be the probability that Z = z (z S Z ), ad f(,z) = f( z)p(z). Let {Y i,x i,z i : i = 1,...,} be a radom sample of size of {Y,X,Z}, ad set V i X i. Let r 4 be a iteger. Assumptio 1: (a) S Z cotais a fiite umber of poits. (b) For each z S Z, E( X 2 Z=z) < ad E( Y X 2 Z=z) <. (c) For each z S Z, p(, x ~ z) is eerywhere 3 times cotiuously differetiable with respect to, ad the third deriatie is bouded uiformly oer (, ~ x). (d) Var(Y V=,Z=z) is bouded oer z S Z ad i ay bouded iteral. The requiremet that S Z be fiite ca always be satisfied by trucatig the distributio of Z. Assumptio 2: Defie W as i (2.5). W'W is osigular. Assumptio 3: E(Y X=x,Z=z) = G( 'x + z ). G( ) is r times cotiuously differetiable. G( ) ad its first r deriaties are bouded o all bouded iterals. Assumptio 3 makes E(Y X=x,Z=z) a sigle-idex model. The smoothess requiremets are stadard i oparametric estimatio. Assumptio 4: There are fiite umbers 0, 1, c 0 ad c 1 such that 0 < 1, c 0 < c 1, each brach of G -1 ( ) is cotiuous at c 0 ad c 1, ad for each z S Z : a. G( + z ) < c 0 if < 0, b. G( + z ) > c 1 if > 1, ad c. f( z) is bouded away from 0 o [ 0, 1 ]. The purpose of parts (a) ad (b) of assumptio 4 is explaied i the discussio of equatio (2.2). Part (c) isures that G( + z ) is idetified oer [ 0, 1 ] for each z S Z. The results preseted here hold with obious modificatios if c 0 > c 1, G( + z ) > c 0 for all z S Z if < 0, ad G( + z ) < c 1 for all z S Z if > 1. The results also hold for suitably chose data-based alues of 0 ad 1. For example, G( + z ) ca be replaced by a kerel estimator of E(Y Xb,Z=z), ad 0 ad 1 ca be chose 5

7 to satisfy the resultig "empirical" ersio of assumptio 4. Aother databased method for choosig 0 ad 1 is described i Sectio 3. Assumptio 5: If k > 1, there are (a) a 1/2 -cosistet estimator of, deoted by b, ad (b) a (k - 1) 1 ector-alued fuctio (y,x) satisfyig E (Y,X) = 0 ad 1/2-1/2 ( b - ) = (Y,X,Z ) + o (1) i i i p i=1 as. The aerage-deriatie estimators of Hȧ. rdle ad Stoker (1989), Powell et al. (1989), ad Stoker (1991) satisfy this assumptio, after sig ad scale ormalizatio, uder regularity coditios gie by these authors. All of these estimators are direct i the sese of ot requirig oliear optimizatio or other iteratie computatios. A illustratio of is gie i sectio 2.4. For each z S Z defie ^V i = X i b, - -1 ^V i A () = (h ) I(Z = z)y K ' z i i h i=1 ad - -1 ^V i f () = (h ) I(Z = z)k. z i h i=1 where K is a fuctio satisfyig assumptio 6 below, ad {h } is a sequece of positie real umbers satisfyig assumptio 7. Set G () = A ()/f (), z z z Assumptio 6: K is a bouded, symmetrical, differetiable fuctio that is ozero oly o [-1,1]. K'( ), the deriatie of K, is Lipschitz cotiuous. For each iteger i betwee 0 ad r: 1 i K()d = -1 1 if i = 0 0 if 1 < i < r ozero if i = r Assumptio 7: As, h r + 3 ad h 2r 0. A higher-order kerel (r 4) with udersmoothig is eeded to preet 1/2 ( - ) from beig asymptotically biased. At the expese of somewhat 6

8 more complex proofs, assumptio 7 ca be relaxed to permit {h } to be a radom sequece ad to ary accordig to the alue of z S z. For each z S Z, defie G z () = G( + z ), G z '() = dg z ()/d ad 1 = - G '()E( X,z)I[c G( + z ) c ]d. z z The followig theorem shows that is cosistet ad 1/2 ( - ) is asymptotically ormal uder assumptios 1-7. Theorem 1: Let assumptios 1-7 hold. As (a) p (b) 1/2 ( - ) d N(0, ), where is the coariace matrix of the (M - 1) 1 radom ector whose (j - 1) compoet (j = 2,...,M) is -1-1/2 (j) (j) -1 (W'W) W' {I(Z = z )f(v,z ) [Y - G (V )]I[c G (V ) i i i (j) i 0 (j) i i=1 z z (1) (1) -1 c ] - I(Z = z )f(v,z ) [Y - G (V )]I[c G (V ) c ] 1 i i i (1) i 0 (1) i 1 z z + ( - )' (Y,X,Z. (j) (1) i i i z z 2.3 Estimatig ca be estimated cosistetly by replacig ukow quatities with cosistet estimators. It is ot difficult to show that z is estimated cosistetly by -1 = - X I(Z = z)i( ^V )I[c G (^V ) c ]G '(^V )/f (^V )], z i i 0 i 1 0 z i 1 z i z i i=1 where G z '() = dg z ()/d. Defie (y,,z) to be the (M - 1) 1 ector whose (j - 1) compoet (j = 2,...,M) is (j) -1 (y,,z) = I(z = z )f () [y - G ()]I[c G () j (j) (j) 0 (j) z z z c ] 1 7

9 (1) -1 - I(z = z )f () [y - G ()]I[c G () c ]. (1) (1) 0 (1) 1 z z z Let be a cosistet estimator of. The is estimated cosistetly by the sample coariace of the (M - 1) 1 ector whose (j - 1) compoet (j = 2,...,M) is -1 (W'W) W'[ (Y,^V,Z ) + ( - )' (Y,X,Z )]. j i i i (j) (1) i i i z z The details of deped o the estimator of that is used. To illustrate, let p(x z) (z S Z ) be the probability desity fuctio of X coditioal o Z = z, ad let p (z) be the empirical probability that Z = z. I Sectio 3 we estimate by (1) usig the method of Powell et al. (1989) to estimate the desity-weighted aerage deriatie [Ep(X z) G(X + z )/ x] for each z S Z, (2) formig a weighted aerage of the resultig estimates with weights p (z), ad (3) imposig the ormalizatio 1 = 1. Let = E[p(X Z) G(X + Z )/ x (1) ], where x (1) is the first compoet of x. It follows by applyig the delta method to equatios (3.14) ad (3.16) of Powell et al. (1989) that -1 (1) (y,x,z) = -2 p(z)[y - G(x + z )][ p(x z)/ ~ x - p(x z)/ x ]. (2.10) Let be the weighted aerage of the estimates of [Ep(X z) G(X + z )/ x (1) ] usig weights p (z). ca be obtaied from (2.10) by replacig with, p with p, G(x + z ) with G z (xb ) ad p(x z) with a kerel desity estimator. 3. MONTE CARLO EXPERIMENTS This sectio reports the results of a small-scale Mote Carlo iestigatio of the fiite-sample behaior of for model (1.2). I the experimets k = = 2, ad = 250 or 500. G is the cumulatie stadard ormal distributio fuctio. The compoets of X are idepedetly distributed as N(0,1). The first compoet of Z is 0 with probability 0.5 ad 1 with probability 0.5. The secod compoet of Z takes the alues 0, 1, ad 2 with probabilities of 0.25, 0.5, ad 0.25, respectiely. The compoets of Z are idepedet of oe aother ad of X. The first compoet of is 1 by scale ormalizatio. The secod compoet, whose true alue is 8

10 2, was estimated by formig a weighted aerage of desity-weighted aeragederiatie estimates (Powell, et al. 1989) that were computed for each poit i S Z. The weights i the weighted aerage of estimates were the empirical probabilities p (z). K is the 4th-order kerel K() = (105/64)( )I( 1). The alues of c 0 ad c 1 are 0.2 ad 0.8, respectiely. Existig theory does ot idicate how to choose h or the badwidth required to estimate whe is fixed. Hȧ. rdle, et al. (1992) iestigated badwidth selectio for aerage-deriatie estimatio with a scalar X. Their results are ot applicable here because does ot hae to be estimated if X is scalar. Hȧ. rdle ad Tsybako (1992) iestigated badwidth selectio for aerage deriatie estimatio with a multidimesioal X, but with their recommeded badwidth the asymptotic distributio of 1/2 ( b - ) has a ozero mea, thereby iolatig our assumptio 5. We carried out a set of prelimiary Mote Carlo experimets ad, o the basis of its results, set the badwidth for estimatig at 2.5. To compute the kerel estimate G z, we used the badwidth h z = s z z -1/7.5, where s z is the sample stadard deiatio of Xb coditioal o Z = z S Z ad z is the umber of sampled obseratios for which Z = z. Experimetatio with other badwidths idicated that the results are ot highly sesitie to the choice of badwidth. The itegral i (2.9) was computed by Gauss-Legedre quadrature. To aoid edge effects i estimatig G z, the limits of itegratio were set at = mi max {X b - h : Z = z} 1 i z i z S 1 i Z ad = max mi {X b + h : Z = z}. 0 i z i z S 1 i Z There were 500 replicatios i each experimet. The computatios were carried out i GAUSS usig GAUSS pseudo-radom umber geerators. The results of the experimets are summarized i Table 1, which shows the empirical meas ad stadard deiatios of b, 1 ad 2. We also computed the empirical medias ad iterquartile rages of the estimates. These lead to the same coclusios as the meas ad stadard deiatios, so they are ot show. To proide a basis for judgig the performace of the 9

11 semiparametric estimator, Table 1 also shows the meas ad stadard deiatios of the parametric maximum likelihood estimates of ad. The asymptotic efficiecy boud for semiparametric estimatio of ad exceeds the Cramer-Rao boud (Cosslett 1987), so o semiparametric estimator ca achiee the precisio of the parametric maximum likelihood estimator. The semiparametric estimator of Klei ad Spady (1993) achiees the semiparametric efficiecy boud, but its computatioal complexity precludes carryig out Mote Carlo experimets to compare its fiite-sample performace with that of the direct estimator. The differeces betwee the true alues of ad the meas of the semiparametric estimates are small except i the two experimets with 2 = 1 ad = 250. I these experimets, 2 has a relatiely large dowward bias. The root-mea-square errors of the semiparametric estimates of 1 ad 2 exceed those of the maximum likelihood estimates by factors of 1.3 to 1.6 i all the experimets. The cause of the dowward bias i the experimets with 2 = 1 ad = 250 ca be uderstood by obserig that is estimated from the horizotal differece betwee fuctios G z correspodig to differet alues of z. If the shifts caused by ariatios i Z are large ad z is small (depedig o z, its aerage alue is either 31 or 62 i the experimets with = 250), there may be few alues of Xb i the iteral o which the rages of the fuctios G z oerlap. This causes the estimates of J ad to be imprecise. The problem decreases with icreasig, as ca be see from the results of the experimets with = AN APPLICATION This sectio illustrates the semiparametric estimator by applyig it to data o product ioatio by Germa maufacturers of iestmet goods. The data were assembled i 1989 by the IFO Istitute i Muich ad cosist of obseratios o 1100 maufacturers. The depedet ariable is Y = 1 if a maufacturer realized a ioatio i a specific product category durig 1989 ad 0 otherwise. The cotiuous idepedet ariables are the umber of employees i the product category (EMPLP), the umber of employees i the etire firm (EMPLF), ad a idicator of the firm's productio capacity 10

12 utilizatio (CAP). There is oe discrete idepedet ariable, DEM, which is 1 if a firm expected icreasig demad i the product category ad 0 otherwise. We stadardized the cotiuous ariables, so they hae uits of stadard deiatios from their meas. Scale/sig ormalizatio was achieed by settig EMPLP = 1. The kerel, 0, ad 1 are as i Sectio 3. We set c 0 = 0.50 ad c 1 = 0.95 after examiig graphs of G z () for the two alues of DEM. The badwidths for estimatig G z were selected usig the method described i Sectio 3. Estimatio of was carried out usig the desity-weighted aerage deriatie method described i Sectio 3. The badwidth for estimatig was obtaied by scalig the alue used i the Mote Carlo experimets i proportio to z -1/5.5. The expoet is based o Powell, et al. (1989), who show that the badwidth must be asymptotic to (sample size) p, where -1/6 < p < -1/5, i order for our assumptio 5 to hold. The scalig procedure yielded h 1.75 for both alues of DEM. The semiparametric estimates of ad are show i Table 3 together with the estimates obtaied from a parametric probit model. Figures 2 ad 3 show estimates of G ad dg/d that were obtaied from kerel oparametric regressio of Y o the semiparametric estimate of X + Z. There are large differeces betwee the semiparametric ad probit estimates. The semiparametric estimate of EMPLF is small ad statistically osigificat, whereas the probit estimate is similar i size to CAP. I additio, the semiparametric estimate of DEM is 65 percet larger tha the probit estimate. A particularly strikig differece betwee the semiparametric ad probit estimates is reealed by Figures 2 ad 3, which show that dg/d is a bimodal probability desity fuctio (PDF) with roughly equal probability i each lobe. This feature of the data cotradicts the probit model, which assumes that dg/d is a uimodal (ormal) PDF. The bimodality of dg/d suggests that the data may be a mixture of two populatios. Although further iestigatio of this possibility is beyod the scope of this paper, a obious ext step would be to search for ariables that characterize these populatios. 4. CONCLUSIONS 11

13 This paper has described a direct (o-iteratie) method for estimatig the parameters of a semiparametric sigle-idex model whe some of the explaatory ariables are discrete. The resultig estimator is 1/2 - cosistet ad asymptotically ormal. The method described here is cosiderably less demadig computatioally tha other methods for estimatig semiparametric sigle-idex models with discrete explaatory ariables, sice all other methods require solig difficult oliear optimizatio problems. A applicatio to data o product ioatio by Germa maufacturers has illustrated the usefuless of the semiparametric estimator. 12

14 MATHEMATICAL APPENDIX This appedix presets the proof of Theorem 1. The proof is based o four lemmas. It is assumed throughout that assumptios 1-7 hold. Lemma 1: (a) For each i = 2,...,M, (i) (1) (i) (1) J[z ] - J[z ] = (c - c )[z - z ]. 1 0 (b) = (c 1 - c 0 ) -1 (W'W) -1 W' J. Proof: To proe part (a), defie a = max{ 0 + z : z S Z } ad b = mi{ 1 + z : z S Z }. Let z = z (1) or z (i). A chage of ariables ad some algebra yield a b J(z) = c I[G() < c ]d + c I[G() < c ]d z 0 a b b + G()I[c G() c ]d + c I[G() > c ]d a a + z 1 + I[G() > c ]d 1 b b b = c ( - - z ) + c I[G() < c ]d + G()I[c G() c ]d 0 a a a b + c I[G() > c ]d + c ( - + z ) b a Therefore, (i) (1) (i) (1) J[z ] - J[z ] = (c - c )[z - z ], (A1) 1 0 which proes part (a). Part (b) follows from osigularity of W'W ad the obseratio that by part (a), W' J - (c 1 - c 0 )W'W. Q.E.D. Defie: (,z) = p(z) ~ x( / )p(, ~ x z)d ~ x, 1 13

15 (,z) = p(z) ~ x( / )[G ()p(, ~ x z)]d ~ x, 2 z (,z) = -G '()E( X,z), 3 z ad -1 = (Y,X,Z ). i i i i=1 Lemma 2: Defie G z () = G( + z ) (z S Z ). For each z S Z (a) - V -1 i 3-1 A () = (h ) I(Z = z)y K - ' (,z) + O [(h ) ], z i i h 2 p i=1 (b) A z () - G z ()f(,z) = O p (h r ), (c) - V -1 i 3-1 f () = (h ) I(Z = z)k - ' (,z) + O [(h ) ], z i h 1 p i=1 ad (d) f z () - f(,z) = O p (h r ) uiformly oer (-, ). Proof: Oly parts (a) ad (b) are proed. The proofs of parts (c) ad (d) are similar. To proe part (a), use a Taylor series expasio to obtai - V -1 i A () = (h ) I(Z = z)y K z i i h i=1 - V 2-1 i - (h ) I(Z = z)y K' X ( b - ) + R, i i h i i=1 A () - A ()( b - ) + R, z1 z2 where - V * - V 2-1 i i R = -(h ) I(Z = z)y X K' - K' ( b - ) i i i h h i=1 ad V i * is betwee ^V i ad V i. Uder assumptio 1, Lipschitz cotiuity of K' ad 1/2 -cosistecy of b imply that R = O p [(h 3 ) -1 ] uiformly oer. Methods similar to those used i establishig uiform cosistecy of 14

16 deriaties of kerel oparametric regressio estimators yield the result that A z2 () = 2 (,z) + O p (h r ) + o p [(log )/(h 3 ) 1/2 ] uiformly oer. Part (a) follows from these results ad assumptio 5. To proe part (b), obsere that A z1 () - EA z1 () = o[(h ) -1/2 log()] almost surely uiformly oer by Theorem (2.37) of Pollard (1984). I additio, stadard methods for kerel estimatio show that EA z1 () = G z ()f( z)p(z) + O(h r ) uiformly oer. This result ad part (a) establish part (b). Q.E.D. Lemma 3: For each z S Z - V -1 i G () - G () = [h f( z)] I(Z = z)[y - G ()]K z z i i z h i=1 uiformly oer [ 0, 1 ]. r ' (,z) + O (h ) + O [(h ) ]. 3 p p Proof: A Taylor series expasio yields -1 G () - G () = f(,z) [A () - G ()f ()] z z z z z 2 + O{[A () - G ()f(,z)][f () - f(,z)]/f(,z) } z z z O{[f () - f(,z)] /f(,z) }. (A2) z The lemma follows by applyig Lemma 2 to (A2). Q.E.D. Lemma 4: For each z S Z, J (z) - J(z) = -1-1 I(Z = z)i[c G (V ) c ]f(v,z) [Y - G (V )] i 0 z i 1 i i z i i=1-1/2 + ' + o ( ). z p Proof: Defie 1 - V -1 i H = (1/h ) I[c G () c ][Y - G ()]f(,z) K d i 0 z 1 i z h 0 ad -1 H * = I(Z = z)i[c G (V ) c ]f(v,z) [Y - G (V )]. i i 0 z i 1 i i z i 15

17 It follows from assumptio 4 ad lemma 2 that I[G z () < c 0 ] - I[G z () < c 0 ] = o p ( -1/2 ) uiformly oer [ 0, 1 ]. The same result holds if c 0 is replaced by c 1 ad/or the directios of the iequalities are reersed. Therefore, it follows from lemma 3 that -1-1/2 J (z) - J(z) = I(Z = z)i + + o ( ). (A3) i i z p i=1 A straightforward but somewhat legthy calculatio based o Taylor series expasios shows that for each z S Z, E(H i - H i *) = O(h r ) ad Var(H i - H i *) = O(h /). The lemma ow follows from Chebyshe's iequality. Q.E.D. Proof of Theorem 1: By lemma 4 ad the defiitio of, -1-1/2 - = (W'W) W' + o ( ). p Therefore, part (a) of the theorem follows by applyig the weak law of large umbers of, ad part (b) follows by applyig the Lideberg-Ley cetral limit theorem. Q.E.D. 16

18 REFERENCES Cosslett, S.R. (1987). Efficiecy bouds for distributio-free estimators of the biary choice ad the cesored regressio models. Ecoometrica, 55, Friedma, J.H. ad Stuetzle, W. (1981). Projectio pursuit regressio. Joual of the America Statistical Associatio, 76, Ha, A.K. (1987). No-parametric aalysis of a geeralized regressio model. Joural of Ecoometrics, 35, Hȧ. rdle, W., Hart, J., Marro, J.S. ad Tsybako, A.B. (1992). Badwidth choice for aerage deriatie estimatio. Joural of the America Statistical Associatio, 87, Hȧ. rdle, W. ad Stoker, T.M. (1989). Iestigatig smooth multiple regressio by the method of aerage deriaties. Joural of the America Statistical Associatio, 84, Hȧ. rdle, W. ad Tsybako, A.B. (1993). How sesitie are aerage deriaties? Joural of Ecoometrics, 58, Ichimura, H. (1993). Semiparametric least squares (SLS) ad weighted SLS estimatio of sigle-idex models. Joural of Ecoometrics, 58, Klei, R.L. ad Spady, R.H. (1993). A efficiet semiparametric estimator for discrete choice models. Ecoometrica, 61, Maski, C.F. (1988). Idetificatio of biary respose models. Joural of the America Statistical Associatio, 83, Pollard, D. (1984). Spriger-Verlag. Coergece of Stochastic Processes, New York, Powell, J.L., Stock, J.H., ad Stoker, T.M. (1989). Semiparametric estimatio of idex coefficiets. Ecoometrica, 51, Sherma, R.P. (1993). The limitig distributio of the maximum rak correlatio estimator. Ecoometrica, 61, Stoker, T.M. (1991). Equialece of direct, idirect ad slope estimators of aerage deriaties, i W.A. Barett, J. Powell ad G. Tauche, eds., Noparametric ad Semiparametric Methods i Ecoometrics ad Statistics, New York, Cambridge Uiersity Press,

19 TABLE 1: RESULTS OF THE MONTE CARLO EXPERIMENTS a = 250 = 500 Mea Mea (Std. Deiatio) (Std. Deiatio) b b Direct Semiparametric Estimator (0.403) (0.371) (0.240) (0.271) (0.260) (0.167) (0.436) (0.342) (0.262) (0.272) (0.252) (0.180) (0.439) (0.343) (0.305) (0.270) (0.256) (0.212) (0.416) (0.372) (0.241) (0.265) (0.281) (0.173) (0.418) (0.378) (0.236) (0.282) (0.265) (0.178) (0.440) (0.345) (0.285) (0.287) (0.256) (0.215) Parametric Maximum Likelihood Estimator (0.273) (0.246) (0.183) (0.195) (0.170) (0.118) (0.306) (0.235) (0.194) (0.214) (0.165) (0.132) (0.306) (0.258) (0.219) (0.209) (0.172) (0.157)

20 (0.293) (0.255) (0.178) (0.204) (0.173) (0.119) (0.287) (0.251) (0.184) (0.200) (0.181) (0.129) (0.290) (0.285) (0.215) (0.216) (0.198) (0.160) a Based o 500 replicatios. b is the estimate of the secod compoet of. i (i = 1,2) estimates the i'th compoet of. 19

21 TABLE 2: ESTIMATED COEFFICIENTS (STANDARD ERRORS) FOR A MODEL OF PRODUCT INNOVATION a EMPLP EMPLF CAP DEM Semiparametric Model (0.028) (0.091) (0.380) Probit Model (0.242) (0.163) (0.387) a The coefficiet of EMPLP is 1 by sig-scale ormalizatio. 20

SEMIPARAMETRIC SINGLE-INDEX MODELS. Joel L. Horowitz Department of Economics Northwestern University

SEMIPARAMETRIC SINGLE-INDEX MODELS by Joel L. Horowitz Departmet of Ecoomics Northwester Uiversity INTRODUCTION Much of applied ecoometrics ad statistics ivolves estimatig a coditioal mea fuctio: E ( Y