SEMIPARAMETRIC SINGLE-INDEX MODELS. Joel L. Horowitz Department of Economics Northwestern University

SEMIPARAMETRIC SINGLE-INDEX MODELS by Joel L. Horowitz Departmet of Ecoomics Northwester Uiversity

INTRODUCTION Much of applied ecoometrics ad statistics ivolves estimatig a coditioal mea fuctio: E ( Y X = x) Y may be cotiuous or biary If biary, the E ( Y X = x) is P ( Y = 1 X = x) I biary respose model, Y may idicate a idividual s choice amog two alteratives, occurrece or o-occurrece of a evet, etc. Possible approaches Fully parametric Fully oparametric Semiparametric

FULLY PARAMETRIC MODELING I fully parametric model, E ( Y X = x) is kow up to a fiite-dimesioal parameter: E ( Y = 1 X = x) = F( x, θ ) F is kow fuctio θ is ukow, fiite-dimesioal parameter Example: biary probit or logit model Advatages: If F is correctly specified Maximizes estimatio efficiecy Permits extrapolatio of x beyod rage of data Ofte has atural behavioral iterpretatio Disadvatages: F rarely kow i applicatios Ca be highly misleadig if F is misspecified

FULLY NONPARAMETRIC MODELING E ( Y X = x) G( x) assumed to be smooth fuctio of x Nothig assumed about shape of G. G estimated by oparametric mea regressio of Y o X This miimizes a priori assumptios ad likelihood of specificatio error Disadvatages: Hard to icorporate behavioral hypotheses draw from ecoomic or other theory models Estimatio precisio is expoetially decreasig fuctio of dimesio of X Extrapolatio ot possible

SEMIPARAMETRIC MODELING Achieves greater precisio tha oparametric models but with weaker assumptios tha parametric models Does this by restrictig G( x ) so as to reduce effective dimesio of x. Risk of specificatio error greater tha with fully oparametric model but less tha with parametric oe Examples: Sigle-idex model: Gx ( ) = Fxβ ( ), where F is ukow Additive model: Gx ( ) = H[ f( x) +... + f( x)], 1 1 where H is kow or ukow fuctio ad f i s are ukow d d

IDENTIFICATION OF SINGLE-INDEX MODELS E ( Y X = x) = G( x β ) β ot idetified if G is costat fuctio. Sig, scale, ad locatio ormalizatios eeded to idetify β To implemet assume X has o itercept ad β 1 = 1. X 1 must be cotiuously distributed coditioal o other compoets of X. Let X = ( X1, X 2) ad X β = X1+ β2x2. G ad β 2 ca be aythig that satisfy: (X 1,X 2) G(X 1 + β 2X 2) E(Y X) (0,0) G(0) 0 (1,0) G(1) 1 (0,1) G(β 2 ) 3 (1,1) G(1 + β 2 ) 4

OPTIMZATION ESTIMATORS If G kow, β ca be estimated by oliear least squares. 1 2 i i i i= 1 miimize: w( X )[ Y G( X b] b where w ( ) is a weight fuctio. Whe G ukow, replace G(X i b) with oparametric estimator of E(Y X i b) (e.g., kerel). Estimator ow solves 1 2 i i i i= 1 miimize: w( X )[ Y G ( X b)] w may be chose to b Keep deomiator of G away from 0 Achieve asymptotic efficiecy

ASYMPTOTIC NORMALITY Ichimura (1993) gives coditios uder which 1/2 ( b β ) N(0, V ) where b is weighted NLS estimator Proof based o stadard Taylor series methods of asymptotic distributio theory Estimator has 1/2 rate of covergece Hall ad Ichimura (1991) derived asymptotic efficiecy boud for β i Y G( X ) ( Xi ) U i = i β + σ β where the U i are iid with mea 0 Hall ad Ichimura also derived asymptotically efficiet estimator Uses estimate of σ(x i β) -1 as weight fuctio i NLS objective fuctio ad kerel estimator of G. i

MLE FOR BINARY RESPONSE MODEL If Y = 0 or 1, G(xβ) = P(Y=1 X=x) If G kow, log likelihood is i= 1 { i i [ i ]} log Lb ( ) = log GXb ( ) + (1 Y)log 1 GXb ( ) If G ukow, replace it with estimator G log Lb ( ) = i= 1 τ { log G ( X b) + (1 Y)log[ 1 G ( X b) ]} i i i i τ i trims away observatios for which G( Xib) is too close to 0 or 1. Klei ad Spady (1993) gave coditios uder 1/2 which semiparametric MLE estimator is - cosistet ad asymptotically ormal Chamberlai (1986) ad Cosslett (1987) derived asymptotic efficiecy boud for case i which G is a CDF Semiparametric MLE achieves boud

DIRECT ESTIMATORS NLS ad ML estimators are hard to compute Direct estimators avoid eed to solve optimizatio problem Direct estimators are ot asymptotically efficiet Efficiet estimator ca be obtaied easily by oe-step method If X is cotiuous radom vector, β proportioal to average derivative of G β E [ wx ( ) GX ( β ) X] where w is a weight fuctio Oly weighted average derivative eeded because β idetified oly up to scale If w is idetity fuctio, get average derivative estimator of β (Härdle ad Stoker 1989) This estimator is hard to aalyze because of its radom deomiator

DENSITY WEIGHTED AVERAGE DERIVATIVE ESTIMATORS Radom deomiator problem ca be overcome by settig w(x) = f(x), desity of X Itegratio by parts gives [ f( X) G( Xβ ) X] δ E [ β ] = 2 EGX ( ) f( X)/ X [ X] = 2 EY f( X) Estimate δ by replacig E with sample average ad f with kerel estimator to get δ = ( 2/ ) i= 1 Y i fi( Xi) x where f i is leave-oe-out kerel estimator of f(x). Powell, Stock, ad Stoker (1989) gave coditios uder which 1/2 ( δ δ ) N (0, V)

METHOD OF PROOF Write δ as U statistic of order 2 with badwidthdepedet kerel U statistic is asymptotically equivalet to its projectio, which gives δ 1/2 = (2/ ) r( Yi, Xi) + op( ), i= 1 where r ( Y, X ) i i = k + 1 1 Xi x K [ Yi E( Y X = x) ] f( x) dx h h Chagig variables i itegral shows that leadig term of r does ot deped o h or So δ is asymptotically equivalet to a sum of iid radom variables 1/2 -cosistecy ad asymptotic ormality follow from Lideberg-Levy theorem

TECHNICAL DETAILS Must use higher-order K with udersmoothig to isure that asymptotic distributio of 1/2 (δ - δ) is cetered at 0. Härdle ad Tsybakov (1993) ad Powell ad Stoker (1996) describe methods for selectig applicatios. h i Horowitz ad Härdle (1996) show how to iclude discrete compoets of X i direct estimator.

ESTIMATOR WITH DISCRETE COVARIATES Write model as E(Y X = x, Z = z) = G(Xβ + Zα), where X is cotiuous ad Z is discrete with M poits of support. Idetificatio requires a cotiuous covariate Assume estimator of β, b is available, possibly average of average derivative estimates computed at each poit i support of Z. Suppose there are fiite umbers c 0, c 1, v 0, v 1 such that Gv ( + z α) is bouded for all v [v 0,v 1 ] ad z supp( Z). v v0 G( v zα ) c0 + for each z supp( Z) v v1 G( v zα ) c1 Defie + > for each z supp( Z) v 0 0 1 > v J 1 ( z ) = { cigv [ ( + z α) < c ] + cigv [ ( + z α) c 1 ] 0 + Gv ( + zα) Ic [ Gv ( + zα) c]} dv 0 1

The for i DISCRETE COVARIATES (cot.) = 2,..., M () i (1) () i (1) = 1 0 Jz [ ] Jz [ ] ( c c)[ z z ] α. This is M - 1 liear equatios i compoets of α. To solve, write (2) (1) Jz [ ] Jz [ ] J =... ; ( M ) (1) Jz [ ] Jz [ ] W (2) (1) z z =.... ( M ) (1) z z The 1 0 1 1 α = ( c c ) ( WW ) W J. Obtai estimator by replacig G with oparametric regressio estimate of E( Y Xb = v, Z = z). Let J be resultig estimator of J Estimator of α is 1 1 α = ( c c ) ( WW ) W J 0 1 Horowitz ad Härdle (1996) give coditios uder 1/2 d which ( α α) N(0,Vα ).

1.8 J K G(V + 2) G G(V).2 E F G H 0 A B -2.85-1.15-0.85.85 V C D ( c0, c 1) = (0.2,0.8), ( v0, v 1) = (2.85,0.85) (1) J[ z ] = ACGE+ CDHG+ GH = 2c + 1.7c + GHK 0 0 (2) J[ z ] = ABFE + BDKJ + EFJ = 1.7c + 2c + EFJ 0 1 (2) (1) (2) (1) Jz [ ] Jz [ ] = 2( c c) = ( c c)[ z z ] α K 1 0 1 0

HIGH-DIMENSIONAL X Average derivative estimators require G ad f to have may derivatives) if X is high dimesioal. This is form of curse of dimesioality Implies that fiite-sample precisio of average derivatives may be low if dim( ) X large. Hristache, Juditsky, ad Spokoiy (2001) proposed method for iteratively improvig a average derivative estimator. Method uses two badwidths: a large oe i the directio orthogoal to curret estimate ad a small oe i parallel directio. Calculate ew estimate of β usig average derivatives with the two badwidths This procedure yields estimator that is 1/2 - cosistet ad asymptotically ormal regardless of dimesio of X whe G is twice differetiable. Mote Carlo evidece idicates that iterated estimator has smaller fiite-sample errors tha oiterated oe.

OUTLINE OF ITERATIVE METHOD Iitializatio: Specify parameters ρ 1, ρ mi, a ρ, h 1, hmax, ah, k = 1, 0 ˆβ (iitial estimate of β ) Compute S = ( I + ρ ˆ β ˆ β ) 2 1/2 k k k 1 k 1 For every i = 1,...,, compute fˆ ( X ) from ˆ 1 1 = 2 f ( ) k X i SX k ij K 2 fˆ ( ) j 1 Xij Xij h k X i = k k i 1 2 1 SX k ij Yj K 2 j= 1 X ij h k where X ij = X j Xi Compute ˆ 1 k = f ( ) j= 1 k Xi β Set hk+ 1 = ahhk, ρ = k+ 1 a ρ ρ. If k ρk+ 1 > ρm i, set k = k+1 ad retur to step 2. Otherwise, stop. ˆ

AN APPLICATION Model of product iovatio by Germa maufacturers of ivestmet goods Data assembled by IFO Istitute i Muich Cosist of observatios o 1100 maufacturers Model: P(Y=1 X=x) = G(Xβ), where Y = 1 if maufacturer realized a iovatio i a specific product category i 1989 ad 0 otherwise Variables: o. of employees i product category (EMPLP), o. of employees i etire firm (EMPLF), idicator of firm s productio capacity utilizatio (CAP), DEM = 1 if firm expected icreasig demad for product ad 0 otherwise

ESTIMATED COEFFICIENTS FOR MODEL OF PRODUCT INNOVATION EMPLP EMPLF CAP DEM Semiparametric Model 1 0.032 0.346 1.732 (0.028) (0.078) (0.509) Probit Model 1 0.516 0.520 1.895 (0.242) (0.163) (0.387)

1.8 G(V).6.4-4 0 4 8 12 V ESTIMATE OF G(V).1 dg/dv.05 0-4 0 4 8 12 V

CONCLUSIONS Sigle-idex models: Provide compromise betwee restrictios of parametric models ad imprecisio of fully oparametric models May be structural (e.g., radom utility biaryrespose model) Asymptotic efficiecy bouds available i some cases Two classes of estimators Noliear optimizatio: provides asymptotically efficiet estimator i some cases Direct: No-iterative, does ot require solvig oliear optimizatio problem Oe-step estimatio from direct-estimate yields asymptotic efficiecy whe efficiet estimator available Example based o real data illustrates usefuless