SMOOTHING QUANTILE REGRESSIONS

Size: px

Start display at page:

Download "SMOOTHING QUANTILE REGRESSIONS"

Basil Strickland
5 years ago
Views:

1 SMOOTHING QUANTILE REGRESSIONS Emmauel Guerre Marcelo Ferades Eduardo Horta Scool of Ecoomics ad Fiace Scool of Ecoomics ad Fiace Departmet of Statistics Quee Mary Uiversity of Lodo Quee Mary Uiversity of Lodo UFRGS ABSTRACT: Te paper proposes a ew smooted versio of te quatile regressio estimator. We sow tat smootig te objective fuctio improves te efficiecy of quatile regressio estimatio relative to smootig oly te ceck fuctio. Its secod-order mea squared error is studied ad sow to be smaller ta te oe of Koeker ad Bassett 978 quatile regressio estimator we a optimal badwidt is used. Te paper proposes a data-drive coice of te badwidt troug cross-validatio. A simulatio experimet reveals tat te improvemet ca be substatial wit a Mea Squared Error reductio as ig as 4% compared to te usual quatile regressio estimator of Koeker ad Bassett 978. JEL CLASSIFICATION: C4. KEYWORDS: Baadur approximatio, data-drive badwidt, kerel estimatio, quatile regressio. ACKNOWLEDGEMENTS:

2 . INTRODUCTION 2. THE OTHER SMOOTHED QUANTILE REGRESSOR ESTIMATOR I wat follows. we restrict attetio to te media estimatio. Tis is just for simplicity i tat it is straigtforward to exted te discussio to ay oter quatile. Defie te media estimator as te miimizer of H b Y i b K Yi b, 2 were K t t Ku du. Te first ad secod derivatives of te are H b H 2 b { Y i b K Y i b + K Yi b }, 2 { Yi b 2 K Yi b } + 2 K Y i b. 2.. Te secod-order term. Observe tat t k K t dt k t k Kt dt for ay iteger k as log as bot itegrals exist. It te olds tat y b y b E H2 b 2 K fy dy + 2 K y b fy dy t K t fb + z dz + 2 Kz fb + z dz fb + O s + 2 fb + O s fb + O s Te first-order term. Te variace of te first-order term satisfies var H b var K Yi b It turs out tat var K Yi b + var Y i b K Y i b + 2 cov Y i b K Y i b, K Yi b { F b F b 2 } K z K z dz + o, 2

3 wereas var Y i b K Y i b fb cov Y i b K Y i b, K Yi b fb u 2 K 2 u du + o, zkzk zfb + z dz K z K z dz + o give tat z Kz K z dz K z K z dz. It te follows tat { } var H 2 b F b F b 2 K z K z dz + o + 2 fb K z K z dz + fb u 2 K 2 u du + o. 3. SMOOTHING QUANTILE REGRESSIONS Cosider a radom sample {Y i, X i, i,..., } of Y, X R R d + suc tat Y i X iβu i X i β U i + + X id β d U i, were U i is a, -uiform radom variable idepedet of X i. Let F y x P Y y X x deote te coditioal cumulative distributio fuctio of Y give X. Assumig tat te mappigs u β j u for j,..., d are cotiuous ad icreasig over te uit iterval, te coditioal quatile fuctio is te uiquely defied by Qτ x F τ x x βτ. Koeker ad Bassett 978 estimate te quatile regressio by βτ argmi ρ τ Yi X b R d ib, 2 wit ρ τ v v τ Iv <. Te stadard approac i te literature is to smoot te ceck fuctio ρ τ by replacig te idicator fuctio I v < wit a kerel-based couterpart see Amemiya, 982; Horowitz, 998. We take a differet route by usig a smooted estimator of te cumulative distributio of deviatio V i b Y i X i b rater ta te empirical distributio i 2. We sow tat tis etails a more efficiet smootig of te objective fuctio ta just smootig te ceck fuctio. 3

4 Let F t; b IV ib t ad F t; b t f v; b dv, were f v; b is te kerel desity estimator, wit badwidt >, give by f v; b K v t d F t; b wit Kt dt. Itegratio by parts gives way to ρ τ V i b ρ τ v d F v; b τ τ F v; b dv + τ K v V i b, v d F v; b + τ F v; b dv. We propose smootig te objective fuctio by replacig F t; b wit F t; b, amely, Rb; τ, τ F v; b dv + τ Te resultig smooted quatile regressio estimator te is F v; b dv. v d F v; b βτ; arg mi b R d R b; τ,. 3 It is easier to appreciate te motivatio for smootig directly te objective fuctio by lookig at te first-order coditio, i.e., te score fuctio R b; τ, b Rb; τ,. It follows from te defiitios of F ad f tat R b; τ, τ τ F t; b dt τ F t; b dt b b X i K t V i b dt τ X i K t V i b dt { } X X i τk i b Y i X τ K i b Y i X X i K i b Y i τ. If d ad X i, solvig R b; τ, reduces to τ F b τ F b, 4 wit F b K b Yi or, equivaletly, F b τ. Tis meas tat te solutio of 4 is te well-kow Nadaraya s 964 smooted quatile estimator. Tis also illustrates te fact tat R b; τ, is 4

5 aalogous to a smooted cumulative distributio fuctio. Tis is i cotrast wit te smootig procedure put fort by Horowitz 998, wose first-order coditio ivolves a quatity aalogous to a kerel-based estimator of te probability desity fuctio. As a result, te secod-order derivative of our smooted quatile regressio estimator R 2 b; τ, is similar to a kerel desity estimator, wereas Horowitz s 998 ivolves terms tat are similar to a kerel estimator of te derivative of a probability desity fuctio. Tis esures tat our estimator as better ig-order properties ta Horowitz s 998 give tat te kerel desity estimator coverges at a faster rate relative to te kerel derivative estimator. Our mai assumptios are as follows. Deote te coditioal probability desity fuctio of Y give X x by fy x d x j β j Q y x j ad let f j y x deote its jt derivative j y j fy x. Let also s s deote te lower iteger part of ay positive real umber s, tat is to say, te uique iteger umber satisfyig s s < s. Defie X ad Y R as te port of X ad Y. Assumptio Q Te coditioal probability desity fuctio fy x is cotiuous over X Y, wit fy x > for all x, y X Y. Te coditioal quatile fuctio y Qy x is strictly icreasig for all x X. Assumptio D Tere exist some s, L > suc tat x,y f j y x L, wit lim y ± f j y x for j,..., s, ad f s y x f s y x L y y s s for all x X ad y, y Y. Assumptio X Te covariates X X,..., X d are almost surely positive, bouded ad suc tat D EXX as full rak. Assumptio K Te kerel K is symmetric, cotiuous ad twice differetiable wit Ky dy <, Ky dy, K y dy <, K 2 y dy <, ad < For s as i Assumptio Q, y s+ Ky dy < ad yky dy K y K y dy <. y s+ Ky dy. 5

6 Assumptio H Te badwidt is i te iterval, wit / o/ l ad o. Remarks a Commets o Assumptio D ad te Laplace distributio 2 exp x for wic s eve if tere is o derivative. b Ay symmetric o-egative kerel automatically meets te coditio K y K y dy > i Assumptio K. Te latter is oterwise fairly geeral, requirig oly tat te kerel fuctio is moderately large i te tails so as to esure tat K y remais iferior to oe. Before establisig te mai results, it is ecessary to itroduce some furter otatio. Let ad deote te cadidate limit of R 2 b; τ, by R 2 b; τ, 2 Rb; τ, b b D τ E XX f X βτ X. Assumptios Q, D ad X te esure tat D τ exists for all τ. Let deote te Euclidea orm of a vector or a matrix, amely, V trv V. 4. MAIN RESULTS Altoug tis paper ougt to illustrate te positive aspects of smootig, ay reader familiar wit oparametric approaces is probably already aware tat te beefits expected from tis teciques comes wit potetially importat drawbacks. Ideed, te smooted quatile regressio estimator β τ; sould ot be viewed as a estimator of βτ, a approac wic would amout to igore te impact of smootig. It is more oest to iterpret β τ; as a estimator of β τ; arg mi E Rb; τ,, b R d a poit of view tat ackowledges tat β τ; ca be a biased estimator of βτ. Te ext result studies te order of te bias term β τ; βτ. THEOREM BiasQ. Give Assumptios Q, X, K, β τ; is uiquely defied for all τ τ, τ ad satisfies, uiformly wit respect to τ, τ, τ,, βτ; βτ + O L s+. 6

7 Additioally, if s is a iteger umber ad y fy x is st cotiuously differetiable for all x, te followig expasio olds βτ; βτ s+ Bτ { + o}, 5 were Bτ z fx βτ x xx s+ Kz dz fx dx s +! E Xf s X βτ X. Let us ow tur to te positive improvemets iduced by smootig. Te ext teorem deals wit Baadur represetatios for βτ; βτ;, tat is, te approximatio of tis quatity by a stadardized sum. THEOREM 2 Baa. Give Assumptios K, Q ad X, βτ; is uique for τ, τ, τ, wit probability tedig to ad satisfies βτ; βτ; R2 βτ; ; τ, R βτ; ; τ, l + O P 3/2 + s /2 D τ R βτ; ; τ, +O P l 3/2 + s /2 + l /2 + L s 6 7 uiformly wit respect to τ, τ, τ,, were R2 τ, τ,τ, βτ; ; τ, ad τ, τ,τ, R βτ; ; τ, are bot of order O P. Tese liear approximatios for βτ; βτ; differ oly because tey employ distict stadardizatios. Te latter is te radom sum R 2 βτ; ; τ, i 6, wereas 7 uses istead its limitig couterpart D τ. As a cosequece, te remaider term of 7 icludes a additioal term l / /2 + s relative to 6, correspodig to te order of R2 τ τ,τ βτ; ; τ, D τ. As usual, tere is a factor l because of te uiform ature of te result. Note tat bot approximatios ivolve te term R βτ; ; τ,, wic is cetered give tat E R βτ; ; τ, b E bβτ; Rb; τ,. 7

8 Koeker 25 discusses similar approximatios for te regressio quatile estimator βτ βτ. See page 23 as well as refereces terei. It is well kow tat te best possible error for te correspodig remaider term is /4 up to iessetial logaritmic factors. Tis cotrasts wit te remaider term of 6 wic is at worst of order /4 provided tat s > /2 ad remais larger ta l /3 / /2. Ideed, te remaider term 6 acieves te order /2 wic is typical of estimators wic miimize a smoot criterio fuctio provided s ad assumig tat te order of remais larger or equal ta l / /3. Te study of te remaider term of 7 is more complex due to te additioal term l / /2 + s. Coosig a badwidt proportioal to /2s+, wic is optimal i view of te criterio cosidered i Teorem 3 below, gives tat te order of l / /2 + s is l / s/2s+ wic is always larger ta /2 but is smaller ta /4 provided s > /2. Teorem 2 is also a crucial step for te study of te asymptotic properties of te smooted quatile regressio estimator. Te liear approximatio 7 sows tat te asymptotic properties of β; τ are give by te leadig term βτ; D τ R βτ; ; τ,. I wat follows, we focus o te optimal coice of te badwidt we estimatig a liear combiatio λ βτ. Tis icludes i particular estimatio of eac of te coefficiets β j τ by a proper coice of te vector of λ. Defie te Asymptotic Mea Square Error of λ βτ as AMSEλ β; τ E λ βτ; D τ 2 R βτ; ; τ, βτ. Tis quatity is a proxy for te mea square error MSEλ β; τ E λ β; τ βτ 2. Studyig te AMSE istead of te MSE amouts to eglectig te remaider term of 7, wic sriks to. Te ext result describes a optimal badwidt coice wit respect to te AMSE criterio. For te quatile estimator 4, Ralescu 996 gives a liear approximatio wit a remaider of order /4 up to a logaritmic term witout restrictio o s > but provided tat te badwidt is larger ta l l / /4, suggestig tat te order of 6 is too pessimistic for small s. Ideed our proof teciques are cetered o larger values of s for wic a liear approximatio ca old wit a better order ta /4. 8

9 THEOREM 3 VarQ. Give Assumptios Q, D, X, K, ad H we ave uiformly wit respect to τ, τ, τ,, D var τ R βτ; ; τ, τ τd τd D τ { } 2 K y K y dy + O +s. 8 If te expasio i 5 olds, AMSEλ β; τ is asymptotically miimal if + oopt, were opt as log as λ Bτ. I tis case, λ D τd D τλτ τ K y K y dy s + λ Bτ 2 2s+ AMSEλ λ β; D τ τd D τλτ τ { 2s + } 2 opt K y K y dy + o 2s 2s+. 2s + 2 A key poit i Teorem 3 is te fact tat te variace expasio i 8 icludes a egative term, amely, 2 K y K y dy. Tis implies tat te asymptotic variace of te smooted βτ; is smaller ta te oe of te quatile regressio estimator βτ. As already oticed by Azzalii 98 for stadard quatile estimatio, it te follows tat te AMSE of λ βτ; ca be made smaller ta te oe of λ βτ. I oter words, smootig te loss fuctio ρ τ as i 3 improves te stadard regressio quatile estimator. I te case of quatile estimatio, a fudametal reaso for suc a improvemet is tat te smoot estimator of te cumulative distributio fuctio domiates te sample cumulative distributio fuctio as sow i Reiss 98. Uder Assumptios Q ad D, te optimal badwidt is of order /2s+ as te oe wic would be used to estimate te ucoditioal probability desity fuctio of Y. Te order of te secod-order variace term wit is /2s+ / 2s+2/2s+ wic is close to te first-order variace term λ D τd D τλτ τ/ we s is reasoably large, beig for istace 6/5 if s 2. Teorem 3 gives more detailed expasio i te case were y fy x is s-t times cotiuously differetiable. Te expressio of te optimal badwidt opt resembles te oe give i Lio ad Padgett 99 for estimatio of te cumulative distributio fuctio, see also Reiss 98. 9

10 5. A DATA-DRIVEN BANDWIDTH CHOICE Defie β λ τ; λ βτ; ad Bλ β λ τ; β λ τ; for λ R d. For a badwidt b, let D X ix i, D R 2 βτ; b; τ, b, D τ τ D D D, ad D λ λ Dλ. Let also H {,,..., K }, wit k k. Cosider te data-drive badwidt ĥ arg mi H Q;, wit Q; B 2 λ 4 Dλ K y K y dy, for. We propose to compute te badwidt by ĥ ĥ+, leadig to te fully data-drive estimator β λ τ β λ τ; ĥ. Let argmi H Q;, were Q; B 2 λ 2 D λ τ β λ τ; λ βτ;, ad B λ β λ τ; β λ τ;. K y K y dy, Assumptio B: Te badwidt b is suc tat b o ad /b o/ l, wereas te badwidts ad K respectively satisfy C /2 l ad K o, wit Kl. I additio, > is suc tat wit meetig te followig coditios. l l l. Tere also exist ukow s > /2 ad L > L > B B λ / Ls + s, L s+ B λ, ad is uiquely defied i te limit wit /2s++. B2 For all ɛ >, lim if mi 2s/2s++ H; / ɛ Q; Q ; >. THEOREM 4 DD. Uder Assumptios B ad K, te data-drive smooted quatile regressio estimator is suc tat, for +, βτ βτ Z + o P, were EZ ad varz { 2 } K y K y dy τ τd τd D τ. Note tat tere is o bias term above because βτ; βτ o P.

11 Lettig Rb; τ, E Rb; τ, yields 6. PROOFS Rb; τ Rb; τ, E ρ τ Y X b x b τ F v x dv + τ wose first- ad secod-order derivatives are respectively x b F v x dv fx dx, R b; τ R 2 b; τ F x b x τ x fx dx, fx b xxx fx dx. Let also deote S b,τ, te set R d,, to wic b, τ, belog. 6.. Prelimiary results. LEMMA BiasRF. Assumptios K, D ad X esure tat R 2 b; τ, R 2 b; τ b,τ, S b,τ, L s O, R b; τ, R b; τ b,τ, S b,τ, L s+ O, Rb; τ, Rb; τ b,τ, S b,τ, maxl, s+ O, R 2 max j j 2 b + δ; τ, R 2 j j 2 b; τ, j,j 2 d b,τ, S b,τ, L δ s O. Proof of Lemma. We ave Rb; τ, τ + τ τ + τ { v E { v E K t Y X b } dt E K t Y X b } dt dv, dv { } v+x b E E K t Y X dt dv { E v+x b K t Y X } dt dv.

12 It te follows from te Lebesgue Domiated Covergece teorem tat { R b; τ, E τ XE K v + X b Y X dv τ E τ X b X K v Y dv τ X b X K v Y dv R 2 b; τ, τe XX K X b Y + τe XX K X b Y E XX K X b Y. XE K v + X b Y X Assume from ow o tat s give tat te case s is trivial. Uder Assumptio D, a Taylor expasio wit itegral remaider gives s fv + z x f l v x l zl l! + zs s! Usig a cage of variables y t + x b + z yields uder Assumptio K E K t Y X b X x ft + x b x f s v + wz x w s dw. K t y x bfy x dy ft + x b x Kzft + x b + z x ft + x b x dy, } dv It te follows from Assumptio D tat E K X b Y { f s t + x b + wz x f s t + x b x z s fx b xfxdx C s z s Kz dz. s! Kz d Tis yields te first stated result. We ow tur our attetio to te stated results for R ad R. Itegratig 9 wit b yields x b { } E K t Y X x ft x dt { } z f s x b + wz x f s x b x s Kz dz w s dw s! { } z f s x b + vwz x f s x s+ b x dv Kz dz w w s dw s! C L s+, 2

13 wereas v+x b { } E K t Y X x ft x dt { } z f s 2 v + x b + wz x f s 2 v + x s b x s! Kz dz w s dw C maxl, s+. Tis implies tat { R b; τ, τf x b x τ F x b x } x fx dx CLs+, + } { Rb; τ, τ F v + x b x dv + τ F v + x b x dv fx dx CL s+ for all τ,. Observe ow tat R 2 j j 2 b; τ, E X j X j2 K X b Y x x 2 Kz fx b + z x fx dx dz. It te follows tat R 2 j j 2 b + δ; τ, R 2 j j 2 b; τ, uiformly i b,, δ, τ, j ad j 2. L x x 2 Kz fx b + x δ + z x fx b + z x fx dx dz x x 2 Kz x δ s fx dx dz CL δ s, LEMMA 2 StocR. Give Assumptios K ad X, max b,τ, R d,, b,τ, R d,, j,j 2 d b,τ, R d,, were, for te latter R b; τ, R b; τ, O P, R2 b; τ, R 2 b; τ, l / l l O P, 2 R j j 2 b; τ, R 2 j j 2 b; τ, δ l / l l O P, R 2 j j 2 b; τ, R 2 j j 2 b + δ; τ, R 2 j j 2 b; τ,, R 2 j j 2 b; τ, R 2 j j 2 b + δ; τ, R 2 j j 2 b; τ,. Proof of Lemma 2. Recall tat 3

14 R b; τ, { τ t Yi X i X i K b } dt τ X X i K i b Y i τ. + { t Yi X i X i K b } dt Cosider te classes of fuctios { x F j x, y x j K b y ; b, R d, }, j,..., d. Sice K < ad X j is bouded, Lemma 22-ii i Nola ad Pollard 987 esures tat F j is Euclidea wit a square itegrable evelope C X j. Te te remark below Defiitio 8 i Nola ad Pollard 987, Propositio ad iequality p. 39 i Eimal ad Maso 25 imply tat { } X X ji K i b Y i X E X ji K i b Y i O P. b, R d, Te te expressio of R b; τ, give above sows tat R b; τ, R b; τ, O P. b,τ, R d,, Te proof of te result for R 2 follows te same steps ta te proof of Teorem i Eimal ad Maso 25, usig K < ad Assumptio K2. Te proof of te result for R 2 follows te same steps, usig K < ad X j X j2 X b + δ Y K δ X b Y K to boud te summads i R 2 ad to compute teir variace X j X j2 X δ X K b + wδ Y dw δ C X K b + wδ Y dw 6.2. Proof of Teorem. Let βτ; be ay miimizer of Rb; τ,. Lemma sows, sice β τ arg mi b R b; τ R βτ; ; τ R βτ; τ if b Rb; τ, mi R βτ; τ + R βτ; ; τ R βτ; ; τ, O max L, s+, 4 b

15 uiformly i ad τ. Sice R βτ; τ, te Taylor formula wit itegral remaider gives R βτ; ; τ R βτ; τ βτ; βτ R 2 βτ + w βτ; βτ ; τ dw βτ; β τ. Sice R 2 b; τ f x b x xx fxdx, Assumptio Q ad X esures tat te miimal eigevalue of R 2 βτ + w βτ; βτ ; τ dw is larger ta C > for all τ τ, τ ad,. Hece substitutig i te iequalities above gives βτ; βτ max L, s+ /2 O.,τ, τ,τ We ow sow tat βτ; is uique, tat is Rb; τ, as a uique miimizer, provided tat is small eoug. Cosider te compact set C C ɛ { } b R d ; b j β j τ ɛ, β j τ + ɛ, j,..., p, ɛ > recallig tat β j icreases wit τ uder Q. Takig ɛ small eoug esures via Q ad X tat te miimal eigevalue of te defiite positive R 2 b; τ is larger ta a C > for all b C, τ τ, τ. Hece Lemma esures tat te maps b C Rb; τ,, τ τ, τ ad,, are all strictly covex provided is small eoug. Sice sows tat all te cadidate miimizers β τ;, τ τ, τ ad,, are i C for is small eoug, te strict covexity of te correspodig Rb; τ, esures tat all tese βτ; are uiquely defied. We ow improve. Sice Lemma gives tat R βτ; ; τ, adr βτ; τ, τ,,, τ,,, R β τ; ; τ R βτ; τ L s+ R β τ; ; τ, R βτ; ; τ L s+ O. 5

16 Now te Taylor formula, te expressio of R 2, Assumptio Q ad give R βτ; ; τ R βτ; τ R 2 βτ + w βτ; βτ ; τ dw βτ; βτ R 2 βτ; τ + O L βτ; βτ s β τ; βτ R 2 βτ; τ + O L max L, s+ s /2 βτ; βτ uiformly i τ,,,. Sice te LHS is O L s+, tis gives uiformly i τ,,, ad provided tat is small eoug, ad βτ; βτ + O L s+ is prove. Te proof of 5 similarly uses te expasios { } βτ; βτ + R 2 βτ; τ O L s+ R βτ; τ, R βτ; ; τ, R βτ; τ, { R 2 βτ; τ + O C} βτ; βτ, ad, usig s s +, ad w 2 w s dw/2 /ss + s + 2 R z s+ Kzdz βτ; τ, + o f s x βτ x fxdx, s +! so tat 5 olds Proof of Teorem 2. Lemmas ad 2 give sice τ,,, τ,,, b,τ, R d,, + o P. Sice τ,,, β R τ; ; τ R βτ; τ β R τ; ; τ R b; τ R b; τ, + R βτ; ; τ, R βτ; ; τ R βτ; τ b,τ, R d,, R b; τ, R b; τ, R 2 βτ + w βτ; βτ ; τ dw βτ; βτ 6

17 ad because te miimal eigevalue of R2 βτ + w b βτ ; τ dw is larger ta C > for all b ad τ τ, τ uder Assumptios Q ad X, tis gives tat τ, τ,τ, β τ; βτ o P, ad by Teorem, τ, τ,τ, β τ; βτ; o P. 2 Note te former sows tat it is possible to fid a compact B suc tat { βτ; }, τ, τ, τ, B wit a probability tedig to. Because Lemmas ad 2 imply tat all te fuctios b B R b; τ, are strictly covex wit a probability tedig to, it follows tat all te βτ;, τ, τ, τ, are uiquely defied wit a probability goig to. Sice R β τ; ; τ,, Lemmas 2, ad Teorem, we ave uiformly i τ, τ, τ,, R βτ; ; τ, R β τ; ; τ, R β τ; ; τ, R 2 β τ; + w βτ; β τ; ; τ, dw β τ; βτ; R 2 l /2 βτ; ; τ, + O P βτ; β τ; + O L β τ; βτ; s βτ; βτ;, 3 were te eigevalues of R 2 β τ; ; τ, are bouded away from ad ifiity wit a probability tedig to. Sice R βτ; ; τ,, Lemma 2 sows tat R βτ; ; τ, O P / uiformly. Hece applyig 3 ad 2 give, sice o/ l, 3/2 βτ; βτ; OP τ, τ,τ, l /2 ad, reapplyig 3, Applyig 3 agai gives βτ; β τ; τ, τ,τ, β τ; βτ; O P R 2 l /2 β τ; ; τ, + O P 3/2 + s /2 R βτ; ; τ, R2 βτ; ; τ, R l /2 βτ; ; τ, + O P 3/2 + s /2 7

18 uiformly wit respect to τ, τ, τ,. Tis gives te first Baadur represetatio 6 i Teorem 2. Te Baadur represetatio 7 te follows from Lemmas ad Proof of Teorem 3. We ave, sice E R βτ; ; τ,, var R β τ; ; τ, { X var X K β τ; Y E } X τ E XX K βτ; Y 2 τ X XX K β τ; Y 2 X 2τE XX K βτ; Y + τ 2 E XX. Asuumptios D ad K give, itegratig by parts ad usig Teorem, X E K βτ; Y X x x K βτ; y x fy x dy K βτ; y F y x dy F x βτ; x + F x βτ; z x F x β τ; x Kz dz F x βτ x + O s+ + O s+ τ + O s+, sice x βτ F τ x ad arguig as i Lemma.For te term ivolvig K 2, defie Kz 2KzK z d dz K z 2, wic is suc tat Kz dz lim z + K z 2. 8

19 Arguig as above ow gives X E K βτ; Y 2 X x x K βτ; y F y x dy τ + O s+ + F x βτ; z x F x βτ; x Kz dz τ f x βτ; x + O s zkz dz + O s+ τ f x βτ x + O +s s + O s zkz dz + O s+ τ f x βτ x zkz dz + O +s. Substitutig gives te variace expasio sice K z K z, zkz dz 2 zkzk zdz zd K z zd K z + K z 2 dz + + K z K z 2 dz K z dz 2 Te variace expasio ad 5 te gives AMSE λ +s βτ; g + o were g Sice g s+2, + K z K z dz. τ τ λ K z K D z dz τd D τλ + 2s+2 λ Bτ 2. K z K z dz τ τ λ D τd D solvig g gives te desired opt ad te correspodig AMSE expasio. τλ +2s + 2 2s+ λ Bτ 2, 7. WORK IN PROGRESS Defie for λ R d β λ τ; λ βτ;, Bλ β λ τ; β λ τ;. 9

20 Cosider a badwidt b. Let D D τ τ X i X i, D R 2 β τ; b ; τ, b, D D D, Dλ λ Dλ. Let H {,,..., K } were k k. A first data-drive badwidt is, for ad, Te proposed optimal badwidt is ĥ arg mi H Q ; were Q ; B 2 λ 4 Dλ ĥ ĥ+ + K y dy. leadig to te fully data-drive estimator β λ τ β λ τ; ĥ. 7.. Proof i progress. Set β λ τ; λ β τ;, B λ β λ τ; β λ τ; ad Q ; B 2 λ 4 D λ τ arg mi Q ;. H + K y dy, Assumptio B: Te badwidts, K ad K satisfy l /2 C, K o adk/ l. > is suc tat wit lim L > L > suc tat l l l +. Tere are some ukow s > /2 ad B: B λ / L s + s, L s+ B λ ad is uiquely defied asymptotically wit /2s++ ; B2: lim if 2s/2s++ mi H; / ɛ {Q ; Q ; } > for all ɛ >. Routie modificatios of Arcoes 996 give l βλ τ; β λ τ λ D τ R /4 βτ; τ, + O P, 2

21 see also He ad Sao 996, Remark 3.4. Hece 7 gives β λ τ; β /2 λ τ; B λ τ; λ were +O P l /3 3/4 + l/2 3/2 + /2 λ λ D τ R βτ; ; τ, R βτ; τ, λ D τ s /2+ + B λ /2 { X X i K i βτ; Y i I Y i X iβτ }. We ave E λ ad X E K β τ; Y I Y X βτ 2 X x + x βτ + + x βτ x K βτ; y x βτ βτ;/ x βτ βτ;/ + f x βτ x { x K βτ; y 2 fy x dy 2 fy x dy K u 2 f x β τ; u x du K u 2 f x βτ; u x du K u 2 du + + } 2 K u du uiformly i, ad x uder Assumptios B ad X. Hece, uiformly i,, { + } 2 var λ K u 2 du + K u du λ D τe XX f X βτ X D τλ. Te arguig as i Eimal ad Maso 25 gives λ τ, max O P. l /, 2

22 It follows tat, uiformly wit respect to H, β λ τ; β l / λ τ; B λ τ; + o P O P 3/4 +O P l /3 + l/2 3/2. /2 Observe ow tat te coditios o, ad K i Assumptio B esure tat { } { l / max H max l } { max exp l H H\{ } O exp l l l K l max H\{ } 3/4 l /3 / l/2 3/2 { 3/4 /2 } l /3 mi H\{ } max H\{ } l /3 l l l o K o, } 3/4 3/2 2 l /3 /2 l /6 O, 3/4 /2 /2 /2. l Tis gives uiformly wit respect to H B λ τ; β λ τ; β /2 λ τ; B λ τ; + o P + o P. Sice ad D λ D λ + o P, it follows tat max H max H B 2 λ τ; + / Q ; + / O Observe ow tat, sice /K o/ l ad usig B, Q ; Q ; Q ; + / o P. 4 C Q ; mi H {L 2 2s+2 C } C + o. Tis implies i particular tat Q ; Q ; + o P. Cosider ow some ɛ,. Uder B, we ave for all ɛ C Q ; L2 2s+2 C C ad 4 gives mi Q ; mi Q ; + o P Q ;. H; ɛ H; ɛ 22

23 It follows tat, uder B, B2 ad sice Q ; Q ; + o P, mi Q ; > mi Q ; H; ɛ H; / <ε wit a probability tedig to. Cosider ow + ɛ. Set κ 4D λ τ + K y dy so tat Q; Bλ 2 ; τ κ /. We ave for tose suc tat Q; κ/2 / Q ; Q ;. κ If Q; < κ/2 /, te B 2 λ ; τ 3κ/2 / so tat B gives 3κ/2L 2 /2s++ implyig tat Q; CQ ;. Hece 4 gives mi Q ; + op mi Q ; + o P Q ;, H; ɛ H; ɛ wic togeter wit Q ; Q ; + o P ad B2 gives mi Q ; > mi Q ;. H; +ɛ H; / <ε Hece ĥ/ + o P wic also sows tat ĥ+ + o P. We ow tur to βτ β τ; ĥ, ĥ ĥ+. 7 gives, for η ĥ+ / + were + β λ τ β λ τ; + / + + /2 E λ η l /2 +O P + /2 E λ η λ D τ R β τ; η + { λ D τ X i 3/2 + ; τ, η + X K i β τ; η + s /2+ + η + + /2 s+ R β τ; + Yi ; τ, + X K i β τ; + + } Yi. 23

24 We first study te icremets of E λ. Assumptio B gives for all η, η 2 /2, 2, { E X + K β τ; η + Y X η + K β τ; η 2 + } 2 Y η 2 + X x η {K u K u + x β τ; η + η β τ; η 2 + η 2 } 2 f x β τ; η + η + u x du x η 2 βτ;η + η βτ;η 2 + /η 2 Cη Ku + v dv 2 du C η 2 β τ; η + η β τ; η C η2 β τ; η + β τ; η2 + + η2 η β τ; η2 + 2 C η2 η 2. Hece va der Vaart ad Weller Corollary 2.2.5, 996 gives tat for ay ɛ > 2 E /2 E λ η 2 E λ η η,η 2 /2,2; η 2 η ɛ Cɛ /2. Tis gives E λ η O P ĥ + + /2 o P, ad, sice + l / /2, β λ τ β λ τ; + op + + /2 o P. /2 l /2 + O P + 3/2 + s / /2 s+ Tis gives te desired result. REFERENCES AMEMIYA, T. 982: Two stage least absolute deviatios estimators, Ecoometrica, 5, AZZALINI, A. 98: A ote i te estimatio of a distributio ad quatiles by a kerel metod, Biometrika, 68, HOROWITZ, J. L. 998: Bootstrap metods for media regressio models, Ecoometrica, 66, KOENKER, R. 25: Quatile Regressio. Cambridge Uiversity Press, Cambridge. KOENKER, R., AND G. BASSETT 978: Regressio quatiles, Ecoometrica, 46, LIO, Y. L., AND W. J. PADGETT 99: A ote o te asymptotically optimal badwidt for Nadaraya s quatile estimator, Statistics & Probability Letters,, NADARAYA, E. A. 964: Some ew estimates for distributio fuctio, Teory of Probability ad its Applicatios, 9, RALESCU, S. S. 996: A Baadur-Kiefer law for te Nadaraya empiric-quatile processes, Teory of Probability ad its Applicatios, 4, REISS, R.-D. 98: Noparametric estimatio of smoot distributio fuctios, Scadiavia Joural of Statistics, 8,

LECTURE 2 LEAST SQUARES CROSS-VALIDATION FOR KERNEL DENSITY ESTIMATION

LECTURE 2 LEAST SQUARES CROSS-VALIDATION FOR KERNEL DENSITY ESTIMATION Jauary 3 07 LECTURE LEAST SQUARES CROSS-VALIDATION FOR ERNEL DENSITY ESTIMATION Noparametric kerel estimatio is extremely sesitive to te coice of badwidt as larger values of result i averagig over more