Part 4b Asymptotic Results for MRR2 using PRESS. Recall that the PRESS statistic is a special type of cross validation procedure (see Allen (1971))

art 4b Asymptotc Results for MRR usg RESS Recall that the RESS statstc s a specal type of cross valdato procedure (see Alle (97)) partcular to the regresso problem ad volves fdg Y $,, the estmate at the th observato foud by removg the th data par, (x, Y ) from the data set I the MRR case RESS(λ ) ( ( $ ( ) ( ) y f + λg $ )) Oce aga, we choose λ by fdg the value of λ that mmzes RESS(λ ) Ths s doe by settg d RESS(λ ) RESS (λ) 0 ad solvg for λ We obta RESS (λ) dλ Solvg ths equato results $ λ as ( ( $ ( ) ( ) ( ) Y f ))( ) + λ ( ) ( $ ( $ ( ) ( ) ( ) Y g f + )) 0 λ $ λ ( ) $ ( $ ( ) g Y ) ( ) $, $ ( ) f g Y f < > ( ) ( ) ($ g ) Observe that ths parameter estmate s smlar to λ $ * except that the parametrc ad oparametrc estmates have bee replaced wth the aalogous cross valdated estmates Of course we must esure that RESS (λ) > 0 Ths follows from RESS (λ) ( ) > 0, forλ R, except for the degeerate case whch we wll ot worry about here Thus $ λ does, fact, produce a global mmum Ad t s ths estmate that we wll study asymptotcally the remader of the secto We wll aga obta the dfferece betwee λ* ad $ λ, the vestgate that dfferece asymptotcally 64

λ* λ$ < $, $ > < ( ) $, $ ( ) g θ f g Y > f ( ) (4B) ( $)( $ ) ( ) ( $ )( $ ( ) g θ f g Y f ) ( $ ) ( $ + ( $ $ )) ( ) g g g g (4B) We have the same deomator problem as secto 3b Recall from that secto that the asymptotc rates for $ ( ) f f$ ( ) ad are O ( - ), ad O ( ) respectvely (Burma ad Chaudhur (99) results 6, ad 60 respectvely) Next, set α ( ) ( ) ( ) ( ) Recall that usg cross valdated estmates ths dfferece s very mportat We wll eed the followg lemma ad ts corollary These results deal wth the dfferece term α ad a closely related term whch wll prove mportat the results that follow The proofs for both results are foud appedx 4b Lemma 4b: Assumg codtos A-A6 α O O ( ), f lm 3 ( ), f 0 Corollary 4b: Assumg codtos A-A6 O ( ), f lm ( ) O ( ), f 0 A mportat artfact of ths lemma s that α coverges to zero faster tha $g Ths mples that the deomator o the rght sde of 4B ca be hadled (asymptotcally) by dealg wth $g 6

Rewrtg the rght had term 4B we have ( ) ( $ )( $ ( ) g Y f ) ( ) ( + ( )) ( ) ( $ )( $ ( ) g Y f ) ( g + α) ( ) ( $ )( $ ( ) g Y f ) α ( ) ( + α) ( $ )( $ g Y f ) ( ) ( $ )( $ g Y ) f α ( ) + ( ) ( ) ( ) ( ) As before, the left part of the last term s what we eed to complete the problem The rght part, however, we must ultmately deal wth ad shall call t remader term (R) We have the α followg lemmas that gve asymptotc results for R, ad wll ultmately provde a foudato for fdg the asymptotc covergece rates for 4B The proofs for Lemmas 4b ad 4b3 are foud appedx 4b Lemma 4b: Assumg codtos A-A6 ( ) ( $ )( $ ( ) g Y f ) O ( ), f lm ( ) O ( ), f 0 Lemma 4b3: Assumg codtos A-A6 R O ( ), f lm O ( ), f 0 The mportace of the precedg result s that t wll allow us to rewrte 4B wth a commo deomator, whch wll lead to the mportat result of Lemma 4b4 66

Wth our ew otato 4B becomes λ* λ$ ( )( θ f$ ) ( ) ( $ )( $ ( ) g Y f ) ( $ $ $ ( ) ( ) ) ( $ $ $ ( ) gθ gf g Y g f ) ( ) ( ) (( $ $ ) ( $ $ ( ) gθ g Y g f gf $ $ + )) ( ) ( ) ( $ ( $ $ $) ) (( $ $ $)( $ $ ( ) gθ g + g g Y g g g f f f $ ) gf $ $ + + + ) ( ) ( ) ( $ ( $ + $ $)( + )) + (( $ $) $ + $( $ ( ) $ ( ) ) + ( $ $)( $ ( ) gθ g g g θ ε g g f g f f g g f f $ )) ( ) ( ) ( ( $ $)( + ) $ ) + (( $ $) $ + $( $ ( ) $ ( ) ) + ( $ $)( $ ( ) g g θ ε g g g f g f f g g f f $ ε )) < ( ) > < > + < ( ) ( $ $ ), $, ( $ $), ( $ ) > + < $,( $ ( ) $ ) > + < ( ) ( $ $)( $ ( ) g g ε g g g f g f f g g f f $ ) > ε θ ( ) ( ) ( $ $) + $ + ( $ $) $ + $ $ ( ) $ ( ) + ( $ $) $ ( ) g g ε g g g f g f f g g f f$ ε θ (by the Cauchy-Schwartz ad Tragle equaltes) O ( ) + O ( ) f$ f ( b * *) + f ( b **) θ O ( ) + (by Burma ad Chaudhur (99) results 60 ad 6, ad A4A3 (wth pursuat commets)) O ( ) + O ( ) O ( ) + (4B3) by the Tragle equalty, 4, ad the defto of Wth ths result for λ* λ$ had 67

we may proceed wth the followg lemma dealg wth covergece rates for the RESS selected mxg parameter to the theoretcally optmal mxg parameter Lemma 4b4 s the most mportat lemma leadg up to the estmate covergece theorems ths secto It s aalogous to Lemma 4a3 the prevous secto ad ts proof s appedx 4b Lemma 4b4: Assumg codtos A-A6 λ* λ$ O ( ) + ( ), lm O f 0 O ( ), f 0 The ext lemma gves asymptotc coverges rates for all of the prevous quattes the stace whch the parametrc estmate becomes correct as the sample sze creases ( lm 0 ) It s aalogous to Lemma 4a4 the prevous secto ad ts proof ca be foud appedx 4b Lemma 4b: Now assume that lm 0 Uder assumptos A-A6 a) α O ( ), 3 O ( ), f f > < b) R O ( ), f > O f ( ), < < O ( ), f < 68

Lemma 4b(cot): O ( ) + O ( ), f > c) λ* λ$ O ( ) + O ( ), f < < O ( ), f < Before takg o ay of the theorems dealg wth estmate covergece, we eed to do a lttle algebra smlar to that doe art 3a (partcularly the proof of Theorem 3A) Observe that λ$ + f$ θ λ* + f$ θ ( λ $ g $ + f$ θ) ( λ* + f$ θ) (followg the proof of Theorem 3A) ( t θ) ( t θ ) (say) ( t t ) ( t t ) ( t θ) (( λ $ *) $) ( $ λ g λ λ*) ( λ* + f$ θ) So that λ$ $ $ θ ( λ $ g + f λ* ) + λ$ λ* λ* + f$ θ + λ* + f$ θ The $ $ $ $ * $ ( $ * $ * $ $ λ θ λ λ λ g + f g + λ g λ g + f θ ) + λ* + f$ θ (4B4) 69

We ca ow obta the followg two theorems dealg wth estmate covergece rates The proofs of Theorems 4B ad 4B4 are foud appedx 4b The umberg s such that they ca be compared wth ther couterparts the prevous sectos Wll the MRR estmate usg the RESS selected mxg parameter yeld results that are comparable? Theorem 4B: Assumg codtos A-A6 ( ), lm $ $ $ O f λ g + f θ O ( ), f 0 Theorem 4B gves us a affrmatve respose (to the prevous questo) the form of a thrd Golde Result of Model Robust Regresso Ths tme the result demostrates the flexblty of the MRR procedure to hadle a mxg parameter estmate that volves cross valdato, ad s the frst result of ths type MRR We wll later dscuss the reasos for ths We wll demostrate the covergece rates of MRR wth a example Suppose a user s estmatg a fucto θ by usg MRR ad attemptg to model the fucto parametrcally wth a OLS quartc regresso ad oparametrcally by a Local Lear Regresso (LLR) usg the asymptotcally optmal costat badwdth, h ROT, from p of Fa ad Gjbels (996) We wll oce aga use the Epaechkov Kerel the oparametrc estmate ad $ λ for the mxg parameter From Ruppert ad Wad (994) we have that at ay gve x C, the covergece rate of the LLR estmate s gve by where for LLR, 4 ( x) θ( x) O ( h ) + O ( h ) ROT ROT h o ( ) ROT 70

The ( x) θ( x) O ( ) 4 Next, we exted ths result to the dmesoal oparametrc vector estmate For a rgorous presetato of ths exteso see the proof of Lemma a appedx a The exteso results O 4 ( ), so that asymptotcally the user has a estmate such that 4 $ ( ), lm λ $ $ O f g + f θ O ( ), f 0 Ths MRR estmate wll coverge to the true mea fucto at a rate o slower tha O the model s msspecfed, ad as fast as O ( - ) f θ(x) s truly a quartc fucto o C[] ( 4 ) f We preset oe fal theorem ths secto for the case where lm 0 Oce more, MRR proves to be a capable alteratve to MRR Theorem 4B4: Assumg codtos A-A6 hold, ad that lm 0 O ( ), f > λ$ $ $ g f θ O ( ), f + < < O ( ), f < Theorem 4B4 s comparable to Theorem 3A4 eve though ths theorem deals wth the MRR estmate usg the RESS selected mxg parameter Thus, ths result s as strkg as that of Theorem 4B We wll dscuss the reasos for ths the ext part of ths secto 7

Commets I the MRR case the mxg parameter $ λ outperforms ts MRR couterpart for the most part I comparg Theorem 3B4 to Theorem 4B4 t s evdet that the MRR estmate wth $λ has the capablty of covergg more rapdly ether of the last two cases < <, or <, ad s equal the frst > I fact, the same cotext, ts asymptotc performace s equal to that of the MRR estmate usg the asymptotcally optmal mxg parameter λ $ * Observe the results the cases whch lm, or 0, e compare Theorems 3B ad 4B The MRR estmate wth λ $ equals ts MRR couterpart estmate the frst stace ad betters t (asymptotcally) the secod Ths s most lkely attrbutable to the robustess of the MRR estmate, partcularly the lmted role of the oparametrc estmate (eve f we allow λ to be larger tha oe) Note that the MRR estmate retas all the advatages of the parametrc estmate That s, t s ever slowed dow by the mxg parameter as the MRR case Ths s a desrable qualty ad has bee demostrated mathematcally ths secto I cocluso, our work would dcate that MRR s more robust whe $ λ s used to select the mxg parameter The MRR estmate retas all of the postve asymptotc propertes of the MRR estmate ad does ot lose those capabltes whe usg $ λ, the mxg parameter selected usg RESS We ow tur our atteto to the applcato of MRR (partcularly MRR) to quatal regresso 7