Part 4b Asymptotic Results for MRR2 using PRESS. Recall that the PRESS statistic is a special type of cross validation procedure (see Allen (1971))

Similar documents
Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions.

9 U-STATISTICS. Eh =(m!) 1 Eh(X (1),..., X (m ) ) i.i.d

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ " 1

TESTS BASED ON MAXIMUM LIKELIHOOD

ESS Line Fitting

CHAPTER 4 RADICAL EXPRESSIONS

( ) = ( ) ( ) Chapter 13 Asymptotic Theory and Stochastic Regressors. Stochastic regressors model

5 Short Proofs of Simplified Stirling s Approximation

arxiv:math/ v1 [math.gm] 8 Dec 2005

X X X E[ ] E X E X. is the ()m n where the ( i,)th. j element is the mean of the ( i,)th., then

2006 Jamie Trahan, Autar Kaw, Kevin Martin University of South Florida United States of America

18.413: Error Correcting Codes Lab March 2, Lecture 8

Bounds on the expected entropy and KL-divergence of sampled multinomial distributions. Brandon C. Roy

Feature Selection: Part 2. 1 Greedy Algorithms (continued from the last lecture)

Multiple Linear Regression Analysis

MATH 247/Winter Notes on the adjoint and on normal operators.

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model

Chapter 4 (Part 1): Non-Parametric Classification (Sections ) Pattern Classification 4.3) Announcements

F. Inequalities. HKAL Pure Mathematics. 進佳數學團隊 Dr. Herbert Lam 林康榮博士. [Solution] Example Basic properties

Lecture Note to Rice Chapter 8

( ) 2 2. Multi-Layer Refraction Problem Rafael Espericueta, Bakersfield College, November, 2006

Bayes Estimator for Exponential Distribution with Extension of Jeffery Prior Information

Can we take the Mysticism Out of the Pearson Coefficient of Linear Correlation?

Mu Sequences/Series Solutions National Convention 2014

Bayes (Naïve or not) Classifiers: Generative Approach

Chapter 9 Jordan Block Matrices

Recall MLR 5 Homskedasticity error u has the same variance given any values of the explanatory variables Var(u x1,...,xk) = 2 or E(UU ) = 2 I

ECON 5360 Class Notes GMM

Econometric Methods. Review of Estimation

Point Estimation: definition of estimators

Multivariate Transformation of Variables and Maximum Likelihood Estimation

Third handout: On the Gini Index

Functions of Random Variables

Exercises for Square-Congruence Modulo n ver 11

STA302/1001-Fall 2008 Midterm Test October 21, 2008

Chapter 14 Logistic Regression Models

12.2 Estimating Model parameters Assumptions: ox and y are related according to the simple linear regression model

Solving Constrained Flow-Shop Scheduling. Problems with Three Machines

Decomposition of Hadamard Matrices

Marcinkiewicz strong laws for linear statistics of ρ -mixing sequences of random variables

Statistics MINITAB - Lab 5

X ε ) = 0, or equivalently, lim

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

Estimation of Stress- Strength Reliability model using finite mixture of exponential distributions

Chapter 5 Properties of a Random Sample

Lecture 07: Poles and Zeros

Lecture 9: Tolerant Testing

Simulation Output Analysis

Investigation of Partially Conditional RP Model with Response Error. Ed Stanek

Multiple Choice Test. Chapter Adequacy of Models for Regression

Non-uniform Turán-type problems

Evaluating Polynomials

Lecture 2: The Simple Regression Model

means the first term, a2 means the term, etc. Infinite Sequences: follow the same pattern forever.

i 2 σ ) i = 1,2,...,n , and = 3.01 = 4.01

Chapter 8. Inferences about More Than Two Population Central Values

Dimensionality Reduction and Learning

f f... f 1 n n (ii) Median : It is the value of the middle-most observation(s).

Analysis of Variance with Weibull Data

ENGI 3423 Simple Linear Regression Page 12-01

1 Solution to Problem 6.40

THE EFFICIENCY OF EMPIRICAL LIKELIHOOD WITH NUISANCE PARAMETERS

A tighter lower bound on the circuit size of the hardest Boolean functions

Median as a Weighted Arithmetic Mean of All Sample Observations

STK4011 and STK9011 Autumn 2016

Laboratory I.10 It All Adds Up

Maximum Likelihood Estimation

2SLS Estimates ECON In this case, begin with the assumption that E[ i

ε. Therefore, the estimate

Simple Linear Regression

Summary of the lecture in Biostatistics

hp calculators HP 30S Statistics Averages and Standard Deviations Average and Standard Deviation Practice Finding Averages and Standard Deviations

PROJECTION PROBLEM FOR REGULAR POLYGONS

Wu-Hausman Test: But if X and ε are independent, βˆ. ECON 324 Page 1

Effect of Mean on Variance Function Estimation in Nonparametric Regression

Chapter Business Statistics: A First Course Fifth Edition. Learning Objectives. Correlation vs. Regression. In this chapter, you learn:

Bayesian Inferences for Two Parameter Weibull Distribution Kipkoech W. Cheruiyot 1, Abel Ouko 2, Emily Kirimi 3

BAYESIAN INFERENCES FOR TWO PARAMETER WEIBULL DISTRIBUTION

ANALYSIS ON THE NATURE OF THE BASIC EQUATIONS IN SYNERGETIC INTER-REPRESENTATION NETWORK

Generalization of the Dissimilarity Measure of Fuzzy Sets

Assignment 5/MATH 247/Winter Due: Friday, February 19 in class (!) (answers will be posted right after class)

Econometrics. 3) Statistical properties of the OLS estimator

Chapter 13, Part A Analysis of Variance and Experimental Design. Introduction to Analysis of Variance. Introduction to Analysis of Variance

9.1 Introduction to the probit and logit models

LINEAR REGRESSION ANALYSIS

Midterm Exam 1, section 1 (Solution) Thursday, February hour, 15 minutes

We have already referred to a certain reaction, which takes place at high temperature after rich combustion.

A Remark on the Uniform Convergence of Some Sequences of Functions

Arithmetic Mean and Geometric Mean

d dt d d dt dt Also recall that by Taylor series, / 2 (enables use of sin instead of cos-see p.27 of A&F) dsin

CHAPTER 6. d. With success = observation greater than 10, x = # of successes = 4, and

The Mathematical Appendix

Lecture Notes Types of economic variables

1. The weight of six Golden Retrievers is 66, 61, 70, 67, 92 and 66 pounds. The weight of six Labrador Retrievers is 54, 60, 72, 78, 84 and 67.

Chapter 2 Supplemental Text Material

Complete Convergence and Some Maximal Inequalities for Weighted Sums of Random Variables

UNIT 2 SOLUTION OF ALGEBRAIC AND TRANSCENDENTAL EQUATIONS

THE ROYAL STATISTICAL SOCIETY 2016 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE MODULE 5

GENERALIZATIONS OF CEVA S THEOREM AND APPLICATIONS

A New Family of Transformations for Lifetime Data

Transcription:

art 4b Asymptotc Results for MRR usg RESS Recall that the RESS statstc s a specal type of cross valdato procedure (see Alle (97)) partcular to the regresso problem ad volves fdg Y $,, the estmate at the th observato foud by removg the th data par, (x, Y ) from the data set I the MRR case RESS(λ ) ( ( $ ( ) ( ) y f + λg $ )) Oce aga, we choose λ by fdg the value of λ that mmzes RESS(λ ) Ths s doe by settg d RESS(λ ) RESS (λ) 0 ad solvg for λ We obta RESS (λ) dλ Solvg ths equato results $ λ as ( ( $ ( ) ( ) ( ) Y f ))( ) + λ ( ) ( $ ( $ ( ) ( ) ( ) Y g f + )) 0 λ $ λ ( ) $ ( $ ( ) g Y ) ( ) $, $ ( ) f g Y f < > ( ) ( ) ($ g ) Observe that ths parameter estmate s smlar to λ $ * except that the parametrc ad oparametrc estmates have bee replaced wth the aalogous cross valdated estmates Of course we must esure that RESS (λ) > 0 Ths follows from RESS (λ) ( ) > 0, forλ R, except for the degeerate case whch we wll ot worry about here Thus $ λ does, fact, produce a global mmum Ad t s ths estmate that we wll study asymptotcally the remader of the secto We wll aga obta the dfferece betwee λ* ad $ λ, the vestgate that dfferece asymptotcally 64

λ* λ$ < $, $ > < ( ) $, $ ( ) g θ f g Y > f ( ) (4B) ( $)( $ ) ( ) ( $ )( $ ( ) g θ f g Y f ) ( $ ) ( $ + ( $ $ )) ( ) g g g g (4B) We have the same deomator problem as secto 3b Recall from that secto that the asymptotc rates for $ ( ) f f$ ( ) ad are O ( - ), ad O ( ) respectvely (Burma ad Chaudhur (99) results 6, ad 60 respectvely) Next, set α ( ) ( ) ( ) ( ) Recall that usg cross valdated estmates ths dfferece s very mportat We wll eed the followg lemma ad ts corollary These results deal wth the dfferece term α ad a closely related term whch wll prove mportat the results that follow The proofs for both results are foud appedx 4b Lemma 4b: Assumg codtos A-A6 α O O ( ), f lm 3 ( ), f 0 Corollary 4b: Assumg codtos A-A6 O ( ), f lm ( ) O ( ), f 0 A mportat artfact of ths lemma s that α coverges to zero faster tha $g Ths mples that the deomator o the rght sde of 4B ca be hadled (asymptotcally) by dealg wth $g 6

Rewrtg the rght had term 4B we have ( ) ( $ )( $ ( ) g Y f ) ( ) ( + ( )) ( ) ( $ )( $ ( ) g Y f ) ( g + α) ( ) ( $ )( $ ( ) g Y f ) α ( ) ( + α) ( $ )( $ g Y f ) ( ) ( $ )( $ g Y ) f α ( ) + ( ) ( ) ( ) ( ) As before, the left part of the last term s what we eed to complete the problem The rght part, however, we must ultmately deal wth ad shall call t remader term (R) We have the α followg lemmas that gve asymptotc results for R, ad wll ultmately provde a foudato for fdg the asymptotc covergece rates for 4B The proofs for Lemmas 4b ad 4b3 are foud appedx 4b Lemma 4b: Assumg codtos A-A6 ( ) ( $ )( $ ( ) g Y f ) O ( ), f lm ( ) O ( ), f 0 Lemma 4b3: Assumg codtos A-A6 R O ( ), f lm O ( ), f 0 The mportace of the precedg result s that t wll allow us to rewrte 4B wth a commo deomator, whch wll lead to the mportat result of Lemma 4b4 66

Wth our ew otato 4B becomes λ* λ$ ( )( θ f$ ) ( ) ( $ )( $ ( ) g Y f ) ( $ $ $ ( ) ( ) ) ( $ $ $ ( ) gθ gf g Y g f ) ( ) ( ) (( $ $ ) ( $ $ ( ) gθ g Y g f gf $ $ + )) ( ) ( ) ( $ ( $ $ $) ) (( $ $ $)( $ $ ( ) gθ g + g g Y g g g f f f $ ) gf $ $ + + + ) ( ) ( ) ( $ ( $ + $ $)( + )) + (( $ $) $ + $( $ ( ) $ ( ) ) + ( $ $)( $ ( ) gθ g g g θ ε g g f g f f g g f f $ )) ( ) ( ) ( ( $ $)( + ) $ ) + (( $ $) $ + $( $ ( ) $ ( ) ) + ( $ $)( $ ( ) g g θ ε g g g f g f f g g f f $ ε )) < ( ) > < > + < ( ) ( $ $ ), $, ( $ $), ( $ ) > + < $,( $ ( ) $ ) > + < ( ) ( $ $)( $ ( ) g g ε g g g f g f f g g f f $ ) > ε θ ( ) ( ) ( $ $) + $ + ( $ $) $ + $ $ ( ) $ ( ) + ( $ $) $ ( ) g g ε g g g f g f f g g f f$ ε θ (by the Cauchy-Schwartz ad Tragle equaltes) O ( ) + O ( ) f$ f ( b * *) + f ( b **) θ O ( ) + (by Burma ad Chaudhur (99) results 60 ad 6, ad A4A3 (wth pursuat commets)) O ( ) + O ( ) O ( ) + (4B3) by the Tragle equalty, 4, ad the defto of Wth ths result for λ* λ$ had 67

we may proceed wth the followg lemma dealg wth covergece rates for the RESS selected mxg parameter to the theoretcally optmal mxg parameter Lemma 4b4 s the most mportat lemma leadg up to the estmate covergece theorems ths secto It s aalogous to Lemma 4a3 the prevous secto ad ts proof s appedx 4b Lemma 4b4: Assumg codtos A-A6 λ* λ$ O ( ) + ( ), lm O f 0 O ( ), f 0 The ext lemma gves asymptotc coverges rates for all of the prevous quattes the stace whch the parametrc estmate becomes correct as the sample sze creases ( lm 0 ) It s aalogous to Lemma 4a4 the prevous secto ad ts proof ca be foud appedx 4b Lemma 4b: Now assume that lm 0 Uder assumptos A-A6 a) α O ( ), 3 O ( ), f f > < b) R O ( ), f > O f ( ), < < O ( ), f < 68

Lemma 4b(cot): O ( ) + O ( ), f > c) λ* λ$ O ( ) + O ( ), f < < O ( ), f < Before takg o ay of the theorems dealg wth estmate covergece, we eed to do a lttle algebra smlar to that doe art 3a (partcularly the proof of Theorem 3A) Observe that λ$ + f$ θ λ* + f$ θ ( λ $ g $ + f$ θ) ( λ* + f$ θ) (followg the proof of Theorem 3A) ( t θ) ( t θ ) (say) ( t t ) ( t t ) ( t θ) (( λ $ *) $) ( $ λ g λ λ*) ( λ* + f$ θ) So that λ$ $ $ θ ( λ $ g + f λ* ) + λ$ λ* λ* + f$ θ + λ* + f$ θ The $ $ $ $ * $ ( $ * $ * $ $ λ θ λ λ λ g + f g + λ g λ g + f θ ) + λ* + f$ θ (4B4) 69

We ca ow obta the followg two theorems dealg wth estmate covergece rates The proofs of Theorems 4B ad 4B4 are foud appedx 4b The umberg s such that they ca be compared wth ther couterparts the prevous sectos Wll the MRR estmate usg the RESS selected mxg parameter yeld results that are comparable? Theorem 4B: Assumg codtos A-A6 ( ), lm $ $ $ O f λ g + f θ O ( ), f 0 Theorem 4B gves us a affrmatve respose (to the prevous questo) the form of a thrd Golde Result of Model Robust Regresso Ths tme the result demostrates the flexblty of the MRR procedure to hadle a mxg parameter estmate that volves cross valdato, ad s the frst result of ths type MRR We wll later dscuss the reasos for ths We wll demostrate the covergece rates of MRR wth a example Suppose a user s estmatg a fucto θ by usg MRR ad attemptg to model the fucto parametrcally wth a OLS quartc regresso ad oparametrcally by a Local Lear Regresso (LLR) usg the asymptotcally optmal costat badwdth, h ROT, from p of Fa ad Gjbels (996) We wll oce aga use the Epaechkov Kerel the oparametrc estmate ad $ λ for the mxg parameter From Ruppert ad Wad (994) we have that at ay gve x C, the covergece rate of the LLR estmate s gve by where for LLR, 4 ( x) θ( x) O ( h ) + O ( h ) ROT ROT h o ( ) ROT 70

The ( x) θ( x) O ( ) 4 Next, we exted ths result to the dmesoal oparametrc vector estmate For a rgorous presetato of ths exteso see the proof of Lemma a appedx a The exteso results O 4 ( ), so that asymptotcally the user has a estmate such that 4 $ ( ), lm λ $ $ O f g + f θ O ( ), f 0 Ths MRR estmate wll coverge to the true mea fucto at a rate o slower tha O the model s msspecfed, ad as fast as O ( - ) f θ(x) s truly a quartc fucto o C[] ( 4 ) f We preset oe fal theorem ths secto for the case where lm 0 Oce more, MRR proves to be a capable alteratve to MRR Theorem 4B4: Assumg codtos A-A6 hold, ad that lm 0 O ( ), f > λ$ $ $ g f θ O ( ), f + < < O ( ), f < Theorem 4B4 s comparable to Theorem 3A4 eve though ths theorem deals wth the MRR estmate usg the RESS selected mxg parameter Thus, ths result s as strkg as that of Theorem 4B We wll dscuss the reasos for ths the ext part of ths secto 7

Commets I the MRR case the mxg parameter $ λ outperforms ts MRR couterpart for the most part I comparg Theorem 3B4 to Theorem 4B4 t s evdet that the MRR estmate wth $λ has the capablty of covergg more rapdly ether of the last two cases < <, or <, ad s equal the frst > I fact, the same cotext, ts asymptotc performace s equal to that of the MRR estmate usg the asymptotcally optmal mxg parameter λ $ * Observe the results the cases whch lm, or 0, e compare Theorems 3B ad 4B The MRR estmate wth λ $ equals ts MRR couterpart estmate the frst stace ad betters t (asymptotcally) the secod Ths s most lkely attrbutable to the robustess of the MRR estmate, partcularly the lmted role of the oparametrc estmate (eve f we allow λ to be larger tha oe) Note that the MRR estmate retas all the advatages of the parametrc estmate That s, t s ever slowed dow by the mxg parameter as the MRR case Ths s a desrable qualty ad has bee demostrated mathematcally ths secto I cocluso, our work would dcate that MRR s more robust whe $ λ s used to select the mxg parameter The MRR estmate retas all of the postve asymptotc propertes of the MRR estmate ad does ot lose those capabltes whe usg $ λ, the mxg parameter selected usg RESS We ow tur our atteto to the applcato of MRR (partcularly MRR) to quatal regresso 7