Online Appendix to: Decoupling Noise and Features via Weighted l 1 -Analysis Compressed Sensing

Similar documents
X ε ) = 0, or equivalently, lim

Chapter 5 Properties of a Random Sample

The Mathematical Appendix

MATH 247/Winter Notes on the adjoint and on normal operators.

Strong Convergence of Weighted Averaged Approximants of Asymptotically Nonexpansive Mappings in Banach Spaces without Uniform Convexity

Part 4b Asymptotic Results for MRR2 using PRESS. Recall that the PRESS statistic is a special type of cross validation procedure (see Allen (1971))

A Remark on the Uniform Convergence of Some Sequences of Functions

CHAPTER VI Statistical Analysis of Experimental Data

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model

Point Estimation: definition of estimators

Introduction to local (nonparametric) density estimation. methods

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:

ECONOMETRIC THEORY. MODULE VIII Lecture - 26 Heteroskedasticity

Special Instructions / Useful Data

Complete Convergence and Some Maximal Inequalities for Weighted Sums of Random Variables

PROJECTION PROBLEM FOR REGULAR POLYGONS

= lim. (x 1 x 2... x n ) 1 n. = log. x i. = M, n

Research Article A New Iterative Method for Common Fixed Points of a Finite Family of Nonexpansive Mappings

Functions of Random Variables

TESTS BASED ON MAXIMUM LIKELIHOOD

Chapter 8. Inferences about More Than Two Population Central Values

Econometric Methods. Review of Estimation

Chapter 9 Jordan Block Matrices

Dimensionality Reduction and Learning

9 U-STATISTICS. Eh =(m!) 1 Eh(X (1),..., X (m ) ) i.i.d

ENGI 4421 Joint Probability Distributions Page Joint Probability Distributions [Navidi sections 2.5 and 2.6; Devore sections

( ) = ( ) ( ) Chapter 13 Asymptotic Theory and Stochastic Regressors. Stochastic regressors model

Extreme Value Theory: An Introduction

Complete Convergence for Weighted Sums of Arrays of Rowwise Asymptotically Almost Negative Associated Random Variables

Non-uniform Turán-type problems

Midterm Exam 1, section 1 (Solution) Thursday, February hour, 15 minutes

Beam Warming Second-Order Upwind Method

Extend the Borel-Cantelli Lemma to Sequences of. Non-Independent Random Variables

Summary of the lecture in Biostatistics

X X X E[ ] E X E X. is the ()m n where the ( i,)th. j element is the mean of the ( i,)th., then

COV. Violation of constant variance of ε i s but they are still independent. The error term (ε) is said to be heteroscedastic.

STRONG CONSISTENCY OF LEAST SQUARES ESTIMATE IN MULTIPLE REGRESSION WHEN THE ERROR VARIANCE IS INFINITE

Lecture 3 Probability review (cont d)

Marcinkiewicz strong laws for linear statistics of ρ -mixing sequences of random variables

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

Lecture Note to Rice Chapter 8

Simulation Output Analysis

Lecture 3. Sampling, sampling distributions, and parameter estimation

. The set of these sums. be a partition of [ ab, ]. Consider the sum f( x) f( x 1)

STRONG CONSISTENCY FOR SIMPLE LINEAR EV MODEL WITH v/ -MIXING

ρ < 1 be five real numbers. The

Arithmetic Mean and Geometric Mean

A practical threshold estimation for jump processes

Rademacher Complexity. Examples

The Arithmetic-Geometric mean inequality in an external formula. Yuki Seo. October 23, 2012

4 Inner Product Spaces

Econometrics. 3) Statistical properties of the OLS estimator

ANALYSIS ON THE NATURE OF THE BASIC EQUATIONS IN SYNERGETIC INTER-REPRESENTATION NETWORK

18.413: Error Correcting Codes Lab March 2, Lecture 8

1 Solution to Problem 6.40

Simple Linear Regression

CS286.2 Lecture 4: Dinur s Proof of the PCP Theorem

MEASURES OF DISPERSION

Qualifying Exam Statistical Theory Problem Solutions August 2005

Multiple Linear Regression Analysis

Lecture Notes Types of economic variables

Chapter 4 Multiple Random Variables

International Journal of Mathematical Archive-5(8), 2014, Available online through ISSN

Lecture 9: Tolerant Testing

Almost Sure Convergence of Pair-wise NQD Random Sequence

Research Article Some Strong Limit Theorems for Weighted Product Sums of ρ-mixing Sequences of Random Variables

STK4011 and STK9011 Autumn 2016

Estimation of Stress- Strength Reliability model using finite mixture of exponential distributions

Multivariate Transformation of Variables and Maximum Likelihood Estimation

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

Answer key to problem set # 2 ECON 342 J. Marcelo Ochoa Spring, 2009

Lecture 07: Poles and Zeros

5 Short Proofs of Simplified Stirling s Approximation

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions.

Third handout: On the Gini Index

3. Basic Concepts: Consequences and Properties

Chapter 4 (Part 1): Non-Parametric Classification (Sections ) Pattern Classification 4.3) Announcements

Multiple Choice Test. Chapter Adequacy of Models for Regression

Lebesgue Measure of Generalized Cantor Set

1 Lyapunov Stability Theory

Midterm Exam 1, section 2 (Solution) Thursday, February hour, 15 minutes

Chapter 4 Multiple Random Variables

arxiv: v1 [math.st] 24 Oct 2016

Uniform asymptotical stability of almost periodic solution of a discrete multispecies Lotka-Volterra competition system

Chapter 2 - Free Vibration of Multi-Degree-of-Freedom Systems - II

Wu-Hausman Test: But if X and ε are independent, βˆ. ECON 324 Page 1

Assignment 5/MATH 247/Winter Due: Friday, February 19 in class (!) (answers will be posted right after class)

Sampling Theory MODULE X LECTURE - 35 TWO STAGE SAMPLING (SUB SAMPLING)

Large and Moderate Deviation Principles for Kernel Distribution Estimator

9.1 Introduction to the probit and logit models

Discrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand DIS 10b

Feature Selection: Part 2. 1 Greedy Algorithms (continued from the last lecture)

CIS 800/002 The Algorithmic Foundations of Data Privacy October 13, Lecture 9. Database Update Algorithms: Multiplicative Weights

Chapter 13, Part A Analysis of Variance and Experimental Design. Introduction to Analysis of Variance. Introduction to Analysis of Variance

Entropy ISSN by MDPI

THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA

Unimodality Tests for Global Optimization of Single Variable Functions Using Statistical Methods

2SLS Estimates ECON In this case, begin with the assumption that E[ i

Class 13,14 June 17, 19, 2015

Lecture 12 APPROXIMATION OF FIRST ORDER DERIVATIVES

Transcription:

Ole Appedx to: Decouplg Nose ad Features va Weghted l -Aalyss Compressed Sesg RUIMIN WANG, ZHOUWANG YANG, LIGANG LIU, JIANSONG DENG, ad FALAI CHEN Uversty of Scece ad Techology of Cha I ths appedx, we wll prove the covergece rate ad asymptotc optmalty for the DLRS estmator, based o the asymptotc behavor of egevalues of matrx M ad statstcal theory Let be a bouded D mafold the doma ad H the space of C -cotuous fuctos defed o A sem-orm H sdefedby f, f Wth a set of samplg pots {X } the doma, we ca also gve a dscrete verso of the aforemetoed sem-orm as f, f X Specally, f,0 f ad f,0 f X We ow have a few assumptos as follows A The put s a bouded Lpschtz doma satsfyg the uform coe codtos See Utreras [988] for detaled defto A The set of samplg pots {X } doma satsfes the followg quas-uform assumpto: there exsts a costat ξ 0 > 0 such that δ max ξ 0, δ m where δ max sup X f X X X ad δ m m X X A3 Gve {X }, there exst costats ξ ad ξ depedg o such that ξ f, f, ξ f, for ay fucto f H Remark Suppose {X } s a equdstrbuted sequece the rego From the law of large umbers, we have lm f, Area f, Sce s bouded, Area s also bouded Thus A3 s satsfed wth probablty oe as the sample sze goes to fty c 0 ACM 0730-030/0/03-ART8 $500 DOI: http://dxdoorg/05/5579 PROOF OF THEOREM Before we prove Theorem, we have some propostos PROPOSITION For ay f H, there exsts a matrx M, depedg o such that f, m φ H φx f,,, ft M, f, where f f,,f T f X,,fX T s the vector of fucto values at {X } The proof of the precedg proposto ca be foud textbook Halmos [98] usg the Resz represetato theorem ad thus the detals are omtted PROPOSITION If s a bouded D mafold ad μ s the largest egevalue of matrx M,,theδmax ad δ max μ are both bouded from above PROOF Suppose that V ut s the area of ut geodestc dsk o So we have V ut δ m Area, ad the get δ Area δmax Area max ξ V ut δm 0 V O ut So δmax s bouded from above Let u be the fucto such that ut M, u u, m φ φ H,, φx u,,, where u u,,u T s the egevector of M, correspodg to the largest egevalue, that s, M, u μ uwedefea compactly supported radal bass fucto { e ws s / s, 0 s 0, s > ad specfy a terpolat φx u w X, where w X w X X δ m By the defto of δ m, t s easy to see that φx u,,, Moreover, we have for β Z 3 + ad wth β D β w X 0, D β w X δ m Dβ w0 ACM Trasactos o Graphcs, Vol 33, No, Artcle 8, Publcato date: March 0

App- R Wag et al Hece, we have u, φ,! β! Dβ φx β β! u D β w X β β! u Dβ w X β u β! Dβ w0 δ m, β whch mples that μ cwδm by deotg the costat cw β β! Dβ w0 Fally we get δ max μ cw δ max δ m cwξ 0 ad prove that δ max μ s bouded from above PROPOSITION 3 Suppose that ξ m μ ξ m for m>0 ad,,,whereξ,ξ > 0 are costats The we have for >0,λ>0, PROOF Frst of all we have + λξ m For,, we have + λξ m + λμ Oλ /m m + + λμ λξ + m λξ m + λξ x m dx λξ y m m + λξ m + y dy λξ /m y m /m + y dy ξ /m λ /m Oλ /m, where the secod equato reflects the chage of varable y λξ x m, ad correspods to Smlarly, wth the same chage of varable, we also have dx + λξ m + λξ x m 0 λξ m λξ y m m m 0 + y dy λξ /m m y m /m + y dy Oλ /m ξ /m λ /m We are ow ready to exhbt the Raylegh quotet equaltes coectg the sem-orms H ad ther dscretzed verso LEMMA Let satsfy A ad f 0 satsfy A3 The there exsts costat γ > 0 depedg oly o,ξ 0,ξ ad δ 0 > 0, such that f δ max δ 0 we have f, f, f,0 γ f,0 + δmax f,, for ay f,0 0 PROOF Accordg to Theorem 33 Utreras [988], there exsts costat c,ξ 0 > 0adδ 0 > 0 such that for δ max δ 0, f,0 Cd,m,,ξ 0 f,0 + δ max f, Sce f, ξ f,,wehave f, f,0 ξ f, c,ξ 0 f,0 + δ max f, f, γ f,0 + δmax f,, where γ c,ξ 0 /ξ LEMMA 5 Assume the same codtos as Lemma The there exsts costat γ > 0 depedg oly o,ξ 0,ξ, ξ ad δ 0 > 0, such that f δ max δ 0 we have f, f, f,0 γ f,0 + δmax f, 3, for ay 0 f PROOF Accordg to Theorem 3 Utreras [988], there exsts costat c,ξ 0 > 0adδ 0 > 0 such that for δ max δ 0, f,0 c,ξ 0 f,0 + δ max f, Sce ξ f, f, ξ f,,wehave f, f,0 f, /ξ c,ξ 0 f,0 + δ max f, /ξ f, γ f,0 + δmax f,, where γ c,ξ 0 ξ max, /ξ Lemma ad Lemma 5 buld a coecto betwee the cotuous sem-orms ad dscrete sem-orms Ths eables us to study the behavor of the egevalues of M, through studyg the varatoal egevalue problem Let μ μ be the egevalues of M, ascedg order Clearly {μ } are o-egatve real umbers sce the matrx M, s sem-postve defe Next we study the behavor of these egevalues ad show that they ca be bouded by the dscrete spectrum of the dfferetal operator,where s the Laplaca-Beltram operator o LEMMA 6 Let satsfy A ad {X } satsfy A The there exst costats c,c > 0 such that c ρ μ c ρ, ACM Trasactos o Graphcs, Vol 33, No, Artcle 8, Publcato date: March 0

Decouplg Nose ad Features va Weghted l -Aalyss Compressed Sesg App-3 where ρ ρ ρ are the frst egevalues of the varatoal egevalue problem φ ψ ρ φψ, ψ H PROOF From Lemma we get φ, φ, φ,0 γ φ,0 + δmax φ, for ay φ H wth φ,0 0 Thus μ γ ϑ, where ϑ ϑ are the frst egevalues of the varatoal egevalue problem φ, ϑ φ,0 + δ max, φ, whch mples ρ ϑ + δmax ρ,,, Note that δmax ρ s bouded from above, sce ρ accordg to Theorem 6 Agmo [965] ad the fact δmax O from Eq So there exsts c > 0 such that c γ +δmaxρ,the we have μ c ρ O the other had, usg Lemma 5 we have φ, φ,0 φ, γ φ,0 + δmax φ, ρ γ ν, where ν ν are the frst egevalues of the varatoal egevalue problem φ, ν φ,0 + δ max, φ, whch gves ν + δmax μ,,, So there exsts c > 0suchthat μ γ + δ max μ μ ρ γ + δ max μ ρ c ρ, sce δ max μ s bouded accordg to oposto LEMMA 7 Suppose satsfy A Let {μ μ } be the egevalues of M, ascedg order The there exst costats c 3,c > 0 such that for < we have c 3 μ c PROOF Accordg to Lemma 6, t suffces to prove that the egevalues ρ ρ satsfy the type of relatoshp Eq By usg tegrato by parts, we observe that ρ ρ are the egevalues of the dfferetal operator whch has dscrete spectrum cotaed the o-egatve real axs We ca the apply Theorem 6 Agmo [965] to get ρ, > Ths cocludes the proof THEOREM 8 Let f be a elemet of H ad the samples satsfy y f X + ε,,,, 5 where y,,y are the observed fuctoal values at {X }, ad ε,,ε are d radom varables wth zero mea ad fte varace σ > 0 Suppose A ad A are fulflled Let ˆf λ A λy I + λm, y be the estmator from the DLRS model Deote r λ ˆf λ f As ad λ /3 s chose, the E[r λ] O 3 PROOF By usg the bouds of egevalues μ O obtaed Lemma 7, we have E[r λ] E[ ˆf λ f ] f T A λ I f + σ tr[a λ ] λ ft Mf + σ 6 + λμ Oλ + O λ, where the last equato s based o the result of oposto 3 wth m I partcular, f the smoothg parameter s chose to satsfy λ /3, the we acheve the covergece rate E[r λ] O 3 Accordg to Stoe [98], s the optmal for multvarate fucto estmato wth the order the D 3 doma wth some stadard assumptos Sce the assumpto A3 s satsfed wth probablty oe as, we kow the DLRS estmator acheves the optmal covergece rate wth probablty oe Usg Theorem 8, we ca easly prove Theorem the submsso Specfcally, the DLRS model we let the ukow fucto f be a C -smooth surface S tself ad the observatos y y,,y T be the osy samples of surface posto P p,,p T Therefore we come to the cocluso of Theorem the submsso PROOF OF THEOREM We wll show that the DLRS estmator satsfes some geeral codtos ad the prove the asymptotc optmalty of GCV uder our proposed framework Let ˆf λ A λy I + λm y be the estmator of our DLRS model ad deote r λ ˆf λ f The asymptotc optmalty of GCV s defed as r ˆλ G f λ R+ r λ p, 7 whch verfes the closeess betwee the values of rsk fucto gve by the GCV choce ˆλ G ad the theoretcally optmal choce λ arg f λ R+ r λ The ma result here s to show that our estmator satsfes the followg three codtos C f λ R+ E[r λ] C There exsts a sequece {λ } such that r λ p 0 the covergece probablty C3 Let 0 κ κ be the egevalues of Kλ λm For ay l such that l 0, the l+ κ 0as l+ κ ACM Trasactos o Graphcs, Vol 33, No, Artcle 8, Publcato date: March 0

App- R Wag et al The codto C states that the covergece rate of the rsk fucto to zero should be lower tha O Otherwse, the estmates may possess uattaably small rsk Deote ull the ull space of Laplaca operator Actually from the behavor of egevalues as show Lemma 7, t s ot dffcult to verfy that our proposed model meets the codto C except for f ull LEMMA If f / ull, the estmator ˆf λ from our DLRS model holds f E[r λ] λ R + Ths verfes the codto C PROOF Let0 μ μ be the egevalues of desg matrx M, adu the ut egevector correspodg to μ,,,sowehave E[r λ] E[ ˆf λ f ] E[ˆf λ f T ˆf λ f] f T A λ I f + σ tr[a λ ] λ μ + λμ e + σ + λμ, where e u T f If λ O or λ correspods to, sce μ there exsts such that 0ad λμ +λμ for >,the λ μ E[r λ] + λμ e e > f,0 max { } e,,e O O the other had, f λ 0 correspods to,wehave E[r λ] σ Oλ, + λμ where the secod equato s also based o oposto 3 LEMMA Uder codto C, we have probablty sup r λ λ>0 E[r λ] 0 9 PROOF To get Eq 9, t suffces to show probablty f T A λ I A λε sup 0 0 λ>0 E[r λ] ad A λε σ tr[a λ ] sup 0 λ>0 E[r λ] ACM Trasactos o Graphcs, Vol 33, No, Artcle 8, Publcato date: March 0 8 Accordg to the Chebyshev equalty, we have for ay gve δ>0 { f T A λ I A λε } E[r λ] δ E[r λ] E [ f T A λ I A λε ] δ E[r λ] σ tr [ A λa λ I ff T A λ I A λ ] δ E[r λ] σ A λa λ I f δ E[r λ] σ A λ I f E[r λ] δ σ E[r λ] 0, sce E[r λ] A λ I f Thus Eq 0 holds probablty Aga usg the Chebyshev equalty, we have for ay gve δ>0 { A λε σ tr[a λ ] } E[r λ] δ E[r λ] E [ A λε σ tr[a λ ] ] δ E[r λ] E[ Aλε ] σ tr[a λ ] E[r λ] Sce E[r λ] σ tr[a λ ], we oly eed to show E [ A λε ] σ tr[a λ ] < Costat σ tr[a λ ] Deote B A λ B,thewehave E [ A λε ] E [ ε T Bε ] [ ] E B ε ε B ε ε,, [ ] E B ε B ε [ ] +E B ε ε B ε ε B σ + B E[ ] ε + B σ There exsts a costat c such that E[ε ] cσ ad σ cσ,so we get E [ A λε ] B σ + c B σ + c B σ B σ B σ + c, σ tr[a λ ] + cσ tr[a λ ] σ tr[a λ ] + cσ tr[a λ ], whch mples Eq, ad mmedately leads to probablty The codto C shows that the rsk fucto r λ coverges to zero probablty wth approprate sequece {λ }Obvously, the cocluso of codto C ca be easly derved from

Decouplg Nose ad Features va Weghted l -Aalyss Compressed Sesg App-5 Theorem 8 ad Lemma Therefore, the codto C holds true The codto C3 gves a rato l+ κ, 3 l+ κ whch s defed o the egevalues of Kλ λm ad ofte plays a mportat role the asymptotc aalyss l LEMMA 3 I our model, for ay l such that 0 ad κ l+ > 0, the the rato of Eq 3 coverges to zero as the sample sze goes to fty Ths verfes the codto C3 PROOF From Lemma 7, amely, μ O, we get lm l+ κ l+ κ lm lm lm l+ μ l+ μ l+ μ m/d dμ l+ μ m/d dμ m dd l + m m d d m d m dd lm m d l + 0 l + m d m d l+ l+ m d m d By cocluso, we have verfed that the three codtos C, C, ad C3 hold true for our model The we wll prove the asymptotc optmalty of GCV uder these three codtos LEMMA Uder the codto C, we have tr[i A λ ], ad I A λ y σ 5 PROOF From the fact that σ tr[a λ ] σ tr[a λ ] E[r λ ] 0, we have tr[a λ ] 0 ad the get tr[i A λ ] By the fact ε σ ad the Cauchy-Schwartz equalty, we have I A λ y ε + f ˆf λ + f ˆf λ T ε σ LEMMA 5 Uder the codto C3, for λ such that r λ 0, we have tr[a λ ] 0 6 tr[a λ ] PROOF Recall A λ I + λ M I + K λ We get tr[a λ ] + κ tr[a λ ] + κ, 7 where 0 κ κ are the egevalues of K λ Let l be the umber holdg κ l <κ l+,thewehave + κ l + κ, 8 l+ ad + κ l + To reach Eq 6, t suffces to show l + l+ κ l + l+ κ l+ κ 9 0 0 O the other had, E[r λ ] 0scer λ s o-egatve, thus we get tr[a λ ] 0 ad have l 0duetoEq9Sot s ot hard to see that 0 holds uder the codto C3 LEMMA 6 For ay ˆλ such that r ˆλ 0 ad tr[a ˆλ] 0, tr[a ˆλ ] uder the codto C we have SURE ˆλ r ˆλ ε + σ p 0, r ˆλ ad f ˆλ ˆf ˆλ p 0, 3 r ˆλ where SURE λ σ σ tr[i A λ] I A λy, f λ y σ tr[i Aλ] I A λy I A λy ad r λ f λ f oof of the Lemma 6 s left the Appedx LEMMA 7 Uder codtos C ad C3, ˆf ˆλ G s cosstet, that s, r ˆλ G 0,whereˆλ G s chose by GCV PROOF Accordg to the proof of Lemma 5 L [985] ad smlarly as Grard [99], the precedg lemma ca be establshed Asymptotc Optmalty Theorem THEOREM 8 Uder codtos C, C, ad C3, ˆf ˆλ G s asymptotcally optmal, where ˆλ G s the GCV choce PROOF From the codto C, for λ that s the mmzer of r λ, we have r λ 0 Accordg to Lemma 5, we have tr[a λ ] 0 tr[a λ ] Hece from Lemma 6, we have SURE λ ε + σ r λ + o p O the other had, from Lemma 7 ths also holds for ˆλ ˆλ G Therefore we have SURE ˆλ G ε + σ r ˆλ G + o p 5 Sce SURE ˆλ G SURE λ adr λ r ˆλ G, we have r ˆλ G /r λ probablty ACM Trasactos o Graphcs, Vol 33, No, Artcle 8, Publcato date: March 0

App-6 R Wag et al oof of Lemma 6 PROOF We frst prove Eq, whch ca be rewrtte as ε + σ σ tr[i A λ]y T I A λε I A λy σ tr[i A λ] I A λy r λ σ tr[i A λ] I A λy f T I A λε r λ + σ tr[i A λ] I A λy ε T A λε σ tr[a λ] r λ σ tr[i A λ] I A λy σ ε + r λ 6 Note that tr[i A λ ], I A λ y σ r from Lemma, ad sup λ>0 λ E[r λ] 0 by Lemma Thus t suffces for us to show the followg three equatos f T I A λε sup 0, 7 λ>0 E[r λ] ε T A λε σ tr[a λ] sup 0, 8 λ>0 E[r λ] σ tr[i A λ] I A λy σ ε sup λ>0 E[r λ] 0 9 For Eq 7, accordg to the Chebyshev equalty, we have for ay gve δ>0 { f T I A λε E[r λ] } δ E[r λ] E [ f T I A λε ] δ E[r λ] σ tr [ I A λf f T I A λ ] δ E[r λ] σ I A λf δ E[r λ] σ I Aλf E[r λ] δ σ E[r λ] 0, sce E[r λ] I A λf For Eq 8, aga usg the Chebyshev equalty, we have for ay gve δ>0 { ε T A λε σ tr[a λ] } E[r λ] δ E[r λ] E [ ε T A λε σ tr[a λ] ] δ E[r λ] E[εT A λε ] σ tr[a λ] E[r λ] Sce E[r λ] σ tr[a λ ], we oly eed to show E [ ε T A λε ] σ tr[a λ] < Costat 30 σ tr[a λ ] ACM Trasactos o Graphcs, Vol 33, No, Artcle 8, Publcato date: March 0 Deote A λ A,thewehave E [ [ ] ε T A λε ] E A ε ε A ε ε,, [ ] E A ε A ε [ ] + E A ε ε A ε ε A σ + A E[ ] ε + A σ There exsts a costat c such that E[ε ] cσ ad σ cσ,so we get E [ ε T A λε ] A σ + c A σ + c A σ A σ A σ + c, σ tr[a λ] + cσ tr[a λ ], whch mples Eq 30, ad mmedately leads to 8 For Eq 9, usg the proved 7, 8, ad σ tr[a λ] σ tr[a λ ] E[r λ], we oly eed to show sup λ>0 σ ε 0, 3 E[r λ] / sce the fact that σ tr[i A λ] I A λ σ σ tr[a λ] ε + f ˆf λ σ σ tr[a λ] ε r λ f ˆf λ T ε σ ε σ tr[a λ] r λ f T I A λε + ε T A λε σ ε + r λ + f T I A λε + ε T A λε σ tr[a λ] + σ tr[a λ] By the Chebyshev equalty, we have for ay gve δ>0 { σ ε } E[r λ] / δ E[r λ] E [ σ ε ] δ E[r λ] E[ ε ] σ δ E[r λ] σ + E [ ] ε σ δ E[r λ] E [ ] ε 0, whch mples 3 Now t remas to prove Eq 3, the umerator of whch ca be rearraged as f ˆλ ˆf ˆλ σ tr[i A λ] I A λy I A λy σ ε rλ f T I Aλε+ ε T Aλε σ tr[aλ]+σ tr[aλ] I Aλy

Decouplg Nose ad Features va Weghted l -Aalyss Compressed Sesg App-7 To get 3, sce I A λy σ, t suffces to show the followg σ ε 0, 3 r λ f T I A λε 0, 33 r λ ε T A λε σ tr[a λ] 0, 3 r λ tr[a λ] 0 35 r λ r Note that sup λ>0 λ E[r λ] 0, the Eqs 3, 33, ad 3 ca be easly proved from 3, 7, ad 8 respectvely The last equato 35 follows from σ tr[a λ ] E[r λ] ad Hece, we complete the proof of Lemma 6 REFERENCES S Agmo 965 Lectures o Ellptc Boudary Value oblems D Va Norstrad, ceto, NJ D A Grard 99 Asymptotc optmalty of the fast radomzed versos of GCV ad cl rdge regresso ad regularzato A Statst 9,, 950 963 P R Halmos 98 A Hlbert Space oblem Book Vol 9 of Graduate Texts Mathematcs, d Ed Sprger K-C L 985 From Ste s ubased rsk estmates to the method of geeralzed cross valdato A Statst 3,, 7 35 377 C J Stoe 98 Optmal global rates of covergece for oparametrc regresso The A Statst 0,, 00 053 F I Utreras 988 Covergece rates for multvarate smoothg sple fuctos J Approx Theory 5,, 5 7 Receved December 0; revsed December 03; accepted December 03 ACM Trasactos o Graphcs, Vol 33, No, Artcle 8, Publcato date: March 0