5.1 A mutual information bound based on metric entropy
|
|
- Patience Cain
- 6 years ago
- Views:
Transcription
1 Chapter 5 Global Fao Method I this chapter, we exted the techiques of Chapter 2.4 o Fao s method the local Fao method) to a more global costructio. I particular, we show that, rather tha costructig a local packig, choosig a scalig δ > 0, ad the optimizig over this δ, it is actually, i may cases, possible to prove lower bouds o miimax error directly usig packig ad coverig umbers metric etropy ad packig etropy). The material i this chapter is based o a paper of Yag ad Barro 5]. 5.1 A mutual iformatio boud based o metric etropy Tobegi, werecalltheclassicalfaoiequality, whichsaysthatforaymarkovchaiv X V, where V is uiform o the fiite set V, we have P V V) 1 IV;X)+log2. log V ) Recall Corollary 2.11.) Thus, there are two igrediets i provig lower bouds o the error i a hypothesis test: upper boudig the mutual iformatio ad lower boudig the size V. Here, we state a propositio doig the former. Before statig our result, we require a bit of otatio. First, we assume that V is draw from a distributio µ, ad coditioal o V = v, assume the sample X P v. The a stadard calculatio or simply the defiitio of mutual iformatio; recall equatio 2.4.4)) gives that IV;X) = D kl Pv P ) dµv), where P = P v dµv) ) Now, we show how to coect this mutual iformatio quatity to a coverig umber of a set of distributios. Assume that for all v, we have P v P, where P is a collectio of distributios. I aalogy with Defiitio 2.1, we say that the collectio of distributios Q i } N i=1 form a ǫ-cover of P i KL-divergece if for all P P, there exists some i such that D kl P Q i ) ǫ 2. With this, we may defie the KL-coverig umber of the set P as N kl ǫ,p) := if } N N Q i,i = 1,...,N, sup mid kl P Q i ) ǫ 2, 5.1.2) P P i wheren kl ǫ,p) = + ifosuchcoverexists. Withdefiitio5.1.2)iplace, wehavethefollowig propositio. 49
2 Staford Statistics 311/Electrical Egieerig 377 Propositio 5.1. Uder coditios of the precedig paragraphs, we have IV;X) if ǫ 2 +logn kl ǫ,p) } ) ǫ>0 Proof First, we claim that D kl Pv P ) dµv) D kl P v Q)dµv) 5.1.4) for ay distributio Q. Ideed, briefly, we have D kl Pv P ) dµv) = dp v log dp v V X dp dµv) = V X = D kl P v Q)dµv)+ dµv)dp v V X V }} =dp ) = D kl P v Q)dµv) D kl P Q dp v log dp ] v dq +log dµv) Q dp log dq dp D kl P v Q)dµv), so that iequality 5.1.4) holds. By carefully choosig the distributio Q i the upper boud5.1.4), we obtai the propositio. Now, assume that the distributios Q i, i = 1,...,N form a ǫ 2 -cover of the family P, meaig that mi i N] D klp Q i ) ǫ 2 for all P P. Let p v ad q i deote the desities of P v ad Q i with respect to some fixed base measure o X the choice of based measure does ot matter). The defiiig the distributio Q = 1/N) N i=1 Q i, we obtai for ay v that i expectatio over X P v, D kl P v Q) = E Pv log p ] ] vx) p v X) = E Pv log qx) N 1 i=1 q ix) ] ] p v X) p v X) = logn +E Pv log N i=1 q logn +E Pv log ix) max i q i X) logn +mie Pv log p ] vx) = logn +mid kl P v Q i ). i q i X) i By our assumptio that the Q i form a cover, this gives the desired result, as ǫ 0 was arbitrary, as was our choice of the cover. By a completely parallel proof, we also immediately obtai the followig corollary. Corollary 5.2. Assume that X 1,...,X are draw i.i.d. from P v coditioal o V = v. Let N kl ǫ,p) deote the KL-coverig umber of a collectio P cotaiig the distributios over a sigle observatio) P v for all v V. The IV;X 1,...,X ) if ǫ 2 +logn kl ǫ,p) }. ǫ 0 50
3 Staford Statistics 311/Electrical Egieerig 377 With Corollary 5.2 ad Propositio 5.1 i place, we thus see that the global coverig umbers i KL-divergece gover the behavior of iformatio. We remark i passig that the quatity 5.1.3), ad its i.i.d. aalogue i Corollary 5.2, is kow as the idex of resolvability, ad it cotrols estimatio rates ad redudacy of codig schemes for ukow distributios i a variety of scearios; see, for example, Barro 1] ad Barro ad Cover 2]. It is also similar to otios of complexity i Dudley s etropy itegral cf. Dudley 3]) i empirical process theory, where the fluctuatios of a empirical process are govered by a tradeoff betwee coverig umber ad approximatio of idividual terms i the process. 5.2 Miimax bouds usig global packigs There is ow a four step process to provig miimax lower bouds usig the global Fao method. Our startig poit is to recall the Fao miimax lower boud i Propositio 2.12, which begis with the costructio of a set of poits θp v )} v V that form a 2δ-packig of a set Θ i some ρ-semimetric. With this iequality i mid, we perform the followig four steps: i) Boud the packig etropy. Give a lower boud o the packig umber of the set Θ with 2δ-separatio call this lower boud Mδ)). ii) Boud the metric etropy. Give a upper boud o the KL-metric etropy of the class P of distributios cotaiig all the distributios P v, that is, a upper boud o logn kl ǫ,p). iii) Fid the critical radius. Notig as i Corollary 5.2 that with i.i.d. observatios, we have IV;X 1,...,X ) if ǫ 2 +logn kl ǫ,p) }, ǫ 0 we ow balace the iformatio IV;X1 ) ad the packig etropy logmδ). To that ed, we choose ǫ ad δ > 0 at the critical radius, defied as follows: choose the ay ǫ such that ad choose the largest δ > 0 such that ǫ 2 logn kl ǫ,p), logmδ ) 4ǫ 2 +2log2 2N kl ǫ,p)+2ǫ 2 +2log2 2IV;X 1)+log2). We could have chose the ǫ attaiig the ifimum i the mutual iformatio, but this way we eed oly a upper boud o logn kl ǫ,p).) iv) Apply the Fao miimax boud. Havig chose δ ad ǫ as above, we immediately obtai that for the Markov chai V X 1 V, PV V) 1 IV;X 1,...,X )+log2 logmδ ) = 1 2, ad thus, applyig the Fao miimax boud i Propositio 2.12, we obtai M θp);φ ρ) 1 2 Φδ ). 51
4 Staford Statistics 311/Electrical Egieerig Example: o-parametric regressio I this sectio, we flesh out the outlie i the prequel to show how to obtai a miimax lower boud for a o-parametric regressio problem directly with packig ad metric etropies. I this example, we sketch the result, leavig explicit costat calculatios to the dedicated reader. Noetheless, we recover a aalogue of Theorem 4.4 o miimax risks for estimatio of 1-Lipschitz fuctios o 0, 1]. We use the stadard o-parametric regressio settig, where our observatios Y i follow the idepedet oise model 4.1.1), that is, Y i = fx i )+ε i. Lettig F := f : 0,1] R, f0) = 0, f is Lipschitz} be the family of 1-Lipschitz fuctios with f0) = 0, we have Propositio 5.3. There exists a uiversal costat c > 0 such that M F, ) := if sup f f F E f f f ] c ) σ 2 1/3, where f is costructed based o the idepedet observatios fx i )+ε i. The rate i Propositio 5.3 is sharp to withi factors logarithmic i ; a more precise aalysis of the upper ad lower bouds o the miimax rate yields M F, ) := if sup f f F E f f f ] σ 2 ) 1/3 log. See, for example, Tsybakov 4] for a proof of this fact. Proof Our first step is to ote that the coverig ad packig umbers of the set F i the l metric satisfy lognδ,f, ) logmδ,f, ) 1 δ ) To see this, fix some δ 0,1) ad assume for simplicity that 1/δ is a iteger. Defie the sets E j = δj 1),δj), ad for each v 1,1} 1/δ defie h v x) = 1/δ j=1 v j1x E j }. The defie the fuctio f v t) = t 0 h vt)dt, which icreases or decreases liearly o each iterval of width δ i 0,1]. The these f v form a 2δ-packig ad a 2δ-cover of F, ad there are 2 1/δ such f v. Thus the asymptotic approximatio 5.3.1) holds. TODO: Draw a picture Now, if for some fixed x 0,1] ad f,g F we defie P f ad P g to be the distributios of the observatios fx)+ε or gx)+ε, we have that D kl P f P g ) = 1 2σ 2fX i) gx i )) 2 f g 2 2σ 2, ad if P f is the distributio of the observatios fx i)+ε i, i = 1,...,, we also have D kl P f Pg ) 1 = 2σ 2fX i) gx i )) 2 2σ 2 f g 2. i=1 52
5 Staford Statistics 311/Electrical Egieerig 377 I particular, this implies the upper boud logn kl ǫ,p) 1 σǫ o the KL-metric etropy of the class P = P f : f F}, as lognδ,f, ) δ 1. Thus we have completed steps i) ad ii) i our program above. It remais to choose the critical radius i step iii), but this is ow relatively straightforward: by choosig ǫ 1/σ) 1/3, ad whece ǫ 2 /σ 2 ) 1/3, we fid that takig δ σ 2 /) 1/3 is sufficiet to esure that lognδ,f, ) δ 1 4ǫ 2 +2log2. Thus we have as desired. M F, ) δ 1 2 σ 2 ) 1/3 53
6 Bibliography 1] A. R. Barro. Complexity regularizatio with applicatio to artificial eural etworks. I Noparametric Fuctioal Estimatio ad Related Topics, pages Kluwer Academic, ] A. R. Barro ad T. M. Cover. Miimum complexity desity estimatio. IEEE Trasactios o Iformatio Theory, 37: , ] R. M. Dudley. Uiform Cetral Limit Theorems. Cambridge Uiversity Press, ] A. B. Tsybakov. Itroductio to Noparametric Estimatio. Spriger, ] Y. Yag ad A. Barro. Iformatio-theoretic determiatio of miimax rates of covergece. Aals of Statistics, 275): ,
ECE 901 Lecture 12: Complexity Regularization and the Squared Loss
ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality
More informationECE 901 Lecture 14: Maximum Likelihood Estimation and Complexity Regularization
ECE 90 Lecture 4: Maximum Likelihood Estimatio ad Complexity Regularizatio R Nowak 5/7/009 Review : Maximum Likelihood Estimatio We have iid observatios draw from a ukow distributio Y i iid p θ, i,, where
More informationMaximum Likelihood Estimation and Complexity Regularization
ECE90 Sprig 004 Statistical Regularizatio ad Learig Theory Lecture: 4 Maximum Likelihood Estimatio ad Complexity Regularizatio Lecturer: Rob Nowak Scribe: Pam Limpiti Review : Maximum Likelihood Estimatio
More informationRates of Convergence by Moduli of Continuity
Rates of Covergece by Moduli of Cotiuity Joh Duchi: Notes for Statistics 300b March, 017 1 Itroductio I this ote, we give a presetatio showig the importace, ad relatioship betwee, the modulis of cotiuity
More informationNonparametric regression: minimax upper and lower bounds
Capter 4 Noparametric regressio: miimax upper ad lower bouds 4. Itroductio We cosider oe of te two te most classical o-parametric problems i tis example: estimatig a regressio fuctio o a subset of te real
More informationConvergence of random variables. (telegram style notes) P.J.C. Spreij
Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space
More informationProduct measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014.
Product measures, Toelli s ad Fubii s theorems For use i MAT3400/4400, autum 2014 Nadia S. Larse Versio of 13 October 2014. 1. Costructio of the product measure The purpose of these otes is to preset the
More informationInformation Theory and Statistics Lecture 4: Lempel-Ziv code
Iformatio Theory ad Statistics Lecture 4: Lempel-Ziv code Łukasz Dębowski ldebowsk@ipipa.waw.pl Ph. D. Programme 203/204 Etropy rate is the limitig compressio rate Theorem For a statioary process (X i)
More informationEntropy and Ergodic Theory Lecture 5: Joint typicality and conditional AEP
Etropy ad Ergodic Theory Lecture 5: Joit typicality ad coditioal AEP 1 Notatio: from RVs back to distributios Let (Ω, F, P) be a probability space, ad let X ad Y be A- ad B-valued discrete RVs, respectively.
More informationAn Introduction to Randomized Algorithms
A Itroductio to Radomized Algorithms The focus of this lecture is to study a radomized algorithm for quick sort, aalyze it usig probabilistic recurrece relatios, ad also provide more geeral tools for aalysis
More informationLet us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.
Lecture 5 Let us give oe more example of MLE. Example 3. The uiform distributio U[0, ] o the iterval [0, ] has p.d.f. { 1 f(x =, 0 x, 0, otherwise The likelihood fuctio ϕ( = f(x i = 1 I(X 1,..., X [0,
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013. Large Deviations for i.i.d. Random Variables
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013 Large Deviatios for i.i.d. Radom Variables Cotet. Cheroff boud usig expoetial momet geeratig fuctios. Properties of a momet
More informationDiscrete Mathematics for CS Spring 2008 David Wagner Note 22
CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig
More informationREGRESSION WITH QUADRATIC LOSS
REGRESSION WITH QUADRATIC LOSS MAXIM RAGINSKY Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X, Y ), where, as before, X is a R d
More informationLecture 13: Maximum Likelihood Estimation
ECE90 Sprig 007 Statistical Learig Theory Istructor: R. Nowak Lecture 3: Maximum Likelihood Estimatio Summary of Lecture I the last lecture we derived a risk (MSE) boud for regressio problems; i.e., select
More informationThe Boolean Ring of Intervals
MATH 532 Lebesgue Measure Dr. Neal, WKU We ow shall apply the results obtaied about outer measure to the legth measure o the real lie. Throughout, our space X will be the set of real umbers R. Whe ecessary,
More informationLecture 7: October 18, 2017
Iformatio ad Codig Theory Autum 207 Lecturer: Madhur Tulsiai Lecture 7: October 8, 207 Biary hypothesis testig I this lecture, we apply the tools developed i the past few lectures to uderstad the problem
More informationRegression with quadratic loss
Regressio with quadratic loss Maxim Ragisky October 13, 2015 Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X,Y, where, as before,
More informationLower bounds on minimax rates for nonparametric regression with additive sparsity and smoothness
Lower bouds o miimax rates for oparametric regressio with additive sparsity ad smoothess Garvesh Raskutti 1, Marti J. Waiwright 1,2, Bi Yu 1,2 1 UC Berkeley Departmet of Statistics 2 UC Berkeley Departmet
More informationEmpirical Process Theory and Oracle Inequalities
Stat 928: Statistical Learig Theory Lecture: 10 Empirical Process Theory ad Oracle Iequalities Istructor: Sham Kakade 1 Risk vs Risk See Lecture 0 for a discussio o termiology. 2 The Uio Boud / Boferoi
More information18.657: Mathematics of Machine Learning
8.657: Mathematics of Machie Learig Lecturer: Philippe Rigollet Lecture 4 Scribe: Cheg Mao Sep., 05 I this lecture, we cotiue to discuss the effect of oise o the rate of the excess risk E(h) = R(h) R(h
More informationIntegrable Functions. { f n } is called a determining sequence for f. If f is integrable with respect to, then f d does exist as a finite real number
MATH 532 Itegrable Fuctios Dr. Neal, WKU We ow shall defie what it meas for a measurable fuctio to be itegrable, show that all itegral properties of simple fuctios still hold, ad the give some coditios
More informationLecture Notes for Analysis Class
Lecture Notes for Aalysis Class Topological Spaces A topology for a set X is a collectio T of subsets of X such that: (a) X ad the empty set are i T (b) Uios of elemets of T are i T (c) Fiite itersectios
More informationLecture 10: Universal coding and prediction
0-704: Iformatio Processig ad Learig Sprig 0 Lecture 0: Uiversal codig ad predictio Lecturer: Aarti Sigh Scribes: Georg M. Goerg Disclaimer: These otes have ot bee subjected to the usual scrutiy reserved
More informationUnbiased Estimation. February 7-12, 2008
Ubiased Estimatio February 7-2, 2008 We begi with a sample X = (X,..., X ) of radom variables chose accordig to oe of a family of probabilities P θ where θ is elemet from the parameter space Θ. For radom
More informationChapter 3. Strong convergence. 3.1 Definition of almost sure convergence
Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i
More informationREAL ANALYSIS II: PROBLEM SET 1 - SOLUTIONS
REAL ANALYSIS II: PROBLEM SET 1 - SOLUTIONS 18th Feb, 016 Defiitio (Lipschitz fuctio). A fuctio f : R R is said to be Lipschitz if there exists a positive real umber c such that for ay x, y i the domai
More informationChapter 6 Infinite Series
Chapter 6 Ifiite Series I the previous chapter we cosidered itegrals which were improper i the sese that the iterval of itegratio was ubouded. I this chapter we are goig to discuss a topic which is somewhat
More information(A sequence also can be thought of as the list of function values attained for a function f :ℵ X, where f (n) = x n for n 1.) x 1 x N +k x N +4 x 3
MATH 337 Sequeces Dr. Neal, WKU Let X be a metric space with distace fuctio d. We shall defie the geeral cocept of sequece ad limit i a metric space, the apply the results i particular to some special
More information1 Introduction to reducing variance in Monte Carlo simulations
Copyright c 010 by Karl Sigma 1 Itroductio to reducig variace i Mote Carlo simulatios 11 Review of cofidece itervals for estimatig a mea I statistics, we estimate a ukow mea µ = E(X) of a distributio by
More informationRandom Walks on Discrete and Continuous Circles. by Jeffrey S. Rosenthal School of Mathematics, University of Minnesota, Minneapolis, MN, U.S.A.
Radom Walks o Discrete ad Cotiuous Circles by Jeffrey S. Rosethal School of Mathematics, Uiversity of Miesota, Mieapolis, MN, U.S.A. 55455 (Appeared i Joural of Applied Probability 30 (1993), 780 789.)
More information1 Convergence in Probability and the Weak Law of Large Numbers
36-752 Advaced Probability Overview Sprig 2018 8. Covergece Cocepts: i Probability, i L p ad Almost Surely Istructor: Alessadro Rialdo Associated readig: Sec 2.4, 2.5, ad 4.11 of Ash ad Doléas-Dade; Sec
More information6.3 Testing Series With Positive Terms
6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial
More informationMath 61CM - Solutions to homework 3
Math 6CM - Solutios to homework 3 Cédric De Groote October 2 th, 208 Problem : Let F be a field, m 0 a fixed oegative iteger ad let V = {a 0 + a x + + a m x m a 0,, a m F} be the vector space cosistig
More informationEntropy Rates and Asymptotic Equipartition
Chapter 29 Etropy Rates ad Asymptotic Equipartitio Sectio 29. itroduces the etropy rate the asymptotic etropy per time-step of a stochastic process ad shows that it is well-defied; ad similarly for iformatio,
More informationLecture 16: Achieving and Estimating the Fundamental Limit
EE378A tatistical igal Processig Lecture 6-05/25/207 Lecture 6: Achievig ad Estimatig the Fudametal Limit Lecturer: Jiatao Jiao cribe: William Clary I this lecture, we formally defie the two distict problems
More informationLecture 7: Density Estimation: k-nearest Neighbor and Basis Approach
STAT 425: Itroductio to Noparametric Statistics Witer 28 Lecture 7: Desity Estimatio: k-nearest Neighbor ad Basis Approach Istructor: Ye-Chi Che Referece: Sectio 8.4 of All of Noparametric Statistics.
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 3 9/11/2013. Large deviations Theory. Cramér s Theorem
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/5.070J Fall 203 Lecture 3 9//203 Large deviatios Theory. Cramér s Theorem Cotet.. Cramér s Theorem. 2. Rate fuctio ad properties. 3. Chage of measure techique.
More informationSieve Estimators: Consistency and Rates of Convergence
EECS 598: Statistical Learig Theory, Witer 2014 Topic 6 Sieve Estimators: Cosistecy ad Rates of Covergece Lecturer: Clayto Scott Scribe: Julia Katz-Samuels, Brado Oselio, Pi-Yu Che Disclaimer: These otes
More informationInformation Theory Tutorial Communication over Channels with memory. Chi Zhang Department of Electrical Engineering University of Notre Dame
Iformatio Theory Tutorial Commuicatio over Chaels with memory Chi Zhag Departmet of Electrical Egieerig Uiversity of Notre Dame Abstract A geeral capacity formula C = sup I(; Y ), which is correct for
More informationlim za n n = z lim a n n.
Lecture 6 Sequeces ad Series Defiitio 1 By a sequece i a set A, we mea a mappig f : N A. It is customary to deote a sequece f by {s } where, s := f(). A sequece {z } of (complex) umbers is said to be coverget
More informationEFFECTIVE WLLN, SLLN, AND CLT IN STATISTICAL MODELS
EFFECTIVE WLLN, SLLN, AND CLT IN STATISTICAL MODELS Ryszard Zieliński Ist Math Polish Acad Sc POBox 21, 00-956 Warszawa 10, Polad e-mail: rziel@impagovpl ABSTRACT Weak laws of large umbers (W LLN), strog
More informationAdvanced Stochastic Processes.
Advaced Stochastic Processes. David Gamarik LECTURE 2 Radom variables ad measurable fuctios. Strog Law of Large Numbers (SLLN). Scary stuff cotiued... Outlie of Lecture Radom variables ad measurable fuctios.
More informationDistribution of Random Samples & Limit theorems
STAT/MATH 395 A - PROBABILITY II UW Witer Quarter 2017 Néhémy Lim Distributio of Radom Samples & Limit theorems 1 Distributio of i.i.d. Samples Motivatig example. Assume that the goal of a study is to
More informationRandom Variables, Sampling and Estimation
Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig
More informationStatistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.
Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized
More informationBerry-Esseen bounds for self-normalized martingales
Berry-Essee bouds for self-ormalized martigales Xiequa Fa a, Qi-Ma Shao b a Ceter for Applied Mathematics, Tiaji Uiversity, Tiaji 30007, Chia b Departmet of Statistics, The Chiese Uiversity of Hog Kog,
More informationEstimation for Complete Data
Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of
More informationSummary. Recap ... Last Lecture. Summary. Theorem
Last Lecture Biostatistics 602 - Statistical Iferece Lecture 23 Hyu Mi Kag April 11th, 2013 What is p-value? What is the advatage of p-value compared to hypothesis testig procedure with size α? How ca
More informationMonte Carlo Integration
Mote Carlo Itegratio I these otes we first review basic umerical itegratio methods (usig Riema approximatio ad the trapezoidal rule) ad their limitatios for evaluatig multidimesioal itegrals. Next we itroduce
More informationSequences and Series of Functions
Chapter 6 Sequeces ad Series of Fuctios 6.1. Covergece of a Sequece of Fuctios Poitwise Covergece. Defiitio 6.1. Let, for each N, fuctio f : A R be defied. If, for each x A, the sequece (f (x)) coverges
More informationProbability 2 - Notes 10. Lemma. If X is a random variable and g(x) 0 for all x in the support of f X, then P(g(X) 1) E[g(X)].
Probability 2 - Notes 0 Some Useful Iequalities. Lemma. If X is a radom variable ad g(x 0 for all x i the support of f X, the P(g(X E[g(X]. Proof. (cotiuous case P(g(X Corollaries x:g(x f X (xdx x:g(x
More informationStudy the bias (due to the nite dimensional approximation) and variance of the estimators
2 Series Methods 2. Geeral Approach A model has parameters (; ) where is ite-dimesioal ad is oparametric. (Sometimes, there is o :) We will focus o regressio. The fuctio is approximated by a series a ite
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013 Fuctioal Law of Large Numbers. Costructio of the Wieer Measure Cotet. 1. Additioal techical results o weak covergece
More informationECE 901 Lecture 13: Maximum Likelihood Estimation
ECE 90 Lecture 3: Maximum Likelihood Estimatio R. Nowak 5/7/009 The focus of this lecture is to cosider aother approach to learig based o maximum likelihood estimatio. Ulike earlier approaches cosidered
More informationSolution. 1 Solutions of Homework 1. Sangchul Lee. October 27, Problem 1.1
Solutio Sagchul Lee October 7, 017 1 Solutios of Homework 1 Problem 1.1 Let Ω,F,P) be a probability space. Show that if {A : N} F such that A := lim A exists, the PA) = lim PA ). Proof. Usig the cotiuity
More informationOutput Analysis and Run-Length Control
IEOR E4703: Mote Carlo Simulatio Columbia Uiversity c 2017 by Marti Haugh Output Aalysis ad Ru-Legth Cotrol I these otes we describe how the Cetral Limit Theorem ca be used to costruct approximate (1 α%
More information1 Review and Overview
CS9T/STATS3: Statistical Learig Theory Lecturer: Tegyu Ma Lecture #6 Scribe: Jay Whag ad Patrick Cho October 0, 08 Review ad Overview Recall i the last lecture that for ay family of scalar fuctios F, we
More informationLecture 15: Learning Theory: Concentration Inequalities
STAT 425: Itroductio to Noparametric Statistics Witer 208 Lecture 5: Learig Theory: Cocetratio Iequalities Istructor: Ye-Chi Che 5. Itroductio Recall that i the lecture o classificatio, we have see that
More informationMath 220B Final Exam Solutions March 18, 2002
Math 0B Fial Exam Solutios March 18, 00 1. (1 poits) (a) (6 poits) Fid the Gree s fuctio for the tilted half-plae {(x 1, x ) R : x 1 + x > 0}. For x (x 1, x ), y (y 1, y ), express your Gree s fuctio G(x,
More informationMachine Learning Brett Bernstein
Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio
More information1+x 1 + α+x. x = 2(α x2 ) 1+x
Math 2030 Homework 6 Solutios # [Problem 5] For coveiece we let α lim sup a ad β lim sup b. Without loss of geerality let us assume that α β. If α the by assumptio β < so i this case α + β. By Theorem
More informationRademacher Complexity
EECS 598: Statistical Learig Theory, Witer 204 Topic 0 Rademacher Complexity Lecturer: Clayto Scott Scribe: Ya Deg, Kevi Moo Disclaimer: These otes have ot bee subjected to the usual scrutiy reserved for
More informationEmpirical Processes: Glivenko Cantelli Theorems
Empirical Processes: Gliveko Catelli Theorems Mouliath Baerjee Jue 6, 200 Gliveko Catelli classes of fuctios The reader is referred to Chapter.6 of Weller s Torgo otes, Chapter??? of VDVW ad Chapter 8.3
More informationApplication to Random Graphs
A Applicatio to Radom Graphs Brachig processes have a umber of iterestig ad importat applicatios. We shall cosider oe of the most famous of them, the Erdős-Réyi radom graph theory. 1 Defiitio A.1. Let
More informationCouncil for Innovative Research
ABSTRACT ON ABEL CONVERGENT SERIES OF FUNCTIONS ERDAL GÜL AND MEHMET ALBAYRAK Yildiz Techical Uiversity, Departmet of Mathematics, 34210 Eseler, Istabul egul34@gmail.com mehmetalbayrak12@gmail.com I this
More informationLecture 10 October Minimaxity and least favorable prior sequences
STATS 300A: Theory of Statistics Fall 205 Lecture 0 October 22 Lecturer: Lester Mackey Scribe: Brya He, Rahul Makhijai Warig: These otes may cotai factual ad/or typographic errors. 0. Miimaxity ad least
More informationEcon 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara
Poit Estimator Eco 325 Notes o Poit Estimator ad Cofidece Iterval 1 By Hiro Kasahara Parameter, Estimator, ad Estimate The ormal probability desity fuctio is fully characterized by two costats: populatio
More information7.1 Convergence of sequences of random variables
Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite
More informationIf a subset E of R contains no open interval, is it of zero measure? For instance, is the set of irrationals in [0, 1] is of measure zero?
2 Lebesgue Measure I Chapter 1 we defied the cocept of a set of measure zero, ad we have observed that every coutable set is of measure zero. Here are some atural questios: If a subset E of R cotais a
More informationLinear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d
Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y
More informationStatistical Theory MT 2008 Problems 1: Solution sketches
Statistical Theory MT 008 Problems : Solutio sketches. Which of the followig desities are withi a expoetial family? Explai your reasoig. a) Let 0 < θ < ad put fx, θ) = θ)θ x ; x = 0,,,... b) c) where α
More information32 estimating the cumulative distribution function
32 estimatig the cumulative distributio fuctio 4.6 types of cofidece itervals/bads Let F be a class of distributio fuctios F ad let θ be some quatity of iterest, such as the mea of F or the whole fuctio
More informationINFINITE SEQUENCES AND SERIES
11 INFINITE SEQUENCES AND SERIES INFINITE SEQUENCES AND SERIES 11.4 The Compariso Tests I this sectio, we will lear: How to fid the value of a series by comparig it with a kow series. COMPARISON TESTS
More informationMcGill University Math 354: Honors Analysis 3 Fall 2012 Solutions to selected problems
McGill Uiversity Math 354: Hoors Aalysis 3 Fall 212 Assigmet 3 Solutios to selected problems Problem 1. Lipschitz fuctios. Let Lip K be the set of all fuctios cotiuous fuctios o [, 1] satisfyig a Lipschitz
More informationLecture 2. The Lovász Local Lemma
Staford Uiversity Sprig 208 Math 233A: No-costructive methods i combiatorics Istructor: Ja Vodrák Lecture date: Jauary 0, 208 Origial scribe: Apoorva Khare Lecture 2. The Lovász Local Lemma 2. Itroductio
More informationSequences. Notation. Convergence of a Sequence
Sequeces A sequece is essetially just a list. Defiitio (Sequece of Real Numbers). A sequece of real umbers is a fuctio Z (, ) R for some real umber. Do t let the descriptio of the domai cofuse you; it
More informationStochastic Simulation
Stochastic Simulatio 1 Itroductio Readig Assigmet: Read Chapter 1 of text. We shall itroduce may of the key issues to be discussed i this course via a couple of model problems. Model Problem 1 (Jackso
More informationLecture 27. Capacity of additive Gaussian noise channel and the sphere packing bound
Lecture 7 Ageda for the lecture Gaussia chael with average power costraits Capacity of additive Gaussia oise chael ad the sphere packig boud 7. Additive Gaussia oise chael Up to this poit, we have bee
More informationOn Random Line Segments in the Unit Square
O Radom Lie Segmets i the Uit Square Thomas A. Courtade Departmet of Electrical Egieerig Uiversity of Califoria Los Ageles, Califoria 90095 Email: tacourta@ee.ucla.edu I. INTRODUCTION Let Q = [0, 1] [0,
More information2.1. The Algebraic and Order Properties of R Definition. A binary operation on a set F is a function B : F F! F.
CHAPTER 2 The Real Numbers 2.. The Algebraic ad Order Properties of R Defiitio. A biary operatio o a set F is a fuctio B : F F! F. For the biary operatios of + ad, we replace B(a, b) by a + b ad a b, respectively.
More informationFirst Year Quantitative Comp Exam Spring, Part I - 203A. f X (x) = 0 otherwise
First Year Quatitative Comp Exam Sprig, 2012 Istructio: There are three parts. Aswer every questio i every part. Questio I-1 Part I - 203A A radom variable X is distributed with the margial desity: >
More informationECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors
ECONOMETRIC THEORY MODULE XIII Lecture - 34 Asymptotic Theory ad Stochastic Regressors Dr. Shalabh Departmet of Mathematics ad Statistics Idia Istitute of Techology Kapur Asymptotic theory The asymptotic
More informationUniversity of Colorado Denver Dept. Math. & Stat. Sciences Applied Analysis Preliminary Exam 13 January 2012, 10:00 am 2:00 pm. Good luck!
Uiversity of Colorado Dever Dept. Math. & Stat. Scieces Applied Aalysis Prelimiary Exam 13 Jauary 01, 10:00 am :00 pm Name: The proctor will let you read the followig coditios before the exam begis, ad
More informationStatistical Theory MT 2009 Problems 1: Solution sketches
Statistical Theory MT 009 Problems : Solutio sketches. Which of the followig desities are withi a expoetial family? Explai your reasoig. (a) Let 0 < θ < ad put f(x, θ) = ( θ)θ x ; x = 0,,,... (b) (c) where
More informationDetailed proofs of Propositions 3.1 and 3.2
Detailed proofs of Propositios 3. ad 3. Proof of Propositio 3. NB: itegratio sets are geerally omitted for itegrals defied over a uit hypercube [0, s with ay s d. We first give four lemmas. The proof of
More informationBasics of Probability Theory (for Theory of Computation courses)
Basics of Probability Theory (for Theory of Computatio courses) Oded Goldreich Departmet of Computer Sciece Weizma Istitute of Sciece Rehovot, Israel. oded.goldreich@weizma.ac.il November 24, 2008 Preface.
More informationA survey on penalized empirical risk minimization Sara A. van de Geer
A survey o pealized empirical risk miimizatio Sara A. va de Geer We address the questio how to choose the pealty i empirical risk miimizatio. Roughly speakig, this pealty should be a good boud for the
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationNotes 19 : Martingale CLT
Notes 9 : Martigale CLT Math 733-734: Theory of Probability Lecturer: Sebastie Roch Refereces: [Bil95, Chapter 35], [Roc, Chapter 3]. Sice we have ot ecoutered weak covergece i some time, we first recall
More informationSection 1.4. Power Series
Sectio.4. Power Series De itio. The fuctio de ed by f (x) (x a) () c 0 + c (x a) + c 2 (x a) 2 + c (x a) + ::: is called a power series cetered at x a with coe ciet sequece f g :The domai of this fuctio
More informationThis section is optional.
4 Momet Geeratig Fuctios* This sectio is optioal. The momet geeratig fuctio g : R R of a radom variable X is defied as g(t) = E[e tx ]. Propositio 1. We have g () (0) = E[X ] for = 1, 2,... Proof. Therefore
More informationMetric Space Properties
Metric Space Properties Math 40 Fial Project Preseted by: Michael Brow, Alex Cordova, ad Alyssa Sachez We have already poited out ad will recogize throughout this book the importace of compact sets. All
More informationCHAPTER 10 INFINITE SEQUENCES AND SERIES
CHAPTER 10 INFINITE SEQUENCES AND SERIES 10.1 Sequeces 10.2 Ifiite Series 10.3 The Itegral Tests 10.4 Compariso Tests 10.5 The Ratio ad Root Tests 10.6 Alteratig Series: Absolute ad Coditioal Covergece
More informationApproximation by Superpositions of a Sigmoidal Function
Zeitschrift für Aalysis ud ihre Aweduge Joural for Aalysis ad its Applicatios Volume 22 (2003, No. 2, 463 470 Approximatio by Superpositios of a Sigmoidal Fuctio G. Lewicki ad G. Mario Abstract. We geeralize
More informationLecture 19. sup y 1,..., yn B d n
STAT 06A: Polyomials of adom Variables Lecture date: Nov Lecture 19 Grothedieck s Iequality Scribe: Be Hough The scribes are based o a guest lecture by ya O Doell. I this lecture we prove Grothedieck s
More informationAgnostic Learning and Concentration Inequalities
ECE901 Sprig 2004 Statistical Regularizatio ad Learig Theory Lecture: 7 Agostic Learig ad Cocetratio Iequalities Lecturer: Rob Nowak Scribe: Aravid Kailas 1 Itroductio 1.1 Motivatio I the last lecture
More informationMAS111 Convergence and Continuity
MAS Covergece ad Cotiuity Key Objectives At the ed of the course, studets should kow the followig topics ad be able to apply the basic priciples ad theorems therei to solvig various problems cocerig covergece
More informationSDS 321: Introduction to Probability and Statistics
SDS 321: Itroductio to Probability ad Statistics Lecture 23: Cotiuous radom variables- Iequalities, CLT Puramrita Sarkar Departmet of Statistics ad Data Sciece The Uiversity of Texas at Austi www.cs.cmu.edu/
More information