Rates of Convergence by Moduli of Continuity
|
|
- Leo Glenn
- 5 years ago
- Views:
Transcription
1 Rates of Covergece by Moduli of Cotiuity Joh Duchi: Notes for Statistics 300b March, Itroductio I this ote, we give a presetatio showig the importace, ad relatioship betwee, the modulis of cotiuity of a stochastic process ad certai growth-like properties of the populatio quatity beig modeled or optimized Our treatmet roughly follows va der Vaart ad Weller 1, Chapter 3, though we make a few simplificatios i attempt to make the approach somewhat cleaer To set otatio, let Θ be some parameter space with distace d, ad let R : Θ R be a sequece of radom risk fuctioals with expectatio Rθ := ER θ A typical example of such a process is whe we have data X i X ad a loss fuctio l : Θ X R, for example, the iid loss may be the egative log likelihood log p θ x for some model p θ We the draw X i P, ad we defie R θ := 1 lθ, X i ad Rθ := Elθ, X i=1 We would like to uderstad the covergece rate properties of θ = argmi θ Θ R θ to θ 0 := argmi θ Θ Rθ, the populatio miimizer It is atural, based o a Taylor expasio, to assume that i a eighborhood of θ 0, the populatio risk grows at least quadratically because Rθ 0 = 0 Thus, throughout this ote, we assume that there is a costat η > 0 ad a growth costat > 0 such that Rθ Rθ 0 + dθ, θ 0 for θ Θ st dθ, θ 0 η 1 With such a coditio, it is possible to give rates of covergece of θ to θ 0, at least so log as the radom fuctios R do ot have so much variability i a eighborhood of θ 0 that they swamp the quadratic growth away from θ 0 Rates of covergece ad compariso of fuctios Because we would like to uderstad miimizig the populatio risk R ad fidig θ 0, we do ot particularly care if Rθ ad R θ are close While havig R θ Rθ uiformly is sufficiet to guaratee that θ θ 0, it is ot ecessary Ideed, all we really care about is that R θ > R θ 0 for θ sufficietly far from θ 0 That is, as we expect to have roughly R θ R θ 0 Rθ Rθ 0, where Rθ Rθ 0 + dθ, θ 0, so that we hope that R θ > R θ 0 wheever dθ, θ 0 is large eough that it swamps the stochastic error i R θ R θ 0 Moreover, as log as θ is close eough to θ 0, we ca give stroger covergece guaratees, because we expect VarR θ R θ 0 to be smaller tha VarR θ by itself A bit more precisely, we must have deviatios roughly 1
2 1 Varlθ, X i ay uiform estimate of Rθ, by the cetral limit theorem However, if θ is ear θ 0, the R θ R θ 0 = Rθ Rθ 0 + O P 1 Varlθ; X lθ0 ; X, ad the latter variace may be substatially smaller tha Varlθ, X whe dθ, θ 0 is small With the above motivatio i mid, as we wish to compare R θ R θ 0 to Rθ Rθ 0, our first step i providig rates of covergece is to uderstad the modulus of cotiuity of the process θ R θ i a eighborhood of θ 0 We make the followig defiitio Defiitio 1 Let Θ δ := {θ Θ : dθ, θ 0 δ} The expected modulus of cotiuity of the process R i a radius δ aroud θ 0 is E sup R θ Rθ R θ 0 Rθ 0 θ Θ δ For otatioal coveiece, we also defie the error processes θ, x := lθ, x Rθ lθ 0, x Rθ 0 ad θ := R θ Rθ R θ 0 Rθ 0 Both of these processes are evidetly mea zero We are most ofte cocered with upper bouds o the modulus of cotiuity relative to 1/ the typical cetral limit theorem rate That is, we cosider fuctios φ of the form that E sup R θ Rθ R θ 0 Rθ 0 θ Θ δ φδ Ofte, these fuctios satisfy φδ σδ, where σ is a type of stadard deviatio/variace measure though for fuller geerality, we will cosider fuctios φδ = σδ α for parameters α 0, A example makes this more apparet Example 1: Let l be L-Lipschitz i Θ R d ad take the orm as the distace fuctio, that is, lθ; x lθ ; x L θ θ Recallig the compariso process, we the have λ L θ θ E exp λ θ, X exp by the stadard sub-gaussia iequality for bouded radom variables Thus, lettig NΘ δ,, ɛ be the coverig umber of Θ δ for the orm, we have log NΘ δ,, ɛ d log 1 + δ ɛ ad log NΘ δ,, ɛ = 0 for ɛ δ Thus, a stadard etropy itegral calculatio, usig that L θ is a -sub-gaussia process, yields E sup θ c L δ log NΘδ,, ɛdɛ c L d δ log 1 + δ dɛ c L d δ, θ Θ δ ɛ 0 where c is a umerical costat That is, we have modulus of cotiuity boud with φδ = L dδ, or σ = L d 0
3 1 For ituitio: o-stochastic bouds o differeces i empirical risk Because we would like to uderstad the relative differeces betwee R ad R, we begi for ituitio by assumig that we have the ucoditioal boud that θ φδ wheever dθ, θ 0 δ The ituitively, we must have d θ, θ 0 small wheever the quadratic growth dθ, θ 0 i Rθ away from θ 0 domiates or overcomes the stochastic error φδ/ i our estimatio Let us make this rigorous, ad begi by assumig that dθ, θ 0 η, that is, θ is i the regio of quadratic growth 1 of R away from Rθ 0, ad let = 1 for simplicity ad wlog Now, let δ = dθ, θ 0, ad assume that R θ R θ 0, that is, θ has smaller empirical risk tha θ 0 The we have R θ R θ 0 = R θ 0 Rθ 0 + Rθ + Rθ 0 Rθ }{{} dθ,θ 0 R θ 0 Rθ 0 + Rθ dθ, θ 0, where we have used the coditio 1 Rearragig, we fid that dθ, θ 0 R θ 0 Rθ 0 + Rθ R θ θ φδ That is, we have the key iequality δ φδ 3 This iequality is the key isight to all of our cosideratios of moduli of cotiuity: if φδ does ot grow as fast as δ ad δ were large, this would cotradict iequality 3, so δ = dθ, θ 0 must be small Said differetly, for suitably large δ suitably large will decrease as grows, the quadratic growth δ will evetually swamp the stochastic error φδ/ based o iequality 3 More carefully, suppose that φδ σδ α for some α 0, The iequality 3 implies δ σδα, or δ σ 1 α Moduli of cotiuity ad covergece guaratees We ow show how to make the o-stochastic heuristic argumet of the precedig sectio rigorous Assume that we have the modulus of cotiuity boud E sup θ = E sup R θ Rθ R θ 0 Rθ 0 φδ 4 θ Θ δ θ Θ δ for all δ η, where η > 0 is the regio of strog covexity of R iequality 1 Assume additioally that φδ σδ α for some variace parameter σ ad a power α 0, The we choose the rate 3
4 δ to be the poit at which the quadratic growth domiates the stochastic error i the modulus of cotiuity 4, that is, { δ := if δ 0 : δ φδ } 5 Notig that φδ σδ α, the we certaily have that σ δ = 1 α is sufficiet to satisfy this domiatio coditio, that is, we have δ δ Moreover, we have φδ/δ 1, ad similarly for δ Thus, at least ituitively, we expect that the rate of covergece of θ to θ 0 should be roughly of order δ δ, because this is the poit at which the curvature of the risk domiates the stochastic error i its estimatio We may formalize this i the followig theorem Theorem 1 Rates of covergece Let δ be the smallest domiatig radius 5, where the empirical risk R satisfies the modulus coditio 4 ad φδ σδ α Assume also that θ = argmi θ R θ is cosistet, that is, θ p θ0 The d θ, θ 0 = O P δ = O P δ = O P σ 1 α Proof Our proof builds off of a so-called peelig argumet, where we argue that the behavior of the local relative errors θ is ice o shells aroud θ 0 Ideed, for each ad all j N, defie the shells S j, := { θ Θ : δ j 1 dθ, θ 0 δ j} Recall the defiitio η > 0 of the radius i the quadratic growth coditio 1 Now, fix ay t R +, ad cosider the evet that d θ, θ 0 t δ The either d θ, θ 0 η or we have θ S j, for some j with j t but j δ η I particular, P d θ, θ 0 t δ j:j t, j δ η P θ S j, + Pd θ, θ 0 η 6 The fial term is o1, so we may igore it i what follows Now, cosider the evet that θ S j, This implies that there exists some θ S j, such that R θ R θ 0, i tur implyig R θ R θ 0 Rθ 0 + Rθ + Rθ 0 Rθ R θ 0 Rθ 0 + Rθ dθ, θ 0, where we have used the growth coditio 1 that Rθ Rθ 0 + dθ, θ 0, which holds for θ S j, as dθ, θ 0 η Notig that dθ, θ 0 δ j, we rearrage the precedig iequality to obtai that θ S j, implies δ j R θ 0 Rθ 0 R θ Rθ sup θ S j, θ Returig to the probability sum 6, we thus have P θ S j, P sup θ δ j θ S j, 4 Esup θ S j, θ δ j
5 But of course, by assumptio 4, this i tur has the boud P θ S j, j φ j δ δ j jα by the defiitio 5 of the critical radius for δ Summig iequality 6, we thus obtai P d θ, θ 0 t δ 4 φδ δ j jα j α + o1 For ay ɛ > 0, we may take t sufficietly large that j t j α ɛ, which is the defiitio of d θ, θ 0 = O P δ j t Refereces 1 A W va der Vaart ad J A Weller Weak Covergece ad Empirical Processes: With Applicatios to Statistics Spriger, New York,
Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence
Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i
More informationGlivenko-Cantelli Classes
CS28B/Stat24B (Sprig 2008 Statistical Learig Theory Lecture: 4 Gliveko-Catelli Classes Lecturer: Peter Bartlett Scribe: Michelle Besi Itroductio This lecture will cover Gliveko-Catelli (GC classes ad itroduce
More informationLecture 19: Convergence
Lecture 19: Covergece Asymptotic approach I statistical aalysis or iferece, a key to the success of fidig a good procedure is beig able to fid some momets ad/or distributios of various statistics. I may
More informationECE 901 Lecture 13: Maximum Likelihood Estimation
ECE 90 Lecture 3: Maximum Likelihood Estimatio R. Nowak 5/7/009 The focus of this lecture is to cosider aother approach to learig based o maximum likelihood estimatio. Ulike earlier approaches cosidered
More information5.1 A mutual information bound based on metric entropy
Chapter 5 Global Fao Method I this chapter, we exted the techiques of Chapter 2.4 o Fao s method the local Fao method) to a more global costructio. I particular, we show that, rather tha costructig a local
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013. Large Deviations for i.i.d. Random Variables
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013 Large Deviatios for i.i.d. Radom Variables Cotet. Cheroff boud usig expoetial momet geeratig fuctios. Properties of a momet
More informationConvergence of random variables. (telegram style notes) P.J.C. Spreij
Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space
More informationEmpirical Process Theory and Oracle Inequalities
Stat 928: Statistical Learig Theory Lecture: 10 Empirical Process Theory ad Oracle Iequalities Istructor: Sham Kakade 1 Risk vs Risk See Lecture 0 for a discussio o termiology. 2 The Uio Boud / Boferoi
More informationLecture 3: August 31
36-705: Itermediate Statistics Fall 018 Lecturer: Siva Balakrisha Lecture 3: August 31 This lecture will be mostly a summary of other useful expoetial tail bouds We will ot prove ay of these i lecture,
More informationSequences and Series of Functions
Chapter 6 Sequeces ad Series of Fuctios 6.1. Covergece of a Sequece of Fuctios Poitwise Covergece. Defiitio 6.1. Let, for each N, fuctio f : A R be defied. If, for each x A, the sequece (f (x)) coverges
More information7.1 Convergence of sequences of random variables
Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite
More informationEmpirical Processes: Glivenko Cantelli Theorems
Empirical Processes: Gliveko Catelli Theorems Mouliath Baerjee Jue 6, 200 Gliveko Catelli classes of fuctios The reader is referred to Chapter.6 of Weller s Torgo otes, Chapter??? of VDVW ad Chapter 8.3
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS
MASSACHUSTTS INSTITUT OF TCHNOLOGY 6.436J/5.085J Fall 2008 Lecture 9 /7/2008 LAWS OF LARG NUMBRS II Cotets. The strog law of large umbers 2. The Cheroff boud TH STRONG LAW OF LARG NUMBRS While the weak
More information(A sequence also can be thought of as the list of function values attained for a function f :ℵ X, where f (n) = x n for n 1.) x 1 x N +k x N +4 x 3
MATH 337 Sequeces Dr. Neal, WKU Let X be a metric space with distace fuctio d. We shall defie the geeral cocept of sequece ad limit i a metric space, the apply the results i particular to some special
More information1 Convergence in Probability and the Weak Law of Large Numbers
36-752 Advaced Probability Overview Sprig 2018 8. Covergece Cocepts: i Probability, i L p ad Almost Surely Istructor: Alessadro Rialdo Associated readig: Sec 2.4, 2.5, ad 4.11 of Ash ad Doléas-Dade; Sec
More informationThe Growth of Functions. Theoretical Supplement
The Growth of Fuctios Theoretical Supplemet The Triagle Iequality The triagle iequality is a algebraic tool that is ofte useful i maipulatig absolute values of fuctios. The triagle iequality says that
More informationECE 901 Lecture 12: Complexity Regularization and the Squared Loss
ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality
More informationST5215: Advanced Statistical Theory
ST525: Advaced Statistical Theory Departmet of Statistics & Applied Probability Tuesday, September 7, 2 ST525: Advaced Statistical Theory Lecture : The law of large umbers The Law of Large Numbers The
More informationFall 2013 MTH431/531 Real analysis Section Notes
Fall 013 MTH431/531 Real aalysis Sectio 8.1-8. Notes Yi Su 013.11.1 1. Defiitio of uiform covergece. We look at a sequece of fuctios f (x) ad study the coverget property. Notice we have two parameters
More informationNotes 5 : More on the a.s. convergence of sums
Notes 5 : More o the a.s. covergece of sums Math 733-734: Theory of Probability Lecturer: Sebastie Roch Refereces: Dur0, Sectios.5; Wil9, Sectio 4.7, Shi96, Sectio IV.4, Dur0, Sectio.. Radom series. Three-series
More informationLecture 6: Coupon Collector s problem
Radomized Algorithms Lecture 6: Coupo Collector s problem Sotiris Nikoletseas Professor CEID - ETY Course 2017-2018 Sotiris Nikoletseas, Professor Radomized Algorithms - Lecture 6 1 / 16 Variace: key features
More informationInfinite Sequences and Series
Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet
More information1 Review and Overview
DRAFT a fial versio will be posted shortly CS229T/STATS231: Statistical Learig Theory Lecturer: Tegyu Ma Lecture #3 Scribe: Migda Qiao October 1, 2013 1 Review ad Overview I the first half of this course,
More informationA survey on penalized empirical risk minimization Sara A. van de Geer
A survey o pealized empirical risk miimizatio Sara A. va de Geer We address the questio how to choose the pealty i empirical risk miimizatio. Roughly speakig, this pealty should be a good boud for the
More informationSieve Estimators: Consistency and Rates of Convergence
EECS 598: Statistical Learig Theory, Witer 2014 Topic 6 Sieve Estimators: Cosistecy ad Rates of Covergece Lecturer: Clayto Scott Scribe: Julia Katz-Samuels, Brado Oselio, Pi-Yu Che Disclaimer: These otes
More informationLecture 2. The Lovász Local Lemma
Staford Uiversity Sprig 208 Math 233A: No-costructive methods i combiatorics Istructor: Ja Vodrák Lecture date: Jauary 0, 208 Origial scribe: Apoorva Khare Lecture 2. The Lovász Local Lemma 2. Itroductio
More informationEcon 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.
Eco 325/327 Notes o Sample Mea, Sample Proportio, Cetral Limit Theorem, Chi-square Distributio, Studet s t distributio 1 Sample Mea By Hiro Kasahara We cosider a radom sample from a populatio. Defiitio
More informationMATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4
MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.
More informationLecture 13: Maximum Likelihood Estimation
ECE90 Sprig 007 Statistical Learig Theory Istructor: R. Nowak Lecture 3: Maximum Likelihood Estimatio Summary of Lecture I the last lecture we derived a risk (MSE) boud for regressio problems; i.e., select
More informationThis section is optional.
4 Momet Geeratig Fuctios* This sectio is optioal. The momet geeratig fuctio g : R R of a radom variable X is defied as g(t) = E[e tx ]. Propositio 1. We have g () (0) = E[X ] for = 1, 2,... Proof. Therefore
More information6.3 Testing Series With Positive Terms
6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial
More informationLecture 11 October 27
STATS 300A: Theory of Statistics Fall 205 Lecture October 27 Lecturer: Lester Mackey Scribe: Viswajith Veugopal, Vivek Bagaria, Steve Yadlowsky Warig: These otes may cotai factual ad/or typographic errors..
More informationStatistical Inference Based on Extremum Estimators
T. Rotheberg Fall, 2007 Statistical Iferece Based o Extremum Estimators Itroductio Suppose 0, the true value of a p-dimesioal parameter, is kow to lie i some subset S R p : Ofte we choose to estimate 0
More informationPoint Estimation: properties of estimators 1 FINITE-SAMPLE PROPERTIES. finite-sample properties (CB 7.3) large-sample properties (CB 10.
Poit Estimatio: properties of estimators fiite-sample properties CB 7.3) large-sample properties CB 10.1) 1 FINITE-SAMPLE PROPERTIES How a estimator performs for fiite umber of observatios. Estimator:
More informationLinear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d
Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y
More information7.1 Convergence of sequences of random variables
Chapter 7 Limit theorems Throughout this sectio we will assume a probability space (Ω, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite
More informationOutput Analysis and Run-Length Control
IEOR E4703: Mote Carlo Simulatio Columbia Uiversity c 2017 by Marti Haugh Output Aalysis ad Ru-Legth Cotrol I these otes we describe how the Cetral Limit Theorem ca be used to costruct approximate (1 α%
More informationLECTURE 14 NOTES. A sequence of α-level tests {ϕ n (x)} is consistent if
LECTURE 14 NOTES 1. Asymptotic power of tests. Defiitio 1.1. A sequece of -level tests {ϕ x)} is cosistet if β θ) := E θ [ ϕ x) ] 1 as, for ay θ Θ 1. Just like cosistecy of a sequece of estimators, Defiitio
More informationMetric Space Properties
Metric Space Properties Math 40 Fial Project Preseted by: Michael Brow, Alex Cordova, ad Alyssa Sachez We have already poited out ad will recogize throughout this book the importace of compact sets. All
More informationLecture 9: September 19
36-700: Probability ad Mathematical Statistics I Fall 206 Lecturer: Siva Balakrisha Lecture 9: September 9 9. Review ad Outlie Last class we discussed: Statistical estimatio broadly Pot estimatio Bias-Variace
More informationLecture 12: September 27
36-705: Itermediate Statistics Fall 207 Lecturer: Siva Balakrisha Lecture 2: September 27 Today we will discuss sufficiecy i more detail ad the begi to discuss some geeral strategies for costructig estimators.
More informationLecture 10 October Minimaxity and least favorable prior sequences
STATS 300A: Theory of Statistics Fall 205 Lecture 0 October 22 Lecturer: Lester Mackey Scribe: Brya He, Rahul Makhijai Warig: These otes may cotai factual ad/or typographic errors. 0. Miimaxity ad least
More informationECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors
ECONOMETRIC THEORY MODULE XIII Lecture - 34 Asymptotic Theory ad Stochastic Regressors Dr. Shalabh Departmet of Mathematics ad Statistics Idia Istitute of Techology Kapur Asymptotic theory The asymptotic
More informationMath 2784 (or 2794W) University of Connecticut
ORDERS OF GROWTH PAT SMITH Math 2784 (or 2794W) Uiversity of Coecticut Date: Mar. 2, 22. ORDERS OF GROWTH. Itroductio Gaiig a ituitive feel for the relative growth of fuctios is importat if you really
More informationECE 901 Lecture 14: Maximum Likelihood Estimation and Complexity Regularization
ECE 90 Lecture 4: Maximum Likelihood Estimatio ad Complexity Regularizatio R Nowak 5/7/009 Review : Maximum Likelihood Estimatio We have iid observatios draw from a ukow distributio Y i iid p θ, i,, where
More informationSTAT Homework 1 - Solutions
STAT-36700 Homework 1 - Solutios Fall 018 September 11, 018 This cotais solutios for Homework 1. Please ote that we have icluded several additioal commets ad approaches to the problems to give you better
More informationMATH 413 FINAL EXAM. f(x) f(y) M x y. x + 1 n
MATH 43 FINAL EXAM Math 43 fial exam, 3 May 28. The exam starts at 9: am ad you have 5 miutes. No textbooks or calculators may be used durig the exam. This exam is prited o both sides of the paper. Good
More informationDiscrete Mathematics for CS Spring 2008 David Wagner Note 22
CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig
More informationRegression with quadratic loss
Regressio with quadratic loss Maxim Ragisky October 13, 2015 Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X,Y, where, as before,
More informationRandom Variables, Sampling and Estimation
Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig
More informationSequences. Notation. Convergence of a Sequence
Sequeces A sequece is essetially just a list. Defiitio (Sequece of Real Numbers). A sequece of real umbers is a fuctio Z (, ) R for some real umber. Do t let the descriptio of the domai cofuse you; it
More informationEcon 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara
Poit Estimator Eco 325 Notes o Poit Estimator ad Cofidece Iterval 1 By Hiro Kasahara Parameter, Estimator, ad Estimate The ormal probability desity fuctio is fully characterized by two costats: populatio
More informationLecture Notes 15 Hypothesis Testing (Chapter 10)
1 Itroductio Lecture Notes 15 Hypothesis Testig Chapter 10) Let X 1,..., X p θ x). Suppose we we wat to kow if θ = θ 0 or ot, where θ 0 is a specific value of θ. For example, if we are flippig a coi, we
More informationSingular Continuous Measures by Michael Pejic 5/14/10
Sigular Cotiuous Measures by Michael Peic 5/4/0 Prelimiaries Give a set X, a σ-algebra o X is a collectio of subsets of X that cotais X ad ad is closed uder complemetatio ad coutable uios hece, coutable
More informationLecture 2: Concentration Bounds
CSE 52: Desig ad Aalysis of Algorithms I Sprig 206 Lecture 2: Cocetratio Bouds Lecturer: Shaya Oveis Ghara March 30th Scribe: Syuzaa Sargsya Disclaimer: These otes have ot bee subjected to the usual scrutiy
More informationECE 330:541, Stochastic Signals and Systems Lecture Notes on Limit Theorems from Probability Fall 2002
ECE 330:541, Stochastic Sigals ad Systems Lecture Notes o Limit Theorems from robability Fall 00 I practice, there are two ways we ca costruct a ew sequece of radom variables from a old sequece of radom
More informationLecture 7: Density Estimation: k-nearest Neighbor and Basis Approach
STAT 425: Itroductio to Noparametric Statistics Witer 28 Lecture 7: Desity Estimatio: k-nearest Neighbor ad Basis Approach Istructor: Ye-Chi Che Referece: Sectio 8.4 of All of Noparametric Statistics.
More information62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +
62. Power series Defiitio 16. (Power series) Give a sequece {c }, the series c x = c 0 + c 1 x + c 2 x 2 + c 3 x 3 + is called a power series i the variable x. The umbers c are called the coefficiets of
More informationAn Introduction to Asymptotic Theory
A Itroductio to Asymptotic Theory Pig Yu School of Ecoomics ad Fiace The Uiversity of Hog Kog Pig Yu (HKU) Asymptotic Theory 1 / 20 Five Weapos i Asymptotic Theory Five Weapos i Asymptotic Theory Pig Yu
More informationDistribution of Random Samples & Limit theorems
STAT/MATH 395 A - PROBABILITY II UW Witer Quarter 2017 Néhémy Lim Distributio of Radom Samples & Limit theorems 1 Distributio of i.i.d. Samples Motivatig example. Assume that the goal of a study is to
More informationINFINITE SEQUENCES AND SERIES
11 INFINITE SEQUENCES AND SERIES INFINITE SEQUENCES AND SERIES 11.4 The Compariso Tests I this sectio, we will lear: How to fid the value of a series by comparig it with a kow series. COMPARISON TESTS
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 6 9/23/2013. Brownian motion. Introduction
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/5.070J Fall 203 Lecture 6 9/23/203 Browia motio. Itroductio Cotet.. A heuristic costructio of a Browia motio from a radom walk. 2. Defiitio ad basic properties
More informationMath 525: Lecture 5. January 18, 2018
Math 525: Lecture 5 Jauary 18, 2018 1 Series (review) Defiitio 1.1. A sequece (a ) R coverges to a poit L R (writte a L or lim a = L) if for each ǫ > 0, we ca fid N such that a L < ǫ for all N. If the
More informationsin(n) + 2 cos(2n) n 3/2 3 sin(n) 2cos(2n) n 3/2 a n =
60. Ratio ad root tests 60.1. Absolutely coverget series. Defiitio 13. (Absolute covergece) A series a is called absolutely coverget if the series of absolute values a is coverget. The absolute covergece
More informationSeunghee Ye Ma 8: Week 5 Oct 28
Week 5 Summary I Sectio, we go over the Mea Value Theorem ad its applicatios. I Sectio 2, we will recap what we have covered so far this term. Topics Page Mea Value Theorem. Applicatios of the Mea Value
More informationMath 113, Calculus II Winter 2007 Final Exam Solutions
Math, Calculus II Witer 7 Fial Exam Solutios (5 poits) Use the limit defiitio of the defiite itegral ad the sum formulas to compute x x + dx The check your aswer usig the Evaluatio Theorem Solutio: I this
More informationNotes On Median and Quantile Regression. James L. Powell Department of Economics University of California, Berkeley
Notes O Media ad Quatile Regressio James L. Powell Departmet of Ecoomics Uiversity of Califoria, Berkeley Coditioal Media Restrictios ad Least Absolute Deviatios It is well-kow that the expected value
More informationAdvanced Stochastic Processes.
Advaced Stochastic Processes. David Gamarik LECTURE 2 Radom variables ad measurable fuctios. Strog Law of Large Numbers (SLLN). Scary stuff cotiued... Outlie of Lecture Radom variables ad measurable fuctios.
More informationExercise 4.3 Use the Continuity Theorem to prove the Cramér-Wold Theorem, Theorem. (1) φ a X(1).
Assigmet 7 Exercise 4.3 Use the Cotiuity Theorem to prove the Cramér-Wold Theorem, Theorem 4.12. Hit: a X d a X implies that φ a X (1) φ a X(1). Sketch of solutio: As we poited out i class, the oly tricky
More informationLecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting
Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would
More informationAdvanced Analysis. Min Yan Department of Mathematics Hong Kong University of Science and Technology
Advaced Aalysis Mi Ya Departmet of Mathematics Hog Kog Uiversity of Sciece ad Techology September 3, 009 Cotets Limit ad Cotiuity 7 Limit of Sequece 8 Defiitio 8 Property 3 3 Ifiity ad Ifiitesimal 8 4
More informationTR/46 OCTOBER THE ZEROS OF PARTIAL SUMS OF A MACLAURIN EXPANSION A. TALBOT
TR/46 OCTOBER 974 THE ZEROS OF PARTIAL SUMS OF A MACLAURIN EXPANSION by A. TALBOT .. Itroductio. A problem i approximatio theory o which I have recetly worked [] required for its solutio a proof that the
More informationMachine Learning Brett Bernstein
Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio
More informationy X F n (y), To see this, let y Y and apply property (ii) to find a sequence {y n } X such that y n y and lim sup F n (y n ) F (y).
Modica Mortola Fuctioal 2 Γ-Covergece Let X, d) be a metric space ad cosider a sequece {F } of fuctioals F : X [, ]. We say that {F } Γ-coverges to a fuctioal F : X [, ] if the followig properties hold:
More informationTechnical Proofs for Homogeneity Pursuit
Techical Proofs for Homogeeity Pursuit bstract This is the supplemetal material for the article Homogeeity Pursuit, submitted for publicatio i Joural of the merica Statistical ssociatio. B Proofs B. Proof
More informationLet us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.
Lecture 5 Let us give oe more example of MLE. Example 3. The uiform distributio U[0, ] o the iterval [0, ] has p.d.f. { 1 f(x =, 0 x, 0, otherwise The likelihood fuctio ϕ( = f(x i = 1 I(X 1,..., X [0,
More informationSelf-normalized deviation inequalities with application to t-statistic
Self-ormalized deviatio iequalities with applicatio to t-statistic Xiequa Fa Ceter for Applied Mathematics, Tiaji Uiversity, 30007 Tiaji, Chia Abstract Let ξ i i 1 be a sequece of idepedet ad symmetric
More informationDifferentiable Convex Functions
Differetiable Covex Fuctios The followig picture motivates Theorem 11. f ( x) f ( x) f '( x)( x x) ˆx x 1 Theorem 11 : Let f : R R be differetiable. The, f is covex o the covex set C R if, ad oly if for
More informationlim za n n = z lim a n n.
Lecture 6 Sequeces ad Series Defiitio 1 By a sequece i a set A, we mea a mappig f : N A. It is customary to deote a sequece f by {s } where, s := f(). A sequece {z } of (complex) umbers is said to be coverget
More informationLecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting
Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would
More informationChapter 5. Inequalities. 5.1 The Markov and Chebyshev inequalities
Chapter 5 Iequalities 5.1 The Markov ad Chebyshev iequalities As you have probably see o today s frot page: every perso i the upper teth percetile ears at least 1 times more tha the average salary. I other
More informationMachine Learning Theory Tübingen University, WS 2016/2017 Lecture 12
Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract I this lecture we derive risk bouds for kerel methods. We will start by showig that Soft Margi kerel SVM correspods to miimizig
More informationBasics of Probability Theory (for Theory of Computation courses)
Basics of Probability Theory (for Theory of Computatio courses) Oded Goldreich Departmet of Computer Sciece Weizma Istitute of Sciece Rehovot, Israel. oded.goldreich@weizma.ac.il November 24, 2008 Preface.
More informationRandom Walks on Discrete and Continuous Circles. by Jeffrey S. Rosenthal School of Mathematics, University of Minnesota, Minneapolis, MN, U.S.A.
Radom Walks o Discrete ad Cotiuous Circles by Jeffrey S. Rosethal School of Mathematics, Uiversity of Miesota, Mieapolis, MN, U.S.A. 55455 (Appeared i Joural of Applied Probability 30 (1993), 780 789.)
More informationMaximum Likelihood Estimation and Complexity Regularization
ECE90 Sprig 004 Statistical Regularizatio ad Learig Theory Lecture: 4 Maximum Likelihood Estimatio ad Complexity Regularizatio Lecturer: Rob Nowak Scribe: Pam Limpiti Review : Maximum Likelihood Estimatio
More informationAda Boost, Risk Bounds, Concentration Inequalities. 1 AdaBoost and Estimates of Conditional Probabilities
CS8B/Stat4B Sprig 008) Statistical Learig Theory Lecture: Ada Boost, Risk Bouds, Cocetratio Iequalities Lecturer: Peter Bartlett Scribe: Subhrasu Maji AdaBoost ad Estimates of Coditioal Probabilities We
More informationSolutions to HW Assignment 1
Solutios to HW: 1 Course: Theory of Probability II Page: 1 of 6 Uiversity of Texas at Austi Solutios to HW Assigmet 1 Problem 1.1. Let Ω, F, {F } 0, P) be a filtered probability space ad T a stoppig time.
More informationCS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5
CS434a/54a: Patter Recogitio Prof. Olga Veksler Lecture 5 Today Itroductio to parameter estimatio Two methods for parameter estimatio Maimum Likelihood Estimatio Bayesia Estimatio Itroducto Bayesia Decisio
More informationG. R. Pasha Department of Statistics Bahauddin Zakariya University Multan, Pakistan
Deviatio of the Variaces of Classical Estimators ad Negative Iteger Momet Estimator from Miimum Variace Boud with Referece to Maxwell Distributio G. R. Pasha Departmet of Statistics Bahauddi Zakariya Uiversity
More informationDefinition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4.
4. BASES I BAACH SPACES 39 4. BASES I BAACH SPACES Sice a Baach space X is a vector space, it must possess a Hamel, or vector space, basis, i.e., a subset {x γ } γ Γ whose fiite liear spa is all of X ad
More information1 Review and Overview
CS9T/STATS3: Statistical Learig Theory Lecturer: Tegyu Ma Lecture #6 Scribe: Jay Whag ad Patrick Cho October 0, 08 Review ad Overview Recall i the last lecture that for ay family of scalar fuctios F, we
More informationMonte Carlo Integration
Mote Carlo Itegratio I these otes we first review basic umerical itegratio methods (usig Riema approximatio ad the trapezoidal rule) ad their limitatios for evaluatig multidimesioal itegrals. Next we itroduce
More informationNotes 27 : Brownian motion: path properties
Notes 27 : Browia motio: path properties Math 733-734: Theory of Probability Lecturer: Sebastie Roch Refereces:[Dur10, Sectio 8.1], [MP10, Sectio 1.1, 1.2, 1.3]. Recall: DEF 27.1 (Covariace) Let X = (X
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013 Fuctioal Law of Large Numbers. Costructio of the Wieer Measure Cotet. 1. Additioal techical results o weak covergece
More informationLecture 15: Learning Theory: Concentration Inequalities
STAT 425: Itroductio to Noparametric Statistics Witer 208 Lecture 5: Learig Theory: Cocetratio Iequalities Istructor: Ye-Chi Che 5. Itroductio Recall that i the lecture o classificatio, we have see that
More informationOptimally Sparse SVMs
A. Proof of Lemma 3. We here prove a lower boud o the umber of support vectors to achieve geeralizatio bouds of the form which we cosider. Importatly, this result holds ot oly for liear classifiers, but
More information1 Duality revisited. AM 221: Advanced Optimization Spring 2016
AM 22: Advaced Optimizatio Sprig 206 Prof. Yaro Siger Sectio 7 Wedesday, Mar. 9th Duality revisited I this sectio, we will give a slightly differet perspective o duality. optimizatio program: f(x) x R
More informationBeurling Integers: Part 2
Beurlig Itegers: Part 2 Isomorphisms Devi Platt July 11, 2015 1 Prime Factorizatio Sequeces I the last article we itroduced the Beurlig geeralized itegers, which ca be represeted as a sequece of real umbers
More informationNotes 19 : Martingale CLT
Notes 9 : Martigale CLT Math 733-734: Theory of Probability Lecturer: Sebastie Roch Refereces: [Bil95, Chapter 35], [Roc, Chapter 3]. Sice we have ot ecoutered weak covergece i some time, we first recall
More information