On Equivalence of Martingale Tail Bounds and Deterministic Regret Inequalities

Size: px
Start display at page:

Download "On Equivalence of Martingale Tail Bounds and Deterministic Regret Inequalities"

Transcription

1 O Equivalece of Martigale Tail Bouds ad Determiistic Regret Iequalities Sasha Rakhli Departmet of Statistics, The Wharto School Uiversity of Pesylvaia Dec 16, 2015 Joit work with K. Sridhara arxiv:

2 Outlie Itroductio Beyod Baach spaces Extras

3 If Z 1,..., Z idepedet with EZ t = 0 the E ( 2 Z t ) = EZ 2 t.

4 If Z 1,..., Z idepedet with EZ t = 0 the Exteds to Hilbert space E ( E 2 Z t ) 2 Z i = EZ 2 t. = E Z i 2.

5 (Pielis 94): Let Z 1,..., Z be a martigale differece sequece i a separable 2-smooth Baach space (B, ). For ay u > 0 P (sup 1 where σ 2 Z t 2. Z t σu) 2 exp { u2 2D 2 },

6 (Pielis 94): Let Z 1,..., Z be a martigale differece sequece i a separable 2-smooth Baach space (B, ). For ay u > 0 P (sup 1 where σ 2 Z t 2. Z t σu) 2 exp { u2 2D 2 }, Questios: replace σ with sequece-depedet versio? is it always possible? exted beyod liear structure of Baach spaces?

7 (Pielis 94): Let Z 1,..., Z be a martigale differece sequece i a separable 2-smooth Baach space (B, ). For ay u > 0 P (sup 1 where σ 2 Z t 2. Z t σu) 2 exp { u2 2D 2 }, Questios: replace σ with sequece-depedet versio? is it always possible? exted beyod liear structure of Baach spaces? Cotributios: address these questios the actual techique: equivalece of tail bouds ad determiistic pathwise regret iequalities

8 Baby versio Uit Euclidea ball B i R d. Let z 1,..., z B be arbitrary. Defie ŷ t+1 = ŷ t+1 (z 1,..., z t ) = Proj B (ŷ t 1 z t ) with ŷ 1 = 0.

9 Baby versio Uit Euclidea ball B i R d. Let z 1,..., z B be arbitrary. Defie ŷ t+1 = ŷ t+1 (z 1,..., z t ) = Proj B (ŷ t 1 z t ) with ŷ 1 = 0. The, y, z 1,..., z B, ŷ t y, z t

10 Baby versio Uit Euclidea ball B i R d. Let z 1,..., z B be arbitrary. Defie ŷ t+1 = ŷ t+1 (z 1,..., z t ) = Proj B (ŷ t 1 z t ) with ŷ 1 = 0. The, y, z 1,..., z B, ŷ t y, z t Rewrite as (ŷ t ) z 1,..., z B, z t ŷ t, z t.

11 Determiistic iequality: (ŷ t ) z 1,..., z B, z t ŷ t, z t. (1)

12 Determiistic iequality: (ŷ t ) z 1,..., z B, z t ŷ t, z t. (1) Apply to a MDS Z 1,..., Z with values i B P ( Z t u) P ( ŷ t, Z t u) (2)

13 Determiistic iequality: (ŷ t ) z 1,..., z B, z t ŷ t, z t. (1) Apply to a MDS Z 1,..., Z with values i B P ( Z t u) P ( ŷ t, Z t u) exp{ u 2 /2} (2) by Asuma-Hoeffdig.

14 Determiistic iequality: (ŷ t ) z 1,..., z B, z t ŷ t, z t. (1) Apply to a MDS Z 1,..., Z with values i B P ( Z t u) P ( ŷ t, Z t u) exp{ u 2 /2} (2) by Asuma-Hoeffdig. Itegrate tails: E Z t c (3) Usig vo Neuma miimax theorem, it is possible to show (ŷ t ) y, z 1,..., z B, ŷ t y, z t sup mds E W t

15 Determiistic iequality: (ŷ t ) z 1,..., z B, z t ŷ t, z t. (1) Apply to a MDS Z 1,..., Z with values i B P ( Z t u) P ( ŷ t, Z t u) exp{ u 2 /2} (2) by Asuma-Hoeffdig. Itegrate tails: E Z t c (3) Usig vo Neuma miimax theorem, it is possible to show (ŷ t ) y, z 1,..., z B, ŷ t y, z t sup mds E W t c Coclusio: (1) (2) (3) (1) (up to cost)

16 (ŷ t ) y, z 1,..., z B, ŷ t y, z t P ( Z t u) exp{ u 2 /2} E Z t c

17 (ŷ t ) y, z 1,..., z B, ŷ t y, z t P ( Z t u) exp{ u 2 /2} E Z t c Curiosities: i particular (3) (2) amplifies i-expectatio to high prob. improve tail bouds by takig a better gradiet descet improve gradiet descet by fidig better tail bouds move beyod liear structure of Baach space

18 (ŷ t ) y, z 1,..., z B, ŷ t y, z t P ( Z t u) exp{ u 2 /2} E Z t c Curiosities: i particular (3) (2) amplifies i-expectatio to high prob. improve tail bouds by takig a better gradiet descet improve gradiet descet by fidig better tail bouds move beyod liear structure of Baach space < / ed of baby versio >

19 Warmup: mirror descet with adaptive step size (B, ) 2-smooth, (B, ) deotes dual. D R B B R Bregma divergece w.r.t. R, which is 1-strogly covex o uit ball B B. Deote R 2 max sup f,g B D R (f, g). Here z t s eed ot be i uit ball. Lemma. F B covex. Defie, ŷ t+1 = ŷ t+1(z 1,..., z t) = argmi y F {η t y, z t + D R (y, ŷ t)} ad η t R max mi {1, ( t s=1 z s 2 + t 1 s=1 z s 2 1 ) }. The for ay y F ad ay z 1,..., z B, ŷ t y, z t 2.5R max z t

20 Warmup: mirror descet with adaptive step size Let E t be coditioal expectatio. Theorem. Let Z 1,..., Z be a B-valued MDS. For ay u > 0, Z t 2.5R max ( V + 1) P V +W + (E > u 2 exp { u 2 /16}, V +W ) 2 where V = Z t 2 ad W = E t 1 Z t 2. Holds with W 0 if MDS coditioally symmetric. -idepedet, self-ormalized, ca be exteded to p-smooth

21 summary so far coectio betwee first-order covex optimizatio methods ad oe-sided probabilistic tail bouds

22 Outlie Itroductio Beyod Baach spaces Extras

23 Iterpret as supremum of stochastic process Z t = sup y 1 y, Z t Geeralizatio (after ceterig): take ay stochastic process Z t ad sup g G g(z t ) E t 1 [g(z t )]

24 Iterpret as supremum of stochastic process Z t = sup y 1 y, Z t Geeralizatio (after ceterig): take ay stochastic process Z t ad sup g G g(z t ) E t 1 [g(z t )] Eough to cosider D t = σ(ɛ 1,..., ɛ t ) geerated by i.i.d. Rademacher: sup f F ɛ t f(x t ) where x t is D t 1 -measurable. (exted Pacheko s symmetrizatio techique to martigales) f(x t(ɛ 1 t 1)) = g(z t(ɛ 1 t 1, +1) g(z t(ɛ 1 t 1, 1))

25 Determiistic regret iequalities Let y 1,..., y {±1}, x 1,..., x X, F = {f X R} For a give fuctio B F X R, wat a predictio strategy such that ŷ t = ŷ t(x 1,..., x t, y 1,..., y t 1) (x t, y t), ŷ ty t if f F { y tf(x t) + 2B(f; x 1,..., x )}.

26 Determiistic regret iequalities Let y 1,..., y {±1}, x 1,..., x X, F = {f X R} For a give fuctio B F X R, wat a predictio strategy such that ŷ t = ŷ t(x 1,..., x t, y 1,..., y t 1) (x t, y t), ŷ ty t if f F { y tf(x t) + 2B(f; x 1,..., x )}. If existece of (ŷ t) is certified, apply to y t = ɛ t ad x t = x t(ɛ): P (sup f F { ɛ tf(x t) 2B(f; x 1,..., x )} u) P ( ɛ tŷ t u) exp{...}.

27 Lemma. If for ay predictable process x = (x 1,..., x ) E [sup f F ɛ t f(x t ) 2B(f; x 1,..., x )] 0, the there exists a strategy (ŷ t ) with values ŷ t sup f F f(x t ) such that the determiistic iequality holds for all sequeces.

28 Lemma. If for ay predictable process x = (x 1,..., x ) E [sup f F ɛ t f(x t ) 2B(f; x 1,..., x )] 0, the there exists a strategy (ŷ t ) with values ŷ t sup f F f(x t ) such that the determiistic iequality holds for all sequeces. automatic amplificatio to high probability existetial o explicit predictio strategy (ŷ t ) a offset versio of sequetial Rademacher complexity R (F; x) = E [sup f F ɛ t f(x t )] (ɛ 1,..., ɛ ) sup f F ɛ t f(x t ) is ot Lipschitz; cocetratio methods fail

29 Defiitio. Let r (1, 2]. We say that sequetial Rademacher complexity of F exhibits a 1/r growth if 1, x, R (F; x) C 1/r sup f(x t (ɛ)). f F,ɛ {±1},t

30 Usig amplificatio ad reverse Hölder (due to Burkholder/Pisier): Lemma. Let F R X. Suppose sequetial Rademacher complexity exhibits 1/r growth, r (1, 2]. For ay p < r, E sup f F ɛ t f(x t ) C r,p E ( sup f(x t ) )1/p p. f F Further, if F [ 1, 1] X, the E sup f F ɛ t f(x t ) C log E ( sup f(x t ) )1/r r f F I spirit of: if ca prove E Z t the E Z t E Z t 2

31 Defiitio. We say G R Z has martigale type p if C such that E[sup g G (g(z t) E t 1 [g(z t)])] C E( E t 1 sup g(z t) g(z t) p 1/p ) g G Theorem. For ay G R Z, 1. If sequetial Rademacher exhibits 1/r growth, r (1, 2], the G has martigale type p for every p < r. 2. If G has martigale type p, the sequetial Rademacher exhibits 1/p growth.

32 Fier aalysis for type 2 Defie Var = sup f(x t ) 2, Var(f) = f(x t ) 2 f F Wheever log N seq (α) α q, q [0, 2], E [sup f F High probability via amplificatio. ɛ t f(x t ) C (Var 1/2 ) q 4 (Var 1/2 (f)) 2 q 4 ] 0

33 Fier aalysis for type 2 Defie Var = sup f(x t ) 2, Var(f) = f(x t ) 2 f F Wheever log N seq (α) α q, q [0, 2], E [sup f F High probability via amplificatio. ɛ t f(x t ) C (Var 1/2 ) q 4 (Var 1/2 (f)) 2 q 4 ] 0 Compare to (Massart, Rossigol 13): weak variace improvemet of Nemirovskii iequality: for i.i.d. zero mea Z 1,..., Z R d : E [max j d ɛ t Z t,j ] 2 l(2d)e max j d Z 2 t,j. We match idex j o both sides; exted to martigales beyod fiite case.

34 Coclusios Equivalece of determiistic regret iequalities ad martigale tail bouds gives a way of provig tail bouds (for martigales or i.i.d.) by exhibitig a method or certifyig its existece amplificatio to high probability Use it to exted otio of martigale type to geeral classes Not i this talk: data-depedet bouds for olie learig

35 Ope questios What is behid the equivalece? Replace with E( E( sup 1/p E t 1 sup g(z t ) g(z t) p ) g G g G 1/p g(z t ) E t 1g(Z t) p ) If sequetial Rademacher complexity exhibits 1/r growth rate, the does G have martigale type r? We oly prove martigale type p for ay p < r.

36 Outlie Itroductio Beyod Baach spaces Extras

37 Reverse Hölder priciple For p (0, ), defie Z p, = (sup t>0 t p P(Z > t)) 1/p Lemma (Pisier). For ay δ (0, 1) ad ay R there exists C p (δ, R) s.t. the followig holds. For i.i.d. (Z i ) i 0, if sup N 1 P (sup N 1/p Z i > R) δ i N the Z p, C p (δ, R) Corollary: For ay 0 < q < p < there exists C p,q such that Z p, C p,q sup N 1/p sup Z i N 1 i N q

On Equivalence of Martingale Tail Bounds and Deterministic Regret Inequalities

On Equivalence of Martingale Tail Bounds and Deterministic Regret Inequalities O Equivalece of Martigale Tail Bouds ad Determiistic Regret Iequalities Alexader Rakhli Uiversity of Pesylvaia Karthik Sridhara Corell Uiversity October 17, 2015 Abstract We study a equivalece of (i) determiistic

More information

Machine Learning Theory (CS 6783)

Machine Learning Theory (CS 6783) Machie Learig Theory (CS 6783) Lecture 3 : Olie Learig, miimax value, sequetial Rademacher complexity Recap: Miimax Theorem We shall use the celebrated miimax theorem as a key tool to boud the miimax rate

More information

On Equivalence of Martingale Tail Bounds and Deterministic Regret Inequalities

On Equivalence of Martingale Tail Bounds and Deterministic Regret Inequalities Proceedigs of Machie Learig Research vol 65:1 19, 2017 O Equivalece of Martigale Tail Bouds ad Determiistic Regret Iequalities Alexader Rakhli Uiversity of Pesylvaia Karthik Sridhara Corell Uiversity rakhli@wharto.upe.edu

More information

Learning Theory: Lecture Notes

Learning Theory: Lecture Notes Learig Theory: Lecture Notes Kamalika Chaudhuri October 4, 0 Cocetratio of Averages Cocetratio of measure is very useful i showig bouds o the errors of machie-learig algorithms. We will begi with a basic

More information

Glivenko-Cantelli Classes

Glivenko-Cantelli Classes CS28B/Stat24B (Sprig 2008 Statistical Learig Theory Lecture: 4 Gliveko-Catelli Classes Lecturer: Peter Bartlett Scribe: Michelle Besi Itroductio This lecture will cover Gliveko-Catelli (GC classes ad itroduce

More information

6.883: Online Methods in Machine Learning Alexander Rakhlin

6.883: Online Methods in Machine Learning Alexander Rakhlin 6.883: Olie Methods i Machie Learig Alexader Rakhli LECTURE 23. SOME CONSEQUENCES OF ONLINE NO-REGRET METHODS I this lecture, we explore some cosequeces of the developed techiques.. Covex optimizatio Wheever

More information

Self-normalized deviation inequalities with application to t-statistic

Self-normalized deviation inequalities with application to t-statistic Self-ormalized deviatio iequalities with applicatio to t-statistic Xiequa Fa Ceter for Applied Mathematics, Tiaji Uiversity, 30007 Tiaji, Chia Abstract Let ξ i i 1 be a sequece of idepedet ad symmetric

More information

Optimally Sparse SVMs

Optimally Sparse SVMs A. Proof of Lemma 3. We here prove a lower boud o the umber of support vectors to achieve geeralizatio bouds of the form which we cosider. Importatly, this result holds ot oly for liear classifiers, but

More information

Notes on Snell Envelops and Examples

Notes on Snell Envelops and Examples Notes o Sell Evelops ad Examples Example (Secretary Problem): Coside a pool of N cadidates whose qualificatios are represeted by ukow umbers {a > a 2 > > a N } from best to last. They are iterviewed sequetially

More information

arxiv: v1 [math.pr] 13 Oct 2011

arxiv: v1 [math.pr] 13 Oct 2011 A tail iequality for quadratic forms of subgaussia radom vectors Daiel Hsu, Sham M. Kakade,, ad Tog Zhag 3 arxiv:0.84v math.pr] 3 Oct 0 Microsoft Research New Eglad Departmet of Statistics, Wharto School,

More information

1 Duality revisited. AM 221: Advanced Optimization Spring 2016

1 Duality revisited. AM 221: Advanced Optimization Spring 2016 AM 22: Advaced Optimizatio Sprig 206 Prof. Yaro Siger Sectio 7 Wedesday, Mar. 9th Duality revisited I this sectio, we will give a slightly differet perspective o duality. optimizatio program: f(x) x R

More information

Supplementary Material for Fast Stochastic AUC Maximization with O(1/n)-Convergence Rate

Supplementary Material for Fast Stochastic AUC Maximization with O(1/n)-Convergence Rate Supplemetary Material for Fast Stochastic AUC Maximizatio with O/-Covergece Rate Migrui Liu Xiaoxua Zhag Zaiyi Che Xiaoyu Wag 3 iabao Yag echical Lemmas ized versio of Hoeffdig s iequality, ote that We

More information

REGRESSION WITH QUADRATIC LOSS

REGRESSION WITH QUADRATIC LOSS REGRESSION WITH QUADRATIC LOSS MAXIM RAGINSKY Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X, Y ), where, as before, X is a R d

More information

arxiv: v1 [math.pr] 4 Dec 2013

arxiv: v1 [math.pr] 4 Dec 2013 Squared-Norm Empirical Process i Baach Space arxiv:32005v [mathpr] 4 Dec 203 Vicet Q Vu Departmet of Statistics The Ohio State Uiversity Columbus, OH vqv@statosuedu Abstract Jig Lei Departmet of Statistics

More information

Regression with quadratic loss

Regression with quadratic loss Regressio with quadratic loss Maxim Ragisky October 13, 2015 Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X,Y, where, as before,

More information

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality

More information

2 Banach spaces and Hilbert spaces

2 Banach spaces and Hilbert spaces 2 Baach spaces ad Hilbert spaces Tryig to do aalysis i the ratioal umbers is difficult for example cosider the set {x Q : x 2 2}. This set is o-empty ad bouded above but does ot have a least upper boud

More information

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Convergence of random variables. (telegram style notes) P.J.C. Spreij Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space

More information

DISCRETE PREDICTION PROBLEMS: RANDOMIZED PREDICTION

DISCRETE PREDICTION PROBLEMS: RANDOMIZED PREDICTION DISCRETE PREDICTION PROBLEMS: RANDOMIZED PREDICTION Csaba Szepesvári Uiversity of Alberta CMPUT 654 E-mail: szepesva@ualberta.ca UofA, October 10-12-14, 2006 OUTLINE 1 DISCRETE PREDICTION PROBLEMS 2 RANDOMIZED

More information

1 Review and Overview

1 Review and Overview DRAFT a fial versio will be posted shortly CS229T/STATS231: Statistical Learig Theory Lecturer: Tegyu Ma Lecture #3 Scribe: Migda Qiao October 1, 2013 1 Review ad Overview I the first half of this course,

More information

Agnostic Learning and Concentration Inequalities

Agnostic Learning and Concentration Inequalities ECE901 Sprig 2004 Statistical Regularizatio ad Learig Theory Lecture: 7 Agostic Learig ad Cocetratio Iequalities Lecturer: Rob Nowak Scribe: Aravid Kailas 1 Itroductio 1.1 Motivatio I the last lecture

More information

An Introduction to Randomized Algorithms

An Introduction to Randomized Algorithms A Itroductio to Radomized Algorithms The focus of this lecture is to study a radomized algorithm for quick sort, aalyze it usig probabilistic recurrece relatios, ad also provide more geeral tools for aalysis

More information

Math Solutions to homework 6

Math Solutions to homework 6 Math 175 - Solutios to homework 6 Cédric De Groote November 16, 2017 Problem 1 (8.11 i the book): Let K be a compact Hermitia operator o a Hilbert space H ad let the kerel of K be {0}. Show that there

More information

ACO Comprehensive Exam 9 October 2007 Student code A. 1. Graph Theory

ACO Comprehensive Exam 9 October 2007 Student code A. 1. Graph Theory 1. Graph Theory Prove that there exist o simple plaar triagulatio T ad two distict adjacet vertices x, y V (T ) such that x ad y are the oly vertices of T of odd degree. Do ot use the Four-Color Theorem.

More information

Concentration inequalities

Concentration inequalities Cocetratio iequalities Jea-Yves Audibert 1,2 1. Imagie - ENPC/CSTB - uiversité Paris Est 2. Willow (INRIA/ENS/CNRS) ThRaSH 2010 with Problem Tight upper ad lower bouds o f(x 1,..., X ) X 1,..., X i.i.d.

More information

Supplementary Material for Fast Stochastic AUC Maximization with O(1/n)-Convergence Rate

Supplementary Material for Fast Stochastic AUC Maximization with O(1/n)-Convergence Rate Supplemetary Material for Fast Stochastic AUC Maximizatio with O/-Covergece Rate Migrui Liu Xiaoxua Zhag Zaiyi Che Xiaoyu Wag 3 iabao Yag echical Lemmas ized versio of Hoeffdig s iequality, ote that We

More information

Lecture 15: Learning Theory: Concentration Inequalities

Lecture 15: Learning Theory: Concentration Inequalities STAT 425: Itroductio to Noparametric Statistics Witer 208 Lecture 5: Learig Theory: Cocetratio Iequalities Istructor: Ye-Chi Che 5. Itroductio Recall that i the lecture o classificatio, we have see that

More information

Rademacher Complexity

Rademacher Complexity EECS 598: Statistical Learig Theory, Witer 204 Topic 0 Rademacher Complexity Lecturer: Clayto Scott Scribe: Ya Deg, Kevi Moo Disclaimer: These otes have ot bee subjected to the usual scrutiy reserved for

More information

Solutions to HW Assignment 1

Solutions to HW Assignment 1 Solutios to HW: 1 Course: Theory of Probability II Page: 1 of 6 Uiversity of Texas at Austi Solutios to HW Assigmet 1 Problem 1.1. Let Ω, F, {F } 0, P) be a filtered probability space ad T a stoppig time.

More information

MAT1026 Calculus II Basic Convergence Tests for Series

MAT1026 Calculus II Basic Convergence Tests for Series MAT026 Calculus II Basic Covergece Tests for Series Egi MERMUT 202.03.08 Dokuz Eylül Uiversity Faculty of Sciece Departmet of Mathematics İzmir/TURKEY Cotets Mootoe Covergece Theorem 2 2 Series of Real

More information

1 Review and Overview

1 Review and Overview CS9T/STATS3: Statistical Learig Theory Lecturer: Tegyu Ma Lecture #6 Scribe: Jay Whag ad Patrick Cho October 0, 08 Review ad Overview Recall i the last lecture that for ay family of scalar fuctios F, we

More information

ECE534, Spring 2018: Solutions for Problem Set #2

ECE534, Spring 2018: Solutions for Problem Set #2 ECE534, Srig 08: s for roblem Set #. Rademacher Radom Variables ad Symmetrizatio a) Let X be a Rademacher radom variable, i.e., X = ±) = /. Show that E e λx e λ /. E e λx = e λ + e λ = + k= k=0 λ k k k!

More information

Chapter 7 Isoperimetric problem

Chapter 7 Isoperimetric problem Chapter 7 Isoperimetric problem Recall that the isoperimetric problem (see the itroductio its coectio with ido s proble) is oe of the most classical problem of a shape optimizatio. It ca be formulated

More information

Lecture 10 October Minimaxity and least favorable prior sequences

Lecture 10 October Minimaxity and least favorable prior sequences STATS 300A: Theory of Statistics Fall 205 Lecture 0 October 22 Lecturer: Lester Mackey Scribe: Brya He, Rahul Makhijai Warig: These otes may cotai factual ad/or typographic errors. 0. Miimaxity ad least

More information

18.657: Mathematics of Machine Learning

18.657: Mathematics of Machine Learning 18.657: Mathematics of Machie Learig Lecturer: Philippe Rigollet Lecture 15 Scribe: Zach Izzo Oct. 27, 2015 Part III Olie Learig It is ofte the case that we will be asked to make a sequece of predictios,

More information

Notes 19 : Martingale CLT

Notes 19 : Martingale CLT Notes 9 : Martigale CLT Math 733-734: Theory of Probability Lecturer: Sebastie Roch Refereces: [Bil95, Chapter 35], [Roc, Chapter 3]. Sice we have ot ecoutered weak covergece i some time, we first recall

More information

Lecture 19: Convergence

Lecture 19: Convergence Lecture 19: Covergece Asymptotic approach I statistical aalysis or iferece, a key to the success of fidig a good procedure is beig able to fid some momets ad/or distributios of various statistics. I may

More information

Precise Rates in Complete Moment Convergence for Negatively Associated Sequences

Precise Rates in Complete Moment Convergence for Negatively Associated Sequences Commuicatios of the Korea Statistical Society 29, Vol. 16, No. 5, 841 849 Precise Rates i Complete Momet Covergece for Negatively Associated Sequeces Dae-Hee Ryu 1,a a Departmet of Computer Sciece, ChugWoo

More information

The random version of Dvoretzky s theorem in l n

The random version of Dvoretzky s theorem in l n The radom versio of Dvoretzky s theorem i l Gideo Schechtma Abstract We show that with high probability a sectio of the l ball of dimesio k cε log c > 0 a uiversal costat) is ε close to a multiple of the

More information

Topics. Homework Problems. MATH 301 Introduction to Analysis Chapter Four Sequences. 1. Definition of convergence of sequences.

Topics. Homework Problems. MATH 301 Introduction to Analysis Chapter Four Sequences. 1. Definition of convergence of sequences. MATH 301 Itroductio to Aalysis Chapter Four Sequeces Topics 1. Defiitio of covergece of sequeces. 2. Fidig ad provig the limit of sequeces. 3. Bouded covergece theorem: Theorem 4.1.8. 4. Theorems 4.1.13

More information

2.1. The Algebraic and Order Properties of R Definition. A binary operation on a set F is a function B : F F! F.

2.1. The Algebraic and Order Properties of R Definition. A binary operation on a set F is a function B : F F! F. CHAPTER 2 The Real Numbers 2.. The Algebraic ad Order Properties of R Defiitio. A biary operatio o a set F is a fuctio B : F F! F. For the biary operatios of + ad, we replace B(a, b) by a + b ad a b, respectively.

More information

Sequences and Series of Functions

Sequences and Series of Functions Chapter 6 Sequeces ad Series of Fuctios 6.1. Covergece of a Sequece of Fuctios Poitwise Covergece. Defiitio 6.1. Let, for each N, fuctio f : A R be defied. If, for each x A, the sequece (f (x)) coverges

More information

Machine Learning Theory (CS 6783)

Machine Learning Theory (CS 6783) Machie Learig Theory (CS 6783) Lecture 2 : Learig Frameworks, Examples Settig up learig problems. X : istace space or iput space Examples: Computer Visio: Raw M N image vectorized X = 0, 255 M N, SIFT

More information

Lecture 2: Concentration Bounds

Lecture 2: Concentration Bounds CSE 52: Desig ad Aalysis of Algorithms I Sprig 206 Lecture 2: Cocetratio Bouds Lecturer: Shaya Oveis Ghara March 30th Scribe: Syuzaa Sargsya Disclaimer: These otes have ot bee subjected to the usual scrutiy

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit theorems Throughout this sectio we will assume a probability space (Ω, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

Asymptotic distribution of products of sums of independent random variables

Asymptotic distribution of products of sums of independent random variables Proc. Idia Acad. Sci. Math. Sci. Vol. 3, No., May 03, pp. 83 9. c Idia Academy of Scieces Asymptotic distributio of products of sums of idepedet radom variables YANLING WANG, SUXIA YAO ad HONGXIA DU ollege

More information

Online Convex Optimization in the Bandit Setting: Gradient Descent Without a Gradient. -Avinash Atreya Feb

Online Convex Optimization in the Bandit Setting: Gradient Descent Without a Gradient. -Avinash Atreya Feb Olie Covex Optimizatio i the Badit Settig: Gradiet Descet Without a Gradiet -Aviash Atreya Feb 9 2011 Outlie Itroductio The Problem Example Backgroud Notatio Results Oe Poit Estimate Mai Theorem Extesios

More information

SDS 321: Introduction to Probability and Statistics

SDS 321: Introduction to Probability and Statistics SDS 321: Itroductio to Probability ad Statistics Lecture 23: Cotiuous radom variables- Iequalities, CLT Puramrita Sarkar Departmet of Statistics ad Data Sciece The Uiversity of Texas at Austi www.cs.cmu.edu/

More information

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i

More information

This section is optional.

This section is optional. 4 Momet Geeratig Fuctios* This sectio is optioal. The momet geeratig fuctio g : R R of a radom variable X is defied as g(t) = E[e tx ]. Propositio 1. We have g () (0) = E[X ] for = 1, 2,... Proof. Therefore

More information

A remark on p-summing norms of operators

A remark on p-summing norms of operators A remark o p-summig orms of operators Artem Zvavitch Abstract. I this paper we improve a result of W. B. Johso ad G. Schechtma by provig that the p-summig orm of ay operator with -dimesioal domai ca be

More information

A Proof of Birkhoff s Ergodic Theorem

A Proof of Birkhoff s Ergodic Theorem A Proof of Birkhoff s Ergodic Theorem Joseph Hora September 2, 205 Itroductio I Fall 203, I was learig the basics of ergodic theory, ad I came across this theorem. Oe of my supervisors, Athoy Quas, showed

More information

BETWEEN QUASICONVEX AND CONVEX SET-VALUED MAPPINGS. 1. Introduction. Throughout the paper we denote by X a linear space and by Y a topological linear

BETWEEN QUASICONVEX AND CONVEX SET-VALUED MAPPINGS. 1. Introduction. Throughout the paper we denote by X a linear space and by Y a topological linear BETWEEN QUASICONVEX AND CONVEX SET-VALUED MAPPINGS Abstract. The aim of this paper is to give sufficiet coditios for a quasicovex setvalued mappig to be covex. I particular, we recover several kow characterizatios

More information

Information Theory Tutorial Communication over Channels with memory. Chi Zhang Department of Electrical Engineering University of Notre Dame

Information Theory Tutorial Communication over Channels with memory. Chi Zhang Department of Electrical Engineering University of Notre Dame Iformatio Theory Tutorial Commuicatio over Chaels with memory Chi Zhag Departmet of Electrical Egieerig Uiversity of Notre Dame Abstract A geeral capacity formula C = sup I(; Y ), which is correct for

More information

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1. Eco 325/327 Notes o Sample Mea, Sample Proportio, Cetral Limit Theorem, Chi-square Distributio, Studet s t distributio 1 Sample Mea By Hiro Kasahara We cosider a radom sample from a populatio. Defiitio

More information

A survey on penalized empirical risk minimization Sara A. van de Geer

A survey on penalized empirical risk minimization Sara A. van de Geer A survey o pealized empirical risk miimizatio Sara A. va de Geer We address the questio how to choose the pealty i empirical risk miimizatio. Roughly speakig, this pealty should be a good boud for the

More information

Empirical Process Theory and Oracle Inequalities

Empirical Process Theory and Oracle Inequalities Stat 928: Statistical Learig Theory Lecture: 10 Empirical Process Theory ad Oracle Iequalities Istructor: Sham Kakade 1 Risk vs Risk See Lecture 0 for a discussio o termiology. 2 The Uio Boud / Boferoi

More information

Probability 2 - Notes 10. Lemma. If X is a random variable and g(x) 0 for all x in the support of f X, then P(g(X) 1) E[g(X)].

Probability 2 - Notes 10. Lemma. If X is a random variable and g(x) 0 for all x in the support of f X, then P(g(X) 1) E[g(X)]. Probability 2 - Notes 0 Some Useful Iequalities. Lemma. If X is a radom variable ad g(x 0 for all x i the support of f X, the P(g(X E[g(X]. Proof. (cotiuous case P(g(X Corollaries x:g(x f X (xdx x:g(x

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

Partial match queries: a limit process

Partial match queries: a limit process Partial match queries: a limit process Nicolas Brouti Ralph Neiiger Heig Sulzbach Partial match queries: a limit process 1 / 17 Searchig geometric data ad quadtrees 1 Partial match queries: a limit process

More information

Boundaries and the James theorem

Boundaries and the James theorem Boudaries ad the James theorem L. Vesely 1. Itroductio The followig theorem is importat ad well kow. All spaces cosidered here are real ormed or Baach spaces. Give a ormed space X, we deote by B X ad S

More information

If a subset E of R contains no open interval, is it of zero measure? For instance, is the set of irrationals in [0, 1] is of measure zero?

If a subset E of R contains no open interval, is it of zero measure? For instance, is the set of irrationals in [0, 1] is of measure zero? 2 Lebesgue Measure I Chapter 1 we defied the cocept of a set of measure zero, ad we have observed that every coutable set is of measure zero. Here are some atural questios: If a subset E of R cotais a

More information

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f. Lecture 5 Let us give oe more example of MLE. Example 3. The uiform distributio U[0, ] o the iterval [0, ] has p.d.f. { 1 f(x =, 0 x, 0, otherwise The likelihood fuctio ϕ( = f(x i = 1 I(X 1,..., X [0,

More information

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15 17. Joit distributios of extreme order statistics Lehma 5.1; Ferguso 15 I Example 10., we derived the asymptotic distributio of the maximum from a radom sample from a uiform distributio. We did this usig

More information

Fall 2013 MTH431/531 Real analysis Section Notes

Fall 2013 MTH431/531 Real analysis Section Notes Fall 013 MTH431/531 Real aalysis Sectio 8.1-8. Notes Yi Su 013.11.1 1. Defiitio of uiform covergece. We look at a sequece of fuctios f (x) ad study the coverget property. Notice we have two parameters

More information

Learnability with Rademacher Complexities

Learnability with Rademacher Complexities Learability with Rademacher Complexities Daiel Khashabi Fall 203 Last Update: September 26, 206 Itroductio Our goal i study of passive ervised learig is to fid a hypothesis h based o a set of examples

More information

5.1 A mutual information bound based on metric entropy

5.1 A mutual information bound based on metric entropy Chapter 5 Global Fao Method I this chapter, we exted the techiques of Chapter 2.4 o Fao s method the local Fao method) to a more global costructio. I particular, we show that, rather tha costructig a local

More information

1 Convergence in Probability and the Weak Law of Large Numbers

1 Convergence in Probability and the Weak Law of Large Numbers 36-752 Advaced Probability Overview Sprig 2018 8. Covergece Cocepts: i Probability, i L p ad Almost Surely Istructor: Alessadro Rialdo Associated readig: Sec 2.4, 2.5, ad 4.11 of Ash ad Doléas-Dade; Sec

More information

Chapter IV Integration Theory

Chapter IV Integration Theory Chapter IV Itegratio Theory Lectures 32-33 1. Costructio of the itegral I this sectio we costruct the abstract itegral. As a matter of termiology, we defie a measure space as beig a triple (, A, µ), where

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013 MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013 Fuctioal Law of Large Numbers. Costructio of the Wieer Measure Cotet. 1. Additioal techical results o weak covergece

More information

ECE 330:541, Stochastic Signals and Systems Lecture Notes on Limit Theorems from Probability Fall 2002

ECE 330:541, Stochastic Signals and Systems Lecture Notes on Limit Theorems from Probability Fall 2002 ECE 330:541, Stochastic Sigals ad Systems Lecture Notes o Limit Theorems from robability Fall 00 I practice, there are two ways we ca costruct a ew sequece of radom variables from a old sequece of radom

More information

Machine Learning Brett Bernstein

Machine Learning Brett Bernstein Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio

More information

Sh. Al-sharif - R. Khalil

Sh. Al-sharif - R. Khalil Red. Sem. Mat. Uiv. Pol. Torio - Vol. 62, 2 (24) Sh. Al-sharif - R. Khalil C -SEMIGROUP AND OPERATOR IDEALS Abstract. Let T (t), t

More information

Measure and Measurable Functions

Measure and Measurable Functions 3 Measure ad Measurable Fuctios 3.1 Measure o a Arbitrary σ-algebra Recall from Chapter 2 that the set M of all Lebesgue measurable sets has the followig properties: R M, E M implies E c M, E M for N implies

More information

Notes 5 : More on the a.s. convergence of sums

Notes 5 : More on the a.s. convergence of sums Notes 5 : More o the a.s. covergece of sums Math 733-734: Theory of Probability Lecturer: Sebastie Roch Refereces: Dur0, Sectios.5; Wil9, Sectio 4.7, Shi96, Sectio IV.4, Dur0, Sectio.. Radom series. Three-series

More information

The log-behavior of n p(n) and n p(n)/n

The log-behavior of n p(n) and n p(n)/n Ramauja J. 44 017, 81-99 The log-behavior of p ad p/ William Y.C. Che 1 ad Ke Y. Zheg 1 Ceter for Applied Mathematics Tiaji Uiversity Tiaji 0007, P. R. Chia Ceter for Combiatorics, LPMC Nakai Uivercity

More information

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4.

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4. 4. BASES I BAACH SPACES 39 4. BASES I BAACH SPACES Sice a Baach space X is a vector space, it must possess a Hamel, or vector space, basis, i.e., a subset {x γ } γ Γ whose fiite liear spa is all of X ad

More information

Equivalent Banach Operator Ideal Norms 1

Equivalent Banach Operator Ideal Norms 1 It. Joural of Math. Aalysis, Vol. 6, 2012, o. 1, 19-27 Equivalet Baach Operator Ideal Norms 1 Musudi Sammy Chuka Uiversity College P.O. Box 109-60400, Keya sammusudi@yahoo.com Shem Aywa Maside Muliro Uiversity

More information

Lecture 3: August 31

Lecture 3: August 31 36-705: Itermediate Statistics Fall 018 Lecturer: Siva Balakrisha Lecture 3: August 31 This lecture will be mostly a summary of other useful expoetial tail bouds We will ot prove ay of these i lecture,

More information

Entropy Rates and Asymptotic Equipartition

Entropy Rates and Asymptotic Equipartition Chapter 29 Etropy Rates ad Asymptotic Equipartitio Sectio 29. itroduces the etropy rate the asymptotic etropy per time-step of a stochastic process ad shows that it is well-defied; ad similarly for iformatio,

More information

Introduction to Extreme Value Theory Laurens de Haan, ISM Japan, Erasmus University Rotterdam, NL University of Lisbon, PT

Introduction to Extreme Value Theory Laurens de Haan, ISM Japan, Erasmus University Rotterdam, NL University of Lisbon, PT Itroductio to Extreme Value Theory Laures de Haa, ISM Japa, 202 Itroductio to Extreme Value Theory Laures de Haa Erasmus Uiversity Rotterdam, NL Uiversity of Lisbo, PT Itroductio to Extreme Value Theory

More information

On equivalent strictly G-convex renormings of Banach spaces

On equivalent strictly G-convex renormings of Banach spaces Cet. Eur. J. Math. 8(5) 200 87-877 DOI: 0.2478/s533-00-0050-3 Cetral Europea Joural of Mathematics O equivalet strictly G-covex reormigs of Baach spaces Research Article Nataliia V. Boyko Departmet of

More information

1 Lecture 2: Sequence, Series and power series (8/14/2012)

1 Lecture 2: Sequence, Series and power series (8/14/2012) Summer Jump-Start Program for Aalysis, 202 Sog-Yig Li Lecture 2: Sequece, Series ad power series (8/4/202). More o sequeces Example.. Let {x } ad {y } be two bouded sequeces. Show lim sup (x + y ) lim

More information

(A sequence also can be thought of as the list of function values attained for a function f :ℵ X, where f (n) = x n for n 1.) x 1 x N +k x N +4 x 3

(A sequence also can be thought of as the list of function values attained for a function f :ℵ X, where f (n) = x n for n 1.) x 1 x N +k x N +4 x 3 MATH 337 Sequeces Dr. Neal, WKU Let X be a metric space with distace fuctio d. We shall defie the geeral cocept of sequece ad limit i a metric space, the apply the results i particular to some special

More information

The value of Banach limits on a certain sequence of all rational numbers in the interval (0,1) Bao Qi Feng

The value of Banach limits on a certain sequence of all rational numbers in the interval (0,1) Bao Qi Feng The value of Baach limits o a certai sequece of all ratioal umbers i the iterval 0, Bao Qi Feg Departmet of Mathematical Scieces, Ket State Uiversity, Tuscarawas, 330 Uiversity Dr. NE, New Philadelphia,

More information

Absolute Boundedness and Absolute Convergence in Sequence Spaces* Martin Buntinas and Naza Tanović Miller

Absolute Boundedness and Absolute Convergence in Sequence Spaces* Martin Buntinas and Naza Tanović Miller Absolute Boudedess ad Absolute Covergece i Sequece Spaces* Marti Butias ad Naza Taović Miller 1 Itroductio We maily use stadard otatio as give i sectio 2 For a F K space E, various forms of sectioal boudedess

More information

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014.

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014. Product measures, Toelli s ad Fubii s theorems For use i MAT3400/4400, autum 2014 Nadia S. Larse Versio of 13 October 2014. 1. Costructio of the product measure The purpose of these otes is to preset the

More information

Lecture 3 The Lebesgue Integral

Lecture 3 The Lebesgue Integral Lecture 3: The Lebesgue Itegral 1 of 14 Course: Theory of Probability I Term: Fall 2013 Istructor: Gorda Zitkovic Lecture 3 The Lebesgue Itegral The costructio of the itegral Uless expressly specified

More information

OFF-DIAGONAL MULTILINEAR INTERPOLATION BETWEEN ADJOINT OPERATORS

OFF-DIAGONAL MULTILINEAR INTERPOLATION BETWEEN ADJOINT OPERATORS OFF-DIAGONAL MULTILINEAR INTERPOLATION BETWEEN ADJOINT OPERATORS LOUKAS GRAFAKOS AND RICHARD G. LYNCH 2 Abstract. We exted a theorem by Grafakos ad Tao [5] o multiliear iterpolatio betwee adjoit operators

More information

A REMARK ON A PROBLEM OF KLEE

A REMARK ON A PROBLEM OF KLEE C O L L O Q U I U M M A T H E M A T I C U M VOL. 71 1996 NO. 1 A REMARK ON A PROBLEM OF KLEE BY N. J. K A L T O N (COLUMBIA, MISSOURI) AND N. T. P E C K (URBANA, ILLINOIS) This paper treats a property

More information

Riesz-Fischer Sequences and Lower Frame Bounds

Riesz-Fischer Sequences and Lower Frame Bounds Zeitschrift für Aalysis ud ihre Aweduge Joural for Aalysis ad its Applicatios Volume 1 (00), No., 305 314 Riesz-Fischer Sequeces ad Lower Frame Bouds P. Casazza, O. Christese, S. Li ad A. Lider Abstract.

More information

Outline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression

Outline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression REGRESSION 1 Outlie Liear regressio Regularizatio fuctios Polyomial curve fittig Stochastic gradiet descet for regressio MLE for regressio Step-wise forward regressio Regressio methods Statistical techiques

More information

for all x ; ;x R. A ifiite sequece fx ; g is said to be ND if every fiite subset X ; ;X is ND. The coditios (.) ad (.3) are equivalet for =, but these

for all x ; ;x R. A ifiite sequece fx ; g is said to be ND if every fiite subset X ; ;X is ND. The coditios (.) ad (.3) are equivalet for =, but these sub-gaussia techiques i provig some strog it theorems Λ M. Amii A. Bozorgia Departmet of Mathematics, Faculty of Scieces Sista ad Baluchesta Uiversity, Zaheda, Ira Amii@hamoo.usb.ac.ir, Fax:054446565 Departmet

More information

Solutions to Tutorial 3 (Week 4)

Solutions to Tutorial 3 (Week 4) The Uiversity of Sydey School of Mathematics ad Statistics Solutios to Tutorial Week 4 MATH2962: Real ad Complex Aalysis Advaced Semester 1, 2017 Web Page: http://www.maths.usyd.edu.au/u/ug/im/math2962/

More information

18.657: Mathematics of Machine Learning

18.657: Mathematics of Machine Learning 8.657: Mathematics of Machie Learig Lecturer: Philippe Rigollet Lecture 4 Scribe: Cheg Mao Sep., 05 I this lecture, we cotiue to discuss the effect of oise o the rate of the excess risk E(h) = R(h) R(h

More information

Lecture 3 : Random variables and their distributions

Lecture 3 : Random variables and their distributions Lecture 3 : Radom variables ad their distributios 3.1 Radom variables Let (Ω, F) ad (S, S) be two measurable spaces. A map X : Ω S is measurable or a radom variable (deoted r.v.) if X 1 (A) {ω : X(ω) A}

More information

NYU Center for Data Science: DS-GA 1003 Machine Learning and Computational Statistics (Spring 2018)

NYU Center for Data Science: DS-GA 1003 Machine Learning and Computational Statistics (Spring 2018) NYU Ceter for Data Sciece: DS-GA 003 Machie Learig ad Computatioal Statistics (Sprig 208) Brett Berstei, David Roseberg, Be Jakubowski Jauary 20, 208 Istructios: Followig most lab ad lecture sectios, we

More information

Distribution of Random Samples & Limit theorems

Distribution of Random Samples & Limit theorems STAT/MATH 395 A - PROBABILITY II UW Witer Quarter 2017 Néhémy Lim Distributio of Radom Samples & Limit theorems 1 Distributio of i.i.d. Samples Motivatig example. Assume that the goal of a study is to

More information

Introduction to Optimization Techniques. How to Solve Equations

Introduction to Optimization Techniques. How to Solve Equations Itroductio to Optimizatio Techiques How to Solve Equatios Iterative Methods of Optimizatio Iterative methods of optimizatio Solutio of the oliear equatios resultig form a optimizatio problem is usually

More information

2.4 Sequences, Sequences of Sets

2.4 Sequences, Sequences of Sets 72 CHAPTER 2. IMPORTANT PROPERTIES OF R 2.4 Sequeces, Sequeces of Sets 2.4.1 Sequeces Defiitio 2.4.1 (sequece Let S R. 1. A sequece i S is a fuctio f : K S where K = { N : 0 for some 0 N}. 2. For each

More information