Supplementary Material for Fast Stochastic AUC Maximization with O(1/n)-Convergence Rate

Size: px
Start display at page:

Download "Supplementary Material for Fast Stochastic AUC Maximization with O(1/n)-Convergence Rate"

Transcription

1 Supplemetary Material for Fast Stochastic AUC Maximizatio with O/-Covergece Rate Migrui Liu Xiaoxua Zhag Zaiyi Che Xiaoyu Wag 3 iabao Yag echical Lemmas ized versio of Hoeffdig s iequality, ote that We itroduce two cocetratio iequalities i Lemma 4, which are used frequetly i the proofs Lemma 4 Radomized versio of Hoeffdig s iequality Suppose is a radom variable takig value o N +, ad let X,, X be idepedet radom variables Defie X = X + + X If every X i is strictly bouded by the itervals [a i, b i ], the we have with probability at least, X E X l/ i= b i a i 7 Similarly, with probability at least, E X X l/ i= b i a i 8 i= Proof he proof is quite straightforward For the radom- 9 Pr X + + X EX = Pr X E X t= Pr = t Pr = t =, t= l/ l/ i= b i a i i= b i a i = t where the first iequality follows from the determiistic Radomized versio of vector cocetratio iequality Suppose is a radom variable takig value versio of Hoeffdig s iequality It is easy to show the correctess of the radomized o N +, ad let X,, X R d be iid radom versio of vector cocetratio iequality by employig variables If φ : R d H, where H is a Hilbert the same techique he determiistic versio space edowed with orm actually we ca take ca be derived via McDiarmid s iequality McDiarmid, H to be R d edowed with ifiity orm Suppose 989 A stadard proof ca be foud i the sectio B = sup x R d φx < he we have with probability 4 of Shawe-aylor & Cristiaii, 004 For at least, completeess, we iclude the proof here o derive φx i E φx B [ + the determiistic versio, defie S = X,, X, ] S = X log/,, X to be two collectios of idepe- det samples, S = X,, X i, X i, X i+,, X, ad fs = i= φx i EφX, we have fs fs B/ By McDiarmid s iequality, we have Departmet of Computer Sciece, he Uiversity of Iowa, Iowa City, IA 54, USA Uiversity of Sciece ad echology of Chia 3 Itellifusio Correspodece to: Migrui Liu <migrui-liu@uiowaedu>, Xiaoxua Zhag <xiaoxuazhag@uiowaedu>, Zaiyi Che <czy656@mailutsceduc>, Xiaoyu Wag <faghuaxue@gmailcom>, iabao Yag <tiabao-yag@uiowaedu> ɛ Pr fs EfS > ɛ exp 4B 0 Proceedigs of the 35 th Iteratioal Coferece o Machie Learig, Stockholm, Swede, PMLR 80, 08 Copyright 08 by the authors Defie σ = σ,, σ to be Rademacher variables, ie Prσ i = = Prσ i = = /, ad σ i s are iid

2 Fast Stochastic AUC Maximizatio Defie φ S = i= φz i, the E fs = E φs Eφ S = E φ S Eφ S = E E φ S φ S E φ S φ S = E σ i φx i φx i i= E σ i φx i i= = E σi φ X i + σ i σ j φx i φx j i= i j E σi φ X i + σ i σ j φx i φx j i= i j = Eφ X i B i= Combig this result with 0, ad takig ɛ = B suffice to get the result log Proof of Lemma Proof Accordig to the equatio 6 i Yig et al, 06, we have fv, α = fw, a, b, α = p p w x a + αw x P x y = dx+ x w x b + + αw x } P x y = α x = p p w Exx y = + Exx y = w w aex y = + bex y = + a + b + + αw Ex y = Ex y = α } Whe α = w Ex y = Ex y = Ω, it is easy to see that α Ω by employig Cauchy-Schwarz iequality, ie α w Ex y = Ex y = Rκ, fv, α achieves its maximum with respect to α, so we get f v = f w, a, b, w Ex y = Ex y = = p p w Exx y = w aw Ex y = + a + w Exx y = w bw Ex y = + b + [ w Ex y = Ex y = } + w Ex y = Ex y = ] = p p [ w, a, b M w, a, b + affie fuctio of v ], where M = M + M + M 3, M = Exx y = Ex y = 0 Ex y = M = Exx y = 0 Ex y = Ex y = 0 M 3 = qq q = Ex y = Ex y = Note that M, M, M 3 are positive semidefiite matrix, ad hece M is positive semidefiite So f v is covex ad piecewise quadratic Sice Ω is a polyhedro, accordig to Corollary 3 of Li, 03, we ca kow that f v restricted o Ω satisfies the quadratic growth coditio 3 Proof of Lemma 3 Proof By applyig the iequality 9 i Lemma 4, the triagle iequality, ad the uio boud, we have with probability at least 6, Â A κ + l + κ + l +, Note that both ad + follow the Beroulli distributio, ad deote p = Pry = By applyig determiistic versio of Hoeffdig s iequality i Lemma 4 ie, iequality 8 to idicator fuctios of radom variables I [yi= ] ad I [yi=] respectively ad the uio boud, we have with probability at least 6, the followig two equatios hold simultaeously: p l l, + p

3 Fast Stochastic AUC Maximizatio Accordig to ad, by uio boud, we kow that with probability at least 3, we have  A κ + p l l + κ + p l l Note that is equivalet to l p p, l p p 3 4 Utilizig 4 ad pluggig it ito 3, we kow that with probability at least 3, κ + l  A l p l κ + l + p + κ = l l p 4κ + l, ξ where l l + ξ mi p, p 4 Proof of heorem κ + l l p l Proof Defie = log, ad a, = G 3γ + γ l 6, µ 0 = R0 a 0,, µ k = k µ 0, R k = R 0 / k, where k =,, m he we have µ k Rk = k µ 0 R0 By defiitio of m, whe 00, 0 < log log m log log log, 5 so we have m 4 log 6 o employ the result of Lemma, at the i-th stage, we eed R 0 to satisfy R +4κ = = 4 i / + 4κ, which should Ri 4 i R0 hold for ay i m So 0 4 m suffices to achieve this requiremet Now we argue that this coditio ca be implied by 00 Note that 0 = /m m o show the impli- 4 m, it suffices to prove log 3 log, ie, ad 4 m 4 log catio from 00 to 0 that whe 00, log log log log log log log, log log = log log 3, which obviously holds Accordig to Lemma, we kow that P v f v satisfy the quadratic growth coditio, which implies that there exists some c > 0, such that v v cp v P v, where v is the closest poit to v i Ω We ca assume c R0 G, ie, c G R 0 Otherwise we ca set c to be R0 G such that the quadratic growth property i Lemma still holds Whe 00, we have µ m = m µ 0 4 log R 0 G G R 0 log G 3 l3 0 log R 0 log 0 G 3 l3 log R 0 log m + G 3 l3 log R 0 log + log log = G R 3 l3 log G 0 R 0 log log +3 log 3γ + γ l 60 / l 3 0 log log

4 Fast Stochastic AUC Maximizatio where the first iequality holds because of 6, the secod iequality stems from the fact that γ, γ, 0 < <, ad the defiitio of, the third iequality holds by employig a + b ab, the fourth iequality holds because 0 m +, the fifth iequality holds because of the lower boud of m i 5, ad the last iequality holds sice 00 ad the fuctio is mootoically icreasig with respect to So G R 0 µ m Recall that c G R 0, ad thus c µ m Give v k, deote v k by the closest optimal solutio to v k We cosider two cases Case If c µ 0, the µ 0 c µ m So there exists a k such that µ k c µ k, where 0 k < m o utilize this fact, we have the followig lemma Lemma 5 Let k satisfy µ k c µ k he for ay k k, there exists a Borel set A k Ω of probability at least k, such that for ω A k, the poits v k } m k= geerated by the Algorithm satisfy v k v k R k = k+ R 0, 7 P v k P µ k R k = k µ 0 R 0 8 Moreover, for k > k there is a Borel set C k Ω of probability at least k k such that o C k, we have P v k P v k µ k Rk 9 Proof We prove 7 ad 8 by iductio Note that 7 holds for k = Assume it is true for some k > o A k Accordig to the Lemma, there exists a Borel set B k with PrB k such that P v k P R k a 0, = µ k k R 0 R k = µ k R k, which is 8 By the iductive hypothesis, v k v k R k o the set A k Defie A k = A k B k Note that PrA k PrA k + PrB k k, ad o A k, by the HEB ad the defiitio of k, we have v k v k c P v k P the RHS becomes zero ad hece we ca get a tighter boud of P v k P v k, we here relax the boud to be R k a 0,, which is, there exists a Borel set B k with PrB k such that P v k P v k R k a 0, = k k R k a 0, = k k µ k R k = µ kr k, which implies that o C k = k j=k + B j, we have P v k P v k = k j=k + k j=k + P v j P v j k j µ k R k µ k R k By uio boud, we have PrC k = Pr k j=k + B j k k Here completes the proof Now we proceed the proof as follows Note that µ 0 c µ m At the ed of k -th stage, o the Borel set A k of probability at least k, we have P v k P µ k R k he o the Borel set D m = C m A k = m j=k + B j A k with PrD m m, we have P v m P = P v m P v k + P v k P µ k Rk 4µ k c µ k R k = 4c a 0, By the defiitio of m ad, ad the fact that m log, we have m So PrD m Case If c < µ 0, the o A = B, P v P R 0 a 0, = R 0 a 0, a 0, P v k P µ k µ kr k µ k R k, = µ 0 a 0, c a 0, which is 7 for k + Now we prove 9 For k > k, it is easy to show a similar coclusio as i Lemma Remark: At k-th stage with k > k, oe ca use the similar proof of Lemma by substitutig all v to v k, the first term I o Hece o A C m, by usig Lemma 5 ad a similar argumet as i case, we have P v m P =P v m P v + P v P R 0 a 0, c a 0,,

5 where PrA C m Combiig the two cases, we have with probability at least, P v m P 4c c G 3γ + γ l 60 log 0 l = Õ 0 Fast Stochastic AUC Maximizatio Refereces Li, Guoyi Global error bouds for piecewise covex polyomials Mathematical Programmig, pp 8, 03 McDiarmid, Coli O the method of bouded differeces Surveys i combiatorics, 4:48 88, 989 Shawe-aylor, Joh ad Cristiaii, Nello Kerel methods for patter aalysis Cambridge uiversity press, 004 Yig, Yimig, We, Logyi, ad Lyu, Siwei Stochastic olie auc maximizatio I Advaces i Neural Iformatio Processig Systems, pp , 06

Supplementary Material for Fast Stochastic AUC Maximization with O(1/n)-Convergence Rate

Supplementary Material for Fast Stochastic AUC Maximization with O(1/n)-Convergence Rate Supplemetary Material for Fast Stochastic AUC Maximizatio with O/-Covergece Rate Migrui Liu Xiaoxua Zhag Zaiyi Che Xiaoyu Wag 3 iabao Yag echical Lemmas ized versio of Hoeffdig s iequality, ote that We

More information

The log-behavior of n p(n) and n p(n)/n

The log-behavior of n p(n) and n p(n)/n Ramauja J. 44 017, 81-99 The log-behavior of p ad p/ William Y.C. Che 1 ad Ke Y. Zheg 1 Ceter for Applied Mathematics Tiaji Uiversity Tiaji 0007, P. R. Chia Ceter for Combiatorics, LPMC Nakai Uivercity

More information

REGRESSION WITH QUADRATIC LOSS

REGRESSION WITH QUADRATIC LOSS REGRESSION WITH QUADRATIC LOSS MAXIM RAGINSKY Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X, Y ), where, as before, X is a R d

More information

Regression with quadratic loss

Regression with quadratic loss Regressio with quadratic loss Maxim Ragisky October 13, 2015 Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X,Y, where, as before,

More information

Optimally Sparse SVMs

Optimally Sparse SVMs A. Proof of Lemma 3. We here prove a lower boud o the umber of support vectors to achieve geeralizatio bouds of the form which we cosider. Importatly, this result holds ot oly for liear classifiers, but

More information

MATH 112: HOMEWORK 6 SOLUTIONS. Problem 1: Rudin, Chapter 3, Problem s k < s k < 2 + s k+1

MATH 112: HOMEWORK 6 SOLUTIONS. Problem 1: Rudin, Chapter 3, Problem s k < s k < 2 + s k+1 MATH 2: HOMEWORK 6 SOLUTIONS CA PRO JIRADILOK Problem. If s = 2, ad Problem : Rudi, Chapter 3, Problem 3. s + = 2 + s ( =, 2, 3,... ), prove that {s } coverges, ad that s < 2 for =, 2, 3,.... Proof. The

More information

BIRKHOFF ERGODIC THEOREM

BIRKHOFF ERGODIC THEOREM BIRKHOFF ERGODIC THEOREM Abstract. We will give a proof of the poitwise ergodic theorem, which was first proved by Birkhoff. May improvemets have bee made sice Birkhoff s orgial proof. The versio we give

More information

Learning Theory: Lecture Notes

Learning Theory: Lecture Notes Learig Theory: Lecture Notes Kamalika Chaudhuri October 4, 0 Cocetratio of Averages Cocetratio of measure is very useful i showig bouds o the errors of machie-learig algorithms. We will begi with a basic

More information

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Convergence of random variables. (telegram style notes) P.J.C. Spreij Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space

More information

A Hadamard-type lower bound for symmetric diagonally dominant positive matrices

A Hadamard-type lower bound for symmetric diagonally dominant positive matrices A Hadamard-type lower boud for symmetric diagoally domiat positive matrices Christopher J. Hillar, Adre Wibisoo Uiversity of Califoria, Berkeley Jauary 7, 205 Abstract We prove a ew lower-boud form of

More information

Rademacher Complexity

Rademacher Complexity EECS 598: Statistical Learig Theory, Witer 204 Topic 0 Rademacher Complexity Lecturer: Clayto Scott Scribe: Ya Deg, Kevi Moo Disclaimer: These otes have ot bee subjected to the usual scrutiy reserved for

More information

1+x 1 + α+x. x = 2(α x2 ) 1+x

1+x 1 + α+x. x = 2(α x2 ) 1+x Math 2030 Homework 6 Solutios # [Problem 5] For coveiece we let α lim sup a ad β lim sup b. Without loss of geerality let us assume that α β. If α the by assumptio β < so i this case α + β. By Theorem

More information

Self-normalized deviation inequalities with application to t-statistic

Self-normalized deviation inequalities with application to t-statistic Self-ormalized deviatio iequalities with applicatio to t-statistic Xiequa Fa Ceter for Applied Mathematics, Tiaji Uiversity, 30007 Tiaji, Chia Abstract Let ξ i i 1 be a sequece of idepedet ad symmetric

More information

An Introduction to Randomized Algorithms

An Introduction to Randomized Algorithms A Itroductio to Radomized Algorithms The focus of this lecture is to study a radomized algorithm for quick sort, aalyze it usig probabilistic recurrece relatios, ad also provide more geeral tools for aalysis

More information

Computation of Error Bounds for P-matrix Linear Complementarity Problems

Computation of Error Bounds for P-matrix Linear Complementarity Problems Mathematical Programmig mauscript No. (will be iserted by the editor) Xiaoju Che Shuhuag Xiag Computatio of Error Bouds for P-matrix Liear Complemetarity Problems Received: date / Accepted: date Abstract

More information

1 Review and Overview

1 Review and Overview DRAFT a fial versio will be posted shortly CS229T/STATS231: Statistical Learig Theory Lecturer: Tegyu Ma Lecture #3 Scribe: Migda Qiao October 1, 2013 1 Review ad Overview I the first half of this course,

More information

1 Review and Overview

1 Review and Overview CS9T/STATS3: Statistical Learig Theory Lecturer: Tegyu Ma Lecture #6 Scribe: Jay Whag ad Patrick Cho October 0, 08 Review ad Overview Recall i the last lecture that for ay family of scalar fuctios F, we

More information

Seunghee Ye Ma 8: Week 5 Oct 28

Seunghee Ye Ma 8: Week 5 Oct 28 Week 5 Summary I Sectio, we go over the Mea Value Theorem ad its applicatios. I Sectio 2, we will recap what we have covered so far this term. Topics Page Mea Value Theorem. Applicatios of the Mea Value

More information

1 Convergence in Probability and the Weak Law of Large Numbers

1 Convergence in Probability and the Weak Law of Large Numbers 36-752 Advaced Probability Overview Sprig 2018 8. Covergece Cocepts: i Probability, i L p ad Almost Surely Istructor: Alessadro Rialdo Associated readig: Sec 2.4, 2.5, ad 4.11 of Ash ad Doléas-Dade; Sec

More information

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality

More information

Solution. 1 Solutions of Homework 1. Sangchul Lee. October 27, Problem 1.1

Solution. 1 Solutions of Homework 1. Sangchul Lee. October 27, Problem 1.1 Solutio Sagchul Lee October 7, 017 1 Solutios of Homework 1 Problem 1.1 Let Ω,F,P) be a probability space. Show that if {A : N} F such that A := lim A exists, the PA) = lim PA ). Proof. Usig the cotiuity

More information

Law of the sum of Bernoulli random variables

Law of the sum of Bernoulli random variables Law of the sum of Beroulli radom variables Nicolas Chevallier Uiversité de Haute Alsace, 4, rue des frères Lumière 68093 Mulhouse icolas.chevallier@uha.fr December 006 Abstract Let be the set of all possible

More information

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f. Lecture 5 Let us give oe more example of MLE. Example 3. The uiform distributio U[0, ] o the iterval [0, ] has p.d.f. { 1 f(x =, 0 x, 0, otherwise The likelihood fuctio ϕ( = f(x i = 1 I(X 1,..., X [0,

More information

ACO Comprehensive Exam 9 October 2007 Student code A. 1. Graph Theory

ACO Comprehensive Exam 9 October 2007 Student code A. 1. Graph Theory 1. Graph Theory Prove that there exist o simple plaar triagulatio T ad two distict adjacet vertices x, y V (T ) such that x ad y are the oly vertices of T of odd degree. Do ot use the Four-Color Theorem.

More information

STAT Homework 1 - Solutions

STAT Homework 1 - Solutions STAT-36700 Homework 1 - Solutios Fall 018 September 11, 018 This cotais solutios for Homework 1. Please ote that we have icluded several additioal commets ad approaches to the problems to give you better

More information

Glivenko-Cantelli Classes

Glivenko-Cantelli Classes CS28B/Stat24B (Sprig 2008 Statistical Learig Theory Lecture: 4 Gliveko-Catelli Classes Lecturer: Peter Bartlett Scribe: Michelle Besi Itroductio This lecture will cover Gliveko-Catelli (GC classes ad itroduce

More information

Section 11.8: Power Series

Section 11.8: Power Series Sectio 11.8: Power Series 1. Power Series I this sectio, we cosider geeralizig the cocept of a series. Recall that a series is a ifiite sum of umbers a. We ca talk about whether or ot it coverges ad i

More information

Lecture 01: the Central Limit Theorem. 1 Central Limit Theorem for i.i.d. random variables

Lecture 01: the Central Limit Theorem. 1 Central Limit Theorem for i.i.d. random variables CSCI-B609: A Theorist s Toolkit, Fall 06 Aug 3 Lecture 0: the Cetral Limit Theorem Lecturer: Yua Zhou Scribe: Yua Xie & Yua Zhou Cetral Limit Theorem for iid radom variables Let us say that we wat to aalyze

More information

APPENDIX A SMO ALGORITHM

APPENDIX A SMO ALGORITHM AENDIX A SMO ALGORITHM Sequetial Miimal Optimizatio SMO) is a simple algorithm that ca quickly solve the SVM Q problem without ay extra matrix storage ad without usig time-cosumig umerical Q optimizatio

More information

ECE534, Spring 2018: Solutions for Problem Set #2

ECE534, Spring 2018: Solutions for Problem Set #2 ECE534, Srig 08: s for roblem Set #. Rademacher Radom Variables ad Symmetrizatio a) Let X be a Rademacher radom variable, i.e., X = ±) = /. Show that E e λx e λ /. E e λx = e λ + e λ = + k= k=0 λ k k k!

More information

Precise Rates in Complete Moment Convergence for Negatively Associated Sequences

Precise Rates in Complete Moment Convergence for Negatively Associated Sequences Commuicatios of the Korea Statistical Society 29, Vol. 16, No. 5, 841 849 Precise Rates i Complete Momet Covergece for Negatively Associated Sequeces Dae-Hee Ryu 1,a a Departmet of Computer Sciece, ChugWoo

More information

Integrable Functions. { f n } is called a determining sequence for f. If f is integrable with respect to, then f d does exist as a finite real number

Integrable Functions. { f n } is called a determining sequence for f. If f is integrable with respect to, then f d does exist as a finite real number MATH 532 Itegrable Fuctios Dr. Neal, WKU We ow shall defie what it meas for a measurable fuctio to be itegrable, show that all itegral properties of simple fuctios still hold, ad the give some coditios

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 3 9/11/2013. Large deviations Theory. Cramér s Theorem

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 3 9/11/2013. Large deviations Theory. Cramér s Theorem MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/5.070J Fall 203 Lecture 3 9//203 Large deviatios Theory. Cramér s Theorem Cotet.. Cramér s Theorem. 2. Rate fuctio ad properties. 3. Chage of measure techique.

More information

Sequences and Limits

Sequences and Limits Chapter Sequeces ad Limits Let { a } be a sequece of real or complex umbers A ecessary ad sufficiet coditio for the sequece to coverge is that for ay ɛ > 0 there exists a iteger N > 0 such that a p a q

More information

Lecture Chapter 6: Convergence of Random Sequences

Lecture Chapter 6: Convergence of Random Sequences ECE5: Aalysis of Radom Sigals Fall 6 Lecture Chapter 6: Covergece of Radom Sequeces Dr Salim El Rouayheb Scribe: Abhay Ashutosh Doel, Qibo Zhag, Peiwe Tia, Pegzhe Wag, Lu Liu Radom sequece Defiitio A ifiite

More information

Machine Learning Brett Bernstein

Machine Learning Brett Bernstein Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio

More information

On Equivalence of Martingale Tail Bounds and Deterministic Regret Inequalities

On Equivalence of Martingale Tail Bounds and Deterministic Regret Inequalities O Equivalece of Martigale Tail Bouds ad Determiistic Regret Iequalities Sasha Rakhli Departmet of Statistics, The Wharto School Uiversity of Pesylvaia Dec 16, 2015 Joit work with K. Sridhara arxiv:1510.03925

More information

arxiv: v1 [math.pr] 4 Dec 2013

arxiv: v1 [math.pr] 4 Dec 2013 Squared-Norm Empirical Process i Baach Space arxiv:32005v [mathpr] 4 Dec 203 Vicet Q Vu Departmet of Statistics The Ohio State Uiversity Columbus, OH vqv@statosuedu Abstract Jig Lei Departmet of Statistics

More information

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014.

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014. Product measures, Toelli s ad Fubii s theorems For use i MAT3400/4400, autum 2014 Nadia S. Larse Versio of 13 October 2014. 1. Costructio of the product measure The purpose of these otes is to preset the

More information

Assignment 5: Solutions

Assignment 5: Solutions McGill Uiversity Departmet of Mathematics ad Statistics MATH 54 Aalysis, Fall 05 Assigmet 5: Solutios. Let y be a ubouded sequece of positive umbers satisfyig y + > y for all N. Let x be aother sequece

More information

The random version of Dvoretzky s theorem in l n

The random version of Dvoretzky s theorem in l n The radom versio of Dvoretzky s theorem i l Gideo Schechtma Abstract We show that with high probability a sectio of the l ball of dimesio k cε log c > 0 a uiversal costat) is ε close to a multiple of the

More information

Lecture 7: October 18, 2017

Lecture 7: October 18, 2017 Iformatio ad Codig Theory Autum 207 Lecturer: Madhur Tulsiai Lecture 7: October 8, 207 Biary hypothesis testig I this lecture, we apply the tools developed i the past few lectures to uderstad the problem

More information

Solutions to HW Assignment 1

Solutions to HW Assignment 1 Solutios to HW: 1 Course: Theory of Probability II Page: 1 of 6 Uiversity of Texas at Austi Solutios to HW Assigmet 1 Problem 1.1. Let Ω, F, {F } 0, P) be a filtered probability space ad T a stoppig time.

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS MASSACHUSTTS INSTITUT OF TCHNOLOGY 6.436J/5.085J Fall 2008 Lecture 9 /7/2008 LAWS OF LARG NUMBRS II Cotets. The strog law of large umbers 2. The Cheroff boud TH STRONG LAW OF LARG NUMBRS While the weak

More information

Lecture 3 : Random variables and their distributions

Lecture 3 : Random variables and their distributions Lecture 3 : Radom variables ad their distributios 3.1 Radom variables Let (Ω, F) ad (S, S) be two measurable spaces. A map X : Ω S is measurable or a radom variable (deoted r.v.) if X 1 (A) {ω : X(ω) A}

More information

Chapter 5. Inequalities. 5.1 The Markov and Chebyshev inequalities

Chapter 5. Inequalities. 5.1 The Markov and Chebyshev inequalities Chapter 5 Iequalities 5.1 The Markov ad Chebyshev iequalities As you have probably see o today s frot page: every perso i the upper teth percetile ears at least 1 times more tha the average salary. I other

More information

Ada Boost, Risk Bounds, Concentration Inequalities. 1 AdaBoost and Estimates of Conditional Probabilities

Ada Boost, Risk Bounds, Concentration Inequalities. 1 AdaBoost and Estimates of Conditional Probabilities CS8B/Stat4B Sprig 008) Statistical Learig Theory Lecture: Ada Boost, Risk Bouds, Cocetratio Iequalities Lecturer: Peter Bartlett Scribe: Subhrasu Maji AdaBoost ad Estimates of Coditioal Probabilities We

More information

Randomized Algorithms I, Spring 2018, Department of Computer Science, University of Helsinki Homework 1: Solutions (Discussed January 25, 2018)

Randomized Algorithms I, Spring 2018, Department of Computer Science, University of Helsinki Homework 1: Solutions (Discussed January 25, 2018) Radomized Algorithms I, Sprig 08, Departmet of Computer Sciece, Uiversity of Helsiki Homework : Solutios Discussed Jauary 5, 08). Exercise.: Cosider the followig balls-ad-bi game. We start with oe black

More information

Chapter 6 Principles of Data Reduction

Chapter 6 Principles of Data Reduction Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a

More information

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit theorems Throughout this sectio we will assume a probability space (Ω, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 + 62. Power series Defiitio 16. (Power series) Give a sequece {c }, the series c x = c 0 + c 1 x + c 2 x 2 + c 3 x 3 + is called a power series i the variable x. The umbers c are called the coefficiets of

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013 MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013 Fuctioal Law of Large Numbers. Costructio of the Wieer Measure Cotet. 1. Additioal techical results o weak covergece

More information

Statistical Machine Learning II Spring 2017, Learning Theory, Lecture 7

Statistical Machine Learning II Spring 2017, Learning Theory, Lecture 7 Statistical Machie Learig II Sprig 2017, Learig Theory, Lecture 7 1 Itroductio Jea Hoorio jhoorio@purdue.edu So far we have see some techiques for provig geeralizatio for coutably fiite hypothesis classes

More information

The Boolean Ring of Intervals

The Boolean Ring of Intervals MATH 532 Lebesgue Measure Dr. Neal, WKU We ow shall apply the results obtaied about outer measure to the legth measure o the real lie. Throughout, our space X will be the set of real umbers R. Whe ecessary,

More information

Lecture 3: August 31

Lecture 3: August 31 36-705: Itermediate Statistics Fall 018 Lecturer: Siva Balakrisha Lecture 3: August 31 This lecture will be mostly a summary of other useful expoetial tail bouds We will ot prove ay of these i lecture,

More information

Sieve Estimators: Consistency and Rates of Convergence

Sieve Estimators: Consistency and Rates of Convergence EECS 598: Statistical Learig Theory, Witer 2014 Topic 6 Sieve Estimators: Cosistecy ad Rates of Covergece Lecturer: Clayto Scott Scribe: Julia Katz-Samuels, Brado Oselio, Pi-Yu Che Disclaimer: These otes

More information

Notes 27 : Brownian motion: path properties

Notes 27 : Brownian motion: path properties Notes 27 : Browia motio: path properties Math 733-734: Theory of Probability Lecturer: Sebastie Roch Refereces:[Dur10, Sectio 8.1], [MP10, Sectio 1.1, 1.2, 1.3]. Recall: DEF 27.1 (Covariace) Let X = (X

More information

Research Article Nonexistence of Homoclinic Solutions for a Class of Discrete Hamiltonian Systems

Research Article Nonexistence of Homoclinic Solutions for a Class of Discrete Hamiltonian Systems Abstract ad Applied Aalysis Volume 203, Article ID 39868, 6 pages http://dx.doi.org/0.55/203/39868 Research Article Noexistece of Homocliic Solutios for a Class of Discrete Hamiltoia Systems Xiaopig Wag

More information

Math 341 Lecture #31 6.5: Power Series

Math 341 Lecture #31 6.5: Power Series Math 341 Lecture #31 6.5: Power Series We ow tur our attetio to a particular kid of series of fuctios, amely, power series, f(x = a x = a 0 + a 1 x + a 2 x 2 + where a R for all N. I terms of a series

More information

Advanced Stochastic Processes.

Advanced Stochastic Processes. Advaced Stochastic Processes. David Gamarik LECTURE 2 Radom variables ad measurable fuctios. Strog Law of Large Numbers (SLLN). Scary stuff cotiued... Outlie of Lecture Radom variables ad measurable fuctios.

More information

TESTING FOR THE BUFFERED AUTOREGRESSIVE PROCESSES (SUPPLEMENTARY MATERIAL)

TESTING FOR THE BUFFERED AUTOREGRESSIVE PROCESSES (SUPPLEMENTARY MATERIAL) TESTING FOR THE BUFFERED AUTOREGRESSIVE PROCESSES SUPPLEMENTARY MATERIAL) By Ke Zhu, Philip L.H. Yu ad Wai Keug Li Chiese Academy of Scieces ad Uiversity of Hog Kog APPENDIX: PROOFS I this appedix, we

More information

Mi-Hwa Ko and Tae-Sung Kim

Mi-Hwa Ko and Tae-Sung Kim J. Korea Math. Soc. 42 2005), No. 5, pp. 949 957 ALMOST SURE CONVERGENCE FOR WEIGHTED SUMS OF NEGATIVELY ORTHANT DEPENDENT RANDOM VARIABLES Mi-Hwa Ko ad Tae-Sug Kim Abstract. For weighted sum of a sequece

More information

for all x ; ;x R. A ifiite sequece fx ; g is said to be ND if every fiite subset X ; ;X is ND. The coditios (.) ad (.3) are equivalet for =, but these

for all x ; ;x R. A ifiite sequece fx ; g is said to be ND if every fiite subset X ; ;X is ND. The coditios (.) ad (.3) are equivalet for =, but these sub-gaussia techiques i provig some strog it theorems Λ M. Amii A. Bozorgia Departmet of Mathematics, Faculty of Scieces Sista ad Baluchesta Uiversity, Zaheda, Ira Amii@hamoo.usb.ac.ir, Fax:054446565 Departmet

More information

Monte Carlo Integration

Monte Carlo Integration Mote Carlo Itegratio I these otes we first review basic umerical itegratio methods (usig Riema approximatio ad the trapezoidal rule) ad their limitatios for evaluatig multidimesioal itegrals. Next we itroduce

More information

1 Duality revisited. AM 221: Advanced Optimization Spring 2016

1 Duality revisited. AM 221: Advanced Optimization Spring 2016 AM 22: Advaced Optimizatio Sprig 206 Prof. Yaro Siger Sectio 7 Wedesday, Mar. 9th Duality revisited I this sectio, we will give a slightly differet perspective o duality. optimizatio program: f(x) x R

More information

ECE 6980 An Algorithmic and Information-Theoretic Toolbox for Massive Data

ECE 6980 An Algorithmic and Information-Theoretic Toolbox for Massive Data ECE 6980 A Algorithmic ad Iformatio-Theoretic Toolbo for Massive Data Istructor: Jayadev Acharya Lecture # Scribe: Huayu Zhag 8th August, 017 1 Recap X =, ε is a accuracy parameter, ad δ is a error parameter.

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12 Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract I this lecture we derive risk bouds for kerel methods. We will start by showig that Soft Margi kerel SVM correspods to miimizig

More information

Lecture 2: Concentration Bounds

Lecture 2: Concentration Bounds CSE 52: Desig ad Aalysis of Algorithms I Sprig 206 Lecture 2: Cocetratio Bouds Lecturer: Shaya Oveis Ghara March 30th Scribe: Syuzaa Sargsya Disclaimer: These otes have ot bee subjected to the usual scrutiy

More information

Read carefully the instructions on the answer book and make sure that the particulars required are entered on each answer book.

Read carefully the instructions on the answer book and make sure that the particulars required are entered on each answer book. THE UNIVERSITY OF WARWICK FIRST YEAR EXAMINATION: Jauary 2009 Aalysis I Time Allowed:.5 hours Read carefully the istructios o the aswer book ad make sure that the particulars required are etered o each

More information

32 estimating the cumulative distribution function

32 estimating the cumulative distribution function 32 estimatig the cumulative distributio fuctio 4.6 types of cofidece itervals/bads Let F be a class of distributio fuctios F ad let θ be some quatity of iterest, such as the mea of F or the whole fuctio

More information

1 Lecture 2: Sequence, Series and power series (8/14/2012)

1 Lecture 2: Sequence, Series and power series (8/14/2012) Summer Jump-Start Program for Aalysis, 202 Sog-Yig Li Lecture 2: Sequece, Series ad power series (8/4/202). More o sequeces Example.. Let {x } ad {y } be two bouded sequeces. Show lim sup (x + y ) lim

More information

Supplement for SADAGRAD: Strongly Adaptive Stochastic Gradient Methods"

Supplement for SADAGRAD: Strongly Adaptive Stochastic Gradient Methods Suppleme for SADAGRAD: Srogly Adapive Sochasic Gradie Mehods" Zaiyi Che * 1 Yi Xu * Ehog Che 1 iabao Yag 1. Proof of Proposiio 1 Proposiio 1. Le ɛ > 0 be fixed, H 0 γi, γ g, EF (w 1 ) F (w ) ɛ 0 ad ieraio

More information

MAT1026 Calculus II Basic Convergence Tests for Series

MAT1026 Calculus II Basic Convergence Tests for Series MAT026 Calculus II Basic Covergece Tests for Series Egi MERMUT 202.03.08 Dokuz Eylül Uiversity Faculty of Sciece Departmet of Mathematics İzmir/TURKEY Cotets Mootoe Covergece Theorem 2 2 Series of Real

More information

ECE 330:541, Stochastic Signals and Systems Lecture Notes on Limit Theorems from Probability Fall 2002

ECE 330:541, Stochastic Signals and Systems Lecture Notes on Limit Theorems from Probability Fall 2002 ECE 330:541, Stochastic Sigals ad Systems Lecture Notes o Limit Theorems from robability Fall 00 I practice, there are two ways we ca costruct a ew sequece of radom variables from a old sequece of radom

More information

Lecture 19. sup y 1,..., yn B d n

Lecture 19. sup y 1,..., yn B d n STAT 06A: Polyomials of adom Variables Lecture date: Nov Lecture 19 Grothedieck s Iequality Scribe: Be Hough The scribes are based o a guest lecture by ya O Doell. I this lecture we prove Grothedieck s

More information

2 Banach spaces and Hilbert spaces

2 Banach spaces and Hilbert spaces 2 Baach spaces ad Hilbert spaces Tryig to do aalysis i the ratioal umbers is difficult for example cosider the set {x Q : x 2 2}. This set is o-empty ad bouded above but does ot have a least upper boud

More information

Concavity Solutions of Second-Order Differential Equations

Concavity Solutions of Second-Order Differential Equations Proceedigs of the Paista Academy of Scieces 5 (3): 4 45 (4) Copyright Paista Academy of Scieces ISSN: 377-969 (prit), 36-448 (olie) Paista Academy of Scieces Research Article Cocavity Solutios of Secod-Order

More information

Problem Set 2 Solutions

Problem Set 2 Solutions CS271 Radomess & Computatio, Sprig 2018 Problem Set 2 Solutios Poit totals are i the margi; the maximum total umber of poits was 52. 1. Probabilistic method for domiatig sets 6pts Pick a radom subset S

More information

Rates of Convergence for Quicksort

Rates of Convergence for Quicksort Rates of Covergece for Quicksort Ralph Neiiger School of Computer Sciece McGill Uiversity 480 Uiversity Street Motreal, HA 2K6 Caada Ludger Rüschedorf Istitut für Mathematische Stochastik Uiversität Freiburg

More information

If a subset E of R contains no open interval, is it of zero measure? For instance, is the set of irrationals in [0, 1] is of measure zero?

If a subset E of R contains no open interval, is it of zero measure? For instance, is the set of irrationals in [0, 1] is of measure zero? 2 Lebesgue Measure I Chapter 1 we defied the cocept of a set of measure zero, ad we have observed that every coutable set is of measure zero. Here are some atural questios: If a subset E of R cotais a

More information

Element sampling: Part 2

Element sampling: Part 2 Chapter 4 Elemet samplig: Part 2 4.1 Itroductio We ow cosider uequal probability samplig desigs which is very popular i practice. I the uequal probability samplig, we ca improve the efficiecy of the resultig

More information

MATH301 Real Analysis (2008 Fall) Tutorial Note #7. k=1 f k (x) converges pointwise to S(x) on E if and

MATH301 Real Analysis (2008 Fall) Tutorial Note #7. k=1 f k (x) converges pointwise to S(x) on E if and MATH01 Real Aalysis (2008 Fall) Tutorial Note #7 Sequece ad Series of fuctio 1: Poitwise Covergece ad Uiform Covergece Part I: Poitwise Covergece Defiitio of poitwise covergece: A sequece of fuctios f

More information

Supplementary Materials for Statistical-Computational Phase Transitions in Planted Models: The High-Dimensional Setting

Supplementary Materials for Statistical-Computational Phase Transitions in Planted Models: The High-Dimensional Setting Supplemetary Materials for Statistical-Computatioal Phase Trasitios i Plated Models: The High-Dimesioal Settig Yudog Che The Uiversity of Califoria, Berkeley yudog.che@eecs.berkeley.edu Jiamig Xu Uiversity

More information

<, if ε > 0 2nloglogn. =, if ε < 0.

<, if ε > 0 2nloglogn. =, if ε < 0. GLASNIK MATEMATIČKI Vol. 52(72)(207), 35 360 THE DAVIS-GUT LAW FOR INDEPENDENT AND IDENTICALLY DISTRIBUTED BANACH SPACE VALUED RANDOM ELEMENTS Pigya Che, Migyag Zhag ad Adrew Rosalsky Jia Uversity, P.

More information

Lecture 11 and 12: Basic estimation theory

Lecture 11 and 12: Basic estimation theory Lecture ad 2: Basic estimatio theory Sprig 202 - EE 94 Networked estimatio ad cotrol Prof. Kha March 2 202 I. MAXIMUM-LIKELIHOOD ESTIMATORS The maximum likelihood priciple is deceptively simple. Louis

More information

Notes for Lecture 11

Notes for Lecture 11 U.C. Berkeley CS78: Computatioal Complexity Hadout N Professor Luca Trevisa 3/4/008 Notes for Lecture Eigevalues, Expasio, ad Radom Walks As usual by ow, let G = (V, E) be a udirected d-regular graph with

More information

36-755, Fall 2017 Homework 5 Solution Due Wed Nov 15 by 5:00pm in Jisu s mailbox

36-755, Fall 2017 Homework 5 Solution Due Wed Nov 15 by 5:00pm in Jisu s mailbox Poits: 00+ pts total for the assigmet 36-755, Fall 07 Homework 5 Solutio Due Wed Nov 5 by 5:00pm i Jisu s mailbox We first review some basic relatios with orms ad the sigular value decompositio o matrices

More information

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y

More information

Sequences and Series of Functions

Sequences and Series of Functions Chapter 6 Sequeces ad Series of Fuctios 6.1. Covergece of a Sequece of Fuctios Poitwise Covergece. Defiitio 6.1. Let, for each N, fuctio f : A R be defied. If, for each x A, the sequece (f (x)) coverges

More information

Linear Support Vector Machines

Linear Support Vector Machines Liear Support Vector Machies David S. Roseberg The Support Vector Machie For a liear support vector machie (SVM), we use the hypothesis space of affie fuctios F = { f(x) = w T x + b w R d, b R } ad evaluate

More information

Technical Proofs for Homogeneity Pursuit

Technical Proofs for Homogeneity Pursuit Techical Proofs for Homogeeity Pursuit bstract This is the supplemetal material for the article Homogeeity Pursuit, submitted for publicatio i Joural of the merica Statistical ssociatio. B Proofs B. Proof

More information

15.083J/6.859J Integer Optimization. Lecture 3: Methods to enhance formulations

15.083J/6.859J Integer Optimization. Lecture 3: Methods to enhance formulations 15.083J/6.859J Iteger Optimizatio Lecture 3: Methods to ehace formulatios 1 Outlie Polyhedral review Slide 1 Methods to geerate valid iequalities Methods to geerate facet defiig iequalities Polyhedral

More information

Research Article Approximate Riesz Algebra-Valued Derivations

Research Article Approximate Riesz Algebra-Valued Derivations Abstract ad Applied Aalysis Volume 2012, Article ID 240258, 5 pages doi:10.1155/2012/240258 Research Article Approximate Riesz Algebra-Valued Derivatios Faruk Polat Departmet of Mathematics, Faculty of

More information

Entropy Rates and Asymptotic Equipartition

Entropy Rates and Asymptotic Equipartition Chapter 29 Etropy Rates ad Asymptotic Equipartitio Sectio 29. itroduces the etropy rate the asymptotic etropy per time-step of a stochastic process ad shows that it is well-defied; ad similarly for iformatio,

More information

Lecture 4: April 10, 2013

Lecture 4: April 10, 2013 TTIC/CMSC 1150 Mathematical Toolkit Sprig 01 Madhur Tulsiai Lecture 4: April 10, 01 Scribe: Haris Agelidakis 1 Chebyshev s Iequality recap I the previous lecture, we used Chebyshev s iequality to get a

More information

Berry-Esseen bounds for self-normalized martingales

Berry-Esseen bounds for self-normalized martingales Berry-Essee bouds for self-ormalized martigales Xiequa Fa a, Qi-Ma Shao b a Ceter for Applied Mathematics, Tiaji Uiversity, Tiaji 30007, Chia b Departmet of Statistics, The Chiese Uiversity of Hog Kog,

More information

Learnability with Rademacher Complexities

Learnability with Rademacher Complexities Learability with Rademacher Complexities Daiel Khashabi Fall 203 Last Update: September 26, 206 Itroductio Our goal i study of passive ervised learig is to fid a hypothesis h based o a set of examples

More information

Math 451: Euclidean and Non-Euclidean Geometry MWF 3pm, Gasson 204 Homework 3 Solutions

Math 451: Euclidean and Non-Euclidean Geometry MWF 3pm, Gasson 204 Homework 3 Solutions Math 451: Euclidea ad No-Euclidea Geometry MWF 3pm, Gasso 204 Homework 3 Solutios Exercises from 1.4 ad 1.5 of the otes: 4.3, 4.10, 4.12, 4.14, 4.15, 5.3, 5.4, 5.5 Exercise 4.3. Explai why Hp, q) = {x

More information