Statistical Machine Learning II Spring 2017, Learning Theory, Lecture 7

Size: px
Start display at page:

Download "Statistical Machine Learning II Spring 2017, Learning Theory, Lecture 7"

Transcription

1 Statistical Machie Learig II Sprig 2017, Learig Theory, Lecture 7 1 Itroductio Jea Hoorio jhoorio@purdue.edu So far we have see some techiques for provig geeralizatio for coutably fiite hypothesis classes e.g., uio boud), as well as for ifiite hypothesis classes e.g., primal-dual witess, Rademacher complexity). Example 1. Lets start by providig a very simple example: classificatio of oe dimesioal data. Our task is to give a biary label {0, 1} to a iput z R. I this example, the hypothesis class is the set of threshold fuctios: F = {f : R {0, 1} fz) = 1[z > θ], θ R} {f : R {0, 1} fz) = 1[z < θ], θ R} Could we use uio bouds i this case? The cardiality of F is equal to the umber 1 of scalars i R. Istead of coutig the umber of fuctios i F, we will cout the possible ways i which traiig samples could be labeled with fuctios i F. 2 Growth fuctio We will assume a arbitrary domai Z ad a dataset S = {z 1,..., z } cotaiig samples where z i Z for all i. I geeral, we will assume a hypothesis class F {f f : Z {0, 1}}. We will use the followig shorthad otatio: FS) = {fz 1 ),..., fz )) {0, 1} f F} That is, FS) cotais all the {0, 1} vectors that ca produced by applyig all fuctios i F to the dataset S. A atural measure of complexity is the followig. Defiitio 7.1. The growth fuctio or shatter coefficiet) of the hypothesis class F {f f : Z {0, 1}} for samples is: GF, ) = max S Z FS) 1 R is ot a coutable set, by Cator s diagoalisatio argumet. 1

2 Note that the growth fuctio does ot deped o the specific traiig set S, but it is a measure of the worst case amog all possible traiig sets. Clearly GF, ) 2, but ofte it is much smaller. Example 1 cotiues). Assume we sort all samples i S i icreasig order, ad recall that we have threshold fuctios, thus after sortig it is ot possible label three cosecutive samples as 0, 1, 0 or 1, 0, 1. I other words, all samples to the left should be 0 ad all samples to the right should be 1 or alteratively, all samples to the left should be 1 ad all samples to the right should be 0). Let see this more graphically. Each colum is oe of the samples, ad each row is a possible {0, 1} vector i FS). Clearly, GF, ) = 2 for Example 1. 1, 0, 0, 0,..., 0) 1, 1, 0, 0,..., 0) 1, 1, 1, 0,..., 0). 1, 1, 1, 1,..., 1) 0, 1, 1, 1,..., 1) 0, 0, 1, 1,..., 1) 0, 0, 0, 1,..., 1). 0, 0, 0, 0,..., 0) 3 Vapik-Chervoekis VC) dimesio Defiitio 7.2. The VC dimesio of the hypothesis class F {f f : Z {0, 1}} is: V CF) = max N { GF, ) = 2 } As the growth fuctio, the VC dimesio does ot deped o the specific traiig set S. Also, by defiitio if V CF) = d the for all > d we have GF, ) < 2. The iequality is strict.) I the followig sectio, we show a less obvious result. Example 1 cotiues). As before, assume we sort all samples i S i icreasig order, ad recall that we have threshold fuctios. Lets list all possible 2

3 {0, 1} vectors ad strikeout the oes that are ot i the set FS). = 1 = 2 = 3 0) 0, 0) 0, 0, 0) 1) 0, 1) 0, 0, 1) 1, 0) 0, 1, 0) 1, 1) 0, 1, 1) 1, 0, 0) 1, 0, 1) 1, 1, 0) 1, 1, 1) GF, 1) = 2 GF, 2) = 4 GF, 3) = 6 Clearly, V CF) = 2 for Example 1. I fact, we previously foud that GF, ) = 2, by Defiitio 7.2 we have: V CF) = max N { GF, ) = 2 } = max N { 2 = 2 } = 2 4 Sauer-Shelah lemma Lemma 7.1. The growth fuctio ad the VC dimesio of a hypothesis class F {f f : Z {0, 1}} fulfill: GF, ) V CF) i=0 ) + 1) V CF) i Proof. The right-had side is just a cosequece of the biomial theorem, thus we will cocetrate o the left-had side. We will use proof by iductio. First, defie for clarity: H, d) = d i=0 ) i Sice the biomial coefficiet fulfills ) i = 1 ) i + 1 i 1), it is clear that: H, d) = H 1, d) + H 1, d 1) 1) We ca restate the theorem as follows, V CF) d the: GF, ) H, d) 2) 3

4 Base case. We show that eq.2) holds for = 1 ad all d 1. Sice we have oly oe sample, we have oly two possible {0, 1} 1 vectors 0) ad 1), ad therefore: GF, 1) = {0), 1)} = 2 O the other had: H1, d) = = = 2 d ) 1 i i=0 ) ) Thus, GF, 1) = H1, d) = 2 ad eq.2) holds for = 1 ad all d 1. Iductive step. Assume that eq.2) holds for 1 ad all d 1, ad show that it holds for ad d. Fix a dataset S ad defie: Furthermore, defie: F 2 = FS 2 ) S = {z 1, z 2,..., z } S 2 = {z 2,..., z } F 2 = {fz 2 ),..., fz )) f F such that f F) f z 1 ) = 1 fz 1 ) ad f z i ) = fz i ) for i = 2... } Let see this more graphically. For FS), each colum is oe of the samples, ad each row is a possible {0, 1} vector. For F 2 ad F 2, each colum is oe of the 1 samples, ad each row is a possible {0, 1} 1 vector. Here b i {0, 1}. There are three cases FS) F 2 F 2 1. Two vectors i FS) match i 0, b 2,..., b ) b 2,..., b ) b 2,..., b ) etries 2 to, but ot i etry 1 1, b 2,..., b ) 2. A vector i FS) is uique i 0, b 2,..., b ) b 2,..., b ) etries 2 to, etry 1 is 0 3. A vector i FS) is uique i 1, b 2,..., b ) b 2,..., b ) etries 2 to, etry 1 is 1 Let c 1, c 2 ad c 3 the umber of times case 1, 2 ad 3 occur i S, respectively. 4

5 The ext table shows the umber of biary vectors i FS), F 2 ad F 2. There are three cases FS) F 2 F 2 1. Two vectors i FS) match i 2c 1 c 1 c 1 2. A vector i FS) is uique i c 2 c A vector i FS) is uique i c 3 c 3 0 Total umber of biary vectors 2c 1 +c 2 +c 3 c 1 +c 2 +c 3 c 1 From the above, it is clear that: FS) = F 2 + F 2 3) F 2 FS) 2 F 2 FS) Recall that Defiitio 7.2 VC dimesio) depeds o powers of 2. Thus from the above, it is clear that if V CFS)) d the: V CF 2 ) d V CF 2) d 1 Recall that the umber of samples i S is, while the umber of samples i S 2 is 1. From eq.3), the above ad eq.1), we have: FS) = F 2 + F 2 H 1, d) + H 1, d 1) = H, d) Sice the choice of S was arbitrary, the above holds for ay dataset S, ad thus: GF, ) = max S Z FS) H, d) Therefore, eq.2) holds ad we prove our claim. 5 Massart lemma ad Rademacher complexity Lemma 7.2. Let A be a coutably fiite subset of R. Let σ = {σ 1... σ } be idepedet Rademacher radom variables. We have: E σ [sup σ i a i 2 log A sup a 2 5

6 Proof. For ay t > 0 we have: [ ) ) t E σ σ i a i E σ [ t sup σ i a i sup = E σ [sup t ) σ i a i [ ) E σ t σ i a i = E σ [ t σ i a i = E σi [ tσ i a i = ta i ) + ta i ) 2 ) 1 2 t2 a 2 i = ) 1 2 t2 a 2 2 ) 1 A 2 t2 sup a a) 4.b) where the step i eq.4.a) follows from Jese s iequality. The step i eq.4.b) follows sice for all z R we have that e z + e z )/2 e z2 /2. By takig logarithms ad dividig by t o both sides of the above, we have: log A E σ [sup σ i a i + 1 t 2 t sup a 2 2 log A I order to miimize the fuctio ft) = t t sup a 2 2, we make the derivative equal to zero ad solve for t. That is: Thus, t = 0 = ft)/ t = log A t sup a log A sup a 2. Pluggig this back i the above, we prove our claim. 6

7 Lemma 7.3. Let F {f f : Z {0, 1}} be a hypothesis class. The empirical Rademacher complexity Defiitio 5.2) of the hypothesis class F with respect to samples is bouded as follows: Proof. R F) R F) = E σ [sup = 1 E σ 2 log GF, ) h F [ 1 sup a FS) σ i hz i ) σ i a i 1 2 log FS) sup a 2 a FS) 5.a) 5.b) 1 2 log GF, ) 5.c) 2 log GF, ) = where the step i eq.5.a) follow from Defiitio 5.2 respectively. The step i eq.5.b) follows from Massart lemma Lemma 7.2). The step i eq.5.c) follows sice S Z ) FS) GF, ), ad sice for all a {0, 1} we have that a 2. 7

18.657: Mathematics of Machine Learning

18.657: Mathematics of Machine Learning 8.657: Mathematics of Machie Learig Lecturer: Philippe Rigollet Lecture 4 Scribe: Cheg Mao Sep., 05 I this lecture, we cotiue to discuss the effect of oise o the rate of the excess risk E(h) = R(h) R(h

More information

Empirical Process Theory and Oracle Inequalities

Empirical Process Theory and Oracle Inequalities Stat 928: Statistical Learig Theory Lecture: 10 Empirical Process Theory ad Oracle Iequalities Istructor: Sham Kakade 1 Risk vs Risk See Lecture 0 for a discussio o termiology. 2 The Uio Boud / Boferoi

More information

Rademacher Complexity

Rademacher Complexity EECS 598: Statistical Learig Theory, Witer 204 Topic 0 Rademacher Complexity Lecturer: Clayto Scott Scribe: Ya Deg, Kevi Moo Disclaimer: These otes have ot bee subjected to the usual scrutiy reserved for

More information

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 3

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 3 Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture 3 Tolstikhi Ilya Abstract I this lecture we will prove the VC-boud, which provides a high-probability excess risk boud for the ERM algorithm whe

More information

2.1. The Algebraic and Order Properties of R Definition. A binary operation on a set F is a function B : F F! F.

2.1. The Algebraic and Order Properties of R Definition. A binary operation on a set F is a function B : F F! F. CHAPTER 2 The Real Numbers 2.. The Algebraic ad Order Properties of R Defiitio. A biary operatio o a set F is a fuctio B : F F! F. For the biary operatios of + ad, we replace B(a, b) by a + b ad a b, respectively.

More information

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12 Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract I this lecture we derive risk bouds for kerel methods. We will start by showig that Soft Margi kerel SVM correspods to miimizig

More information

Learning Theory: Lecture Notes

Learning Theory: Lecture Notes Learig Theory: Lecture Notes Kamalika Chaudhuri October 4, 0 Cocetratio of Averages Cocetratio of measure is very useful i showig bouds o the errors of machie-learig algorithms. We will begi with a basic

More information

1 Review and Overview

1 Review and Overview CS9T/STATS3: Statistical Learig Theory Lecturer: Tegyu Ma Lecture #6 Scribe: Jay Whag ad Patrick Cho October 0, 08 Review ad Overview Recall i the last lecture that for ay family of scalar fuctios F, we

More information

The Boolean Ring of Intervals

The Boolean Ring of Intervals MATH 532 Lebesgue Measure Dr. Neal, WKU We ow shall apply the results obtaied about outer measure to the legth measure o the real lie. Throughout, our space X will be the set of real umbers R. Whe ecessary,

More information

Lecture 1. January 8, 2018

Lecture 1. January 8, 2018 Lecture 1 Jauary 8, 018 1 Primes A prime umber p is a positive iteger which caot be writte as ab for some positive itegers a, b > 1. A prime p also have the property that if p ab, the p a or p b. This

More information

Introduction to Probability. Ariel Yadin. Lecture 7

Introduction to Probability. Ariel Yadin. Lecture 7 Itroductio to Probability Ariel Yadi Lecture 7 1. Idepedece Revisited 1.1. Some remiders. Let (Ω, F, P) be a probability space. Give a collectio of subsets K F, recall that the σ-algebra geerated by K,

More information

It is often useful to approximate complicated functions using simpler ones. We consider the task of approximating a function by a polynomial.

It is often useful to approximate complicated functions using simpler ones. We consider the task of approximating a function by a polynomial. Taylor Polyomials ad Taylor Series It is ofte useful to approximate complicated fuctios usig simpler oes We cosider the task of approximatig a fuctio by a polyomial If f is at least -times differetiable

More information

Homework 9. (n + 1)! = 1 1

Homework 9. (n + 1)! = 1 1 . Chapter : Questio 8 If N, the Homewor 9 Proof. We will prove this by usig iductio o. 2! + 2 3! + 3 4! + + +! +!. Base step: Whe the left had side is. Whe the right had side is 2! 2 +! 2 which proves

More information

Infinite Sequences and Series

Infinite Sequences and Series Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet

More information

1 Review and Overview

1 Review and Overview DRAFT a fial versio will be posted shortly CS229T/STATS231: Statistical Learig Theory Lecturer: Tegyu Ma Lecture #3 Scribe: Migda Qiao October 1, 2013 1 Review ad Overview I the first half of this course,

More information

An Introduction to Randomized Algorithms

An Introduction to Randomized Algorithms A Itroductio to Radomized Algorithms The focus of this lecture is to study a radomized algorithm for quick sort, aalyze it usig probabilistic recurrece relatios, ad also provide more geeral tools for aalysis

More information

1 Introduction. 1.1 Notation and Terminology

1 Introduction. 1.1 Notation and Terminology 1 Itroductio You have already leared some cocepts of calculus such as limit of a sequece, limit, cotiuity, derivative, ad itegral of a fuctio etc. Real Aalysis studies them more rigorously usig a laguage

More information

Agnostic Learning and Concentration Inequalities

Agnostic Learning and Concentration Inequalities ECE901 Sprig 2004 Statistical Regularizatio ad Learig Theory Lecture: 7 Agostic Learig ad Cocetratio Iequalities Lecturer: Rob Nowak Scribe: Aravid Kailas 1 Itroductio 1.1 Motivatio I the last lecture

More information

The multiplicative structure of finite field and a construction of LRC

The multiplicative structure of finite field and a construction of LRC IERG6120 Codig for Distributed Storage Systems Lecture 8-06/10/2016 The multiplicative structure of fiite field ad a costructio of LRC Lecturer: Keeth Shum Scribe: Zhouyi Hu Notatios: We use the otatio

More information

Math 104: Homework 2 solutions

Math 104: Homework 2 solutions Math 04: Homework solutios. A (0, ): Sice this is a ope iterval, the miimum is udefied, ad sice the set is ot bouded above, the maximum is also udefied. if A 0 ad sup A. B { m + : m, N}: This set does

More information

Limit superior and limit inferior c Prof. Philip Pennance 1 -Draft: April 17, 2017

Limit superior and limit inferior c Prof. Philip Pennance 1 -Draft: April 17, 2017 Limit erior ad limit iferior c Prof. Philip Peace -Draft: April 7, 207. Defiitio. The limit erior of a sequece a is the exteded real umber defied by lim a = lim a k k Similarly, the limit iferior of a

More information

} is said to be a Cauchy sequence provided the following condition is true.

} is said to be a Cauchy sequence provided the following condition is true. Math 4200, Fial Exam Review I. Itroductio to Proofs 1. Prove the Pythagorea theorem. 2. Show that 43 is a irratioal umber. II. Itroductio to Logic 1. Costruct a truth table for the statemet ( p ad ~ r

More information

Modern Algebra 1 Section 1 Assignment 1. Solution: We have to show that if you knock down any one domino, then it knocks down the one behind it.

Modern Algebra 1 Section 1 Assignment 1. Solution: We have to show that if you knock down any one domino, then it knocks down the one behind it. Moder Algebra 1 Sectio 1 Assigmet 1 JOHN PERRY Eercise 1 (pg 11 Warm-up c) Suppose we have a ifiite row of domioes, set up o ed What sort of iductio argumet would covice us that ocig dow the first domio

More information

Maximum Likelihood Estimation and Complexity Regularization

Maximum Likelihood Estimation and Complexity Regularization ECE90 Sprig 004 Statistical Regularizatio ad Learig Theory Lecture: 4 Maximum Likelihood Estimatio ad Complexity Regularizatio Lecturer: Rob Nowak Scribe: Pam Limpiti Review : Maximum Likelihood Estimatio

More information

REGRESSION WITH QUADRATIC LOSS

REGRESSION WITH QUADRATIC LOSS REGRESSION WITH QUADRATIC LOSS MAXIM RAGINSKY Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X, Y ), where, as before, X is a R d

More information

Learnability with Rademacher Complexities

Learnability with Rademacher Complexities Learability with Rademacher Complexities Daiel Khashabi Fall 203 Last Update: September 26, 206 Itroductio Our goal i study of passive ervised learig is to fid a hypothesis h based o a set of examples

More information

3 Gauss map and continued fractions

3 Gauss map and continued fractions ICTP, Trieste, July 08 Gauss map ad cotiued fractios I this lecture we will itroduce the Gauss map, which is very importat for its coectio with cotiued fractios i umber theory. The Gauss map G : [0, ]

More information

Bertrand s Postulate

Bertrand s Postulate Bertrad s Postulate Lola Thompso Ross Program July 3, 2009 Lola Thompso (Ross Program Bertrad s Postulate July 3, 2009 1 / 33 Bertrad s Postulate I ve said it oce ad I ll say it agai: There s always a

More information

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014.

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014. Product measures, Toelli s ad Fubii s theorems For use i MAT3400/4400, autum 2014 Nadia S. Larse Versio of 13 October 2014. 1. Costructio of the product measure The purpose of these otes is to preset the

More information

Topics. Homework Problems. MATH 301 Introduction to Analysis Chapter Four Sequences. 1. Definition of convergence of sequences.

Topics. Homework Problems. MATH 301 Introduction to Analysis Chapter Four Sequences. 1. Definition of convergence of sequences. MATH 301 Itroductio to Aalysis Chapter Four Sequeces Topics 1. Defiitio of covergece of sequeces. 2. Fidig ad provig the limit of sequeces. 3. Bouded covergece theorem: Theorem 4.1.8. 4. Theorems 4.1.13

More information

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Convergence of random variables. (telegram style notes) P.J.C. Spreij Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space

More information

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f. Lecture 5 Let us give oe more example of MLE. Example 3. The uiform distributio U[0, ] o the iterval [0, ] has p.d.f. { 1 f(x =, 0 x, 0, otherwise The likelihood fuctio ϕ( = f(x i = 1 I(X 1,..., X [0,

More information

Lecture 12: November 13, 2018

Lecture 12: November 13, 2018 Mathematical Toolkit Autum 2018 Lecturer: Madhur Tulsiai Lecture 12: November 13, 2018 1 Radomized polyomial idetity testig We will use our kowledge of coditioal probability to prove the followig lemma,

More information

Measure and Measurable Functions

Measure and Measurable Functions 3 Measure ad Measurable Fuctios 3.1 Measure o a Arbitrary σ-algebra Recall from Chapter 2 that the set M of all Lebesgue measurable sets has the followig properties: R M, E M implies E c M, E M for N implies

More information

HOMEWORK #4 - MA 504

HOMEWORK #4 - MA 504 HOMEWORK #4 - MA 504 PAULINHO TCHATCHATCHA Chapter 2, problem 19. (a) If A ad B are disjoit closed sets i some metric space X, prove that they are separated. (b) Prove the same for disjoit ope set. (c)

More information

Machine Learning Theory (CS 6783)

Machine Learning Theory (CS 6783) Machie Learig Theory (CS 6783) Lecture 2 : Learig Frameworks, Examples Settig up learig problems. X : istace space or iput space Examples: Computer Visio: Raw M N image vectorized X = 0, 255 M N, SIFT

More information

10-701/ Machine Learning Mid-term Exam Solution

10-701/ Machine Learning Mid-term Exam Solution 0-70/5-78 Machie Learig Mid-term Exam Solutio Your Name: Your Adrew ID: True or False (Give oe setece explaatio) (20%). (F) For a cotiuous radom variable x ad its probability distributio fuctio p(x), it

More information

Intro to Learning Theory

Intro to Learning Theory Lecture 1, October 18, 2016 Itro to Learig Theory Ruth Urer 1 Machie Learig ad Learig Theory Comig soo 2 Formal Framework 21 Basic otios I our formal model for machie learig, the istaces to be classified

More information

Machine Learning Theory (CS 6783)

Machine Learning Theory (CS 6783) Machie Learig Theory (CS 6783) Lecture 3 : Olie Learig, miimax value, sequetial Rademacher complexity Recap: Miimax Theorem We shall use the celebrated miimax theorem as a key tool to boud the miimax rate

More information

Glivenko-Cantelli Classes

Glivenko-Cantelli Classes CS28B/Stat24B (Sprig 2008 Statistical Learig Theory Lecture: 4 Gliveko-Catelli Classes Lecturer: Peter Bartlett Scribe: Michelle Besi Itroductio This lecture will cover Gliveko-Catelli (GC classes ad itroduce

More information

Law of the sum of Bernoulli random variables

Law of the sum of Bernoulli random variables Law of the sum of Beroulli radom variables Nicolas Chevallier Uiversité de Haute Alsace, 4, rue des frères Lumière 68093 Mulhouse icolas.chevallier@uha.fr December 006 Abstract Let be the set of all possible

More information

Sieve Estimators: Consistency and Rates of Convergence

Sieve Estimators: Consistency and Rates of Convergence EECS 598: Statistical Learig Theory, Witer 2014 Topic 6 Sieve Estimators: Cosistecy ad Rates of Covergece Lecturer: Clayto Scott Scribe: Julia Katz-Samuels, Brado Oselio, Pi-Yu Che Disclaimer: These otes

More information

ECE 901 Lecture 14: Maximum Likelihood Estimation and Complexity Regularization

ECE 901 Lecture 14: Maximum Likelihood Estimation and Complexity Regularization ECE 90 Lecture 4: Maximum Likelihood Estimatio ad Complexity Regularizatio R Nowak 5/7/009 Review : Maximum Likelihood Estimatio We have iid observatios draw from a ukow distributio Y i iid p θ, i,, where

More information

Section 11.8: Power Series

Section 11.8: Power Series Sectio 11.8: Power Series 1. Power Series I this sectio, we cosider geeralizig the cocept of a series. Recall that a series is a ifiite sum of umbers a. We ca talk about whether or ot it coverges ad i

More information

Math 25 Solutions to practice problems

Math 25 Solutions to practice problems Math 5: Advaced Calculus UC Davis, Sprig 0 Math 5 Solutios to practice problems Questio For = 0,,, 3,... ad 0 k defie umbers C k C k =! k!( k)! (for k = 0 ad k = we defie C 0 = C = ). by = ( )... ( k +

More information

Supplementary Material for Fast Stochastic AUC Maximization with O(1/n)-Convergence Rate

Supplementary Material for Fast Stochastic AUC Maximization with O(1/n)-Convergence Rate Supplemetary Material for Fast Stochastic AUC Maximizatio with O/-Covergece Rate Migrui Liu Xiaoxua Zhag Zaiyi Che Xiaoyu Wag 3 iabao Yag echical Lemmas ized versio of Hoeffdig s iequality, ote that We

More information

Lecture 10 October Minimaxity and least favorable prior sequences

Lecture 10 October Minimaxity and least favorable prior sequences STATS 300A: Theory of Statistics Fall 205 Lecture 0 October 22 Lecturer: Lester Mackey Scribe: Brya He, Rahul Makhijai Warig: These otes may cotai factual ad/or typographic errors. 0. Miimaxity ad least

More information

If a subset E of R contains no open interval, is it of zero measure? For instance, is the set of irrationals in [0, 1] is of measure zero?

If a subset E of R contains no open interval, is it of zero measure? For instance, is the set of irrationals in [0, 1] is of measure zero? 2 Lebesgue Measure I Chapter 1 we defied the cocept of a set of measure zero, ad we have observed that every coutable set is of measure zero. Here are some atural questios: If a subset E of R cotais a

More information

Sequences and Series of Functions

Sequences and Series of Functions Chapter 6 Sequeces ad Series of Fuctios 6.1. Covergece of a Sequece of Fuctios Poitwise Covergece. Defiitio 6.1. Let, for each N, fuctio f : A R be defied. If, for each x A, the sequece (f (x)) coverges

More information

ON THE HAUSDORFF DIMENSION OF A FAMILY OF SELF-SIMILAR SETS WITH COMPLICATED OVERLAPS. 1. Introduction and Statements

ON THE HAUSDORFF DIMENSION OF A FAMILY OF SELF-SIMILAR SETS WITH COMPLICATED OVERLAPS. 1. Introduction and Statements ON THE HAUSDORFF DIMENSION OF A FAMILY OF SELF-SIMILAR SETS WITH COMPLICATED OVERLAPS Abstract. We ivestigate the properties of the Hausdorff dimesio of the attractor of the iterated fuctio system (IFS)

More information

and each factor on the right is clearly greater than 1. which is a contradiction, so n must be prime.

and each factor on the right is clearly greater than 1. which is a contradiction, so n must be prime. MATH 324 Summer 200 Elemetary Number Theory Solutios to Assigmet 2 Due: Wedesday July 2, 200 Questio [p 74 #6] Show that o iteger of the form 3 + is a prime, other tha 2 = 3 + Solutio: If 3 + is a prime,

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit theorems Throughout this sectio we will assume a probability space (Ω, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

Optimally Sparse SVMs

Optimally Sparse SVMs A. Proof of Lemma 3. We here prove a lower boud o the umber of support vectors to achieve geeralizatio bouds of the form which we cosider. Importatly, this result holds ot oly for liear classifiers, but

More information

Regression with quadratic loss

Regression with quadratic loss Regressio with quadratic loss Maxim Ragisky October 13, 2015 Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X,Y, where, as before,

More information

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function.

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function. MATH 532 Measurable Fuctios Dr. Neal, WKU Throughout, let ( X, F, µ) be a measure space ad let (!, F, P ) deote the special case of a probability space. We shall ow begi to study real-valued fuctios defied

More information

Lecture 3 : Random variables and their distributions

Lecture 3 : Random variables and their distributions Lecture 3 : Radom variables ad their distributios 3.1 Radom variables Let (Ω, F) ad (S, S) be two measurable spaces. A map X : Ω S is measurable or a radom variable (deoted r.v.) if X 1 (A) {ω : X(ω) A}

More information

Learning Bounds for Support Vector Machines with Learned Kernels

Learning Bounds for Support Vector Machines with Learned Kernels Learig Bouds for Support Vector Machies with Leared Kerels Nati Srebro TTI-Chicago Shai Be-David Uiversity of Waterloo Mostly based o a paper preseted at COLT 06 Kerelized Large-Margi Liear Classificatio

More information

Relations Among Algebras

Relations Among Algebras Itroductio to leee Algebra Lecture 6 CS786 Sprig 2004 February 9, 2004 Relatios Amog Algebras The otio of free algebra described i the previous lecture is a example of a more geeral pheomeo called adjuctio.

More information

Review Problems 1. ICME and MS&E Refresher Course September 19, 2011 B = C = AB = A = A 2 = A 3... C 2 = C 3 = =

Review Problems 1. ICME and MS&E Refresher Course September 19, 2011 B = C = AB = A = A 2 = A 3... C 2 = C 3 = = Review Problems ICME ad MS&E Refresher Course September 9, 0 Warm-up problems. For the followig matrices A = 0 B = C = AB = 0 fid all powers A,A 3,(which is A times A),... ad B,B 3,... ad C,C 3,... Solutio:

More information

2.4 Sequences, Sequences of Sets

2.4 Sequences, Sequences of Sets 72 CHAPTER 2. IMPORTANT PROPERTIES OF R 2.4 Sequeces, Sequeces of Sets 2.4.1 Sequeces Defiitio 2.4.1 (sequece Let S R. 1. A sequece i S is a fuctio f : K S where K = { N : 0 for some 0 N}. 2. For each

More information

Homework 1 Solutions. The exercises are from Foundations of Mathematical Analysis by Richard Johnsonbaugh and W.E. Pfaffenberger.

Homework 1 Solutions. The exercises are from Foundations of Mathematical Analysis by Richard Johnsonbaugh and W.E. Pfaffenberger. Homewor 1 Solutios Math 171, Sprig 2010 Hery Adams The exercises are from Foudatios of Mathematical Aalysis by Richard Johsobaugh ad W.E. Pfaffeberger. 2.2. Let h : X Y, g : Y Z, ad f : Z W. Prove that

More information

Problem Set 2 Solutions

Problem Set 2 Solutions CS271 Radomess & Computatio, Sprig 2018 Problem Set 2 Solutios Poit totals are i the margi; the maximum total umber of poits was 52. 1. Probabilistic method for domiatig sets 6pts Pick a radom subset S

More information

Math 61CM - Solutions to homework 3

Math 61CM - Solutions to homework 3 Math 6CM - Solutios to homework 3 Cédric De Groote October 2 th, 208 Problem : Let F be a field, m 0 a fixed oegative iteger ad let V = {a 0 + a x + + a m x m a 0,, a m F} be the vector space cosistig

More information

FUNDAMENTALS OF REAL ANALYSIS by. V.1. Product measures

FUNDAMENTALS OF REAL ANALYSIS by. V.1. Product measures FUNDAMENTALS OF REAL ANALSIS by Doğa Çömez V. PRODUCT MEASURE SPACES V.1. Product measures Let (, A, µ) ad (, B, ν) be two measure spaces. I this sectio we will costruct a product measure µ ν o that coicides

More information

Chapter 6 Infinite Series

Chapter 6 Infinite Series Chapter 6 Ifiite Series I the previous chapter we cosidered itegrals which were improper i the sese that the iterval of itegratio was ubouded. I this chapter we are goig to discuss a topic which is somewhat

More information

Square-Congruence Modulo n

Square-Congruence Modulo n Square-Cogruece Modulo Abstract This paper is a ivestigatio of a equivalece relatio o the itegers that was itroduced as a exercise i our Discrete Math class. Part I - Itro Defiitio Two itegers are Square-Cogruet

More information

1 Convergence in Probability and the Weak Law of Large Numbers

1 Convergence in Probability and the Weak Law of Large Numbers 36-752 Advaced Probability Overview Sprig 2018 8. Covergece Cocepts: i Probability, i L p ad Almost Surely Istructor: Alessadro Rialdo Associated readig: Sec 2.4, 2.5, ad 4.11 of Ash ad Doléas-Dade; Sec

More information

MATH 112: HOMEWORK 6 SOLUTIONS. Problem 1: Rudin, Chapter 3, Problem s k < s k < 2 + s k+1

MATH 112: HOMEWORK 6 SOLUTIONS. Problem 1: Rudin, Chapter 3, Problem s k < s k < 2 + s k+1 MATH 2: HOMEWORK 6 SOLUTIONS CA PRO JIRADILOK Problem. If s = 2, ad Problem : Rudi, Chapter 3, Problem 3. s + = 2 + s ( =, 2, 3,... ), prove that {s } coverges, ad that s < 2 for =, 2, 3,.... Proof. The

More information

An analog of the arithmetic triangle obtained by replacing the products by the least common multiples

An analog of the arithmetic triangle obtained by replacing the products by the least common multiples arxiv:10021383v2 [mathnt] 9 Feb 2010 A aalog of the arithmetic triagle obtaied by replacig the products by the least commo multiples Bair FARHI bairfarhi@gmailcom MSC: 11A05 Keywords: Al-Karaji s triagle;

More information

The log-behavior of n p(n) and n p(n)/n

The log-behavior of n p(n) and n p(n)/n Ramauja J. 44 017, 81-99 The log-behavior of p ad p/ William Y.C. Che 1 ad Ke Y. Zheg 1 Ceter for Applied Mathematics Tiaji Uiversity Tiaji 0007, P. R. Chia Ceter for Combiatorics, LPMC Nakai Uivercity

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS MASSACHUSTTS INSTITUT OF TCHNOLOGY 6.436J/5.085J Fall 2008 Lecture 9 /7/2008 LAWS OF LARG NUMBRS II Cotets. The strog law of large umbers 2. The Cheroff boud TH STRONG LAW OF LARG NUMBRS While the weak

More information

Chapter 0. Review of set theory. 0.1 Sets

Chapter 0. Review of set theory. 0.1 Sets Chapter 0 Review of set theory Set theory plays a cetral role i the theory of probability. Thus, we will ope this course with a quick review of those otios of set theory which will be used repeatedly.

More information

CSE 1400 Applied Discrete Mathematics Number Theory and Proofs

CSE 1400 Applied Discrete Mathematics Number Theory and Proofs CSE 1400 Applied Discrete Mathematics Number Theory ad Proofs Departmet of Computer Scieces College of Egieerig Florida Tech Sprig 01 Problems for Number Theory Backgroud Number theory is the brach of

More information

LECTURE NOTES, 11/10/04

LECTURE NOTES, 11/10/04 18.700 LECTURE NOTES, 11/10/04 Cotets 1. Direct sum decompositios 1 2. Geeralized eigespaces 3 3. The Chiese remaider theorem 5 4. Liear idepedece of geeralized eigespaces 8 1. Direct sum decompositios

More information

Cardinality Homework Solutions

Cardinality Homework Solutions Cardiality Homework Solutios April 16, 014 Problem 1. I the followig problems, fid a bijectio from A to B (you eed ot prove that the fuctio you list is a bijectio): (a) A = ( 3, 3), B = (7, 1). (b) A =

More information

Ada Boost, Risk Bounds, Concentration Inequalities. 1 AdaBoost and Estimates of Conditional Probabilities

Ada Boost, Risk Bounds, Concentration Inequalities. 1 AdaBoost and Estimates of Conditional Probabilities CS8B/Stat4B Sprig 008) Statistical Learig Theory Lecture: Ada Boost, Risk Bouds, Cocetratio Iequalities Lecturer: Peter Bartlett Scribe: Subhrasu Maji AdaBoost ad Estimates of Coditioal Probabilities We

More information

arxiv: v1 [math.pr] 4 Dec 2013

arxiv: v1 [math.pr] 4 Dec 2013 Squared-Norm Empirical Process i Baach Space arxiv:32005v [mathpr] 4 Dec 203 Vicet Q Vu Departmet of Statistics The Ohio State Uiversity Columbus, OH vqv@statosuedu Abstract Jig Lei Departmet of Statistics

More information

A Proof of Birkhoff s Ergodic Theorem

A Proof of Birkhoff s Ergodic Theorem A Proof of Birkhoff s Ergodic Theorem Joseph Hora September 2, 205 Itroductio I Fall 203, I was learig the basics of ergodic theory, ad I came across this theorem. Oe of my supervisors, Athoy Quas, showed

More information

18.657: Mathematics of Machine Learning

18.657: Mathematics of Machine Learning 8.657: Mathematics of Machie Learig Lecturer: Philippe Rigollet Lecture 0 Scribe: Ade Forrow Oct. 3, 05 Recall the followig defiitios from last time: Defiitio: A fuctio K : X X R is called a positive symmetric

More information

Lecture 10: Mathematical Preliminaries

Lecture 10: Mathematical Preliminaries Lecture : Mathematical Prelimiaries Obective: Reviewig mathematical cocepts ad tools that are frequetly used i the aalysis of algorithms. Lecture # Slide # I this

More information

MATH 324 Summer 2006 Elementary Number Theory Solutions to Assignment 2 Due: Thursday July 27, 2006

MATH 324 Summer 2006 Elementary Number Theory Solutions to Assignment 2 Due: Thursday July 27, 2006 MATH 34 Summer 006 Elemetary Number Theory Solutios to Assigmet Due: Thursday July 7, 006 Departmet of Mathematical ad Statistical Scieces Uiversity of Alberta Questio [p 74 #6] Show that o iteger of the

More information

The natural exponential function

The natural exponential function The atural expoetial fuctio Attila Máté Brookly College of the City Uiversity of New York December, 205 Cotets The atural expoetial fuctio for real x. Beroulli s iequality.....................................2

More information

Notes 27 : Brownian motion: path properties

Notes 27 : Brownian motion: path properties Notes 27 : Browia motio: path properties Math 733-734: Theory of Probability Lecturer: Sebastie Roch Refereces:[Dur10, Sectio 8.1], [MP10, Sectio 1.1, 1.2, 1.3]. Recall: DEF 27.1 (Covariace) Let X = (X

More information

(A sequence also can be thought of as the list of function values attained for a function f :ℵ X, where f (n) = x n for n 1.) x 1 x N +k x N +4 x 3

(A sequence also can be thought of as the list of function values attained for a function f :ℵ X, where f (n) = x n for n 1.) x 1 x N +k x N +4 x 3 MATH 337 Sequeces Dr. Neal, WKU Let X be a metric space with distace fuctio d. We shall defie the geeral cocept of sequece ad limit i a metric space, the apply the results i particular to some special

More information

A Note on the Symmetric Powers of the Standard Representation of S n

A Note on the Symmetric Powers of the Standard Representation of S n A Note o the Symmetric Powers of the Stadard Represetatio of S David Savitt 1 Departmet of Mathematics, Harvard Uiversity Cambridge, MA 0138, USA dsavitt@mathharvardedu Richard P Staley Departmet of Mathematics,

More information

Lecture 2 February 8, 2016

Lecture 2 February 8, 2016 MIT 6.854/8.45: Advaced Algorithms Sprig 206 Prof. Akur Moitra Lecture 2 February 8, 206 Scribe: Calvi Huag, Lih V. Nguye I this lecture, we aalyze the problem of schedulig equal size tasks arrivig olie

More information

MATH 205 HOMEWORK #2 OFFICIAL SOLUTION. (f + g)(x) = f(x) + g(x) = f( x) g( x) = (f + g)( x)

MATH 205 HOMEWORK #2 OFFICIAL SOLUTION. (f + g)(x) = f(x) + g(x) = f( x) g( x) = (f + g)( x) MATH 205 HOMEWORK #2 OFFICIAL SOLUTION Problem 2: Do problems 7-9 o page 40 of Hoffma & Kuze. (7) We will prove this by cotradictio. Suppose that W 1 is ot cotaied i W 2 ad W 2 is ot cotaied i W 1. The

More information

LONG SNAKES IN POWERS OF THE COMPLETE GRAPH WITH AN ODD NUMBER OF VERTICES

LONG SNAKES IN POWERS OF THE COMPLETE GRAPH WITH AN ODD NUMBER OF VERTICES J Lodo Math Soc (2 50, (1994, 465 476 LONG SNAKES IN POWERS OF THE COMPLETE GRAPH WITH AN ODD NUMBER OF VERTICES Jerzy Wojciechowski Abstract I [5] Abbott ad Katchalski ask if there exists a costat c >

More information

Introduction to Probability. Ariel Yadin

Introduction to Probability. Ariel Yadin Itroductio to robability Ariel Yadi Lecture 2 *** Ja. 7 ***. Covergece of Radom Variables As i the case of sequeces of umbers, we would like to talk about covergece of radom variables. There are may ways

More information

A Hadamard-type lower bound for symmetric diagonally dominant positive matrices

A Hadamard-type lower bound for symmetric diagonally dominant positive matrices A Hadamard-type lower boud for symmetric diagoally domiat positive matrices Christopher J. Hillar, Adre Wibisoo Uiversity of Califoria, Berkeley Jauary 7, 205 Abstract We prove a ew lower-boud form of

More information

MT5821 Advanced Combinatorics

MT5821 Advanced Combinatorics MT5821 Advaced Combiatorics 1 Coutig subsets I this sectio, we cout the subsets of a -elemet set. The coutig umbers are the biomial coefficiets, familiar objects but there are some ew thigs to say about

More information

LECTURE 8: ORTHOGONALITY (CHAPTER 5 IN THE BOOK)

LECTURE 8: ORTHOGONALITY (CHAPTER 5 IN THE BOOK) LECTURE 8: ORTHOGONALITY (CHAPTER 5 IN THE BOOK) Everythig marked by is ot required by the course syllabus I this lecture, all vector spaces is over the real umber R. All vectors i R is viewed as a colum

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 11

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 11 Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract We will itroduce the otio of reproducig kerels ad associated Reproducig Kerel Hilbert Spaces (RKHS). We will cosider couple

More information

Notes #3 Sequences Limit Theorems Monotone and Subsequences Bolzano-WeierstraßTheorem Limsup & Liminf of Sequences Cauchy Sequences and Completeness

Notes #3 Sequences Limit Theorems Monotone and Subsequences Bolzano-WeierstraßTheorem Limsup & Liminf of Sequences Cauchy Sequences and Completeness Notes #3 Sequeces Limit Theorems Mootoe ad Subsequeces Bolzao-WeierstraßTheorem Limsup & Limif of Sequeces Cauchy Sequeces ad Completeess This sectio of otes focuses o some of the basics of sequeces of

More information

arxiv: v1 [math.co] 23 Mar 2016

arxiv: v1 [math.co] 23 Mar 2016 The umber of direct-sum decompositios of a fiite vector space arxiv:603.0769v [math.co] 23 Mar 206 David Ellerma Uiversity of Califoria at Riverside August 3, 208 Abstract The theory of q-aalogs develops

More information

On Random Line Segments in the Unit Square

On Random Line Segments in the Unit Square O Radom Lie Segmets i the Uit Square Thomas A. Courtade Departmet of Electrical Egieerig Uiversity of Califoria Los Ageles, Califoria 90095 Email: tacourta@ee.ucla.edu I. INTRODUCTION Let Q = [0, 1] [0,

More information

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i

More information

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality

More information