Sieve Estimators: Consistency and Rates of Convergence

Size: px
Start display at page:

Download "Sieve Estimators: Consistency and Rates of Convergence"

Transcription

1 EECS 598: Statistical Learig Theory, Witer 2014 Topic 6 Sieve Estimators: Cosistecy ad Rates of Covergece Lecturer: Clayto Scott Scribe: Julia Katz-Samuels, Brado Oselio, Pi-Yu Che Disclaimer: These otes have ot bee subjected to the usual scrutiy reserved for formal publicatios. They may be distributed outside this class oly with the permissio of the Istructor. 1 Itroductio We ca decompose the excess risk of a discrimiatio ruleas follows: Rĥ R = Rĥ RH + }{{} RH R }{{} estimatio error approximatio error The first term is the estimatio error ad measures the performace of the discrimiatio rule with respect to the best hypothesis i H. I previous lectures, we studied performace guaratees for this quatity whe ĥ is ERM. Now, we also cosider the approximatio error, which captures how well a class of hypotheses {H k } approximates the Bayes decisio boudary. For example, cosider the histogram classifiers i Figure 1 ad Figure 2, respectively. As the grid becomes more fie-graied, the class of hypotheses approximates the Bayes decisio boudary with icreasig accuracy. The examiatio of the approximatio error will lead to the desig of sieve estimators that perform ERM over H k where k = k grows with at a appropriate rate such that both the approximatio error ad the estimatio error coverge to 0. Note that while the estimatio error is radom because it depeds o the sample, the approximatio error is ot radom. 2 Approximatio Error The followig assumptio will be adopted to establish uiversal cosistecy. Defiitio 1. A sequece of sets of classifiers H 1, H 2,... is said to have the uiversal approximatio property UAP if for all P XY, as k. R H k R Observe that RH k is ot radom; it does ot deped o the traiig data. Also, it depeds o P XY because it is a expectatio. The followig theorem gives a sufficiet coditio for a sequece of classifiers to have the UAP. Theorem 1. Suppose H k cosists of classifiers that are piecewise costat o a paritito of X ito cells A k,1, A k,2,.... If sup j 1 diama k,j 0 as k, the {H k } has the UAP. Proof. See the proof of Theorem 6.1 i [1]. Example. Suppose X = [0, 1] d. Cosider H k = { histogram classifiers based o cells of sidelegth 1 k }. The, diama k,j = d k for j. I particular, sup diama k,j = j 1 as k. By the above theorem, {H k } has the UAP. 0 d k 1

2 2 Figure 1: Histogram classifier based o 6 6 grid Figure 2: Histogram classifier based o grid 3 Covergece of Radom Variables This sectio offers a brief review of covergece i probability ad almost surely, ad gives some useful results for provig both kids of covergece. Defiitio 2. Let Z 1, Z 2,... be a sequece of radom variables i R. We say that Z coverges to Z i i.p. probability, ad write Z Z, if ɛ > 0, lim Pr Z Z ɛ = 0. We say Z coverges to Z almost surely with probability oe, ad write Z a.s. Z, if Pr{ω : lim Z ω = Zω} = 1. Example. I aalysis of discrimiatio rules, we cosider Ω = {ω = X 1, Y 1, X 2, Y 2,... X Y } where X i, Y i i.i.d. P XY. To study covergece of the risk to the Bayes risk, we take Z = Rĥ where ĥ is a classifier based o the first etries of ω, amely X 1, Y 1,..., X, Y, ad Z = R. Now, we cosider some useful lemmas for provig covergece i probability ad almost sure covergece. We begi with useful results for covergece i probability. a.s. i.p. Lemma 1. If Z Z, the Z Z. Proof. This was proved i EECS 501. Lemma 2. If E{ Z Z } 0 as, the Z Proof. By Markov s iequality i.p. Z. Pr Z Z ɛ E{ Z Z } ɛ 0. Now, we tur our attetio to almost sure covergece. We begi with the Borel-Catelli Lemma, which yields a useful corollary.

3 3 Lemma 3 Borel-Catelli Lemma. Cosider a probability space Ω, A, P. Let {A } 1 be a sequece of evets. If P A < the where 1 P lim sup A = 0 lim sup A := {ω Ω : N N such that ω A } = {ω that occur ifiitely ofte i the sequece of evets} = A. N 1 N Proof. Observe that P lim sup A = P lim N N A. Let B N = N A. Observe that B 1 B 2 B 3, i.e., {B N } is a decreasig sequece of evets. The, by cotiuity of P, we have P lim B N = lim P B N N N = lim P A N lim = 0 N N N P A where the iequality i the third lie follows from uio boud ad the last equality follows from the hypothesis that 1 P A <. The followig corollary of the Borel-Catelli lemma is very useful for showig almost sure covergece. Corollary 1. If for all ɛ > 0, we have the Z a.s. Z. Pr Z Z ɛ < 1 Proof. Defie the evet A ɛ = {ω Ω : N N such that Z w Zw ɛ}. Let A ɛ = {ω Ω : Z ω Zω ɛ}. Observe that lim sup A ɛ = A ɛ. By the hypothesis, we may apply the Borel-Catelli lemma to obtai PrA ɛ = 0. Now defie A = {ω Ω : Z ω Z} = {ω Ω : ɛ > 0 N N such that Z ω Zω ɛ}. I words, this meas: for some ɛ, o matter how far alog you go i the sequece, you ca fid some such that Z ω Zω ɛ. Now, let {ɛ j } be a strictly decreasig sequece covergig to 0. Observe that A j=1 Aɛj. To see this, take some ω A. The, for some ɛ > 0 N N such that Z ω Zω ɛ. As j, evetually there is some j such that ɛ j ɛ, so that ω A ɛ j. Therefore, PrA Pr A ɛ j This cocludes the proof. j=1 PrA ɛj j=1 = 0.

4 4 4 Sieve Estimators Let {H k } be a sequece of sets of classifiers with the uiform approximatio property UAP. Assume that we have a uiform deviatio boud for H k e.g., if H k is fiite or a VC class. We deote by ĥ,k the classifier we obtai from empirical risk miimizatio ERM over H k : 1 ĥ,k = arg mi h H k 1 {hxi Y i}. Let k be a iteger-valued sequece ad defie ĥ = ĥ,k. Sice we have the UAP, it should be clear that the approximatio error goes to 0 as log as k with. We wish to restrict the rate of growth of k appropriately so that the estimatio error also goes to zero, i probability or almost surely. This is called a sieve estimator, whose ame is geerally attributed to Greader [3]. Formally, we eed that Rĥ RH k 0, i.p/a.s. Let s apply some previously developed theory to this problem. Cosider the case where H k <. We have see that Pr Rĥ RH k ɛ 2 H k e ɛ2 /2. 1 We eed to make sure that H k does ot domiate the expoetial term, so that the sum i Corollary 1 coverges. Rewrite the right-had side of Eq. 1 as Hk e ɛ 2 /2 = e l H k ɛ 2 /2. i Thus it suffices to have l H k 0, as. Aother way to write this is usig little-oh otatio. We say that a sequece a = ob if a b. So, the above is equivalet to l Hk = o. Uder this coditio, 0 as The for all ɛ > 0, 1 ɛ > 0, N ɛ, N ɛ, l Hk ɛ 2 Pr Rĥ RH k ɛ N ɛ + 2e ɛ 2 /4 <. N ɛ 2 ɛ The summatio o the right is simply a covergig geometric series. By Corollary 1 we have established Theorem 2. Let {H k } satisfy the UAP with Hk <. Let k be a iteger sequece such that as, k ad l Hk = o. The R ĥ R almost surely, i.e., ĥ is strogly uiversally cosistet. Corollary 2. Let {H k } = {the set of histogram classifiers with side legth 1/k}, X = [0, 1] d. Let k such that k d = o. The ĥ is strogly uiversally cosistet. Proof. We previous saw i Sectio 2 that histograms have the UAP. Also, Hk = 2 k d, so the corollary follows from Theorem 2. I Corollary 2, we require kd 0. Note that /kd is the umber of samples per cell. Therefore the umber of samples per cell must go to ifiity; this seems like a reasoable coditio for strog cosistecy.

5 5 Now, let s cosider the case where the VC dimesio of H k is fiite. From our discussio of VC theory, we have that for all ɛ > 0, Pr Rĥ RH k ɛ 8S Hk e ɛ2 /128. We also kow from Sauer s lemma e V S H, V, V where V is the VC dimesio of H. Usig this, we deduce Pr Rĥ RH k ɛ 8S Hk e ɛ2 /128 C V k e ɛ2 /128 = C exp V k l ɛ2 128 C is simply a costat that does ot deped o. If we choose k such that V k = o l, the we obtai the followig iequality, which is similar to the oe i Eq. 2:. Therefore We have established 1 ɛ > 0, N, N, V k l ɛ2 128 ɛ Pr Rĥ RH k ɛ N + Ce ɛ2 /256 <. N Theorem 3. Let {H k } satisfy the UAP, ad let V k < deote the VC dimesio of H k. Let k be a iteger-valued sequece such that as, k ad V k = o l. The Rĥ R almost surely, i.e., ĥ is strogly uiversally cosistet. 5 Rates of Covergece Oe could ask ] whether there is a uiversal rate of covergece. Put more formally, is there a ĥ such that E [Rĥ R coverges to 0 at a fixed rate, for ay joit distributio P XY? It turs out that the aswer is o. A theorem from [2] makes this cocrete. Theorem 4. Let {a } with a 0 such that 1 16 a 1 a For ay ĥ, there exists a joit distributio P XY such that R = 0 ad ERĥ a. Proof. See Chapter 7 of [1]. Sice the sequece {a } is arbitrary, we ca also choose how slowly it coverges to 0, thus showig that there will be o uiversal rate of covergece. I order to establish rates of covergece, we will therefore have to place restrictios o P XY. Some possible reasoable restrictios o P XY iclude: P X Y =y is a cotiuous radom variable. ηx is smooth. there exists a t 0 such that P X ηx 1 2 t 0 = 0. the Bayes decisio boudary is smooth. Below we cosider oe particular distributioal assumptio uder which we ca establish a rate of covergece.

6 6 Figure 3: Illustratio of Bayes decisio boudary BDB. 6 Box-Coutig Class Let X = [0, 1] d. For m 1, defie P m to be the partitio of X ito cubes of side legth 1/m. Note that P m = m d. Defiitio 3. The box coutig class B is the set of all P XY such that A P X has a bouded desity. B There exists a costat C such that m 1, the Bayes decisio boudary BDB {x : ηx = 1/2} passes through at most Cm d 1 elemets of P m see Fig. 3. This essetially says that the Bayes decisio boudary BDB has dimesio d 1. Look up the term box-coutig dimesio for more. We have the followig rate of covergece result. Theorem 5. Let H k ={histogram classifiers based o P k }, assume P XY B, ad let ĥ be a sieve estimator with ad k 1 d+2. The [ ] E Rĥ R = O 1 d+2. Proof. Recall the decompositio ito estimatio ad approximatio errors: [ E Rĥ R ] ] = E [Rĥ R Hk + RH k R. To boud the estimatio error, deote := Rĥ R H k. By the law of total expectatio, Choosig ɛ = [ ] E = P r [ ɛ E }{{} ] ɛ + P r [ < ɛ E }{{}}{{} ] < ɛ. }{{} δ 1 1 trivial sice R 1 ɛ 2[k d l 2+l2/δ] [ ] ad δ = 1/, we have E Rĥ RH k k = O d. To boud the approximatio error, let h = Bayes classifier ad h k = best classifier i H k. Observe [ Rh k Rh = 2E X ηx 1 ] 2 1 {h k X h X} E X [1 {h k X h X}] by ηx = P X h kx h X. 3

7 7 Let := {x : h k x h x}, ad let f be the desity of P X. The P X = fxdx f λ, where λ deotes volume/lebesgue measure. Now classificatio errors occur oly o cells itersectig the Bayes decisio boudary see shaded cells i Fig. 3, ad so λ Ck d 1 /k d = O 1 k. Settig 1 k = k d, we see that k 1 d+2 gives the best rate for histograms. Histograms[ do ot ] achieve the best rate amog all discrimiatio rules for the box-coutig class; this best rate is E Rĥ R = O 1 d [4]. I the ext set of otes we ll examie a discrimiatio rule that achieves this rate to withi a logarithmic factor. Exercises 1. With X = R d let H k = {hx = 1 {fx 0} : f is a polyomial of degree at most k}. Let P be the set of all joit distributios P XY such that if Rh R h H k as k. Determie a explicit sufficiet coditio o k for the sieve estimator to be cosistet for all P XY P. 2. Up to this poit i our discussio of empirical risk miimizatio, we have always assumed that a empirical risk miimizer exists. However, it could be that the ifimum of the empirical risk is ot attaied by ay classifier. I this case, we ca still have a cosistet sieve estimator based o a approximate empirical risk miimizer. Thus, let τ k be a sequece of positive real umbers decreasig to zero. Defie ĥ,k to be ay classifier ĥ,k {h H k : R h if R h + τ k }. h H k Now cosider the discrimiatio rule ĥ := ĥ,k. Show that Theorems 2 ad 3 still hold for this discrimiatio rule. State ay additioal assumptios o the rate of covergece of τ k that may be eeded. 3. Assume X R d. A fuctio f : X R is said to be Lipschitz cotiuous if there exists a costat L > 0 such that for all x, x X, fx fx L x x. Show that if oe coordiate of the Bayes decisio boudary is a Lipschitz cotiuous fuctio of the others, the B holds i the defiitio of the box-coutig class. A example of the stated coditio is whe the Bayes decisio boudary is the graph of, say, x d = fx 1,..., x d 1, where f is Lipschitz. 4. Establish the followig partial coverse to Lemma 2: If Z Z is bouded, the Z i.p. Z implies E{ Z Z } 0. Therefore, i the cotext of classificatio, ERĥ R is equivalet to weak cosistecy.

8 8 Refereces [1] L. Devroye, L. Györfi ad G. Lugosi, A Probabilistic Theory of Patter Recogitio, Spriger, [2] L. Devroye, Ay discrimiatio rule ca have a arbitrarily bad probability of error for fiite sample size, IEEE Tras. Patter Aalysis ad Machie Itelligece, vol. 4, pp , [3] U. Greader, Abstract Iferece, Wiley, [4] C. Scott ad R. Nowak, Miimax-Optimal Classificatio with Dyadic Decisio Trees, IEEE Tras. Iform. Theory, vol. 52, pp , 2006.

Rademacher Complexity

Rademacher Complexity EECS 598: Statistical Learig Theory, Witer 204 Topic 0 Rademacher Complexity Lecturer: Clayto Scott Scribe: Ya Deg, Kevi Moo Disclaimer: These otes have ot bee subjected to the usual scrutiy reserved for

More information

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Convergence of random variables. (telegram style notes) P.J.C. Spreij Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space

More information

Empirical Process Theory and Oracle Inequalities

Empirical Process Theory and Oracle Inequalities Stat 928: Statistical Learig Theory Lecture: 10 Empirical Process Theory ad Oracle Iequalities Istructor: Sham Kakade 1 Risk vs Risk See Lecture 0 for a discussio o termiology. 2 The Uio Boud / Boferoi

More information

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit theorems Throughout this sectio we will assume a probability space (Ω, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS MASSACHUSTTS INSTITUT OF TCHNOLOGY 6.436J/5.085J Fall 2008 Lecture 9 /7/2008 LAWS OF LARG NUMBRS II Cotets. The strog law of large umbers 2. The Cheroff boud TH STRONG LAW OF LARG NUMBRS While the weak

More information

Machine Learning Brett Bernstein

Machine Learning Brett Bernstein Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio

More information

Advanced Stochastic Processes.

Advanced Stochastic Processes. Advaced Stochastic Processes. David Gamarik LECTURE 2 Radom variables ad measurable fuctios. Strog Law of Large Numbers (SLLN). Scary stuff cotiued... Outlie of Lecture Radom variables ad measurable fuctios.

More information

Chapter 6 Infinite Series

Chapter 6 Infinite Series Chapter 6 Ifiite Series I the previous chapter we cosidered itegrals which were improper i the sese that the iterval of itegratio was ubouded. I this chapter we are goig to discuss a topic which is somewhat

More information

Entropy Rates and Asymptotic Equipartition

Entropy Rates and Asymptotic Equipartition Chapter 29 Etropy Rates ad Asymptotic Equipartitio Sectio 29. itroduces the etropy rate the asymptotic etropy per time-step of a stochastic process ad shows that it is well-defied; ad similarly for iformatio,

More information

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12 Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract I this lecture we derive risk bouds for kerel methods. We will start by showig that Soft Margi kerel SVM correspods to miimizig

More information

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 3

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 3 Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture 3 Tolstikhi Ilya Abstract I this lecture we will prove the VC-boud, which provides a high-probability excess risk boud for the ERM algorithm whe

More information

Ada Boost, Risk Bounds, Concentration Inequalities. 1 AdaBoost and Estimates of Conditional Probabilities

Ada Boost, Risk Bounds, Concentration Inequalities. 1 AdaBoost and Estimates of Conditional Probabilities CS8B/Stat4B Sprig 008) Statistical Learig Theory Lecture: Ada Boost, Risk Bouds, Cocetratio Iequalities Lecturer: Peter Bartlett Scribe: Subhrasu Maji AdaBoost ad Estimates of Coditioal Probabilities We

More information

Agnostic Learning and Concentration Inequalities

Agnostic Learning and Concentration Inequalities ECE901 Sprig 2004 Statistical Regularizatio ad Learig Theory Lecture: 7 Agostic Learig ad Cocetratio Iequalities Lecturer: Rob Nowak Scribe: Aravid Kailas 1 Itroductio 1.1 Motivatio I the last lecture

More information

Intro to Learning Theory

Intro to Learning Theory Lecture 1, October 18, 2016 Itro to Learig Theory Ruth Urer 1 Machie Learig ad Learig Theory Comig soo 2 Formal Framework 21 Basic otios I our formal model for machie learig, the istaces to be classified

More information

Sequences and Series of Functions

Sequences and Series of Functions Chapter 6 Sequeces ad Series of Fuctios 6.1. Covergece of a Sequece of Fuctios Poitwise Covergece. Defiitio 6.1. Let, for each N, fuctio f : A R be defied. If, for each x A, the sequece (f (x)) coverges

More information

Lecture 19: Convergence

Lecture 19: Convergence Lecture 19: Covergece Asymptotic approach I statistical aalysis or iferece, a key to the success of fidig a good procedure is beig able to fid some momets ad/or distributios of various statistics. I may

More information

Lecture 10: Universal coding and prediction

Lecture 10: Universal coding and prediction 0-704: Iformatio Processig ad Learig Sprig 0 Lecture 0: Uiversal codig ad predictio Lecturer: Aarti Sigh Scribes: Georg M. Goerg Disclaimer: These otes have ot bee subjected to the usual scrutiy reserved

More information

Empirical Processes: Glivenko Cantelli Theorems

Empirical Processes: Glivenko Cantelli Theorems Empirical Processes: Gliveko Catelli Theorems Mouliath Baerjee Jue 6, 200 Gliveko Catelli classes of fuctios The reader is referred to Chapter.6 of Weller s Torgo otes, Chapter??? of VDVW ad Chapter 8.3

More information

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality

More information

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 + 62. Power series Defiitio 16. (Power series) Give a sequece {c }, the series c x = c 0 + c 1 x + c 2 x 2 + c 3 x 3 + is called a power series i the variable x. The umbers c are called the coefficiets of

More information

10.6 ALTERNATING SERIES

10.6 ALTERNATING SERIES 0.6 Alteratig Series Cotemporary Calculus 0.6 ALTERNATING SERIES I the last two sectios we cosidered tests for the covergece of series whose terms were all positive. I this sectio we examie series whose

More information

Integrable Functions. { f n } is called a determining sequence for f. If f is integrable with respect to, then f d does exist as a finite real number

Integrable Functions. { f n } is called a determining sequence for f. If f is integrable with respect to, then f d does exist as a finite real number MATH 532 Itegrable Fuctios Dr. Neal, WKU We ow shall defie what it meas for a measurable fuctio to be itegrable, show that all itegral properties of simple fuctios still hold, ad the give some coditios

More information

Sequences. Notation. Convergence of a Sequence

Sequences. Notation. Convergence of a Sequence Sequeces A sequece is essetially just a list. Defiitio (Sequece of Real Numbers). A sequece of real umbers is a fuctio Z (, ) R for some real umber. Do t let the descriptio of the domai cofuse you; it

More information

1 Review and Overview

1 Review and Overview CS9T/STATS3: Statistical Learig Theory Lecturer: Tegyu Ma Lecture #6 Scribe: Jay Whag ad Patrick Cho October 0, 08 Review ad Overview Recall i the last lecture that for ay family of scalar fuctios F, we

More information

REGRESSION WITH QUADRATIC LOSS

REGRESSION WITH QUADRATIC LOSS REGRESSION WITH QUADRATIC LOSS MAXIM RAGINSKY Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X, Y ), where, as before, X is a R d

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013 MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013 Fuctioal Law of Large Numbers. Costructio of the Wieer Measure Cotet. 1. Additioal techical results o weak covergece

More information

Fall 2013 MTH431/531 Real analysis Section Notes

Fall 2013 MTH431/531 Real analysis Section Notes Fall 013 MTH431/531 Real aalysis Sectio 8.1-8. Notes Yi Su 013.11.1 1. Defiitio of uiform covergece. We look at a sequece of fuctios f (x) ad study the coverget property. Notice we have two parameters

More information

A survey on penalized empirical risk minimization Sara A. van de Geer

A survey on penalized empirical risk minimization Sara A. van de Geer A survey o pealized empirical risk miimizatio Sara A. va de Geer We address the questio how to choose the pealty i empirical risk miimizatio. Roughly speakig, this pealty should be a good boud for the

More information

(A sequence also can be thought of as the list of function values attained for a function f :ℵ X, where f (n) = x n for n 1.) x 1 x N +k x N +4 x 3

(A sequence also can be thought of as the list of function values attained for a function f :ℵ X, where f (n) = x n for n 1.) x 1 x N +k x N +4 x 3 MATH 337 Sequeces Dr. Neal, WKU Let X be a metric space with distace fuctio d. We shall defie the geeral cocept of sequece ad limit i a metric space, the apply the results i particular to some special

More information

INFINITE SEQUENCES AND SERIES

INFINITE SEQUENCES AND SERIES INFINITE SEQUENCES AND SERIES INFINITE SEQUENCES AND SERIES I geeral, it is difficult to fid the exact sum of a series. We were able to accomplish this for geometric series ad the series /[(+)]. This is

More information

Math 341 Lecture #31 6.5: Power Series

Math 341 Lecture #31 6.5: Power Series Math 341 Lecture #31 6.5: Power Series We ow tur our attetio to a particular kid of series of fuctios, amely, power series, f(x = a x = a 0 + a 1 x + a 2 x 2 + where a R for all N. I terms of a series

More information

Estimation of the essential supremum of a regression function

Estimation of the essential supremum of a regression function Estimatio of the essetial supremum of a regressio fuctio Michael ohler, Adam rzyżak 2, ad Harro Walk 3 Fachbereich Mathematik, Techische Uiversität Darmstadt, Schlossgartestr. 7, 64289 Darmstadt, Germay,

More information

APPENDIX A SMO ALGORITHM

APPENDIX A SMO ALGORITHM AENDIX A SMO ALGORITHM Sequetial Miimal Optimizatio SMO) is a simple algorithm that ca quickly solve the SVM Q problem without ay extra matrix storage ad without usig time-cosumig umerical Q optimizatio

More information

Lecture 2. The Lovász Local Lemma

Lecture 2. The Lovász Local Lemma Staford Uiversity Sprig 208 Math 233A: No-costructive methods i combiatorics Istructor: Ja Vodrák Lecture date: Jauary 0, 208 Origial scribe: Apoorva Khare Lecture 2. The Lovász Local Lemma 2. Itroductio

More information

18.657: Mathematics of Machine Learning

18.657: Mathematics of Machine Learning 8.657: Mathematics of Machie Learig Lecturer: Philippe Rigollet Lecture 4 Scribe: Cheg Mao Sep., 05 I this lecture, we cotiue to discuss the effect of oise o the rate of the excess risk E(h) = R(h) R(h

More information

1 Review and Overview

1 Review and Overview DRAFT a fial versio will be posted shortly CS229T/STATS231: Statistical Learig Theory Lecturer: Tegyu Ma Lecture #3 Scribe: Migda Qiao October 1, 2013 1 Review ad Overview I the first half of this course,

More information

Lecture 3: August 31

Lecture 3: August 31 36-705: Itermediate Statistics Fall 018 Lecturer: Siva Balakrisha Lecture 3: August 31 This lecture will be mostly a summary of other useful expoetial tail bouds We will ot prove ay of these i lecture,

More information

Introduction to Probability. Ariel Yadin

Introduction to Probability. Ariel Yadin Itroductio to robability Ariel Yadi Lecture 2 *** Ja. 7 ***. Covergece of Radom Variables As i the case of sequeces of umbers, we would like to talk about covergece of radom variables. There are may ways

More information

Ma 530 Infinite Series I

Ma 530 Infinite Series I Ma 50 Ifiite Series I Please ote that i additio to the material below this lecture icorporated material from the Visual Calculus web site. The material o sequeces is at Visual Sequeces. (To use this li

More information

Machine Learning Theory (CS 6783)

Machine Learning Theory (CS 6783) Machie Learig Theory (CS 6783) Lecture 2 : Learig Frameworks, Examples Settig up learig problems. X : istace space or iput space Examples: Computer Visio: Raw M N image vectorized X = 0, 255 M N, SIFT

More information

Lecture 15: Learning Theory: Concentration Inequalities

Lecture 15: Learning Theory: Concentration Inequalities STAT 425: Itroductio to Noparametric Statistics Witer 208 Lecture 5: Learig Theory: Cocetratio Iequalities Istructor: Ye-Chi Che 5. Itroductio Recall that i the lecture o classificatio, we have see that

More information

1 Convergence in Probability and the Weak Law of Large Numbers

1 Convergence in Probability and the Weak Law of Large Numbers 36-752 Advaced Probability Overview Sprig 2018 8. Covergece Cocepts: i Probability, i L p ad Almost Surely Istructor: Alessadro Rialdo Associated readig: Sec 2.4, 2.5, ad 4.11 of Ash ad Doléas-Dade; Sec

More information

MAT1026 Calculus II Basic Convergence Tests for Series

MAT1026 Calculus II Basic Convergence Tests for Series MAT026 Calculus II Basic Covergece Tests for Series Egi MERMUT 202.03.08 Dokuz Eylül Uiversity Faculty of Sciece Departmet of Mathematics İzmir/TURKEY Cotets Mootoe Covergece Theorem 2 2 Series of Real

More information

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function.

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function. MATH 532 Measurable Fuctios Dr. Neal, WKU Throughout, let ( X, F, µ) be a measure space ad let (!, F, P ) deote the special case of a probability space. We shall ow begi to study real-valued fuctios defied

More information

Regression with quadratic loss

Regression with quadratic loss Regressio with quadratic loss Maxim Ragisky October 13, 2015 Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X,Y, where, as before,

More information

BIRKHOFF ERGODIC THEOREM

BIRKHOFF ERGODIC THEOREM BIRKHOFF ERGODIC THEOREM Abstract. We will give a proof of the poitwise ergodic theorem, which was first proved by Birkhoff. May improvemets have bee made sice Birkhoff s orgial proof. The versio we give

More information

The log-behavior of n p(n) and n p(n)/n

The log-behavior of n p(n) and n p(n)/n Ramauja J. 44 017, 81-99 The log-behavior of p ad p/ William Y.C. Che 1 ad Ke Y. Zheg 1 Ceter for Applied Mathematics Tiaji Uiversity Tiaji 0007, P. R. Chia Ceter for Combiatorics, LPMC Nakai Uivercity

More information

7 Sequences of real numbers

7 Sequences of real numbers 40 7 Sequeces of real umbers 7. Defiitios ad examples Defiitio 7... A sequece of real umbers is a real fuctio whose domai is the set N of atural umbers. Let s : N R be a sequece. The the values of s are

More information

Solutions to HW Assignment 1

Solutions to HW Assignment 1 Solutios to HW: 1 Course: Theory of Probability II Page: 1 of 6 Uiversity of Texas at Austi Solutios to HW Assigmet 1 Problem 1.1. Let Ω, F, {F } 0, P) be a filtered probability space ad T a stoppig time.

More information

Lecture Notes for Analysis Class

Lecture Notes for Analysis Class Lecture Notes for Aalysis Class Topological Spaces A topology for a set X is a collectio T of subsets of X such that: (a) X ad the empty set are i T (b) Uios of elemets of T are i T (c) Fiite itersectios

More information

An Introduction to Randomized Algorithms

An Introduction to Randomized Algorithms A Itroductio to Radomized Algorithms The focus of this lecture is to study a radomized algorithm for quick sort, aalyze it usig probabilistic recurrece relatios, ad also provide more geeral tools for aalysis

More information

If a subset E of R contains no open interval, is it of zero measure? For instance, is the set of irrationals in [0, 1] is of measure zero?

If a subset E of R contains no open interval, is it of zero measure? For instance, is the set of irrationals in [0, 1] is of measure zero? 2 Lebesgue Measure I Chapter 1 we defied the cocept of a set of measure zero, ad we have observed that every coutable set is of measure zero. Here are some atural questios: If a subset E of R cotais a

More information

Glivenko-Cantelli Classes

Glivenko-Cantelli Classes CS28B/Stat24B (Sprig 2008 Statistical Learig Theory Lecture: 4 Gliveko-Catelli Classes Lecturer: Peter Bartlett Scribe: Michelle Besi Itroductio This lecture will cover Gliveko-Catelli (GC classes ad itroduce

More information

Math 61CM - Solutions to homework 3

Math 61CM - Solutions to homework 3 Math 6CM - Solutios to homework 3 Cédric De Groote October 2 th, 208 Problem : Let F be a field, m 0 a fixed oegative iteger ad let V = {a 0 + a x + + a m x m a 0,, a m F} be the vector space cosistig

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013. Large Deviations for i.i.d. Random Variables

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013. Large Deviations for i.i.d. Random Variables MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013 Large Deviatios for i.i.d. Radom Variables Cotet. Cheroff boud usig expoetial momet geeratig fuctios. Properties of a momet

More information

Chapter 7 Isoperimetric problem

Chapter 7 Isoperimetric problem Chapter 7 Isoperimetric problem Recall that the isoperimetric problem (see the itroductio its coectio with ido s proble) is oe of the most classical problem of a shape optimizatio. It ca be formulated

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 6 9/23/2013. Brownian motion. Introduction

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 6 9/23/2013. Brownian motion. Introduction MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/5.070J Fall 203 Lecture 6 9/23/203 Browia motio. Itroductio Cotet.. A heuristic costructio of a Browia motio from a radom walk. 2. Defiitio ad basic properties

More information

REAL ANALYSIS II: PROBLEM SET 1 - SOLUTIONS

REAL ANALYSIS II: PROBLEM SET 1 - SOLUTIONS REAL ANALYSIS II: PROBLEM SET 1 - SOLUTIONS 18th Feb, 016 Defiitio (Lipschitz fuctio). A fuctio f : R R is said to be Lipschitz if there exists a positive real umber c such that for ay x, y i the domai

More information

2.1. Convergence in distribution and characteristic functions.

2.1. Convergence in distribution and characteristic functions. 3 Chapter 2. Cetral Limit Theorem. Cetral limit theorem, or DeMoivre-Laplace Theorem, which also implies the wea law of large umbers, is the most importat theorem i probability theory ad statistics. For

More information

6.3 Testing Series With Positive Terms

6.3 Testing Series With Positive Terms 6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial

More information

Rates of Convergence by Moduli of Continuity

Rates of Convergence by Moduli of Continuity Rates of Covergece by Moduli of Cotiuity Joh Duchi: Notes for Statistics 300b March, 017 1 Itroductio I this ote, we give a presetatio showig the importace, ad relatioship betwee, the modulis of cotiuity

More information

MA131 - Analysis 1. Workbook 3 Sequences II

MA131 - Analysis 1. Workbook 3 Sequences II MA3 - Aalysis Workbook 3 Sequeces II Autum 2004 Cotets 2.8 Coverget Sequeces........................ 2.9 Algebra of Limits......................... 2 2.0 Further Useful Results........................

More information

Math 155 (Lecture 3)

Math 155 (Lecture 3) Math 55 (Lecture 3) September 8, I this lecture, we ll cosider the aswer to oe of the most basic coutig problems i combiatorics Questio How may ways are there to choose a -elemet subset of the set {,,,

More information

Information Theory Tutorial Communication over Channels with memory. Chi Zhang Department of Electrical Engineering University of Notre Dame

Information Theory Tutorial Communication over Channels with memory. Chi Zhang Department of Electrical Engineering University of Notre Dame Iformatio Theory Tutorial Commuicatio over Chaels with memory Chi Zhag Departmet of Electrical Egieerig Uiversity of Notre Dame Abstract A geeral capacity formula C = sup I(; Y ), which is correct for

More information

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014.

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014. Product measures, Toelli s ad Fubii s theorems For use i MAT3400/4400, autum 2014 Nadia S. Larse Versio of 13 October 2014. 1. Costructio of the product measure The purpose of these otes is to preset the

More information

Optimally Sparse SVMs

Optimally Sparse SVMs A. Proof of Lemma 3. We here prove a lower boud o the umber of support vectors to achieve geeralizatio bouds of the form which we cosider. Importatly, this result holds ot oly for liear classifiers, but

More information

Lecture 3 : Random variables and their distributions

Lecture 3 : Random variables and their distributions Lecture 3 : Radom variables ad their distributios 3.1 Radom variables Let (Ω, F) ad (S, S) be two measurable spaces. A map X : Ω S is measurable or a radom variable (deoted r.v.) if X 1 (A) {ω : X(ω) A}

More information

Lecture 10 October Minimaxity and least favorable prior sequences

Lecture 10 October Minimaxity and least favorable prior sequences STATS 300A: Theory of Statistics Fall 205 Lecture 0 October 22 Lecturer: Lester Mackey Scribe: Brya He, Rahul Makhijai Warig: These otes may cotai factual ad/or typographic errors. 0. Miimaxity ad least

More information

Infinite Sequences and Series

Infinite Sequences and Series Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet

More information

MATH 413 FINAL EXAM. f(x) f(y) M x y. x + 1 n

MATH 413 FINAL EXAM. f(x) f(y) M x y. x + 1 n MATH 43 FINAL EXAM Math 43 fial exam, 3 May 28. The exam starts at 9: am ad you have 5 miutes. No textbooks or calculators may be used durig the exam. This exam is prited o both sides of the paper. Good

More information

1 Lecture 2: Sequence, Series and power series (8/14/2012)

1 Lecture 2: Sequence, Series and power series (8/14/2012) Summer Jump-Start Program for Aalysis, 202 Sog-Yig Li Lecture 2: Sequece, Series ad power series (8/4/202). More o sequeces Example.. Let {x } ad {y } be two bouded sequeces. Show lim sup (x + y ) lim

More information

Solution. 1 Solutions of Homework 1. Sangchul Lee. October 27, Problem 1.1

Solution. 1 Solutions of Homework 1. Sangchul Lee. October 27, Problem 1.1 Solutio Sagchul Lee October 7, 017 1 Solutios of Homework 1 Problem 1.1 Let Ω,F,P) be a probability space. Show that if {A : N} F such that A := lim A exists, the PA) = lim PA ). Proof. Usig the cotiuity

More information

Entropy and Ergodic Theory Lecture 5: Joint typicality and conditional AEP

Entropy and Ergodic Theory Lecture 5: Joint typicality and conditional AEP Etropy ad Ergodic Theory Lecture 5: Joit typicality ad coditioal AEP 1 Notatio: from RVs back to distributios Let (Ω, F, P) be a probability space, ad let X ad Y be A- ad B-valued discrete RVs, respectively.

More information

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1 EECS564 Estimatio, Filterig, ad Detectio Hwk 2 Sols. Witer 25 4. Let Z be a sigle observatio havig desity fuctio where. p (z) = (2z + ), z (a) Assumig that is a oradom parameter, fid ad plot the maximum

More information

Detailed proofs of Propositions 3.1 and 3.2

Detailed proofs of Propositions 3.1 and 3.2 Detailed proofs of Propositios 3. ad 3. Proof of Propositio 3. NB: itegratio sets are geerally omitted for itegrals defied over a uit hypercube [0, s with ay s d. We first give four lemmas. The proof of

More information

Estimation for Complete Data

Estimation for Complete Data Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of

More information

Lecture 11: Decision Trees

Lecture 11: Decision Trees ECE9 Sprig 7 Statistical Learig Theory Istructor: R. Nowak Lecture : Decisio Trees Miimum Complexity Pealized Fuctio Recall the basic results of the last lectures: let X ad Y deote the iput ad output spaces

More information

Machine Learning Brett Bernstein

Machine Learning Brett Bernstein Machie Learig Brett Berstei Week Lecture: Cocept Check Exercises Starred problems are optioal. Statistical Learig Theory. Suppose A = Y = R ad X is some other set. Furthermore, assume P X Y is a discrete

More information

Notes 5 : More on the a.s. convergence of sums

Notes 5 : More on the a.s. convergence of sums Notes 5 : More o the a.s. covergece of sums Math 733-734: Theory of Probability Lecturer: Sebastie Roch Refereces: Dur0, Sectios.5; Wil9, Sectio 4.7, Shi96, Sectio IV.4, Dur0, Sectio.. Radom series. Three-series

More information

4x 2. (n+1) x 3 n+1. = lim. 4x 2 n+1 n3 n. n 4x 2 = lim = 3

4x 2. (n+1) x 3 n+1. = lim. 4x 2 n+1 n3 n. n 4x 2 = lim = 3 Exam Problems (x. Give the series (, fid the values of x for which this power series coverges. Also =0 state clearly what the radius of covergece is. We start by settig up the Ratio Test: x ( x x ( x x

More information

Lecture 8: Convergence of transformations and law of large numbers

Lecture 8: Convergence of transformations and law of large numbers Lecture 8: Covergece of trasformatios ad law of large umbers Trasformatio ad covergece Trasformatio is a importat tool i statistics. If X coverges to X i some sese, we ofte eed to check whether g(x ) coverges

More information

A Proof of Birkhoff s Ergodic Theorem

A Proof of Birkhoff s Ergodic Theorem A Proof of Birkhoff s Ergodic Theorem Joseph Hora September 2, 205 Itroductio I Fall 203, I was learig the basics of ergodic theory, ad I came across this theorem. Oe of my supervisors, Athoy Quas, showed

More information

Lecture 9: Expanders Part 2, Extractors

Lecture 9: Expanders Part 2, Extractors Lecture 9: Expaders Part, Extractors Topics i Complexity Theory ad Pseudoradomess Sprig 013 Rutgers Uiversity Swastik Kopparty Scribes: Jaso Perry, Joh Kim I this lecture, we will discuss further the pseudoradomess

More information

MAS111 Convergence and Continuity

MAS111 Convergence and Continuity MAS Covergece ad Cotiuity Key Objectives At the ed of the course, studets should kow the followig topics ad be able to apply the basic priciples ad theorems therei to solvig various problems cocerig covergece

More information

Sequences, Series, and All That

Sequences, Series, and All That Chapter Te Sequeces, Series, ad All That. Itroductio Suppose we wat to compute a approximatio of the umber e by usig the Taylor polyomial p for f ( x) = e x at a =. This polyomial is easily see to be 3

More information

Statistical Machine Learning II Spring 2017, Learning Theory, Lecture 7

Statistical Machine Learning II Spring 2017, Learning Theory, Lecture 7 Statistical Machie Learig II Sprig 2017, Learig Theory, Lecture 7 1 Itroductio Jea Hoorio jhoorio@purdue.edu So far we have see some techiques for provig geeralizatio for coutably fiite hypothesis classes

More information

Lecture Chapter 6: Convergence of Random Sequences

Lecture Chapter 6: Convergence of Random Sequences ECE5: Aalysis of Radom Sigals Fall 6 Lecture Chapter 6: Covergece of Radom Sequeces Dr Salim El Rouayheb Scribe: Abhay Ashutosh Doel, Qibo Zhag, Peiwe Tia, Pegzhe Wag, Lu Liu Radom sequece Defiitio A ifiite

More information

10-701/ Machine Learning Mid-term Exam Solution

10-701/ Machine Learning Mid-term Exam Solution 0-70/5-78 Machie Learig Mid-term Exam Solutio Your Name: Your Adrew ID: True or False (Give oe setece explaatio) (20%). (F) For a cotiuous radom variable x ad its probability distributio fuctio p(x), it

More information

Lecture 2: Concentration Bounds

Lecture 2: Concentration Bounds CSE 52: Desig ad Aalysis of Algorithms I Sprig 206 Lecture 2: Cocetratio Bouds Lecturer: Shaya Oveis Ghara March 30th Scribe: Syuzaa Sargsya Disclaimer: These otes have ot bee subjected to the usual scrutiy

More information

It is often useful to approximate complicated functions using simpler ones. We consider the task of approximating a function by a polynomial.

It is often useful to approximate complicated functions using simpler ones. We consider the task of approximating a function by a polynomial. Taylor Polyomials ad Taylor Series It is ofte useful to approximate complicated fuctios usig simpler oes We cosider the task of approximatig a fuctio by a polyomial If f is at least -times differetiable

More information

Definition An infinite sequence of numbers is an ordered set of real numbers.

Definition An infinite sequence of numbers is an ordered set of real numbers. Ifiite sequeces (Sect. 0. Today s Lecture: Review: Ifiite sequeces. The Cotiuous Fuctio Theorem for sequeces. Usig L Hôpital s rule o sequeces. Table of useful its. Bouded ad mootoic sequeces. Previous

More information

4.1 Sigma Notation and Riemann Sums

4.1 Sigma Notation and Riemann Sums 0 the itegral. Sigma Notatio ad Riema Sums Oe strategy for calculatig the area of a regio is to cut the regio ito simple shapes, calculate the area of each simple shape, ad the add these smaller areas

More information

MA131 - Analysis 1. Workbook 2 Sequences I

MA131 - Analysis 1. Workbook 2 Sequences I MA3 - Aalysis Workbook 2 Sequeces I Autum 203 Cotets 2 Sequeces I 2. Itroductio.............................. 2.2 Icreasig ad Decreasig Sequeces................ 2 2.3 Bouded Sequeces..........................

More information

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y

More information

ON MEAN ERGODIC CONVERGENCE IN THE CALKIN ALGEBRAS

ON MEAN ERGODIC CONVERGENCE IN THE CALKIN ALGEBRAS PROCEEDINGS OF THE AMERICAN MATHEMATICAL SOCIETY Volume 00, Number 0, Pages 000 000 S 0002-9939(XX0000-0 ON MEAN ERGODIC CONVERGENCE IN THE CALKIN ALGEBRAS MARCH T. BOEDIHARDJO AND WILLIAM B. JOHNSON 2

More information

n=1 a n is the sequence (s n ) n 1 n=1 a n converges to s. We write a n = s, n=1 n=1 a n

n=1 a n is the sequence (s n ) n 1 n=1 a n converges to s. We write a n = s, n=1 n=1 a n Series. Defiitios ad first properties A series is a ifiite sum a + a + a +..., deoted i short by a. The sequece of partial sums of the series a is the sequece s ) defied by s = a k = a +... + a,. k= Defiitio

More information

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015 ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],

More information

Lecture 3 The Lebesgue Integral

Lecture 3 The Lebesgue Integral Lecture 3: The Lebesgue Itegral 1 of 14 Course: Theory of Probability I Term: Fall 2013 Istructor: Gorda Zitkovic Lecture 3 The Lebesgue Itegral The costructio of the itegral Uless expressly specified

More information