CONCENTRATION INEQUALITIES

Size: px
Start display at page:

Download "CONCENTRATION INEQUALITIES"

Transcription

1 CONCENTRATION INEQUALITIES MAXIM RAGINSKY I te previous lecture, te followig result was stated witout proof. If X 1,..., X are idepedet Beroulliθ radom variables represetig te outcomes of a sequece of tosses of a coi wit bias probability of eads θ, te for ay ε 0, 1 1 P θ θ ε 2e ε2 were θ = 1 is te fractio of eads i X = X 1,..., X. Sice θ = E θ, 1 says tat te sample or empirical average of te X i s cocetrates sarply aroud te statistical average θ = EX 1. Bouds like tese are fudametal i statistical learig teory. I te ext few lectures, we will lear te teciques eeded to derive suc bouds for settigs muc more complicated ta coi tossig. Tis is ot meat to be a complete picture; more details ad additioal results ca be foud i te excellet survey by Boucero et al. [BBL04]. X i 1. Te basic tools We start wit Markov s iequality: Let X R be a oegative radom variable. Te for ay t > 0 we ave 2 Te proof is simple: PX t EX t. PX t = E[1 {X t} ] E[X1 {X t}] t EX t, were: 3 uses te fact tat te probability of a evet ca be expressed as te expectatio of its idicator fuctio: PX A = P X dx = 1 {x A} P X dx = E[1 {X A} ] 4 uses te fact tat 5 uses te fact tat so cosequetly E[X1 {X t} ] EX. A X X t > 0 = X t 1 X 0 = X1 {X t} X, Date: Jauary 24,

2 Markov s iequality leads to our first boud o te probability tat a radom variable deviates from its expectatio by more ta a give amout: Cebysev s iequality. Let X be a arbitrary real radom variable. Te for ay t > 0 6 P X EX t Var X t 2, were Var X E[ X EX 2 ] = EX 2 EX 2 is te variace of X. To prove 6, we apply Markov s iequality 2 to te oegative radom variable X EX 2 : 7 P X EX t = X EX 2 t 2 8 E X EX 2 t 2, were te first step uses te fact tat te fuctio φx = x 2 is mootoically icreasig o [0,, so tat a b 0 if ad oly if a 2 b 2. Now let s apply tese tools to te problem of boudig te probability tat, for a coi wit bias θ, te fractio of eads i trials differs from θ by more ta some ε > 0. To tat ed, let us represet te outcomes of te tosses by idepedet Beroulliθ radom variables X 1,..., X {0, 1}, were PX i = 1 = θ for all i. Let θ = 1 X i. Te ad E θ = E Var θ = Var [ 1 1 ] X i = 1 EX }{{} i = θ =PX i =1 X i = 1 2 Var X i = θ1 θ, were we ave used te fact tat te X i s are i.i.d., so VarX X = Var X i = Var X 1. Now we are i a positio to apply Cebysev s iequality: 9 P θ θ ε Var θ θ1 θ ε 2 = ε 2. At te very least, 9 sows tat te probability of gettig a bad sample decreases wit sample size. Ufortuately, it does ot decrease fast eoug. To see wy, we ca appeal to te Cetral Limit Teorem, wic rougly states tat P θ θ t θ1 θ 1 Φt 1 e t2 /2, 2π t were Φt = 1/ 2π t e x2 /2 dx is te stadard Gaussia CDF. Tis would suggest sometig like P θ θ ε exp ε2, 2θ1 θ wic decays wit muc faster ta te rigt-ad side of 9, 2

3 2. Te Ceroff boudig trick ad Hoeffdig s iequality To fix 9, we will use a very powerful tecique, kow as te Ceroff boudig trick [Ce52]. Let X be a oegative radom variable. Suppose we are iterested i boudig te probability PX t for some particular t > 0. Observe tat for ay s > 0 we ave 10 PX t = P e sx e st e st E [ e sx], were te first step is by mootoicity of te fuctio φx = e sx ad te secod step is by Markov s iequality 2. Te Ceroff trick is to coose a s > 0 tat would make te rigt-ad side of 10 suitably small. I fact, sice 10 olds simultaeously for all s > 0, te optimal tig to do is to take PX t if s>0 e st E [ e sx]. However, ofte a good upper boud o te momet-geeratig fuctio E [ e sx] is eoug. Oe suc boud was developed by Hoeffdig [Hoe63] for te case we X is bouded wit probability oe: Lemma 1 Hoeffdig. Let X be a radom variable wit EX = 0 ad Pa X b = 1 for some < a b <. Te for all s > 0 11 E [ e sx] e s2 b a 2 /8. Proof. Te proof uses elemetary calculus ad covexity. First we ote tat te fuctio φx = e sx is covex o R. Ay x [a, b] ca be writte as Hece Sice EX = 0, we ave x = x a b a b + b x b a a. e sx x a b a esb + b x b a esa. E [ e sx] b b a esa a b a esb b = b a a b a esb a e sa. We ave sb a i te expoet i te pareteses. To get te same tig i te e sa term multiplyig te pareteses, we wit a bit of foresigt seek λ suc tat sa = λsb a, wic gives us λ = a/b a. Te b b a a b a esb a e sa = 1 λ + λe sb a e λsb a. Now let u = sb a, so we ca write 12 E [ e sx] 1 λ + λe u e λu. Agai wit a bit of foresigt, let us express te rigt-ad side of 12 as a expoetial of a fuctio of u: 1 λ + λe u e λu = e φu, were φu = log1 λ + λe u λu. Now te wole affair iges o us beig able to sow tat φu u 2 /8 for ay u 0. To tat ed, we first ote tat φ0 = φ 0 = 0, ad tat φ u 1/4 for all u 0. Terefore, by Taylor s teorem we ave φu = φ0 + φ 0u φ αu 2 3

4 for some α [0, u], ad we ca upper-boud te rigt-ad side of te above expressio by u 2 /8. Tus, wic gives us 11. E [ e sx] e φu e u2 /8 = e s2 b a 2 /8, We will ow use te Ceroff metod ad te above lemma to prove te followig Teorem 1 Hoeffdig s iequality. Let X 1,..., X be idepedet radom variables, suc tat X i [a i, b i ] wit probability oe. Let S X i. Te for ay t > 0 2t 2 13 P S ES t exp b i a i 2 ; 14 2t 2 P S ES t exp b i a i 2. Cosequetly, 15 2t 2 P S ES t 2 exp b i a i 2. Proof. By replacig eac X i wit X i EX i, we may as well assume tat EX i = 0. Te S = X i. Usig Ceroff s trick, we write 16 P S t = P e ss e st e st E [ e ss]. Sice te X i s are idepedet, E [ e ss] [ ] [ 17 = E e sx X = E e sx i ] = E [ e sx ] i. Sice X i [a i, b i ], we ca apply Lemma 1 to write E [ e sx i] e s 2 b i a i 2 /8. Substitutig tis ito 17 ad 16, we obtai If we coose s = P S t e st = exp e s2 b i a i 2 /8 st + s2 8 b i a i 2 4t P b i a i 2, te we obtai 13. Te proof of 14 is similar. Now we will apply Hoeffdig s iequality to improve our crude cocetratio boud 9 for te sum of idepedet Beroulliθ radom variables, X 1,..., X. Sice eac X i {0, 1}, we ca apply Teorem 1 to get, for ay t > 0, P X i θ t 2e 2t2 /. Terefore, wic gives us te claimed boud 1. P θ θ ε = P X i θ ε 2e 2ε2, 4

5 3. From bouded variables to bouded differeces: McDiarmid s iequality Hoeffdig s iequality applies to sums of idepedet radom variables. We will ow develop its geeralizatio, due to McDiarmid [McD89], to arbitrary real-valued fuctios of idepedet radom variables tat satisfy a certai coditio. Let X be some set, ad cosider a fuctio g : X R. We say tat g as bouded differeces if tere exist oegative umbers c 1,..., c, suc tat 18 sup x 1,...,x,x i X gx1,..., x gx 1,..., x i 1, x i, x i+1,..., x ci for all i = 1,...,. I words, if we cage te it variable wile keepig all te oters fixed, te value of g will ot cage by more ta c i. Teorem 2 McDiarmid s iequality [McD89]. Let X = X 1,..., X X be a -tuple of idepedet X-valued radom variables. If a fuctio g : X R as bouded differeces, as i 18, te, for all t > 0, P gx EgX t exp 2t2 ; c2 i P EgX gx t exp 2t2. c2 i Proof. Let me first sketc te geeral idea beid te proof. Let V = gx EgX. Te first step will be to write V as a sum V i, were te terms V i are costructed so tat: 1 V i is a fuctio oly of X i = X 1,..., X i 2 Tere exists a fuctio Ψ i : X i 1 R suc tat, coditioally o X i 1, Ψ i X i 1 V i Ψ i X i 1 + c i. Provided we ca arrage tigs i tis way, we ca apply Lemma 1 to V i coditioally o X i 1 : E [ e sv i X i 1] e s2 c 21 2 i /8. Te, usig Ceroff s metod, we ave P Z EZ t = PV t e st E [ e sv ] = e st E [e s P ] V i = e st E [e s P ] 1 V i e sv = e st E [e s P 1 V i E [e ]] X sv 1 e st e s2 c 2 /8 E [e s P ] 1 V i, were i te ext-to-last step we used te fact tat V 1,..., V 1 deped oly o X 1, ad i te last step we used 21 wit i =. If we cotiue peelig off te terms ivolvig V 1, V 2,..., V 1, we will get P Z EZ t exp st + s2 c 2 i. 8 Takig s = 4t/ c2 i, we ed up wit 19. 5

6 It remais to costruct te V i s wit te desired properties. To tat ed, let Te V i = H i X i = E[Z X i ] ad V i = H i X i H i 1 X i 1. { E[Z X i ] E[Z X i 1 ] } = E[Z X ] EZ = Z EZ = V. Note tat V i depeds oly o X i by costructio. Moreover, let Ψ i X i 1 = if Hi X i 1, x H i 1 X i 1 x X Ψ ix i 1 = sup Hi X i 1, x H i 1 X i 1, x X were, owig to te fact tat te X i s are idepedet, we ave H i X i 1, x = E[Z X i 1, X i = x] = gx i 1, x, x i+1p X i+1 dx i+1 x i+1 deotig te tuple x i+1,..., x. Te Ψ ix i 1 Ψ i X i 1 = sup Hi X i 1, x H i 1 X i 1 if Hi X i 1, x H i 1 X i 1 x X x X = sup sup Hi X i 1, x H i X i 1, x x X x X = sup sup E[Z X i 1, X i = x] E[Z X i 1, X i = x ] x X x X [gx = sup sup i 1, x, x i+1 gx i 1, x, x i+1 ] P dx i+1 x X x X sup sup gx i 1, x, x i+1 gx i 1, x, x i+1 P dx i+1 x X x X c i, were te last step follows from te bouded differece property. Tus, we ca write Ψ i Xi 1 Ψ i X i 1 + c i, wic implies tat, ideed, coditioally o X i 1. Ψ i X i 1 V i Ψ i X i 1 + c i 4. McDiarmid s iequality i actio McDiarmid s iequality is a extremely powerful ad ofte used tool i statistical learig teory. We will ow discuss several examples of its use. To tat ed, we will first itroduce some otatio ad defiitios. Let X be some measurable space. If Q is a probability distributio of a X-valued radom variable X, te we ca compute te expectatio of ay measurable fuctio f : X R w.r.t. Q. So far, we ave deoted tis expectatio by EfX or by E Q fx. We will ofte fid it coveiet to use a alterative otatio, Qf. Let X = X 1,..., X be idepedet idetically distributed i.i.d. X-valued radom variables wit commo distributio P. Te mai object of iterest to us is te empirical distributio iduced by X, wic we will deote by P X. Te empirical distributio assigs te probability 1/ to eac X i, i.e., P X = 1 δ Xi. 6

7 Here, δ x deotes a uit mass cocetrated at a poit x X, i.e., te probability distributio o X defied by δ x A = 1 {x A}, measurable A X. We ote te followig importat facts about P X : 1 Beig a fuctio of te sample X, P X is a radom variable takig values i te space of probability distributios over X. 2 Te probability of a set A X uder P X, P X A = 1 1 {Xi A}, is te empirical frequecy of te set A o te sample X. Te expectatio of P X A is equal to P A, te P -probability of A. Ideed, [ ] E P 1 X A = E 1 {Xi A} = 1 E[1 {Xi A}] = 1 PX i A = P A. 3 Give a fuctio f : X R, we ca compute its expectatio w.r.t. P X : P X f = 1 fx i, wic is just te sample mea of f o X. It is also referred to as te empirical expectatio of f o X. We ave [ ] E P 1 1 X f = E fx i EfX i = EfX = P f. We ca ow proceed to our examples Sums of bouded radom variables. I te special case we X = R, P is a probability distributio supported o a fiite iterval, ad gx is te sum gx = X i, McDiarmid s iequality simply reduces to Hoeffdig s. Ideed, for ay x [a, b] ad x i we ave [a, b] Itercagig te roles of x i ad x i, we get gx i 1, x i, x i+1 gx i 1, x i, x i+1 = x i x i b a. gx i 1, x i, x i+1 gx i 1, x i, x i+1 = x i x i b a. Hece, we may apply Teorem 2 wit c i = b a for all i to get P gx EgX t 2 exp 2t2 b a 2. 7

8 4.2. Uiform deviatios. Let X 1,..., X be i.i.d. X-valued radom variables wit commo distributio P. By te Law of Large Numbers, for ay A X ad ay ε > 0 lim P PX A P A ε = 0. I fact, we ca use Hoeffdig s iequality to sow tat P PX A P A ε 2e 2ε2. Tis probability boud olds for eac A separately. However, i learig teory we are ofte iterested i te deviatio of empirical frequecies from true probabilities simultaeously over some collectio of te subsets of X. To tat ed, let A be suc a collectio ad cosider te fuctio gx sup P 22 X A P A. A A Later i te course we will see tat, for certai coices of A, EgX = O1/. However, regardless of wat A is, it is easy to see tat, by cagig oly oe X i, te value of gx ca cage at most by 1/. Let x = x 1,..., x, coose some oter x i X, ad let x i deote x wit x i replaced by x i : Te x = x i 1, xi, x i+1, x i = xi 1, x i, x i+1. gx gx i = sup P x A P A sup P x A A A i A P A A { = sup if Px A P A P } x A A A A i A P A { sup Px A P A P } x i A P A A A sup P x A P x i A A A = 1 sup 1 {xi A} 1 {x i A} 1. A A Itercagig te roles of x ad x i, we obtai gx i gx 1. Tus, gx gx i 1. Note tat tis boud olds for all i ad all coices of x ad x i. Tis meas tat te fuctio g defied i 22 as bouded differeces wit c 1 =... = c = 1/. Cosequetly, we ca use Teorem 2 to get P gx EgX ε 2e 2ε2. Tis sows tat te uiform deviatio gx cocetrates sarply aroud its mea EgX. 8

9 4.3. Uiform deviatios cotiued. Te same idea applies to arbitrary real-valued fuctios over X. Let X = X 1,..., X be as i te previous example. Give ay fuctio f : X [0, 1], Hoeffdig s iequality tells us tat P PX f EfX ε 2e 2ε2. However, just as i te previous example, i learig teory we are primarily iterested i cotrollig te deviatios of empirical meas from true meas simultaeously over wole classes of fuctios. To tat ed, let F be suc a class cosistig of fuctios f : X [0, 1] ad cosider te uiform deviatio gx sup f F P X f P f. A argumet etirely similar to te oe i te previous example 1 sows tat tis g as bouded differeces wit c 1 =... = c = 1/. Terefore, applyig McDiarmid s iequality, we obtai P gx EgX ε 2e 2ε2. We will see later tat, for certai fuctio classes F, we will ave EgX = O1/ Kerel desity estimatio. For our fial example, let X = X 1,..., X be a -tuple of i.i.d. real-valued radom variables wose commo distributio P as a probability desity fuctio pdf f, i.e., P A = fxdx for ay measurable set A R. We wis to estimate f from te sample X. A popular metod is to use a kerel estimate te book by Devroye ad Lugosi [DL01] as plety of material o desity estimatio, icludig kerel metods, from te viewpoit of statistical learig teory. To tat ed, we pick a oegative fuctio K : R R tat itegrates to oe, Kxdx = 1 suc a fuctio is called a kerel, as well as a positive badwidt or smootig costat > 0 ad form te estimate f x = 1 A x Xi K It is ot ard to verify 2 tat f is a valid pdf, i.e., tat it is oegative ad itegrates to oe. A commo way of quatifyig te performace of a desity estimator is to use te L 1 distace to te true desity f: f f L1 = f x fx dx. R Note tat f f L1 is a radom variable sice it depeds o te radom sample X. Tus, we ca write it as a fuctio gx of te sample X. Leavig aside te problem of actually boudig EgX, we ca easily establis a cocetratio boud for it usig McDiarmid s iequality. To do. 1 Exercise: verify tis! 2 Aoter exercise! 9

10 tat, we eed to ceck tat g as bouded differeces. Coosig x ad x i as before, we ave gx gx i = 1 i 1 x xj K R j=1 1 i 1 x xj K R j=1 1 x K xi R 2 x K dx = 2. R + 1 K x xi + 1 K K x x i dx + 1 j=i+1 x x i + 1 x xj K fx dx x xj K fx dx j=i+1 Tus, we see tat gx as te bouded differeces property wit c 1 =... = c = 2/, so tat P gx EgX ε 2e ε2 /2. Refereces [BBL04] S. Boucero, O. Bousquet, ad G. Lugosi. Cocetratio iequalities. I O. Bousquet, U. vo Luxburg, ad G. Rätsc, editors, Advaced Lectures i Macie Learig, pages Spriger, [Ce52] H. Ceroff. A meausre of asymptotic efficiecy of tests of a ypotesis based o te sum of observatios. Aals of Matematical Statistics, 23: , [DL01] L. Devroye ad G. Lugosi. Combiatorial Metods i Desity Estimatio. Spriger, [Hoe63] W. Hoeffdig. Probability iequalities for sums of bouded radom variables. Joural of te America Statistical Associatio, 58:13 30, [McD89] C. McDiarmid. O te metod of bouded differeces. I Surveys i Combiatorics, pages Cambridge Uiversity Press,

Learning Theory: Lecture Notes

Learning Theory: Lecture Notes Learig Theory: Lecture Notes Kamalika Chaudhuri October 4, 0 Cocetratio of Averages Cocetratio of measure is very useful i showig bouds o the errors of machie-learig algorithms. We will begi with a basic

More information

Chapter 5. Inequalities. 5.1 The Markov and Chebyshev inequalities

Chapter 5. Inequalities. 5.1 The Markov and Chebyshev inequalities Chapter 5 Iequalities 5.1 The Markov ad Chebyshev iequalities As you have probably see o today s frot page: every perso i the upper teth percetile ears at least 1 times more tha the average salary. I other

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS MASSACHUSTTS INSTITUT OF TCHNOLOGY 6.436J/5.085J Fall 2008 Lecture 9 /7/2008 LAWS OF LARG NUMBRS II Cotets. The strog law of large umbers 2. The Cheroff boud TH STRONG LAW OF LARG NUMBRS While the weak

More information

LECTURE 2 LEAST SQUARES CROSS-VALIDATION FOR KERNEL DENSITY ESTIMATION

LECTURE 2 LEAST SQUARES CROSS-VALIDATION FOR KERNEL DENSITY ESTIMATION Jauary 3 07 LECTURE LEAST SQUARES CROSS-VALIDATION FOR ERNEL DENSITY ESTIMATION Noparametric kerel estimatio is extremely sesitive to te coice of badwidt as larger values of result i averagig over more

More information

This section is optional.

This section is optional. 4 Momet Geeratig Fuctios* This sectio is optioal. The momet geeratig fuctio g : R R of a radom variable X is defied as g(t) = E[e tx ]. Propositio 1. We have g () (0) = E[X ] for = 1, 2,... Proof. Therefore

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

Nonparametric regression: minimax upper and lower bounds

Nonparametric regression: minimax upper and lower bounds Capter 4 Noparametric regressio: miimax upper ad lower bouds 4. Itroductio We cosider oe of te two te most classical o-parametric problems i tis example: estimatig a regressio fuctio o a subset of te real

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit theorems Throughout this sectio we will assume a probability space (Ω, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

Distribution of Random Samples & Limit theorems

Distribution of Random Samples & Limit theorems STAT/MATH 395 A - PROBABILITY II UW Witer Quarter 2017 Néhémy Lim Distributio of Radom Samples & Limit theorems 1 Distributio of i.i.d. Samples Motivatig example. Assume that the goal of a study is to

More information

Glivenko-Cantelli Classes

Glivenko-Cantelli Classes CS28B/Stat24B (Sprig 2008 Statistical Learig Theory Lecture: 4 Gliveko-Catelli Classes Lecturer: Peter Bartlett Scribe: Michelle Besi Itroductio This lecture will cover Gliveko-Catelli (GC classes ad itroduce

More information

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Convergence of random variables. (telegram style notes) P.J.C. Spreij Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space

More information

Lecture 3: August 31

Lecture 3: August 31 36-705: Itermediate Statistics Fall 018 Lecturer: Siva Balakrisha Lecture 3: August 31 This lecture will be mostly a summary of other useful expoetial tail bouds We will ot prove ay of these i lecture,

More information

Advanced Stochastic Processes.

Advanced Stochastic Processes. Advaced Stochastic Processes. David Gamarik LECTURE 2 Radom variables ad measurable fuctios. Strog Law of Large Numbers (SLLN). Scary stuff cotiued... Outlie of Lecture Radom variables ad measurable fuctios.

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013. Large Deviations for i.i.d. Random Variables

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013. Large Deviations for i.i.d. Random Variables MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013 Large Deviatios for i.i.d. Radom Variables Cotet. Cheroff boud usig expoetial momet geeratig fuctios. Properties of a momet

More information

Agnostic Learning and Concentration Inequalities

Agnostic Learning and Concentration Inequalities ECE901 Sprig 2004 Statistical Regularizatio ad Learig Theory Lecture: 7 Agostic Learig ad Cocetratio Iequalities Lecturer: Rob Nowak Scribe: Aravid Kailas 1 Itroductio 1.1 Motivatio I the last lecture

More information

Lecture 2: Concentration Bounds

Lecture 2: Concentration Bounds CSE 52: Desig ad Aalysis of Algorithms I Sprig 206 Lecture 2: Cocetratio Bouds Lecturer: Shaya Oveis Ghara March 30th Scribe: Syuzaa Sargsya Disclaimer: These otes have ot bee subjected to the usual scrutiy

More information

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f. Lecture 5 Let us give oe more example of MLE. Example 3. The uiform distributio U[0, ] o the iterval [0, ] has p.d.f. { 1 f(x =, 0 x, 0, otherwise The likelihood fuctio ϕ( = f(x i = 1 I(X 1,..., X [0,

More information

Lecture 19: Convergence

Lecture 19: Convergence Lecture 19: Covergece Asymptotic approach I statistical aalysis or iferece, a key to the success of fidig a good procedure is beig able to fid some momets ad/or distributios of various statistics. I may

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 3 9/11/2013. Large deviations Theory. Cramér s Theorem

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 3 9/11/2013. Large deviations Theory. Cramér s Theorem MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/5.070J Fall 203 Lecture 3 9//203 Large deviatios Theory. Cramér s Theorem Cotet.. Cramér s Theorem. 2. Rate fuctio ad properties. 3. Chage of measure techique.

More information

Random Variables, Sampling and Estimation

Random Variables, Sampling and Estimation Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig

More information

Probability 2 - Notes 10. Lemma. If X is a random variable and g(x) 0 for all x in the support of f X, then P(g(X) 1) E[g(X)].

Probability 2 - Notes 10. Lemma. If X is a random variable and g(x) 0 for all x in the support of f X, then P(g(X) 1) E[g(X)]. Probability 2 - Notes 0 Some Useful Iequalities. Lemma. If X is a radom variable ad g(x 0 for all x i the support of f X, the P(g(X E[g(X]. Proof. (cotiuous case P(g(X Corollaries x:g(x f X (xdx x:g(x

More information

A survey on penalized empirical risk minimization Sara A. van de Geer

A survey on penalized empirical risk minimization Sara A. van de Geer A survey o pealized empirical risk miimizatio Sara A. va de Geer We address the questio how to choose the pealty i empirical risk miimizatio. Roughly speakig, this pealty should be a good boud for the

More information

Lecture 7: Properties of Random Samples

Lecture 7: Properties of Random Samples Lecture 7: Properties of Radom Samples 1 Cotiued From Last Class Theorem 1.1. Let X 1, X,...X be a radom sample from a populatio with mea µ ad variace σ

More information

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22 CS 70 Discrete Mathematics for CS Sprig 2007 Luca Trevisa Lecture 22 Aother Importat Distributio The Geometric Distributio Questio: A biased coi with Heads probability p is tossed repeatedly util the first

More information

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality

More information

Expectation and Variance of a random variable

Expectation and Variance of a random variable Chapter 11 Expectatio ad Variace of a radom variable The aim of this lecture is to defie ad itroduce mathematical Expectatio ad variace of a fuctio of discrete & cotiuous radom variables ad the distributio

More information

REGRESSION WITH QUADRATIC LOSS

REGRESSION WITH QUADRATIC LOSS REGRESSION WITH QUADRATIC LOSS MAXIM RAGINSKY Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X, Y ), where, as before, X is a R d

More information

4 Conditional Distribution Estimation

4 Conditional Distribution Estimation 4 Coditioal Distributio Estimatio 4. Estimators Te coditioal distributio (CDF) of y i give X i = x is F (y j x) = P (y i y j X i = x) = E ( (y i y) j X i = x) : Tis is te coditioal mea of te radom variable

More information

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1. Eco 325/327 Notes o Sample Mea, Sample Proportio, Cetral Limit Theorem, Chi-square Distributio, Studet s t distributio 1 Sample Mea By Hiro Kasahara We cosider a radom sample from a populatio. Defiitio

More information

January 25, 2017 INTRODUCTION TO MATHEMATICAL STATISTICS

January 25, 2017 INTRODUCTION TO MATHEMATICAL STATISTICS Jauary 25, 207 INTRODUCTION TO MATHEMATICAL STATISTICS Abstract. A basic itroductio to statistics assumig kowledge of probability theory.. Probability I a typical udergraduate problem i probability, we

More information

Lecture 12: September 27

Lecture 12: September 27 36-705: Itermediate Statistics Fall 207 Lecturer: Siva Balakrisha Lecture 2: September 27 Today we will discuss sufficiecy i more detail ad the begi to discuss some geeral strategies for costructig estimators.

More information

EE 4TM4: Digital Communications II Probability Theory

EE 4TM4: Digital Communications II Probability Theory 1 EE 4TM4: Digital Commuicatios II Probability Theory I. RANDOM VARIABLES A radom variable is a real-valued fuctio defied o the sample space. Example: Suppose that our experimet cosists of tossig two fair

More information

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function.

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function. MATH 532 Measurable Fuctios Dr. Neal, WKU Throughout, let ( X, F, µ) be a measure space ad let (!, F, P ) deote the special case of a probability space. We shall ow begi to study real-valued fuctios defied

More information

Regression with quadratic loss

Regression with quadratic loss Regressio with quadratic loss Maxim Ragisky October 13, 2015 Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X,Y, where, as before,

More information

Monte Carlo Integration

Monte Carlo Integration Mote Carlo Itegratio I these otes we first review basic umerical itegratio methods (usig Riema approximatio ad the trapezoidal rule) ad their limitatios for evaluatig multidimesioal itegrals. Next we itroduce

More information

Estimation for Complete Data

Estimation for Complete Data Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of

More information

4. Partial Sums and the Central Limit Theorem

4. Partial Sums and the Central Limit Theorem 1 of 10 7/16/2009 6:05 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 4. Partial Sums ad the Cetral Limit Theorem The cetral limit theorem ad the law of large umbers are the two fudametal theorems

More information

Math 525: Lecture 5. January 18, 2018

Math 525: Lecture 5. January 18, 2018 Math 525: Lecture 5 Jauary 18, 2018 1 Series (review) Defiitio 1.1. A sequece (a ) R coverges to a poit L R (writte a L or lim a = L) if for each ǫ > 0, we ca fid N such that a L < ǫ for all N. If the

More information

STAT Homework 1 - Solutions

STAT Homework 1 - Solutions STAT-36700 Homework 1 - Solutios Fall 018 September 11, 018 This cotais solutios for Homework 1. Please ote that we have icluded several additioal commets ad approaches to the problems to give you better

More information

ST5215: Advanced Statistical Theory

ST5215: Advanced Statistical Theory ST525: Advaced Statistical Theory Departmet of Statistics & Applied Probability Tuesday, September 7, 2 ST525: Advaced Statistical Theory Lecture : The law of large umbers The Law of Large Numbers The

More information

The log-behavior of n p(n) and n p(n)/n

The log-behavior of n p(n) and n p(n)/n Ramauja J. 44 017, 81-99 The log-behavior of p ad p/ William Y.C. Che 1 ad Ke Y. Zheg 1 Ceter for Applied Mathematics Tiaji Uiversity Tiaji 0007, P. R. Chia Ceter for Combiatorics, LPMC Nakai Uivercity

More information

Rademacher Complexity

Rademacher Complexity EECS 598: Statistical Learig Theory, Witer 204 Topic 0 Rademacher Complexity Lecturer: Clayto Scott Scribe: Ya Deg, Kevi Moo Disclaimer: These otes have ot bee subjected to the usual scrutiy reserved for

More information

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam. Probability ad Statistics FS 07 Secod Sessio Exam 09.0.08 Time Limit: 80 Miutes Name: Studet ID: This exam cotais 9 pages (icludig this cover page) ad 0 questios. A Formulae sheet is provided with the

More information

Solution. 1 Solutions of Homework 1. Sangchul Lee. October 27, Problem 1.1

Solution. 1 Solutions of Homework 1. Sangchul Lee. October 27, Problem 1.1 Solutio Sagchul Lee October 7, 017 1 Solutios of Homework 1 Problem 1.1 Let Ω,F,P) be a probability space. Show that if {A : N} F such that A := lim A exists, the PA) = lim PA ). Proof. Usig the cotiuity

More information

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1 EECS564 Estimatio, Filterig, ad Detectio Hwk 2 Sols. Witer 25 4. Let Z be a sigle observatio havig desity fuctio where. p (z) = (2z + ), z (a) Assumig that is a oradom parameter, fid ad plot the maximum

More information

2.2. Central limit theorem.

2.2. Central limit theorem. 36.. Cetral limit theorem. The most ideal case of the CLT is that the radom variables are iid with fiite variace. Although it is a special case of the more geeral Lideberg-Feller CLT, it is most stadard

More information

The standard deviation of the mean

The standard deviation of the mean Physics 6C Fall 20 The stadard deviatio of the mea These otes provide some clarificatio o the distictio betwee the stadard deviatio ad the stadard deviatio of the mea.. The sample mea ad variace Cosider

More information

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15 17. Joit distributios of extreme order statistics Lehma 5.1; Ferguso 15 I Example 10., we derived the asymptotic distributio of the maximum from a radom sample from a uiform distributio. We did this usig

More information

Concentration inequalities

Concentration inequalities Cocetratio iequalities Jea-Yves Audibert 1,2 1. Imagie - ENPC/CSTB - uiversité Paris Est 2. Willow (INRIA/ENS/CNRS) ThRaSH 2010 with Problem Tight upper ad lower bouds o f(x 1,..., X ) X 1,..., X i.i.d.

More information

1 Approximating Integrals using Taylor Polynomials

1 Approximating Integrals using Taylor Polynomials Seughee Ye Ma 8: Week 7 Nov Week 7 Summary This week, we will lear how we ca approximate itegrals usig Taylor series ad umerical methods. Topics Page Approximatig Itegrals usig Taylor Polyomials. Defiitios................................................

More information

Notes On Nonparametric Density Estimation. James L. Powell Department of Economics University of California, Berkeley

Notes On Nonparametric Density Estimation. James L. Powell Department of Economics University of California, Berkeley Notes O Noparametric Desity Estimatio James L Powell Departmet of Ecoomics Uiversity of Califoria, Berkeley Uivariate Desity Estimatio via Numerical Derivatives Cosider te problem of estimatig te desity

More information

Dimension-free PAC-Bayesian bounds for the estimation of the mean of a random vector

Dimension-free PAC-Bayesian bounds for the estimation of the mean of a random vector Dimesio-free PAC-Bayesia bouds for the estimatio of the mea of a radom vector Olivier Catoi CREST CNRS UMR 9194 Uiversité Paris Saclay olivier.catoi@esae.fr Ilaria Giulii Laboratoire de Probabilités et

More information

Lecture 3 The Lebesgue Integral

Lecture 3 The Lebesgue Integral Lecture 3: The Lebesgue Itegral 1 of 14 Course: Theory of Probability I Term: Fall 2013 Istructor: Gorda Zitkovic Lecture 3 The Lebesgue Itegral The costructio of the itegral Uless expressly specified

More information

AMS570 Lecture Notes #2

AMS570 Lecture Notes #2 AMS570 Lecture Notes # Review of Probability (cotiued) Probability distributios. () Biomial distributio Biomial Experimet: ) It cosists of trials ) Each trial results i of possible outcomes, S or F 3)

More information

Exponential Families and Bayesian Inference

Exponential Families and Bayesian Inference Computer Visio Expoetial Families ad Bayesia Iferece Lecture Expoetial Families A expoetial family of distributios is a d-parameter family f(x; havig the followig form: f(x; = h(xe g(t T (x B(, (. where

More information

The Random Walk For Dummies

The Random Walk For Dummies The Radom Walk For Dummies Richard A Mote Abstract We look at the priciples goverig the oe-dimesioal discrete radom walk First we review five basic cocepts of probability theory The we cosider the Beroulli

More information

1 Review and Overview

1 Review and Overview CS9T/STATS3: Statistical Learig Theory Lecturer: Tegyu Ma Lecture #6 Scribe: Jay Whag ad Patrick Cho October 0, 08 Review ad Overview Recall i the last lecture that for ay family of scalar fuctios F, we

More information

Ada Boost, Risk Bounds, Concentration Inequalities. 1 AdaBoost and Estimates of Conditional Probabilities

Ada Boost, Risk Bounds, Concentration Inequalities. 1 AdaBoost and Estimates of Conditional Probabilities CS8B/Stat4B Sprig 008) Statistical Learig Theory Lecture: Ada Boost, Risk Bouds, Cocetratio Iequalities Lecturer: Peter Bartlett Scribe: Subhrasu Maji AdaBoost ad Estimates of Coditioal Probabilities We

More information

Lecture 9: Regression: Regressogram and Kernel Regression

Lecture 9: Regression: Regressogram and Kernel Regression STAT 425: Itroductio to Noparametric Statistics Witer 208 Lecture 9: Regressio: Regressogram ad erel Regressio Istructor: Ye-Ci Ce Referece: Capter 5 of All of oparametric statistics 9 Itroductio Let X,

More information

Lecture 12: November 13, 2018

Lecture 12: November 13, 2018 Mathematical Toolkit Autum 2018 Lecturer: Madhur Tulsiai Lecture 12: November 13, 2018 1 Radomized polyomial idetity testig We will use our kowledge of coditioal probability to prove the followig lemma,

More information

ON LOCAL LINEAR ESTIMATION IN NONPARAMETRIC ERRORS-IN-VARIABLES MODELS 1

ON LOCAL LINEAR ESTIMATION IN NONPARAMETRIC ERRORS-IN-VARIABLES MODELS 1 Teory of Stocastic Processes Vol2 28, o3-4, 2006, pp*-* SILVELYN ZWANZIG ON LOCAL LINEAR ESTIMATION IN NONPARAMETRIC ERRORS-IN-VARIABLES MODELS Local liear metods are applied to a oparametric regressio

More information

An Introduction to Randomized Algorithms

An Introduction to Randomized Algorithms A Itroductio to Radomized Algorithms The focus of this lecture is to study a radomized algorithm for quick sort, aalyze it usig probabilistic recurrece relatios, ad also provide more geeral tools for aalysis

More information

Limit Theorems. Convergence in Probability. Let X be the number of heads observed in n tosses. Then, E[X] = np and Var[X] = np(1-p).

Limit Theorems. Convergence in Probability. Let X be the number of heads observed in n tosses. Then, E[X] = np and Var[X] = np(1-p). Limit Theorems Covergece i Probability Let X be the umber of heads observed i tosses. The, E[X] = p ad Var[X] = p(-p). L O This P x p NM QP P x p should be close to uity for large if our ituitio is correct.

More information

Notes 19 : Martingale CLT

Notes 19 : Martingale CLT Notes 9 : Martigale CLT Math 733-734: Theory of Probability Lecturer: Sebastie Roch Refereces: [Bil95, Chapter 35], [Roc, Chapter 3]. Sice we have ot ecoutered weak covergece i some time, we first recall

More information

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Discrete Mathematics for CS Spring 2008 David Wagner Note 22 CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig

More information

Introduction to Probability. Ariel Yadin

Introduction to Probability. Ariel Yadin Itroductio to robability Ariel Yadi Lecture 2 *** Ja. 7 ***. Covergece of Radom Variables As i the case of sequeces of umbers, we would like to talk about covergece of radom variables. There are may ways

More information

1 Convergence in Probability and the Weak Law of Large Numbers

1 Convergence in Probability and the Weak Law of Large Numbers 36-752 Advaced Probability Overview Sprig 2018 8. Covergece Cocepts: i Probability, i L p ad Almost Surely Istructor: Alessadro Rialdo Associated readig: Sec 2.4, 2.5, ad 4.11 of Ash ad Doléas-Dade; Sec

More information

LECTURE 8: ASYMPTOTICS I

LECTURE 8: ASYMPTOTICS I LECTURE 8: ASYMPTOTICS I We are iterested i the properties of estimators as. Cosider a sequece of radom variables {, X 1}. N. M. Kiefer, Corell Uiversity, Ecoomics 60 1 Defiitio: (Weak covergece) A sequece

More information

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n. Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator

More information

The random version of Dvoretzky s theorem in l n

The random version of Dvoretzky s theorem in l n The radom versio of Dvoretzky s theorem i l Gideo Schechtma Abstract We show that with high probability a sectio of the l ball of dimesio k cε log c > 0 a uiversal costat) is ε close to a multiple of the

More information

ALLOCATING SAMPLE TO STRATA PROPORTIONAL TO AGGREGATE MEASURE OF SIZE WITH BOTH UPPER AND LOWER BOUNDS ON THE NUMBER OF UNITS IN EACH STRATUM

ALLOCATING SAMPLE TO STRATA PROPORTIONAL TO AGGREGATE MEASURE OF SIZE WITH BOTH UPPER AND LOWER BOUNDS ON THE NUMBER OF UNITS IN EACH STRATUM ALLOCATING SAPLE TO STRATA PROPORTIONAL TO AGGREGATE EASURE OF SIZE WIT BOT UPPER AND LOWER BOUNDS ON TE NUBER OF UNITS IN EAC STRATU Lawrece R. Erst ad Cristoper J. Guciardo Erst_L@bls.gov, Guciardo_C@bls.gov

More information

Maximum Likelihood Estimation and Complexity Regularization

Maximum Likelihood Estimation and Complexity Regularization ECE90 Sprig 004 Statistical Regularizatio ad Learig Theory Lecture: 4 Maximum Likelihood Estimatio ad Complexity Regularizatio Lecturer: Rob Nowak Scribe: Pam Limpiti Review : Maximum Likelihood Estimatio

More information

ECE 901 Lecture 14: Maximum Likelihood Estimation and Complexity Regularization

ECE 901 Lecture 14: Maximum Likelihood Estimation and Complexity Regularization ECE 90 Lecture 4: Maximum Likelihood Estimatio ad Complexity Regularizatio R Nowak 5/7/009 Review : Maximum Likelihood Estimatio We have iid observatios draw from a ukow distributio Y i iid p θ, i,, where

More information

Notes 5 : More on the a.s. convergence of sums

Notes 5 : More on the a.s. convergence of sums Notes 5 : More o the a.s. covergece of sums Math 733-734: Theory of Probability Lecturer: Sebastie Roch Refereces: Dur0, Sectios.5; Wil9, Sectio 4.7, Shi96, Sectio IV.4, Dur0, Sectio.. Radom series. Three-series

More information

Elements of Statistical Methods Lots of Data or Large Samples (Ch 8)

Elements of Statistical Methods Lots of Data or Large Samples (Ch 8) Elemets of Statistical Methods Lots of Data or Large Samples (Ch 8) Fritz Scholz Sprig Quarter 2010 February 26, 2010 x ad X We itroduced the sample mea x as the average of the observed sample values x

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013 MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013 Fuctioal Law of Large Numbers. Costructio of the Wieer Measure Cotet. 1. Additioal techical results o weak covergece

More information

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i

More information

1 = δ2 (0, ), Y Y n nδ. , T n = Y Y n n. ( U n,k + X ) ( f U n,k + Y ) n 2n f U n,k + θ Y ) 2 E X1 2 X1

1 = δ2 (0, ), Y Y n nδ. , T n = Y Y n n. ( U n,k + X ) ( f U n,k + Y ) n 2n f U n,k + θ Y ) 2 E X1 2 X1 8. The cetral limit theorems 8.1. The cetral limit theorem for i.i.d. sequeces. ecall that C ( is N -separatig. Theorem 8.1. Let X 1, X,... be i.i.d. radom variables with EX 1 = ad EX 1 = σ (,. Suppose

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Mathematics 170B Selected HW Solutions.

Mathematics 170B Selected HW Solutions. Mathematics 17B Selected HW Solutios. F 4. Suppose X is B(,p). (a)fidthemometgeeratigfuctiom (s)of(x p)/ p(1 p). Write q = 1 p. The MGF of X is (pe s + q), sice X ca be writte as the sum of idepedet Beroulli

More information

Lecture 8: Convergence of transformations and law of large numbers

Lecture 8: Convergence of transformations and law of large numbers Lecture 8: Covergece of trasformatios ad law of large umbers Trasformatio ad covergece Trasformatio is a importat tool i statistics. If X coverges to X i some sese, we ofte eed to check whether g(x ) coverges

More information

HOMEWORK I: PREREQUISITES FROM MATH 727

HOMEWORK I: PREREQUISITES FROM MATH 727 HOMEWORK I: PREREQUISITES FROM MATH 727 Questio. Let X, X 2,... be idepedet expoetial radom variables with mea µ. (a) Show that for Z +, we have EX µ!. (b) Show that almost surely, X + + X (c) Fid the

More information

Asymptotic distribution of products of sums of independent random variables

Asymptotic distribution of products of sums of independent random variables Proc. Idia Acad. Sci. Math. Sci. Vol. 3, No., May 03, pp. 83 9. c Idia Academy of Scieces Asymptotic distributio of products of sums of idepedet radom variables YANLING WANG, SUXIA YAO ad HONGXIA DU ollege

More information

Lecture 2: Poisson Sta*s*cs Probability Density Func*ons Expecta*on and Variance Es*mators

Lecture 2: Poisson Sta*s*cs Probability Density Func*ons Expecta*on and Variance Es*mators Lecture 2: Poisso Sta*s*cs Probability Desity Fuc*os Expecta*o ad Variace Es*mators Biomial Distribu*o: P (k successes i attempts) =! k!( k)! p k s( p s ) k prob of each success Poisso Distributio Note

More information

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 19

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 19 CS 70 Discrete Mathematics ad Probability Theory Sprig 2016 Rao ad Walrad Note 19 Some Importat Distributios Recall our basic probabilistic experimet of tossig a biased coi times. This is a very simple

More information

Supplementary Material for Fast Stochastic AUC Maximization with O(1/n)-Convergence Rate

Supplementary Material for Fast Stochastic AUC Maximization with O(1/n)-Convergence Rate Supplemetary Material for Fast Stochastic AUC Maximizatio with O/-Covergece Rate Migrui Liu Xiaoxua Zhag Zaiyi Che Xiaoyu Wag 3 iabao Yag echical Lemmas ized versio of Hoeffdig s iequality, ote that We

More information

Fall 2013 MTH431/531 Real analysis Section Notes

Fall 2013 MTH431/531 Real analysis Section Notes Fall 013 MTH431/531 Real aalysis Sectio 8.1-8. Notes Yi Su 013.11.1 1. Defiitio of uiform covergece. We look at a sequece of fuctios f (x) ad study the coverget property. Notice we have two parameters

More information

32 estimating the cumulative distribution function

32 estimating the cumulative distribution function 32 estimatig the cumulative distributio fuctio 4.6 types of cofidece itervals/bads Let F be a class of distributio fuctios F ad let θ be some quatity of iterest, such as the mea of F or the whole fuctio

More information

Statistical Theory; Why is the Gaussian Distribution so popular?

Statistical Theory; Why is the Gaussian Distribution so popular? Statistical Theory; Why is the Gaussia Distributio so popular? Rob Nicholls MRC LMB Statistics Course 2014 Cotets Cotiuous Radom Variables Expectatio ad Variace Momets The Law of Large Numbers (LLN) The

More information

5. Likelihood Ratio Tests

5. Likelihood Ratio Tests 1 of 5 7/29/2009 3:16 PM Virtual Laboratories > 9. Hy pothesis Testig > 1 2 3 4 5 6 7 5. Likelihood Ratio Tests Prelimiaries As usual, our startig poit is a radom experimet with a uderlyig sample space,

More information

n outcome is (+1,+1, 1,..., 1). Let the r.v. X denote our position (relative to our starting point 0) after n moves. Thus X = X 1 + X 2 + +X n,

n outcome is (+1,+1, 1,..., 1). Let the r.v. X denote our position (relative to our starting point 0) after n moves. Thus X = X 1 + X 2 + +X n, CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 9 Variace Questio: At each time step, I flip a fair coi. If it comes up Heads, I walk oe step to the right; if it comes up Tails, I walk oe

More information

Empirical Process Theory and Oracle Inequalities

Empirical Process Theory and Oracle Inequalities Stat 928: Statistical Learig Theory Lecture: 10 Empirical Process Theory ad Oracle Iequalities Istructor: Sham Kakade 1 Risk vs Risk See Lecture 0 for a discussio o termiology. 2 The Uio Boud / Boferoi

More information

Math 152. Rumbos Fall Solutions to Review Problems for Exam #2. Number of Heads Frequency

Math 152. Rumbos Fall Solutions to Review Problems for Exam #2. Number of Heads Frequency Math 152. Rumbos Fall 2009 1 Solutios to Review Problems for Exam #2 1. I the book Experimetatio ad Measuremet, by W. J. Youde ad published by the by the Natioal Sciece Teachers Associatio i 1962, the

More information

1 Review and Overview

1 Review and Overview DRAFT a fial versio will be posted shortly CS229T/STATS231: Statistical Learig Theory Lecturer: Tegyu Ma Lecture #3 Scribe: Migda Qiao October 1, 2013 1 Review ad Overview I the first half of this course,

More information

Lecture 7 Testing Nonlinear Inequality Restrictions 1

Lecture 7 Testing Nonlinear Inequality Restrictions 1 Eco 75 Lecture 7 Testig Noliear Iequality Restrictios I Lecture 6, we discussed te testig problems were te ull ypotesis is de ed by oliear equality restrictios: H : ( ) = versus H : ( ) 6= : () We sowed

More information

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014.

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014. Product measures, Toelli s ad Fubii s theorems For use i MAT3400/4400, autum 2014 Nadia S. Larse Versio of 13 October 2014. 1. Costructio of the product measure The purpose of these otes is to preset the

More information

Lecture 2: April 3, 2013

Lecture 2: April 3, 2013 TTIC/CMSC 350 Mathematical Toolkit Sprig 203 Madhur Tulsiai Lecture 2: April 3, 203 Scribe: Shubhedu Trivedi Coi tosses cotiued We retur to the coi tossig example from the last lecture agai: Example. Give,

More information

Problem Set 2 Solutions

Problem Set 2 Solutions CS271 Radomess & Computatio, Sprig 2018 Problem Set 2 Solutios Poit totals are i the margi; the maximum total umber of poits was 52. 1. Probabilistic method for domiatig sets 6pts Pick a radom subset S

More information

Estimation of the Mean and the ACVF

Estimation of the Mean and the ACVF Chapter 5 Estimatio of the Mea ad the ACVF A statioary process {X t } is characterized by its mea ad its autocovariace fuctio γ ), ad so by the autocorrelatio fuctio ρ ) I this chapter we preset the estimators

More information