DISCRETE PREDICTION PROBLEMS: RANDOMIZED PREDICTION

Size: px
Start display at page:

Download "DISCRETE PREDICTION PROBLEMS: RANDOMIZED PREDICTION"

Transcription

1 DISCRETE PREDICTION PROBLEMS: RANDOMIZED PREDICTION Csaba Szepesvári Uiversity of Alberta CMPUT UofA, October , 2006

2 OUTLINE 1 DISCRETE PREDICTION PROBLEMS 2 RANDOMIZED FORECASTERS 3 WEIGHTED AVERAGE FORECASTER 4 FOLLOW THE PERTURBED LEADER 5 BIBLIOGRAPHY

3 BINARY PREDICTION PROBLEMS Biary predictio problem : D = Y = {0, 1}, l(p, y) = I {p y} Loss of forecaster: Loss of expert i: Loss of best expert: ˆL = l(ˆp t, y t ) L i = l(f it, y t ) Goal: Miimize regret, i.e., L = mi L i i R = ˆL L mi

4 BINARY PREDICTION PROBLEMS/2 Propositio: Cosider biary predictio problems. For ay determiistic forecaster y 1: s.t. ˆL (y 1: ) =, where ˆL (y 1: ) is the forecaster s loss o y 1: Proof: ˆp t is based o past iformatio. Hece t, y t ca be selected to let l(ˆp t, y t ) = 1. Qu.e.d. Corollary: There is o determiistic forecaster whose regret is subliear for ay biary predictio problem ad ay set of experts. Proof: Let N = 2, f 1t 0, f 2t 1. The y 1:, L (y 1: ) /2. Pick some y 1: that forces ˆL (y 1: ) =. Hece ˆL (y 1: ) L (y 1: ) /2 = /2. Idea: Radomize the forecaster as this falsifies the above propositio! (prevets worst-case)

5 RANDOMIZED FORECASTERS N def = {1, 2,..., N}. Covetio: l : N Y R, l(i, y) Note: sice l ad Y are ot further restricted, geerality is ot lost Radom choice: I t N is radom variable Forecaster computes I t based o Past iformatio (past decisios, past outcomes) U t U [0,1) Notatio: p it def = P (I t = i I 1:t 1, Y 1:t 1 ) Outcomes ca also be radomized. Outcomes do ot deped o the past actios I 1:t 1! Oblivious or o-reactive oppoet/eviromet, (stock, wheather, etc.)

6 WEIGHTED AVERAGE FORECASTER [LITTLESTONE AND WARMUTH, 1994] Previous result o EWA: THEOREM (LOSS BOUND FOR THE EWA FORECASTER) Assume that D is a covex subset of some vector-space. Let l : D Y [0, 1] be covex i its first argumet. The, for EWA (ˆp t = With η = P P i w i,t 1f it j w j,t 1, w i,t 1 = e ηl i,t 1) it holds: ˆL L l N η + η 8. 8 l N, ˆL L /2 l N. Let f it = e i (ith uit vector), ˆp it = w i,t 1 P N j=1 w j,t 1 l(p, y) def = N i=1 p il(i, y), l is covex i p def D = 1 = {p R N p i 0, j p i = 1 } R N is covex.

7 BOUND ON THE PSEUDO-EXPECTED REGRET EWA: ˆp t = P P i w i,t 1f it j w j,t 1, w i,t 1 = e ηl i,t 1 THEOREM (LOSS BOUND FOR THE EWA FORECASTER: RANDOMIZED PREDICTIONS) Let l : N Y [0, 1]. The, for EWA it holds: L L l N η + η 8. 8 l N With η =, L L /2 l N. Here Note: N L = l(ˆp t, Y t ) = ˆp it l(i, Y t ). i=1 l(ˆp t, Y t ) = E [l(i t, Y t ) Y 1:t, I 1:t 1 ] (= E t [l(i t, Y t )]).

8 BOUND ON THE ACTUAL REGRET? What about ˆL L?? ˆL = l(i t, Y t )?? l(ˆp t, Y t ) = L l(ˆp t, Y t ) is the (coditioal) expected value of l(i t, Y t ) Sums of i.i.d. radom variables are -close to their expectatios! Hoeffdig: if X 1,..., X is i.i.d., a t X t b t the for = (X t E [X t ]) 2ɛ P ( > ɛ) exp( P 2 (b ), t a t ) 2 2ɛ P ( < ɛ) exp( P 2 (b ). t a t ) 2 Whe b t a t 1, with prob. 1 δ, (X t E [X t ]) (/2) l(1/δ).

9 WEIGHTED AVERAGE FORECASTER: ACTUAL REGRET ˆL L = (l(i t, Y t ) l(ˆp t, Y t )); Boud? E [l(i t, Y t )] l(ˆp t, Y t ), but.. E [l(i t, Y t ) Y 1:t, I 1:t 1 ] = = = = N E [l(i t, Y t ) Y 1:t, I 1:t 1, I t = i] P (I t = i Y 1:t, I 1:t 1 ) i=1 N l(i, Y t ) P (I t = i Y 1:t 1, I 1:t 1 ) 1 i=1 N l(i, Y t ) p it = l(ˆp t, Y t ). i=1 0 l(ˆp t, Y t ) l(i t, Y t ) l(ˆp t, Y t ) 1 l(ˆp t, Y t ) (boudedess) 1 I t, Y t are idepedet give the past

10 HOEFFDING-AZUMA INEQUALITY DEFINITION (MARTINGALE DIFFERENCE SERIES) The sequece of radom variables, V 1, V 2,..., is a martigale differece series w.r.t. X 1, X 2,... if t N, V t is a fuctio of X 1,..., X t ad E [V t X 1:t 1 ] = 0 w.p.1. THEOREM (HOEFFDING-AZUMA) Assume that V 1, V 2,... is a martigale differece series w.r.t. X 1, X 2,... such that V t [A t, A t + c t ] where c t is a (o-radom) positive costat, A t is a fuctio of X 1:t 1. The for = V t, P ( > ɛ) exp( P 2ɛ2 c2 t ), P ( < ɛ) exp( P 2ɛ2 ). c2 t COROLLARY If c t 1 the w.p. 1 δ, (/2) l(1/δ).

11 BOUND ON THE RANDOM REGRET By applyig the Hoeffdig-Azuma (H-A) iequality to V t = l(i t, Y t ) l(ˆp t, Y t ), X 1 = (I 1, Y 1, Y 2 ), X t = (I t, Y t+1 ) (t > 1) we get: THEOREM (LOSS BOUND FOR THE EWA FORECASTER: RANDOM REGRET) Let l : N Y [0, 1]. The, for EWA it holds: ˆL L l N η + η l(1/δ) With η = 8 l N, ˆL L 2 l N + 2 l(1/δ).

12 BERNSTEIN S INEQUALITY THEOREM (BERNSTEIN S INEQUALITY FOR MARTINGALE DIFFERENCES) Assume that V 1, V 2,... is a martigale differece series w.r.t. X 1, X 2,... such that V t K. Let Σ 2 = = [ ] E Vt 2 X 1:t 1, V t. The for all Σ,δ > 0 ( ) 2 P 2Σ 2 log( 1 δ ) + 3 K log( 1 δ ) or Σ2 Σ 2 1 δ

13 SMALL LOSSES Previous small-loss boud: 2L l N + l N Radom fluctuatios: add /2 l(1/δ) too big! Berstei s iequality uses the predictable variace to boud the fluctuatios Boud o the predictable variace : E t [ (l(i t, Y t ) l(ˆp t, Y t )) 2] = E t [l(i t, Y t ) 2] l 2 (ˆp t, Y t ) E t [l(i t, Y t ) 2] E t [l(i t, Y t )] = l(ˆp t, Y t ) the effect of radom fluctuatios is comparable with the boud o the expected regret: ( l(it, Y t ) l(ˆp t, Y t ) ) 2L l(1/δ) l(1/δ).

14 FOLLOW THE LEADER Does it work? Take N = 2: Choices: l(1, y t ) : 1 2, 0, 1, 0, 1, 0,... l(2, y t ) : 1 2, 1, 0, 1, 0, 1,... 1 L l(1, y t ) : 11 =.5, 0 L 12 =.5, 1 L13=1.5, 0 L14=1.5, 1 L15=2.5, 0, L l(2, y t ) : 21 =.5 2, 1 L22=1.5, 0 L22=1.5, 1 L23=2.5, 0 L24=2.5, 1,... ˆL = , whilst L i /2, i = 1, 2, ˆL L /2 1.5

15 FOLLOW THE PERTURBED LEADER [Haa, 1957] Follow the perturbed leader (radomized fictitous play): ( ) I t = argmi i=1,...,n Li,t 1 + Z it, Z t f ( ), i.i.d. Goal: develop boud o L! Relate to BEH: ( ) Î t = argmi i N Li,t + Z i,t.

16 FPL: ANALYSIS, PLAN 1 ˆL ad ˆL BEH are close i expectatios: [ ] [ ] E l(i t, y t ) E l(ît, y t ) 2 ˆLBEH ad L are close: l(ît, y t ) 3 Estimate E [Boud ]. l(î, y t ) + Boud L + Boud

17 STEP 1: Goal: ˆL BEH ˆL BOUND [ ] [ ] E l(i t, y t ) E l(ît, y t ) E [l(i t, y t )] = E [ l(argmi i (L i,t 1 + Z it ), y t ) ] = E [F t (Z t )] = F t (z)f (z)dz, where F t (z) = l(argmi i (L i,t 1 + z i ), y t ) E [ l(argmi i (L i,t + Z it ), y t ) ] = E [ l(argmi i (L i,t 1 + l it + Z it ), y t ) ] = E [F t (Z t + l t )] = F t (z + l t )f (z)dz, where l it = l(i, y t ), l t = (l(1, y t ),..., l(n, y t )).

18 STEP 1: ˆL BEH ˆL BOUND/2 E [l(i t, y t )] = [ ] E l(ît, y t ) = F t (z + l t )f (z)dz = F t (z)f (z)dz, F t (z)f (z l t )dz. E [l(i t, y t )] = E [l(i t, y t )] F t (z)f (z)dz sup z,t ( ) sup z,t f (z) f (z l t ) - Choose e.g. f (z) = ( η 2 )N e η z 1, the f (z) f (z l t ) F t (z)f (z l t )dz [ ] E l(ît, y t ). ( ) f (z) f (z l t ) = e η( z 1 z l t 1 ) e η l t 1 e η.. provided that l t 1 1: TODO!

19 STEP 2: ˆL BEH L BOUND ˆL BEH = l(ît, y t ) Î t = argmi i (( t s=1 l(i, y s)) + Z it ). Pla: 1 BEH Boud ˆL = l(ît, y t ) by l(î, y t ) 2 Boud l(î, y t ) by L. I fact, for Step 2.2: { } l(î, y t ) + ZÎ, = mi(l i, + Z i ) i mi i (L i + max Here L has terms overgrows max j Z j! j Z j ) = L + max Z j j

20 STEP 2.1: ˆL BEH L BOUND We kow: For L t (p) = t s=1 l s(p), p t = argmi p L t (p), l t (p t ) l t (p ). Reuse? Î t = argmi i { ( t s=1 l(i, y s)) + Z it }. Rewrite as a miimum of a sum of losses: t ( ) Î t = argmi i l(i, ys ) + Z is Z i,s 1 s=1 t =: argmi i ˆl s (i), here Z i0 = 0 s=1 ˆl s (i) def = l(i, y s ) + Z is Z i,s 1 Reuse: ˆl t (Ît) ˆl t (Î)

21 STEP 2: ˆL BEH L BOUND/3 ˆl t (Ît) ˆl t (Î) ˆl t (Ît) = l(ît, y t ) + ZÎt,t Z Î t,t 1 = ˆL BEH + ZÎt,t Z Î t,t 1 ˆl t (Î) = l(î, y t ) + ZÎ, L + max Z j ( see above) j ˆL BEH + ˆL BEH ( ZÎt,t Z Î t,t 1 L + max Z i, + i ) L + max Z i, i max( Z i,t + Z i,t 1 ). i

22 STEP 3: TAKE EXPECTATIONS ˆL BEH = l(ît, y t ) L + max Z i, + i max(z i,t 1 Z i,t ) ( ) i Pla: Take expectatios Problem: Hard to cotrol E [ max i (Z i,t 1 Z i,t ) ] terms! Pla: Get rid of it! Idea: If Z t = Z t 1 (t 2), but Z 1 f ( ) the for [l(î ] [ ] Î t = argmi i (L it + Z it ), we have E t, y t) = E l(ît, y t ). (*) still applies to Z t ad Î t! [ ] [ ] E l(ît, y t ) = E l(î t, y t ) [ ] [ ] L + E max Z i + E max( Z i ). i i

23 SUMMARY Assumig that l t 1 1, Z t f (z) = ( η 2 )N e η z 1, ( ) f (z) E [l(i t, y t )] sup z,t f (z l t ) [ ] [ ] [ E l(ît, y t ) L + E max Z i + E i [ ] [ ] E l(ît, y t ) e η E l(ît, y t ) max( Z i ) i Hece [ ] [ ]) E [l(i t, y t )] e (L η + E max Z i + E max( Z i ). i i Outstadig issues: Show that we may assume that l t 1 1 Estimate E [max i Z i ] ad E [max i ( Z i )] Note: Z ad Z are idetically distributed, hece E [max i Z i ] = E [max i ( Z i )]. ].

24 ESTIMATE OF E [max i Z i ] E [max i Z i1 ] E [max i Z i1 ] = 0 P (max i Z i1 > u) du itegrate the tail P (max i Z i1 > u) NP ( Z 11 > u) N e ηu uio boud v 0 e ηu du = e ηv /η. P (max i Z i1 > u) du P (max i Z i1 > u) du v + N η e ηv v v Choose v = l(n)/η to get [ ] E max Z i1 (1 + l(n))/η. i..ad L e η ( L + ) 2(1+l N) η.

25 CAN WE ASSUME THAT l t 1 1? I geeral: NO Idea: l t 1 small sparse rewards (may zeroes) Sparsify rewards!

26 SPARSIFYING THE REWARDS Trasform rouds ito N rouds: #t l 1t l 2t... l Nt l 1t #N(t 1) l 2t... 0 #N(t 1) l Nt #Nt We have l ew s 1 1, sice 0 l it 1 p orig it, pis ew : actio probs; T t = N(t 1) + 1 Sychroicity of losses: L orig i,t 1 = Lew i,t t 1 porig it = pi,t ew t l 1t 0 first actio s prob. decreases from T t to T t + 1: p1,t ew t +1 pew 1,T t, others icrease. Repeat for T t + 2,... Hece, L orig t L ew T t ( L orig L ew N e η L,ew N + L ew T t L ew Nt ) ( 2(1+l N) η = e η L,orig + ) 2(1+l N) η

27 FPL BOUND THEOREM (FPL BOUND [KALAI AND VEMPALA, 2003]) Let l : N Y [0, 1] ad cosider FPL! Let Z t ( η 2 )N e η z 1. The ] ( ) E [ˆL e η E [L 2(1 + l N) ] + η Choose η = mi{1, 2(1 + l N)/((e 1)L )}. The E [L ] E [L ] 2 2(e 1)L (1 + l N) + 2(e + 1)(1 + l N). PROOF. Just combie the facts of the previous slides!

28 REFERENCES Haa, J. (1957). Approximatio to Bayes risk i repeated play. Cotributios to the theory of games, 3: Kalai, A. ad Vempala, S. (2003). Efficiet algorithms for the olie decisio problem. I Proceedigs of the 16th Aual Coferece o Learig Theory, pages Spriger. Littlestoe, N. ad Warmuth, M. (1994). The weighted majority algorithm. Iformatio ad Computatio, 108:

THE WEIGHTED MAJORITY ALGORITHM

THE WEIGHTED MAJORITY ALGORITHM THE WEIGHTED MAJORITY ALGORITHM Csaba Szepesvári University of Alberta CMPUT 654 E-mail: szepesva@ualberta.ca UofA, October 3, 2006 OUTLINE 1 PREDICTION WITH EXPERT ADVICE 2 HALVING: FIND THE PERFECT EXPERT!

More information

On Equivalence of Martingale Tail Bounds and Deterministic Regret Inequalities

On Equivalence of Martingale Tail Bounds and Deterministic Regret Inequalities O Equivalece of Martigale Tail Bouds ad Determiistic Regret Iequalities Sasha Rakhli Departmet of Statistics, The Wharto School Uiversity of Pesylvaia Dec 16, 2015 Joit work with K. Sridhara arxiv:1510.03925

More information

Machine Learning Theory (CS 6783)

Machine Learning Theory (CS 6783) Machie Learig Theory (CS 6783) Lecture 3 : Olie Learig, miimax value, sequetial Rademacher complexity Recap: Miimax Theorem We shall use the celebrated miimax theorem as a key tool to boud the miimax rate

More information

Machine Learning Theory (CS 6783)

Machine Learning Theory (CS 6783) Machie Learig Theory (CS 6783) Lecture 2 : Learig Frameworks, Examples Settig up learig problems. X : istace space or iput space Examples: Computer Visio: Raw M N image vectorized X = 0, 255 M N, SIFT

More information

Machine Learning Brett Bernstein

Machine Learning Brett Bernstein Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio

More information

1 Review and Overview

1 Review and Overview CS9T/STATS3: Statistical Learig Theory Lecturer: Tegyu Ma Lecture #6 Scribe: Jay Whag ad Patrick Cho October 0, 08 Review ad Overview Recall i the last lecture that for ay family of scalar fuctios F, we

More information

Self-normalized deviation inequalities with application to t-statistic

Self-normalized deviation inequalities with application to t-statistic Self-ormalized deviatio iequalities with applicatio to t-statistic Xiequa Fa Ceter for Applied Mathematics, Tiaji Uiversity, 30007 Tiaji, Chia Abstract Let ξ i i 1 be a sequece of idepedet ad symmetric

More information

Ada Boost, Risk Bounds, Concentration Inequalities. 1 AdaBoost and Estimates of Conditional Probabilities

Ada Boost, Risk Bounds, Concentration Inequalities. 1 AdaBoost and Estimates of Conditional Probabilities CS8B/Stat4B Sprig 008) Statistical Learig Theory Lecture: Ada Boost, Risk Bouds, Cocetratio Iequalities Lecturer: Peter Bartlett Scribe: Subhrasu Maji AdaBoost ad Estimates of Coditioal Probabilities We

More information

Lecture 10 October Minimaxity and least favorable prior sequences

Lecture 10 October Minimaxity and least favorable prior sequences STATS 300A: Theory of Statistics Fall 205 Lecture 0 October 22 Lecturer: Lester Mackey Scribe: Brya He, Rahul Makhijai Warig: These otes may cotai factual ad/or typographic errors. 0. Miimaxity ad least

More information

Concentration inequalities

Concentration inequalities Cocetratio iequalities Jea-Yves Audibert 1,2 1. Imagie - ENPC/CSTB - uiversité Paris Est 2. Willow (INRIA/ENS/CNRS) ThRaSH 2010 with Problem Tight upper ad lower bouds o f(x 1,..., X ) X 1,..., X i.i.d.

More information

The Maximum-Likelihood Decoding Performance of Error-Correcting Codes

The Maximum-Likelihood Decoding Performance of Error-Correcting Codes The Maximum-Lielihood Decodig Performace of Error-Correctig Codes Hery D. Pfister ECE Departmet Texas A&M Uiversity August 27th, 2007 (rev. 0) November 2st, 203 (rev. ) Performace of Codes. Notatio X,

More information

Supplementary Material for Fast Stochastic AUC Maximization with O(1/n)-Convergence Rate

Supplementary Material for Fast Stochastic AUC Maximization with O(1/n)-Convergence Rate Supplemetary Material for Fast Stochastic AUC Maximizatio with O/-Covergece Rate Migrui Liu Xiaoxua Zhag Zaiyi Che Xiaoyu Wag 3 iabao Yag echical Lemmas ized versio of Hoeffdig s iequality, ote that We

More information

1 Review and Overview

1 Review and Overview DRAFT a fial versio will be posted shortly CS229T/STATS231: Statistical Learig Theory Lecturer: Tegyu Ma Lecture #3 Scribe: Migda Qiao October 1, 2013 1 Review ad Overview I the first half of this course,

More information

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality

More information

Intro to Learning Theory

Intro to Learning Theory Lecture 1, October 18, 2016 Itro to Learig Theory Ruth Urer 1 Machie Learig ad Learig Theory Comig soo 2 Formal Framework 21 Basic otios I our formal model for machie learig, the istaces to be classified

More information

HOMEWORK 4 SOLUTIONS. E[f(X)g(Y ) A] = E f(x)e[g(y ) A] A = E[f(Y ) A]E[g(Y ) A],

HOMEWORK 4 SOLUTIONS. E[f(X)g(Y ) A] = E f(x)e[g(y ) A] A = E[f(Y ) A]E[g(Y ) A], 18.445 HOMEWORK 4 SOLUIOS Exercise 1. Let X, Y be two radom variables o (Ω, F, P). Let A F be a sub-σ-algebra. he radom variables X ad Y are said to be idepedet coditioally o A is for every o-egative measurable

More information

Machine Learning Brett Bernstein

Machine Learning Brett Bernstein Machie Learig Brett Berstei Week Lecture: Cocept Check Exercises Starred problems are optioal. Statistical Learig Theory. Suppose A = Y = R ad X is some other set. Furthermore, assume P X Y is a discrete

More information

Empirical Process Theory and Oracle Inequalities

Empirical Process Theory and Oracle Inequalities Stat 928: Statistical Learig Theory Lecture: 10 Empirical Process Theory ad Oracle Iequalities Istructor: Sham Kakade 1 Risk vs Risk See Lecture 0 for a discussio o termiology. 2 The Uio Boud / Boferoi

More information

18.657: Mathematics of Machine Learning

18.657: Mathematics of Machine Learning 18.657: Mathematics of Machie Learig Lecturer: Philippe Rigollet Lecture 15 Scribe: Zach Izzo Oct. 27, 2015 Part III Olie Learig It is ofte the case that we will be asked to make a sequece of predictios,

More information

NYU Center for Data Science: DS-GA 1003 Machine Learning and Computational Statistics (Spring 2018)

NYU Center for Data Science: DS-GA 1003 Machine Learning and Computational Statistics (Spring 2018) NYU Ceter for Data Sciece: DS-GA 003 Machie Learig ad Computatioal Statistics (Sprig 208) Brett Berstei, David Roseberg, Be Jakubowski Jauary 20, 208 Istructios: Followig most lab ad lecture sectios, we

More information

Supplementary Material for Fast Stochastic AUC Maximization with O(1/n)-Convergence Rate

Supplementary Material for Fast Stochastic AUC Maximization with O(1/n)-Convergence Rate Supplemetary Material for Fast Stochastic AUC Maximizatio with O/-Covergece Rate Migrui Liu Xiaoxua Zhag Zaiyi Che Xiaoyu Wag 3 iabao Yag echical Lemmas ized versio of Hoeffdig s iequality, ote that We

More information

Lecture 3: August 31

Lecture 3: August 31 36-705: Itermediate Statistics Fall 018 Lecturer: Siva Balakrisha Lecture 3: August 31 This lecture will be mostly a summary of other useful expoetial tail bouds We will ot prove ay of these i lecture,

More information

Binary classification, Part 1

Binary classification, Part 1 Biary classificatio, Part 1 Maxim Ragisky September 25, 2014 The problem of biary classificatio ca be stated as follows. We have a radom couple Z = (X,Y ), where X R d is called the feature vector ad Y

More information

18.657: Mathematics of Machine Learning

18.657: Mathematics of Machine Learning 8.657: Mathematics of Machie Learig Lecturer: Philippe Rigollet Lecture 4 Scribe: Cheg Mao Sep., 05 I this lecture, we cotiue to discuss the effect of oise o the rate of the excess risk E(h) = R(h) R(h

More information

6.883: Online Methods in Machine Learning Alexander Rakhlin

6.883: Online Methods in Machine Learning Alexander Rakhlin 6.883: Olie Methods i Machie Learig Alexader Rakhli LECTURE 23. SOME CONSEQUENCES OF ONLINE NO-REGRET METHODS I this lecture, we explore some cosequeces of the developed techiques.. Covex optimizatio Wheever

More information

Lecture 01: the Central Limit Theorem. 1 Central Limit Theorem for i.i.d. random variables

Lecture 01: the Central Limit Theorem. 1 Central Limit Theorem for i.i.d. random variables CSCI-B609: A Theorist s Toolkit, Fall 06 Aug 3 Lecture 0: the Cetral Limit Theorem Lecturer: Yua Zhou Scribe: Yua Xie & Yua Zhou Cetral Limit Theorem for iid radom variables Let us say that we wat to aalyze

More information

Optimally Sparse SVMs

Optimally Sparse SVMs A. Proof of Lemma 3. We here prove a lower boud o the umber of support vectors to achieve geeralizatio bouds of the form which we cosider. Importatly, this result holds ot oly for liear classifiers, but

More information

Learning Theory: Lecture Notes

Learning Theory: Lecture Notes Learig Theory: Lecture Notes Kamalika Chaudhuri October 4, 0 Cocetratio of Averages Cocetratio of measure is very useful i showig bouds o the errors of machie-learig algorithms. We will begi with a basic

More information

Rademacher Complexity

Rademacher Complexity EECS 598: Statistical Learig Theory, Witer 204 Topic 0 Rademacher Complexity Lecturer: Clayto Scott Scribe: Ya Deg, Kevi Moo Disclaimer: These otes have ot bee subjected to the usual scrutiy reserved for

More information

Lecture 7: October 18, 2017

Lecture 7: October 18, 2017 Iformatio ad Codig Theory Autum 207 Lecturer: Madhur Tulsiai Lecture 7: October 8, 207 Biary hypothesis testig I this lecture, we apply the tools developed i the past few lectures to uderstad the problem

More information

6.883: Online Methods in Machine Learning Alexander Rakhlin

6.883: Online Methods in Machine Learning Alexander Rakhlin 6.883: Olie Methods i Machie Learig Alexader Rakhli LECTURES 5 AND 6. THE EXPERTS SETTING. EXPONENTIAL WEIGHTS All the algorithms preseted so far halluciate the future values as radom draws ad the perform

More information

Lecture 4. Hw 1 and 2 will be reoped after class for every body. New deadline 4/20 Hw 3 and 4 online (Nima is lead)

Lecture 4. Hw 1 and 2 will be reoped after class for every body. New deadline 4/20 Hw 3 and 4 online (Nima is lead) Lecture 4 Homework Hw 1 ad 2 will be reoped after class for every body. New deadlie 4/20 Hw 3 ad 4 olie (Nima is lead) Pod-cast lecture o-lie Fial projects Nima will register groups ext week. Email/tell

More information

Maximum Likelihood Estimation and Complexity Regularization

Maximum Likelihood Estimation and Complexity Regularization ECE90 Sprig 004 Statistical Regularizatio ad Learig Theory Lecture: 4 Maximum Likelihood Estimatio ad Complexity Regularizatio Lecturer: Rob Nowak Scribe: Pam Limpiti Review : Maximum Likelihood Estimatio

More information

Hannan consistency in on-line learning in case of unbounded losses under partial monitoring

Hannan consistency in on-line learning in case of unbounded losses under partial monitoring Haa cosistecy i o-lie learig i case of ubouded losses uder partial moitorig Chamy Alleberg, Peter Auer 2, László Györfi 3, ad György Ottucsák 3 School of Computer Sciece Tel Aviv Uiversity Tel Aviv, Israel,

More information

Online Convex Optimization in the Bandit Setting: Gradient Descent Without a Gradient. -Avinash Atreya Feb

Online Convex Optimization in the Bandit Setting: Gradient Descent Without a Gradient. -Avinash Atreya Feb Olie Covex Optimizatio i the Badit Settig: Gradiet Descet Without a Gradiet -Aviash Atreya Feb 9 2011 Outlie Itroductio The Problem Example Backgroud Notatio Results Oe Poit Estimate Mai Theorem Extesios

More information

Lecture 11 and 12: Basic estimation theory

Lecture 11 and 12: Basic estimation theory Lecture ad 2: Basic estimatio theory Sprig 202 - EE 94 Networked estimatio ad cotrol Prof. Kha March 2 202 I. MAXIMUM-LIKELIHOOD ESTIMATORS The maximum likelihood priciple is deceptively simple. Louis

More information

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Convergence of random variables. (telegram style notes) P.J.C. Spreij Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space

More information

Regression with quadratic loss

Regression with quadratic loss Regressio with quadratic loss Maxim Ragisky October 13, 2015 Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X,Y, where, as before,

More information

Basics of Probability Theory (for Theory of Computation courses)

Basics of Probability Theory (for Theory of Computation courses) Basics of Probability Theory (for Theory of Computatio courses) Oded Goldreich Departmet of Computer Sciece Weizma Istitute of Sciece Rehovot, Israel. oded.goldreich@weizma.ac.il November 24, 2008 Preface.

More information

REGRESSION WITH QUADRATIC LOSS

REGRESSION WITH QUADRATIC LOSS REGRESSION WITH QUADRATIC LOSS MAXIM RAGINSKY Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X, Y ), where, as before, X is a R d

More information

Rates of Convergence by Moduli of Continuity

Rates of Convergence by Moduli of Continuity Rates of Covergece by Moduli of Cotiuity Joh Duchi: Notes for Statistics 300b March, 017 1 Itroductio I this ote, we give a presetatio showig the importace, ad relatioship betwee, the modulis of cotiuity

More information

Asymptotic distribution of products of sums of independent random variables

Asymptotic distribution of products of sums of independent random variables Proc. Idia Acad. Sci. Math. Sci. Vol. 3, No., May 03, pp. 83 9. c Idia Academy of Scieces Asymptotic distributio of products of sums of idepedet radom variables YANLING WANG, SUXIA YAO ad HONGXIA DU ollege

More information

ST5215: Advanced Statistical Theory

ST5215: Advanced Statistical Theory ST525: Advaced Statistical Theory Departmet of Statistics & Applied Probability Tuesday, September 7, 2 ST525: Advaced Statistical Theory Lecture : The law of large umbers The Law of Large Numbers The

More information

Monte Carlo Integration

Monte Carlo Integration Mote Carlo Itegratio I these otes we first review basic umerical itegratio methods (usig Riema approximatio ad the trapezoidal rule) ad their limitatios for evaluatig multidimesioal itegrals. Next we itroduce

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 3 9/11/2013. Large deviations Theory. Cramér s Theorem

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 3 9/11/2013. Large deviations Theory. Cramér s Theorem MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/5.070J Fall 203 Lecture 3 9//203 Large deviatios Theory. Cramér s Theorem Cotet.. Cramér s Theorem. 2. Rate fuctio ad properties. 3. Chage of measure techique.

More information

Chapter 5. Inequalities. 5.1 The Markov and Chebyshev inequalities

Chapter 5. Inequalities. 5.1 The Markov and Chebyshev inequalities Chapter 5 Iequalities 5.1 The Markov ad Chebyshev iequalities As you have probably see o today s frot page: every perso i the upper teth percetile ears at least 1 times more tha the average salary. I other

More information

Adaptive Routing Using Expert Advice

Adaptive Routing Using Expert Advice c The Author 2005. Published by Oxford Uiversity Press o behalf of The British Computer Society. All rights reserved. For Permissios, please email: jourals.permissios@oupjourals.org doi:0.093/comjl/bxh000

More information

HOMEWORK 2 SOLUTIONS

HOMEWORK 2 SOLUTIONS HOMEWORK SOLUTIONS CSE 55 RANDOMIZED AND APPROXIMATION ALGORITHMS 1. Questio 1. a) The larger the value of k is, the smaller the expected umber of days util we get all the coupos we eed. I fact if = k

More information

arxiv: v1 [math.pr] 13 Oct 2011

arxiv: v1 [math.pr] 13 Oct 2011 A tail iequality for quadratic forms of subgaussia radom vectors Daiel Hsu, Sham M. Kakade,, ad Tog Zhag 3 arxiv:0.84v math.pr] 3 Oct 0 Microsoft Research New Eglad Departmet of Statistics, Wharto School,

More information

Markov Decision Processes

Markov Decision Processes Markov Decisio Processes Defiitios; Statioary policies; Value improvemet algorithm, Policy improvemet algorithm, ad liear programmig for discouted cost ad average cost criteria. Markov Decisio Processes

More information

Lecture 10: Universal coding and prediction

Lecture 10: Universal coding and prediction 0-704: Iformatio Processig ad Learig Sprig 0 Lecture 0: Uiversal codig ad predictio Lecturer: Aarti Sigh Scribes: Georg M. Goerg Disclaimer: These otes have ot bee subjected to the usual scrutiy reserved

More information

Dimension-free PAC-Bayesian bounds for the estimation of the mean of a random vector

Dimension-free PAC-Bayesian bounds for the estimation of the mean of a random vector Dimesio-free PAC-Bayesia bouds for the estimatio of the mea of a radom vector Olivier Catoi CREST CNRS UMR 9194 Uiversité Paris Saclay olivier.catoi@esae.fr Ilaria Giulii Laboratoire de Probabilités et

More information

Problem Set 4 Due Oct, 12

Problem Set 4 Due Oct, 12 EE226: Radom Processes i Systems Lecturer: Jea C. Walrad Problem Set 4 Due Oct, 12 Fall 06 GSI: Assae Gueye This problem set essetially reviews detectio theory ad hypothesis testig ad some basic otios

More information

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1 EECS564 Estimatio, Filterig, ad Detectio Hwk 2 Sols. Witer 25 4. Let Z be a sigle observatio havig desity fuctio where. p (z) = (2z + ), z (a) Assumig that is a oradom parameter, fid ad plot the maximum

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS MASSACHUSTTS INSTITUT OF TCHNOLOGY 6.436J/5.085J Fall 2008 Lecture 9 /7/2008 LAWS OF LARG NUMBRS II Cotets. The strog law of large umbers 2. The Cheroff boud TH STRONG LAW OF LARG NUMBRS While the weak

More information

Glivenko-Cantelli Classes

Glivenko-Cantelli Classes CS28B/Stat24B (Sprig 2008 Statistical Learig Theory Lecture: 4 Gliveko-Catelli Classes Lecturer: Peter Bartlett Scribe: Michelle Besi Itroductio This lecture will cover Gliveko-Catelli (GC classes ad itroduce

More information

Lecture Chapter 6: Convergence of Random Sequences

Lecture Chapter 6: Convergence of Random Sequences ECE5: Aalysis of Radom Sigals Fall 6 Lecture Chapter 6: Covergece of Radom Sequeces Dr Salim El Rouayheb Scribe: Abhay Ashutosh Doel, Qibo Zhag, Peiwe Tia, Pegzhe Wag, Lu Liu Radom sequece Defiitio A ifiite

More information

A Proof of Birkhoff s Ergodic Theorem

A Proof of Birkhoff s Ergodic Theorem A Proof of Birkhoff s Ergodic Theorem Joseph Hora September 2, 205 Itroductio I Fall 203, I was learig the basics of ergodic theory, ad I came across this theorem. Oe of my supervisors, Athoy Quas, showed

More information

ECE 901 Lecture 14: Maximum Likelihood Estimation and Complexity Regularization

ECE 901 Lecture 14: Maximum Likelihood Estimation and Complexity Regularization ECE 90 Lecture 4: Maximum Likelihood Estimatio ad Complexity Regularizatio R Nowak 5/7/009 Review : Maximum Likelihood Estimatio We have iid observatios draw from a ukow distributio Y i iid p θ, i,, where

More information

Different kinds of Mathematical Induction

Different kinds of Mathematical Induction Differet ids of Mathematical Iductio () Mathematical Iductio Give A N, [ A (a A a A)] A N () (First) Priciple of Mathematical Iductio Let P() be a propositio (ope setece), if we put A { : N p() is true}

More information

Probability 2 - Notes 10. Lemma. If X is a random variable and g(x) 0 for all x in the support of f X, then P(g(X) 1) E[g(X)].

Probability 2 - Notes 10. Lemma. If X is a random variable and g(x) 0 for all x in the support of f X, then P(g(X) 1) E[g(X)]. Probability 2 - Notes 0 Some Useful Iequalities. Lemma. If X is a radom variable ad g(x 0 for all x i the support of f X, the P(g(X E[g(X]. Proof. (cotiuous case P(g(X Corollaries x:g(x f X (xdx x:g(x

More information

Fall 2013 MTH431/531 Real analysis Section Notes

Fall 2013 MTH431/531 Real analysis Section Notes Fall 013 MTH431/531 Real aalysis Sectio 8.1-8. Notes Yi Su 013.11.1 1. Defiitio of uiform covergece. We look at a sequece of fuctios f (x) ad study the coverget property. Notice we have two parameters

More information

A survey on penalized empirical risk minimization Sara A. van de Geer

A survey on penalized empirical risk minimization Sara A. van de Geer A survey o pealized empirical risk miimizatio Sara A. va de Geer We address the questio how to choose the pealty i empirical risk miimizatio. Roughly speakig, this pealty should be a good boud for the

More information

PROBLEM SET 5 SOLUTIONS 126 = , 37 = , 15 = , 7 = 7 1.

PROBLEM SET 5 SOLUTIONS 126 = , 37 = , 15 = , 7 = 7 1. Math 7 Sprig 06 PROBLEM SET 5 SOLUTIONS Notatios. Give a real umber x, we will defie sequeces (a k ), (x k ), (p k ), (q k ) as i lecture.. (a) (5 pts) Fid the simple cotiued fractio represetatios of 6

More information

Lecture 16: Achieving and Estimating the Fundamental Limit

Lecture 16: Achieving and Estimating the Fundamental Limit EE378A tatistical igal Processig Lecture 6-05/25/207 Lecture 6: Achievig ad Estimatig the Fudametal Limit Lecturer: Jiatao Jiao cribe: William Clary I this lecture, we formally defie the two distict problems

More information

Sieve Estimators: Consistency and Rates of Convergence

Sieve Estimators: Consistency and Rates of Convergence EECS 598: Statistical Learig Theory, Witer 2014 Topic 6 Sieve Estimators: Cosistecy ad Rates of Covergece Lecturer: Clayto Scott Scribe: Julia Katz-Samuels, Brado Oselio, Pi-Yu Che Disclaimer: These otes

More information

An Introduction to Randomized Algorithms

An Introduction to Randomized Algorithms A Itroductio to Radomized Algorithms The focus of this lecture is to study a radomized algorithm for quick sort, aalyze it usig probabilistic recurrece relatios, ad also provide more geeral tools for aalysis

More information

ACO Comprehensive Exam 9 October 2007 Student code A. 1. Graph Theory

ACO Comprehensive Exam 9 October 2007 Student code A. 1. Graph Theory 1. Graph Theory Prove that there exist o simple plaar triagulatio T ad two distict adjacet vertices x, y V (T ) such that x ad y are the oly vertices of T of odd degree. Do ot use the Four-Color Theorem.

More information

Probability and Random Processes

Probability and Random Processes Probability ad Radom Processes Lecture 5 Probability ad radom variables The law of large umbers Mikael Skoglud, Probability ad radom processes 1/21 Why Measure Theoretic Probability? Stroger limit theorems

More information

Notes 19 : Martingale CLT

Notes 19 : Martingale CLT Notes 9 : Martigale CLT Math 733-734: Theory of Probability Lecturer: Sebastie Roch Refereces: [Bil95, Chapter 35], [Roc, Chapter 3]. Sice we have ot ecoutered weak covergece i some time, we first recall

More information

Math 104: Homework 2 solutions

Math 104: Homework 2 solutions Math 04: Homework solutios. A (0, ): Sice this is a ope iterval, the miimum is udefied, ad sice the set is ot bouded above, the maximum is also udefied. if A 0 ad sup A. B { m + : m, N}: This set does

More information

Supplementary Materials for Statistical-Computational Phase Transitions in Planted Models: The High-Dimensional Setting

Supplementary Materials for Statistical-Computational Phase Transitions in Planted Models: The High-Dimensional Setting Supplemetary Materials for Statistical-Computatioal Phase Trasitios i Plated Models: The High-Dimesioal Settig Yudog Che The Uiversity of Califoria, Berkeley yudog.che@eecs.berkeley.edu Jiamig Xu Uiversity

More information

Lecture 2: Poisson Sta*s*cs Probability Density Func*ons Expecta*on and Variance Es*mators

Lecture 2: Poisson Sta*s*cs Probability Density Func*ons Expecta*on and Variance Es*mators Lecture 2: Poisso Sta*s*cs Probability Desity Fuc*os Expecta*o ad Variace Es*mators Biomial Distribu*o: P (k successes i attempts) =! k!( k)! p k s( p s ) k prob of each success Poisso Distributio Note

More information

Lecture 7: Channel coding theorem for discrete-time continuous memoryless channel

Lecture 7: Channel coding theorem for discrete-time continuous memoryless channel Lecture 7: Chael codig theorem for discrete-time cotiuous memoryless chael Lectured by Dr. Saif K. Mohammed Scribed by Mirsad Čirkić Iformatio Theory for Wireless Commuicatio ITWC Sprig 202 Let us first

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013. Large Deviations for i.i.d. Random Variables

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013. Large Deviations for i.i.d. Random Variables MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013 Large Deviatios for i.i.d. Radom Variables Cotet. Cheroff boud usig expoetial momet geeratig fuctios. Properties of a momet

More information

Information Theory and Statistics Lecture 4: Lempel-Ziv code

Information Theory and Statistics Lecture 4: Lempel-Ziv code Iformatio Theory ad Statistics Lecture 4: Lempel-Ziv code Łukasz Dębowski ldebowsk@ipipa.waw.pl Ph. D. Programme 203/204 Etropy rate is the limitig compressio rate Theorem For a statioary process (X i)

More information

Regret Bounds for Sleeping Experts and Bandits

Regret Bounds for Sleeping Experts and Bandits Machie Learig mauscript No. will be iserted by the editor) Regret Bouds for Sleepig Experts ad Badits Robert Kleiberg Alexadru Niculescu-Mizil Yogeshwer Sharma Received: date / Accepted: date Abstract

More information

Exponential Families and Bayesian Inference

Exponential Families and Bayesian Inference Computer Visio Expoetial Families ad Bayesia Iferece Lecture Expoetial Families A expoetial family of distributios is a d-parameter family f(x; havig the followig form: f(x; = h(xe g(t T (x B(, (. where

More information

Erratum to: An empirical central limit theorem for intermittent maps

Erratum to: An empirical central limit theorem for intermittent maps Probab. Theory Relat. Fields (2013) 155:487 491 DOI 10.1007/s00440-011-0393-0 ERRATUM Erratum to: A empirical cetral limit theorem for itermittet maps J. Dedecker Published olie: 25 October 2011 Spriger-Verlag

More information

Solutions to HW Assignment 1

Solutions to HW Assignment 1 Solutios to HW: 1 Course: Theory of Probability II Page: 1 of 6 Uiversity of Texas at Austi Solutios to HW Assigmet 1 Problem 1.1. Let Ω, F, {F } 0, P) be a filtered probability space ad T a stoppig time.

More information

Binomial Distribution

Binomial Distribution 0.0 0.5 1.0 1.5 2.0 2.5 3.0 0 1 2 3 4 5 6 7 0.0 0.5 1.0 1.5 2.0 2.5 3.0 Overview Example: coi tossed three times Defiitio Formula Recall that a r.v. is discrete if there are either a fiite umber of possible

More information

Agnostic Learning and Concentration Inequalities

Agnostic Learning and Concentration Inequalities ECE901 Sprig 2004 Statistical Regularizatio ad Learig Theory Lecture: 7 Agostic Learig ad Cocetratio Iequalities Lecturer: Rob Nowak Scribe: Aravid Kailas 1 Itroductio 1.1 Motivatio I the last lecture

More information

Lecture 19. sup y 1,..., yn B d n

Lecture 19. sup y 1,..., yn B d n STAT 06A: Polyomials of adom Variables Lecture date: Nov Lecture 19 Grothedieck s Iequality Scribe: Be Hough The scribes are based o a guest lecture by ya O Doell. I this lecture we prove Grothedieck s

More information

PRACTICE PROBLEMS FOR THE FINAL

PRACTICE PROBLEMS FOR THE FINAL PRACTICE PROBLEMS FOR THE FINAL Math 36Q Fall 25 Professor Hoh Below is a list of practice questios for the Fial Exam. I would suggest also goig over the practice problems ad exams for Exam ad Exam 2 to

More information

Lecture 15: Density estimation

Lecture 15: Density estimation Lecture 15: Desity estimatio Why do we estimate a desity? Suppose that X 1,...,X are i.i.d. radom variables from F ad that F is ukow but has a Lebesgue p.d.f. f. Estimatio of F ca be doe by estimatig f.

More information

Output Analysis and Run-Length Control

Output Analysis and Run-Length Control IEOR E4703: Mote Carlo Simulatio Columbia Uiversity c 2017 by Marti Haugh Output Aalysis ad Ru-Legth Cotrol I these otes we describe how the Cetral Limit Theorem ca be used to costruct approximate (1 α%

More information

Lecture 19: Convergence

Lecture 19: Convergence Lecture 19: Covergece Asymptotic approach I statistical aalysis or iferece, a key to the success of fidig a good procedure is beig able to fid some momets ad/or distributios of various statistics. I may

More information

A New Solution Method for the Finite-Horizon Discrete-Time EOQ Problem

A New Solution Method for the Finite-Horizon Discrete-Time EOQ Problem This is the Pre-Published Versio. A New Solutio Method for the Fiite-Horizo Discrete-Time EOQ Problem Chug-Lu Li Departmet of Logistics The Hog Kog Polytechic Uiversity Hug Hom, Kowloo, Hog Kog Phoe: +852-2766-7410

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

Advanced Analysis. Min Yan Department of Mathematics Hong Kong University of Science and Technology

Advanced Analysis. Min Yan Department of Mathematics Hong Kong University of Science and Technology Advaced Aalysis Mi Ya Departmet of Mathematics Hog Kog Uiversity of Sciece ad Techology September 3, 009 Cotets Limit ad Cotiuity 7 Limit of Sequece 8 Defiitio 8 Property 3 3 Ifiity ad Ifiitesimal 8 4

More information

On Random Line Segments in the Unit Square

On Random Line Segments in the Unit Square O Radom Lie Segmets i the Uit Square Thomas A. Courtade Departmet of Electrical Egieerig Uiversity of Califoria Los Ageles, Califoria 90095 Email: tacourta@ee.ucla.edu I. INTRODUCTION Let Q = [0, 1] [0,

More information

Unbiased Estimation. February 7-12, 2008

Unbiased Estimation. February 7-12, 2008 Ubiased Estimatio February 7-2, 2008 We begi with a sample X = (X,..., X ) of radom variables chose accordig to oe of a family of probabilities P θ where θ is elemet from the parameter space Θ. For radom

More information

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Discrete Mathematics for CS Spring 2008 David Wagner Note 22 CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig

More information

Approximations and more PMFs and PDFs

Approximations and more PMFs and PDFs Approximatios ad more PMFs ad PDFs Saad Meimeh 1 Approximatio of biomial with Poisso Cosider the biomial distributio ( b(k,,p = p k (1 p k, k λ: k Assume that is large, ad p is small, but p λ at the limit.

More information

Introduction to Extreme Value Theory Laurens de Haan, ISM Japan, Erasmus University Rotterdam, NL University of Lisbon, PT

Introduction to Extreme Value Theory Laurens de Haan, ISM Japan, Erasmus University Rotterdam, NL University of Lisbon, PT Itroductio to Extreme Value Theory Laures de Haa, ISM Japa, 202 Itroductio to Extreme Value Theory Laures de Haa Erasmus Uiversity Rotterdam, NL Uiversity of Lisbo, PT Itroductio to Extreme Value Theory

More information

Slide Set 13 Linear Model with Endogenous Regressors and the GMM estimator

Slide Set 13 Linear Model with Endogenous Regressors and the GMM estimator Slide Set 13 Liear Model with Edogeous Regressors ad the GMM estimator Pietro Coretto pcoretto@uisa.it Ecoometrics Master i Ecoomics ad Fiace (MEF) Uiversità degli Studi di Napoli Federico II Versio: Friday

More information

Information Theory Tutorial Communication over Channels with memory. Chi Zhang Department of Electrical Engineering University of Notre Dame

Information Theory Tutorial Communication over Channels with memory. Chi Zhang Department of Electrical Engineering University of Notre Dame Iformatio Theory Tutorial Commuicatio over Chaels with memory Chi Zhag Departmet of Electrical Egieerig Uiversity of Notre Dame Abstract A geeral capacity formula C = sup I(; Y ), which is correct for

More information

CS 330 Discussion - Probability

CS 330 Discussion - Probability CS 330 Discussio - Probability March 24 2017 1 Fudametals of Probability 11 Radom Variables ad Evets A radom variable X is oe whose value is o-determiistic For example, suppose we flip a coi ad set X =

More information

1 Convergence in Probability and the Weak Law of Large Numbers

1 Convergence in Probability and the Weak Law of Large Numbers 36-752 Advaced Probability Overview Sprig 2018 8. Covergece Cocepts: i Probability, i L p ad Almost Surely Istructor: Alessadro Rialdo Associated readig: Sec 2.4, 2.5, ad 4.11 of Ash ad Doléas-Dade; Sec

More information