Lecture 16: Achieving and Estimating the Fundamental Limit

Size: px
Start display at page:

Download "Lecture 16: Achieving and Estimating the Fundamental Limit"

Transcription

1 EE378A tatistical igal Processig Lecture 6-05/25/207 Lecture 6: Achievig ad Estimatig the Fudametal Limit Lecturer: Jiatao Jiao cribe: William Clary I this lecture, we formally defie the two distict problems of achievig ad estimatig the fudametal limit, ad show that uder the logarithmic loss, it is easier to estimate the fudametal limit tha to achieve it. The Bayes evelope The Bayes evelope itroduced i the previous lectures ca be viewed as the fudametal limit of predictio. Ideed, for a specified loss fuctio Λ(x, ˆx), the miimum average loss i predictig X P X is give by the Bayes evelope: U(P X ) mi ˆx E X PX [Λ(X, ˆx). () i.i.d. Throughout this lecture, we observe X, X 2,..., X P X, where X X = {, 2,..., }. I other words, the alphabet size of X is X =. We deote by M the space of probability measures o X. We take Λ(x, ˆx) to be the logarithmic loss i the sequel, i other words, we have Λ(x, ˆx) = Λ(x, ˆP ) = log ˆP (x), (2) for ay x X, ˆP M. For o-egative sequeces a γ, b γ, we use the otatio a γ b γ to deote that there exists a uiversal a costat C such that sup γ γ b γ C, ad a γ b γ is equivalet to b γ a γ. Notatio a γ b γ is equivalet to a a γ b γ ad b γ a γ. Notatio a γ b γ meas that lim if γ γ b γ =, ad a γ b γ is equivalet to b γ a γ. We write a b = mi{a, b} ad a b = max{a, b}. Moreover, poly K deotes the set of all polyomials of degree o more tha K. 2 Achievig the fudametal limit i.i.d. Give i.i.d. observatios X, X 2,..., X P X, we would like to costruct a predictor ˆP = ˆP (X, X 2,..., X ) to predict a fresh ew idepedet radom variable X P X, where X is idepedet of the traiig data {X i } i=. The average risk of predictig X usig the predictor ˆP uder the logarithmic loss is give by [log, (3) where the expectatio is over the radomess of (X, X 2,..., X, X) P (+) X.

2 2. The iappropriate questio of miimax risk ice the distributio P X is ukow, we may take the miimax approach i decisio theory ad aim at solvig the miimax risk. I other words, we aim at solvig [ if sup log. (4) ˆP P X M We ow show that this questio leads to a degeerate aswer that may ot be what we wat. Theorem. The miimax risk is give by if ˆP sup P X M [ log = log(), (5) ad the miimax risk achievig ˆP ca be take to be U = (,,..., ), where U is the uiform distributio o X. Proof We first show that the miimax risk is at least log(). Ideed, for ay predictor ˆP M, we have [log X, X 2,..., X = P X (x) log (6) ˆP (x) x X x X P X (x) log P X (x) (7) = H(P X ), (8) where we used the o-egativity of the KL divergece, ad H(P X ) is the hao etropy. Takig P X = U, we have H(P X ) = log. Takig expectatios o both sides with respect to X, X 2,..., X, we kow [log log (9) for ay predictor ˆP. O the other had, takig ˆP U, we have [log which proves that the miimax risk is at most log. = log, (0) Theorem shows that solvig the miimax risk i predictio may lead to iappropriate aswers. Ideed, the miimax optimal solutios turs out to be a degeerate aswer that igores all the traiig data. What we show ext is that focusig o the miimax regret solves this problem i a meaigful way. 2.2 Achievig the fudametal limit: miimax regret As we argued i the proof of Theorem, for ay predictor ˆP = ˆP (X, X 2,..., X ), we have [log H(P X ). () 2

3 It motivates us to defie the miimax regret as follows: [ if sup log H(P X ). (2) ˆP P X M We have the followig algebraic maipulatios for ay predictor ˆP : [ [ [log H(P X ) = P X (x) log ˆP (x) P X (x) log P X (x) X, X 2,..., X x X x X [ = P X (x) log P X(x) ˆP (x) x X (3) (4) = D(P X ˆP ), (5) where D(P Q) = P (x) x X P (x) log Q(x) is the KL divergece betwee P ad Q. I other words, solvig the miimax regret of predictig a fresh ew idepedet radom variable X based o i.i.d. traiig samples X, X 2,..., X is equivalet to solvig the problem of estimatig the discrete distributio P X uder the KL divergece loss. The miimax regret is characterized by the followig theorem. Theorem 2. 2 if ˆP sup P X M [ log { ( + o()) H(P X ) = 2 log(e) if ( + o()) log( ) if (6) Moreover, if lim sup c (0, ), the miimax regret is bouded away from zero. The predictor ˆP that achieves the performace above i the regime of is: ˆP (x) = (x) + β((x)) + j= β(( x j)) for ay x X, (7) where (x) = (X i = x), (8) i= ad (X, X 2,..., X ) is the traiig data. Here 2 if k = 0 β(k) = if k = 3 4 o.w. (9) The predictor ˆP that achieves the performace above i the regime of is: ˆP (x) = (x) + log + log for ay x X. (20) Paiski, Liam. Variatioal Miimax Estimatio of Discrete Distributios uder KL Loss. I NIP, pp Braess, Dietrich, ad Thomas auer. Berstei polyomials ad learig theory. Joural of Approximatio Theory 28, o. 2 (2004):

4 Vaishig miimax regret implies that there exists a predictor ˆP such that its average predictio error [log o the test set approaches the fudametal limit H(P X ). Theorem 2 shows that its takes at least samples to achieve vaishig miimax regret. It ca be uderstood ituitively that oe eeds at least to see all the symbols at least oce to be able to costruct a predictor whose performace is able to approach the fudametal limit. The miimax regret defiitio reflects the traditioal way of uderstadig of the difficulty of machie learig tasks. I machie learig practice, we iteratively improve our traiig algorithm, ad use its predictio accuracy o the test set to measure the performace of our predictio algorithm. The best performace achieved by existig schemes o the test set is usually uderstood as the limit of predictio for a specific dataset. I this cotext, Theorem 2 ca be iterpreted i the way that with samples, there does ot exist ay predictio algorithm based o traiig samples whose performace o the test set ca approach the Bayes evelope i the worst case. As we show i the ext sectio, there exist algorithms that ca estimate the fudametal limit with samples without explicitly costructig a predictio algorithm. 3 Estimatig the fudametal limit We defie the problem of estimatig the fudametal limit as solvig the followig miimax problem: if Ĥ sup Ĥ H(P X), (2) P X M where the ifimum is take over all possible estimators Ĥ = Ĥ(X, X 2,..., X ) that are fuctios of the empirical traiig data. The materials i this sectio are maily take from. Jiao, Jiatao, Kartik Vekat, Yaju Ha, ad Tsachy Weissma. Miimax estimatio of fuctioals of discrete distributios. IEEE Trasactios o Iformatio Theory 6, o. 5 (205): Jiao, Jiatao, Kartik Vekat, Yaju Ha, ad Tsachy Weissma. Maximum likelihood estimatio of fuctioals of discrete distributios. arxiv preprit arxiv: (204). 3. The miimax rates We have the followig theorem. Theorem uppose if Ĥ log. The, sup Ĥ H(P X) P X M log l + l. (22) Theorem 3 shows that it suffices to take samples to cosistetly estimate the fudametal limit H(P X ). It is very surprisig that the umber of samples required is i fact subliear i : oe ca estimate the hao etropy uiformly over all P X M eve if oe has ot see most of the symbols i the alphabet X i the empirical samples. 3 Valiat, Gregory, ad Paul Valiat. Estimatig the usee: a /log ()-sample estimator for etropy ad support size, show optimal via ew CLTs. I Proceedigs of the forty-third aual ACM symposium o Theory of computig, pp ACM, Valiat, Gregory, ad Paul Valiat. The power of liear estimators. I Foudatios of Computer ciece (FOC), 20 IEEE 52d Aual ymposium o, pp IEEE, Wu, Yihog, ad Pegku Yag. Miimax rates of etropy estimatio o large alphabets via best polyomial approximatio. IEEE Trasactios o Iformatio Theory 62, o. 6 (206): Jiao, Jiatao, Kartik Vekat, Yaju Ha, ad Tsachy Weissma. Miimax estimatio of fuctioals of discrete distributios. IEEE Trasactios o Iformatio Theory 6, o. 5 (205):

5 3.2 Natural cadidate: the empirical etropy Oe of the most atural estimators for the hao etropy H(P X ) give i.i.d. samples is the empirical etropy, which is defied as the followig. Deote the empirical distributio by ˆP = (ˆp, ˆp 2,..., ˆp ), where ˆp i = i= (X i = i) is the empirical frequecy of symbol i i the traiig set. The empirical etropy is defied as H( ˆP ), which plugs-i the empirical distributio ito the hao etropy fuctioal. Ituitively, sice the hao etropy is a cotiuous fuctioal for fiite alphabet distributios, ad ˆP coverges to the true distributio P X as, the plug-i estimate H( ˆP ) should be a decet estimator for H(P X ) if is fixed ad. It is ideed true: it is oly i the high dimesios that the empirical etropy starts to behave poorly as a estimate for the hao etropy. We have the followig theorem quatifyig the performace of the empirical etropy i estimatig H(P X ). Theorem uppose. The, sup H( ˆP ) H(P X ) P X M + l. (23) Comparig Theorem 4 ad 3, it seems that the mai differece is that oe has improved the term to l i the miimax rate-optimal etropy estimator, while keepig the secod term uchaged. We ow ivestigate where the two terms come from, ad how oe may costruct the miimax rate-optimal estimators based o the empirical etropy. 3.3 Aalysis of the empirical etropy For ay estimator Ĥ, its performace i estimatig H(P X) ca be characterized via its bias defied as Ĥ H(P X ), ad the cocetratio of Ĥ aroud its expectatio Ĥ. The cocetratio property may be partially characterized by the variace of the estimator Ĥ, amely Var(Ĥ) = (Ĥ Ĥ)2. We ow argue that i Theorem 4, the term l comes from the bias, ad the term comes from the variace. Itroduce the cocave fuctio f(x) = x l( x ) o [0,. It is clear that We have the followig claim. Claim 5. If p i, the H( ˆP ) = f(ˆp i ). (24) i= 0 f(p i ) Ef(ˆp i ). (25) Moreover, Var(H( ˆP )) (l())2 2(l() + 3)2. (26) The results i Claim 5 are ispirig. It shows that the variace of the empirical etropy ca be uiversally bouded regardless of the support size. Moreover, the bias cotributed by each symbol will be liearly added up together, cotributig the term. It is clear that i the regime of fixed ad, the variace domiates, but i the high dimesios the bias domiates. Hece, the key to improvig the empirical etropy would be to reduce the bias i high dimesios without icurrig too much additioal variace. 7 Jiao, Jiatao, Kartik Vekat, Yaju Ha, ad Tsachy Weissma. Maximum likelihood estimatio of fuctioals of discrete distributios. arxiv preprit arxiv: (204). 8 Wu, Yihog, ad Pegku Yag. Miimax rates of etropy estimatio o large alphabets via best polyomial approximatio. IEEE Trasactios o Iformatio Theory 62, o. 6 (206):

6 3.4 How ca we improve the empirical etropy? It has bee a log jourey to fid the miimax rate-optimal estimators. Harris i 975 proposed expadig E p H( ˆP ) usig a Taylor expasio ad obtaied H( ˆP ) = H(P X ) ( ) + o( ). (27) p i 3 The Taylor expasio result looks decet i the regime where p i s are ot too small. Ideed, for very small p i the remaider term i= p i may be much larger tha the true etropy H(P X ) itself. This ituitio turs out to be correct: it suffices to do a first-order bias correctio usig Taylor series i the regime of ot too small p i. I geeral, for ˆp B(, p), we may write [f(ˆp ) = f(p) + ( ) 2 f p( p) (p) + O P 2, which motivates the bias correctio: i= ˆf c = f(ˆp ) 2 f (ˆp ) ˆp ( ˆp ). I the etropy estimatio case, we follow the bias correctio above ad do the followig 9 Costructio 6. If the true p i l, we use f(ˆp i) + 2 istead of f(ˆp i) to estimate f(p i ). Now the focus is o the small p i regime. We eed to uderstad precisely which term cotributed the bias boud. Assume for ow that all p i l. We have the followig maipulatios: H( ˆP ) H(P X ) = = f(ˆp i ) f(p i ) (28) i= (f(ˆp i ) P K ( ˆp i )) i= (f(p i ) P K (p i )) + where P K ( ) is a arbitrary polyomial with order o more tha K. The followig two observatios are crucial for the improvemets of empirical etropy. Claim 7. If p i l, we have ˆp i l with probability at least 4. i= Claim 8. uppose K l. The for ay costat c > 0, if sup P K poly K x [0, c l f(x) P K (x) Utilizig those two claims, ad coditioig o the evet that all ˆp i c l obtai that (f(ˆp i ) P K ( ˆp i )) l i= (f(p i ) P K (p i )) i= (P K ( ˆp i ) P K (p i )), (29) i= c l. (30), p i c l, we immediately (3) l, (32) 9 Note that this bias correctio ituitio does ot easily geeralize to higher order correctios. For a systematic approach to do higher order bias correctio with Taylor series, we refer the readers to Yaju Ha, Jiatao Jiao, Tsachy Weissma, Miimax Rate-Optimal Estimatio of Divergeces betwee Discrete Distributios, arxiv preprit arxiv: (206). 6

7 which implies that [ E P (P K ( ˆp i ) P K (p i )) i= sice we already kow i Claim 5 that H( ˆP ) H(P X ). Thus, we have idetified the reaso of the poor bias of the empirical etropy: it is because the plug-i approach i estimatig the polyomial P K icurs too much bias. Realizig this turs out to be the crucial factor that leads to the miimax rate-optimal estimator: uder the multiomial model there exists ubiased estimators for ay polyomial P K whose order is o more tha. Ideed, whe X B(, p), for ay iteger r {,..., }: [ X(X )... (X r + ) E = p r. ( )... ( r + ) We complete the costructio of the miimax rate-optimal estimator by doig the followig: Costructio 9. If the true p i l, we use the ubiased estimator of polyomial P K(p i ) to estimate f(p i ). Here P K ( ) is the best approximatio polyomial of f(p i ) over the iterval [ 0, c l itroduced i Claim 8. As for the last step, we eed to use the Cheroff boud to show the followig results o cofidece itervals i the biomial model: Claim 0. There exist c, c 2, c 3, c 4 positive real umbers such that: log() if ˆp i [0, c the p log() i [0, c 2 with probability at least. 4 log() if ˆp i [c 3, the p log() i [c 4, with probability at least. 4 There are other details eeded to make the whole proof work: for example, oe eeds to argue that this approach does ot icrease the variace by too much, ad also show miimax lower bouds. I practice oe may also remove the costat term i P K ( ) to esure that oe assigs zero to symbols that have ever appeared i the traiig data. Thus, we have costructed a miimax rate-optimal estimator that does ot require the kowledge of the support size, but behaves early as well as the exact miimax estimator with the kowledge of the support size. (33) 7

Three Approaches towards Optimal Property Estimation and Testing

Three Approaches towards Optimal Property Estimation and Testing Three Approaches towards Optimal Property Estimatio ad Testig Jiatao Jiao (taford EE) Joit work with: Yaju Ha, Dmitri Pavlichi, Kartik Vekat, Tsachy Weissma Frotiers i Distributio Testig Workshop, FOC

More information

Lecture 17: Minimax estimation of high-dimensional functionals. 1 Estimating the fundamental limit is easier than achieving it: other loss functions

Lecture 17: Minimax estimation of high-dimensional functionals. 1 Estimating the fundamental limit is easier than achieving it: other loss functions EE378A tatistical igal Processig Lecture 3-05/29/207 Lecture 7: Miimax estimatio of high-dimesioal fuctioals Lecturer: Jiatao Jiao cribe: Joatha Lacotte Estimatig the fudametal limit is easier tha achievig

More information

Intro to Learning Theory

Intro to Learning Theory Lecture 1, October 18, 2016 Itro to Learig Theory Ruth Urer 1 Machie Learig ad Learig Theory Comig soo 2 Formal Framework 21 Basic otios I our formal model for machie learig, the istaces to be classified

More information

Estimation for Complete Data

Estimation for Complete Data Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of

More information

6.3 Testing Series With Positive Terms

6.3 Testing Series With Positive Terms 6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial

More information

Information-based Feature Selection

Information-based Feature Selection Iformatio-based Feature Selectio Farza Faria, Abbas Kazeroui, Afshi Babveyh Email: {faria,abbask,afshib}@staford.edu 1 Itroductio Feature selectio is a topic of great iterest i applicatios dealig with

More information

Empirical Process Theory and Oracle Inequalities

Empirical Process Theory and Oracle Inequalities Stat 928: Statistical Learig Theory Lecture: 10 Empirical Process Theory ad Oracle Iequalities Istructor: Sham Kakade 1 Risk vs Risk See Lecture 0 for a discussio o termiology. 2 The Uio Boud / Boferoi

More information

5.1 A mutual information bound based on metric entropy

5.1 A mutual information bound based on metric entropy Chapter 5 Global Fao Method I this chapter, we exted the techiques of Chapter 2.4 o Fao s method the local Fao method) to a more global costructio. I particular, we show that, rather tha costructig a local

More information

ECE 901 Lecture 14: Maximum Likelihood Estimation and Complexity Regularization

ECE 901 Lecture 14: Maximum Likelihood Estimation and Complexity Regularization ECE 90 Lecture 4: Maximum Likelihood Estimatio ad Complexity Regularizatio R Nowak 5/7/009 Review : Maximum Likelihood Estimatio We have iid observatios draw from a ukow distributio Y i iid p θ, i,, where

More information

Lecture 7: October 18, 2017

Lecture 7: October 18, 2017 Iformatio ad Codig Theory Autum 207 Lecturer: Madhur Tulsiai Lecture 7: October 8, 207 Biary hypothesis testig I this lecture, we apply the tools developed i the past few lectures to uderstad the problem

More information

Lecture 10 October Minimaxity and least favorable prior sequences

Lecture 10 October Minimaxity and least favorable prior sequences STATS 300A: Theory of Statistics Fall 205 Lecture 0 October 22 Lecturer: Lester Mackey Scribe: Brya He, Rahul Makhijai Warig: These otes may cotai factual ad/or typographic errors. 0. Miimaxity ad least

More information

Entropy Rates and Asymptotic Equipartition

Entropy Rates and Asymptotic Equipartition Chapter 29 Etropy Rates ad Asymptotic Equipartitio Sectio 29. itroduces the etropy rate the asymptotic etropy per time-step of a stochastic process ad shows that it is well-defied; ad similarly for iformatio,

More information

1 Review and Overview

1 Review and Overview DRAFT a fial versio will be posted shortly CS229T/STATS231: Statistical Learig Theory Lecturer: Tegyu Ma Lecture #3 Scribe: Migda Qiao October 1, 2013 1 Review ad Overview I the first half of this course,

More information

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5 CS434a/54a: Patter Recogitio Prof. Olga Veksler Lecture 5 Today Itroductio to parameter estimatio Two methods for parameter estimatio Maimum Likelihood Estimatio Bayesia Estimatio Itroducto Bayesia Decisio

More information

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Convergence of random variables. (telegram style notes) P.J.C. Spreij Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space

More information

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n. Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator

More information

Maximum Likelihood Estimation and Complexity Regularization

Maximum Likelihood Estimation and Complexity Regularization ECE90 Sprig 004 Statistical Regularizatio ad Learig Theory Lecture: 4 Maximum Likelihood Estimatio ad Complexity Regularizatio Lecturer: Rob Nowak Scribe: Pam Limpiti Review : Maximum Likelihood Estimatio

More information

Journal of Multivariate Analysis. Superefficient estimation of the marginals by exploiting knowledge on the copula

Journal of Multivariate Analysis. Superefficient estimation of the marginals by exploiting knowledge on the copula Joural of Multivariate Aalysis 102 (2011) 1315 1319 Cotets lists available at ScieceDirect Joural of Multivariate Aalysis joural homepage: www.elsevier.com/locate/jmva Superefficiet estimatio of the margials

More information

An Introduction to Randomized Algorithms

An Introduction to Randomized Algorithms A Itroductio to Radomized Algorithms The focus of this lecture is to study a radomized algorithm for quick sort, aalyze it usig probabilistic recurrece relatios, ad also provide more geeral tools for aalysis

More information

Frequentist Inference

Frequentist Inference Frequetist Iferece The topics of the ext three sectios are useful applicatios of the Cetral Limit Theorem. Without kowig aythig about the uderlyig distributio of a sequece of radom variables {X i }, for

More information

Information Theory and Statistics Lecture 4: Lempel-Ziv code

Information Theory and Statistics Lecture 4: Lempel-Ziv code Iformatio Theory ad Statistics Lecture 4: Lempel-Ziv code Łukasz Dębowski ldebowsk@ipipa.waw.pl Ph. D. Programme 203/204 Etropy rate is the limitig compressio rate Theorem For a statioary process (X i)

More information

Expectation and Variance of a random variable

Expectation and Variance of a random variable Chapter 11 Expectatio ad Variace of a radom variable The aim of this lecture is to defie ad itroduce mathematical Expectatio ad variace of a fuctio of discrete & cotiuous radom variables ad the distributio

More information

1 Introduction to reducing variance in Monte Carlo simulations

1 Introduction to reducing variance in Monte Carlo simulations Copyright c 010 by Karl Sigma 1 Itroductio to reducig variace i Mote Carlo simulatios 11 Review of cofidece itervals for estimatig a mea I statistics, we estimate a ukow mea µ = E(X) of a distributio by

More information

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015 ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],

More information

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1 EECS564 Estimatio, Filterig, ad Detectio Hwk 2 Sols. Witer 25 4. Let Z be a sigle observatio havig desity fuctio where. p (z) = (2z + ), z (a) Assumig that is a oradom parameter, fid ad plot the maximum

More information

MAT1026 Calculus II Basic Convergence Tests for Series

MAT1026 Calculus II Basic Convergence Tests for Series MAT026 Calculus II Basic Covergece Tests for Series Egi MERMUT 202.03.08 Dokuz Eylül Uiversity Faculty of Sciece Departmet of Mathematics İzmir/TURKEY Cotets Mootoe Covergece Theorem 2 2 Series of Real

More information

A survey on penalized empirical risk minimization Sara A. van de Geer

A survey on penalized empirical risk minimization Sara A. van de Geer A survey o pealized empirical risk miimizatio Sara A. va de Geer We address the questio how to choose the pealty i empirical risk miimizatio. Roughly speakig, this pealty should be a good boud for the

More information

Introductory statistics

Introductory statistics CM9S: Machie Learig for Bioiformatics Lecture - 03/3/06 Itroductory statistics Lecturer: Sriram Sakararama Scribe: Sriram Sakararama We will provide a overview of statistical iferece focussig o the key

More information

Lecture 3: August 31

Lecture 3: August 31 36-705: Itermediate Statistics Fall 018 Lecturer: Siva Balakrisha Lecture 3: August 31 This lecture will be mostly a summary of other useful expoetial tail bouds We will ot prove ay of these i lecture,

More information

4. Partial Sums and the Central Limit Theorem

4. Partial Sums and the Central Limit Theorem 1 of 10 7/16/2009 6:05 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 4. Partial Sums ad the Cetral Limit Theorem The cetral limit theorem ad the law of large umbers are the two fudametal theorems

More information

REGRESSION WITH QUADRATIC LOSS

REGRESSION WITH QUADRATIC LOSS REGRESSION WITH QUADRATIC LOSS MAXIM RAGINSKY Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X, Y ), where, as before, X is a R d

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

Random Variables, Sampling and Estimation

Random Variables, Sampling and Estimation Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig

More information

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Discrete Mathematics for CS Spring 2008 David Wagner Note 22 CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig

More information

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function.

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function. MATH 532 Measurable Fuctios Dr. Neal, WKU Throughout, let ( X, F, µ) be a measure space ad let (!, F, P ) deote the special case of a probability space. We shall ow begi to study real-valued fuctios defied

More information

Machine Learning Brett Bernstein

Machine Learning Brett Bernstein Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio

More information

10-701/ Machine Learning Mid-term Exam Solution

10-701/ Machine Learning Mid-term Exam Solution 0-70/5-78 Machie Learig Mid-term Exam Solutio Your Name: Your Adrew ID: True or False (Give oe setece explaatio) (20%). (F) For a cotiuous radom variable x ad its probability distributio fuctio p(x), it

More information

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 016 MODULE : Statistical Iferece Time allowed: Three hours Cadidates should aswer FIVE questios. All questios carry equal marks. The umber

More information

Lecture 10: Universal coding and prediction

Lecture 10: Universal coding and prediction 0-704: Iformatio Processig ad Learig Sprig 0 Lecture 0: Uiversal codig ad predictio Lecturer: Aarti Sigh Scribes: Georg M. Goerg Disclaimer: These otes have ot bee subjected to the usual scrutiy reserved

More information

Problem Set 4 Due Oct, 12

Problem Set 4 Due Oct, 12 EE226: Radom Processes i Systems Lecturer: Jea C. Walrad Problem Set 4 Due Oct, 12 Fall 06 GSI: Assae Gueye This problem set essetially reviews detectio theory ad hypothesis testig ad some basic otios

More information

INFINITE SEQUENCES AND SERIES

INFINITE SEQUENCES AND SERIES 11 INFINITE SEQUENCES AND SERIES INFINITE SEQUENCES AND SERIES 11.4 The Compariso Tests I this sectio, we will lear: How to fid the value of a series by comparig it with a kow series. COMPARISON TESTS

More information

Machine Learning Brett Bernstein

Machine Learning Brett Bernstein Machie Learig Brett Berstei Week Lecture: Cocept Check Exercises Starred problems are optioal. Statistical Learig Theory. Suppose A = Y = R ad X is some other set. Furthermore, assume P X Y is a discrete

More information

1 Approximating Integrals using Taylor Polynomials

1 Approximating Integrals using Taylor Polynomials Seughee Ye Ma 8: Week 7 Nov Week 7 Summary This week, we will lear how we ca approximate itegrals usig Taylor series ad umerical methods. Topics Page Approximatig Itegrals usig Taylor Polyomials. Defiitios................................................

More information

Mathematical Statistics - MS

Mathematical Statistics - MS Paper Specific Istructios. The examiatio is of hours duratio. There are a total of 60 questios carryig 00 marks. The etire paper is divided ito three sectios, A, B ad C. All sectios are compulsory. Questios

More information

Machine Learning Theory (CS 6783)

Machine Learning Theory (CS 6783) Machie Learig Theory (CS 6783) Lecture 2 : Learig Frameworks, Examples Settig up learig problems. X : istace space or iput space Examples: Computer Visio: Raw M N image vectorized X = 0, 255 M N, SIFT

More information

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality

More information

Notes for Lecture 11

Notes for Lecture 11 U.C. Berkeley CS78: Computatioal Complexity Hadout N Professor Luca Trevisa 3/4/008 Notes for Lecture Eigevalues, Expasio, ad Radom Walks As usual by ow, let G = (V, E) be a udirected d-regular graph with

More information

Fall 2013 MTH431/531 Real analysis Section Notes

Fall 2013 MTH431/531 Real analysis Section Notes Fall 013 MTH431/531 Real aalysis Sectio 8.1-8. Notes Yi Su 013.11.1 1. Defiitio of uiform covergece. We look at a sequece of fuctios f (x) ad study the coverget property. Notice we have two parameters

More information

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance Hypothesis Testig Empirically evaluatig accuracy of hypotheses: importat activity i ML. Three questios: Give observed accuracy over a sample set, how well does this estimate apply over additioal samples?

More information

Element sampling: Part 2

Element sampling: Part 2 Chapter 4 Elemet samplig: Part 2 4.1 Itroductio We ow cosider uequal probability samplig desigs which is very popular i practice. I the uequal probability samplig, we ca improve the efficiecy of the resultig

More information

Monte Carlo Integration

Monte Carlo Integration Mote Carlo Itegratio I these otes we first review basic umerical itegratio methods (usig Riema approximatio ad the trapezoidal rule) ad their limitatios for evaluatig multidimesioal itegrals. Next we itroduce

More information

Sieve Estimators: Consistency and Rates of Convergence

Sieve Estimators: Consistency and Rates of Convergence EECS 598: Statistical Learig Theory, Witer 2014 Topic 6 Sieve Estimators: Cosistecy ad Rates of Covergece Lecturer: Clayto Scott Scribe: Julia Katz-Samuels, Brado Oselio, Pi-Yu Che Disclaimer: These otes

More information

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1. Eco 325/327 Notes o Sample Mea, Sample Proportio, Cetral Limit Theorem, Chi-square Distributio, Studet s t distributio 1 Sample Mea By Hiro Kasahara We cosider a radom sample from a populatio. Defiitio

More information

Lecture 6: Integration and the Mean Value Theorem. slope =

Lecture 6: Integration and the Mean Value Theorem. slope = Math 8 Istructor: Padraic Bartlett Lecture 6: Itegratio ad the Mea Value Theorem Week 6 Caltech 202 The Mea Value Theorem The Mea Value Theorem abbreviated MVT is the followig result: Theorem. Suppose

More information

Minimax Estimation of Functionals of Discrete Distributions

Minimax Estimation of Functionals of Discrete Distributions Miimax Estimatio of Fuctioals of Discrete Distributios Jiatao Jiao, tudet Member, IEEE, Kartik Vekat, tudet Member, IEEE, Yaju Ha, tudet Member, IEEE, ad Tsachy Weissma, Fellow, IEEE arxiv:406.6956v5 [cs.it]

More information

Chapter 6 Principles of Data Reduction

Chapter 6 Principles of Data Reduction Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a

More information

Lecture 15: Strong, Conditional, & Joint Typicality

Lecture 15: Strong, Conditional, & Joint Typicality EE376A/STATS376A Iformatio Theory Lecture 15-02/27/2018 Lecture 15: Strog, Coditioal, & Joit Typicality Lecturer: Tsachy Weissma Scribe: Nimit Sohoi, William McCloskey, Halwest Mohammad I this lecture,

More information

Stat410 Probability and Statistics II (F16)

Stat410 Probability and Statistics II (F16) Some Basic Cocepts of Statistical Iferece (Sec 5.) Suppose we have a rv X that has a pdf/pmf deoted by f(x; θ) or p(x; θ), where θ is called the parameter. I previous lectures, we focus o probability problems

More information

Regression with quadratic loss

Regression with quadratic loss Regressio with quadratic loss Maxim Ragisky October 13, 2015 Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X,Y, where, as before,

More information

6. Sufficient, Complete, and Ancillary Statistics

6. Sufficient, Complete, and Ancillary Statistics Sufficiet, Complete ad Acillary Statistics http://www.math.uah.edu/stat/poit/sufficiet.xhtml 1 of 7 7/16/2009 6:13 AM Virtual Laboratories > 7. Poit Estimatio > 1 2 3 4 5 6 6. Sufficiet, Complete, ad Acillary

More information

Simulation. Two Rule For Inverting A Distribution Function

Simulation. Two Rule For Inverting A Distribution Function Simulatio Two Rule For Ivertig A Distributio Fuctio Rule 1. If F(x) = u is costat o a iterval [x 1, x 2 ), the the uiform value u is mapped oto x 2 through the iversio process. Rule 2. If there is a jump

More information

1 Inferential Methods for Correlation and Regression Analysis

1 Inferential Methods for Correlation and Regression Analysis 1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet

More information

4.1 Data processing inequality

4.1 Data processing inequality ECE598: Iformatio-theoretic methods i high-dimesioal statistics Sprig 206 Lecture 4: Total variatio/iequalities betwee f-divergeces Lecturer: Yihog Wu Scribe: Matthew Tsao, Feb 8, 206 [Ed. Mar 22] Recall

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS MASSACHUSTTS INSTITUT OF TCHNOLOGY 6.436J/5.085J Fall 2008 Lecture 9 /7/2008 LAWS OF LARG NUMBRS II Cotets. The strog law of large umbers 2. The Cheroff boud TH STRONG LAW OF LARG NUMBRS While the weak

More information

Variance of Discrete Random Variables Class 5, Jeremy Orloff and Jonathan Bloom

Variance of Discrete Random Variables Class 5, Jeremy Orloff and Jonathan Bloom Variace of Discrete Radom Variables Class 5, 18.05 Jeremy Orloff ad Joatha Bloom 1 Learig Goals 1. Be able to compute the variace ad stadard deviatio of a radom variable.. Uderstad that stadard deviatio

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit theorems Throughout this sectio we will assume a probability space (Ω, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

Lecture 12: September 27

Lecture 12: September 27 36-705: Itermediate Statistics Fall 207 Lecturer: Siva Balakrisha Lecture 2: September 27 Today we will discuss sufficiecy i more detail ad the begi to discuss some geeral strategies for costructig estimators.

More information

Exponential Families and Bayesian Inference

Exponential Families and Bayesian Inference Computer Visio Expoetial Families ad Bayesia Iferece Lecture Expoetial Families A expoetial family of distributios is a d-parameter family f(x; havig the followig form: f(x; = h(xe g(t T (x B(, (. where

More information

Lecture 11 October 27

Lecture 11 October 27 STATS 300A: Theory of Statistics Fall 205 Lecture October 27 Lecturer: Lester Mackey Scribe: Viswajith Veugopal, Vivek Bagaria, Steve Yadlowsky Warig: These otes may cotai factual ad/or typographic errors..

More information

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes.

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes. Term Test October 3, 003 Name Math 56 Studet Number Directio: This test is worth 50 poits. You are required to complete this test withi 50 miutes. I order to receive full credit, aswer each problem completely

More information

Output Analysis and Run-Length Control

Output Analysis and Run-Length Control IEOR E4703: Mote Carlo Simulatio Columbia Uiversity c 2017 by Marti Haugh Output Aalysis ad Ru-Legth Cotrol I these otes we describe how the Cetral Limit Theorem ca be used to costruct approximate (1 α%

More information

Chapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE. Part 3: Summary of CI for µ Confidence Interval for a Population Proportion p

Chapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE. Part 3: Summary of CI for µ Confidence Interval for a Population Proportion p Chapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE Part 3: Summary of CI for µ Cofidece Iterval for a Populatio Proportio p Sectio 8-4 Summary for creatig a 100(1-α)% CI for µ: Whe σ 2 is kow ad paret

More information

Local moment matching: A unified methodology for symmetric functional estimation and distribution estimation under Wasserstein distance

Local moment matching: A unified methodology for symmetric functional estimation and distribution estimation under Wasserstein distance Proceedigs of Machie Learig Research vol 75:1 33, 2018 31st Aual Coferece o Learig Theory Local momet matchig: A uified methodology for symmetric fuctioal estimatio ad distributio estimatio uder Wasserstei

More information

It is often useful to approximate complicated functions using simpler ones. We consider the task of approximating a function by a polynomial.

It is often useful to approximate complicated functions using simpler ones. We consider the task of approximating a function by a polynomial. Taylor Polyomials ad Taylor Series It is ofte useful to approximate complicated fuctios usig simpler oes We cosider the task of approximatig a fuctio by a polyomial If f is at least -times differetiable

More information

Maximum Likelihood Estimation of Functionals of Discrete Distributions

Maximum Likelihood Estimation of Functionals of Discrete Distributions Maximum Likelihood Estimatio of Fuctioals of Discrete Distributios Jiatao Jiao, Studet Member, IEEE, Kartik Vekat, Studet Member, IEEE, Yaju Ha, Studet Member, IEEE, ad Tsachy Weissma, Fellow, IEEE arxiv:406.6959v7

More information

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4 MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.

More information

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara Poit Estimator Eco 325 Notes o Poit Estimator ad Cofidece Iterval 1 By Hiro Kasahara Parameter, Estimator, ad Estimate The ormal probability desity fuctio is fully characterized by two costats: populatio

More information

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22 CS 70 Discrete Mathematics for CS Sprig 2007 Luca Trevisa Lecture 22 Aother Importat Distributio The Geometric Distributio Questio: A biased coi with Heads probability p is tossed repeatedly util the first

More information

Lecture Chapter 6: Convergence of Random Sequences

Lecture Chapter 6: Convergence of Random Sequences ECE5: Aalysis of Radom Sigals Fall 6 Lecture Chapter 6: Covergece of Radom Sequeces Dr Salim El Rouayheb Scribe: Abhay Ashutosh Doel, Qibo Zhag, Peiwe Tia, Pegzhe Wag, Lu Liu Radom sequece Defiitio A ifiite

More information

Lecture 19: Convergence

Lecture 19: Convergence Lecture 19: Covergece Asymptotic approach I statistical aalysis or iferece, a key to the success of fidig a good procedure is beig able to fid some momets ad/or distributios of various statistics. I may

More information

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y

More information

Lecture 3. Properties of Summary Statistics: Sampling Distribution

Lecture 3. Properties of Summary Statistics: Sampling Distribution Lecture 3 Properties of Summary Statistics: Samplig Distributio Mai Theme How ca we use math to justify that our umerical summaries from the sample are good summaries of the populatio? Lecture Summary

More information

Stat 421-SP2012 Interval Estimation Section

Stat 421-SP2012 Interval Estimation Section Stat 41-SP01 Iterval Estimatio Sectio 11.1-11. We ow uderstad (Chapter 10) how to fid poit estimators of a ukow parameter. o However, a poit estimate does ot provide ay iformatio about the ucertaity (possible

More information

CS284A: Representations and Algorithms in Molecular Biology

CS284A: Representations and Algorithms in Molecular Biology CS284A: Represetatios ad Algorithms i Molecular Biology Scribe Notes o Lectures 3 & 4: Motif Discovery via Eumeratio & Motif Represetatio Usig Positio Weight Matrix Joshua Gervi Based o presetatios by

More information

Math 61CM - Solutions to homework 3

Math 61CM - Solutions to homework 3 Math 6CM - Solutios to homework 3 Cédric De Groote October 2 th, 208 Problem : Let F be a field, m 0 a fixed oegative iteger ad let V = {a 0 + a x + + a m x m a 0,, a m F} be the vector space cosistig

More information

Solutions: Homework 3

Solutions: Homework 3 Solutios: Homework 3 Suppose that the radom variables Y,...,Y satisfy Y i = x i + " i : i =,..., IID where x,...,x R are fixed values ad ",...," Normal(0, )with R + kow. Fid ˆ = MLE( ). IND Solutio: Observe

More information

Math 113 Exam 3 Practice

Math 113 Exam 3 Practice Math Exam Practice Exam 4 will cover.-., 0. ad 0.. Note that eve though. was tested i exam, questios from that sectios may also be o this exam. For practice problems o., refer to the last review. This

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013. Large Deviations for i.i.d. Random Variables

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013. Large Deviations for i.i.d. Random Variables MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013 Large Deviatios for i.i.d. Radom Variables Cotet. Cheroff boud usig expoetial momet geeratig fuctios. Properties of a momet

More information

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 11

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 11 Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract We will itroduce the otio of reproducig kerels ad associated Reproducig Kerel Hilbert Spaces (RKHS). We will cosider couple

More information

INFINITE SEQUENCES AND SERIES

INFINITE SEQUENCES AND SERIES INFINITE SEQUENCES AND SERIES INFINITE SEQUENCES AND SERIES I geeral, it is difficult to fid the exact sum of a series. We were able to accomplish this for geometric series ad the series /[(+)]. This is

More information

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would

More information

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i

More information

Since X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain

Since X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain Assigmet 9 Exercise 5.5 Let X biomial, p, where p 0, 1 is ukow. Obtai cofidece itervals for p i two differet ways: a Sice X / p d N0, p1 p], the variace of the limitig distributio depeds oly o p. Use the

More information

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10 DS 00: Priciples ad Techiques of Data Sciece Date: April 3, 208 Name: Hypothesis Testig Discussio #0. Defie these terms below as they relate to hypothesis testig. a) Data Geeratio Model: Solutio: A set

More information

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12 Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract I this lecture we derive risk bouds for kerel methods. We will start by showig that Soft Margi kerel SVM correspods to miimizig

More information

6.883: Online Methods in Machine Learning Alexander Rakhlin

6.883: Online Methods in Machine Learning Alexander Rakhlin 6.883: Olie Methods i Machie Learig Alexader Rakhli LECTURES 5 AND 6. THE EXPERTS SETTING. EXPONENTIAL WEIGHTS All the algorithms preseted so far halluciate the future values as radom draws ad the perform

More information

1 of 7 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 6. Order Statistics Defiitios Suppose agai that we have a basic radom experimet, ad that X is a real-valued radom variable

More information

Lecture Notes 15 Hypothesis Testing (Chapter 10)

Lecture Notes 15 Hypothesis Testing (Chapter 10) 1 Itroductio Lecture Notes 15 Hypothesis Testig Chapter 10) Let X 1,..., X p θ x). Suppose we we wat to kow if θ = θ 0 or ot, where θ 0 is a specific value of θ. For example, if we are flippig a coi, we

More information

Basics of Probability Theory (for Theory of Computation courses)

Basics of Probability Theory (for Theory of Computation courses) Basics of Probability Theory (for Theory of Computatio courses) Oded Goldreich Departmet of Computer Sciece Weizma Istitute of Sciece Rehovot, Israel. oded.goldreich@weizma.ac.il November 24, 2008 Preface.

More information