4.1 Data processing inequality
|
|
- Bryan Ferguson
- 6 years ago
- Views:
Transcription
1 ECE598: Iformatio-theoretic methods i high-dimesioal statistics Sprig 206 Lecture 4: Total variatio/iequalities betwee f-divergeces Lecturer: Yihog Wu Scribe: Matthew Tsao, Feb 8, 206 [Ed. Mar 22] Recall the defiitio of f-divergeces from last time. If a fuctio f : R + R satisfies the followig properties: f is a covex fuctio. f = 0. f is strictly covex at x =, i.e. f < αfx + αfy is strict. for all x, y, α such that αx + αy =, the iequality The the fuctioal that maps pairs of distributios to R + defied by ] dp D f P Q E Q dq is a f-divergece. 4. Data processig iequality Theorem 4.. Cosider a chael that produces Y give X based o the law P Y X show below. If P Y is the distributio of Y whe X is geerated by P X ad is the distributio of Y whe X is geerated by Q X, the for ay f-divergece D f, D f P Y D f P X Q X. P X P Y P Y X Q X Oe iterpretatio of this result is that processig the observatio x makes it more difficult to determie whether it came from P X or Q X.
2 Proof. D f P X Q X = E QX Jese s iequality E QY = E QY PX ] a = E QXY ] Q X P XY E QX Y E PX Y P Y ] PXY ] [ ] PXY = E QY E QX Y f ] b PY = E QY = D f P Y. Note that a meas D f P X Q X = D f P XY ; b ca be alteratively uderstood by otig that E Q [ P XY Y ] is precisely the relative desity P Y, by checkig the defiitio of chage of measure, i.e., E P [gy ] = E Q [gy P XY ] = E Q [gy E[ P XY Y ]] for ay g. Remark 4.. P Y X ca be a determiistic map so that Y = fx. More specifically, if fx = E X for ay evet E, the Y is Beroulli with parameter P E or QE ad the data processig iequality gives D f P X Q X D f BerP E BerQE. 4. This is how we prove the coverse directio of large deviatio. Example 4.. If X = X, X 2 ad fx = X, the we have D f P X X 2 Q X X 2 D f P X Q X. As see from the proof of Theorem 4., this is i fact equivalet to data processig iequality. Remark 4.2. If D f P Q is a f-divergece, the D fp Q with fx := xf x is also a f- divergece ad D f P Q = D f Q P. Example: D f P Q = DP Q the D f P Q = DQ P. Proof. First we verify that f has all three properties required for D f to be a f-divergece. For x, y R + ad α [0, ] defie c = αx + αy so that αx c fαx + αy = cf αx = cf c c x + αy c Thus f : R + R is a covex fuctio. f = f = 0. c αx y c f x + αy c + c αy c f =. Observe that = α y fx + α fy. For x, y R +, α [0, ], if αx + αy =, the by strict covexity of f at, 0 = f = f = f αx x + αy < αxf + αyf = α y x y fx + α fy. So f is strictly covex at ad thus D f is a valid f-divergece. Fially, D f P Q = E Q ] [ ] [ P Q P = E P Q P f = E P f Q ] Q = D P f Q P. 2
3 4.2 Total variatio ad hypothesis testig Recall that the choice of fx = 2 x gives rise to the total variatio distace, D f P Q = 2 E P Q Q = P Q, 2 where P Q is a short-had uderstood i the usual sese, amely, dp dµ dµ where µ is a domiatig measure, e.g., µ = P + Q, ad the value of the itegral does ot depeds o µ. We will deote total variatio by d TV P, Q or TVP, Q. Theorem 4.2. The followig defiitios for total variatio are equivalet:. dµ dq d TV P, Q = sup P E QE, 4.2 E where the supremum is over all measurable set E. 2. d TV P, Q is the miimal sum of Type-I ad Type-II error probabilities for testig P versus Q, ad d TVP, Q = P Q Provided the diagoal {x, x : x X } is measurable, d TV P, Q = 4. Let F = {f : X R, f }. The if P [X Y ]. 4.4 P XY : P X =P,P Y =Q d TV P, Q = 2 sup E P fx E Q fx. 4.5 f F Remark 4.3 Variatioal represetatio. The equatio 4.2 ad 4.5 provide sup-represetatio of total variatio, which will be exteded to geeral f-divergeces later. Note that 4.4 is a if-represetatio of total variatio i terms of coupligs, meaig total variatio is the Wasserstei distace with respect to Hammig distace. The beefit of variatioal represetatios is that choosig a particular couplig i 4.4 gives a upper boud o d TV P, Q, ad choosig a particular f i 4.5 yields a lower boud. Remark 4.4 Operatioal meaig. I the biary hypothesis test for H 0 : X P or H : X Q, Theorem 4.2 shows that d TV P, Q is the sum of false alarm ad missed detectio probabilities. This ca be see either from 4.2 where E is the decisio regio for decidig P or from 4.3 sice the optimal test for average probability of error is the likelihood ratio test dp dq >. I particular, d TV P, Q = P Q, the probability of error is zero sice essetially P ad Q have disjoit supports. d TV P, Q = 0 P = Q ad the miimal sum of error probabilities is oe, meaig the best thig to do is to flip a coi. Throughput the course a b = mi{a, b} ad a b = max{a, b}. Here agai P Q is a short-had uderstood per the usual sese, amely, dp dµ where µ is ay domiatig measure. dq dµ dµ 3
4 4.3 Motivatig example: Hypothesis testig with multiple samples Observatio: Not all f-divergeces are both equal. Differet f-divergece has differet operatioal sigificace. For example, as we saw i Sectio 4.2, testig two hypothesis boils dow to total variatio, which determies the fudametal limit miimum average probability of error. Later i the course we will ecouter aother f-divergece: LP Q = P Q 2 P +Q, which is useful for estimatio. 2. Some f-divergece is easier to evaluate tha others. For example, for product distributios, Helliger distace ad χ 2 -divergece tesorize i the sese that they are easily expressible i terms of those of the oe-dimesioal margials; however, computig the total variatio betwee product measures is frequetly difficult. Aother example is that computig the χ 2 - divergece betwee a product distributio ad a mixture of product distributios is coveiet, which will become useful later i the course. Therefore the puchlie is that it is ofte fruitful to boud oe f-divergece by aother ad this sometimes leads to tight characterizatios. I this sectio we cosider a specific useful example to drive this poit home. The i the ext sectio we develop iequalities betwee f-divergeces systematically. Cosider a biary hypothesis test where data X = X, X 2,...X are i.i.d draw from either P or Q ad the goal is to test H 0 : X P vs H : X Q. As metioed before, d TV P gives miimal Type-I+II probabilities of error, achieved by the maximum likelihood test. By the data processig iequality, d TV P m, Q m d TV P for m <. From this we see that d TV P is a icreasig sequece i ad bouded by by defiitio ad hece coverges. Oe would hope that as, d TV P coverges to ad cosequetly, the probability of error i the hypothesis test coverges to zero. It turs out that if the distributios P, Q are idepedet of, the large deviatio theory gives d TV P = exp CP, Q + o 4.6 where the costat CP, Q = log if 0 α P α Q α is the Cheroff Iformatio of P, Q. It is clear from this that d TV P as, ad, i fact, expoetially fast. However, as frequetly ecoutered i high-dimesioal problems, if the distributios P = P ad Q = Q deped o, the the large-deviatio approach that leads to 4.6 is o loger valid. I such a situatio, total variatio is still relevat for hypothesis testig, but its behavior as is ot obvious or easy to compute. I this case, uderstadig how a more computatioally tractable f-divergece is related to total variatio may give isight o hypothesis testig without eedig to directly compute the total variatio. It turs out Helliger distace is precisely suited for this task see Theorem 4.3 below. [ ] 2 Recall that the squared Helliger distace, H 2 P, Q = E Q P Q is a f-divergece with fx = x 2, which provides a sadwich boud for total variatio 0 2 H2 P, Q d TV P, Q HP, Q H2 P, Q
5 The proof of this statemet will explaied i the ext lecture. A few observatios which are direct cosequeces of these iequalities: H 2 P, Q = 2, if ad oly if d TV P, Q =. H 2 P, Q = 0 if ad oly if d TV P, Q = 0. Helliger cosistecy TV cosistecy, amely H 2 P, Q 0 d TV P, Q 0. Theorem 4.3. For ay sequece of distributios P ad Q, as, 2 d TV P, Q 0 H 2 P, Q = o d TV P H 2 P, Q = ω Proof. Because the observatios X = X, X 2,...X are i.i.d, the joit distributio factors H 2 P = 2 2E Q P X i Q [ ] P By idepedece = 2 2 E Q X i = 2 2 Q = H2 P, Q E Q [ P Q d TV P 0 if ad oly if H 2 P 0 which happes precisely whe 2 H2 P, Q, which happes whe H 2 P, Q = o. Similarly, d TV P if ad oly if H 2 P 2 which happes precisely whe 2 H2 P, Q 0, if ad oly if H 2 P, Q = ω. Remark 4.5. The proof of Theorem 4.3 relies o two igrediets:. Sadwich boud 4.7. ] 2. Tesorizatio properties of Helliger: H 2 P i, Q i = 2 2 H2 P i, Q i Note that there are other f-divergeces that are also tesorizable, e.g., χ 2 -divergeces: χ 2 P i, Q i = + χ 2 P i, Q i ; 4.9 however, o sadwich iequality like 4.7 exists for d TV ad χ 2 ad hece there is o χ 2 -versio of Theorem 4.3. Assertig the o-existece of such iequalities requires uderstadig the relatioship betwee these two f-divergeces. 2 For positive sequeces {a }, {b }, we say a = ωb if b = oa. 5
6 4.4 Iequalities betwee f-divergeces We will discuss two methods for fidig iequalities betwee f-divergeces. ad hoc approach: case-by-case proof usig results like Jese s iequality, max mea mi, Cauchy-Schwarz, etc. systematic approach: joit rage of f-divergeces. Defiitio 4.. The joit rage betwee two f-divergeces D f ad D g is the rage of the mappig P, Q D f P Q, D g P Q, i.e., the set R R + R + where x, y R if there exist distributios P, Q o some commo measurable space such that x = D f P Q ad y = D g P Q D g D f The gree regio i the above figure shows what a joit rage betwee D f ad D g might look like. By defiitio of R, the lower boudary gives the sharpest lower boud of D g i terms of D f, amely: D f P Q V D g P Q, where V t if{d f P Q : D g P Q = t}; similarly, the upper boudary gives the best upper boud. As will be discussed i the ext lecture, the sadwich boud 4.7 correspod to precisely the lower ad upper boudaries of the joit rage of H 2 ad d TV, therefore ot improvable. It is importat to ote, however, that R may be a ubouded regio ad some of the boudaries may ot exist, meaig it is impossible to boud oe by the other, such as χ 2 versus d TV. 6
7 To gai some ituitio, we start with the ad hoc approach by provig Pisker s iequality, which bouds total variatio from above by the KL divergece. Theorem 4.4 Pisker s iequality. DP Q 2d 2 TVP, Q. 4.0 Proof. First we show that, by the data processig iequality, it suffices to prove the result for Beroulli distributios. For ay evet E, let Y = {X E} which is Beroulli with parameter P E or QE. By data processig iequality, DP Q dp E QE. If Pisker s iequality is true for all Beroulli radom variables, we have 2 DP Q d TVBerP E, BerQE = P E QE Takig the supremum over E gives Theorem 4.2. The biary case follows easily from Taylor s theorem: dp q = p ad d TV Berp, Berq = p q. q 2 DP Q sup E P E QE = d TV P, Q, i view of p t t t dt 4 p q p tdt = 2p q 2 Remark 4.6. Pisker s iequality is kow to be sharp i the sese that the costat 2 i 4.0 is ot improvable, i.e., there exist {P, Q } such that LHS RHS 2 as. Why? Nevertheless, this does ot mea that 4.0 itself is ot improvable because it might be possible to subtract some higher-order term from the RHS. This is ideed the case ad there are may refiemets of Pisker s iequality. But what is the best iequality? Settlig this questio rests o characterizig the joit rage ad the lower boudary. This is the topic of ext lecture. 7
Lecture 7: October 18, 2017
Iformatio ad Codig Theory Autum 207 Lecturer: Madhur Tulsiai Lecture 7: October 8, 207 Biary hypothesis testig I this lecture, we apply the tools developed i the past few lectures to uderstad the problem
More informationThis exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.
Probability ad Statistics FS 07 Secod Sessio Exam 09.0.08 Time Limit: 80 Miutes Name: Studet ID: This exam cotais 9 pages (icludig this cover page) ad 0 questios. A Formulae sheet is provided with the
More informationLet us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.
Lecture 5 Let us give oe more example of MLE. Example 3. The uiform distributio U[0, ] o the iterval [0, ] has p.d.f. { 1 f(x =, 0 x, 0, otherwise The likelihood fuctio ϕ( = f(x i = 1 I(X 1,..., X [0,
More information5. Likelihood Ratio Tests
1 of 5 7/29/2009 3:16 PM Virtual Laboratories > 9. Hy pothesis Testig > 1 2 3 4 5 6 7 5. Likelihood Ratio Tests Prelimiaries As usual, our startig poit is a radom experimet with a uderlyig sample space,
More informationSpring Information Theory Midterm (take home) Due: Tue, Mar 29, 2016 (in class) Prof. Y. Polyanskiy. P XY (i, j) = α 2 i 2j
Sprig 206 6.44 - Iformatio Theory Midterm (take home) Due: Tue, Mar 29, 206 (i class) Prof. Y. Polyaskiy Rules. Collaboratio strictly prohibited. 2. Write rigorously, prove all claims. 3. You ca use otes
More informationConvergence of random variables. (telegram style notes) P.J.C. Spreij
Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space
More informationChapter 6 Principles of Data Reduction
Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a
More informationAgnostic Learning and Concentration Inequalities
ECE901 Sprig 2004 Statistical Regularizatio ad Learig Theory Lecture: 7 Agostic Learig ad Cocetratio Iequalities Lecturer: Rob Nowak Scribe: Aravid Kailas 1 Itroductio 1.1 Motivatio I the last lecture
More informationProblem Set 4 Due Oct, 12
EE226: Radom Processes i Systems Lecturer: Jea C. Walrad Problem Set 4 Due Oct, 12 Fall 06 GSI: Assae Gueye This problem set essetially reviews detectio theory ad hypothesis testig ad some basic otios
More informationLecture 13: Maximum Likelihood Estimation
ECE90 Sprig 007 Statistical Learig Theory Istructor: R. Nowak Lecture 3: Maximum Likelihood Estimatio Summary of Lecture I the last lecture we derived a risk (MSE) boud for regressio problems; i.e., select
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 3 9/11/2013. Large deviations Theory. Cramér s Theorem
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/5.070J Fall 203 Lecture 3 9//203 Large deviatios Theory. Cramér s Theorem Cotet.. Cramér s Theorem. 2. Rate fuctio ad properties. 3. Chage of measure techique.
More informationRademacher Complexity
EECS 598: Statistical Learig Theory, Witer 204 Topic 0 Rademacher Complexity Lecturer: Clayto Scott Scribe: Ya Deg, Kevi Moo Disclaimer: These otes have ot bee subjected to the usual scrutiy reserved for
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS
MASSACHUSTTS INSTITUT OF TCHNOLOGY 6.436J/5.085J Fall 2008 Lecture 9 /7/2008 LAWS OF LARG NUMBRS II Cotets. The strog law of large umbers 2. The Cheroff boud TH STRONG LAW OF LARG NUMBRS While the weak
More informationSTAT Homework 1 - Solutions
STAT-36700 Homework 1 - Solutios Fall 018 September 11, 018 This cotais solutios for Homework 1. Please ote that we have icluded several additioal commets ad approaches to the problems to give you better
More informationProduct measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014.
Product measures, Toelli s ad Fubii s theorems For use i MAT3400/4400, autum 2014 Nadia S. Larse Versio of 13 October 2014. 1. Costructio of the product measure The purpose of these otes is to preset the
More informationLecture 3 : Random variables and their distributions
Lecture 3 : Radom variables ad their distributios 3.1 Radom variables Let (Ω, F) ad (S, S) be two measurable spaces. A map X : Ω S is measurable or a radom variable (deoted r.v.) if X 1 (A) {ω : X(ω) A}
More informationLecture 6 Simple alternatives and the Neyman-Pearson lemma
STATS 00: Itroductio to Statistical Iferece Autum 06 Lecture 6 Simple alteratives ad the Neyma-Pearso lemma Last lecture, we discussed a umber of ways to costruct test statistics for testig a simple ull
More informationAdvanced Analysis. Min Yan Department of Mathematics Hong Kong University of Science and Technology
Advaced Aalysis Mi Ya Departmet of Mathematics Hog Kog Uiversity of Sciece ad Techology September 3, 009 Cotets Limit ad Cotiuity 7 Limit of Sequece 8 Defiitio 8 Property 3 3 Ifiity ad Ifiitesimal 8 4
More information6.3 Testing Series With Positive Terms
6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial
More information1 of 7 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 6. Order Statistics Defiitios Suppose agai that we have a basic radom experimet, ad that X is a real-valued radom variable
More informationChapter 3. Strong convergence. 3.1 Definition of almost sure convergence
Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i
More informationEECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1
EECS564 Estimatio, Filterig, ad Detectio Hwk 2 Sols. Witer 25 4. Let Z be a sigle observatio havig desity fuctio where. p (z) = (2z + ), z (a) Assumig that is a oradom parameter, fid ad plot the maximum
More informationMAS111 Convergence and Continuity
MAS Covergece ad Cotiuity Key Objectives At the ed of the course, studets should kow the followig topics ad be able to apply the basic priciples ad theorems therei to solvig various problems cocerig covergece
More informationMeasure and Measurable Functions
3 Measure ad Measurable Fuctios 3.1 Measure o a Arbitrary σ-algebra Recall from Chapter 2 that the set M of all Lebesgue measurable sets has the followig properties: R M, E M implies E c M, E M for N implies
More informationECE 901 Lecture 13: Maximum Likelihood Estimation
ECE 90 Lecture 3: Maximum Likelihood Estimatio R. Nowak 5/7/009 The focus of this lecture is to cosider aother approach to learig based o maximum likelihood estimatio. Ulike earlier approaches cosidered
More informationECE 901 Lecture 14: Maximum Likelihood Estimation and Complexity Regularization
ECE 90 Lecture 4: Maximum Likelihood Estimatio ad Complexity Regularizatio R Nowak 5/7/009 Review : Maximum Likelihood Estimatio We have iid observatios draw from a ukow distributio Y i iid p θ, i,, where
More informationLecture 2: Monte Carlo Simulation
STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?
More informationChapter 6 Infinite Series
Chapter 6 Ifiite Series I the previous chapter we cosidered itegrals which were improper i the sese that the iterval of itegratio was ubouded. I this chapter we are goig to discuss a topic which is somewhat
More information2 Banach spaces and Hilbert spaces
2 Baach spaces ad Hilbert spaces Tryig to do aalysis i the ratioal umbers is difficult for example cosider the set {x Q : x 2 2}. This set is o-empty ad bouded above but does ot have a least upper boud
More informationLecture 12: September 27
36-705: Itermediate Statistics Fall 207 Lecturer: Siva Balakrisha Lecture 2: September 27 Today we will discuss sufficiecy i more detail ad the begi to discuss some geeral strategies for costructig estimators.
More informationMaximum Likelihood Estimation and Complexity Regularization
ECE90 Sprig 004 Statistical Regularizatio ad Learig Theory Lecture: 4 Maximum Likelihood Estimatio ad Complexity Regularizatio Lecturer: Rob Nowak Scribe: Pam Limpiti Review : Maximum Likelihood Estimatio
More informationMachine Learning Brett Bernstein
Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio
More informationAn Introduction to Randomized Algorithms
A Itroductio to Radomized Algorithms The focus of this lecture is to study a radomized algorithm for quick sort, aalyze it usig probabilistic recurrece relatios, ad also provide more geeral tools for aalysis
More information1 Convergence in Probability and the Weak Law of Large Numbers
36-752 Advaced Probability Overview Sprig 2018 8. Covergece Cocepts: i Probability, i L p ad Almost Surely Istructor: Alessadro Rialdo Associated readig: Sec 2.4, 2.5, ad 4.11 of Ash ad Doléas-Dade; Sec
More informationFrequentist Inference
Frequetist Iferece The topics of the ext three sectios are useful applicatios of the Cetral Limit Theorem. Without kowig aythig about the uderlyig distributio of a sequece of radom variables {X i }, for
More informationLecture 9: Expanders Part 2, Extractors
Lecture 9: Expaders Part, Extractors Topics i Complexity Theory ad Pseudoradomess Sprig 013 Rutgers Uiversity Swastik Kopparty Scribes: Jaso Perry, Joh Kim I this lecture, we will discuss further the pseudoradomess
More information7.1 Convergence of sequences of random variables
Chapter 7 Limit theorems Throughout this sectio we will assume a probability space (Ω, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite
More informationLecture Chapter 6: Convergence of Random Sequences
ECE5: Aalysis of Radom Sigals Fall 6 Lecture Chapter 6: Covergece of Radom Sequeces Dr Salim El Rouayheb Scribe: Abhay Ashutosh Doel, Qibo Zhag, Peiwe Tia, Pegzhe Wag, Lu Liu Radom sequece Defiitio A ifiite
More informationLecture 10 October Minimaxity and least favorable prior sequences
STATS 300A: Theory of Statistics Fall 205 Lecture 0 October 22 Lecturer: Lester Mackey Scribe: Brya He, Rahul Makhijai Warig: These otes may cotai factual ad/or typographic errors. 0. Miimaxity ad least
More informationLecture 7: Properties of Random Samples
Lecture 7: Properties of Radom Samples 1 Cotiued From Last Class Theorem 1.1. Let X 1, X,...X be a radom sample from a populatio with mea µ ad variace σ
More informationAsymptotic Coupling and Its Applications in Information Theory
Asymptotic Couplig ad Its Applicatios i Iformatio Theory Vicet Y. F. Ta Joit Work with Lei Yu Departmet of Electrical ad Computer Egieerig, Departmet of Mathematics, Natioal Uiversity of Sigapore IMS-APRM
More informationLecture 19: Convergence
Lecture 19: Covergece Asymptotic approach I statistical aalysis or iferece, a key to the success of fidig a good procedure is beig able to fid some momets ad/or distributios of various statistics. I may
More informationGoodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)
Goodess-of-Fit Tests ad Categorical Data Aalysis (Devore Chapter Fourtee) MATH-252-01: Probability ad Statistics II Sprig 2019 Cotets 1 Chi-Squared Tests with Kow Probabilities 1 1.1 Chi-Squared Testig................
More informationFall 2013 MTH431/531 Real analysis Section Notes
Fall 013 MTH431/531 Real aalysis Sectio 8.1-8. Notes Yi Su 013.11.1 1. Defiitio of uiform covergece. We look at a sequece of fuctios f (x) ad study the coverget property. Notice we have two parameters
More informationLecture 3 The Lebesgue Integral
Lecture 3: The Lebesgue Itegral 1 of 14 Course: Theory of Probability I Term: Fall 2013 Istructor: Gorda Zitkovic Lecture 3 The Lebesgue Itegral The costructio of the itegral Uless expressly specified
More informationStat410 Probability and Statistics II (F16)
Some Basic Cocepts of Statistical Iferece (Sec 5.) Suppose we have a rv X that has a pdf/pmf deoted by f(x; θ) or p(x; θ), where θ is called the parameter. I previous lectures, we focus o probability problems
More informationSeunghee Ye Ma 8: Week 5 Oct 28
Week 5 Summary I Sectio, we go over the Mea Value Theorem ad its applicatios. I Sectio 2, we will recap what we have covered so far this term. Topics Page Mea Value Theorem. Applicatios of the Mea Value
More informationBasics of Probability Theory (for Theory of Computation courses)
Basics of Probability Theory (for Theory of Computatio courses) Oded Goldreich Departmet of Computer Sciece Weizma Istitute of Sciece Rehovot, Israel. oded.goldreich@weizma.ac.il November 24, 2008 Preface.
More informationInformation Theory Tutorial Communication over Channels with memory. Chi Zhang Department of Electrical Engineering University of Notre Dame
Iformatio Theory Tutorial Commuicatio over Chaels with memory Chi Zhag Departmet of Electrical Egieerig Uiversity of Notre Dame Abstract A geeral capacity formula C = sup I(; Y ), which is correct for
More informationSTAT Homework 2 - Solutions
STAT-36700 Homework - Solutios Fall 08 September 4, 08 This cotais solutios for Homework. Please ote that we have icluded several additioal commets ad approaches to the problems to give you better isight.
More informationDistribution of Random Samples & Limit theorems
STAT/MATH 395 A - PROBABILITY II UW Witer Quarter 2017 Néhémy Lim Distributio of Radom Samples & Limit theorems 1 Distributio of i.i.d. Samples Motivatig example. Assume that the goal of a study is to
More information7 Sequences of real numbers
40 7 Sequeces of real umbers 7. Defiitios ad examples Defiitio 7... A sequece of real umbers is a real fuctio whose domai is the set N of atural umbers. Let s : N R be a sequece. The the values of s are
More informationDefinition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4.
4. BASES I BAACH SPACES 39 4. BASES I BAACH SPACES Sice a Baach space X is a vector space, it must possess a Hamel, or vector space, basis, i.e., a subset {x γ } γ Γ whose fiite liear spa is all of X ad
More informationMath 113 Exam 4 Practice
Math Exam 4 Practice Exam 4 will cover.-.. This sheet has three sectios. The first sectio will remid you about techiques ad formulas that you should kow. The secod gives a umber of practice questios for
More information1 Approximating Integrals using Taylor Polynomials
Seughee Ye Ma 8: Week 7 Nov Week 7 Summary This week, we will lear how we ca approximate itegrals usig Taylor series ad umerical methods. Topics Page Approximatig Itegrals usig Taylor Polyomials. Defiitios................................................
More informationLecture 14: Graph Entropy
15-859: Iformatio Theory ad Applicatios i TCS Sprig 2013 Lecture 14: Graph Etropy March 19, 2013 Lecturer: Mahdi Cheraghchi Scribe: Euiwoog Lee 1 Recap Bergma s boud o the permaet Shearer s Lemma Number
More information4. Partial Sums and the Central Limit Theorem
1 of 10 7/16/2009 6:05 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 4. Partial Sums ad the Cetral Limit Theorem The cetral limit theorem ad the law of large umbers are the two fudametal theorems
More informationSOME THEORY AND PRACTICE OF STATISTICS by Howard G. Tucker
SOME THEORY AND PRACTICE OF STATISTICS by Howard G. Tucker CHAPTER 9. POINT ESTIMATION 9. Covergece i Probability. The bases of poit estimatio have already bee laid out i previous chapters. I chapter 5
More informationSequences and Series of Functions
Chapter 6 Sequeces ad Series of Fuctios 6.1. Covergece of a Sequece of Fuctios Poitwise Covergece. Defiitio 6.1. Let, for each N, fuctio f : A R be defied. If, for each x A, the sequece (f (x)) coverges
More informationMachine Learning Theory Tübingen University, WS 2016/2017 Lecture 12
Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract I this lecture we derive risk bouds for kerel methods. We will start by showig that Soft Margi kerel SVM correspods to miimizig
More information1+x 1 + α+x. x = 2(α x2 ) 1+x
Math 2030 Homework 6 Solutios # [Problem 5] For coveiece we let α lim sup a ad β lim sup b. Without loss of geerality let us assume that α β. If α the by assumptio β < so i this case α + β. By Theorem
More informationLecture 15: Learning Theory: Concentration Inequalities
STAT 425: Itroductio to Noparametric Statistics Witer 208 Lecture 5: Learig Theory: Cocetratio Iequalities Istructor: Ye-Chi Che 5. Itroductio Recall that i the lecture o classificatio, we have see that
More informationRefinement of Two Fundamental Tools in Information Theory
Refiemet of Two Fudametal Tools i Iformatio Theory Raymod W. Yeug Istitute of Network Codig The Chiese Uiversity of Hog Kog Joit work with Siu Wai Ho ad Sergio Verdu Discotiuity of Shao s Iformatio Measures
More informationReview Problems 1. ICME and MS&E Refresher Course September 19, 2011 B = C = AB = A = A 2 = A 3... C 2 = C 3 = =
Review Problems ICME ad MS&E Refresher Course September 9, 0 Warm-up problems. For the followig matrices A = 0 B = C = AB = 0 fid all powers A,A 3,(which is A times A),... ad B,B 3,... ad C,C 3,... Solutio:
More informationDiscrete Mathematics for CS Spring 2008 David Wagner Note 22
CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig
More informationLecture 2: Concentration Bounds
CSE 52: Desig ad Aalysis of Algorithms I Sprig 206 Lecture 2: Cocetratio Bouds Lecturer: Shaya Oveis Ghara March 30th Scribe: Syuzaa Sargsya Disclaimer: These otes have ot bee subjected to the usual scrutiy
More informationDiscrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22
CS 70 Discrete Mathematics for CS Sprig 2007 Luca Trevisa Lecture 22 Aother Importat Distributio The Geometric Distributio Questio: A biased coi with Heads probability p is tossed repeatedly util the first
More informationChapter 10: Power Series
Chapter : Power Series 57 Chapter Overview: Power Series The reaso series are part of a Calculus course is that there are fuctios which caot be itegrated. All power series, though, ca be itegrated because
More informationMath 113 Exam 3 Practice
Math Exam Practice Exam will cover.-.9. This sheet has three sectios. The first sectio will remid you about techiques ad formulas that you should kow. The secod gives a umber of practice questios for you
More informationCHAPTER 10 INFINITE SEQUENCES AND SERIES
CHAPTER 10 INFINITE SEQUENCES AND SERIES 10.1 Sequeces 10.2 Ifiite Series 10.3 The Itegral Tests 10.4 Compariso Tests 10.5 The Ratio ad Root Tests 10.6 Alteratig Series: Absolute ad Coditioal Covergece
More informationLecture 3: August 31
36-705: Itermediate Statistics Fall 018 Lecturer: Siva Balakrisha Lecture 3: August 31 This lecture will be mostly a summary of other useful expoetial tail bouds We will ot prove ay of these i lecture,
More informationIIT JAM Mathematical Statistics (MS) 2006 SECTION A
IIT JAM Mathematical Statistics (MS) 6 SECTION A. If a > for ad lim a / L >, the which of the followig series is ot coverget? (a) (b) (c) (d) (d) = = a = a = a a + / a lim a a / + = lim a / a / + = lim
More informationFACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures
FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals
More informationEntropy Rates and Asymptotic Equipartition
Chapter 29 Etropy Rates ad Asymptotic Equipartitio Sectio 29. itroduces the etropy rate the asymptotic etropy per time-step of a stochastic process ad shows that it is well-defied; ad similarly for iformatio,
More informationMATH 112: HOMEWORK 6 SOLUTIONS. Problem 1: Rudin, Chapter 3, Problem s k < s k < 2 + s k+1
MATH 2: HOMEWORK 6 SOLUTIONS CA PRO JIRADILOK Problem. If s = 2, ad Problem : Rudi, Chapter 3, Problem 3. s + = 2 + s ( =, 2, 3,... ), prove that {s } coverges, ad that s < 2 for =, 2, 3,.... Proof. The
More information7.1 Convergence of sequences of random variables
Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite
More information1 Review and Overview
CS9T/STATS3: Statistical Learig Theory Lecturer: Tegyu Ma Lecture #6 Scribe: Jay Whag ad Patrick Cho October 0, 08 Review ad Overview Recall i the last lecture that for ay family of scalar fuctios F, we
More informationMath Solutions to homework 6
Math 175 - Solutios to homework 6 Cédric De Groote November 16, 2017 Problem 1 (8.11 i the book): Let K be a compact Hermitia operator o a Hilbert space H ad let the kerel of K be {0}. Show that there
More information1. Universal v.s. non-universal: know the source distribution or not.
28. Radom umber geerators Let s play the followig game: Give a stream of Ber( p) bits, with ukow p, we wat to tur them ito pure radom bits, i.e., idepedet fair coi flips Ber( / 2 ). Our goal is to fid
More informationProblems from 9th edition of Probability and Statistical Inference by Hogg, Tanis and Zimmerman:
Math 224 Fall 2017 Homework 4 Drew Armstrog Problems from 9th editio of Probability ad Statistical Iferece by Hogg, Tais ad Zimmerma: Sectio 2.3, Exercises 16(a,d),18. Sectio 2.4, Exercises 13, 14. Sectio
More informationRandom Variables, Sampling and Estimation
Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig
More informationTopics. Homework Problems. MATH 301 Introduction to Analysis Chapter Four Sequences. 1. Definition of convergence of sequences.
MATH 301 Itroductio to Aalysis Chapter Four Sequeces Topics 1. Defiitio of covergece of sequeces. 2. Fidig ad provig the limit of sequeces. 3. Bouded covergece theorem: Theorem 4.1.8. 4. Theorems 4.1.13
More informationn outcome is (+1,+1, 1,..., 1). Let the r.v. X denote our position (relative to our starting point 0) after n moves. Thus X = X 1 + X 2 + +X n,
CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 9 Variace Questio: At each time step, I flip a fair coi. If it comes up Heads, I walk oe step to the right; if it comes up Tails, I walk oe
More informationECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015
ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],
More informationLecture 19. sup y 1,..., yn B d n
STAT 06A: Polyomials of adom Variables Lecture date: Nov Lecture 19 Grothedieck s Iequality Scribe: Be Hough The scribes are based o a guest lecture by ya O Doell. I this lecture we prove Grothedieck s
More informationHOMEWORK I: PREREQUISITES FROM MATH 727
HOMEWORK I: PREREQUISITES FROM MATH 727 Questio. Let X, X 2,... be idepedet expoetial radom variables with mea µ. (a) Show that for Z +, we have EX µ!. (b) Show that almost surely, X + + X (c) Fid the
More informationMathematical Methods for Physics and Engineering
Mathematical Methods for Physics ad Egieerig Lecture otes Sergei V. Shabaov Departmet of Mathematics, Uiversity of Florida, Gaiesville, FL 326 USA CHAPTER The theory of covergece. Numerical sequeces..
More informationLecture Notes for Analysis Class
Lecture Notes for Aalysis Class Topological Spaces A topology for a set X is a collectio T of subsets of X such that: (a) X ad the empty set are i T (b) Uios of elemets of T are i T (c) Fiite itersectios
More informationFirst Year Quantitative Comp Exam Spring, Part I - 203A. f X (x) = 0 otherwise
First Year Quatitative Comp Exam Sprig, 2012 Istructio: There are three parts. Aswer every questio i every part. Questio I-1 Part I - 203A A radom variable X is distributed with the margial desity: >
More informationChapter 7 Isoperimetric problem
Chapter 7 Isoperimetric problem Recall that the isoperimetric problem (see the itroductio its coectio with ido s proble) is oe of the most classical problem of a shape optimizatio. It ca be formulated
More informationMachine Learning Theory (CS 6783)
Machie Learig Theory (CS 6783) Lecture 2 : Learig Frameworks, Examples Settig up learig problems. X : istace space or iput space Examples: Computer Visio: Raw M N image vectorized X = 0, 255 M N, SIFT
More information13.1 Shannon lower bound
ECE598: Iformatio-theoretic methods i high-dimesioal statistics Srig 016 Lecture 13: Shao lower boud, Fao s method Lecturer: Yihog Wu Scribe: Daewo Seo, Mar 8, 016 [Ed Mar 11] I the last class, we leared
More informationEcon 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.
Eco 325/327 Notes o Sample Mea, Sample Proportio, Cetral Limit Theorem, Chi-square Distributio, Studet s t distributio 1 Sample Mea By Hiro Kasahara We cosider a radom sample from a populatio. Defiitio
More informationBeurling Integers: Part 2
Beurlig Itegers: Part 2 Isomorphisms Devi Platt July 11, 2015 1 Prime Factorizatio Sequeces I the last article we itroduced the Beurlig geeralized itegers, which ca be represeted as a sequece of real umbers
More informationREGRESSION WITH QUADRATIC LOSS
REGRESSION WITH QUADRATIC LOSS MAXIM RAGINSKY Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X, Y ), where, as before, X is a R d
More informationAda Boost, Risk Bounds, Concentration Inequalities. 1 AdaBoost and Estimates of Conditional Probabilities
CS8B/Stat4B Sprig 008) Statistical Learig Theory Lecture: Ada Boost, Risk Bouds, Cocetratio Iequalities Lecturer: Peter Bartlett Scribe: Subhrasu Maji AdaBoost ad Estimates of Coditioal Probabilities We
More informationCEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering
CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio
More informationCS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5
CS434a/54a: Patter Recogitio Prof. Olga Veksler Lecture 5 Today Itroductio to parameter estimatio Two methods for parameter estimatio Maimum Likelihood Estimatio Bayesia Estimatio Itroducto Bayesia Decisio
More informationInfinite Sequences and Series
Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet
More information