1 p(x) = E log 1. H(X) = plog 1 p +(1 p)log 1 1 p.

Size: px
Start display at page:

Download "1 p(x) = E log 1. H(X) = plog 1 p +(1 p)log 1 1 p."

Transcription

1 LECTURE 2 Iformatio Measures 2. ENTROPY LetXbeadiscreteradomvariableoaalphabetX drawaccordigtotheprobability mass fuctio (pmf) p() = P(X = ), X, deoted i short as X p(). The ucertaity about the outcome of X, or equivaletly, the amout of iformatio gaied by observig X, is measured by its etropy H(X) = p() log p() = Elog p(x). X Bycotiuity,weusethecovetio 0log0 = 0itheabovesummatio. Sometimeswe deoteh(x)byh(p()),highlightigthefactthath(x)isafuctioalofthepmf p(). Eample.. If X is a Beroulli radom variable with parameter p = P{X = } [0,] (ishort X Ber(p)),the H(X) = plog p +( p)log p. With a slight abuse of otatio, we deote this quatity by H(p) ad refer to it as the biary etropy fuctio. Figure. plots the biary etropy fuctio. 0 /2 Figure.. The biary etropy fuctio H(p).

2 2 Iformatio Measures Eample.. If X is a geometric radom variable with parameter p (0,]) (i short X Geom(p)),the H(X) = i= p( p) i log = log p + p p log p( p) i p. The etropy H(X) satisfies the followig properties.. H(X) 0.. H(X)isacocavefuctioi p().. H(X) log X.. H(X)isivariatuderayoe-to-oetrasformatio o X,i.e.,H(X) = H(f(X)), provided that f is ijective. Iparticular, H(X) = H(X +a) ad H(X) = H(aX) for a = 0. Thefirstpropertyistrivial. Theproofofthesecodpropertyisleftasaeercise. Forthe proof of the third property, we recall the followig. Lemma. (Jese s iequality). If f() is cove, the E(f(X)) f(e(x)). If f()iscocave,the E(f(X)) f(e(x)). Now by the cocavity of the logarithm fuctio ad Jese s iequality, H(X) = Elog where the last iequality follows sice loge log X, p(x) p(x) E = p(x) : p() =0 p() p() = {: p() = 0} X. Let (X,Y) be a pair of discrete radom variables. The the coditioal etropy of Y give X isdefiedas H(Y X) = Elog p(y X), (. )

3 2. Etropy 3 where p(y ) = p(, y)/p() is the coditioal pmf of Y give {X = }. We sometimes use the otatio H(Y X = ) = H(p(y )), X. I this otatio, H(Y X) = p()p(y ) log y = p()h(y X = ). p(y ) (. ) More geerally, if X is a arbitrary radom variable ady {X = } p(y ) is discrete forevery X,thethecoditioaletropycabedefiedasi(. )ad(. ): H(Y X) = Elog p(y X) = H(Y X = )dp() IfY isdiscrete,thebythecocavityofh(p(y))i p(y)adjese siequality, H(p(y ))dp() Hp(y )dp() = H(p(y)), where the iequality holds with equality if p(y ) p(y), or equivaletly, X ad Y are idepedet. We summarize this relatioship betwee the coditioal ad ucoditioal etropies as follows. Coditioig reduces etropy. H(Y X) H(Y) (. ) withequalityif X ady areidepedet. Let(X,Y) p(, y)beapairofdiscreteradomvariables. Theirjoitetropyis H(X,Y) = Elog p(x,y). Bythechairuleofprobability p(, y) = p()p(y ) = p(y)p( y),wehavethechairule of etropy H(X,Y) = Elog p(x) +Elog p(y X) = H(X)+H(Y X) = H(Y)+H(X Y). More geerally, for a -tuple of radom variables X = (X,X 2,...,X ), we have the followig.

4 4 Iformatio Measures Chai rule of etropy. H(X ) = H(X )+H(X 2 X )+H(X X ) = i= H(X i X i ), where X 0 issettobeauspecifiedcostatbycovetio. Bythechairulead(. ),wecaupperboudthejoitetropyas H(X ) i= H(X i ) withequalityif X,...,X aremutually idepedet. 2.2 DIFFERENTIAL ENTROPY Let X be a cotiuous radom variable draw accordig to the probability desity fuctio(pdf) dp(x ) p() =, Ă, d deotedishortas X p(). Itsdifferetialetropyisdefiedas h(x) = p()log p() d = Elog p(x). Eample2.3. If X is a uiform radom variable over the iterval [a,b] (i short X Uif[a,b]),the b h(x) = log(b a)d a b a = E[log(b a)] = log(b a). Eample2.4. If X isagaussiaradomvariable withmea advariace σ 2 (ishort X N(,σ 2 )),the ( ) 2 h(x) = 2πσ 2e 2 2 log2πσ 2 e ( )2 2 2 d = 2 log(2πσ2 )+ 2σ 2 E[(X )2 ] = 2 log(2πeσ2 ).

5 2.2 Differetial Etropy 5 The differetial etropy h(x) satisfies the followig properties.. h(x)cabeegative.. h(x)isacocavefuctioi p().. h(x) is ivariat uder ay trasformatio that preserves the Lebesgue measure. For eample, h(x) = h( X)ad h(x) = h(x +a).. h(ax) = h(x)+log a fora = 0. If X isaarbitraryradomvariable ady {X = }iscotiuouswithpdf p(y )for every X,thethecoditioaldifferetialetropyofY give X isdefiedas H(Y X) = Elog p(y X) = p(y )log = H(Y X = )dp(). p(y ) dydp() whereh(y X = ) = H(p(y )) deotes thedifferetial etropy of p(y ) forafied. Suppose that Y is cotiuous. As i the discrete case, by cocavity ad Jese s iequality, we have the followig. Coditioig reduces differetial etropy. h(y X) h(y) (. ) withequalityif X ady areidepedet. Let(X,Y) p(, y)beapairofcotiuous radomvariables. Theirjoitdifferetial etropy is h(x,y) = Elog p(x,y) = Elog p(x) +Elog p(y X) = h(x)+h(y X) = h(y)+ h(x Y). Asithediscretecase,fora-tupleofcotiuousradomvariablesX = (X,X 2,...,X ), we have the followig.

6 6 Iformatio Measures Chai rule of differetial etropy. h(x ) = i= h(x i X i ), where X 0 issettobeauspecifiedcostatbycovetio. Bythechairulead(. ),wecaupperboudthejoitetropyas h(x ) i= h(x i ) (. ) withequalityif X,...,X aremutually idepedet. Eample.. Let X = (X,...,X ) be joitly Gaussia radom variables with mea vector ad covariace matri K. The, their differetial etropy is h(x ) = 2 log((2πe) K ), (. ) where K deotes the determiat of K. By (. ), h(x ) i= h(x i ) = 2 log(2πek ii). (. ) By comparig(. ) ad(. ), we ca easily establish the famous Hadamard iequality 2.3 RELATIVE ENTROPY K i= K ii. Let p() ad q() be a pair of pmfs o X. The etet of discrepacy betwee p() ad q() is measured by their relative etropy(also referred to as Kullback Leibler divergece) D(pq) = D(p()q()) = E p log p(x) q(x) = p()log p() q(). X (. ) where the epectatio is take w.r.t. X p(). Note that this quatity is well defied oly whe p() is absolutely cotiuous w.r.t. q(), amely, p() = 0 wheever q() = 0. Otherwise, we defie D(pq) =, which follows by adoptig the covetio /0 =

7 2.4 f -Divergece 7 as well. The relative etropy of two pdfs p() ad q() is similarly defied, with the summatio i(. ) replaced by a itegral. The relative etropy D(pq) satisfies the followig properties.. D(pq) 0withequalityifadolyif(iff)p q,amely,p() = q()forevery X.. D(pq) is ot symmetric, i.e., D(pq) = D(qp) i geeral.. D(pq) iscovei(p,q),i.e.,foray(p,q ),(p 2,q 2 ),ad λ, λ = λ [0,], λd(p q )+ λd(p 2 q 2 ) D(λp + λp 2 λq + λq 2 ).. Chairule.Foray p(, y)adq(, y), D(p(, y)q(, y)) = D(p()q()) +p()d(p(y )q(y )) = D(p()q()) + E p D(p(y X)q(y X)). Theproof ofthefirstthreeproperties isleft asa eercise. For thefourthproperty,cosider D(p(, y)q(, y)) = E p log p(x,y) q(x,y) = E plog p(x) q(x) +E plog p(y X) q(y X). The otio of relative etropy ca be eteded to arbitrary probability measures P ad Qdefiedothesamesamplespaceadsetofevetsas D(PQ) = log dp dq dp, where dp/dqistherado Nikodym derivative of Pw.r.t. Q. (If Pisotabsolutelycotiuousw.r.t. Q,theD(PQ) =.) Iparticular,if Pad Qhaverespectivedesities p adq w.r.t.aσ-fiitemeasure ox,the D(P Q) = D(pq) = p()log p() q() d (). (. ) This defiitio geeralizes our earlier defiitio of the relative etropy for pmfs ad pdfs sice they ca be viewed as desities w.r.t. coutig ad Lebesgue measures, respectively. 2.4 f -DIVERGENCE Wedigressabittogeeralizetheotioofrelativeetropy.Let f : [0, ) Ă Ăbecove with f() = 0. Thethe f-divergecebetweeapairofdesitiespadqw.r.t. isdefied as D f (pq) = q()f p() q() d () = E qf p(x) q(x).

8 8 Iformatio Measures Eample2.6. Let f(u) = ulogu. The D f (pq) = q() p() p() log q() q() d () = p()log p() q() d () = D(pq). Eample2.7. Nowlet f(u) = logu. The D f (pq) = q()log p() q() d () = q()log q() p() d () = D(qp). Eample2.8. Combiig theabovetwocases,let f(u) = (u )logu. The whichissymmetrici(p,q). D f (pq) = D(pq)+D(qp), May basic distace fuctios o probability measures ca be represeted as f-divergeces; see, for eample, Liese ad Vajda( ). 2.5 MUTUAL INFORMATION Let (X,Y) be a pair of discrete radom variables with joit pmf p(, y) = p()p(y ). Theamoutofiformatioaboutoeprovidedbytheotherismeasuredbytheirmutual iformatio I(X;Y) = D(p(, y)p()p(y)) p(, y) = p(, y)log,y p()p(y) = p()p(y )log p(y ) y p(y) = p()d(p(y )p(y)). The mutual iformatio I(X; Y) satisfies the followig properties.. I(X;Y)isaoegativefuctioof p(, y).. I(X;Y) = 0iff X ady areidepedet,i.e., p(, y) p()p(y).

9 2.5 Mutual Iformatio 9. Asafuctioof(p(), p(y )), I(X;Y)iscocavei p()forafied p(y ),adcovei p(y )forafied p().. Mutual iformatio ad etropy. ad. Variatioal characterizatio. I(X;X) = H(X) I(X;Y) = H(X) H(X Y) = H(Y) H(Y X) = H(X)+H(Y) H(X,Y). I(X;Y) = mi q(y) p()d(p(y )q(y)), (. ) wherethemiimumisattaiedbyq(y) p(y). Theproofofthefirstfourpropertiesisleftasaeercise. Forthefifthproperty,cosider I(X;Y) = p(, y)log p(y ),y p(y) = p()p(y )log p(y ) q(y),y q(y) p(y) = p()d(p(y )q(y)) p(y)log p(y) y q(y) p()d(p(y )q(y)), where the last iequality follows sice the subtracted term is equal to D(p(y)q(y)) 0, adholdswithequalityiff p(y) q(y). The otio of mutual iformatio ca be geeralized to a arbitrary pair of radom variables. Let(X,Y)be draw accordig totheprobability distributio P X,Y, adlet P X ad P Y bethemargialdistributiosof X ady,respectively. Thethemutualiformatiobetwee X ady isdefiedas I(X;Y) = D(P X,Y P X P Y ). (. ) Recallthatthejoitdistributio P X,Y isalwaysabsolutelycotiuousw.r.t.theproductof themargialdistributios, P X P Y. Thismutualiformatiocabeepressedequivaletly as I(X;Y) = sup I( (X); ỳ(y)), (),ỳ(y) where the supremum is over all fiite-valued fuctios () ad ỳ(y).

10 0 Iformatio Measures Asaspecialcase,supposethatXisdiscretewithpmfp()adY {X = }iscotiuous with probability desity fuctio(pdf) p(y ). The Y is cotiuous with margial pdf p(y) = p()p(y ) ad the coditioal distributio of X give Y (or the posterior) is p( y) = p()p(y ) p(y) = p()p(y ) p( )p(y ). The mutual iformatio epressio i(. ) ca be ow writte as I(X;Y) = H(X) H(X Y). Flippigtherolesof X ady,thismutual iformatiocabealsowritteas I(X;Y) = h(y) h(y X). 2.6 CAPACITY Sometimes we are iterested i the maimum mutual iformatio ma p() I(X;Y) of a coditioal pmf p(y ), which is referred to as the iformatio capacity (or the capacity i short). By the variatioal characterizatio i (. ), the iformatio capacity ca be epressed as mai(x;y) = ma mi p() p() q(y) p()d(p(y )q(y)), whichcabeviewedasagamebetweetwoplayers,oechoosig p()firstadtheother choosig q(y) et, with the payoff fuctio f(p(),q(y)) = p()d(p(y )q(y)). Usig the followig fudametal result i game theory, we show that the order of plays ca be echaged without affectig the outcome of the game. Miima theorem(sio ). Suppose that U ad V are compact cove subsets of the Euclidea space, ad that a real-valued cotiuous fuctio f(u,) o U V is cocavei uforeach adcovei foreachu. The ma mi f(u,) = mi ma f(u,). u U V V u U

11 2.7 Etropy Rate Sice f(p(),q(y)) is liear (thus cocave) i p() ad cove i q(y) (recall Property of relative etropy), we ca apply the miima theorem ad coclude that mai(x;y) = ma mi p() p() q(y) = mi q(y) ma p() = mi ma q(y) p()d(p(y )q(y)) p()d(p(y )q(y)) D(p(y )q(y)), where the last equality follows by otig that the maimum epectatio is attaied by puttig all the weights o the value that maimizes D(p(y )q(y)). Furthermore, if p ()attaisthemaimum, thebytheoptimalitycoditioof (. ) attais the miimum. 2.7 ENTROPY RATE q (y) p ()p(y ) Let X = (X ) = bearadomprocessoafiitealphabetx. Theamout ofucertaity persymbol ismeasuredbyitsetropyrate if the limit eists. H(X) = lim Ă H(X,...,X ), Eample 2.9. If X is a aperiodic irreducible Markov chai, the the limit eists ad H(X) = lim Ă H(X X ) = π( )H(p( 2 )), where π is the uique statioary distributio of the chai. Eample2.0. If X,X 2,...areidepedetadidetically distributed(i.i.d.), the H(X) = H(X ). Eample2.. LetY = (Y ) = beastatioarymarkovchaiadx = f(y ), =,2,... Theresultigradomprocess X = (X ) = ishiddemarkovaditsetropyratesatisfies H(X X,Y ) H(X) H(X X ) ad H(X) = lim Ă H(X X,Y ) = lim Ă H(X X ).

12 2 Iformatio Measures Eample 2.2. If X is statioary, amely, P{X,X 2 2,...,X } = P{X +m,x 2+m 2,...,X +m } foreverym,adevery, 2,..., X,thetheetropyrateis H(X) = lim Ă H(X X ). 2.8 ERGODICITY Let X = (X ) N be a radom process o X with idices i N ad (, F, P) be the associated probability space, where = X N is the sample space (the set of outcomes), F = σ((x ) N ) is the σ-algebra geerated by X (the set of evets), ad P is the probability measure. Note that X (ω) = ω, where the latter deotes the -th coordiate of ω. The ide set N is typically oe of Ă = {,2,...}, Ă + = {0,,...}, ad Ă = {...,,0,,...},adisomittedwheitisirrelevatorclearfromthecotet. LetT beatimeshiftoperatoro,thatis,if the ω = (...,ω, ω 0,ω,...), Tω = (...,ω 0, ω,ω 2,...). Here the boes deote the zeroth coordiate. Ithisotatio,theprocess X = (X )isstatioaryif P(T A) = P({ω: Tω A}) = P(A) forevery A F. Iftheprocessisdouble-sided (i.e., N = Ă), statioarityis equivaletly defiedby P(A) = P(TA). Foreample, ifa = {ω: ω 0 = },the T A = {ω: (Tω) 0 = } = { = }, TA = {Tω: ω 0 = } = { = }. Thus P(A) = P(X 0 = )while P(T A) = P(X = )ad P(TA) = P(X = ). AevetAissaidtobeshift-ivariatifA = T A. Theprocess X = (X )issaidto beergodicifeveryshift-ivariatevetaistrivial, amely, P(A) = 0or. Eample2.3. Let withprobability (w.p.) /2, X = w.p. /2. The P(T A) = P(A) for every A. Thus, X is statioary. Moreover, A = T A implies A = or. Thus, P(A) {0,}ad X isergodic.

13 2.8 Ergodicity 3 Eample2.4. Let X bedefiedasieample., Z beaidepedetcopyof X,ad Y = (X,Z). AsbeforeY isstatioary. Cosidertheevet A = {y = (,z): i = z i forall i}, whichis shift-ivariat. However, P(A) = P(X = Z) = /2adY isotergodic. Thus, a composite of two idepedet ergodic processes is ot ecessarily ergodic. Eample2.5. Let X,X 2,... be i.i.d. The X = (X ) = is ergodic. To prove this, first observethatayshift-ivariatamustbeithetailσ-algebrat = Ȃ = σ(x,x +,...); seeproblem.. Sice X,X 2,...areidepedet,bytheKolmogorov - law, P(A) = 0 orad X isergodic. Eample2.6. Let /3 w.p. /2, Θ = 2/3 w.p. /2, adgive {Θ = θ}, X,X 2,... be i.i.d. Ber(θ). Note that X,X 2,...are ucoditioally depedet. Cosider the shift-ivariat evet A = ω: limsup Ă = ω: limsup Ă Thebythestroglawoflarge umbers, i= + i=2 X i (ω) 2 X i (ω) 2 = T A. ifθ = /3, P(A Θ = θ) = 0 otherwise ad P(A) = 2 P(A Θ = /3)+ 2 P(A Θ = 2/3) = 2. Thus,Xisotergodic. Igeeral,amitureofergodicprocessesisotergodic. However, every statioary process ca be viewed as a miture of statioary ergodic processes. The strog law of large umbers states that the time average of a i.i.d. radom sequece coverges to the esemble average with probability oe. This determiistic behavior arises more geerally for ergodic processes. Theorem. (Poitwiseergodictheorem(Birkhoff )). Let X = (X ) = be statioaryadergodicwith E( X ) <. The lim Ă i= X = E(X ) a.s. (. )

14 4 Iformatio Measures If E(X 2 ) <,thethecovergecei(. )alsoholdsithel 2 sese,whichisofte referred to as vo Neuma s mea ergodic theorem. The followig result is a aalog of the ergodic theorem for iformatio theory. is sta- Theorem. (Shao,McMilla,Breima ). If X = (X ) = tioary ad ergodic, the lim Ă log p(x = H(X) a.s. ) Roughlyspeakig,thetheoremstatesthateveryrealizedsequece hasaprobability earlyequalto2 H(X) (asymptoticequipartitioproperty). Fortheproofofthetheorem, refer to Cover ad Thomas(, Sectio. ). 2.9 RELATIVE ENTROPY RATE Let Pad QbetwoprobabilitymeasuresoX with-thorderdesitiesp( )adq( ), respectively. The ormalized discrepacy betwee them is measured by their relative etropy rate D(PQ) = lim Ă D(p( )q( )) if the limit eists. Eample 2.7. If P is statioary, Q is statioary fiite-order Markov, ad P is absolutely cotiuous w.r.t. Q, the the limit eists ad D(PQ) = lim Ă p( )D(p( )q( ))d. (. ) See, for eample, Barro( ) ad Gray(, Lemma. ). Eample 2.8. Similarly, if P ad Q are statioary hidde Markov ad P is absolutely cotiuous w.r.t. Q, the the limit eists ad(. ) holds(juag ad Rabier, Lerou, Ephraim ad Merhav ). The followig geeralizes the Shao McMilla Breima theorem to the relative etropy rate betwee desities. Theorem. (Barro ). If P is a statioary ergodic probability measure, Q is a statioary ergodic Markov probability measure, ad P is absolutely cotiuous w.r.t. Q, the lim Ă log p(x ) q(x = D(PQ) P a.s. )

15 Problems 5 PROBLEMS.. Prove Property of etropy i Sectio. ad fid the equality coditio for Property... Prove Properties through of relative etropy i Sectio.... Etropy ad relative etropy. Let X be a fiite alphabet ad q() be the uiform pmfox. Showthatforaypmf p()ox, D(pq) = log X H(p())... The total variatio distace betwee two pmfs p() ad q() is defied as δ(p,q) = 2 p() q(). X (a) Showthatthisdistaceisa f-divergecebyfidigthecorrespodig f. (b) Show that δ(p,q) = ma A X P(A) Q(A), where Pad Qarecorrespodigprobabilitymeasures,e.g., P(A) = A p()... Pisker s iequality. Show that δ(p,q) 2loge D(pq), where the logarithm has the same base as the relative etropy.(hit: First cosider thecasethatx isbiary.).. Let p(, y) be a joit pmf o X Y. Show that p(, y) is absolutely cotiuous w.r.t. p()p(y)... Prove Properties through of mutual iformatio i Sectio.... Let X = (X ) = beastatioaryradomprocess. Showthat H(X) = lim Ă H(X X )... LetY = (Y ) = beastatioarymarkovchaiadx = {д(y )}beahiddemarkov process. Show that ad coclude that H(X X,Y ) H(X) H(X X ) H(X) = lim Ă H(X X,Y ) = lim Ă H(X X ).

16 6 Iformatio Measures.. Recurrecetime. LetX 0,X,X 2,...bei.i.d.copiesofX p(),adletn = mi{ : X = X 0 }bethewaitigtimetotheetoccurreceof X 0. (a) Showthat E(N) = X. (b)showthat E(logN) H(X)... Thepastadthefuture. Let X = (X ) = bestatioary. Showthat lim Ă I(X,...,X ;X +,...,X 2 ) = 0... Variable-duratio symbols. A discrete memoryless source has the alphabet {, 2}, where symbol has duratio ad symbol has duratio. Let X = (X ) = be the resultig radom process. (a) FiditsetropyrateH(X)itermsoftheprobability pofsymbol. (b) Fid the maimum etropy by optimizig over p... Shift-ivariat sets. Show that the collectio I of shift-ivariat sets is a σ-algebra... Tailσ-algebra. LetX = (X ) = bearadomprocessad T = Ȃ σ(x,x +,...). (a) Showthateveryshift-ivariatA F isi T,i.e., I T. (b)doesthecoversehold,thatis, T I?.. Multiple time shifts. Let X = (X ) = be ergodic ad Y = X 2, =,2,... Is Y = (Y ) = ergodic? Proveorprovideacoutereample... Miture of ergodic processes. Let U = (U ) ad V = (V ) be two statioary ergodic processes o the same alphabet X, but with differet etropy rates H(U) ad H(V), respectively. Let X = (X ) be a radom process that is eitheru orv uiformly at radom, i.e., (a) Is X statioary? U w.p. /2, X = V w.p. /2. (b)fiditsetropyrateh(x)itermsofh(u)adh(v). (c) Does the radom sequece log p(x ) coverge almost surely? If so, characterize the limitig radom variable... Cesàro summatio. Let {a } be a sequece of real umbers. We say that a coverges iflim Ă a = a < ;thata covergesithestrogcesàroseseif lim Ă i= a a = 0;

17 Problems 7 adthata covergesithecesàroseseif lim Ă i= a i = a. (a) Show that covergece implies strog Cesàro covergece, which, i tur, implies Cesàro covergece. (b)usigpart(a)toshowthatstrogmiigimpliesweakmiig,which,itur, implies ergodicity.

18

19 Bibliography Barro, A. R.( ). The strog ergodic theorem for desities: Geeralized Shao McMilla Breima theorem. A. Probab., ( ),. [ ] Birkhoff, G. D. ( ). Proof of the ergodic theorem. Proc. Natl. Acad. Sci. USA, ( ),. [ ] Breima, L. ( ). The idividual ergodic theorem of iformatio theory. A. Math. Statist., ( ),. Correctio ( ). ( ),. [ ] Cover,T.M.adThomas,J.A. ( ). ElemetsofIformatio Theory. ded.wiley,newyork. [ ] Ephraim, Y. ad Merhav, N. ( ). Hidde Markov processes. IEEE Tras. If. Theory, ( ),. [ ] Gray, R. M.( ). Etropy ad iformatio theory. Spriger, New York. [ ] Juag, B.-H. F. ad Rabier, L. R. ( ). A probabilistic distace measure for hidde Markov models. AT&T Tech. J., ( ),. [ ] Lerou, B. G.( ). Maimum-likelihood estimatio for hidde Markov models. Stoc. Proc. Appl., ( ),. [ ] Liese, F. ad Vajda, I.( ). O divergeces ad iformatios i statistics ad iformatio theory. IEEE Tras. If. Theory, ( ),. [ ] McMilla, B.( ). The basic theorems of iformatio theory. A. Math. Statist., ( ),. [ ] Shao, C. E.( ). A mathematical theory of commuicatio. Bell Syst. Tech. J., ( ),, ( ),. [ ] Sio, M.( ). O geeral miima theorems. Pacific J. Math.,,. [ ]

H(X) = plog 1 p +(1 p)log 1 1 p. With a slight abuse of notation, we denote this quantity by H(p) and refer to it as the binary entropy function.

H(X) = plog 1 p +(1 p)log 1 1 p. With a slight abuse of notation, we denote this quantity by H(p) and refer to it as the binary entropy function. LECTURE 2 Information Measures 2. ENTROPY LetXbeadiscreterandomvariableonanalphabetX drawnaccordingtotheprobability mass function (pmf) p() = P(X = ), X, denoted in short as X p(). The uncertainty about

More information

Entropy Rates and Asymptotic Equipartition

Entropy Rates and Asymptotic Equipartition Chapter 29 Etropy Rates ad Asymptotic Equipartitio Sectio 29. itroduces the etropy rate the asymptotic etropy per time-step of a stochastic process ad shows that it is well-defied; ad similarly for iformatio,

More information

EE 4TM4: Digital Communications II Information Measures

EE 4TM4: Digital Communications II Information Measures EE 4TM4: Digital Commuicatios II Iformatio Measures Defiitio : The etropy H(X) of a discrete radom variable X is defied by We also write H(p) for the above quatity. Lemma : H(X) 0. H(X) = x X Proof: 0

More information

Distribution of Random Samples & Limit theorems

Distribution of Random Samples & Limit theorems STAT/MATH 395 A - PROBABILITY II UW Witer Quarter 2017 Néhémy Lim Distributio of Radom Samples & Limit theorems 1 Distributio of i.i.d. Samples Motivatig example. Assume that the goal of a study is to

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS MASSACHUSTTS INSTITUT OF TCHNOLOGY 6.436J/5.085J Fall 2008 Lecture 9 /7/2008 LAWS OF LARG NUMBRS II Cotets. The strog law of large umbers 2. The Cheroff boud TH STRONG LAW OF LARG NUMBRS While the weak

More information

ECE 901 Lecture 14: Maximum Likelihood Estimation and Complexity Regularization

ECE 901 Lecture 14: Maximum Likelihood Estimation and Complexity Regularization ECE 90 Lecture 4: Maximum Likelihood Estimatio ad Complexity Regularizatio R Nowak 5/7/009 Review : Maximum Likelihood Estimatio We have iid observatios draw from a ukow distributio Y i iid p θ, i,, where

More information

Lecture 7: Properties of Random Samples

Lecture 7: Properties of Random Samples Lecture 7: Properties of Radom Samples 1 Cotiued From Last Class Theorem 1.1. Let X 1, X,...X be a radom sample from a populatio with mea µ ad variace σ

More information

Lecture 7: October 18, 2017

Lecture 7: October 18, 2017 Iformatio ad Codig Theory Autum 207 Lecturer: Madhur Tulsiai Lecture 7: October 8, 207 Biary hypothesis testig I this lecture, we apply the tools developed i the past few lectures to uderstad the problem

More information

1 Convergence in Probability and the Weak Law of Large Numbers

1 Convergence in Probability and the Weak Law of Large Numbers 36-752 Advaced Probability Overview Sprig 2018 8. Covergece Cocepts: i Probability, i L p ad Almost Surely Istructor: Alessadro Rialdo Associated readig: Sec 2.4, 2.5, ad 4.11 of Ash ad Doléas-Dade; Sec

More information

Lecture 3 : Random variables and their distributions

Lecture 3 : Random variables and their distributions Lecture 3 : Radom variables ad their distributios 3.1 Radom variables Let (Ω, F) ad (S, S) be two measurable spaces. A map X : Ω S is measurable or a radom variable (deoted r.v.) if X 1 (A) {ω : X(ω) A}

More information

Spring Information Theory Midterm (take home) Due: Tue, Mar 29, 2016 (in class) Prof. Y. Polyanskiy. P XY (i, j) = α 2 i 2j

Spring Information Theory Midterm (take home) Due: Tue, Mar 29, 2016 (in class) Prof. Y. Polyanskiy. P XY (i, j) = α 2 i 2j Sprig 206 6.44 - Iformatio Theory Midterm (take home) Due: Tue, Mar 29, 206 (i class) Prof. Y. Polyaskiy Rules. Collaboratio strictly prohibited. 2. Write rigorously, prove all claims. 3. You ca use otes

More information

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Convergence of random variables. (telegram style notes) P.J.C. Spreij Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space

More information

Information Theory Tutorial Communication over Channels with memory. Chi Zhang Department of Electrical Engineering University of Notre Dame

Information Theory Tutorial Communication over Channels with memory. Chi Zhang Department of Electrical Engineering University of Notre Dame Iformatio Theory Tutorial Commuicatio over Chaels with memory Chi Zhag Departmet of Electrical Egieerig Uiversity of Notre Dame Abstract A geeral capacity formula C = sup I(; Y ), which is correct for

More information

Entropy and Ergodic Theory Lecture 5: Joint typicality and conditional AEP

Entropy and Ergodic Theory Lecture 5: Joint typicality and conditional AEP Etropy ad Ergodic Theory Lecture 5: Joit typicality ad coditioal AEP 1 Notatio: from RVs back to distributios Let (Ω, F, P) be a probability space, ad let X ad Y be A- ad B-valued discrete RVs, respectively.

More information

Maximum Likelihood Estimation and Complexity Regularization

Maximum Likelihood Estimation and Complexity Regularization ECE90 Sprig 004 Statistical Regularizatio ad Learig Theory Lecture: 4 Maximum Likelihood Estimatio ad Complexity Regularizatio Lecturer: Rob Nowak Scribe: Pam Limpiti Review : Maximum Likelihood Estimatio

More information

4.1 Data processing inequality

4.1 Data processing inequality ECE598: Iformatio-theoretic methods i high-dimesioal statistics Sprig 206 Lecture 4: Total variatio/iequalities betwee f-divergeces Lecturer: Yihog Wu Scribe: Matthew Tsao, Feb 8, 206 [Ed. Mar 22] Recall

More information

Refinement of Two Fundamental Tools in Information Theory

Refinement of Two Fundamental Tools in Information Theory Refiemet of Two Fudametal Tools i Iformatio Theory Raymod W. Yeug Istitute of Network Codig The Chiese Uiversity of Hog Kog Joit work with Siu Wai Ho ad Sergio Verdu Discotiuity of Shao s Iformatio Measures

More information

Lecture 15: Strong, Conditional, & Joint Typicality

Lecture 15: Strong, Conditional, & Joint Typicality EE376A/STATS376A Iformatio Theory Lecture 15-02/27/2018 Lecture 15: Strog, Coditioal, & Joit Typicality Lecturer: Tsachy Weissma Scribe: Nimit Sohoi, William McCloskey, Halwest Mohammad I this lecture,

More information

Probability and Random Processes

Probability and Random Processes Probability ad Radom Processes Lecture 5 Probability ad radom variables The law of large umbers Mikael Skoglud, Probability ad radom processes 1/21 Why Measure Theoretic Probability? Stroger limit theorems

More information

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam. Probability ad Statistics FS 07 Secod Sessio Exam 09.0.08 Time Limit: 80 Miutes Name: Studet ID: This exam cotais 9 pages (icludig this cover page) ad 0 questios. A Formulae sheet is provided with the

More information

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1 EECS564 Estimatio, Filterig, ad Detectio Hwk 2 Sols. Witer 25 4. Let Z be a sigle observatio havig desity fuctio where. p (z) = (2z + ), z (a) Assumig that is a oradom parameter, fid ad plot the maximum

More information

Advanced Stochastic Processes.

Advanced Stochastic Processes. Advaced Stochastic Processes. David Gamarik LECTURE 2 Radom variables ad measurable fuctios. Strog Law of Large Numbers (SLLN). Scary stuff cotiued... Outlie of Lecture Radom variables ad measurable fuctios.

More information

Generalized Semi- Markov Processes (GSMP)

Generalized Semi- Markov Processes (GSMP) Geeralized Semi- Markov Processes (GSMP) Summary Some Defiitios Markov ad Semi-Markov Processes The Poisso Process Properties of the Poisso Process Iterarrival times Memoryless property ad the residual

More information

Entropies & Information Theory

Entropies & Information Theory Etropies & Iformatio Theory LECTURE I Nilajaa Datta Uiversity of Cambridge,U.K. For more details: see lecture otes (Lecture 1- Lecture 5) o http://www.qi.damtp.cam.ac.uk/ode/223 Quatum Iformatio Theory

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 6 9/23/2013. Brownian motion. Introduction

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 6 9/23/2013. Brownian motion. Introduction MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/5.070J Fall 203 Lecture 6 9/23/203 Browia motio. Itroductio Cotet.. A heuristic costructio of a Browia motio from a radom walk. 2. Defiitio ad basic properties

More information

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio

More information

Lecture 19: Convergence

Lecture 19: Convergence Lecture 19: Covergece Asymptotic approach I statistical aalysis or iferece, a key to the success of fidig a good procedure is beig able to fid some momets ad/or distributios of various statistics. I may

More information

5 Birkhoff s Ergodic Theorem

5 Birkhoff s Ergodic Theorem 5 Birkhoff s Ergodic Theorem Amog the most useful of the various geeralizatios of KolmogorovâĂŹs strog law of large umbers are the ergodic theorems of Birkhoff ad Kigma, which exted the validity of the

More information

Information Theory and Statistics Lecture 4: Lempel-Ziv code

Information Theory and Statistics Lecture 4: Lempel-Ziv code Iformatio Theory ad Statistics Lecture 4: Lempel-Ziv code Łukasz Dębowski ldebowsk@ipipa.waw.pl Ph. D. Programme 203/204 Etropy rate is the limitig compressio rate Theorem For a statioary process (X i)

More information

Lecture 8: Convergence of transformations and law of large numbers

Lecture 8: Convergence of transformations and law of large numbers Lecture 8: Covergece of trasformatios ad law of large umbers Trasformatio ad covergece Trasformatio is a importat tool i statistics. If X coverges to X i some sese, we ofte eed to check whether g(x ) coverges

More information

Introduction to Probability. Ariel Yadin

Introduction to Probability. Ariel Yadin Itroductio to robability Ariel Yadi Lecture 2 *** Ja. 7 ***. Covergece of Radom Variables As i the case of sequeces of umbers, we would like to talk about covergece of radom variables. There are may ways

More information

Joint Probability Distributions and Random Samples. Jointly Distributed Random Variables. Chapter { }

Joint Probability Distributions and Random Samples. Jointly Distributed Random Variables. Chapter { } UCLA STAT A Applied Probability & Statistics for Egieers Istructor: Ivo Diov, Asst. Prof. I Statistics ad Neurology Teachig Assistat: Neda Farziia, UCLA Statistics Uiversity of Califoria, Los Ageles, Sprig

More information

This section is optional.

This section is optional. 4 Momet Geeratig Fuctios* This sectio is optioal. The momet geeratig fuctio g : R R of a radom variable X is defied as g(t) = E[e tx ]. Propositio 1. We have g () (0) = E[X ] for = 1, 2,... Proof. Therefore

More information

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function.

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function. MATH 532 Measurable Fuctios Dr. Neal, WKU Throughout, let ( X, F, µ) be a measure space ad let (!, F, P ) deote the special case of a probability space. We shall ow begi to study real-valued fuctios defied

More information

ECE 901 Lecture 13: Maximum Likelihood Estimation

ECE 901 Lecture 13: Maximum Likelihood Estimation ECE 90 Lecture 3: Maximum Likelihood Estimatio R. Nowak 5/7/009 The focus of this lecture is to cosider aother approach to learig based o maximum likelihood estimatio. Ulike earlier approaches cosidered

More information

5.1 Extremization of mutual information for memoryless sources and channels

5.1 Extremization of mutual information for memoryless sources and channels 5. Sigle-letterizatio. Probability of error. Etropy rate. 5. Extremizatio of mutual iformatio for memoryless sources ad chaels Theorem 5.. (Joit M.I. vs. margial M.I.) () If P Y X = P Y i X i the with

More information

Lecture 12: November 13, 2018

Lecture 12: November 13, 2018 Mathematical Toolkit Autum 2018 Lecturer: Madhur Tulsiai Lecture 12: November 13, 2018 1 Radomized polyomial idetity testig We will use our kowledge of coditioal probability to prove the followig lemma,

More information

Probability for mathematicians INDEPENDENCE TAU

Probability for mathematicians INDEPENDENCE TAU Probability for mathematicias INDEPENDENCE TAU 2013 28 Cotets 3 Ifiite idepedet sequeces 28 3a Idepedet evets........................ 28 3b Idepedet radom variables.................. 33 3 Ifiite idepedet

More information

INFORMATION THEORY AND STATISTICS. Jüri Lember

INFORMATION THEORY AND STATISTICS. Jüri Lember INFORMATION THEORY AND STATISTICS Lecture otes ad exercises Sprig 203 Jüri Lember Literature:. T.M. Cover, J.A. Thomas "Elemets of iformatio theory", Wiley, 99 ja 2006; 2. Yeug, Raymod W. "A first course

More information

Sequences and Series of Functions

Sequences and Series of Functions Chapter 6 Sequeces ad Series of Fuctios 6.1. Covergece of a Sequece of Fuctios Poitwise Covergece. Defiitio 6.1. Let, for each N, fuctio f : A R be defied. If, for each x A, the sequece (f (x)) coverges

More information

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f. Lecture 5 Let us give oe more example of MLE. Example 3. The uiform distributio U[0, ] o the iterval [0, ] has p.d.f. { 1 f(x =, 0 x, 0, otherwise The likelihood fuctio ϕ( = f(x i = 1 I(X 1,..., X [0,

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

4. Basic probability theory

4. Basic probability theory Cotets Basic cocepts Discrete radom variables Discrete distributios (br distributios) Cotiuous radom variables Cotiuous distributios (time distributios) Other radom variables Lect04.ppt S-38.45 - Itroductio

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013 MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013 Fuctioal Law of Large Numbers. Costructio of the Wieer Measure Cotet. 1. Additioal techical results o weak covergece

More information

Probability 2 - Notes 10. Lemma. If X is a random variable and g(x) 0 for all x in the support of f X, then P(g(X) 1) E[g(X)].

Probability 2 - Notes 10. Lemma. If X is a random variable and g(x) 0 for all x in the support of f X, then P(g(X) 1) E[g(X)]. Probability 2 - Notes 0 Some Useful Iequalities. Lemma. If X is a radom variable ad g(x 0 for all x i the support of f X, the P(g(X E[g(X]. Proof. (cotiuous case P(g(X Corollaries x:g(x f X (xdx x:g(x

More information

4. Partial Sums and the Central Limit Theorem

4. Partial Sums and the Central Limit Theorem 1 of 10 7/16/2009 6:05 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 4. Partial Sums ad the Cetral Limit Theorem The cetral limit theorem ad the law of large umbers are the two fudametal theorems

More information

Lecture 7: Channel coding theorem for discrete-time continuous memoryless channel

Lecture 7: Channel coding theorem for discrete-time continuous memoryless channel Lecture 7: Chael codig theorem for discrete-time cotiuous memoryless chael Lectured by Dr. Saif K. Mohammed Scribed by Mirsad Čirkić Iformatio Theory for Wireless Commuicatio ITWC Sprig 202 Let us first

More information

Lecture 13: Maximum Likelihood Estimation

Lecture 13: Maximum Likelihood Estimation ECE90 Sprig 007 Statistical Learig Theory Istructor: R. Nowak Lecture 3: Maximum Likelihood Estimatio Summary of Lecture I the last lecture we derived a risk (MSE) boud for regressio problems; i.e., select

More information

Lecture 11 and 12: Basic estimation theory

Lecture 11 and 12: Basic estimation theory Lecture ad 2: Basic estimatio theory Sprig 202 - EE 94 Networked estimatio ad cotrol Prof. Kha March 2 202 I. MAXIMUM-LIKELIHOOD ESTIMATORS The maximum likelihood priciple is deceptively simple. Louis

More information

Lecture 6: Source coding, Typicality, and Noisy channels and capacity

Lecture 6: Source coding, Typicality, and Noisy channels and capacity 15-859: Iformatio Theory ad Applicatios i TCS CMU: Sprig 2013 Lecture 6: Source codig, Typicality, ad Noisy chaels ad capacity Jauary 31, 2013 Lecturer: Mahdi Cheraghchi Scribe: Togbo Huag 1 Recap Uiversal

More information

ECE 330:541, Stochastic Signals and Systems Lecture Notes on Limit Theorems from Probability Fall 2002

ECE 330:541, Stochastic Signals and Systems Lecture Notes on Limit Theorems from Probability Fall 2002 ECE 330:541, Stochastic Sigals ad Systems Lecture Notes o Limit Theorems from robability Fall 00 I practice, there are two ways we ca costruct a ew sequece of radom variables from a old sequece of radom

More information

Lecture 11: Channel Coding Theorem: Converse Part

Lecture 11: Channel Coding Theorem: Converse Part EE376A/STATS376A Iformatio Theory Lecture - 02/3/208 Lecture : Chael Codig Theorem: Coverse Part Lecturer: Tsachy Weissma Scribe: Erdem Bıyık I this lecture, we will cotiue our discussio o chael codig

More information

ACO Comprehensive Exam 9 October 2007 Student code A. 1. Graph Theory

ACO Comprehensive Exam 9 October 2007 Student code A. 1. Graph Theory 1. Graph Theory Prove that there exist o simple plaar triagulatio T ad two distict adjacet vertices x, y V (T ) such that x ad y are the oly vertices of T of odd degree. Do ot use the Four-Color Theorem.

More information

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15 17. Joit distributios of extreme order statistics Lehma 5.1; Ferguso 15 I Example 10., we derived the asymptotic distributio of the maximum from a radom sample from a uiform distributio. We did this usig

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit theorems Throughout this sectio we will assume a probability space (Ω, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

Asymptotic Coupling and Its Applications in Information Theory

Asymptotic Coupling and Its Applications in Information Theory Asymptotic Couplig ad Its Applicatios i Iformatio Theory Vicet Y. F. Ta Joit Work with Lei Yu Departmet of Electrical ad Computer Egieerig, Departmet of Mathematics, Natioal Uiversity of Sigapore IMS-APRM

More information

Ma 4121: Introduction to Lebesgue Integration Solutions to Homework Assignment 5

Ma 4121: Introduction to Lebesgue Integration Solutions to Homework Assignment 5 Ma 42: Itroductio to Lebesgue Itegratio Solutios to Homework Assigmet 5 Prof. Wickerhauser Due Thursday, April th, 23 Please retur your solutios to the istructor by the ed of class o the due date. You

More information

1 Introduction to reducing variance in Monte Carlo simulations

1 Introduction to reducing variance in Monte Carlo simulations Copyright c 010 by Karl Sigma 1 Itroductio to reducig variace i Mote Carlo simulatios 11 Review of cofidece itervals for estimatig a mea I statistics, we estimate a ukow mea µ = E(X) of a distributio by

More information

STAT Homework 1 - Solutions

STAT Homework 1 - Solutions STAT-36700 Homework 1 - Solutios Fall 018 September 11, 018 This cotais solutios for Homework 1. Please ote that we have icluded several additioal commets ad approaches to the problems to give you better

More information

ECE 6980 An Algorithmic and Information-Theoretic Toolbox for Massive Data

ECE 6980 An Algorithmic and Information-Theoretic Toolbox for Massive Data ECE 6980 A Algorithmic ad Iformatio-Theoretic Toolbo for Massive Data Istructor: Jayadev Acharya Lecture # Scribe: Huayu Zhag 8th August, 017 1 Recap X =, ε is a accuracy parameter, ad δ is a error parameter.

More information

lim za n n = z lim a n n.

lim za n n = z lim a n n. Lecture 6 Sequeces ad Series Defiitio 1 By a sequece i a set A, we mea a mappig f : N A. It is customary to deote a sequece f by {s } where, s := f(). A sequece {z } of (complex) umbers is said to be coverget

More information

REAL ANALYSIS II: PROBLEM SET 1 - SOLUTIONS

REAL ANALYSIS II: PROBLEM SET 1 - SOLUTIONS REAL ANALYSIS II: PROBLEM SET 1 - SOLUTIONS 18th Feb, 016 Defiitio (Lipschitz fuctio). A fuctio f : R R is said to be Lipschitz if there exists a positive real umber c such that for ay x, y i the domai

More information

Lecture 27. Capacity of additive Gaussian noise channel and the sphere packing bound

Lecture 27. Capacity of additive Gaussian noise channel and the sphere packing bound Lecture 7 Ageda for the lecture Gaussia chael with average power costraits Capacity of additive Gaussia oise chael ad the sphere packig boud 7. Additive Gaussia oise chael Up to this poit, we have bee

More information

Lecture 14: Graph Entropy

Lecture 14: Graph Entropy 15-859: Iformatio Theory ad Applicatios i TCS Sprig 2013 Lecture 14: Graph Etropy March 19, 2013 Lecturer: Mahdi Cheraghchi Scribe: Euiwoog Lee 1 Recap Bergma s boud o the permaet Shearer s Lemma Number

More information

The Maximum-Likelihood Decoding Performance of Error-Correcting Codes

The Maximum-Likelihood Decoding Performance of Error-Correcting Codes The Maximum-Lielihood Decodig Performace of Error-Correctig Codes Hery D. Pfister ECE Departmet Texas A&M Uiversity August 27th, 2007 (rev. 0) November 2st, 203 (rev. ) Performace of Codes. Notatio X,

More information

Some Basic Probability Concepts. 2.1 Experiments, Outcomes and Random Variables

Some Basic Probability Concepts. 2.1 Experiments, Outcomes and Random Variables Some Basic Probability Cocepts 2. Experimets, Outcomes ad Radom Variables A radom variable is a variable whose value is ukow util it is observed. The value of a radom variable results from a experimet;

More information

Mathematics 170B Selected HW Solutions.

Mathematics 170B Selected HW Solutions. Mathematics 17B Selected HW Solutios. F 4. Suppose X is B(,p). (a)fidthemometgeeratigfuctiom (s)of(x p)/ p(1 p). Write q = 1 p. The MGF of X is (pe s + q), sice X ca be writte as the sum of idepedet Beroulli

More information

Lecture 2: Concentration Bounds

Lecture 2: Concentration Bounds CSE 52: Desig ad Aalysis of Algorithms I Sprig 206 Lecture 2: Cocetratio Bouds Lecturer: Shaya Oveis Ghara March 30th Scribe: Syuzaa Sargsya Disclaimer: These otes have ot bee subjected to the usual scrutiy

More information

IIT JAM Mathematical Statistics (MS) 2006 SECTION A

IIT JAM Mathematical Statistics (MS) 2006 SECTION A IIT JAM Mathematical Statistics (MS) 6 SECTION A. If a > for ad lim a / L >, the which of the followig series is ot coverget? (a) (b) (c) (d) (d) = = a = a = a a + / a lim a a / + = lim a / a / + = lim

More information

1 = δ2 (0, ), Y Y n nδ. , T n = Y Y n n. ( U n,k + X ) ( f U n,k + Y ) n 2n f U n,k + θ Y ) 2 E X1 2 X1

1 = δ2 (0, ), Y Y n nδ. , T n = Y Y n n. ( U n,k + X ) ( f U n,k + Y ) n 2n f U n,k + θ Y ) 2 E X1 2 X1 8. The cetral limit theorems 8.1. The cetral limit theorem for i.i.d. sequeces. ecall that C ( is N -separatig. Theorem 8.1. Let X 1, X,... be i.i.d. radom variables with EX 1 = ad EX 1 = σ (,. Suppose

More information

5.1 A mutual information bound based on metric entropy

5.1 A mutual information bound based on metric entropy Chapter 5 Global Fao Method I this chapter, we exted the techiques of Chapter 2.4 o Fao s method the local Fao method) to a more global costructio. I particular, we show that, rather tha costructig a local

More information

Solutions to HW Assignment 1

Solutions to HW Assignment 1 Solutios to HW: 1 Course: Theory of Probability II Page: 1 of 6 Uiversity of Texas at Austi Solutios to HW Assigmet 1 Problem 1.1. Let Ω, F, {F } 0, P) be a filtered probability space ad T a stoppig time.

More information

Introduction to Probability. Ariel Yadin. Lecture 7

Introduction to Probability. Ariel Yadin. Lecture 7 Itroductio to Probability Ariel Yadi Lecture 7 1. Idepedece Revisited 1.1. Some remiders. Let (Ω, F, P) be a probability space. Give a collectio of subsets K F, recall that the σ-algebra geerated by K,

More information

Notes 5 : More on the a.s. convergence of sums

Notes 5 : More on the a.s. convergence of sums Notes 5 : More o the a.s. covergece of sums Math 733-734: Theory of Probability Lecturer: Sebastie Roch Refereces: Dur0, Sectios.5; Wil9, Sectio 4.7, Shi96, Sectio IV.4, Dur0, Sectio.. Radom series. Three-series

More information

Singular Continuous Measures by Michael Pejic 5/14/10

Singular Continuous Measures by Michael Pejic 5/14/10 Sigular Cotiuous Measures by Michael Peic 5/4/0 Prelimiaries Give a set X, a σ-algebra o X is a collectio of subsets of X that cotais X ad ad is closed uder complemetatio ad coutable uios hece, coutable

More information

Chapter 6 Principles of Data Reduction

Chapter 6 Principles of Data Reduction Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a

More information

Differential Entropy

Differential Entropy School o Iormatio Sciece Dieretial Etropy 009 - Course - Iormatio Theory - Tetsuo Asao ad Tad matsumoto Email: {t-asao matumoto}@jaist.ac.jp Japa Advaced Istitute o Sciece ad Techology Asahidai - Nomi

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 3 9/11/2013. Large deviations Theory. Cramér s Theorem

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 3 9/11/2013. Large deviations Theory. Cramér s Theorem MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/5.070J Fall 203 Lecture 3 9//203 Large deviatios Theory. Cramér s Theorem Cotet.. Cramér s Theorem. 2. Rate fuctio ad properties. 3. Chage of measure techique.

More information

Random Variables, Sampling and Estimation

Random Variables, Sampling and Estimation Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig

More information

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i

More information

32 estimating the cumulative distribution function

32 estimating the cumulative distribution function 32 estimatig the cumulative distributio fuctio 4.6 types of cofidece itervals/bads Let F be a class of distributio fuctios F ad let θ be some quatity of iterest, such as the mea of F or the whole fuctio

More information

Machine Learning Theory (CS 6783)

Machine Learning Theory (CS 6783) Machie Learig Theory (CS 6783) Lecture 3 : Olie Learig, miimax value, sequetial Rademacher complexity Recap: Miimax Theorem We shall use the celebrated miimax theorem as a key tool to boud the miimax rate

More information

STA Object Data Analysis - A List of Projects. January 18, 2018

STA Object Data Analysis - A List of Projects. January 18, 2018 STA 6557 Jauary 8, 208 Object Data Aalysis - A List of Projects. Schoeberg Mea glaucomatous shape chages of the Optic Nerve Head regio i aimal models 2. Aalysis of VW- Kedall ati-mea shapes with a applicatio

More information

Chapter 6 Infinite Series

Chapter 6 Infinite Series Chapter 6 Ifiite Series I the previous chapter we cosidered itegrals which were improper i the sese that the iterval of itegratio was ubouded. I this chapter we are goig to discuss a topic which is somewhat

More information

Lecture Chapter 6: Convergence of Random Sequences

Lecture Chapter 6: Convergence of Random Sequences ECE5: Aalysis of Radom Sigals Fall 6 Lecture Chapter 6: Covergece of Radom Sequeces Dr Salim El Rouayheb Scribe: Abhay Ashutosh Doel, Qibo Zhag, Peiwe Tia, Pegzhe Wag, Lu Liu Radom sequece Defiitio A ifiite

More information

Lecture 10: Universal coding and prediction

Lecture 10: Universal coding and prediction 0-704: Iformatio Processig ad Learig Sprig 0 Lecture 0: Uiversal codig ad predictio Lecturer: Aarti Sigh Scribes: Georg M. Goerg Disclaimer: These otes have ot bee subjected to the usual scrutiy reserved

More information

2.1. The Algebraic and Order Properties of R Definition. A binary operation on a set F is a function B : F F! F.

2.1. The Algebraic and Order Properties of R Definition. A binary operation on a set F is a function B : F F! F. CHAPTER 2 The Real Numbers 2.. The Algebraic ad Order Properties of R Defiitio. A biary operatio o a set F is a fuctio B : F F! F. For the biary operatios of + ad, we replace B(a, b) by a + b ad a b, respectively.

More information

Asymptotic distribution of products of sums of independent random variables

Asymptotic distribution of products of sums of independent random variables Proc. Idia Acad. Sci. Math. Sci. Vol. 3, No., May 03, pp. 83 9. c Idia Academy of Scieces Asymptotic distributio of products of sums of idepedet radom variables YANLING WANG, SUXIA YAO ad HONGXIA DU ollege

More information

Mathematical Statistics - MS

Mathematical Statistics - MS Paper Specific Istructios. The examiatio is of hours duratio. There are a total of 60 questios carryig 00 marks. The etire paper is divided ito three sectios, A, B ad C. All sectios are compulsory. Questios

More information

An Introduction to Randomized Algorithms

An Introduction to Randomized Algorithms A Itroductio to Radomized Algorithms The focus of this lecture is to study a radomized algorithm for quick sort, aalyze it usig probabilistic recurrece relatios, ad also provide more geeral tools for aalysis

More information

Lecture 20: Multivariate convergence and the Central Limit Theorem

Lecture 20: Multivariate convergence and the Central Limit Theorem Lecture 20: Multivariate covergece ad the Cetral Limit Theorem Covergece i distributio for radom vectors Let Z,Z 1,Z 2,... be radom vectors o R k. If the cdf of Z is cotiuous, the we ca defie covergece

More information

EE 4TM4: Digital Communications II Probability Theory

EE 4TM4: Digital Communications II Probability Theory 1 EE 4TM4: Digital Commuicatios II Probability Theory I. RANDOM VARIABLES A radom variable is a real-valued fuctio defied o the sample space. Example: Suppose that our experimet cosists of tossig two fair

More information

Measure and Measurable Functions

Measure and Measurable Functions 3 Measure ad Measurable Fuctios 3.1 Measure o a Arbitrary σ-algebra Recall from Chapter 2 that the set M of all Lebesgue measurable sets has the followig properties: R M, E M implies E c M, E M for N implies

More information

Approximations and more PMFs and PDFs

Approximations and more PMFs and PDFs Approximatios ad more PMFs ad PDFs Saad Meimeh 1 Approximatio of biomial with Poisso Cosider the biomial distributio ( b(k,,p = p k (1 p k, k λ: k Assume that is large, ad p is small, but p λ at the limit.

More information

Pattern Classification

Pattern Classification Patter Classificatio All materials i these slides were tae from Patter Classificatio (d ed) by R. O. Duda, P. E. Hart ad D. G. Stor, Joh Wiley & Sos, 000 with the permissio of the authors ad the publisher

More information

1 of 7 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 6. Order Statistics Defiitios Suppose agai that we have a basic radom experimet, ad that X is a real-valued radom variable

More information

Lecture 6: Coupon Collector s problem

Lecture 6: Coupon Collector s problem Radomized Algorithms Lecture 6: Coupo Collector s problem Sotiris Nikoletseas Professor CEID - ETY Course 2017-2018 Sotiris Nikoletseas, Professor Radomized Algorithms - Lecture 6 1 / 16 Variace: key features

More information

BIRKHOFF ERGODIC THEOREM

BIRKHOFF ERGODIC THEOREM BIRKHOFF ERGODIC THEOREM Abstract. We will give a proof of the poitwise ergodic theorem, which was first proved by Birkhoff. May improvemets have bee made sice Birkhoff s orgial proof. The versio we give

More information

Universal source coding for complementary delivery

Universal source coding for complementary delivery SITA2006 i Hakodate 2005.2. p. Uiversal source codig for complemetary delivery Akisato Kimura, 2, Tomohiko Uyematsu 2, Shigeaki Kuzuoka 2 Media Iformatio Laboratory, NTT Commuicatio Sciece Laboratories,

More information