The method of types. PhD short course Information Theory and Statistics Siena, September, Mauro Barni University of Siena

PhD short course Iformatio Theory ad Statistics Siea, 15-19 September, 2014 The method of types Mauro Bari Uiversity of Siea

Outlie of the course Part 1: Iformatio theory i a utshell Part 2: The method of types ad its relatioship with statistics Part 3: Iformatio theory ad large deviatio theory Part 4: Iformatio theory ad hypothesis testig Part 5: Applicatio to adversarial sigal processig

Outlie of Part 2 The method of types Defiitios Basic properties with proof of theorems Law of large umbers Source codig, Uiversal source codig

Type or empirical probability Type, or empirical probability, of a sequece P x (a) = N(a x ) a X Set with all the types with deomiator P = all types with deomiator '! 1 if X = {0,1} P 5 = ( 0,1), 5, 4 $! 2 # &, " 5% 5, 3 $! 3 # &, " 5% 5, 2 $! 4 # &, " 5% 5, 1 $ ( # &, 1, 0 ) " 5% ( ) * +,

Type class Type class: all the sequeces havig the same type T(P) = { x X : P x = P} Example: x 5 = 01100! P x 5 = 3 5, 2 $ # & " 5% T ( P ) x 5 = ') ( *) 11000,10100,10010,10001, 01100 01010, 01001, 00110, 00101, 00011 + ), -)

Number of types The umber of types grows polyomially with Theorem The umber of types with deomiator is upper bouded by: P ( +1) X Proof. Obvious.

Probability of a sequece Theorem The probability that a sequece x = x is emitted by a DMS source with pmf Q is Q(x) = 2 ( H (P x ) +D(P x Q) ) if P x = Q Q(x) = 2 H (P x ) H (Q) = 2 Remember The larger the KL distace from the type of x ad Q the lower the probability.

Probability of a sequece Proof. i Q(x) = Q(x i ) = a X Q(a) N (a x) = Q(a) P x (a) = 2 P x (a)logq(a) a X a X a X = 2 [P x (a)logq(a) P x (a)log P x (a)+p x (a)log P x (a)] = 2 a " P x (a)log P x (a) Q(a) +P % $ x (a)log P x (a)' # & = 2 [ H (P x )+D(P x Q) ]

Examples Probability of a specific sequece with /2 tails ad heads Fair coi Biased coi with P(H) = 1/3, P(T) = 2/3 Same as above with /3 heads Fair coi Biased coi with P(H) = 1/3, P(T) = 2/3

Size of a type class Theorem The size of a type class T(P) ca be bouded as follows: 1 ( +1) X 2 H (P) T(P) 2 H (P) Remember The size of a type class grows expoetially with growig rate equal to the etropy of the type.

Size of a type class Proof. (upper boud) Give P P cosider the probability that a source with pmf P emits a sequece i T(P). We have 1 P(x) = 2 x T (P) x T (P) H (P) H (P) = T(P) 2 H (P) T(P) 2

Size of a type class Proof. (lower boud)! T(P) = # " P(a 1 )... P(a X ) $ & =! % 1! 2! X!! # " e $ & % T(P)!! $ # & " e % " 1 1 $ # e " $ # e % ' & % 1 ' & X " $ # Stirlig approximatio after some algebra X e % X ' & T(P) 1 ( +1) X 2 H (P)

Probability of a type class Theorem The probability that a DMS with pmf Q emits a sequece belogig to T(P) ca be bouded as follows: 1 ( +1) X 2 D(P Q) Q(T(P)) 2 D(P Q) Remember The larger the KL distace betwee P ad Q the smaller the probability. If P=Q, the probability teds to 1 expoetially fast

Probability of a type class Proof. Q(T(P)) = Q(x) = 2 x T (P) x T (P) (H (P)+D(P Q)) (H (P)+D(P Q)) = T(P) 2 By rememberig the bouds o the size of T(P): 1 ( +1) X 2 D(P Q) Q(T(P)) 2 D(P Q)

I summary P ( +1) X Q(x) = 2 [D(P x Q)+H (P x )] H (P) T(P) 2 Q(T(P)) 2 D(P Q)

Iformatio Theory ad Statistics

Law of large umbers The law of large umbers provides the lik betwee Iformatio Theory ad Statistics. The weak form of the LLN states that Give a sequece of iid radom variables X i X = 1 ε > 0 i=1 X i lim Pr{ X µ X > ε} = 0 Stadard proof is based o Chebyshev iequality. LLN ca be easily exteded to relative frequecies ad probabilities (for discrete radom variables).

Law of large umbers (IT perspective) Q(T(P)) 2 D(P Q) Whe grows the oly type class with a o-egligible probability is Q Theorem (law of large umbers) T ε Q = { x : D(P x Q) ε} P(x T Q ε ) = Q(T(P)) 2 D(P Q) 2 ε P:D(P Q)>ε P:D(P Q)>ε P:D(P Q)>ε ( +1) X 2 ε = 2 # ε X $ % log(+1) & ' ( That teds to 0 whe teds to ifiity

Source codig (achievability) Source codig theorem (Shao 48) Give a DMS source Q, ay rate R such that R = H(Q)+ε is achievable (for ay ε > 0) Code sequeces of icreasig leght. Code efficietly oly the sequeces i T(Q), sice the others will (almost) ever occur. To do that we eed oly H(Q) bits.

Source codig: rigorous proof Choose a small ε ad defie T ε Q = {x : D(P x Q) ε} By the cotiuity of D d(p x,q) ε ' which 0 if ε 0 By the cotiuity of H H(P x ) H(Q)+ε '' which 0 if ε ' 0 1. Code sequeces i T Q ε by coutig them i T Q ε 2. Code sequeces ot i T Q ε by coutig them i X

Source codig: rigorous proof The average umber of bits is L Pr{T Q ε }[H(Q)+ ε ''+ X log( +1)]+ (1 Pr{T Q ε }) log( X ) L log( +1) H(Q)ε ''+ X +δ log( X ) That ca be made arbitrarily small by icreasig ad by properly choosig ε ad δ

Uiversal source codig What if Q is ot kow? The suprisig result is that we ca still code at ayrate larger tha the Etropy. Observe a sequece of emitted symbols ad estimate Q, the trasmit iformatio about the type ad the idex of the sequece withi the type

Uiversal source codig (rigorous proof) Choose a arbitrarily small ε ad let T ε Q = { x : D(P x Q) ε}. Give a sequece x use X log( +1) bits to idicate its type ad H(P x ) to idex x withi the type. The average umber of bits used by the code is: X log( +1) X log( +1) + Q(x )H(P x ) + Q(x )H(P x ) x T Q ε x T Q ε +Q(x T Q ε )log X +Q(x T Q ε )[H(Q)+δ] H(Q)+δ ' Beig ε ad δ (ad hece δ ) arbitrarily small, ay rate larger tha H(Q) ca be obtaied.

Chael codig The method of types ca be used to prove may other results i IT icludig the chael codig theorem Outside the scope of this course

Refereces 1. T. M. Cover ad J. A. Thomas, Elemets of Iformatio Theory, Wiley 2. I. Csiszar, The method of types, IEEE Tras. If. Theory, vol.44, o.6, pp. 2505 2523, Oct. 1998. 3. I. Csiszar ad P. C Shields, Iformatio Theory ad Statistics; a Tutorial, Foudatios ad Treds i Commu. ad If. Theory, 2004, NOW Pubisher Ic.