Large Deviations Performance of Interval Algorithm for Random Number Generation

Size: px
Start display at page:

Download "Large Deviations Performance of Interval Algorithm for Random Number Generation"

Transcription

1 Large Deviations Performance of Interval Algorithm for Random Number Generation Akisato KIMURA Tomohiko UYEMATSU February 22, 999 No. AK-TR Abstract We investigate large deviations performance of interval algorithm for random number generation. First, we show that the length of input sequence per the length of output sequence approaches to the ratio of entropies of input and output distributions almost surely. Next, we investigate large deviations performance especially for intrinsic randomness. We show that the length of output fair random bits per input sample approaches to the entropy of the input source almost surely, and we can determine the exponent in this case. Further, we consider to obtain the fixed number of fair random bits from the input sequence with fixed length. We show that the approximation error measured by the variational distance and divergence vanishes exponentially as the length of input sequence tends to infinity, if the number of output random bits per input sample is below the entropy of the source. Contrarily, the approximation error measured by the variational distance approaches to two exponentially and the approximation error measured by the divergence approaches to infinity linearly, if the number of random bits per input sample is above the entropy of the source. Dept. of Electrical and Electronic Eng., Tokyo Institute of Technology, 2-2- Ookayama, Meguro-ku, Tokyo , Japan

2 I. Introduction Random number generation is a problem of simulating some prescribed target distribution by using a given source. This problem has been investigated in computer science, and has a close relation to information theory, 2, 3]. Some practical algorithms for random number generation have been proposed so far, i.e., 3, 4, 5]. In this paper, we consider the interval algorithm proposed by Han and Hoshi 3]. Performance of the interval algorithm has already been investigated in 3, 6, 7]. Han and Hoshi 3] have showed that the expected length of input sequence per the length of output sequence can be characterized by the ratio of entropies of the input and output distributions. Uyematsu and Kanaya 6] have investigated large deviations performance of the interval algorithm where the distribution of input source is uniform. Further, Uchida and Han 7] have extended the result of Uyematsu and Kanaya to stationary ergodic Markov process. We investigate large deviations performance, where the input and output distributions is both non-uniform. First, we show that the length of input sequence per the length of output sequence approaches to the ratio of entropies of input and output distributions almost surely. Next, we investigate large deviations performance especially for intrinsic randomness. We show that the length of output fair random bits per input sample approaches to the entropy of the input source almost surely, and we can determine the exponent in this case. Further, we consider to obtain the fixed number of fair random bits from the input sequence with fixed length. We show that the approximation error measured by the variational distance and divergence vanishes exponentially as the length of input sequence tends to infinity, if the number of random bits per input sample is below the entropy of the source. Contrarily, the approximation error measured by the variational distance approaches to two exponentially and the approximation error measured by the divergence approaches to infinity linearly, if the number of random bits per input sample is above the entropy of the source. II. Basic Definitions (a) Discrete Memoryless Source Let X be a finite set. We denote by M(X ) the set of all probability distributions on X. Throughout this paper, by a source X with alphabet X, we mean a discrete memoryless source (DMS) of distribution P X M(X ). To denote a source, we will use both notations X and P X interchangeably. 2

3 For random variable X which has a distribution P X, we shall denote this entropy as H(P X )andh(x), interchangeably. H(P X ) = x X P X (x)logp X (x). Further, for arbitrary distributions P, Q M(X ), we denote by D(P Q) the information divergence D(P Q) = x X P (x)log P (x) Q(x). Lastly, we denote by d(p, Q) thevariational distance or l distance between two distributions P and Q on X d(p, Q) = P (x) Q(x). x X From now on, all logarithms and exponentials are to the base two. (b) Type of Sequence The type of a sequence x X n is defined as a distribution Px M(X ), where Px(a) isgivenby Px(a) = (number of occurrences of a X in x). () n We shall write P n or P for the set of types of sequences in X n. We denoted by TP n or T P the set of sequences of type P in X n. On the contrary for a distribution P M(X ), if T P then we denote by P the type of sequences in X n. We introduce some well-known facts, cf. Csiszár-Körner 8]: For the set of types in X n,wehave P n (n +) X (2) where denotes the cardinality of the set. For the set of sequences of type P in X n, (n +) X exp(nh(p )) T P exp(nh(p )) (3) If x T P,wethenhave From (3)(4), Q n (x) =exp n{d(p Q)+H(P )}]. (4) (n +) X exp( nd(p Q)) Q n (T P ) exp( nd(p Q)) (5) 3

4 (c) Intrinsic Randomness In this paper, we especially investigate the problem to generate a uniform random number with as large size as possible from a general source X = {X n } n=. This problem is called intrinsic randomness problem 9]. Here, we shall introduce basic definitions and a result for intrinsic randomness problem. Definition : For arbitrary source X = {X n } n=,rateris achievable Intrinsic Randomness (IR) rate if and only if there exists a map ϕ n : X n U Mn such that lim inf n n log M n R lim d(u M n n,ϕ n (X n )) = 0, where U Mn = {, 2,,Mn } and U Mn is a uniform distribution on U Mn. Definition 2 (sup achievable IR rate): S(X) =sup{r R is achievable IR rate} As for the characterization of IR rate, Vembu and Verdú 2]provedthe following fundamental theorem. Theorem : For any stationary source X, where H(X) is the entropy rate of X. S(X) =H(X) (6) III. Interval Algorithm In this chapter, we introduce the interval algorithm for random number generation, proposed by Han and Hoshi 3]. Let us consider to produce an i.i.d. random sequence Y n =(Y,Y 2,,Y n ). Each random variable Y i (i =, 2,,n) is subject to a generic distribution q =(q,q 2,,q N ). We generate this sequence by using an i.i.d. random sequence X,X 2,, with a generic distribution p =(p,p 2,,p M ). Interval Algorithm for Generating Random Process 4

5 a) Partition an unit interval 0, ) into N disjoint subinterval J(),J(2),,J(N) such that b) Set J(i) = Q i,q i ) i =, 2,,N i Q i = q k i =, 2,,N; Q 0 =0. P j = k= j p k j =, 2,,M; P 0 =0. k= 2) Set s = t = λ (null string), α s = γ t =0,β s = δ t =,I(s) =α s,β s ), J(t) =γ t,δ t ), and m =. 3) Obtain an output symbol from the source X to have a value a {, 2,,M}, and generate the subinterval of I(s) where I(sa) = α sa,β sa ) α sa = α s +(β s α s )P a β sa = α s +(β s α s )P a. 4a) If I(sa) is entirely contained in some J(ti) (i =, 2,,N), then output i as the value of the mth random number Y m and set t = ti. Otherwise, go to 5). 4b) If m = n then stop the algorithm. Otherwise, partition the interval J(t) γ t,δ t )inton disjoint subinterval J(t),J(t2),,J(tN) such that where J(tj) = γ tj,δ tj ) j =, 2,,N γ tj = γ t +(δ t γ t )Q j δ tj = γ t +(δ t γ t )Q j and set m = m + and go to 4a). 5

6 5) Set s = sa andgoto3). Han and Hoshi have shown that E(L) lim = H(p) n n H(q), (7) where E(L) is the average length of input sequence to obtain output sequences of length n. IV. Almost Sure Convergence Theorem We shall investigate large deviations performance of the interval algorithm for random number generation. Let us consider to produce an i.i.d. random sequence Y n =(Y,Y 2,,Y n ). Each random variable Y i (i =, 2,,n) is subject to a generic distribution P Y on Y. We generate this sequence by using an i.i.d. random sequence X,X 2, with a generic distribution P X on X.WedenotebyT n (x, y) the length of input sequence x X necessary to generate y Y n. Then, we obtain the following theorem: Theorem 2: lim n n T n(x, Y n )= H(Y ) a.s. (8) H(X) Before the proof of theorem, we shall give a necessary definition and some lemmas for strongly typical sequence 4]. Definition 3: LetPx be a type of the sequence x X n. For δ>0anda distribution P M(X ), if Px satisfies D(Px P ) δ then we call x X n P -typical sequence or strongly typical sequence. Further, when a random variable X with alphabet X has the distribution P,wealsocallx X n X-typical sequence. We shall write Tδ n(p )ort δ(p ) for the set of P -typical sequences in X n, and Tδ n(x) ort δ(x) for the set of X-typical sequences in X n. Lemma : For every 0 <δ 8,ifx T δ n (P )then n log P n (x) H(P ) γ n (9) where γ n = δ 2δ log 2δ X. (0) 6

7 Lemma 2: Suppose that a sequence {δ n } satisfies lim δ n = 0, lim nδ n = n n. For every P M(X ), P n (T δn (P )) ɛ n () where { ɛ n =exp ( n δ n ) } X log(n +). (2) n Proof of Theorem 2: Suppose that a sequence {δ n } satisfies lim δ n = 0 and lim nδ n =. n n Due to the nature of the interval algorithm, we can correspond each y Y n to a distinct subinterval J(y) of0, ) with its width PY n (y). On the other hand, we can also correspond each x X to I(x) withitswidth (x). Then, each subinterval I(x) corresponds to the input sequence. If the subinterval I(x) is included in some J(y), then the input sequence corresponding to I(x) can terminate the algorithm. PX (a)achievable part Assume that we don t have to terminate the algorithm for x / T δ (X). 7

8 Then, by using Lemma,2 and (2)-(5), we obtain { } Pr n T n(x, Y n ) R 2 max (x) y Y n : PY n(y) min x T δ (X) P X (x) + y Y n : PY n(y) min x T δ (X) P X (x) D(Q P Y )+H(Q)R(H(X)+γ n ) + D(Q P Y )+H(Q) R(H(X)+γ n ) D(Q P Y )+H(Q)R(H(X)+γ n ) + D(Q P Y )+H(Q) R(H(X)+γ n) P X x T δ (X) P n Y (y)+ Q P n 2exp n{d(q P Y ) 2Rγ n x X : x T δ (X) PX (x) 2exp{ n(r H(X) H(Q) Rγ n)} exp( nd(q P Y )) + ɛ n 2exp{ n(r H(X) H(Q) Rγ n )} exp{ n(d(q P Y ) 2Rγ n )} + ɛ n + R H(X) H(Q) D(Q P Y )+Rγ n + }]+ɛ n 2(n +) Y exp n min {D(Q P Y ) 2Rγ n Q M(Y) + R H(X) H(Q) D(Q P Y )+Rγ n + }]+ɛ n where x + =max{0,x}. Here,letbe E r (R, P X,P Y ) = min Q M(X ) {D(Q P Y )+ R H(X) H(Q) D(Q P Y ) + }. E r (R, P X,P Y ) = 0 if and only if Q = P Y and RH(X) H(Q), i.e. R H(Y ) H(X). This implies that E r(r, P X,P Y ) > 0 if and only if R> H(Y ) H(X). From this for every δ>0, there exists a sufficiently large n 0 such that γ n < δh(x) H(X) H(Y ) +δ 8

9 for all n n 0. Therefore, for all n n 0 we can see min {D(Q P Y ) 2Rγ n + R H(X) H(Q) D(Q P Y )+Rγ n + } Q M(X ) { ( ) H(Y ) = min D(Q P Y ) 2 Q M(Y) H(X) + δ γ n ( ) + H(Y ) H(Y ) H(Q)+δH(X) D(Q P Y )+ H(X) + δ + γ n > 0 R= H(Y ) H(X) +δ + } Hence, n= { Pr n T n(x, Y n ) H(Y ) } H(X) δ < (3) (b)converse part If n is sufficiently large, we can δ n 8 for all n n. Thus from Lemma, if x T δ (X) then exp{ (H(X)+γ )}PX (x) exp{ (H(X) γ )}. Let N(X ) be an integer such that exp{(h(x)+2γ )} N(X ) exp{(h(x)+3γ )}. It is easy to see PX (x) > N(X ) for all x T δ (X). Assume that all x / T δ (X) stop the algorithm, by using Lemma,2 and (2)-(5) we obtain { } Pr n T n(x, Y n ) R y Y n : P n Y (y) N(X ) P n Y (y)+ H(Q)+D(Q P Y ) n log(n(x )) H(Q)+D(Q P Y )R(H(X)+3γ n ) x X : x/ T δ (X) PX (x) exp( nd(q P Y )) + ɛ n exp( nd(q P Y )) + ɛ n (n +) Y exp{ n min Q M(Y): H(Q)+D(Q P Y )R(H(X)+3γ n ) D(Q P Y )} + ɛ n. 9

10 Here, let be F (R, P X,P Y ) = min D(Q P Y ). Q M(Y): H(Q)+D(Q P Y )R H(X) F (R, P X,P Y ) = 0 if and only if Q = P Y and H(Q) RH(X), i.e. R H(Y ) H(X). This implies that F (R, P X,P Y ) > 0 if and only if R< H(Y ) H(X). From this, for every δ > 0thereexistsasufficientlylargen 2 satisfying γ n < δh(x) H(Y ) for all n n 2. Therefore, for all n n 2 we can see 3 H(X) δ min D(Q P Y ) Q M(Y): H(Q)+D(Q P Y )R(H(X)+3γ n) H(Y ) R= H(X) δ Hence, = min Q M(Y): > 0 H(Q)+D(Q P Y )H(Y ) δh(x)+3γ n n= D(Q P Y ) H(Y ) H(X) δ { Pr n T n(x, Y ) H(Y ) } H(X) δ < (4) From (3)(4), by using the Borel-Cantelli s principle (e.g. 0]) we can obtain (8). Contrarily, let us consider to generate an i.i.d. random sequence Y,Y 2, by using an i.i.d. random sequence X n =(X,X 2,,X n ). We denote by L n (X n,y) the length of the generated sequence. Then from Theorem 2, we immediately obtain the following corollary. Corollary : lim n n L n(x n,y)= H(X) H(Y ) a.s. (5) V. Almost Sure Convergence of Number of Fair Bits per Input Sample In above chapter, we showed that the length of input sequence per the length of output sequence converges to the ratio of entropies of the input and output distributions almost surely. To investigate asymptotic properties, we consider more restricted case. Let us consider to produce a sequence of fair bits by using an i.i.d. random 0

11 sequence X n =(X,X 2,,X n )oflengthn. Each random variable X i (i =, 2,,n) is subject to a generic distribution P X on X.WedenotebyL n (x) the number of generated fair bits from the input sequence x X n. Here, we define the following functions: E r (R, P X ) = min H(Q)R D(Q P X ), (6) E sp (R, P X ) = min D(Q P X ), (7) D(Q P X )+H(Q)R F (R, P X ) = min D(Q P X ), (8) D(Q P X )+H(Q) R G(R, P X ) = min H(Q) R D(Q P X ). (9) Then, we obtain the following large deviations performances of interval algorithm: Theorem 3: ForR>0, lim inf { n n log Pr n L n(x n ) R} ] E r (R, P X ). (20) For R>R min = max x X log P X (x) lim sup { n n log Pr n L n(x n ) R} ] E sp (R, P X ). (2) Further, E r (R, P X ) > 0 if and only if R<H(X), E sp (R, P X ) > 0ifand only if R min <R<H(X), and E r (R, P X ) <E sp (R, P X )forr<h(x). Proof: We can show this theorem in a similar manner as Theorem 2. We can correspond each x X n to a distinct subinterval I(x) of0, ) with its width PX n (x). Partition a unit interval 0, ) into exp() subintervals J i = (i ) exp( ), iexp( )) i =, 2,, exp(). First, we shall show (20). The number of input sequences not to stop

12 the algorithm is not more than exp(). Then we obtain { } Pr n L n(x n ) R ( ) T Q, exp() PX n (x) min x T Q Q P n min = ], exp{ n(h(q) R)} exp( nd(q P X )) exp n{d(q P X )+ H(Q) R + }] Q P n ] (n +) X exp n min {D(Q P X)+ H(Q) R + } Q M(X ) which implies lim inf { }] n n log Pr n L n(x n ) R min {D(Q P X)+ H(Q) R + }. Q M(X ) Note that D(Q P X )+H(Q) R resp. H(Q)] is a linear (resp. convex) function of Q. Then, min H(Q) R = min {D(Q P X )+ H(Q) R + } H(Q)=R = min H(Q)=R {D(Q P X )+ H(Q) R + } D(Q P X ). Hence, we obtain (20). E r (R, P X ) = 0 if and only if Q = P X and H(Q) R, i.e. R H(X). This implies that E r (R, P X ) > 0 if and only if R<H(X). 2

13 Next, we show (2). We have { } Pr n L n(x n ) R n X (x) P x X n : PX n (x) exp( ) H(Q)+D(Q P X )R (n +) X exp( nd(q P X )) (n +) X exp{ n min D(Q P X )} H(Q)+D(Q P X )R which implies (2) for R>R min. It should be noted that the minimum of (7) is taken over the non-empty set of Q if R>R min. E sp (R, P X ) = 0 if and only if Q = P X and H(Q) R, i.e. R H(X). This implies that E sp (R, P X ) > 0 if and only if R min <R<H(X). Theorem 4: For0<R<R max = min x X lim inf n log P X(x), n log Pr { n L n(x n ) R} ] F (R, P X ). (22) For 0 <R<log X, lim sup { n n log Pr n L n(x n ) R} ] G(R, P X ). (23) Further, F (R, P X ) > 0 if and only if H(X) <R<R max, G(R, P X ) > 0if and only if H(X) <R<log X, andf (R, P X ) <G(R, P X )forr>h(x). Proof: We can show this theorem in a similar manner as the proof of Theorem 3. First, we shall show (22). We have { } Pr n L n(x n n ) R X (x) P x X n : PX n (x)exp( ) H(Q)+D(Q P X ) R exp( nd(q P X )) (n +) X exp{ nf (R, P X )} 3

14 which implies (22) for R<R max. It should be noted that the minimum of (8) is taken over the non-empty set of Q if R<R max. F (R, P X ) = 0 if and only if Q = P X and H(Q) R, i.e. R H(X). This implies that F (R, P X ) > 0 if and only if H(X) <R<R max. Next, we show (23). We have { } Pr n L n(x n ) R x T Q, T Q 2exp() 2 T Q P n X(x) 2 (n +) X T Q 2exp() 2 (n +) X H(Q) R+ log 2(n+) X n 2 (n +) X exp{ n min exp( nd(q P X )) exp( nd(q P X )) H(Q) R+ log 2(n+) X n D(Q P X )}. By the continuity of divergence, we can obtain (23) for R<log X. It should be noted that the minimum of (9) is taken over the non-empty set of Q if R<log X. G(R, P X ) = 0 if and only if Q = P X and H(Q) R, i.e. R H(X). This implies that G(R, P X ) > 0 if and only if H(X) <R<log X. Remark : Let us consider to produce a specified number of fair bits by using a sequence from the source X. We denote by T n (X) the length of input sequence to obtain fair bits of length n. Then, we obtain similar relations as (20)-(23). For example, corresponding to (20), we have lim inf { n n log Pr n T n(x) R} ] Ẽr(R, P X ) (24) where Ẽ r (R, P X )= min RD(Q P X ). (25) H(Q)/R VI. Error Exponent for Intrinsic Randomness 4

15 In this chapter, let us consider to produce fixed number of random bits with an input sequence of length n. In this case, we cannot generate fair bits exactly but approximately. First, we modify the interval algorithm for generating random process so that the algorithm outputs a specified sequence Y whenever the algorithm does not stop with an input sequence of length n, where Y = {0, }. The modified algorithm can be described below. Modified Interval Algorithm for Generating Fair Bits with Fixed Input Length a) Partition an unit interval 0, ) into disjoint subinterval J(0),J() such that J(i) = 2 i, ) (i +) i =0,. 2 b) Set P j = j p k j =, 2,,M; P 0 =0. k= 2) Set s = t = λ (null string), α s = γ t =0,β s = δ t =,I(s) =α s,β s ), J(t) =γ t,δ t ), l =0,andm =. 3) If l = n then output as the output sequence Y,andstopthe algorithm. Otherwise obtain an input symbol from the source X to have a value a {, 2,,M}, and generate the subinterval of I(s) where and set l = l +. I(sa) = α sa,β sa ) α sa = α s +(β s α s )P a β sa = α s +(β s α s )P a, 4a) If I(sa) is entirely contained in some J(ti) (i =0, ), then set t = ti. Otherwise, go to 5). 4b) If m = then output t as the output sequence Y,andstopthe algorithm. Otherwise, partition the interval J(t) γ t,δ t ) into disjoint subinterval J(t0),J(t) such that J(tj) = γ tj,δ tj ) j =0, 5

16 where γ tj = γ t + 2 j(δ t γ t ) and set m = m + and go to 4a). 5) Set s = sa andgoto3). δ tj = γ t + 2 (j +)(δ t γ t ), (a) Approximation Error by Variational Distance We first measured the approximation error by the variational distance between the desired and approximated output distribution. Then, we obtain the following theorems: Theorem 5: If the modified interval algorithm is used, then we have lim inf n n log d ( U exp(),py ) ] E r (R, P X ), (26) where U exp() is a uniform distribution on U exp() = {, 2,, exp()}, PY denote the output distribution of the modified interval algorithm, and E r (R, P X ) is given by (6). Further, for R>R min, lim sup n n log d ( U exp(),py ) ] E sp (R, P X ) (27) where E sp (R, P X ) is given by (7). Proof: First, we shall show (26). The number of input sequences to output a specified sequence Y is not more than exp(). Then, we 6

17 obtain d(u exp(), P Y )= = y Y : y = 2 y Y : y 2 min x T Q y Y exp( ) P Y (y) exp( ) P Y (y) + exp( ) P Y (y) ( ) T Q, exp() PX n (x) y Y : y (exp( ) P Y ] 2(n +) X exp n min {D(Q P X)+ H(Q) R + } Q M(X ) (y)) which implies (26). Next, we show (27). In a similar manner as the proof of (26), we obtain d ( U exp(), PY ) =2 exp( ) P Y (y) 2 P x X n : PX n (x) exp( ) y Y : y n X(x) 2(n +) X exp{ n min D(Q P X )} D(Q P X )+H(Q)R which implies (27) for R>R min. This theorem implies that if the length of output sequence per input sample is below the entropy of the source, the approximation error measured by the variational distance vanishes exponentially as the length of input sequence tends to infinity, by using the modified interval algorithm. Next theorem shows the upper bounds of the error exponent. Theorem 6: Let P Y denote a distribution on U exp() using any algorithm for random number generation with fixed input length n. Then for R> R min, lim sup ] n n log d(u exp(), P Y ) E sp (R, P X ), (28) 7

18 where E sp (R, P X ) is given by (7). Proof: It should be noted that P Y exp( ) (y) 2exp( ). Then, we have d(u exp(), P Y )= x X n : P n X (x) 2exp( ) y Y P Y exp( ) 2 P n X(x) 2 (n +) X D(Q P X )+H(Q)R n (y) 2 P Y (y) P Y (y) if exp( nd(q P X )) 2 (n +) X exp{ n min D(Q P X )}. D(Q P X )+H(Q)R n From the continuity of divergence, we can obtain (28). Note that E r (R, P X ) <E sp (R, P X )forr<h(x). Hence, it is still an open problem to obtain the exact error exponent of the proposed algorithm. Next theorem shows the converse result. Theorem 7: If the modified interval algorithm is used, then for R<R max, lim inf n n log { 2 d(u exp(), PY ) } ] F (R, P X ), (29) where F (R, P X ) is given by (8). Further, for R<log X lim sup n where G(R, P X ) is given by (9). n log { 2 d(u exp(),p Y ) } ] G(R, P X ), (30) Proof: First, we shall show (29). From the equality a + b = a b + 8

19 2min(a, b), we obtain 2 d ( U exp(), PY ) = 2 exp( ) P Y (y) = 2 = 2 2 y Y y Y min y Y : y ( ) exp( ), PY (y) PY (y)+2exp( ) P x X n : PX n (x)exp( ) n X(x)+2exp( ) 2(n +) X exp{ n min D(Q P X )+H(Q) R D(Q P X )} +2exp( ). Now that F (H(X),P X )=0,F(R, P X ) is monotonously increasing for R H(X) and D(Q P X ) = Q(x)log Q(x) P (x) x X Q(x)logP X (x) x X Q(x)logmin P X( x) bx X x X = log min P X( x) bx X = R max, then F (R, P X ) <Rfor R<R max. Hence, from the convexity of divergence we can obtain (29) for R<R max. Next, we show (30). In a similar manner as the proof of (29), we have 2 d(u exp(), P Y )=2 y Y min 2 x X n : x T Q, T Q 2exp() 2 T Q P n X(x) ( exp( ), P Y (y) ) (n +) X exp{ n min H(Q) R+ n log 2(n+) X D(Q P X )} 9

20 which implies (30) for R<log X. This theorem implies that if the length of output sequence per input sample is above the entropy of the source, the approximation error measured by the variational distance approaches to two exponentially as the length of input sequence tends to infinity. Next theorem was due to Ohama. Theorem 8 5]: Consider the optimum algorithm for random number generation with fixed input length n, let P Y denote the distribution on Y which minimizes the variational distance. Then, we have n { log 2 d(u exp(), P Y )} ] = F (R, P X ) (3) where lim n F (R, P X )= min {D(Q P X)+ R H(Q) D(Q P X ) + }. (32) Q M(X ) Further, F (R, P X ) F (R, P X ) and equality holds for R R 0,where R 0 = D(U X P X )+log X. (33) Proof: (a)converse part For x X n such that PX n (x) exp( ), we assign x to a certain y Y one by one. We shall denote the set of these y Y by A. Also, for x X n such that PX n (x) exp( ), we assign as many x as possible to a certain y A c, on condition that the sum of probability of assigned x is not over the probability of y. We shall denote the set of these y A c by B. If there are some x to be corresponded to no y, we assign these x to suitable y B one by one. We shall denote the set of these y B by B 2. 20

21 Then, we have 2 d(u exp(), P Y ) = 2 ( ) exp( ), P Y (y) y Y min = 2 y A exp( )+2 = 2 2 y B B c 2 2 exp( )+2 y A y B P Y (y)+2 exp( ) y B 2 P Y (y) exp( )+2 P x X n : PX n (x) exp( ) x X n : PX n (x)exp( ) D(Q P X )+H(Q)R = 2 Q P n exp exp{ n(r H(Q))} +2 n X (x) D(Q P X )+H(Q) R exp( nd(q P X )) n{d(q P X )+ R H(Q) D(Q P X ) + } ] 2(n +) X exp n min {D(Q P X)+ R H(Q) D(Q P X ) + } Q M(X ) which implies lim inf n n log { 2 d(u exp(), P Y )} ] F (R, P X ). (b)achievable part (b-i) Suppose that R R 0.Weassignx X n such that PX n (x) exp( ) to y Y one by one, and arbitrary for other x X n. Then, we have 2 d(u exp(), P Y )=2 ( ) exp( ), P Y (y) 2 y Y min exp( ) x X n : PX n (x) exp( ) 2(n +) X D(Q P X )+H(Q)R exp{ n(r H(Q))} 2(n +) X exp{ n min (R H(Q))}. D(Q P X )+H(Q)R 2 ]

22 By the way since R R 0, from the convexity of R H(Q) wehave Therefore, we obtain min (R H(Q)) = R log X. D(Q P X )+H(Q)R min {D(Q P X)+ R H(Q) D(Q P X ) + } Q M(X ) = min (R H(Q)), min Here, min D(Q P X )+H(Q)R = min R log X, Then, we have Q =arg min D(Q P X )+H(Q) R min D(Q P X )+H(Q) R D(Q P X )+H(Q) R D(Q P X ). D(Q P X ). D(Q P X ) D(Q P X ) (R log X ) R H(Q ) (R log X ) H(Q )+log X 0 which implies that min {D(Q P X)+ R H(Q) D(Q P X ) + } = R log X Q M(X ) for R R 0. Hence, we have lim sup n for R R 0. n { log 2 d(u exp(), P Y )} ] F (R, P X ) (b-ii) Suppose that R<R 0. We can select a type Q such that D(Q P X )+H(Q ) R and Q minimizes D(Q P X ) > 0. Then, we assign as many x T Q as possible to y Y, on condition that the sum of the 22

23 probability of assigned x is not over the probability of y, and arbitrary for x / T Q. In this case, the number of x corresponding to a certain y is k = exp( ) exp n{d(q P X )+H(Q )}] Thus, for a sufficiently large n 0 and all n n 0,thenumberofy to be assigned is upperbounded as follows. exp(nh(q )) exp(nh(q )) + k k exp(nh(q )) exp n{r H(Q ) D(Q P X )}] + Therefore, we have. 2exp(nH(Q )) exp n{r H(Q ) D(Q P X )}] + = 2expn{R D(Q P X )}]+ exp(). 2 d(u exp(), P Y ) = 2 ( ) exp( ), P Y (y) y Y min 2 x X n : x T Q P n X(x) 2(n +) X exp{ n min D(Q P X )}. D(Q P X )+H(Q) R By the way, note that R H(Q) resp.d(q P X )+H(Q)] is convex (resp. linear) function of Q. Thus, for R<R 0, (R H(Q)) can be attained at its boundary, that is, min D(Q P X )+H(Q)R min (R H(Q)) = min (R H(Q)) D(Q P X )+H(Q)R D(Q P X )+H(Q)=R = min D(Q P X )+H(Q)=R D(Q P X ). 23

24 This implies that for R<R 0 min {D(Q P X)+ R H(Q) D(Q P X ) + } = min D(Q P X ). Q M(X ) D(Q P X )+H(Q) R Hence, we have lim sup n n { } ] log 2 d(u exp(), P Y ) F (R, P X ) for R<R 0. From (a)(b-i)(b-ii), we obtain (3). F (R, P X ) = 0 if and only if Q = P X and R H(Q), i.e. R H(X). This implies that F (R, P X ) > 0 if and only if R>H(X). Theorem 7 and 8 imply that the modified interval algorithm is not optimum if R R 0. (b) Approximation Error by Divergence Next, we shall consider to measure the approximation error by the divergence between the desired and approximated output distribution. First, we show the following lemma. Lemma 3 Let P n,q n be arbitrary distributions on X n.if then d(p n,q n ) ɛ, D(P n Q n ) ɛ log P n minq n min, where Pmin n (resp. Qn min ) is the minimum of P n (resp. Q n )onx n. Proof: Using the inequality Q n (x)logq n (x) Q n (x)logp n (x), (34) x X n x X n we have (P n (x)logp n (x) Q n (x)logq n (x)) x X n (P n (x) Q n (x)) log P n (x) x X n log Pmin n P n (x) Q n (x) x X n = d(p n,q n )logpmin n ɛ log Pmin n. 24

25 Hence, we obtain D(P n Q n ) = P n (x)log P n (x) Q n (x) x X n = P n (x)logp n (x) P n (x)logq n (x) x X n x X n Q n (x)logq n (x) ɛ log Pmin n P n (x)logq n (x) x X n x X n = (Q n (x) P n (x)) log Q n (x) ɛ log Pmin n x X n log Q n min P n (x) Q n (x) ɛlog Pmin n x X n ɛ log P n minq n min. From Theorem 5 and Lemma 3, we immediately obtain the following corollary. Corollary 2: If the modified interval algorithm is used, then we have lim inf n n log D ( U exp() PY ) ] E r (R, P X ) (35) where E r (R, P X ) is given by (6). This corollary implies that if the length of output sequence per input sample is below the entropy of the source, the approximation error measured by the divergence also vanishes exponentially as the length of input sequence tends to infinity. Remark 2: Han9] has showed that there exists an algorithm for random number generation of which normalized divergence vanishes. However as shown in Corollary 2, for DMS (more generally finite-state unifilar sources), even divergence can vanish as the length of input sequence tends to infinity. Next, we show the following lemmas. Lemma 4: LetP n,q n be arbitrary distributions on X n.if 2 d(p n,q n ) ɛ, 25

26 then D(P n Q n )+D(Q n P n ) log ɛ. Proof: Using the log-sum inequality8] and (34), we obtain D(P n Q n )+D(Q n P n ) = P n (x)log P n (x) Q n (x) + Q n (x)log Qn (x) P n (x) x X n x X n = (P n (x)logp n (x)+q n (x)logq n (x)) x X n (P n (x)logq n (x)+q n (x)logp n (x)) x X n (P n (x)+q n (x)) log P n (x) log Q n (x) x X n = (P n (x)+q n (x)) log P n (x) Q n (x) x X n = (P n (x)+q n (x)) log max(p n (x), Qn (x)) min(p n (x), Q n (x)) x X n P n P n (x) (x)log min(p n (x), Q n (x)) x X n log min(p n (x), Q n (x)) x X n log ɛ. Lemma 5: LetP n,q n be arbitrary distributions on X n.if 2 d(p n,q n ) ɛ, then Proof: Note that D(P n Q n ) (2 ɛ)logp n min Qn min. d(p n,q n ) 2 ɛ. 26

27 Then, in a similar manner as the proof of Lemma 4, we obtain D(P n Q n ) = P n (x)log P n (x) Q n (x) x X n = P n (x)logp n (x) P n (x)logq n (x) x X n x X n Q n (x)logq n (x) (2 ɛ)logpmin n P n (x)logq n (x) x X n x X n = (Q n (x) P n (x)) log Q n (x) (2 ɛ)logpmin n x X n log Q n min P n (x) Q n (x) (2 ɛ)logpmin n x X n (2 ɛ)logpminq n n min. From these lemmas and Theorem 7, we immediately obtain the following corollary. Corollary 3: If the modified interval algorithm is used, then for R<R max, lim inf n { n D ( U exp() P Y ) + D ( P Y U exp() ) } F (R, P X ), (36) where F (R, P X ) is given by (8). Further, for R<log X lim sup n n D ( U exp() PY ) R( log min P X( x)). (37) bx X This corollary implies that if the length of output sequence per input sample is above the entropy of the source, the approximation error measured by the divergence approaches to two linearly as the length of input sequence tends to infinity. Remark 3: We can easily extend the result of Chapter IV, V and VI to stationary ergodic Markov source by using Markov type ]. Further, we can extend these result to finite-state unifilar Markov source by using the definition of type in 2, 3]. VII. Conclusion 27

28 We have investigated large deviations performance of the interval algorithm for random number generation. We have showed almost surely proposition for i.i.d. random sequence. We have clarified some asymptotic properties, when target random number is subject to uniform distribution. As future researches, we are going to generalize our results to more complex sources. References ] D. Knuth and A. Yao, The complexity of nonuniform random number generation, Algorithm and Complexity, New Directions and Results, pp , ed. by J. F. Traub, Academic Press, New York, ]S.VembuandS.Verdú, Generating random bits from and arbitrary source: Fundamental limits, IEEE Trans. on Inform. Theory, vol.it- 4, pp , Sep ] T. S. Han and M. Hoshi, Interval algorithm for random number generation, IEEE Trans. on Inform. Theory, vol.43, pp.599-6, Mar ] F. Kanaya, An asymptotically optimal algorithm for generating Markov random sequences, Proc. of SITA 97, pp.77-80, Matsuyama, Japan, Dec. 997 (in Japanese). 5] Y. Ohama, Fixed to fixed length random number generation using one dimensional piecewise linear maps, Proc. of SITA 98, pp.57-60, Gifu, Japan, Dec. 998 (in Japanese). 6] T. Uyematsu and F. Kanaya, Methods of channel simulation achieving conditional resolvability by statistically stable transformation, submitted to IEEE Trans. on Inform. Theory 7] O. Uchida and T. S. Han, Performance analysis of interval algorithm for generating Markov processes, Proc. of SITA 98, pp.65-68, Gifu, Japan, Dec ] I. Csiszár and J. Körner, Information Theory: Coding Theorems for Discrete Memoryless Systems. New York: Academic, 98. 9] T. S. Han: Information-Spectrum Methods in Information Theory, Baifukan, Tokyo, 998 (in Japanese). 28

29 0] P. C. Shields: The ergodic theory of discrete sample paths, Graduate Studies in Math. vol.3, American Math. Soc., 996. ] L. D. Davisson, G. Longo and A. Sgarro, Error exponent for the noiseless of finite ergodic Markov sources, IEEE Trans. on Inform. Theory, vol.it-27, pp , Jul ] N. Merhav, Universal coding with minimum probability of codeword length overflow, IEEE Trans. on Inform. Theory, vol.37, pp , May ] N. Merhav and D. L. Neuhoff, Variable-to-fixed length codes provide better large deviations performance than fixed-to-variable length codes, IEEE Trans. on Inform. Theory, vol.38, pp.35-40, Jan ] T. Uyematsu: Today s Shannon Theory, Baifukan, Tokyo, 998 (in Japanese). 29

Large Deviations Performance of Knuth-Yao algorithm for Random Number Generation

Large Deviations Performance of Knuth-Yao algorithm for Random Number Generation Large Deviations Performance of Knuth-Yao algorithm for Random Number Generation Akisato KIMURA akisato@ss.titech.ac.jp Tomohiko UYEMATSU uematsu@ss.titech.ac.jp April 2, 999 No. AK-TR-999-02 Abstract

More information

EECS 750. Hypothesis Testing with Communication Constraints

EECS 750. Hypothesis Testing with Communication Constraints EECS 750 Hypothesis Testing with Communication Constraints Name: Dinesh Krithivasan Abstract In this report, we study a modification of the classical statistical problem of bivariate hypothesis testing.

More information

(each row defines a probability distribution). Given n-strings x X n, y Y n we can use the absence of memory in the channel to compute

(each row defines a probability distribution). Given n-strings x X n, y Y n we can use the absence of memory in the channel to compute ENEE 739C: Advanced Topics in Signal Processing: Coding Theory Instructor: Alexander Barg Lecture 6 (draft; 9/6/03. Error exponents for Discrete Memoryless Channels http://www.enee.umd.edu/ abarg/enee739c/course.html

More information

Arimoto Channel Coding Converse and Rényi Divergence

Arimoto Channel Coding Converse and Rényi Divergence Arimoto Channel Coding Converse and Rényi Divergence Yury Polyanskiy and Sergio Verdú Abstract Arimoto proved a non-asymptotic upper bound on the probability of successful decoding achievable by any code

More information

Information measures in simple coding problems

Information measures in simple coding problems Part I Information measures in simple coding problems in this web service in this web service Source coding and hypothesis testing; information measures A(discrete)source is a sequence {X i } i= of random

More information

An Achievable Error Exponent for the Mismatched Multiple-Access Channel

An Achievable Error Exponent for the Mismatched Multiple-Access Channel An Achievable Error Exponent for the Mismatched Multiple-Access Channel Jonathan Scarlett University of Cambridge jms265@camacuk Albert Guillén i Fàbregas ICREA & Universitat Pompeu Fabra University of

More information

INFORMATION THEORY AND STATISTICS

INFORMATION THEORY AND STATISTICS CHAPTER INFORMATION THEORY AND STATISTICS We now explore the relationship between information theory and statistics. We begin by describing the method of types, which is a powerful technique in large deviation

More information

Subset Source Coding

Subset Source Coding Fifty-third Annual Allerton Conference Allerton House, UIUC, Illinois, USA September 29 - October 2, 205 Subset Source Coding Ebrahim MolavianJazi and Aylin Yener Wireless Communications and Networking

More information

An Extended Fano s Inequality for the Finite Blocklength Coding

An Extended Fano s Inequality for the Finite Blocklength Coding An Extended Fano s Inequality for the Finite Bloclength Coding Yunquan Dong, Pingyi Fan {dongyq8@mails,fpy@mail}.tsinghua.edu.cn Department of Electronic Engineering, Tsinghua University, Beijing, P.R.

More information

Convexity/Concavity of Renyi Entropy and α-mutual Information

Convexity/Concavity of Renyi Entropy and α-mutual Information Convexity/Concavity of Renyi Entropy and -Mutual Information Siu-Wai Ho Institute for Telecommunications Research University of South Australia Adelaide, SA 5095, Australia Email: siuwai.ho@unisa.edu.au

More information

The Method of Types and Its Application to Information Hiding

The Method of Types and Its Application to Information Hiding The Method of Types and Its Application to Information Hiding Pierre Moulin University of Illinois at Urbana-Champaign www.ifp.uiuc.edu/ moulin/talks/eusipco05-slides.pdf EUSIPCO Antalya, September 7,

More information

EE 4TM4: Digital Communications II. Channel Capacity

EE 4TM4: Digital Communications II. Channel Capacity EE 4TM4: Digital Communications II 1 Channel Capacity I. CHANNEL CODING THEOREM Definition 1: A rater is said to be achievable if there exists a sequence of(2 nr,n) codes such thatlim n P (n) e (C) = 0.

More information

Chapter 4. Data Transmission and Channel Capacity. Po-Ning Chen, Professor. Department of Communications Engineering. National Chiao Tung University

Chapter 4. Data Transmission and Channel Capacity. Po-Ning Chen, Professor. Department of Communications Engineering. National Chiao Tung University Chapter 4 Data Transmission and Channel Capacity Po-Ning Chen, Professor Department of Communications Engineering National Chiao Tung University Hsin Chu, Taiwan 30050, R.O.C. Principle of Data Transmission

More information

Information Theory and Statistics, Part I

Information Theory and Statistics, Part I Information Theory and Statistics, Part I Information Theory 2013 Lecture 6 George Mathai May 16, 2013 Outline This lecture will cover Method of Types. Law of Large Numbers. Universal Source Coding. Large

More information

Homework Set #2 Data Compression, Huffman code and AEP

Homework Set #2 Data Compression, Huffman code and AEP Homework Set #2 Data Compression, Huffman code and AEP 1. Huffman coding. Consider the random variable ( x1 x X = 2 x 3 x 4 x 5 x 6 x 7 0.50 0.26 0.11 0.04 0.04 0.03 0.02 (a Find a binary Huffman code

More information

A new converse in rate-distortion theory

A new converse in rate-distortion theory A new converse in rate-distortion theory Victoria Kostina, Sergio Verdú Dept. of Electrical Engineering, Princeton University, NJ, 08544, USA Abstract This paper shows new finite-blocklength converse bounds

More information

Correlation Detection and an Operational Interpretation of the Rényi Mutual Information

Correlation Detection and an Operational Interpretation of the Rényi Mutual Information Correlation Detection and an Operational Interpretation of the Rényi Mutual Information Masahito Hayashi 1, Marco Tomamichel 2 1 Graduate School of Mathematics, Nagoya University, and Centre for Quantum

More information

Capacity of a channel Shannon s second theorem. Information Theory 1/33

Capacity of a channel Shannon s second theorem. Information Theory 1/33 Capacity of a channel Shannon s second theorem Information Theory 1/33 Outline 1. Memoryless channels, examples ; 2. Capacity ; 3. Symmetric channels ; 4. Channel Coding ; 5. Shannon s second theorem,

More information

Variable Length Codes for Degraded Broadcast Channels

Variable Length Codes for Degraded Broadcast Channels Variable Length Codes for Degraded Broadcast Channels Stéphane Musy School of Computer and Communication Sciences, EPFL CH-1015 Lausanne, Switzerland Email: stephane.musy@ep.ch Abstract This paper investigates

More information

Coding on Countably Infinite Alphabets

Coding on Countably Infinite Alphabets Coding on Countably Infinite Alphabets Non-parametric Information Theory Licence de droits d usage Outline Lossless Coding on infinite alphabets Source Coding Universal Coding Infinite Alphabets Enveloppe

More information

Information Theory and Hypothesis Testing

Information Theory and Hypothesis Testing Summer School on Game Theory and Telecommunications Campione, 7-12 September, 2014 Information Theory and Hypothesis Testing Mauro Barni University of Siena September 8 Review of some basic results linking

More information

Information Theory in Intelligent Decision Making

Information Theory in Intelligent Decision Making Information Theory in Intelligent Decision Making Adaptive Systems and Algorithms Research Groups School of Computer Science University of Hertfordshire, United Kingdom June 7, 2015 Information Theory

More information

EE5139R: Problem Set 4 Assigned: 31/08/16, Due: 07/09/16

EE5139R: Problem Set 4 Assigned: 31/08/16, Due: 07/09/16 EE539R: Problem Set 4 Assigned: 3/08/6, Due: 07/09/6. Cover and Thomas: Problem 3.5 Sets defined by probabilities: Define the set C n (t = {x n : P X n(x n 2 nt } (a We have = P X n(x n P X n(x n 2 nt

More information

Coding for Discrete Source

Coding for Discrete Source EGR 544 Communication Theory 3. Coding for Discrete Sources Z. Aliyazicioglu Electrical and Computer Engineering Department Cal Poly Pomona Coding for Discrete Source Coding Represent source data effectively

More information

On the Reliability Function of Variable-Rate Slepian-Wolf Coding

On the Reliability Function of Variable-Rate Slepian-Wolf Coding entropy Article On the Reliability Function of Variable-Rate Slepian-Wolf Coding Jun Chen 1,2, *, Da-ke He 3, Ashish Jagmohan 4 and Luis A. Lastras-Montaño 4 1 College of Electronic Information and Automation,

More information

EE5139R: Problem Set 7 Assigned: 30/09/15, Due: 07/10/15

EE5139R: Problem Set 7 Assigned: 30/09/15, Due: 07/10/15 EE5139R: Problem Set 7 Assigned: 30/09/15, Due: 07/10/15 1. Cascade of Binary Symmetric Channels The conditional probability distribution py x for each of the BSCs may be expressed by the transition probability

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science Transmission of Information Spring 2006

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science Transmission of Information Spring 2006 MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science 6.44 Transmission of Information Spring 2006 Homework 2 Solution name username April 4, 2006 Reading: Chapter

More information

Hypothesis Testing with Communication Constraints

Hypothesis Testing with Communication Constraints Hypothesis Testing with Communication Constraints Dinesh Krithivasan EECS 750 April 17, 2006 Dinesh Krithivasan (EECS 750) Hyp. testing with comm. constraints April 17, 2006 1 / 21 Presentation Outline

More information

Multiaccess Channels with State Known to One Encoder: A Case of Degraded Message Sets

Multiaccess Channels with State Known to One Encoder: A Case of Degraded Message Sets Multiaccess Channels with State Known to One Encoder: A Case of Degraded Message Sets Shivaprasad Kotagiri and J. Nicholas Laneman Department of Electrical Engineering University of Notre Dame Notre Dame,

More information

An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1

An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1 Kraft s inequality An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if N 2 l i 1 Proof: Suppose that we have a tree code. Let l max = max{l 1,...,

More information

Solutions to Set #2 Data Compression, Huffman code and AEP

Solutions to Set #2 Data Compression, Huffman code and AEP Solutions to Set #2 Data Compression, Huffman code and AEP. Huffman coding. Consider the random variable ( ) x x X = 2 x 3 x 4 x 5 x 6 x 7 0.50 0.26 0. 0.04 0.04 0.03 0.02 (a) Find a binary Huffman code

More information

Pointwise Redundancy in Lossy Data Compression and Universal Lossy Data Compression

Pointwise Redundancy in Lossy Data Compression and Universal Lossy Data Compression Pointwise Redundancy in Lossy Data Compression and Universal Lossy Data Compression I. Kontoyiannis To appear, IEEE Transactions on Information Theory, Jan. 2000 Last revision, November 21, 1999 Abstract

More information

Hash Property and Fixed-rate Universal Coding Theorems

Hash Property and Fixed-rate Universal Coding Theorems 1 Hash Property and Fixed-rate Universal Coding Theorems Jun Muramatsu Member, IEEE, Shigeki Miyake Member, IEEE, Abstract arxiv:0804.1183v1 [cs.it 8 Apr 2008 The aim of this paper is to prove the achievability

More information

ECE 4400:693 - Information Theory

ECE 4400:693 - Information Theory ECE 4400:693 - Information Theory Dr. Nghi Tran Lecture 8: Differential Entropy Dr. Nghi Tran (ECE-University of Akron) ECE 4400:693 Lecture 1 / 43 Outline 1 Review: Entropy of discrete RVs 2 Differential

More information

Information Theory. Lecture 5 Entropy rate and Markov sources STEFAN HÖST

Information Theory. Lecture 5 Entropy rate and Markov sources STEFAN HÖST Information Theory Lecture 5 Entropy rate and Markov sources STEFAN HÖST Universal Source Coding Huffman coding is optimal, what is the problem? In the previous coding schemes (Huffman and Shannon-Fano)it

More information

EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018

EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018 Please submit the solutions on Gradescope. EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018 1. Optimal codeword lengths. Although the codeword lengths of an optimal variable length code

More information

DISCRIMINATING two distributions is treated as a fundamental problem in the field of statistical inference. This problem

DISCRIMINATING two distributions is treated as a fundamental problem in the field of statistical inference. This problem Discrimination of two channels by adaptive methods and its application to quantum system Masahito Hayashi 1 arxiv:0804.0686v1 [quant-ph] 4 Apr 2008 Abstract The optimal exponential error rate for adaptive

More information

Network coding for multicast relation to compression and generalization of Slepian-Wolf

Network coding for multicast relation to compression and generalization of Slepian-Wolf Network coding for multicast relation to compression and generalization of Slepian-Wolf 1 Overview Review of Slepian-Wolf Distributed network compression Error exponents Source-channel separation issues

More information

Lecture 5 - Information theory

Lecture 5 - Information theory Lecture 5 - Information theory Jan Bouda FI MU May 18, 2012 Jan Bouda (FI MU) Lecture 5 - Information theory May 18, 2012 1 / 42 Part I Uncertainty and entropy Jan Bouda (FI MU) Lecture 5 - Information

More information

Chaos, Complexity, and Inference (36-462)

Chaos, Complexity, and Inference (36-462) Chaos, Complexity, and Inference (36-462) Lecture 7: Information Theory Cosma Shalizi 3 February 2009 Entropy and Information Measuring randomness and dependence in bits The connection to statistics Long-run

More information

Second-Order Asymptotics in Information Theory

Second-Order Asymptotics in Information Theory Second-Order Asymptotics in Information Theory Vincent Y. F. Tan (vtan@nus.edu.sg) Dept. of ECE and Dept. of Mathematics National University of Singapore (NUS) National Taiwan University November 2015

More information

Source Coding with Lists and Rényi Entropy or The Honey-Do Problem

Source Coding with Lists and Rényi Entropy or The Honey-Do Problem Source Coding with Lists and Rényi Entropy or The Honey-Do Problem Amos Lapidoth ETH Zurich October 8, 2013 Joint work with Christoph Bunte. A Task from your Spouse Using a fixed number of bits, your spouse

More information

Universal source coding for complementary delivery

Universal source coding for complementary delivery SITA2006 i Hakodate 2005.2. p. Uiversal source codig for complemetary delivery Akisato Kimura, 2, Tomohiko Uyematsu 2, Shigeaki Kuzuoka 2 Media Iformatio Laboratory, NTT Commuicatio Sciece Laboratories,

More information

A Novel Asynchronous Communication Paradigm: Detection, Isolation, and Coding

A Novel Asynchronous Communication Paradigm: Detection, Isolation, and Coding A Novel Asynchronous Communication Paradigm: Detection, Isolation, and Coding The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation

More information

Upper Bounds on the Capacity of Binary Intermittent Communication

Upper Bounds on the Capacity of Binary Intermittent Communication Upper Bounds on the Capacity of Binary Intermittent Communication Mostafa Khoshnevisan and J. Nicholas Laneman Department of Electrical Engineering University of Notre Dame Notre Dame, Indiana 46556 Email:{mhoshne,

More information

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 53, NO. 12, DECEMBER

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 53, NO. 12, DECEMBER IEEE TRANSACTIONS ON INFORMATION THEORY, VOL 53, NO 12, DECEMBER 2007 4457 Joint Source Channel Coding Error Exponent for Discrete Communication Systems With Markovian Memory Yangfan Zhong, Student Member,

More information

IN [1], Forney derived lower bounds on the random coding

IN [1], Forney derived lower bounds on the random coding Exact Random Coding Exponents for Erasure Decoding Anelia Somekh-Baruch and Neri Merhav Abstract Random coding of channel decoding with an erasure option is studied By analyzing the large deviations behavior

More information

Lecture 2: August 31

Lecture 2: August 31 0-704: Information Processing and Learning Fall 206 Lecturer: Aarti Singh Lecture 2: August 3 Note: These notes are based on scribed notes from Spring5 offering of this course. LaTeX template courtesy

More information

x log x, which is strictly convex, and use Jensen s Inequality:

x log x, which is strictly convex, and use Jensen s Inequality: 2. Information measures: mutual information 2.1 Divergence: main inequality Theorem 2.1 (Information Inequality). D(P Q) 0 ; D(P Q) = 0 iff P = Q Proof. Let ϕ(x) x log x, which is strictly convex, and

More information

Common Information. Abbas El Gamal. Stanford University. Viterbi Lecture, USC, April 2014

Common Information. Abbas El Gamal. Stanford University. Viterbi Lecture, USC, April 2014 Common Information Abbas El Gamal Stanford University Viterbi Lecture, USC, April 2014 Andrew Viterbi s Fabulous Formula, IEEE Spectrum, 2010 El Gamal (Stanford University) Disclaimer Viterbi Lecture 2

More information

Two Applications of the Gaussian Poincaré Inequality in the Shannon Theory

Two Applications of the Gaussian Poincaré Inequality in the Shannon Theory Two Applications of the Gaussian Poincaré Inequality in the Shannon Theory Vincent Y. F. Tan (Joint work with Silas L. Fong) National University of Singapore (NUS) 2016 International Zurich Seminar on

More information

On Large Deviation Analysis of Sampling from Typical Sets

On Large Deviation Analysis of Sampling from Typical Sets Communications and Signal Processing Laboratory (CSPL) Technical Report No. 374, University of Michigan at Ann Arbor, July 25, 2006. On Large Deviation Analysis of Sampling from Typical Sets Dinesh Krithivasan

More information

Entropy and Ergodic Theory Lecture 4: Conditional entropy and mutual information

Entropy and Ergodic Theory Lecture 4: Conditional entropy and mutual information Entropy and Ergodic Theory Lecture 4: Conditional entropy and mutual information 1 Conditional entropy Let (Ω, F, P) be a probability space, let X be a RV taking values in some finite set A. In this lecture

More information

Fixed-Length-Parsing Universal Compression with Side Information

Fixed-Length-Parsing Universal Compression with Side Information Fixed-ength-Parsing Universal Compression with Side Information Yeohee Im and Sergio Verdú Dept. of Electrical Eng., Princeton University, NJ 08544 Email: yeoheei,verdu@princeton.edu Abstract This paper

More information

Tight Bounds for Symmetric Divergence Measures and a New Inequality Relating f-divergences

Tight Bounds for Symmetric Divergence Measures and a New Inequality Relating f-divergences Tight Bounds for Symmetric Divergence Measures and a New Inequality Relating f-divergences Igal Sason Department of Electrical Engineering Technion, Haifa 3000, Israel E-mail: sason@ee.technion.ac.il Abstract

More information

Lecture 6: Gaussian Channels. Copyright G. Caire (Sample Lectures) 157

Lecture 6: Gaussian Channels. Copyright G. Caire (Sample Lectures) 157 Lecture 6: Gaussian Channels Copyright G. Caire (Sample Lectures) 157 Differential entropy (1) Definition 18. The (joint) differential entropy of a continuous random vector X n p X n(x) over R is: Z h(x

More information

Soft Covering with High Probability

Soft Covering with High Probability Soft Covering with High Probability Paul Cuff Princeton University arxiv:605.06396v [cs.it] 20 May 206 Abstract Wyner s soft-covering lemma is the central analysis step for achievability proofs of information

More information

Information Theoretic Limits of Randomness Generation

Information Theoretic Limits of Randomness Generation Information Theoretic Limits of Randomness Generation Abbas El Gamal Stanford University Shannon Centennial, University of Michigan, September 2016 Information theory The fundamental problem of communication

More information

A General Formula for Compound Channel Capacity

A General Formula for Compound Channel Capacity A General Formula for Compound Channel Capacity Sergey Loyka, Charalambos D. Charalambous University of Ottawa, University of Cyprus ETH Zurich (May 2015), ISIT-15 1/32 Outline 1 Introduction 2 Channel

More information

SHARED INFORMATION. Prakash Narayan with. Imre Csiszár, Sirin Nitinawarat, Himanshu Tyagi, Shun Watanabe

SHARED INFORMATION. Prakash Narayan with. Imre Csiszár, Sirin Nitinawarat, Himanshu Tyagi, Shun Watanabe SHARED INFORMATION Prakash Narayan with Imre Csiszár, Sirin Nitinawarat, Himanshu Tyagi, Shun Watanabe 2/41 Outline Two-terminal model: Mutual information Operational meaning in: Channel coding: channel

More information

Feedback Capacity of a Class of Symmetric Finite-State Markov Channels

Feedback Capacity of a Class of Symmetric Finite-State Markov Channels Feedback Capacity of a Class of Symmetric Finite-State Markov Channels Nevroz Şen, Fady Alajaji and Serdar Yüksel Department of Mathematics and Statistics Queen s University Kingston, ON K7L 3N6, Canada

More information

Frans M.J. Willems. Authentication Based on Secret-Key Generation. Frans M.J. Willems. (joint work w. Tanya Ignatenko)

Frans M.J. Willems. Authentication Based on Secret-Key Generation. Frans M.J. Willems. (joint work w. Tanya Ignatenko) Eindhoven University of Technology IEEE EURASIP Spain Seminar on Signal Processing, Communication and Information Theory, Universidad Carlos III de Madrid, December 11, 2014 : Secret-Based Authentication

More information

arxiv: v4 [cs.it] 17 Oct 2015

arxiv: v4 [cs.it] 17 Oct 2015 Upper Bounds on the Relative Entropy and Rényi Divergence as a Function of Total Variation Distance for Finite Alphabets Igal Sason Department of Electrical Engineering Technion Israel Institute of Technology

More information

Information Theory: Entropy, Markov Chains, and Huffman Coding

Information Theory: Entropy, Markov Chains, and Huffman Coding The University of Notre Dame A senior thesis submitted to the Department of Mathematics and the Glynn Family Honors Program Information Theory: Entropy, Markov Chains, and Huffman Coding Patrick LeBlanc

More information

A Summary of Multiple Access Channels

A Summary of Multiple Access Channels A Summary of Multiple Access Channels Wenyi Zhang February 24, 2003 Abstract In this summary we attempt to present a brief overview of the classical results on the information-theoretic aspects of multiple

More information

Chapter 2: Entropy and Mutual Information. University of Illinois at Chicago ECE 534, Natasha Devroye

Chapter 2: Entropy and Mutual Information. University of Illinois at Chicago ECE 534, Natasha Devroye Chapter 2: Entropy and Mutual Information Chapter 2 outline Definitions Entropy Joint entropy, conditional entropy Relative entropy, mutual information Chain rules Jensen s inequality Log-sum inequality

More information

Optimum Binary-Constrained Homophonic Coding

Optimum Binary-Constrained Homophonic Coding Optimum Binary-Constrained Homophonic Coding Valdemar C. da Rocha Jr. and Cecilio Pimentel Communications Research Group - CODEC Department of Electronics and Systems, P.O. Box 7800 Federal University

More information

Capacity Region of Reversely Degraded Gaussian MIMO Broadcast Channel

Capacity Region of Reversely Degraded Gaussian MIMO Broadcast Channel Capacity Region of Reversely Degraded Gaussian MIMO Broadcast Channel Jun Chen Dept. of Electrical and Computer Engr. McMaster University Hamilton, Ontario, Canada Chao Tian AT&T Labs-Research 80 Park

More information

Subset Universal Lossy Compression

Subset Universal Lossy Compression Subset Universal Lossy Compression Or Ordentlich Tel Aviv University ordent@eng.tau.ac.il Ofer Shayevitz Tel Aviv University ofersha@eng.tau.ac.il Abstract A lossy source code C with rate R for a discrete

More information

Interactive Hypothesis Testing with Communication Constraints

Interactive Hypothesis Testing with Communication Constraints Fiftieth Annual Allerton Conference Allerton House, UIUC, Illinois, USA October - 5, 22 Interactive Hypothesis Testing with Communication Constraints Yu Xiang and Young-Han Kim Department of Electrical

More information

Capacity of AWGN channels

Capacity of AWGN channels Chapter 3 Capacity of AWGN channels In this chapter we prove that the capacity of an AWGN channel with bandwidth W and signal-tonoise ratio SNR is W log 2 (1+SNR) bits per second (b/s). The proof that

More information

Motivation for Arithmetic Coding

Motivation for Arithmetic Coding Motivation for Arithmetic Coding Motivations for arithmetic coding: 1) Huffman coding algorithm can generate prefix codes with a minimum average codeword length. But this length is usually strictly greater

More information

Representation of Correlated Sources into Graphs for Transmission over Broadcast Channels

Representation of Correlated Sources into Graphs for Transmission over Broadcast Channels Representation of Correlated s into Graphs for Transmission over Broadcast s Suhan Choi Department of Electrical Eng. and Computer Science University of Michigan, Ann Arbor, MI 80, USA Email: suhanc@eecs.umich.edu

More information

A Formula for the Capacity of the General Gel fand-pinsker Channel

A Formula for the Capacity of the General Gel fand-pinsker Channel A Formula for the Capacity of the General Gel fand-pinsker Channel Vincent Y. F. Tan Institute for Infocomm Research (I 2 R, A*STAR, Email: tanyfv@i2r.a-star.edu.sg ECE Dept., National University of Singapore

More information

CHAPTER 3. P (B j A i ) P (B j ) =log 2. j=1

CHAPTER 3. P (B j A i ) P (B j ) =log 2. j=1 CHAPTER 3 Problem 3. : Also : Hence : I(B j ; A i ) = log P (B j A i ) P (B j ) 4 P (B j )= P (B j,a i )= i= 3 P (A i )= P (B j,a i )= j= =log P (B j,a i ) P (B j )P (A i ).3, j=.7, j=.4, j=3.3, i=.7,

More information

Coding into a source: an inverse rate-distortion theorem

Coding into a source: an inverse rate-distortion theorem Coding into a source: an inverse rate-distortion theorem Anant Sahai joint work with: Mukul Agarwal Sanjoy K. Mitter Wireless Foundations Department of Electrical Engineering and Computer Sciences University

More information

On the Entropy of Sums of Bernoulli Random Variables via the Chen-Stein Method

On the Entropy of Sums of Bernoulli Random Variables via the Chen-Stein Method On the Entropy of Sums of Bernoulli Random Variables via the Chen-Stein Method Igal Sason Department of Electrical Engineering Technion - Israel Institute of Technology Haifa 32000, Israel ETH, Zurich,

More information

On Achievable Rates for Channels with. Mismatched Decoding

On Achievable Rates for Channels with. Mismatched Decoding On Achievable Rates for Channels with 1 Mismatched Decoding Anelia Somekh-Baruch arxiv:13050547v1 [csit] 2 May 2013 Abstract The problem of mismatched decoding for discrete memoryless channels is addressed

More information

SHARED INFORMATION. Prakash Narayan with. Imre Csiszár, Sirin Nitinawarat, Himanshu Tyagi, Shun Watanabe

SHARED INFORMATION. Prakash Narayan with. Imre Csiszár, Sirin Nitinawarat, Himanshu Tyagi, Shun Watanabe SHARED INFORMATION Prakash Narayan with Imre Csiszár, Sirin Nitinawarat, Himanshu Tyagi, Shun Watanabe 2/40 Acknowledgement Praneeth Boda Himanshu Tyagi Shun Watanabe 3/40 Outline Two-terminal model: Mutual

More information

The Capacity Region for Multi-source Multi-sink Network Coding

The Capacity Region for Multi-source Multi-sink Network Coding The Capacity Region for Multi-source Multi-sink Network Coding Xijin Yan Dept. of Electrical Eng. - Systems University of Southern California Los Angeles, CA, U.S.A. xyan@usc.edu Raymond W. Yeung Dept.

More information

1 Introduction to information theory

1 Introduction to information theory 1 Introduction to information theory 1.1 Introduction In this chapter we present some of the basic concepts of information theory. The situations we have in mind involve the exchange of information through

More information

Lecture 11: Introduction to Markov Chains. Copyright G. Caire (Sample Lectures) 321

Lecture 11: Introduction to Markov Chains. Copyright G. Caire (Sample Lectures) 321 Lecture 11: Introduction to Markov Chains Copyright G. Caire (Sample Lectures) 321 Discrete-time random processes A sequence of RVs indexed by a variable n 2 {0, 1, 2,...} forms a discretetime random process

More information

4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information

4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information 4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information Ramji Venkataramanan Signal Processing and Communications Lab Department of Engineering ramji.v@eng.cam.ac.uk

More information

The Empirical Distribution of Rate-Constrained Source Codes

The Empirical Distribution of Rate-Constrained Source Codes The Empirical Distribution of Rate-Constrained Source Codes Tsachy Weissman, Eri Ordentlich HP Laboratories Palo Alto HPL-2003-253 December 8 th, 2003* E-mail: tsachy@stanford.edu, eord@hpl.hp.com rate-distortion

More information

Entropy and Graphs. Seyed Saeed Changiz Rezaei

Entropy and Graphs. Seyed Saeed Changiz Rezaei Entropy and Graphs by Seyed Saeed Changiz Rezaei A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Master of Mathematics in Combinatorics and Optimization

More information

Tight Bounds for Symmetric Divergence Measures and a Refined Bound for Lossless Source Coding

Tight Bounds for Symmetric Divergence Measures and a Refined Bound for Lossless Source Coding APPEARS IN THE IEEE TRANSACTIONS ON INFORMATION THEORY, FEBRUARY 015 1 Tight Bounds for Symmetric Divergence Measures and a Refined Bound for Lossless Source Coding Igal Sason Abstract Tight bounds for

More information

EE/Stat 376B Handout #5 Network Information Theory October, 14, Homework Set #2 Solutions

EE/Stat 376B Handout #5 Network Information Theory October, 14, Homework Set #2 Solutions EE/Stat 376B Handout #5 Network Information Theory October, 14, 014 1. Problem.4 parts (b) and (c). Homework Set # Solutions (b) Consider h(x + Y ) h(x + Y Y ) = h(x Y ) = h(x). (c) Let ay = Y 1 + Y, where

More information

Lossy Compression Coding Theorems for Arbitrary Sources

Lossy Compression Coding Theorems for Arbitrary Sources Lossy Compression Coding Theorems for Arbitrary Sources Ioannis Kontoyiannis U of Cambridge joint work with M. Madiman, M. Harrison, J. Zhang Beyond IID Workshop, Cambridge, UK July 23, 2018 Outline Motivation

More information

Mismatched Multi-letter Successive Decoding for the Multiple-Access Channel

Mismatched Multi-letter Successive Decoding for the Multiple-Access Channel Mismatched Multi-letter Successive Decoding for the Multiple-Access Channel Jonathan Scarlett University of Cambridge jms265@cam.ac.uk Alfonso Martinez Universitat Pompeu Fabra alfonso.martinez@ieee.org

More information

Secret Key Agreement: General Capacity and Second-Order Asymptotics. Masahito Hayashi Himanshu Tyagi Shun Watanabe

Secret Key Agreement: General Capacity and Second-Order Asymptotics. Masahito Hayashi Himanshu Tyagi Shun Watanabe Secret Key Agreement: General Capacity and Second-Order Asymptotics Masahito Hayashi Himanshu Tyagi Shun Watanabe Two party secret key agreement Maurer 93, Ahlswede-Csiszár 93 X F Y K x K y ArandomvariableK

More information

Information Masking and Amplification: The Source Coding Setting

Information Masking and Amplification: The Source Coding Setting 202 IEEE International Symposium on Information Theory Proceedings Information Masking and Amplification: The Source Coding Setting Thomas A. Courtade Department of Electrical Engineering University of

More information

Distributed Lossless Compression. Distributed lossless compression system

Distributed Lossless Compression. Distributed lossless compression system Lecture #3 Distributed Lossless Compression (Reading: NIT 10.1 10.5, 4.4) Distributed lossless source coding Lossless source coding via random binning Time Sharing Achievability proof of the Slepian Wolf

More information

Channel Dispersion and Moderate Deviations Limits for Memoryless Channels

Channel Dispersion and Moderate Deviations Limits for Memoryless Channels Channel Dispersion and Moderate Deviations Limits for Memoryless Channels Yury Polyanskiy and Sergio Verdú Abstract Recently, Altug and Wagner ] posed a question regarding the optimal behavior of the probability

More information

Lecture 5: Channel Capacity. Copyright G. Caire (Sample Lectures) 122

Lecture 5: Channel Capacity. Copyright G. Caire (Sample Lectures) 122 Lecture 5: Channel Capacity Copyright G. Caire (Sample Lectures) 122 M Definitions and Problem Setup 2 X n Y n Encoder p(y x) Decoder ˆM Message Channel Estimate Definition 11. Discrete Memoryless Channel

More information

On bounded redundancy of universal codes

On bounded redundancy of universal codes On bounded redundancy of universal codes Łukasz Dębowski Institute of omputer Science, Polish Academy of Sciences ul. Jana Kazimierza 5, 01-248 Warszawa, Poland Abstract onsider stationary ergodic measures

More information

Superposition Encoding and Partial Decoding Is Optimal for a Class of Z-interference Channels

Superposition Encoding and Partial Decoding Is Optimal for a Class of Z-interference Channels Superposition Encoding and Partial Decoding Is Optimal for a Class of Z-interference Channels Nan Liu and Andrea Goldsmith Department of Electrical Engineering Stanford University, Stanford CA 94305 Email:

More information

Strong Converse and Stein s Lemma in the Quantum Hypothesis Testing

Strong Converse and Stein s Lemma in the Quantum Hypothesis Testing Strong Converse and Stein s Lemma in the Quantum Hypothesis Testing arxiv:uant-ph/9906090 v 24 Jun 999 Tomohiro Ogawa and Hiroshi Nagaoka Abstract The hypothesis testing problem of two uantum states is

More information

Improved Gilbert-Varshamov Bound for Constrained Systems

Improved Gilbert-Varshamov Bound for Constrained Systems Improved Gilbert-Varshamov Bound for Constrained Systems Brian H. Marcus Ron M. Roth December 10, 1991 Abstract Nonconstructive existence results are obtained for block error-correcting codes whose codewords

More information

arxiv: v1 [cs.it] 5 Sep 2008

arxiv: v1 [cs.it] 5 Sep 2008 1 arxiv:0809.1043v1 [cs.it] 5 Sep 2008 On Unique Decodability Marco Dalai, Riccardo Leonardi Abstract In this paper we propose a revisitation of the topic of unique decodability and of some fundamental

More information

Dispersion of the Gilbert-Elliott Channel

Dispersion of the Gilbert-Elliott Channel Dispersion of the Gilbert-Elliott Channel Yury Polyanskiy Email: ypolyans@princeton.edu H. Vincent Poor Email: poor@princeton.edu Sergio Verdú Email: verdu@princeton.edu Abstract Channel dispersion plays

More information