Distributed Lossless Compression. Distributed lossless compression system

Lecture #3 Distributed Lossless Compression (Reading: NIT 10.1 10.5, 4.4) Distributed lossless source coding Lossless source coding via random binning Time Sharing Achievability proof of the Slepian Wolf theorem Extension to more than two sources Copyright 2002 2015 Abbas El Gamal and Young-Han Kim Distributed lossless compression system X n 1 X n 2 Encoder 1 Encoder 2 M 1 M 2 Decoder X n 1, X n 2 Two-component DMS (2-DMS) (X 1 X 2, p(x 1, x 2 )) A (2 nr 1, 2 nr 2, n) code: Two encoders: m 1 (x n 1 ) [1 : 2nR 1 ) and m 2 (x n 2 ) [1 : 2nR 2 ) Decoder ( x n 1, xn 2 )(m 1, m 2 ) Probability of error: P (n) e = P{( X n 1, X n 2 ) (Xn 1, Xn 2 )} (R 1, R 2 ) achievable if (2 nr 1, 2 nr 2, n) codes such that lim n P (n) e = 0 Optimal rate region R : Closure of the set of achievable (R 1, R 2 ) 2 / 17

Bounds on the optimal rate region R 2 Outer bound Inner bound H(X 2 )? H(X 2 X 1 ) H(X 1 X 2 ) H(X 1 ) R 1 Sufficient condition for individual compression: R 1 > H(X 1 ), R 2 > H(X 2 ) Necessary condition for centralized compression: R 1 + R 2 H(X 1, X 2 ) Can also show that R 1 H(X 1 X 2 ), R 2 H(X 2 X 1 ) are necessary Slepian Wolf theorem 3 / 17 R 2 H(X 2 ) H(X 2 X 1 ) H(X 1 X 2 ) H(X 1 ) R 1 Theorem 10.1 (Slepian Wolf 1973) The optimal rate region R is the set of (R 1, R 2 ) such that R 1 H(X 1 X 2 ), R 2 H(X 2 X 1 ), R 1 + R 2 H(X 1, X 2 ) 4 / 17

Example Doubly symmetric binary source (DSBS(p)) (X 1, X 2 ) 1 p 1/2 1 1 1/2 X 2 X 1 X 1 X 2 0 1/2 0 1 p 0 1/2 1 0 1 p 2 1 p 2 p 2 1 p 2 Let p = 0.01 Individual compression: 2 bits/symbol-pair Slepian Wolf coding: H(X 1, X 2 ) = 1.0808 bits/symbol-pair Lossless source coding via random binning 5 / 17 Codebook generation: Randomly assign an index m(x n ) [1 : 2 nr ] to each sequence x n X n The set of sequences with the same index m form a bin B(m), m [1 : 2 nr ] Bin assignments are revealed to the encoder and decoder Encoding: Upon observing x n B(m), send the bin index m Decoding: Find the unique typical sequence x n B(m) B(1) B(m) B(2 nr ) x n T (n) 6 / 17

Lossless source coding via random binning Codebook generation: Randomly assign an index m(x n ) [1 : 2 nr ] to each sequence x n X n The set of sequences with the same index m form a bin B(m), m [1 : 2 nr ] Bin assignments are revealed to the encoder and decoder Encoding: Upon observing x n B(m), send the bin index m Decoding: Find the unique typical sequence x n B(m) x n B(1) B(m) B(2 nr ) Lossless source coding via random binning 6 / 17 Codebook generation: Randomly assign an index m(x n ) [1 : 2 nr ] to each sequence x n X n The set of sequences with the same index m form a bin B(m), m [1 : 2 nr ] Bin assignments are revealed to the encoder and decoder Encoding: Upon observing x n B(m), send the bin index m Decoding: Find the unique typical sequence x n B(m) x n B(1) B(m) B(2 nr ) 6 / 17

Analysis of the probability of error We bound P (n) e averaged over random bin assignments Let M denote the random bin index of X n, i.e., X n B(M) Note that M Unif[1 : 2 nr ], independent of X n Error events: Thus By the LLN, P(E 1 ) 0 E 1 = {X n T (n) }, or E 2 = { x n B(M) for some x n X n, x n T (n) } P(E) P(E 1 ) + P(E 2 ) = P(E 1 ) + P(E 2 X n B(1)) Analysis of the probability of error 7 / 17 Consider P(E 2 X n B(1)) = P X n = x n X n B(1) x n P x n B(1) for some x n x n, x n T (n) x n B(1), X n = x n p(x n ) P x n B(1) x n B(1), X n = x n x n x n T (n) x n x n = p(x n ) P{ x n B(1)} x n x n T (n) x n x n T (n) 2 nr 2 n(h(x)+δ()) 2 nr Hence P(E 2 ) 0 as n if R > H(X) + δ() 8 / 17

Achievability via linear binning Let X be a Bern(p) source R = H(X) achieved via linear binning (hashing) Let H be a randomly generated nr n binary parity-check matrix Encoder sends HX n Decoder recovers X n with high probability if R > H(p) (why?) Time sharing 9 / 17 Proposition 4.1 If (R 1, R 2 ), (R 1, R 2 ) R, then (R 1, R 2 ) = (αr 1 + ᾱr 1, αr 2 + ᾱr 2 ) R for α [0, 1] The rate region R is convex R 2 (R 1, R 2 ) (R 1, R 2 ) (R 1, R 2 ) R 1 10 / 17

Time sharing Proposition 4.1 If (R 1, R 2 ), (R 1, R 2 ) R, then (R 1, R 2 ) = (αr 1 + ᾱr 1, αr 2 + ᾱr 2 ) R for α [0, 1] The rate region R is convex Proof: Time sharing argument Let C k be a sequence of (2kR 1, 2 kr 2, k) codes with P (k) e1 0 Let C k be a sequence of (2kR 1, 2 kr 2, k) codes with P (k) e2 0 Construct a new sequence of codes by using C αn αn for i [1 : αn] and C ᾱn ᾱn for i [αn + 1 : n] C αn C ᾱn By the union of events bound, P (n) e Remarks: P (αn) e1 + P (ᾱn) e2 0 Time division is a special case of time sharing (between (R 1, 0) and (0, R 2 )) The rate (capacity) region of any source (channel) coding problem is convex Achievability proof of the S W theorem (Cover 1975) 10 / 17 We show that the corner point (H(X 1 ), H(X 2 X 1 )) is achievable Achievability of the other corner point (H(X 1 X 2 )), H(X 2 )) follows similarly The rest of the region is achieved using time sharing R 2 H(X 2 ) H(X 2 X 1 ) H(X 1 X 2 ) H(X 1 ) R 1 11 / 17

Achievability proof of the S W theorem (Cover 1975) We show that the corner point (H(X 1 ), H(X 2 X 1 )) is achievable Achievability of the other corner point (H(X 1 X 2 )), H(X 2 )) follows similarly The rest of the region is achieved using time sharing R 2 H(X 2 ) H(X 2 X 1 ) H(X 1 X 2 ) H(X 1 ) R 1 Achievability proof of the S W theorem (Cover 1975) 11 / 17 We show that the corner point (H(X 1 ), H(X 2 X 1 )) is achievable Achievability of the other corner point (H(X 1 X 2 )), H(X 2 )) follows similarly The rest of the region is achieved using time sharing R 2 H(X 2 ) H(X 2 X 1 ) H(X 1 X 2 ) H(X 1 ) R 1 11 / 17

Achievability of (H(X 1 ), H(X 2 X 1 )) Codebook generation: Assign a distinct index m 1 [1 : 2 nr 1 ] to each x n 1 T (n) (X 1 ), and m 1 = 1, otherwise Randomly assign an index m 2 (x n 2 ) [1 : 2nR 2 ] to each x n 2 X n 2 The sequences with the same index m 2 form a bin B(m 2 ) x n 1 1 2 m 1 2 nr 1 x n 2 B(1) B(m 2 ) T (n) (X 1, X 2 ) B(2 nr 2 ) Achievability of (H(X 1 ), H(X 2 X 1 )) 12 / 17 Encoding: Upon observing x n 1, encoder 1 sends the index m 1(x n 1 ) Upon observing x n 2 B(m 2), encoder 2 sends m 2 x n 1 1 2 2 nr 1 m 1 x n 2 B(1) B(m 2 ) (x n 1, xn 2 ) B(2 nr 2 ) 12 / 17

Achievability of (H(X 1 ), H(X 2 X 1 )) Decoding: Sources recovered successively Declare x n 1 = x n 1 (m 1) for the unique x n 1 (m 1) T (n) (X 1 ) Find the unique x n 2 B(m 2) T (n) (X 2 x n 1 ) If there is none or more than one, the decoder declares an error x n 1 1 2 2 nr 1 m 1 x n 2 B(1) B(m 2 ) ( x n 1, xn 2 ) B(2 nr 2 ) Analysis of the probability of error 12 / 17 Let M 1 and M 2 denote the random bin indices for X n 1 and Xn 2 Error events: Then, P(E 1 ) 0 by LLN E 1 = (X n 1, Xn 2 ) T (n), E 2 = x n 2 B(M 2) for some x n 2 Xn 2, (Xn 1, xn 2 ) T (n) P(E) P(E 1 ) + P(E 2 ) 13 / 17

Analysis of the probability of error By symmetry, P(E 2 ) = P(E 2 X n 2 B(1)) = P{(X n 1, Xn 2 ) = (xn 1, xn 2 ) Xn 2 B(1)} (x n 1,xn 2 ) P x n 2 B(1) for some xn 2 xn 2, (xn 1, xn 2 ) T (n) xn 2 B(1), (X n 1, Xn 2 ) = (xn 1, xn 2 ) p(x n 1, xn 2 ) (x n 1,xn 2 ) x n (n) 2 T 2 n(h(x 2 X 1 )+δ()) 2 nr 2 (X 2 x n 1 ) x n 2 xn 2 P{ x n 2 B(1)} Hence, P(E 2 ) 0 as n if R 2 > H(X 2 X 1 ) + δ() Extension to more than two sources 14 / 17 Theorem 10.3 The optimal rate region R (X 1,..., X k ) for the k-dms (X 1,..., X k ) is the set of (R 1,..., R k ) such that R j H(X(S) X(S c )) for all S [1 : k] j S For k = 3, R (X 1, X 2, X 3 ) is the set of (R 1, R 2, R 3 ) such that R 1 H(X 1 X 2, X 3 ), R 2 H(X 2 X 1, X 3 ), R 3 H(X 3 X 1, X 2 ), R 1 + R 2 H(X 1, X 2 X 3 ), R 1 + R 3 H(X 1, X 3 X 2 ), R 2 + R 3 H(X 2, X 3 X 1 ), R 1 + R 2 + R 3 H(X 1, X 2, X 3 ) 15 / 17

Summary k-component discrete memoryless source (k-dms) Distributed lossless source coding for a k-dms: Slepian Wolf optimal rate region Random binning Time sharing References 16 / 17 Cover, T. M. (1975). A proof of the data compression theorem of Slepian and Wolf for ergodic sources. IEEE Trans. Inf. Theory, 21(2), 226 228. Slepian, D. and Wolf, J. K. (1973). Noiseless coding of correlated information sources. IEEE Trans. Inf. Theory, 19(4), 471 480. 17 / 17