UCSD ECE 255C Handout #14 Prof. Young-Han Kim Thursday, March 9, 2017 Solutions to Homework Set #5 3.18 Bounds on the quadratic rate distortion function. Recall that R(D) = inf F(ˆx x):e(x ˆX)2 DI(X; ˆX). Let ˆX = (1 D )(X +Z), P where Z = N(0,PD/(P D)). It is straightforward to check that for this choice of F(ˆx x), E((X ˆX) 2 ) D, and therefore I(X; ˆX) is an upper bound on R(D). But I(X; ˆX) = h( ˆX) h( ˆX X) ( = h( ˆX) 1 2 log 2πe(1 D/P) 2 PD ) P D ( 1 (2πeE( 2 log ˆX 2 ) ) 1 2 log 2πe(1 D/P) 2 fpd ) P D 1 2 log( 2πe(1 D/P) 2 (P +PD/(P D))) ) ( 1 2 log 2πe(1 D/P) 2 PD ) P D = 1 2 log(p D ). Thus, R(D) R(P/D). (Note that by the discretization method, any rate R > R(P/D) is achievable.) For the lower bound, consider R(D) = min h(x) h(x ˆX) = h(x) max h(x ˆX) = h(x) max h(x h(x) max h(x) 1 2 log(2πed). h(x ˆX) ˆX ˆX) Since the upper bound is equal to the rate distortion function of the Gaussian source of power P, this suggests that among all sources of the same power, the Gaussian source are the hardest to compress. 3.19 Lossy source coding from a noisy observation. Since the encoder has access only to Y n, the distortion 1
becomes Ed(X n,ˆx n (m(y n )))]. This distortion can be rewritten as follows. Ed(X n,ˆx n (m(y n )))] = 1 n E Xi,Y n d(x i,ˆx i (m(y n )))] (a) = y n p(y n ) 1 n (b) = y n p(y n ) 1 n (c) = y n p(y n ) 1 n = 1 n Ed(X i,ˆx i (m(y n ))) Y n = y n ] Ed(X i,ˆx i (m(y n ))) Y i = y i ] d (y i,ˆx i (m(y n ))) Ed (Y i,ˆx i (m(y n )))] = Ed (Y n,ˆx n (m(y n )))] where(a)followsfromthepropertyofconditionalexpectations; (b)followsfromthefactthat ˆx i (m(y n )) is a function of y n ; (c) follows from the definition of d (y,ˆx) in the hint. Hence, this DMS equivalent to the one with source Y and a new distortion measure d. Therefore, the rate distortion function of this DMC is given by R(D) = min I(Y; ˆX) p(ˆx y):e(d (Y, ˆX)) D = min I(Y; ˆX) p(ˆx y):e Y E(d(X, ˆX) Y)] D = min I(Y; ˆX). p(ˆx y):e(d(x, ˆX)) D 11.6 Side information with occasional erasures. We show that { p(1 H(D/p)) 0 D/p 1/2, R SI-D = R SI-ED = pr(d/p) = 0 otherwise, where R(D) is the rate distortion function without side information. For the proof of the converse, consider R SI-ED (D) = min p(ˆx x,y):e(d(x, ˆX)) D I(X; ˆX Y) and note that for p(ˆx x,y) such that we must have D Ed(X, ˆX)] ped(x, ˆX) Y = e], I(X; ˆX Y) = pi(x; ˆX Y = e) (a) pr(d/p), where (a) follows by noting that X {Y = e} p(x) and p ˆX X,Y (ˆx x,e) satisfies Ed(X, ˆX) Y = e] D/p, and concluding that I(X; ˆX Y = e) R(D/p). Thus, R SI-ED (D) pr(d/p). 2
For achievability, consider R SI-D (D) = and set p(u x) to attain the minimum of and ˆx(u,y) to satisfy Then, and Thus, R X (D/p) = min I(X;U Y), p(u x),ˆx(u,y):e(d(x, ˆX)) D ˆX = min I(X;U) p(u x):ed(x,u)] D/p { U Y = e, Y otherwise. I(X;U Y) = pi(x;u Y = e) = pr(d/p) Ed(X, ˆX)] = ped(x, ˆX) Y = e]+(1 p)ed(x, ˆX) Y = X] = p D p = D. R SI-D pr(d/p). Notethat this argumentdue toperron, Diggavi,and Telagar(2007)isapplicable to anydms X p(x) and the erasure side information { X w.p. 1 p, Y = e w.p. p. 15.7 Triangular cyclic network. (a) The cutset bound is the set of rate triples (R 1,R 2,R 3 ) such that R 1 1, R 2 1, R 3 1. (b) Achievability follows simply by routing. For the proof of the converse, consider any sequence of codes with lim n P e (n) = 0. Since, by assumption, M 3 can be recovered from M 12 and M 2 with high probability at node 2, M 3 can also be recovered at node 1 with high probability. By applying the similar arguments to other nodes, it can be easily shown that the capacity region is the same as the one for the new network depicted in Figure 1. Now the cutset bound for the new network is the set of rate triples (R 1,R 2,R 3 ) such that This completes the proof of the converse. R 1 +R 2 1, R 2 +R 3 1, R 3 +R 1 1. 16.13 Properties of the Gaussian relay channel capacity. The first property follows from the fact that C(P) C(g 2 31P), the capacity of the direct channel, which is strictly greater than zero for P > 0, and tends to infinity as P. The second property follows from the fact that the cutset bound, which is greater than or equal to C(P) tends to zero as P 0. To prove the third property, use the following 3
M 2 ˆM 3, ˆM 1 2 1 3 ˆM 2, ˆM 3 M 3 M 1 ˆM1, ˆM 2 Figure 1: Equivalent triangular cyclic network. time-sharing argument. For any blocklength n, let k = αn and k = n k. For any P,P > 0 and ǫ > 0, there exists n such that C(P) < C (k) (P) + ǫ and C(P ) < C (k ) (P ) + ǫ. Consider the distribution F(x k 1 ) and functions {x 2i} k that satisfy the power constraint P and attains C(k) (P), and F (x k 1 ) and {x 1i }k that satisfy the power constraint P and attain C (k ) (P ). Now the distribution F(x k 1)F (x n k+1 ) and functions ({x 2i} n,{x 2i }n i=k+1 ) satisfy the power constraint αp +ᾱp and attain αc (k) (P)+ βc (k ) (P ). Hence αc(p)+ᾱc(p ) αc (k) (P)+ᾱC (k ) (P )+ǫ C (n) (αp +ᾱp )+ǫ C(αP +ᾱp )+ǫ. Taking ǫ 0 establishes the concavity. Note that C (k) (P) may not be concave for fixed k. Finally, that C(P) is strictly monotonically increasing in P follows from the first two parts and concavity. This completes the proofs of the properties of C(P). 17.11 Directed information. (a) Consider (b) Since p(y n,x n ) = = n p(y i,x i y i 1,x i 1 ) n p(x i y i 1,x i 1 )p(y i y i 1,x i ) = p(x n y n 1 )p(y n x n ). I(X n Y n ) = I(X i ;Y i Y i 1 ) and each term in the summation is nonnegative, so is I(X n Y n ). Moreover, it is equal to zero iff each term is equal to 0, or equivalently, p(y i y i 1,x i ) = p(y i y i 1 ), i 1 : n]. In the causal conditioning notation, this can be rewritten as p(y n x n ) = p(y n ). 4
(c) Consider I(X n ;Y n ) = E log p(y n,x n ] ) p(y n )p(x n ) = E log p(y n X n )p(x n X n 1 ] ) = E log p(y n X n ) p(y n ) p(y n )p(x n ) ] +E log p(xn Y n 1 ) p(x n ) = I(X n Y n )+I(Y 0,Y n 1 X n ). ] (d) The inequality I(X n Y n ) I(X n ;Y n ) follows immediately from part (c). By a similar argument to part (b), the equality holds iff p(x n y n 1 ) = p(x n ). 17.14 Gaussian two-way channel. Assume without loss of generality that ] 1 ρ K =. ρ 1 By the maximum differential entropy lemma, the outer bound on the capacity region in Proposition 17.3 simplifies as R 1 I(X 1 ;Y 2 X 2 ) = 1 2 log(1+g2 12Var(X 1 X 2 )) = 1 2 log(1+g2 12P), R 2 I(X 2 ;Y 1 X 1 ) = 1 2 log(1+g2 21Var(X 2 X 1 )) = 1 2 log(1+g2 21P), (1) which is attained by X 1 N(0,P) and X 2 N(0,P), independent of each other. By setting the same Gaussian (X 1,X 2 ) and Q = in the inner bound in Proposition 17.2 and taking the standard discretization argument, the rectangular rate region in (1) is achievable. Note that point-to-point capacities are achieved and that the noise correlation is irrelevant. 17.15 Common-message feedback capacity of broadcast channels. The common-message feedback capacity is C F = max p(x) min{i(x;y 1), I(X;Y 2 )}. Achievability follows by the case without feedback. For the converse, consider nr = H(M) I(M;Y1 n )+nǫ n = I(M;Y 1i Y1 i 1 )+nǫ n (a) (b) = I(M,Y i 1 1 ;Y 1i )+nǫ n I(M,Y i 1 2 ;Y 1i )+nǫ n I(M,Y i 1 2,X i ;Y 1i )+nǫ n I(X i,y 1i )+nǫ n, 5
where(a)followssincex i isafunctionof(m,y i 1 2 )and(b)followssincei(m,y i 1 2 ;Y 1i X i ) = 0 for all i. Following the usual procedure by introducing the time-sharing random variable, we have shown that R I(X;Y 1 ). Similarly, we have R I(X;Y 2 ). Thus, R min{i(x;y 1 ), I(X;Y 2 )} for some pmf p(x). 18.7 Broadcasting over a diamond network. The capacity is C = max min{i(x 1,Y 2 ), I(X 1,Y 3 ), I(X 2,X 3 ;Y 4 )}. p(x 1),p(x 2,x 3) For achievability, consider a two-hop relaying scheme in which node 1 communicates to nodes 2 and 3 using common-message broadcasting at rate R 1 < min{i(x 1 ;Y 2 ), I(X 1 ;Y 3 )} and nodes 2 and 3, in turn, communicate to node 4 using cooperative multiple access at rate R 2 < I(X 2,X 3 ;Y 4 ). The minimum of R 1 and R 2 is certainly achievable by two-hop relaying over multiple blocks. For the proof of the converse, consider the cutset bound C = max p(x 1,x 2,x 3) min S:1 S,S {1,2,3,4} I(X(S);Y(Sc ) X(S c )) with S = {1,3,4}, S = {1,2,4}, and S = {1,2,3}. The corresponding mutual information terms can be upper bounded as C I(X 1,X 3 ;Y 2 X 2 ) I(X 1,X 2,X 3 ;Y 2 ) = I(X 1 ;Y 2 ), C I(X 1,X 2 ;Y 3 X 3 ) I(X 1,X 2,X 3 ;Y 3 ) = I(X 1 ;Y 3 ), C I(X 1,X 2,X 3 ;Y 4 ) I(X 1,X 2,X 3 ;Y 4 ) = I(X 2,X 3 ;Y 4 ). Noting that these bounds depend on p(x 1,x 2,x 3 ) only through the marginals p(x 1 ) and p(x 2,x 3 ) completes the proof. 6