CHAPTER 3. P (B j A i ) P (B j ) =log 2. j=1

CHAPTER 3 Problem 3. : Also : Hence : I(B j ; A i ) = log P (B j A i ) P (B j ) 4 P (B j )= P (B j,a i )= i= 3 P (A i )= P (B j,a i )= j= =log P (B j,a i ) P (B j )P (A i ).3, j=.7, j=.4, j=3.3, i=.7, i=.3, i=3., i=4. I(B ; A )=log =+.57 bits (.3)(.3).5 I(B ; A )=log =.76 bits (.3)(.7).5 I(B ; A 3 )=log =.943 bits (.3)(.3). I(B ; A 4 )=log =+.757 bits (.3)(.).8 I(B ; A )=log =.65 bits (.7)(.3).3 I(B ; A )=log =.64 bits (.7)(.7). I(B ; A 3 )=log =+.5 bits (.7)(.3).4 I(B ; A 4 )=log =.53 bits (.7)(.).3 I(B 3 ; A )=log =. bits (.4)(.3).9 I(B 3 ; A )=log =+.334 bits (.4)(.7) 6

.4 I(B 3 ; A 3 )=log =+.5 bits (.4)(.3).6 I(B 3 ; A 4 )=log =.556 bits (.4)(.) (b) The average mutual information will be : 3 4 I(B; A) = P (Ai, B j )I(B j ; A i )=.677 bits j= i= Problem 3. : H(B) = 3 j= P (B j )log P (B j ) = [.3 log.3 +.7 log.7 +.4 log.4] =.56 bits/letter Problem 3.3 : Let f(u) =u ln u. The first and second derivatives of f(u) are df du = u and d f du = >, u > u 7

Hence this function achieves its minimum at df = u =. The minimum value is f(u = du ) = so ln u = u, at u =. For all other values of u :<u<, u, we have f(u) > u > ln u. Problem 3.4 : We will show that I(X; Y ) I(X; Y ) = i j P (x i,y j )log P (x i,y j ) P (x i )P (y j ) = ln i j P (x i,y j )ln P (x i)p (y j ) P (x i,y j ) We use the inequality ln u u. We need only consider those terms for which P (x i,y j ) > ; then, applying the inequality to each term in I(X; Y ): I(X; Y ) ln i j P (x i,y j ) [ P (x i )P (y j ) P (x i,y j ] ) The first inequality becomes equality if and only if = ln i j [P (x i )P (y j ) P (x i,y j )] P (x i )P (y j ) P (x i,y j ) = P (x i )P (y j )=P (x i,y j ) when P (x i,y j ) >. Also, since the summations [P (x i )P (y j ) P (x i,y j )] i j contain only the terms for which P (x i,y j ) >, this term equals zero if and only if P (X i )P (Y j )=, when P (x i,y j )=. Therefore, both inequalitites become equalities and hence, I(X; Y )= if and only if X and Y are statistically independent. 8

Problem 3.5 : We shall prove that H(X) log n : H(X) log n = n i= p i log p i log n = n i= p i log p i n i= p i log n = n i= p i log np i = ni= p ln i ln np i ( ni= p ln i np i ) = Hence, H(X) log n. Also, if p i =/n i H(X) =logn. Problem 3.6 : By definition, the differential entropy is H(X) = p(x)logp(x)dx For the uniformly distributed random variable : a H(X) = a log dx =loga a (a) For a =, H(X) = (b) For a =4, H(X) = log 4 = log (c) For a =/4, H(X) =log 4 = log Problem 3.7 : (a) The following figure depicts the design of a ternary Huffman code (we follow the convention that the lower-probability branch is assigned a ) : 9

Codeword Probability.5...58.5.4....33.8.5.5..8 æ (b) The average number of binary digits per source letter is : R = i P (x i )n i =(.45) + 3(.37) + 4(.8) + 5(.) =.83 bits/letter (c) The entropy of the source is : H(X) = i P (x i )logp(x i )=.8 bits/letter As it is expected the entropy of the source is less than the average length of each codeword. Problem 3.8 : The source entropy is : H(X) = 5 p i log =log5=.3 bits/letter i= p i (a) When we encode one letter at a time we require R = 3 bits/letter. Hence, the efficiency is.3/3 =.77 (77%).

(b) If we encode two letters at a time, we have 5 possible sequences. Hence, we need 5 bits per -letter symbol, or R =.5 bits/letter ; the efficiency is.3/.5 =.93. (c) In the case of encoding three letters at a time we have 5 possible sequences. Hence we need 7 bits per 3-letter symbol, so R =7/3 bits/letter; the efficiency is.3/(7/3) =.994. Problem 3.9 : (a) I(x i ; y j ) = log P (x i y j ) P (x i ) = log P (x i,y j ) P (x i )P (y j ) = log P (y j x i ) P (y j ) = log log P (y j ) P (y j x i ) = I(y j ) I(y j x i ) (b) I(x i ; y j ) = log P (x i y j ) P (x i ) = log P (x i,y j ) P (x i )P (y j ) = log +log log P (x i ) P (y j ) P (x i,y j ) = I(x i )+I(y j ) I(x i,y j ) Problem 3. : (a) H(X) = p( p) k log (p( p) k ) k= = p log (p) ( p) k p log ( p) (k )( p) k k= k= = p log (p) ( p) p log p ( p) ( ( p)) = log (p) p log p ( p)

(b) Clearly P (X = k X >K)=fork K. Ifk>K,then But, P (X = k X >K)= P (X = k, X > K) P (X >K) = p( p)k P (X >K) ( ) P (X >K) = p( p) k K = p ( p) k ( p) k k=k+ k= k= ( ) ( p)k = p =( p) K ( p) ( p) so that P (X = k X >K)= If we let k = K + l with l =,,...,then p( p)k ( p) K P (X = k X >K)= p( p)k ( p) l ( p) K = p( p) l that is P (X = k X > K) is the geometrically distributed. Hence, using the results of the first part we obtain H(X X >K) = p( p) l log (p( p) l ) l= = log (p) p log p ( p) Problem 3. : (a) The marginal distribution P (x) isgivenbyp (x) = y P (x, y). Hence, H(X) = P (x)logp(x) = P (x, y)logp (x) x x y = P (x, y)logp (x) x,y Similarly it is proved that H(Y )= x,y P (x, y)logp (y). (b) Using the inequality ln w w withw = P (x)p (y),weobtain P (x,y) ln P (x)p (y) P (x, y) P (x)p (y) P (x, y)

Multiplying the previous by P (x, y) and adding over x, y, weobtain P (x, y)lnp (x)p (y) P (x, y)lnp (x, y) P (x)p (y) P (x, y) = x,y x,y x,y x,y Hence, H(X, Y ) P (x, y)lnp (x)p (y) = P (x, y)(ln P (x)+lnp(y)) x,y x,y = P (x, y)lnp (x) P (x, y)lnp (y) =H(X)+H(Y ) x,y x,y Equality holds when P (x)p (y) P (x,y) =, i.e when X, Y are independent. (c) H(X, Y )=H(X)+H(Y X) =H(Y )+H(X Y) Also, from part (b), H(X, Y ) H(X)+H(Y ). Combining the two relations, we obtain H(Y )+H(X Y) H(X)+H(Y )= H(X Y ) H(X) Suppose now that the previous relation holds with equality. Then, x P (x)logp (x y) = x P (x)logp (x) x P (x) log( P (x) P (x y) )= However, P (x) is a lwa ys grea ter or equa l to P (x y), so that log(p (x)/p (x y)) is non-negative. Since P (x) >, the above equality holds if and only if log(p (x)/p (x y)) = or equivalently if and only if P (x)/p (x y) =. This implies that P (x y) =P (x) meaning that X and Y are independent. Problem 3. : The marginal probabilities are given by P (X =) = k P (X =,Y = k) =P (X =,Y =)+P (X =,Y =)= 3 P (X =) = k P (X =,Y = k) =P (X =,Y =)= 3 P (Y =) = k P (X = k, Y =)=P (X =,Y =)= 3 P (Y =) = k P (X = k, Y =)=P (X =,Y =)+P (X =,Y =)= 3 3

Hence, H(X) = P i log P i = ( i= 3 log 3 + 3 log 3 )=.983 H(X) = P i log P i = ( i= 3 log 3 + 3 log 3 )=.983 H(X, Y ) = i= 3 log 3 =.585 H(X Y ) = H(X, Y ) H(Y )=.585.983 =.6667 H(Y X) = H(X, Y ) H(X) =.585.983 =.6667 Problem 3.3 : H = lim H(X n X,...,X n ) n [ = lim ] P (x,...,x n )log n P (x n x,...,x n ) x,...,x n [ = lim ] P (x,...,x n )log n P (x n x n ) x,...,x n = lim P (x n,x n )log n P (x n x n ) x n,x n = lim H(X n X n ) n However, for a stationary process P (x n,x n )andp (x n x n ) are independent of n, sothat H = lim n H(X n X n )=H(X n X n ) Problem 3.4 : H(X, Y ) = H(X, g(x)) = H(X)+H(g(X) X) = H(g(X)) + H(X g(x)) 4

But, H(g(X) X) =, sinceg( ) is deterministic. Therefore, H(X) =H(g(X)) + H(X g(x)) Since each term in the previous equation is non-negative we obtain H(X) H(g(X)) Equality holds when H(X g(x)) =. This means that the values g(x) uniquely determine X, or that g( ) is aone to one mapping. Problem 3.5 : I(X; Y ) = n i= mj= P (x i,y j )log P (x i,y j ) P (x i )P (y j ) = = { ni= mj= P (x i,y j )logp (x i,y j ) n i= mj= P (x i,y j )logp (x i ) n i= mj= P (x i,y j )logp (y j ) { ni= mj= P (x i,y j )logp (x i,y j ) n i= P (x i )logp (x i ) m j= P (y j )logp (y j ) } } = H(XY )+H(X)+H(Y ) Problem 3.6 : H(X X...X n )= m m j = j =... m n j n= P (x,x,..., x n )logp (x,x,..., x n ) Since the {x i } are statistically independent : P (x,x,..., x n )=P(x )P (x )...P (x n ) and m m n... P (x )P (x )...P (x n )=P(x ) j = (similarly for the other x i ). Then : H(X X...X n ) = m j = j n= m j =... m n j n= P (x )P (x )...P (x n )logp (x )P (x )...P (x n ) = m j = P (x )logp (x ) m j = P (x )logp (x )... m n j n= P (x n )logp (x n ) = n i= H(X i ) 5

Problem 3.7 : We consider an n input, n output channel. Since it is noiseless : Hence : Butitisalsotruethat: Hence : P (y j x i )= {, i j, i=j H(X Y ) = n i= nj= P (x i,y j )logp (x i y j ) = n i= nj= P (y j x i )p(x i )logp (x i y j ) P (x i y j )= {, i j, i=j n H(X Y )= P (x i )log= i= } } Problem 3.8 : The conditional mutual information between x 3 and x given x is defined as : I(x 3 ; x x )=log P (x 3,x x ) P (x 3 x )P (x x ) =logp (x 3 x x ) P (x 3 x ) Hence : and I(x 3 ; x x )=I(x 3 x ) I(x 3 x x ) I(X 3 ; X X ) = x x x3 P (x,x,x 3 )log P (x 3 x x ) P (x 3 x ) = { x x x3 P (x,x,x 3 )logp(x 3 x ) + x x x3 P (x,x,x 3 )logp(x 3 x x ) } Since I(X 3 ; X X ), it follows that : = H(X 3 X ) H(X 3 X X ) H(X 3 X ) H(X 3 X X ) Problem 3.9 : 6

Assume that a>. Then we know that in the linear transformation Y = ax + b : Hence : p Y (y) = a p X( y b a ) H(Y ) = p Y (y)logp Y (y)dy = p a X( y b )log p a a X( y b )dy a Let u = y b. Then dy = adu, and : a H(Y ) = a p X(u)[logp X (u) log a] adu = p X(u)logp X (u)du + p X(u)logadu = H(X)+loga In a similar way, we can prove that for a<: H(Y )= H(X) log a Problem 3. : The linear transformation produces the symbols : y i = ax i + b, i =,, 3 with corresponding probabilities p =.45, p =.35, p 3 =.. since the {y i } have the same probability distribution as the {x i },itfollowsthat: H(Y )=H(X). Hence, the entropy of a DMS is not affected by the linear transformation. Problem 3. : (a) The following figure depicts the design of the Huffman code, when encoding a single level ata time: 7

Codeword Level Probability a.3365 a a 3.3365.635.37.6635 a 4.635 The average number of binary digits per source level is : æ R = i P (a i )n i =.995 bits/level The entropy of the source is : H(X) = i P (a i )logp(a i )=.98 bits/level (b) Encoding two levels at a time : 8

Codeword Levels a a Probability.33 a a.33.646 a a.33 a a a a 3.33.55.4.37.55987 a a 4.55.3333 a a 3.55.4 a a 4.55.8 a 3 a.55.4.443 a 3 a.55 a 4 a.55.4 a 4 a.55 a 3 a 3.673.5346.696 a 3 a 4.673.69 a 4 a 3.673.5346 a 4 a 4.673 æ The average number of binary digits per level pair is R = k P (a k )n k = 3.874 bits/pair resulting in an average number : R =.937 bits/level (c) H(X) R J J <H(X)+ J As J, R J J H(X) =.98 bits/level. 9

Problem 3. : First, we need the state probabilities P (x i ), i =,. For stationary Markov processes, these can be found, in general, by the solution of the system : P Π=P, P i = where P is the state probability vector and Π is the transition matrix : Π[ij] = P (x j x i ). However, in the case of a two-state Markov source, we can find P (x i )inasimplerwaybynoting that the probability of a transition from state to state equals the probability of a transition from state to state (so that the probability of each state will remain the same). Hence : P (x x )P (x )=P (x x )P (x ).3P (x )=.P (x ) P (x )=.6, P(x )=.4 i Then : H(X) = { P (x )[ P (x x )logp (x x ) P (x x )logp (x x )] + P (x )[ P (x x )logp (x x ) P (x x )logp (x x )] } =.6[.8log.8.log.] +.4[.3log.3.7log.7] =.7857 bits/letter If the source is a binary DMS with output letter probabilities P (x )=.6, P(x )=.4, its entropy will be : H DMS (X) =.6log.6.4log.4 =.97 bits/letter We see that the entropy of the Markov source is smaller, since the memory inherent in it reduces the information content of each output. Problem 3.3 : (a) H(X) = (.5 log.5 +.log.+.log.+.5 log.5 +.5 log.5 +.5 log.5 +.3log.3) =.58 (b) After quantization, the new alphabet is B = { 4,, 4} and the corresponding symbol probabilities are given by P ( 4) = P ( 5) + P ( 3) =.5 +. =.5 P () = P ( ) + P () + P () =.+.5 +.5 =.3 P (4) = P (3) + P (5) =.5 +.3 =.55 3

Hence, H(Q(X)) =.46. As it is observed quantization decreases the entropy of the source. Problem 3.4 : The following figure depicts the design of a ternary Huffman code...8.7.5.5.3..8.5 The average codeword length is R(X) = x P (x)n x =. + (.8 +.7 +.5 +.3 +. +.5) =.78 (ternary symbols/output) For a fair comparison of the average codeword length with the entropy of the source, we compute the latter with logarithms in base 3. Hence, H(X) = x P (x)log 3 P (x) =.747 As it is expected H(X) R(X). Problem 3.5 : Parsing the sequence by the rules of the Lempel-Ziv coding scheme we obtain the phrases,,,,,,,,,,,,,,,,,,... The number of the phrases is 8. For each phrase we need 5 bits plus an extra bit to represent the new source output. 3

Dictionary Dictionary Codeword Location Contents 3 4 5 6 7 8 9 3 4 5 6 7 8 Problem 3.6 : (a) where we have used the fact (b) x H(X) = λ e λ ln( = ln( λ ) H(X) = x λ e x λ e x λ e λ )dx = lnλ + λ xdx λ = lnλ + λ λ =+lnλ λ dx + λ e x λ x λ dx λ e x λ dx =ande[x] = x λ e x λ dx = λ. = ln( λ ) x x λ e λ ln( λ e λ )dx x λ e λ dx + λ 3 x x λ e λ dx

[ = ln(λ)+ x λ λ e x λ dx + x = ln(λ)+ λ λ + λ λ =+ln(λ) x λ e λ dx ] (c) H(X) = x + λ ln λ λ = ( ) [ ln λ λ ( ) x + λ dx λ x + λ λ dx + x + λ ln(x + λ)dx λ λ = ln(λ ) λ z ln zdz λ = ln(λ ) λ [ z ln z ] λ z 4 λ λ ( ) x + λ x + λ ln dx λ λ ] x + λ dx λ λ x + λ λ ln( x + λ)dx = ln(λ ) ln(λ)+ Problem 3.7 : (a) Since R(D) =log λ and D = λ λ,weobtainr(d) =log( ) = log() = bit/sample. D λ/ (b) The following figure depicts R(D) for λ =.,. a nd.3. As it is observed from the figure, an increase of the parameter λ increases the required rate for a given distortion. 7 6 5 R(D) 4 3 l=.3 l=. l=..5..5..5.3 Distortion D 33

Problem 3.8 : (a) For a Gaussian random variable of zero mean and variance σ the rate-distortion function is given by R(D) = log σ. Hence, the upper bound is satisfied with equality. For the lower D bound recall that H(X) = log (πeσ ). Thus, H(X) log (πed) = log (πeσ ) log (πed) = ( ) πeσ log = R(D) πed As it is observed the upper and the lower bounds coincide. (b) The differential entropy of a Laplacian source with parameter λ is H(X) = + ln(λ). The variance of the Laplacian distribution is σ = x x λ e λ dx =λ Hence, with σ =,weobtainλ = /andh(x) =+ln(λ) =+ln( ) =.3466 nats/symbol =.5 bits/symbol. A plot of the lower and upper bound of R(D) is given in the next figure. 5 Laplacian Distribution, unit variance 4 3 R(D) Upper Bound Lower Bound -...3.4.5.6.7.8.9 Distortion D (c) The variance of the triangular distribution is given by ( ) ( ) x + λ λ x + λ σ = x dx + x dx λ λ = λ ( 4 x4 + λ 3 x3 ) = λ 6 λ 34 λ + λ ( 4 x4 + λ 3 x3 ) λ

Hence, with σ =,weobtainλ = 6andH(X) =ln(6) ln( 6)+/ =.795 bits /source output. A plot of the lower and upper bound of R(D) is given in the next figure. 4.5 Triangular distribution, unit variance 4 3.5 3.5 R(D).5.5 Lower Bound Upper Bound -.5...3.4.5.6.7.8.9 Distortion D Problem 3.9 : σ = E[X (t)] = R X (τ) τ= = A Hence, SQNR = 3 4 ν X =3 4 ν X =3 4 ν A x max A With SQNR = 6 db, we obtain ( 3 4 q ) log =6= q =9.6733 The smallest integer larger that q is. Hence, the required number of quantization levels is ν =. Problem 3.3 : (a) H(X G) = p(x, g)logp(x g)dxdg 35

But X, G are independent, so : p(x, g) = p(x)p(g), p(x g) = p(x).hence : H(X G) = p(g) [ p(x)logp(x)dx] dg = p(g)h(x)dg = H(X) = log(πeσ x ) where the last equality stems from the Gaussian pdf of X. (b) I(X; Y )=H(Y ) H(Y X) Since Y is the sum of two independent, zero-mean Gaussian r.v s, it is also a zero-mean Gaussian r.v. with variance : σ y = σ x + σ n. Hence : H(Y )= log (πe (σ x + σ n )). Also, since y = x + g : p(y x) =p g (y x) = e (y x) σn πσn Hence : H(Y X) = p(x, y)logp(y x)dxdy ( ) (y x) = p(x)loge p(y x)ln exp( ) dydx πσn σn [ ( = p(x)loge p g (y x) ln( ) ] (y x) πσ n )+ dy dx σn [ = p(x)loge ln( πσ n )+ ] σ σn n dx = [log( πσ n )+ ] log e p(x)dx = log ( πeσ n) (= H(G)) where we have used the fact that : p g(y x)dy =, (y x) p g (y x)dy = E [G ]=σn. From H(Y ), H(Y X) : I(X; Y )=H(Y) H(Y X) = log ( πe(σx + σ n )) log ( ( ) ) πeσn = log + σ x σn 36

Problem 3.3 : Codeword Letter Probability x.5 x. x 3.5.47 x 4. x 5. x 6.8 x 7 x 8 x 9.5.5..8 æ R =.85 ternary symbols/letter Problem 3.3 : Given (n,n,n 3,n 4 )=(,,, 3) we have : 4 n k = + + + 3 = 9 k= 8 > Since the Craft inequality is not satisfied, a binary code with code word lengths (,,, 3) that satisfies the prefix condition does not exist. Problem 3.33 : n n k = n k= k= n = n n = 37

Therefore the Kraft inequality is satisfied. Problem 3.34 : But : and Hence : p(x) = (π) n/ M H(X) =... log p(x) = log(π)n M e X M X / p(x)logp(x)dx ( log e ) X M X ( )... log e X M X p(x)dx = n log e H(X) = log(π)n M + log en = log(πe)n M Problem 3.35 : R(D) =+D log D +( D)log( D), D = P e /.9.8.7.6 R(D).5.4.3...5..5..5.3.35.4.45.5 D 38

Problem 3.36 : R(D) =logm + D log D +( D)log ( D) M 3.5 M=8 R(D).5 M=4.5 M=...3.4.5.6.7.8.9 D Problem 3.37 : Let W = P P. Then : d W (X, X) =(X X) W(X X) d W (X, X) = (X X) P P(X X) = ( P(X X) ) P(X X) ( ) ( Y Ỹ Y Ỹ ) = n where by definition : Y= npx, Ỹ = np X. Hence : d W (X, X) =d (Y,Ỹ). Problem 3.38 : (a) The first order predictor is : ˆx(n) =a x(n ). The coefficient a that minimizes the MSE is found from the orthogonality of the prediction error to the prediction data : E [e(n)x(n )] = E [(x(n) a x(n )) x(n )] = φ() a φ() = a = φ()/φ() = / 39

The minimum MSE is : ɛ = φ() ( a )=3/4 (b) For the second order predictor : ˆx(n) =a x(n ) + a x(n ). Following the Levinson- Durbin algorithm (Eqs 3-5-5) : a = φ() k= a k φ( k) = ɛ 3/4 a = a a a =/3 = /3 The minimum MSE is : ɛ = ɛ ( a ) =/3 Problem 3.39 : p(x,x )= { 5 7ab, x,x C, o.w If x,x are quantized separately by using uniform intervals of length, the number of levels needed is L = a,l = b. The number of bits is : R x = R + R =logl +logl =log ab By using vector quantization with squares having area,wehavel x = 7ab and R 5 x =logl x = log 7ab bits. The difference in bit rate is : 5 R x R x =log ab 7ab log 5 =log5 =. bits/output sample 7 for all a, b >. } Problem 3.4 : (a) The area between the two squares is 4 4 =. Hence, p X,Y (x, y) =. The marginal probability p X (x) isgivenbyp X (x) = p X,Y (x, y)dy. If X<, then p X (x) = p X,Y (x, y)dy = y = 3 4

If X<, then Finally, if X, then p X (x) = dy + dy = 6 p X (x) = p X,Y (x, y)dy = y The next figure depicts the marginal distribution p X (x).... /3 /6 = 3 Similarly we find that - - y< 3 p Y (y) = y< 6 y 3 (b) The quantization levels ˆx,ˆx,ˆx 3 and ˆx 4 are set to 3,, and 3 resulting distortion is respectively. The D X = (x + 3 ) p X (x)dx + = (x +3x + 9 3 4 )dx + 6 ( 3 x3 + 3 x + 9 ) 4 x (x + ) p X (x)dx (x + x + 4 )dx ( 3 x3 + x + 4 x ) The total distortion is = 3 = + 6 D total = D X + D Y = + = 6 whereas the resulting number of bits per (X, Y )pair R = R X + R Y =log 4+log 4=4 (c) Suppose that we divide the region over which p(x, y) intol equal subregions. The case of L = 4 is depicted in the next figure. 4

For each subregion the quantization output vector (ˆx, ŷ) is the centroid of the corresponding rectangle. Since, each subregion has the same shape (uniform quantization), a rectangle with width equal to one and length /L, the distortion of the vector quantizer is D = = L = L L [ L [(x, y) (, L L )] dxdy [ (x ) +(y L + 3 L 3 ] L ) = + L ] dxdy If we set D = 6,weobtain L = = L = 44 = Thus, we have to divide the area over which p(x, y), into equal subregions in order to achieve the same distortion. In this case the resulting number of bits per source output pair (X, Y )isr =log = 3.585. Problem 3.4 : (a) The joint probability density function is p XY (x, y) = ( =. The marginal distribution ) 8 p X (x) isp X (x) = y p XY (x, y)dy. If x,then If x,then p X (x) = p X (x) = The next figure depicts p X (x). x+ x x+ x p X,Y (x, y)dy = 8 y x+ x = x + 4 p X,Y (x, y)dy = 8 y x+ x = x + 4 4

From the symmetry of the problem we have p Y (y) = { y+ 4 y< y+ 4 y (b) D X = (x + 3 ) p X (x)dx + (x + ) p X (x)dx = (x + 3 ) (x +)dx + (x + ) ( x +)dx ( 4 x4 + 5 3 x3 + 33 8 x + 9 ) x ( 4 x4 + x 3 + 9 8 x + ) x = = The total distortion is + D total = D X + D Y = + = 6 whereas the required number of bits per source output pair R = R X + R Y =log 4+log 4=4 (c) We divide the square over which p(x, y) into 4 = 6 equal square regions. The area of each square is and the resulting distortion D = 6 [ (x 8 ) +(y ] ) dxdy = 4 (x ) dxdy = 4 (x + 8 x )dx = 4 ( 3 x3 + 8 x ) x = Hence, using vector quantization and the same rate we obtain half the distortion. 43