CHAPTER 3. P (B j A i ) P (B j ) =log 2. j=1

Similar documents
ECE 4400:693 - Information Theory

Coding for Discrete Source

Lecture 8: Channel Capacity, Continuous Random Variables

Exercises with solutions (Set B)

Solutions to Homework Set #1 Sanov s Theorem, Rate distortion

An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1

3F1: Signals and Systems INFORMATION THEORY Examples Paper Solutions

Chapter 4. Data Transmission and Channel Capacity. Po-Ning Chen, Professor. Department of Communications Engineering. National Chiao Tung University

EC2252 COMMUNICATION THEORY UNIT 5 INFORMATION THEORY

Chapter 2: Source coding

Lecture 17: Differential Entropy

EE/Stat 376B Handout #5 Network Information Theory October, 14, Homework Set #2 Solutions

Exercises with solutions (Set D)

EE/Stats 376A: Homework 7 Solutions Due on Friday March 17, 5 pm

Chapter I: Fundamental Information Theory

Lecture 2. Capacity of the Gaussian channel

EE376A: Homeworks #4 Solutions Due on Thursday, February 22, 2018 Please submit on Gradescope. Start every question on a new page.

Ch. 8 Math Preliminaries for Lossy Coding. 8.4 Info Theory Revisited


18.2 Continuous Alphabet (discrete-time, memoryless) Channel

ECE Information theory Final

Information Theory. Lecture 5 Entropy rate and Markov sources STEFAN HÖST

COMM901 Source Coding and Compression. Quiz 1

EE376A - Information Theory Final, Monday March 14th 2016 Solutions. Please start answering each question on a new page of the answer booklet.

Lecture 22: Final Review

Revision of Lecture 5

Motivation for Arithmetic Coding

Information Theory. Coding and Information Theory. Information Theory Textbooks. Entropy

TSBK08 Data compression Exercises

Electrical and Information Technology. Information Theory. Problems and Solutions. Contents. Problems... 1 Solutions...7

4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information

Lecture 6: Gaussian Channels. Copyright G. Caire (Sample Lectures) 157

EE4601 Communication Systems

Source Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria

A One-to-One Code and Its Anti-Redundancy

PROOF OF ZADOR-GERSHO THEOREM

Chapter 3 Source Coding. 3.1 An Introduction to Source Coding 3.2 Optimal Source Codes 3.3 Shannon-Fano Code 3.4 Huffman Code

Chapter 2: Entropy and Mutual Information. University of Illinois at Chicago ECE 534, Natasha Devroye

Chapter 9 Fundamental Limits in Information Theory

LECTURE 3. Last time:

Chapter 8: Differential entropy. University of Illinois at Chicago ECE 534, Natasha Devroye

Chapter 2 Date Compression: Source Coding. 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code

MARKOV CHAINS A finite state Markov chain is a sequence of discrete cv s from a finite alphabet where is a pmf on and for

Chapter 3, 4 Random Variables ENCS Probability and Stochastic Processes. Concordia University

Ch. 8 Math Preliminaries for Lossy Coding. 8.5 Rate-Distortion Theory

PART III. Outline. Codes and Cryptography. Sources. Optimal Codes (I) Jorge L. Villar. MAMME, Fall 2015

at Some sort of quantization is necessary to represent continuous signals in digital form

Digital Image Processing Lectures 25 & 26

ECE353: Probability and Random Processes. Lecture 7 -Continuous Random Variable

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.

Continuous Random Variables

ELEC546 Review of Information Theory

1 Introduction to information theory

MAHALAKSHMI ENGINEERING COLLEGE-TRICHY QUESTION BANK UNIT V PART-A. 1. What is binary symmetric channel (AUC DEC 2006)

Introduction to Estimation and Data fusion Part I: Probability, State and Information Models

ELEMENT OF INFORMATION THEORY

Solutions to Homework Set #4 Differential Entropy and Gaussian Channel

Lecture 3. Mathematical methods in communication I. REMINDER. A. Convex Set. A set R is a convex set iff, x 1,x 2 R, θ, 0 θ 1, θx 1 + θx 2 R, (1)

Introduction to Probability and Stocastic Processes - Part I

conditional cdf, conditional pdf, total probability theorem?

MAHALAKSHMI ENGINEERING COLLEGE QUESTION BANK. SUBJECT CODE / Name: EC2252 COMMUNICATION THEORY UNIT-V INFORMATION THEORY PART-A

3. Probability and Statistics

Lecture 11: Continuous-valued signals and differential entropy

Lecture 5 Channel Coding over Continuous Channels

2 Functions of random variables

SOURCE CODING WITH SIDE INFORMATION AT THE DECODER (WYNER-ZIV CODING) FEB 13, 2003

Homework 1 Due: Thursday 2/5/2015. Instructions: Turn in your homework in class on Thursday 2/5/2015

Lecture 5: Asymptotic Equipartition Property

Chapter 2: Random Variables

Review: mostly probability and some statistics

EGR 544 Communication Theory

EE5139R: Problem Set 4 Assigned: 31/08/16, Due: 07/09/16

Multimedia Communications. Mathematical Preliminaries for Lossless Compression

Information Dimension

Quiz 2 Date: Monday, November 21, 2016

Lecture 6 I. CHANNEL CODING. X n (m) P Y X

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Coding of memoryless sources 1/35

Problem 7.7 : We assume that P (x i )=1/3, i =1, 2, 3. Then P (y 1 )= 1 ((1 p)+p) = P (y j )=1/3, j=2, 3. Hence : and similarly.

Information Theory CHAPTER. 5.1 Introduction. 5.2 Entropy

EECS 229A Spring 2007 * * (a) By stationarity and the chain rule for entropy, we have

Basic Principles of Video Coding

lossless, optimal compressor

Channel capacity. Outline : 1. Source entropy 2. Discrete memoryless channel 3. Mutual information 4. Channel capacity 5.

1 Random Variable: Topics

Gaussian source Assumptions d = (x-y) 2, given D, find lower bound of I(X;Y)

Information and Entropy

Basic Principles of Lossless Coding. Universal Lossless coding. Lempel-Ziv Coding. 2. Exploit dependences between successive symbols.

3F1 Information Theory, Lecture 3

EE376A - Information Theory Midterm, Tuesday February 10th. Please start answering each question on a new page of the answer booklet.

Lossy Distributed Source Coding

X 1 : X Table 1: Y = X X 2

An introduction to basic information theory. Hampus Wessman

Multivariate distributions

PROBABILITY AND INFORMATION THEORY. Dr. Gjergji Kasneci Introduction to Information Retrieval WS

Introduction to Probability and Statistics (Continued)

Bivariate distributions

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

EE5319R: Problem Set 3 Assigned: 24/08/16, Due: 31/08/16

Solutions to Homework Set #5 (Prepared by Lele Wang) MSE = E [ (sgn(x) g(y)) 2],, where f X (x) = 1 2 2π e. e (x y)2 2 dx 2π

Transcription:

CHAPTER 3 Problem 3. : Also : Hence : I(B j ; A i ) = log P (B j A i ) P (B j ) 4 P (B j )= P (B j,a i )= i= 3 P (A i )= P (B j,a i )= j= =log P (B j,a i ) P (B j )P (A i ).3, j=.7, j=.4, j=3.3, i=.7, i=.3, i=3., i=4. I(B ; A )=log =+.57 bits (.3)(.3).5 I(B ; A )=log =.76 bits (.3)(.7).5 I(B ; A 3 )=log =.943 bits (.3)(.3). I(B ; A 4 )=log =+.757 bits (.3)(.).8 I(B ; A )=log =.65 bits (.7)(.3).3 I(B ; A )=log =.64 bits (.7)(.7). I(B ; A 3 )=log =+.5 bits (.7)(.3).4 I(B ; A 4 )=log =.53 bits (.7)(.).3 I(B 3 ; A )=log =. bits (.4)(.3).9 I(B 3 ; A )=log =+.334 bits (.4)(.7) 6

.4 I(B 3 ; A 3 )=log =+.5 bits (.4)(.3).6 I(B 3 ; A 4 )=log =.556 bits (.4)(.) (b) The average mutual information will be : 3 4 I(B; A) = P (Ai, B j )I(B j ; A i )=.677 bits j= i= Problem 3. : H(B) = 3 j= P (B j )log P (B j ) = [.3 log.3 +.7 log.7 +.4 log.4] =.56 bits/letter Problem 3.3 : Let f(u) =u ln u. The first and second derivatives of f(u) are df du = u and d f du = >, u > u 7

Hence this function achieves its minimum at df = u =. The minimum value is f(u = du ) = so ln u = u, at u =. For all other values of u :<u<, u, we have f(u) > u > ln u. Problem 3.4 : We will show that I(X; Y ) I(X; Y ) = i j P (x i,y j )log P (x i,y j ) P (x i )P (y j ) = ln i j P (x i,y j )ln P (x i)p (y j ) P (x i,y j ) We use the inequality ln u u. We need only consider those terms for which P (x i,y j ) > ; then, applying the inequality to each term in I(X; Y ): I(X; Y ) ln i j P (x i,y j ) [ P (x i )P (y j ) P (x i,y j ] ) The first inequality becomes equality if and only if = ln i j [P (x i )P (y j ) P (x i,y j )] P (x i )P (y j ) P (x i,y j ) = P (x i )P (y j )=P (x i,y j ) when P (x i,y j ) >. Also, since the summations [P (x i )P (y j ) P (x i,y j )] i j contain only the terms for which P (x i,y j ) >, this term equals zero if and only if P (X i )P (Y j )=, when P (x i,y j )=. Therefore, both inequalitites become equalities and hence, I(X; Y )= if and only if X and Y are statistically independent. 8

Problem 3.5 : We shall prove that H(X) log n : H(X) log n = n i= p i log p i log n = n i= p i log p i n i= p i log n = n i= p i log np i = ni= p ln i ln np i ( ni= p ln i np i ) = Hence, H(X) log n. Also, if p i =/n i H(X) =logn. Problem 3.6 : By definition, the differential entropy is H(X) = p(x)logp(x)dx For the uniformly distributed random variable : a H(X) = a log dx =loga a (a) For a =, H(X) = (b) For a =4, H(X) = log 4 = log (c) For a =/4, H(X) =log 4 = log Problem 3.7 : (a) The following figure depicts the design of a ternary Huffman code (we follow the convention that the lower-probability branch is assigned a ) : 9

Codeword Probability.5...58.5.4....33.8.5.5..8 æ (b) The average number of binary digits per source letter is : R = i P (x i )n i =(.45) + 3(.37) + 4(.8) + 5(.) =.83 bits/letter (c) The entropy of the source is : H(X) = i P (x i )logp(x i )=.8 bits/letter As it is expected the entropy of the source is less than the average length of each codeword. Problem 3.8 : The source entropy is : H(X) = 5 p i log =log5=.3 bits/letter i= p i (a) When we encode one letter at a time we require R = 3 bits/letter. Hence, the efficiency is.3/3 =.77 (77%).

(b) If we encode two letters at a time, we have 5 possible sequences. Hence, we need 5 bits per -letter symbol, or R =.5 bits/letter ; the efficiency is.3/.5 =.93. (c) In the case of encoding three letters at a time we have 5 possible sequences. Hence we need 7 bits per 3-letter symbol, so R =7/3 bits/letter; the efficiency is.3/(7/3) =.994. Problem 3.9 : (a) I(x i ; y j ) = log P (x i y j ) P (x i ) = log P (x i,y j ) P (x i )P (y j ) = log P (y j x i ) P (y j ) = log log P (y j ) P (y j x i ) = I(y j ) I(y j x i ) (b) I(x i ; y j ) = log P (x i y j ) P (x i ) = log P (x i,y j ) P (x i )P (y j ) = log +log log P (x i ) P (y j ) P (x i,y j ) = I(x i )+I(y j ) I(x i,y j ) Problem 3. : (a) H(X) = p( p) k log (p( p) k ) k= = p log (p) ( p) k p log ( p) (k )( p) k k= k= = p log (p) ( p) p log p ( p) ( ( p)) = log (p) p log p ( p)

(b) Clearly P (X = k X >K)=fork K. Ifk>K,then But, P (X = k X >K)= P (X = k, X > K) P (X >K) = p( p)k P (X >K) ( ) P (X >K) = p( p) k K = p ( p) k ( p) k k=k+ k= k= ( ) ( p)k = p =( p) K ( p) ( p) so that P (X = k X >K)= If we let k = K + l with l =,,...,then p( p)k ( p) K P (X = k X >K)= p( p)k ( p) l ( p) K = p( p) l that is P (X = k X > K) is the geometrically distributed. Hence, using the results of the first part we obtain H(X X >K) = p( p) l log (p( p) l ) l= = log (p) p log p ( p) Problem 3. : (a) The marginal distribution P (x) isgivenbyp (x) = y P (x, y). Hence, H(X) = P (x)logp(x) = P (x, y)logp (x) x x y = P (x, y)logp (x) x,y Similarly it is proved that H(Y )= x,y P (x, y)logp (y). (b) Using the inequality ln w w withw = P (x)p (y),weobtain P (x,y) ln P (x)p (y) P (x, y) P (x)p (y) P (x, y)

Multiplying the previous by P (x, y) and adding over x, y, weobtain P (x, y)lnp (x)p (y) P (x, y)lnp (x, y) P (x)p (y) P (x, y) = x,y x,y x,y x,y Hence, H(X, Y ) P (x, y)lnp (x)p (y) = P (x, y)(ln P (x)+lnp(y)) x,y x,y = P (x, y)lnp (x) P (x, y)lnp (y) =H(X)+H(Y ) x,y x,y Equality holds when P (x)p (y) P (x,y) =, i.e when X, Y are independent. (c) H(X, Y )=H(X)+H(Y X) =H(Y )+H(X Y) Also, from part (b), H(X, Y ) H(X)+H(Y ). Combining the two relations, we obtain H(Y )+H(X Y) H(X)+H(Y )= H(X Y ) H(X) Suppose now that the previous relation holds with equality. Then, x P (x)logp (x y) = x P (x)logp (x) x P (x) log( P (x) P (x y) )= However, P (x) is a lwa ys grea ter or equa l to P (x y), so that log(p (x)/p (x y)) is non-negative. Since P (x) >, the above equality holds if and only if log(p (x)/p (x y)) = or equivalently if and only if P (x)/p (x y) =. This implies that P (x y) =P (x) meaning that X and Y are independent. Problem 3. : The marginal probabilities are given by P (X =) = k P (X =,Y = k) =P (X =,Y =)+P (X =,Y =)= 3 P (X =) = k P (X =,Y = k) =P (X =,Y =)= 3 P (Y =) = k P (X = k, Y =)=P (X =,Y =)= 3 P (Y =) = k P (X = k, Y =)=P (X =,Y =)+P (X =,Y =)= 3 3

Hence, H(X) = P i log P i = ( i= 3 log 3 + 3 log 3 )=.983 H(X) = P i log P i = ( i= 3 log 3 + 3 log 3 )=.983 H(X, Y ) = i= 3 log 3 =.585 H(X Y ) = H(X, Y ) H(Y )=.585.983 =.6667 H(Y X) = H(X, Y ) H(X) =.585.983 =.6667 Problem 3.3 : H = lim H(X n X,...,X n ) n [ = lim ] P (x,...,x n )log n P (x n x,...,x n ) x,...,x n [ = lim ] P (x,...,x n )log n P (x n x n ) x,...,x n = lim P (x n,x n )log n P (x n x n ) x n,x n = lim H(X n X n ) n However, for a stationary process P (x n,x n )andp (x n x n ) are independent of n, sothat H = lim n H(X n X n )=H(X n X n ) Problem 3.4 : H(X, Y ) = H(X, g(x)) = H(X)+H(g(X) X) = H(g(X)) + H(X g(x)) 4

But, H(g(X) X) =, sinceg( ) is deterministic. Therefore, H(X) =H(g(X)) + H(X g(x)) Since each term in the previous equation is non-negative we obtain H(X) H(g(X)) Equality holds when H(X g(x)) =. This means that the values g(x) uniquely determine X, or that g( ) is aone to one mapping. Problem 3.5 : I(X; Y ) = n i= mj= P (x i,y j )log P (x i,y j ) P (x i )P (y j ) = = { ni= mj= P (x i,y j )logp (x i,y j ) n i= mj= P (x i,y j )logp (x i ) n i= mj= P (x i,y j )logp (y j ) { ni= mj= P (x i,y j )logp (x i,y j ) n i= P (x i )logp (x i ) m j= P (y j )logp (y j ) } } = H(XY )+H(X)+H(Y ) Problem 3.6 : H(X X...X n )= m m j = j =... m n j n= P (x,x,..., x n )logp (x,x,..., x n ) Since the {x i } are statistically independent : P (x,x,..., x n )=P(x )P (x )...P (x n ) and m m n... P (x )P (x )...P (x n )=P(x ) j = (similarly for the other x i ). Then : H(X X...X n ) = m j = j n= m j =... m n j n= P (x )P (x )...P (x n )logp (x )P (x )...P (x n ) = m j = P (x )logp (x ) m j = P (x )logp (x )... m n j n= P (x n )logp (x n ) = n i= H(X i ) 5

Problem 3.7 : We consider an n input, n output channel. Since it is noiseless : Hence : Butitisalsotruethat: Hence : P (y j x i )= {, i j, i=j H(X Y ) = n i= nj= P (x i,y j )logp (x i y j ) = n i= nj= P (y j x i )p(x i )logp (x i y j ) P (x i y j )= {, i j, i=j n H(X Y )= P (x i )log= i= } } Problem 3.8 : The conditional mutual information between x 3 and x given x is defined as : I(x 3 ; x x )=log P (x 3,x x ) P (x 3 x )P (x x ) =logp (x 3 x x ) P (x 3 x ) Hence : and I(x 3 ; x x )=I(x 3 x ) I(x 3 x x ) I(X 3 ; X X ) = x x x3 P (x,x,x 3 )log P (x 3 x x ) P (x 3 x ) = { x x x3 P (x,x,x 3 )logp(x 3 x ) + x x x3 P (x,x,x 3 )logp(x 3 x x ) } Since I(X 3 ; X X ), it follows that : = H(X 3 X ) H(X 3 X X ) H(X 3 X ) H(X 3 X X ) Problem 3.9 : 6

Assume that a>. Then we know that in the linear transformation Y = ax + b : Hence : p Y (y) = a p X( y b a ) H(Y ) = p Y (y)logp Y (y)dy = p a X( y b )log p a a X( y b )dy a Let u = y b. Then dy = adu, and : a H(Y ) = a p X(u)[logp X (u) log a] adu = p X(u)logp X (u)du + p X(u)logadu = H(X)+loga In a similar way, we can prove that for a<: H(Y )= H(X) log a Problem 3. : The linear transformation produces the symbols : y i = ax i + b, i =,, 3 with corresponding probabilities p =.45, p =.35, p 3 =.. since the {y i } have the same probability distribution as the {x i },itfollowsthat: H(Y )=H(X). Hence, the entropy of a DMS is not affected by the linear transformation. Problem 3. : (a) The following figure depicts the design of the Huffman code, when encoding a single level ata time: 7

Codeword Level Probability a.3365 a a 3.3365.635.37.6635 a 4.635 The average number of binary digits per source level is : æ R = i P (a i )n i =.995 bits/level The entropy of the source is : H(X) = i P (a i )logp(a i )=.98 bits/level (b) Encoding two levels at a time : 8

Codeword Levels a a Probability.33 a a.33.646 a a.33 a a a a 3.33.55.4.37.55987 a a 4.55.3333 a a 3.55.4 a a 4.55.8 a 3 a.55.4.443 a 3 a.55 a 4 a.55.4 a 4 a.55 a 3 a 3.673.5346.696 a 3 a 4.673.69 a 4 a 3.673.5346 a 4 a 4.673 æ The average number of binary digits per level pair is R = k P (a k )n k = 3.874 bits/pair resulting in an average number : R =.937 bits/level (c) H(X) R J J <H(X)+ J As J, R J J H(X) =.98 bits/level. 9

Problem 3. : First, we need the state probabilities P (x i ), i =,. For stationary Markov processes, these can be found, in general, by the solution of the system : P Π=P, P i = where P is the state probability vector and Π is the transition matrix : Π[ij] = P (x j x i ). However, in the case of a two-state Markov source, we can find P (x i )inasimplerwaybynoting that the probability of a transition from state to state equals the probability of a transition from state to state (so that the probability of each state will remain the same). Hence : P (x x )P (x )=P (x x )P (x ).3P (x )=.P (x ) P (x )=.6, P(x )=.4 i Then : H(X) = { P (x )[ P (x x )logp (x x ) P (x x )logp (x x )] + P (x )[ P (x x )logp (x x ) P (x x )logp (x x )] } =.6[.8log.8.log.] +.4[.3log.3.7log.7] =.7857 bits/letter If the source is a binary DMS with output letter probabilities P (x )=.6, P(x )=.4, its entropy will be : H DMS (X) =.6log.6.4log.4 =.97 bits/letter We see that the entropy of the Markov source is smaller, since the memory inherent in it reduces the information content of each output. Problem 3.3 : (a) H(X) = (.5 log.5 +.log.+.log.+.5 log.5 +.5 log.5 +.5 log.5 +.3log.3) =.58 (b) After quantization, the new alphabet is B = { 4,, 4} and the corresponding symbol probabilities are given by P ( 4) = P ( 5) + P ( 3) =.5 +. =.5 P () = P ( ) + P () + P () =.+.5 +.5 =.3 P (4) = P (3) + P (5) =.5 +.3 =.55 3

Hence, H(Q(X)) =.46. As it is observed quantization decreases the entropy of the source. Problem 3.4 : The following figure depicts the design of a ternary Huffman code...8.7.5.5.3..8.5 The average codeword length is R(X) = x P (x)n x =. + (.8 +.7 +.5 +.3 +. +.5) =.78 (ternary symbols/output) For a fair comparison of the average codeword length with the entropy of the source, we compute the latter with logarithms in base 3. Hence, H(X) = x P (x)log 3 P (x) =.747 As it is expected H(X) R(X). Problem 3.5 : Parsing the sequence by the rules of the Lempel-Ziv coding scheme we obtain the phrases,,,,,,,,,,,,,,,,,,... The number of the phrases is 8. For each phrase we need 5 bits plus an extra bit to represent the new source output. 3

Dictionary Dictionary Codeword Location Contents 3 4 5 6 7 8 9 3 4 5 6 7 8 Problem 3.6 : (a) where we have used the fact (b) x H(X) = λ e λ ln( = ln( λ ) H(X) = x λ e x λ e x λ e λ )dx = lnλ + λ xdx λ = lnλ + λ λ =+lnλ λ dx + λ e x λ x λ dx λ e x λ dx =ande[x] = x λ e x λ dx = λ. = ln( λ ) x x λ e λ ln( λ e λ )dx x λ e λ dx + λ 3 x x λ e λ dx

[ = ln(λ)+ x λ λ e x λ dx + x = ln(λ)+ λ λ + λ λ =+ln(λ) x λ e λ dx ] (c) H(X) = x + λ ln λ λ = ( ) [ ln λ λ ( ) x + λ dx λ x + λ λ dx + x + λ ln(x + λ)dx λ λ = ln(λ ) λ z ln zdz λ = ln(λ ) λ [ z ln z ] λ z 4 λ λ ( ) x + λ x + λ ln dx λ λ ] x + λ dx λ λ x + λ λ ln( x + λ)dx = ln(λ ) ln(λ)+ Problem 3.7 : (a) Since R(D) =log λ and D = λ λ,weobtainr(d) =log( ) = log() = bit/sample. D λ/ (b) The following figure depicts R(D) for λ =.,. a nd.3. As it is observed from the figure, an increase of the parameter λ increases the required rate for a given distortion. 7 6 5 R(D) 4 3 l=.3 l=. l=..5..5..5.3 Distortion D 33

Problem 3.8 : (a) For a Gaussian random variable of zero mean and variance σ the rate-distortion function is given by R(D) = log σ. Hence, the upper bound is satisfied with equality. For the lower D bound recall that H(X) = log (πeσ ). Thus, H(X) log (πed) = log (πeσ ) log (πed) = ( ) πeσ log = R(D) πed As it is observed the upper and the lower bounds coincide. (b) The differential entropy of a Laplacian source with parameter λ is H(X) = + ln(λ). The variance of the Laplacian distribution is σ = x x λ e λ dx =λ Hence, with σ =,weobtainλ = /andh(x) =+ln(λ) =+ln( ) =.3466 nats/symbol =.5 bits/symbol. A plot of the lower and upper bound of R(D) is given in the next figure. 5 Laplacian Distribution, unit variance 4 3 R(D) Upper Bound Lower Bound -...3.4.5.6.7.8.9 Distortion D (c) The variance of the triangular distribution is given by ( ) ( ) x + λ λ x + λ σ = x dx + x dx λ λ = λ ( 4 x4 + λ 3 x3 ) = λ 6 λ 34 λ + λ ( 4 x4 + λ 3 x3 ) λ

Hence, with σ =,weobtainλ = 6andH(X) =ln(6) ln( 6)+/ =.795 bits /source output. A plot of the lower and upper bound of R(D) is given in the next figure. 4.5 Triangular distribution, unit variance 4 3.5 3.5 R(D).5.5 Lower Bound Upper Bound -.5...3.4.5.6.7.8.9 Distortion D Problem 3.9 : σ = E[X (t)] = R X (τ) τ= = A Hence, SQNR = 3 4 ν X =3 4 ν X =3 4 ν A x max A With SQNR = 6 db, we obtain ( 3 4 q ) log =6= q =9.6733 The smallest integer larger that q is. Hence, the required number of quantization levels is ν =. Problem 3.3 : (a) H(X G) = p(x, g)logp(x g)dxdg 35

But X, G are independent, so : p(x, g) = p(x)p(g), p(x g) = p(x).hence : H(X G) = p(g) [ p(x)logp(x)dx] dg = p(g)h(x)dg = H(X) = log(πeσ x ) where the last equality stems from the Gaussian pdf of X. (b) I(X; Y )=H(Y ) H(Y X) Since Y is the sum of two independent, zero-mean Gaussian r.v s, it is also a zero-mean Gaussian r.v. with variance : σ y = σ x + σ n. Hence : H(Y )= log (πe (σ x + σ n )). Also, since y = x + g : p(y x) =p g (y x) = e (y x) σn πσn Hence : H(Y X) = p(x, y)logp(y x)dxdy ( ) (y x) = p(x)loge p(y x)ln exp( ) dydx πσn σn [ ( = p(x)loge p g (y x) ln( ) ] (y x) πσ n )+ dy dx σn [ = p(x)loge ln( πσ n )+ ] σ σn n dx = [log( πσ n )+ ] log e p(x)dx = log ( πeσ n) (= H(G)) where we have used the fact that : p g(y x)dy =, (y x) p g (y x)dy = E [G ]=σn. From H(Y ), H(Y X) : I(X; Y )=H(Y) H(Y X) = log ( πe(σx + σ n )) log ( ( ) ) πeσn = log + σ x σn 36

Problem 3.3 : Codeword Letter Probability x.5 x. x 3.5.47 x 4. x 5. x 6.8 x 7 x 8 x 9.5.5..8 æ R =.85 ternary symbols/letter Problem 3.3 : Given (n,n,n 3,n 4 )=(,,, 3) we have : 4 n k = + + + 3 = 9 k= 8 > Since the Craft inequality is not satisfied, a binary code with code word lengths (,,, 3) that satisfies the prefix condition does not exist. Problem 3.33 : n n k = n k= k= n = n n = 37

Therefore the Kraft inequality is satisfied. Problem 3.34 : But : and Hence : p(x) = (π) n/ M H(X) =... log p(x) = log(π)n M e X M X / p(x)logp(x)dx ( log e ) X M X ( )... log e X M X p(x)dx = n log e H(X) = log(π)n M + log en = log(πe)n M Problem 3.35 : R(D) =+D log D +( D)log( D), D = P e /.9.8.7.6 R(D).5.4.3...5..5..5.3.35.4.45.5 D 38

Problem 3.36 : R(D) =logm + D log D +( D)log ( D) M 3.5 M=8 R(D).5 M=4.5 M=...3.4.5.6.7.8.9 D Problem 3.37 : Let W = P P. Then : d W (X, X) =(X X) W(X X) d W (X, X) = (X X) P P(X X) = ( P(X X) ) P(X X) ( ) ( Y Ỹ Y Ỹ ) = n where by definition : Y= npx, Ỹ = np X. Hence : d W (X, X) =d (Y,Ỹ). Problem 3.38 : (a) The first order predictor is : ˆx(n) =a x(n ). The coefficient a that minimizes the MSE is found from the orthogonality of the prediction error to the prediction data : E [e(n)x(n )] = E [(x(n) a x(n )) x(n )] = φ() a φ() = a = φ()/φ() = / 39

The minimum MSE is : ɛ = φ() ( a )=3/4 (b) For the second order predictor : ˆx(n) =a x(n ) + a x(n ). Following the Levinson- Durbin algorithm (Eqs 3-5-5) : a = φ() k= a k φ( k) = ɛ 3/4 a = a a a =/3 = /3 The minimum MSE is : ɛ = ɛ ( a ) =/3 Problem 3.39 : p(x,x )= { 5 7ab, x,x C, o.w If x,x are quantized separately by using uniform intervals of length, the number of levels needed is L = a,l = b. The number of bits is : R x = R + R =logl +logl =log ab By using vector quantization with squares having area,wehavel x = 7ab and R 5 x =logl x = log 7ab bits. The difference in bit rate is : 5 R x R x =log ab 7ab log 5 =log5 =. bits/output sample 7 for all a, b >. } Problem 3.4 : (a) The area between the two squares is 4 4 =. Hence, p X,Y (x, y) =. The marginal probability p X (x) isgivenbyp X (x) = p X,Y (x, y)dy. If X<, then p X (x) = p X,Y (x, y)dy = y = 3 4

If X<, then Finally, if X, then p X (x) = dy + dy = 6 p X (x) = p X,Y (x, y)dy = y The next figure depicts the marginal distribution p X (x).... /3 /6 = 3 Similarly we find that - - y< 3 p Y (y) = y< 6 y 3 (b) The quantization levels ˆx,ˆx,ˆx 3 and ˆx 4 are set to 3,, and 3 resulting distortion is respectively. The D X = (x + 3 ) p X (x)dx + = (x +3x + 9 3 4 )dx + 6 ( 3 x3 + 3 x + 9 ) 4 x (x + ) p X (x)dx (x + x + 4 )dx ( 3 x3 + x + 4 x ) The total distortion is = 3 = + 6 D total = D X + D Y = + = 6 whereas the resulting number of bits per (X, Y )pair R = R X + R Y =log 4+log 4=4 (c) Suppose that we divide the region over which p(x, y) intol equal subregions. The case of L = 4 is depicted in the next figure. 4

For each subregion the quantization output vector (ˆx, ŷ) is the centroid of the corresponding rectangle. Since, each subregion has the same shape (uniform quantization), a rectangle with width equal to one and length /L, the distortion of the vector quantizer is D = = L = L L [ L [(x, y) (, L L )] dxdy [ (x ) +(y L + 3 L 3 ] L ) = + L ] dxdy If we set D = 6,weobtain L = = L = 44 = Thus, we have to divide the area over which p(x, y), into equal subregions in order to achieve the same distortion. In this case the resulting number of bits per source output pair (X, Y )isr =log = 3.585. Problem 3.4 : (a) The joint probability density function is p XY (x, y) = ( =. The marginal distribution ) 8 p X (x) isp X (x) = y p XY (x, y)dy. If x,then If x,then p X (x) = p X (x) = The next figure depicts p X (x). x+ x x+ x p X,Y (x, y)dy = 8 y x+ x = x + 4 p X,Y (x, y)dy = 8 y x+ x = x + 4 4

From the symmetry of the problem we have p Y (y) = { y+ 4 y< y+ 4 y (b) D X = (x + 3 ) p X (x)dx + (x + ) p X (x)dx = (x + 3 ) (x +)dx + (x + ) ( x +)dx ( 4 x4 + 5 3 x3 + 33 8 x + 9 ) x ( 4 x4 + x 3 + 9 8 x + ) x = = The total distortion is + D total = D X + D Y = + = 6 whereas the required number of bits per source output pair R = R X + R Y =log 4+log 4=4 (c) We divide the square over which p(x, y) into 4 = 6 equal square regions. The area of each square is and the resulting distortion D = 6 [ (x 8 ) +(y ] ) dxdy = 4 (x ) dxdy = 4 (x + 8 x )dx = 4 ( 3 x3 + 8 x ) x = Hence, using vector quantization and the same rate we obtain half the distortion. 43