Solutions to Homework Set #3 Channel and Source coding

Similar documents
Homework Set #3 Rates definitions, Channel Coding, Source-Channel coding

EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018

Lecture 3: Channel Capacity

Lecture 6 I. CHANNEL CODING. X n (m) P Y X

EE376A: Homeworks #4 Solutions Due on Thursday, February 22, 2018 Please submit on Gradescope. Start every question on a new page.

Chapter 4. Data Transmission and Channel Capacity. Po-Ning Chen, Professor. Department of Communications Engineering. National Chiao Tung University

Note that the new channel is noisier than the original two : and H(A I +A2-2A1A2) > H(A2) (why?). min(c,, C2 ) = min(1 - H(a t ), 1 - H(A 2 )).

Shannon s noisy-channel theorem

Lecture 8: Channel and source-channel coding theorems; BEC & linear codes. 1 Intuitive justification for upper bound on channel capacity

National University of Singapore Department of Electrical & Computer Engineering. Examination for

Lecture 10: Broadcast Channel and Superposition Coding

Chapter 2: Entropy and Mutual Information. University of Illinois at Chicago ECE 534, Natasha Devroye

(Classical) Information Theory III: Noisy channel coding

Electrical and Information Technology. Information Theory. Problems and Solutions. Contents. Problems... 1 Solutions...7

Lecture 4 Noisy Channel Coding

Lecture 5: Channel Capacity. Copyright G. Caire (Sample Lectures) 122

ECE 534 Information Theory - Midterm 2

Solutions to Homework Set #4 Differential Entropy and Gaussian Channel

Solutions to Homework Set #2 Broadcast channel, degraded message set, Csiszar Sum Equality

Capacity of the Discrete Memoryless Energy Harvesting Channel with Side Information

Noisy channel communication

Lecture 2: August 31

Solutions to Homework Set #1 Sanov s Theorem, Rate distortion

Solutions to Set #2 Data Compression, Huffman code and AEP

Chapter I: Fundamental Information Theory

EE 4TM4: Digital Communications II. Channel Capacity

Feedback Capacity of a Class of Symmetric Finite-State Markov Channels

Information Theory for Wireless Communications. Lecture 10 Discrete Memoryless Multiple Access Channel (DM-MAC): The Converse Theorem

Notes 3: Stochastic channels and noisy coding theorem bound. 1 Model of information communication and noisy channel

Shannon s Noisy-Channel Coding Theorem

Noisy-Channel Coding

UCSD ECE 255C Handout #12 Prof. Young-Han Kim Tuesday, February 28, Solutions to Take-Home Midterm (Prepared by Pinar Sen)

Channel capacity. Outline : 1. Source entropy 2. Discrete memoryless channel 3. Mutual information 4. Channel capacity 5.

Shannon s Noisy-Channel Coding Theorem

EE5139R: Problem Set 7 Assigned: 30/09/15, Due: 07/10/15

Capacity of a channel Shannon s second theorem. Information Theory 1/33

ECE Information theory Final

ECE Information theory Final (Fall 2008)

CS 630 Basic Probability and Information Theory. Tim Campbell

EE/Stat 376B Handout #5 Network Information Theory October, 14, Homework Set #2 Solutions

Lecture 22: Final Review

Appendix B Information theory from first principles

Entropies & Information Theory

Midterm Exam Information Theory Fall Midterm Exam. Time: 09:10 12:10 11/23, 2016

Information Theory. Lecture 10. Network Information Theory (CT15); a focus on channel capacity results

Exercise 1. = P(y a 1)P(a 1 )

Can Feedback Increase the Capacity of the Energy Harvesting Channel?

X 1 : X Table 1: Y = X X 2

LECTURE 3. Last time:

Lecture 8: Shannon s Noise Models

Intermittent Communication

Towards a Theory of Information Flow in the Finitary Process Soup

1 Introduction to information theory

ELEC546 Review of Information Theory

Lecture 15: Conditional and Joint Typicaility

Reliable Computation over Multiple-Access Channels

UCSD ECE 255C Handout #14 Prof. Young-Han Kim Thursday, March 9, Solutions to Homework Set #5

Capacity Theorems for Relay Channels

Example: Letter Frequencies

Homework Set #2 Data Compression, Huffman code and AEP

Example: Letter Frequencies

Example: Letter Frequencies

Upper Bounds on the Capacity of Binary Intermittent Communication

EE376A - Information Theory Final, Monday March 14th 2016 Solutions. Please start answering each question on a new page of the answer booklet.

Revision of Lecture 5

3F1: Signals and Systems INFORMATION THEORY Examples Paper Solutions

Lecture 8: Channel Capacity, Continuous Random Variables

The Communication Complexity of Correlation. Prahladh Harsha Rahul Jain David McAllester Jaikumar Radhakrishnan

Information Theory CHAPTER. 5.1 Introduction. 5.2 Entropy

Relay Networks With Delays

EE5139R: Problem Set 4 Assigned: 31/08/16, Due: 07/09/16

Principles of Communications

The Gallager Converse

EE/Stats 376A: Homework 7 Solutions Due on Friday March 17, 5 pm

An Alternative Proof of Channel Polarization for Channels with Arbitrary Input Alphabets

LECTURE 13. Last time: Lecture outline

Lecture 5 Channel Coding over Continuous Channels


Lecture 2. Capacity of the Gaussian channel

Energy State Amplification in an Energy Harvesting Communication System

Chapter 9 Fundamental Limits in Information Theory

4F5: Advanced Communications and Coding Handout 2: The Typical Set, Compression, Mutual Information

Lecture 4 Channel Coding

Lecture 11: Polar codes construction

Information Theory. Coding and Information Theory. Information Theory Textbooks. Entropy

Lecture 1: Introduction, Entropy and ML estimation

Distributed Source Coding Using LDPC Codes

Optimal Encoding Schemes for Several Classes of Discrete Degraded Broadcast Channels

Superposition Encoding and Partial Decoding Is Optimal for a Class of Z-interference Channels

CSCI 2570 Introduction to Nanocomputing

Discrete Memoryless Channels with Memoryless Output Sequences

The binary entropy function

Chapter 8: Differential entropy. University of Illinois at Chicago ECE 534, Natasha Devroye

Quiz 2 Date: Monday, November 21, 2016

An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1

ELEMENT OF INFORMATION THEORY

On Dependence Balance Bounds for Two Way Channels

Information measures in simple coding problems

Application of Information Theory, Lecture 7. Relative Entropy. Handout Mode. Iftach Haitner. Tel Aviv University.

Capacity of a Class of Semi-Deterministic Primitive Relay Channels

Transcription:

Solutions to Homework Set #3 Channel and Source coding. Rates (a) Channels coding Rate: Assuming you are sending 4 different messages using usages of a channel. What is the rate (in bits per channel use) that you send. (b) Source coding Rate: Assuming you have a file with 6 Ascii characters, where the alphabet of Ascii characters is 56. Assume that each Ascii character is represented by bits (binary alphabet before compression). After compressing it we get 4 6 bits. What is the compression rate? Solution: Rates. (a) (b) R = log 4 =. R = 4 6 6 log (56) =.. Preprocessing the output. One is given a communication channel with transition probabilities p(y x) and channel capacity C = max I(X;Y). A helpful statistician preprocesses the output by forming Ỹ = g(y), yielding a channel p(ỹ x). He claims that this will strictly improve the capacity. (a) Show that he is wrong. (b) Under what conditions does he not strictly decrease the capacity? Solution: Preprocessing the output.

(a) The statistician calculates Ỹ = g(y). Since X Y Ỹ forms a Markov chain, we can apply the data processing inequality. Hence for every distribution on x, I(X;Y) I(X;Ỹ). Let be the distribution on x that maximizes I(X;Ỹ). Then C = max I(X;Y) I(X;Y) = I(X;Ỹ) = = maxi(x;ỹ) = C. Thus, the helpful suggestion is wrong and processing the output does not increase capacity. (b) We have equality (no decrease in capacity) in the above sequence of inequalities only if we have equality in data processing inequality, i.e., for the distribution that maximizes I(X;Ỹ), we have X Ỹ Y forming a Markov chain. Thus, Ỹ should be a sufficient statistic. 3. The Z channel. The Z-channel has binary input and output alphabets and transition probabilities p(y x) given by the following matrix: [ ] Q = x,y {,} / / Find the capacity of the Z-channel and the maximizing input probability distribution. Solution: The Z channel. First we express I(X; Y), the mutual information between the input and output of the Z-channel, as a function of α = Pr(X = ): H(Y X) = Pr(X = ) +Pr(X = ) = α H(Y) = H(Pr(Y = )) = H(α/) I(X;Y) = H(Y) H(Y X) = H(α/) α Since I(X;Y) is strictly concave on α (why?) and I(X;Y) = when α = and α =, the maximum mutual information is obtained for some value of α such that < α <.

Using elementary calculus, we determine that d dα I(X;Y) = log α/, α/ which is equal to zero for α = /5. (It is reasonable that Pr(X = ) < / since X = is the noisy input to the channel.) So the capacity of the Z-channel in bits is H(/5) /5 =.7.4 =.3. 4. Using two channels at once. Consider two discrete memoryless channels (X,p(y x ),Y ) and (X,p(y x ),Y ) with capacities C and C respectively. A new channel (X X,p(y x ) p(y x ),Y Y ) is formed in which x X and x X, are simultaneously sent, resulting in y,y. Find the capacity of this channel. Solution: Using two channels at once. Tofindthecapacityoftheproductchannel(X X,p(y,y x,x ),Y Y ), we have to find the distribution p(x,x ) on the input alphabet X X that maximizes I(X,X ;Y,Y ). Since the transition probabilities are given as p(y,y x,x ) = p(y x )p(y x ), p(x,x,y,y ) = p(x,x )p(y,y x,x ) = p(x,x )p(y x )p(y x ), Therefore, Y X X Y forms a Markov chain and I(X,X ;Y,Y ) = H(Y,Y ) H(Y,Y X,X ) = H(Y,Y ) H(Y X,X ) H(Y X,X )() = H(Y,Y ) H(Y X ) H(Y X ) () H(Y )+H(Y ) H(Y X ) H(Y X ) (3) = I(X ;Y )+I(X ;Y ), where Eqs. () and () follow from Markovity, and Eq. (3) is met with equality if X and X are independent and hence Y and Y are inde- 3

pendent. Therefore C = max I(X,X ;Y,Y ) p(x,x ) max I(X ;Y )+ max I(X ;Y ) p(x,x ) p(x,x ) = maxi(x ;Y )+maxi(x ;Y ) p(x ) p(x ) = C +C. with equality iff p(x,x ) = p (x )p (x ) and p (x ) and p (x ) are the distributions that maximize C and C respectively. 5. A channel with two independent looks at Y. Let Y and Y be conditionally independent and conditionally identically distributed given X. Thus p(y,y x) = p(y x)p(y x). (a) Show I(X;Y,Y ) = I(X;Y ) I(Y ;Y ). (b) Conclude that the capacity of the channel X (Y,Y ) is less than twice the capacity of the channel X Y Solution: A channel with two independent looks at Y. (a) I(X;Y,Y ) = H(Y,Y ) H(Y,Y X) (4) = H(Y )+H(Y ) I(Y ;Y ) H(Y X) H(Y X) (5) (since Y and Y are conditionally independent given X)(6) = I(X;Y )+I(X;Y ) I(Y ;Y ) (7) = I(X;Y ) I(Y ;Y ) (since Y and Y are conditionally (8). identically distributed) 4

(b) The capacity of the single look channel X Y is The capacity of the channel X (Y,Y ) is C = maxi(x;y ). (9) C = maxi(x;y,y ) () = maxi(x;y ) I(Y ;Y ) () maxi(x;y ) () = C. (3) Hence, two independent looks cannot be more than twice as good as one look. 6. Choice of channels. Find the capacity C of the union of channels (X,p (y x ),Y ) and (X,p (y x ),Y ) where, at each time, one can send a symbol over channel or over channel but not both. Assume the output alphabets are distinct and do not intersect. (a) Show C = C + C. (b) What is the capacity of this Channel? 3 p p p p 3 Solution: Choice of channels. (a) Let θ = {, if the signal is sent over the channel, if the signal is sent over the channel. 5

Consider the following communication scheme: The sender chooses between two channels according to Bern(α) coin flip. Then the channel input is X = (θ,x θ ). Since the output alphabets Y and Y are disjoint, θ is a function of Y, i.e. X Y θ. Therefore, I(X;Y) = I(X;Y,θ) = I(X θ,θ;y,θ) Thus, it follows that = I(θ;Y,θ)+I(X θ ;Y,θ θ) = I(θ;Y,θ)+I(X θ ;Y θ) = H(θ)+αI(X θ ;Y θ = )+( α)i(x θ ;Y θ = ) = H(α)+αI(X ;Y )+( α)i(x ;Y ). C = sup{h(α)+αc +( α)c }, α which is a strictly concave function on α. Hence, the maximum exists and by elementary calculus, one can easily show C = log ( C + C ), which is attained with α = C /( C + C ). If one interprets M = C as the effective number of noise free symbols, then the above result follows in a rather intuitive manner: we have M = C noise free symbols from channel, and M = C noise free symbols from channel. Since at each step we get to choose which channel to use, we essentially have M + M = C + C noise free symbols for the new channel. Therefore, the capacity of this channel is C = log ( C + C ). This argument is very similiar to the effective alphabet argument given in Problem 9, Chapter of the text. (b) From part (b) we get capacity is log( H(p) + ). 7. Cascaded BSCs. Considerthetwodiscretememorylesschannels(X,p (y x),y)and(y,p (z y),z). Let p (y x) and p (z y) be binary symmetric channels with crossover probabilities λ and λ respectively. 6

X λ λ λ λ λ λ λ λ Y Z (a) What is the capacity C of p (y x)? (b) What is the capacity C of p (z y)? (c) Wenowcascadethesechannels. Thusp 3 (z x) = y p (y x)p (z y). What is the capacity C 3 of p 3 (z x)? Show C 3 min{c,c }. (d) Now let us actively intervene between channels and, rather than passively transmitting y n. What is the capacity of channel followed by channel if you are allowed to decode the output y n of channel and then reencode it as ỹ n for transmission over channel? (Think W x n (W) y n ỹ n (y n ) z n Ŵ.) (e) What is the capacity of the cascade in part c) if the receiver can view both Y and Z? Solution: Cascaded BSCs. (a) This is a simple BSC with capacity C = H b (λ ). (b) Similarly, C = H b (λ ). (c) This is also a BSC channel with transition probability λ λ = λ ( λ )+( λ )λ and thus C 3 = H b (λ λ ). From the markov chain X Y Z we can see that and additionaly C 3 = I(X;Z) I(X;Y,Z) = I(X;Y) = C C 3 = I(X;Z) I(X,Y;Z) = I(Y;Z) = C 7

and thus C 3 min{c,c } (d) If one can reencode Y then we obtain that C 3 = min{c,c } since only one of the channels will be the bottleneck between X and Z. (e) From the Markov chain X Y Z we can see that 8. Channel capacity C 3 = I(X;Y,Z) = I(X;Y) = C (a) What is the capacity of the following channel (Input) X 3 3 4 Y (Output) (b) Provide a simple scheme that can transmit at rate R = log 3 bits through this channel. Solution for Channel capacity (a) We can use the solution of previous home question: C = log ( C + C + C 3 ) Now we need to calculate the capacity of each channel: C = maxi(x;y) = H(Y) H(Y X) = = 8

C = maxi(x;y) = H(Y) H(Y X) = = C 3 = max I(X;Y) = max = max {H(Y) H(Y X)} [ ( ) p log p ( ) ( )] p +p 3 log p +p 3 p Assigning p 3 = p and derive against p : di(x;y) dp = p p ( log p ) + p p + ( ) log p = And as result p = 5 : C 3.3 And, finally: C = log( + +.3 ).7 (b) Here is a simple code that achieves capacity. Encoding: You just use ternary representation of the message and send using,, but no 3 (or,,3 but no ) of the input channel. Decoding: map the ternary output into the message. 9

9. A channel with a switch. Consider the channel that is depicted in Fig., there are two channels with the conditional probabilities p(y x) and p(y x). These two channels have common input alphabet X and two disjoint output alphabets Y,Y (a symbol that appears in Y can t appear in Y ). The position of the switch is determined by R.V Z which is independent of X, where Pr(Z = ) = λ. p(y x) Y Z = X Y p(y x) Y Z = Figure : The channel. (a) Show that I(X;Y) = λi(x;y )+ λi(x;y ). (4) (b) The capacity of this system is given by C = max I(X;Y). Show that C λc + λc, (5) where C i = max I(X;Y i ). When is equality achieved? (c) The sub-channels defined by p(y x) and p(y x) are now given in the following figure, where p =. Find the input probability that maximizes I(X; Y). For this case, does the equality C = λc + λc stand? explain! Solution: Channel with state (a) Since the alphabet Y,Y are disjoint the markov chain X Y Z holds. Additionally, X and Z are independent and thus we have

p X Y p p p p X Y p 3 Figure : (a) describes channel - BSC with transition probability p. (b) describes channel - Z channel with transition probability p. that I(X;Y) = I(X;Y Z) (b) Consider the following: = Pr(z = )I(X;Y z = )+Pr(z = )I(X;Y z = ) = λi(x;y z = )+ λi(x;y z = ) = λi(x;y )+ λi(x;y ) C = maxi(x;y) = max{λi(x;y )+ λi(x;y )} max {λi(x;y )}+max { λi(x;y )} = λc + λc (c) Since p =, the capacity of the first channel is and thus we only need to maximize over the second channel. Thus, we would like to find such that λi(x;y ) = λ(h b ( α) αh b( ) is maximized where Pr(x = ) = α [,]. In this case the solution is α =.4. We can see that the equality in holds.. Channel with state A discrete memoryless (DM) state dependent channel with state space S is defined by an input alphabet X, an output alphabet Y and a set of channel transition matrices {p(y x,s)} s S. Namely, for each s S

the transmitter sees a different channel. The capacity of such a channel where the state is know causally to both encoder and decoder is given by: C = maxi(x;y S). (6) p(x s) Let S = 3 and the three different channels (one for each state s S) are as illustrated in the following figure ǫ ǫ δ δ ǫ δ δ ǫ S = S = S = 3 Figure 3: The three state dependent channel. The state process is i.i.d. according to the distribution p(s). (a) Find an expression for the capacity of the S-channel (the channel the transmitter sees given S = ) as a function of ǫ. (b) Find an expression for the capacity of the BSC (the channel the transmitter sees given S = ) as a function of δ. (c) Find an expression for the capacity of the Z-channel (the channel the transmitter sees given S = 3) as a function of ǫ. (d) Find an expression for the capacity of the DM state dependent channel (using formula (6)) for p(s) = [ ] as a function of 3 6 ǫ and δ. (e) LetusdefineaconditionalprobabilitymatrixP X S fortworandom variables X and S with X = {,} and S = {,,3}, by: [ ] 3, PX S = p(x = j s = i). (7) i=,j= WhatistheinputconditionalprobabilitymatrixP X S thatachieves the capacity you have found in (d)?

Solution: Channel with state (a) Denote the capacity of the S-Channel by C S = max I(X;Y S = ) p(x s=) = max [H(Y S = ) H(Y X,S = )] p(x s=) Assume that the input X is distributed according to X Bernoulli(α) for s =, and let us calculate the entropy terms: H(Y X,S = ) = x X H(Y X = x,s = ) Consider =αh(y X =,S = )+( α)h(y X =,S = ) =αh b (ǫ)+( α) = αh b (ǫ) p(y = s = ) =p(y =,x = s = )+p(y =,x = s = ) Then We want to maximize =α( ǫ) H(Y X,S = ) =H b (α( ǫ)). I(X;Y S = ) = H b (α( ǫ)) αh b (ǫ). Taking derivative with respect to α and find the roots of the derivation gives the capacity. (b) For the binary symmetric channel (BSC) the capacity is given by C BSC = H b (δ). (c) The capacity of the Z-channel and S-channel are equal because thechannelsareequivalentuptoswitchingwithandviceversa. (d) The capacity of the state dependent channel is C =maxi(x;y S) p(x s) [ p(s = )I(X;Y S = )+p(s = )I(X;Y S = ) =max p(x s) +p(s = 3)I(X;Y S = 3) ] = C S + 3 C BSC + 6 C Z 3

(e) The rows of the matrix P X S are the conditional probability function which achieve capacity for each of the sub channel. p(x = s = ) p(x = s = ) P X S = p(x = s = ) p(x = s = ) p(x = s = ) p(x = s = 3). Modulo channel (a) Consider the DMC defined as follows: Output Y = X Z where X, taking values in {,}, is the channel input, is the modulo- summation operation, and Z is binary channel noise uniform over {,} and independent of X. What is the capacity of this channel? (b) Consider the channel of the previous part, but suppose that instead of modulo- addition Y = X Z, we perform modulo-3 addition Y = X 3 Z. Now what is the capacity? Solution: Modulo channel (a) This is a simple case of a BSC with transition probability of p = / and thus the capacity in this case is C BSC =. (b) In this case we can model the channel as a BEC as studied in class. The input is of binary alphabet and the output is of a trenary alphabet. The probability of error is Pr(z = ) = p = and thus the capacity for this channel is C BEC = p =.. Cascaded BSCs: Given is a cascade of k identical and independent binary symmetric channels, each with crossover probability α. (a) In the case where no encoding or decoding is allowed at the intermediate terminals, what is the capacity of this cascaded channel as a function of k,α. (b) Now, assume that encoding and decoding is allowed at the intermediate points, what is the capacity as a function of k,α. (c) What is the capacity of each of the above settings in the case where the number of cascaded channels, k, goes to infinity? 4

Solution: Cascaded BSCs. (a) CascadedBSCsresultanewBSCwithanewparameter,β. Therefore, the capacity is C a = H (β) and the parameter β can be found as follows: ( ) k β = α i ( α) k i i {i k:i is odd} k ( ) k ( ) i = α i ( α) k i i i= = k ( ) k ( α) i ( α) k i i i= = ( ( α)k ). Answers of β as the initial sum over odd indices or the binary convolutional of k identical parameters α have been accepted as well. (b) We have seen in HW that in the case of encoding and decoding the capacity of the cascaded channel equals C b = min{c i }. Since all channels are identical with capacity H (α), we have that C b = H (α). (c) In (a), β.5 as k so C a. For (b), the number of cascaded channels does not change the capacity which remains C b = H (α). 5