Solutions to Homework Set #4 Differential Entropy and Gaussian Channel

Similar documents
Lecture 2. Capacity of the Gaussian channel

Lecture 6 I. CHANNEL CODING. X n (m) P Y X

Solutions to Homework Set #3 Channel and Source coding

Lecture 6: Gaussian Channels. Copyright G. Caire (Sample Lectures) 157

EE376A: Homeworks #4 Solutions Due on Thursday, February 22, 2018 Please submit on Gradescope. Start every question on a new page.

Lecture 5 Channel Coding over Continuous Channels

Chapter 8: Differential entropy. University of Illinois at Chicago ECE 534, Natasha Devroye

ECE Information theory Final (Fall 2008)

National University of Singapore Department of Electrical & Computer Engineering. Examination for

Chapter 4: Continuous channel and its capacity

Lecture 14 February 28

EE 4TM4: Digital Communications II Scalar Gaussian Channel

Chapter 2: Entropy and Mutual Information. University of Illinois at Chicago ECE 534, Natasha Devroye

ELEC546 Review of Information Theory

Lecture 3: Channel Capacity

Appendix B Information theory from first principles

Gaussian channel. Information theory 2013, lecture 6. Jens Sjölund. 8 May Jens Sjölund (IMT, LiU) Gaussian channel 1 / 26

EE 4TM4: Digital Communications II. Channel Capacity

EE/Stats 376A: Homework 7 Solutions Due on Friday March 17, 5 pm

EE5139R: Problem Set 7 Assigned: 30/09/15, Due: 07/10/15

Lecture 22: Final Review

EE/Stat 376B Handout #5 Network Information Theory October, 14, Homework Set #2 Solutions

Chapter 4. Data Transmission and Channel Capacity. Po-Ning Chen, Professor. Department of Communications Engineering. National Chiao Tung University

18.2 Continuous Alphabet (discrete-time, memoryless) Channel

ECE 4400:693 - Information Theory

UCSD ECE153 Handout #34 Prof. Young-Han Kim Tuesday, May 27, Solutions to Homework Set #6 (Prepared by TA Fatemeh Arbabjolfaei)

ECE Information theory Final

Solutions to Homework Set #1 Sanov s Theorem, Rate distortion

EE376A - Information Theory Final, Monday March 14th 2016 Solutions. Please start answering each question on a new page of the answer booklet.

Solutions to Set #2 Data Compression, Huffman code and AEP

Electrical and Information Technology. Information Theory. Problems and Solutions. Contents. Problems... 1 Solutions...7

Note that the new channel is noisier than the original two : and H(A I +A2-2A1A2) > H(A2) (why?). min(c,, C2 ) = min(1 - H(a t ), 1 - H(A 2 )).

Lecture 4 Noisy Channel Coding

3F1: Signals and Systems INFORMATION THEORY Examples Paper Solutions

Information Theory for Wireless Communications. Lecture 10 Discrete Memoryless Multiple Access Channel (DM-MAC): The Converse Theorem

Revision of Lecture 5

Homework Set #2 Data Compression, Huffman code and AEP

Lecture 5: Channel Capacity. Copyright G. Caire (Sample Lectures) 122

Capacity of a channel Shannon s second theorem. Information Theory 1/33

LECTURE 18. Lecture outline Gaussian channels: parallel colored noise inter-symbol interference general case: multiple inputs and outputs

LECTURE 10. Last time: Lecture outline

Lecture 17: Differential Entropy

Lecture 6 Channel Coding over Continuous Channels

Superposition Encoding and Partial Decoding Is Optimal for a Class of Z-interference Channels

Quiz 2 Date: Monday, November 21, 2016

Principles of Coded Modulation. Georg Böcherer

Chapter I: Fundamental Information Theory

Exercises with solutions (Set D)

Chapter 9 Fundamental Limits in Information Theory

Lecture 8: Channel Capacity, Continuous Random Variables

Energy State Amplification in an Energy Harvesting Communication System

Lecture 10: Broadcast Channel and Superposition Coding

Lecture 15: Conditional and Joint Typicaility

ECE 534 Information Theory - Midterm 2

(Classical) Information Theory III: Noisy channel coding

Lecture 4 Channel Coding

Capacity of the Discrete Memoryless Energy Harvesting Channel with Side Information

The Communication Complexity of Correlation. Prahladh Harsha Rahul Jain David McAllester Jaikumar Radhakrishnan

Generalized Writing on Dirty Paper

Noisy channel communication

Shannon s Noisy-Channel Coding Theorem

Shannon s noisy-channel theorem

Principles of Communications

EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018

Notes 3: Stochastic channels and noisy coding theorem bound. 1 Model of information communication and noisy channel

Tutorial 4. b a. h(f) = a b a ln 1. b a dx = ln(b a) nats = log(b a) bits. = ln λ + 1 nats. = log e λ bits. = ln 1 2 ln λ + 1. nats. = ln 2e. bits.

Lecture 11: Quantum Information III - Source Coding

Solutions to Homework Set #6 (Prepared by Lele Wang)

X 1 : X Table 1: Y = X X 2

Homework Set #3 Rates definitions, Channel Coding, Source-Channel coding

Communication Theory II

Information Theory. Coding and Information Theory. Information Theory Textbooks. Entropy

Lecture 8: Channel and source-channel coding theorems; BEC & linear codes. 1 Intuitive justification for upper bound on channel capacity

The Method of Types and Its Application to Information Hiding

On the Capacity Region of the Gaussian Z-channel

Lecture 4 Capacity of Wireless Channels

Lecture 2: August 31

On the Duality between Multiple-Access Codes and Computation Codes

Solutions to Homework Set #5 (Prepared by Lele Wang) MSE = E [ (sgn(x) g(y)) 2],, where f X (x) = 1 2 2π e. e (x y)2 2 dx 2π

LECTURE 13. Last time: Lecture outline

Shannon meets Wiener II: On MMSE estimation in successive decoding schemes

Hands-On Learning Theory Fall 2016, Lecture 3

Chapter 9. Gaussian Channel

Information Theory Primer:

Capacity Theorems for Relay Channels

x log x, which is strictly convex, and use Jensen s Inequality:

Midterm Exam Information Theory Fall Midterm Exam. Time: 09:10 12:10 11/23, 2016

Capacity Bounds for. the Gaussian Interference Channel

UCSD ECE 255C Handout #14 Prof. Young-Han Kim Thursday, March 9, Solutions to Homework Set #5

Exercise 1. = P(y a 1)P(a 1 )

UCSD ECE153 Handout #30 Prof. Young-Han Kim Thursday, May 15, Homework Set #6 Due: Thursday, May 22, 2011

(each row defines a probability distribution). Given n-strings x X n, y Y n we can use the absence of memory in the channel to compute

Limits on classical communication from quantum entropy power inequalities

Can Feedback Increase the Capacity of the Energy Harvesting Channel?

Revision of Lecture 4

CHAPTER 3. P (B j A i ) P (B j ) =log 2. j=1

Relay Networks With Delays

Lecture 8: Shannon s Noise Models

Capacity of AWGN channels

Resource Allocation for Wireless Fading Relay Channels: Max-Min Solution 1 2

Transcription:

Solutions to Homework Set #4 Differential Entropy and Gaussian Channel 1. Differential entropy. Evaluate the differential entropy h(x = f lnf for the following: (a Find the entropy of the exponential density λe λx, x 0. (b The sum of X 1 and X, where X 1 and X are independent normal random variables with means µ i and variances σ i, i = 1,. Solution: Differential entropy. (a h(f = log e λ bits. (1 (b Sum of two normal distributions. The sum of two normal random variables is also normal, so applying the result derived the class for the normal distribution, since X 1 + X N(µ 1 +µ,σ 1 +σ, h(f = 1 logπe(σ 1 +σ bits. (. Mutual information for correlated normals. Find the mutual information I(X;Y, where ( ( [ ] X σ ρσ N 0, Y ρσ σ. Evaluate I(X;Y for ρ = 1,ρ = 0, and ρ = 1, and comment. Mutual information for correlated normals. [ X Y ] ( [ σ ρσ N 0, ρσ σ Using the expression for the entropy of a multivariate normal derived in class h(x,y = 1 log(πe K = 1 log(πe σ 4 (1 ρ. (4 1 ] (3

Since X and Y are individually normal with variance σ, h(x = h(y = 1 logπeσ. (5 Hence I(X;Y = h(x+h(y h(x,y = 1 log(1 ρ. (6 (a ρ = 1. In this case, X = Y, and knowing X implies perfect knowledge about Y. Hence the mutual information is infinite, which agrees with the formula. (b ρ = 0. In this case, X and Y are independent, and hence I(X;Y = 0, which agrees with the formula. (c ρ = 1. In this case, X = Y, and again the mutual information is infinite as in the case when ρ = 1. 3. Markov Gaussian mutual information. Suppose that (X,Y,Z are jointly Gaussian and that X Y Z forms a Markov chain. Let X and Y have correlation coefficient ρ 1 and let Y and Z have correlation coefficient ρ. Find I(X;Z. Solution: Markov Gaussian mutual information. First note that we may without any loss of generality assume that the means of X, Y and Z are zero. If in fact the means are not zero one can subtract the vector of means without affecting the mutual information or the conditional independence of X, Z given Y. Similary we can also assume the variances of X, Y, and Z to be 1. (The scaling may change the differential entropy, but not the mutual information. Let Σ = ( 1 ρxz ρ xz 1 be the covariance matrix of X and Z. From Eqs. (9.93 and (9.94 I(X;Z = h(x+h(z h(x,z, = 1 log( πe + 1 log( πe 1 log( πedet(σ = 1 log(1 ρ xz

Now from the conditional independence of X and Z given Y, we have ρ xz = E[XZ] = E[E[XZ Y]] = E[E[X Y] E[Z Y]] = E[ρ 1 Y ρ Y] = ρ 1 ρ. We can thus conclude that I(X;Z = 1 log(1 ρ 1ρ 4. Output power constraint. Consider an additive white Gaussian noise channel with an expected output power constraint P. (We might want to protect the eardrums of the listener. Thus Y = X+Z, Z N(0,σ, Z is independent of X, and EY P. Assume σ < P. Find the channel capacity. Solution: Output power constraint. The output power constraint EY P is equivalent to the input power constraint E(X +Z = EX +EZ = EX +σ P, that is, EX P σ. Thus, we reduce the problem to a previously known one and get C = 1 ( P log. σ 5. Multipath Gaussian channel. Consider a Gaussian noise channel of power constraint P, where the signal takes two different paths and the received noisy signals are added together at the antenna. 3

Z 1 + Y 1 X + Y + Y Z Let Y = Y 1 +Y and EX P. (a Find the capacity of this channel if Z 1 and Z are jointly normal with covariance matrix [ ] N Nρ K =. Nρ N (b What is the capacity for ρ = 0, 1, and 1? Solution: Multipath Gaussian channel. (a Since Y = Y 1 +Y = X +Z 1 +X +Z = X +(Z 1 +Z, and Z 1 +Z is N(0,N(1+ρ, the capacity is given by C = 1 (1+ log = 1 ( log 1+ 4P N(1+ρ P N(1+ρ. (b When ρ = 0, When ρ = 1, C = 1 log ( 1+ P N. C = 1 ( log 1+ P, N 4

which makes sense since Y 1 = Y and Y = Y 1. (Scaling the output does not change the mutual information. When ρ = 1, we have C =. Since Z 1 +Z = 0, the channel is given by Y = X without any additive noise. Hence we can transmit unbounded amount of information (any real number satisfying the power constraint over the channel without any error. 6. The two-look Gaussian channel. X (Y 1,Y Consider the ordinary additive noise Gaussian channel with two correlated looks at X, i.e., Y = (Y 1,Y, where Y 1 = X +Z 1 Y = X +Z with a power constraint P on X, and (Z 1,Z N (0,K, where [ ] N Nρ K =. Nρ N Find the capacity C for (a ρ = 1. (b ρ = 0. (c ρ = -1. Solution: The two-look Gaussian channel It is clear that the input distribution that maximizes the capacity is X N(0, P. Evaluating the mutual information for this distribution, C = maxi(x;y 1,Y = h(y 1,Y h(y 1,Y X = h(y 1,Y h(z 1,Z X = h(y 1,Y h(z 1,Z 5

Now since we have (Z 1,Z N ( [ N Nρ 0, Nρ N ], h(z 1,Z = 1 log(πe K Z = 1 log(πe N (1 ρ. Since Y 1 = X +Z 1, and Y = X +Z, we have ( [ P +N P +ρn (Y 1,Y N 0, P +ρn P +N ], and h(y 1,Y = 1 log(πe K Y = 1 log(πe (N (1 ρ +PN(1 ρ. Hence the capacity is C = h(y 1,Y h(z 1,Z = 1 (1+ log P. N(1+ρ (a ρ = 1. In this case, C = 1 log(1 + P, which is the capacity of a single N look channel. This is not surprising, since in this case Y 1 = Y. (b ρ = 0. In this case, C = 1 log ( 1+ P N which corresponds to using twice the power in a single look. The capacity is the same as the capacity of the channel X (Y 1 +Y. (c ρ = 1. In this case, C =, which is not surprising since if we add Y 1 and Y, we can recover X exactly. Note that the capacity of the above channel in all cases is the same as the capacity of the channel X Y 1 +Y. 7. Diversity System For the following system, a message W {1,,..., nr } is encoded into two symbol blocks X n 1 = (X 1,1,X 1,,...,X 1,n and X n = (X,1,X,,...,X,n that 6,

are being transmitted over a channel. The average power constrain on the inputs are 1E[ n n i=1 X 1,i] P 1 and 1E[ n n i=1 X,i] P. The channel has a multiplying effect on X 1,X by factor h 1,h, respectively, i.e., Y = h 1 X 1 + h X +Z, where Z is a white Gaussian noise Z N(0,σ. (a FindthejointdistributionofX 1 andx thatbringthemutualinformation I(Y;X 1,X toamaximum? (YouneedtofindargmaxP X1,X I(X 1,X ;Y. h 1 X 1 V 1 Z W Encoder Y Decoder Ŵ X V h Figure 1: The communication model (b What is the capacity of the system? (c Express the capacity for the following cases: i. h 1 = 1, h = 1? ii. h 1 = 1, h = 0? iii. h 1 = 0, h = 0? Solution: Diversity System (a The mutual information is: Y = h 1 X 1 +h X +Z I(X 1,X ;Y = h(y h(y X 1,X = h(y h(z 7

Since h(z is constant, we need to find the maximum of h(y, the second moment of Y is: E[Y ] = E[(h 1 X 1 +h X +Z ] (i = E[(h 1 X 1 +h X ]+E[Z ] = h 1[X 1]+h [X ]+h 1 h E[X 1 X ]+σ Z h 1P 1 +h P +h 1 h E[X 1 X ]+σz (ii h 1P 1 +h P +h 1 h E[X1]E[X ]+σ Z h 1P 1 +h P +h 1 h P1 P +σ Z = (h 1 P1 +h P +σ Z (i - Z is independent of X 1, X. (ii - Cauchy-Schwarz inequality. Where X 1 = αx, ( X 1 X N (0,K and K = ( P 1 P1 P1 P P P will result with equality and bring the mutual information to a maximum. Therefore, the mutual information is bounded by: I(X 1,X ;Y 1 ( log 1+ (h 1 P1 +h P (b The capacity of the system is: C = max I(X 1,X ;Y = 1 ( P x1,x log 1+ (h 1 P1 +h P (c For h 1 = 1 and h = 1 the capacity of the system would be: σ Z C = 1 ( log 1+ ( P 1 + P σ Z σ Z = 1 ( log 1+ P 1 + P 1 P +P σz For h 1 = 1 and h = 0 the capacity of the system would be: C = 1 ( log 1+ P 1 σz 8

For h 1 = 0 and h = 0 the capacity of the system would be: C = 1 log(1 = 0 We can see that having Gaussian channels with one message, it is the best to transmit the signals coherently. 9

8. AWGN with two noises Figure depicts a communication system with an AWGN(Additive white noise Gaussianchannelwhithtwoi.i.d. noisesz 1 N(0,σ1,Z N(0,σthatare independent of each other and are added to the signal X, i.e., Y = X+Z 1 +Z. The average power constrain on the input is P, i.e., 1E[ n n i=1 X i] P. In the sub-questions below we consider the cases where the noise Z may or may not be known to the encoder and decoder. Z 1 Z 1 W Encoder X Y Decoder Ŵ Figure : Two noise sources (a Find the channel capacity for the case in which the noise in not known to either sides (lines 1 and are disconnected from the encoder and the decoder. (b Find the capacity for the case that the noise Z is known to the encoder anddecoder(lines1andareconnectedtoboththeencoderanddecoder. This means that the codeword X n may depend on the message W and the noise Z n n and the decoder decision Ŵ may depend on the output Y and the noise Z. n (Hint: Could the capacity be lager than 1 log(1+ P σ1? (c Find the capacity for the case that the noise Z is known only to the decoder. (line 1 is disconnected from the encoder and line is connected to the decoder. This means that the codewords X n may depend only on the message W and the decoder decision Ŵ may depend on the output Y n and the noise Z. n Solution: AWGN with two noises (a Since the noise is not know to both sides, the total noise is σ1 +σ and the capacity is: C = 1 (1+ log P σ1 +σ 10

(b Once Z is known to the receiver, we can add a subtraction unit in the decoder that subtract Z and therefore the noise is only Z 1. And the capacity is: C = 1 ( log 1+ P σ1 (c Same as in (b, the capacity is: C = 1 log ( 1+ P σ 1 9. Parallel channels and waterfilling Consider a pair of parallel Gaussian channels, i.e., ( ( ( Y1 X1 Z1 = + where Y ( Z1 Z N X Z ( [ σ 0, 1 0 0 σ, ], and there is a power constraint E(X 1 +X P. Assume that σ 1 > σ. At what power does the channel stop behaving like a single channel with noise variance σ, and begin behaving like a pair of channels, ie., at what power does the worst channel become useful? Solution: Parallel channels and waterfilling By the result of water filling taught in the class, it follows that we will put all the signal power into the channel with less noise until the total power of noise + signal in that channel equals the noise power in the other channel. After that, we will split any additional power evenly between the two channels. Thus the combined channel begins to behave like a pair of parallel channels when the signal power is equal to the difference of the two noise powers, i.e., when P = σ 1 σ. 10. Blahut-Arimoto s algorithm and KKT conditions Recall, that the capacity of a memoryless channel is given by C = max p(x I(X;Y. Solving this optimization problem is a difficult task for the general channel. In this question we develop an iterative algorithm for finding the solution for a fixed channel p(y x. 11

(a Prove that the mutual information as a function of p(x and p(x y may be written as I(X;Y = x,y p(xp(y xlog p(x y p(x. (b Show that I(X;Y as written above is concave in both p(x, p(x y (Hint. You may use the log-sum-inequality. (c Find an expression for p(x that maximizes I(X;Y when p(x y is fixed (Hint. You may use the Lagrange multipliers method with the constraint xp(x = 1. No need to take into account that p(x 0 since it will obtained anyway. (d Find an expression for p(x y that maximizes I(X;Y when p(x is fixed (Hint. You may use the Lagrange multipliers method with constraints xp(x y = 1 for all y. No need to take into account that p(x y 0 since it will obtained anyway.. (e Using (d, conclude that C = max p(x,p(x y I(X;Y. The Blahut-Arimoto s algorithm is performed by maximizing in each iteration over another variable; first over p(x when p(x y is fixed, then over p(x y when p(x is fixed, and so on. This iterative algorithm converges, and hence one can find the capacity of any DMC p(y x with reasonable alphabet size. Solutions (a Since I(X;Y = H(X H(X Y, the answer is obvious. (b Recall, that the Log-Sum inequality is ( n n a i log a n i i=1 a i log a i b n i i=1 b. i Hence (λp 1 (x+ i=1 i=1 (1 λp (xlog λp 1(x+(1 λp (x λp 1 (x y+(1 λp (x y λp 1 (xlog p 1(x p 1 (x y +(1 λp (xlog p (x p (x y. Taking the reciprocal of the logarithms yields (λp 1 (x+ (1 λp (xlog λp 1(x y+(1 λp (x y λp 1 (x+(1 λp (x λp 1 (xlog p 1(x y p 1 (x +(1 λp (xlog p (x y p (x. 1

Multiplying by p(y x and summing over all x, y, and letting I(p(x, p(x y be the mutual information as in (a, we obtain I(λp 1 (x+(1 λp (x,λp 1 (x y+(1 λp (x y (c Define the lagrangian λi(p 1 (x,p 1 (x y+(1 λi(p (x,p (x y. L = x,y p(xp(y xlog p(x y p(x +µ( x p(x 1, and differentiate over p(x. Solving L p(x = 0 provides us with p(x = y p(x yp(y x x y p(x yp(y x. (d Define the lagrangian J = x,y p(xp(y xlog p(x y p(x +µ(y( x p(x y 1, and differentiate over p(x y. Solving J p(x y p(x y = p(xp(y x x p(xp(y x. = 0 provides us with (e The expression for p(x y is the one that corresponds to p(x, and hence maximizing over p(x, p(x y is the same as over p(x alone. 11. Fading channel. Consider an additive noise fading channel V Z X Y Y = XV +Z, where Z is additive noise, V is a random variable representing fading, and Z and V are independent of each other and of X. 13

(a Argue that knowledge of the fading factor V improves capacity by showing I(X;Y V I(X;Y. (b Incidentally, conditioning does not always increase mutual information. Give an example of p(u,r,s such that I(U;R S < I(U;R. Solution: Fading channel (a We may show the inequality as follows: I(X;Y V = h(x V h(x Y,V = h(x h(x Y,V (7 h(x h(x Y (8 = I(X;Y where (7 follows from the independence of X and V, and (8 is true because conditioning reduces entropy. (b There are many examples for this case. For instance, if we consider two cascaded BSCs where the input to the channel is U, the output of the first channel is R and the output of the second channel is S, the Markov chain U R S holds from the data processing inequality. Now we know that I(U;R (a = I(U;R,S (9 (b = I(U;S+I(U;R S (10 I(U;R S (11 where (a follows from the Markov chain and (b follows since mutual information is always non-negative. 1. Additive Gaussian channel where the noise might be a relay In this question we consider a channel with additive Gaussian noise as seen in class. Consider the channel presented in Fig. 3. 14

Z N X Y Figure 3: Additive Gaussian noise channel. Y = X +Z +N, where Z N(0,σ 1 and N N(0,σ are additive noises and the input, X, is with power constraint P. N,Z and X are independent. (a Calculate the capacity of the channel assuming that the noise is independent of the message that the encoder uses for determining X i. (b Now it is given that Z i is an output of a relay-encoder which has access to the same message M that the channel encoder has. Hence X and Z are no longer independent. It is also given that Z has a power constraint P, namely 1 n n i=1 Z i P with high probability. Find the capacity of the channel and the probability density function f(x,z for which it is achieved. Solution: Additive Gaussian channel where the noise might be a relay See attached file. 13. True or False Copy the following to your notebook and write true or false. Then, if it s true, prove it. Otherwise, if it s false, give a counter example or prove that the opposite is true. Let X be a continuous random variable. Then the following holds I(X;X = h(x. 15

Solution This claim is false since the mutual information term represents a Gaussian channel with no noise. For this channel the capacity is infinite. Where the entropy of X is bounded by the the entropy of a gaussian RV which is 1 log(πeσ x. 14. Two antennas with Gaussian noise In this question we consider a point-to-point discrete memoryless channel(dmc in which the transmitter and the receiver both have two antennas, illustrated in Fig. 4. This channel is defined by two input alphabets X 1 and X, two output alphabets Y 1 and Y and a channel transition matrix P Y1 Y X 1 X. A message M is randomly and uniformly chosen from the message set M = {1,,..., nr } and is to be transmitted from the encoder to the decoder in a lossless manner (as defined in class. M Encoder X n 1 P Y1 Y X 1 X Y n 1 Decoder ˆM X n Y n Figure 4: Two antenna point-to-point DMC. (a What is the capacity of the channel? Now, consider the following Gaussian two antenna point-to-point DMC illustrated in Fig. 5 The outputs of the channel for every time i {1,...,n} are give by, Y 1,i = X 1,i +Z 1, (1 Y,i = X 1,i +X,i +Z 1 +Z, (13 where (Z 1,Z are two independent (of each other and of everything else Gaussian random variable distributed according to Z 1 N(0,N 1 and Z 1 N(0,N. The input signals are bound to an average power constraints, E [ 1 n n i=1 X 1,i ] P 1 16 ; E [ 1 n n i=1 X,i ] P. (14

Z 1 X 1,i Y 1,i X,i Y,i Figure 5: A Gaussian two antenna point-to-point DMC. Z (b Find the capacity of the Gaussian channel in terms of the provided parameters and state the joint distribution of (X 1,X that achieves it. Solution: (a Letusdenotetheinputpair(X n 1,X n by X n andtheoutputpair(y n 1,Y n by Ỹ n. An equivalent channel to the one considered in this question is the point-to-point DMC for which ( X n,ỹ n serve as the channel s input and output sequences, respectively, and the channel transition matrix is PỸ X. Recalling that the point-to-point channel capacity is given by max P X I( X;Ỹ, and substituting X n = (X1,X n n and Ỹ n = (Y1 n,y n we obtain: C = max I(X 1,X ;Y 1,Y. (15 P X1 X (b First now that Y can be rewritten as Y = Y 1 +X +Z. Now, we upper 17

bound the capacity as: I(X 1,X ;Y 1,Y = I(X 1,X ;Y 1 +I(X 1,X ;Y Y 1 = h(y 1 h(y 1 X 1,X +h(y Y 1 h(y X 1,X,Y 1 (a = h(y 1 h(z 1 +h(y 1 +X +Z Y 1 h(y 1 +X +Z X 1,X,Y 1 (b = h(y 1 h(z 1 +h(x +Z Y 1 h(z (c h(y 1 h(z 1 +h(x +Z h(z = h(y 1 1 log(πen 1+h(Y 1 log(πen (d 1 log( πe(p 1 +N 1 1 log(πen 1+ 1 log( πe(p +N 1 log(π = 1 ( log (P i +N i i=1 where: (a follows from the definitions of Y 1 and the fact that Z 1 is independent of X 1 ; (b follows from the fact that Z is independent of (X 1,X,Z 1 and therefore it is independent of (X 1,X,Y 1 ; (c follows from the fact that conditioning reduces entropy; (d follows by the maximum of differential entropy property. This upper bound is achieved by choosing (X 1,X to be jointly Gaussian RVs with the following distribution, ( (( ( X1 0 P1 0 N, (16 0 0 P X This distribution achieves (c with an equality since by this choice we get that Y 1 = X 1 +Z 1 andx +Z areindependent. Whereas(disachieved with equality since by this choice Y 1 and Y are Gaussian RVs with variances P 1 +N 1 and P +N, respectively (which achieves the maximum of entropy 15. Complex Gaussian Channel. The following question focuses on the complex Gaussian point-to-point communication channel. 18

Let Z = U + iv be a complex Gaussian RV in the sense that U and V are independent and identically distributed real Gaussian RVs. In the following sections Z CN(0,γ, where, and γ is a given positive parameter. 0 = E[Z] ; γ = E[ Z ], (17 (a Find the distribution of the random vector (R{Z},I{Z} T = (U,V T. (b Is it true that h(z = h(u,v? Justify your answer. (c Calculate h(z. (d What is the maximum of the differential entropy over all centered complex RVsZ = U+iV withe[u ]+E[V ] γ? WhichdistributionofZ achieves this maximum? Finally, consider the complex Gaussian channel illustrated in Fig. 6. Z i CN(0,N X i Y i Figure 6: A complex Gaussian point-to-point channel. The output of the channel for every time i {1,...,n} is give by, Y i = X i +Z i, (18 where X i, for i {1,...,n}, is a complex channel input and Z i is distributed i.i.d according to Z i CN(0,N. The input signal is bound to an average power constraint, [ ] 1 n E X i P. (19 n i=1 19

The capacity of the complex Gaussian channel is given by, C = max I(X;Y. (0 f X : E[ X ] P (e Express the capacity in (0 in terms of the parameters of the problem (i.e., as a function of P and N and state the distribution of the complex input RV X that achieves the maximum. (f Compare the result to the capacity of the real point-to-point channel. Explain the difference. Solution: Complex Gaussian Channel. See attached file. 16. True or False Copy each relation to your notebook and write true or false. Then, if it s true, prove it. If it is false give a counterexample or prove that the opposite is true. (a Let X, Y be a pair of random variables jointly distributed according to P X,Y. For every y Y we have H(X Y = y H(X. (b Consider the channel in Fig. 7 where Z is a Gaussian noise with power N and a is a deterministic constant. There is a power constraint on X such that E[X ] P. The capacity between X and Y is denoted by C. Is C = 1 log(1+ P N? Z a X Ỹ Y Figure 7: Cascaded AWGN channel. Solutions 0

a False. For example: Y\X 0 1 0 0.5 0.5 1 0.5 0 For this distribution we can calculate H b (0.5 = H(X < H(X Y = 0 = 1. b True. For the converse, if multiplying by a would increase the capacity then we would use it in the point to point channel. To achieve, we divide by a and apply the decoding procedure as in the point to point channel. c True. This can be done as in the previous question. Another approach is by noting that the function ( 3 is a bijective function and therefore by having Y we indeed can recover Ỹ and: I(X;Ỹ = H(Ỹ H(Ỹ X = H(Ỹ,Y H(Y,Ỹ X = H(Y H(Y X = I(X;Y and thus max p(x I(X;Y = max p(x I(X;Ỹ = 1 log(1+snr. 17. Parallel Gaussian channels Consider a channel consisting of parallel Gaussian channels, with inputs X 1 and X and outputs given by Y 1 = X 1 +Z 1, Y = X +Z. Z 1 X 1 Y 1 Z X Y Figure 8: Parallel Gaussian channels. The random variables Z 1 and Z are independent of each other and of the inputs, and have the variances σ 1 and σ respectively, with σ 1 < σ. 1

(a Suppose X 1 = X = X and we have the power constraint E[X ] P. At the receiver, an output Y = Y 1 + Y is generated. What is the capacity C a of the resulting channel with X as the input and Y as the output? (b Suppose that we still have to transmit the same signal on both channels, but we can now choose how to distribute the power between the channels, i.e. X 1 = ax and X = bx. The new constraint is E[X 1]+E[X ] P. What is the capacity, C b, of this channel with X as the input and (Y 1,Y as the output? Which a and b achieve that capacity? (c We now assume that Z 1 and Z are dependent, specifically, Z = Z 1. As in subsection b, we can choose how to distribute the power between the channels, i.e. X 1 = ax and X = bx under the power constraint E[X 1]+E[X ] P. The outputs of the channels are given by Y 1 = ax +Z 1, Y = bx +Z 1. What is the capacity, C c, of this channel with X as the input and (Y 1,Y as the output? Which a and b achieve that capacity? Solution: Parallel Gaussian channels. (a This channel has an input X and output Y and as we learned in class, the capacity of the Gaussian channel is given by In our case, C = 1 log(1+snr. (1 SNR = E[(X 1 +X ] E[(Z +Z ] 4P. ( σ1 +σ So, the capacity of this channel is given by C = 1 ( log 1+ 4P. (3 σ1 +σ

(b Let Y 1 = ax +Z 1 Y = bx +Z, (4 where Z 1 and Z are independent of each other and have the variances σ 1 and σ respectively, with σ 1 < σ. We seek the values of a,b that maximize I(X;Y 1,Y = h(y 1,Y h(y 1,Y X = h(y 1,Y h(z 1,Z = h(y 1,Y 1 logπeσ 1σ, (5 under the constraint a + b. In order to find h(y 1,Y we need to find the covariance matrix of Y 1,Y, which is given by ( a Σ Y1,Y = P +σ1 abp abp b P +σ. (6 Then, and Σ Y1,Y = (a P +σ 1(b P +σ a b P = P(a σ +b σ 1+σ 1σ P(a σ +( a σ 1+σ 1σ = a P(σ σ 1+(P +σ σ 1, (7 h(y 1,Y logπe+ 1 log[a P(σ σ 1+(P +σ σ 1]. (8 We can now see that, since σ1 < σ, the expression in (8 achieves its maximum value when a achieves its maximal value, namely, for a =. We conclude that the optimal strategy in this case is to use only X 1 to transmit the data, and the capacity is thus C b = 1 ( log 1+ P (9 σ1 3

(c In this case, we can set X 1 = 0, X = X and Y = Y Y 1. Substituting the equations for Y 1, Y and Z we see that Thus, the capacity is infinite. Y = X. (30 18. Fast fading Gaussian channel: Consider a Gaussian channel given by Y i = G i X i +Z i, where Z i i.i.d N(0,N G i Z i M Enc X i Y i Dec ˆM Figure 9: Fast fading Gaussian channel i.i.d and G i P G (g. The gains and noise are independent, i.e., {Z i } {G i }, and { 0.5 if g = 1 P G (g = 0.5 if g = (a Assume that the states are known at the decoder only, and there is an input constraint P. i. What is the capacity formula? ii. Find the optimal inputs distribution in the formula you gave. iii. Compute the capacity as a function of N and P. (b Now the states are known both to the encoder and decoder, and the input constraint is P. i. What is the capacity formula? ii. Compute the capacity as a function of N and P. You can write your answer as an optimization problem. (c Assume Repeat 18b. P G (g = 4 { 0.5 if g = 0 0.5 if g = 1.

Solution: Fast fading Gaussian channel (a i. As we saw in the lectures, the capacity is given by C 1 = sup P X I(X;Y G (31 wherethemaximumistakenoverallx-distributionssuchthate(x P. ii. We show in the next item that the optimal input is X N(0,P. iii. The states are known only to the decoder, and thus we may assume that X G. We have I(X;Y G = h(y G h(y X,G = h(y G h(y GX X,G = h(y G h(z X,G (a = h(y G h(z = P(G = 1 h(y G = 1+P(G = h(y G = h(z = 1 h(x +Z G = 1+ 1 h(x +Z G = h(z (a = 1 h(x +Z+ 1 h(x +Z 1 log(πen (3 where (a follows from the fact that Z (X,G. Now, the maximal entropy lemma implies h(x +Z 1 log(πe(p +N, (33 h(x +Z 1 log(πe(4p +N, (34 with equality if and only if X N(0,P. Thus, I(X;Y G 1 log ( (P +N(4P +N N again, with equality if and only if X N(0,P. Therefore I(X;Y G = 1 ( 4 log 1+ P N C 1 = max P X + 14 log ( 1+ 4P N, (35. (36 5

(b i. Again, as we saw in the lectures, the capacity in this case is given by C = sup P X G I(X;Y G (37 where the maximum is over all distributions P X G which satisfy the power constraint. ii. As before, we get: I(X;Y G = 1 h(x +Z G = 1+ 1 h(x +Z G = 1 log(πen. Define the functional: (38 var(x W = w = E(X W = w. (39 Then, let var(x G = i = P i for i = 1,. By the power constraint, we have that P 1 +P P. Accordingly, due to the maximal entropy lemma, we get h(x +Z G = 1 1 log(πe(p 1 +N (40 h(x +Z G = 1 log(πe(4p +N, (41 were both inequalities are achieved if X G = i is Gaussian with variance P i. Hence, ( I(X;Y G 1 log (P1 +N(P +N. (4 N Therefore, the capacity is given by ( 1 C = sup (P 1,P :P 1 +P P log (P1 +N(4P +N N { ( 1 = sup (P 1,P :P 1 +P P 4 log 1+ P 1 N + 14 log ( 1+ 4P N (43 }. (44 Remark: This optimization problem can be solved by Water-filling. 6

(c As before we will get I(X;Y G = 1 h(z G = 0+ 1 h(x +Z G = 1 1 log(πen = 1 h(x +Z G = 1 1 log(πen. (45 4 Also, h(x +Z G = 1 1 log(πe(p 1 +N. (46 Thus, C 3 = sup (P 1 :P 1 P = 1 4 log ( 1+ P N ( 1 4 log P1 +N N. (47 Intuition: To achieve (47 the transmitter will not send any data when G = 0 (because in this case our information will be lost, but rather all the data will be transmitted when G = 1 (and with power P to satisfy the power constraint. Now, when G = 1, the channel reduces to a simple Gaussian channel, Y = X+Z, with signal to noise ratio of P/N, namely, we can achieve a rate of ( 1 log 1+ P N However, since with high probability, G = 1 half of the time, we need to multiply the last result by half, and we get the quarter factor in (47.. 7