Gaussian channel. Information theory 2013, lecture 6. Jens Sjölund. 8 May Jens Sjölund (IMT, LiU) Gaussian channel 1 / 26

Gaussian channel Information theory 2013, lecture 6 Jens Sjölund 8 May 2013 Jens Sjölund (IMT, LiU) Gaussian channel 1 / 26

Outline 1 Definitions 2 The coding theorem for Gaussian channel 3 Bandlimited channels 4 Parallel Gaussian channels 5 Channels with colored Gaussian noise 6 Gaussian channels with feedback Jens Sjölund (IMT, LiU) Gaussian channel 2 / 26

Introduction Definition A Gaussian channel is a time-discrete channel with output Y i, input X i and noise Z i at time i such that Y i = X i + Z i, Z i N (0,N) where Z i is i.i.d and independent of X i. The Gaussian channel is the most important continuous alphabet channel, modeling a wide range of communication channels. The validity of this follows from the central limit theorem Jens Sjölund (IMT, LiU) Gaussian channel 3 / 26

The power constraint If the noise variance is zero or the input is unconstrained, the capacity of the channel is infinite. The most common limitation on the input is a power constraint EX 2 i P Jens Sjölund (IMT, LiU) Gaussian channel 4 / 26

Information capacity Theorem The information capacity of a Gaussian channel with power constraint P and noise variance N is C = max I(X;Y ) = 1 (1 f (x):ex 2 P 2 log + P ) N Proof. Use that EY 2 = P + N = h(y ) 1 2 log2πe(p + N) (Theorem 8.6.5) I(X;Y ) = h(y ) h(y X) = h(y ) h(x + Z X) = h(y ) h(z X) = h(y ) h(z) 1 2 log2πe(p + N) 1 2 log2πen = 1 ( ) P + N 2 log = 1 ( N 2 log 1 + P ) N which holds with equality iff X N (0,P ) Jens Sjölund (IMT, LiU) Gaussian channel 5 / 26

Statement of the theorem Theorem (9.1.1) The capacity of a Gaussian channel with power constraint P and noise variance N is C = 1 ( 2 log 1 + P ) N Proof. See the book. Based on random codes and joint typicality decoding (similar to the proof for the discrete case). Jens Sjölund (IMT, LiU) Gaussian channel 7 / 26

Handwaving motivation Volume of n-dimensional sphere r n n Radius of received vectors i=1 Y i 2 n(p + N) Using decoding spheres of radius nn the maximum number of nonintersecting decoding spheres is therefore no more than ( ) n n(p + N ( M = ( ) n = 1 + P ) n/2 = 2 n 2 log(1+ P N ) nn N (note the typo in eq. 9.22 in the book). This corresponds to a rate R = logm log (2 n 2 log(1+ P N ) ) = = 1 ( n n 2 log 1 + P ) = C N Jens Sjölund (IMT, LiU) Gaussian channel 8 / 26

Bandlimited channels A bandlimited channel with white noise can be described as Y (t) = (X(t) + Z(t)) h(t) where X(t) is the signal waveform, Z(t) is the waveform of the white Gaussian noise and h(t) is the impulse response of an ideal bandpass filter, which cuts all frequencies greater than W. Theorem (Nyquist-Shannon sampling theorem) Suppose that a function f (t) is bandlimited to W. Then, the function is completely determined by samples of the function spaced 1 2W apart. Jens Sjölund (IMT, LiU) Gaussian channel 10 / 26

Bandlimited channels Sampling at 2W Hertz, the energy per sample is P /2W. Assuming a noise spectral density N 0 /2 the capacity is C = 1 ( 2 log 1 + P /2W ) = 12 ( N 0 /2 log 1 + P ) bits per sample N 0 W ( = W log 1 + P ) bits per second N 0 W which is arguably one of the most famous formulas of information theory. Jens Sjölund (IMT, LiU) Gaussian channel 11 / 26

Parallel Gaussian channels Consider k independent Gaussian channels in parallel with a common power constraint Y j = X j + Z j, Z j N (0,N j ) j = 1,2,...,k E k Xj 2 P j=1 The objective is to distribute the total power among channels so as to maximize the capacity. Jens Sjölund (IMT, LiU) Gaussian channel 13 / 26

Information capacity of parallel Gaussian channel Following the same lines of reasoning as in the single channel case one may show ( I(X 1,...,X k ;Y 1,...,Y k ) h(yj ) h(z j ) ) ( 1 2 log 1 + P ) j N j where P j = EX 2 j and P j = P. Equality is achieved by j P 1 0... 0 0 P 2... 0 (X 1,...,X k ) N 0,...... 0 0... P k j Jens Sjölund (IMT, LiU) Gaussian channel 14 / 26

Maximizing the information capacity The objective is thus to maximize the capacity subject to the constraint Pj = P, which can be done a Lagrange multiplier λ J(P 1,...,P k ) = j 1 2 log ( 1 + P ) j N j J 1 = P i 2(P i + N i ) + λ = 0 P i = 1 2λ N i = ν N i + λ P j P however P i 0, so the solution needs to modified to P i = (ν N i ) + ν N = i if ν N i 0 if ν < N i j Jens Sjölund (IMT, LiU) Gaussian channel 15 / 26

Water-filling The gives the solution C = k 1 (1 2 log + (ν N j) + ) j=1 where ν is chosen so that k j=1 (ν N j ) + = P (typo in the summary). As the signal power increases, power is alloted to the channels where the sum of noise and power is the lowest. By analogy to the way water distributes itself in a vessel, this is referred to as water-filling. N j Jens Sjölund (IMT, LiU) Gaussian channel 16 / 26

Channels with colored Gaussian noise Consider parallel channels with dependent noise. Y i = X i + Z i, Z n N (0,K Z ) This can also represent the case of a single channel with Gaussian noise with memory. For a channel with memory, a block of n consecutive uses can be considered as n channels in parallel with dependent noise. Jens Sjölund (IMT, LiU) Gaussian channel 18 / 26

Maximizing the information capacity (1/3) Let K X be the input covariance matrix. The power constraint can then be written 1 EX 2 n i = 1 n tr(k X) P As for the independent channels, i I(X 1,...,X k ;Y 1,...,Y k ) = h(y 1,...,Y k ) h(z 1,...,Z k ) The entropy of the output is maximal when Y is normally distributed, and since the input and noise are independent K Y = K X + K Z = h(y 1,...,Y k ) = 1 2 log((2πe)n K X + K Z ) Jens Sjölund (IMT, LiU) Gaussian channel 19 / 26

Maximizing the information capacity (2/3) Thus, maximizing the capacity amounts to maximizing K X + K Z subject to the constraint tr(k X ) np. The eigenvalue decomposition of K Z gives K Z = QΛQ t, where QQ t = I then, K X + K Z = Q Q t K X Q + Λ Q t = Q t K X Q + Λ = A + Λ and tr(a) = tr(q t K X Q) = tr(q t QK X ) = tr(k X ) np Jens Sjölund (IMT, LiU) Gaussian channel 20 / 26

Maximizing the information capacity (3/3) From Hadamard s inequality (Theorem 17.9.2) A + Λ (A ii + λ i ) with equality iff A is diagonal. Since A ii np, A ii 0 i it can be shown using the KKT conditions that the optimal solution is given by i A ii = (ν λ i ) + (ν λ i ) + = np i Jens Sjölund (IMT, LiU) Gaussian channel 21 / 26

Resulting information capacity For a channel with colored Gaussian noise C = 1 n ( 1 n 2 log 1 + (ν λ i) + ) λ i i=1 where λ i are the eigenvalues of K Z and ν is chosen so that ni=1 (ν λ i ) + = np (typo in the summary). Jens Sjölund (IMT, LiU) Gaussian channel 22 / 26

Gaussian channels with feedback Y i = X i + Z i, Z i N (0,K (n) Z ) Because of the feedback, X n and Z n are not independent; X i is casually dependent on the past values of Z. Jens Sjölund (IMT, LiU) Gaussian channel 24 / 26

Capacity with and without feedback For memoryless Gaussian channels, feedback does not increase capacity. However, for channels with memory, it does. Capacity without feedback ( ) 1 C n = max 2n log KX + K Z K Z tr(k X ) np Capacity with feedback C n,fb = max tr(k X ) np = 1 2n (λ λ(n) log 1 + ( ) 1 2n log KX+Z K Z i λ (n) i i ) + Jens Sjölund (IMT, LiU) Gaussian channel 25 / 26

Bounds on the capacity with feedback Theorem C n,fb C n + 1 2 Theorem (Pinsker s statement) C n,fb 2C n In words, Gaussian channel capacity is not increased by more than half a bit or by more than a factor 2 when we have feedback. Jens Sjölund (IMT, LiU) Gaussian channel 26 / 26