LECTURE 18. Lecture outline Gaussian channels: parallel colored noise inter-symbol interference general case: multiple inputs and outputs

LECTURE 18 Last time: White Gaussian noise Bandlimited WGN Additive White Gaussian Noise (AWGN) channel Capacity of AWGN channel Application: DS-CDMA systems Spreading Coding theorem Lecture outline Gaussian channels: parallel colored noise inter-symbol interference general case: multiple inputs and outputs Reading: Sections 10.4-10.5.

Parallel Gaussian channels Y j = X j + N j where σ 2j is the variance for channel j (superscript to show that it could be something else than several samples from a single channel) the noises on the channels are mutually independent the constraint on energy, however, is over all the channels E [ kj=1 (X j ) 2] P

Parallel Gaussian channels How do we allocate our resources across channels when we want to maximize the total mutual information: We seek the maximum over all f X 1,...,X k (x 1,..., x k )s.t. E [ kj=1 (X j ) 2] P of I ( (X 1,..., X k ); (Y 1,..., Y k ) ) Intuitively, we know that channels with good SNR get more input energy, channels with bad SNR get less input energy

Parallel Gaussian channels I ( (X 1,..., X k ); (Y 1,..., Y k ) ) = h(y 1,..., Y k ) h(y 1,..., Y k X 1,..., X k ) = h(y 1,..., Y k ) h(n 1,..., N k ) = h(y 1,..., Y k ) k j=1 k j=1 h(y j ) k j=1 ( 1 2 ln 1 + P j σ j2 k j=1 h(n j ) ) h(n j ) where E[(X i ) 2 ] = P i hence k j=1 P j P equality is achieved for the X j s independent and Gaussian (but not necessarily IID)

Parallel Gaussian channels Hence (X 1,..., X k ) is 0-mean with Λ (X 1,...,X k ) = P 1 0... 0 0 P 2... 0............ 0 0... P k the total energy constraint is a constraint we handle using Lagrange multipliers The function we now consider is ( ) kj=1 1 2 ln 1 + P j + λ k σ j2 j=1 P j after differentiating with respect to P j 1 2 1 P j +σ j2 + λ = 0 so we want to choose the P j + N j to be constant subject to the additional constrain that the P j s must be non-negative Select a dummy variable ν then k j=1 (ν σ j2 ) + = P

Parallel Gaussian channels Water-filling graphical interpretation Revisit the issue of spreading in frequency

Colored noise Y i = X i + N i for n time samples Λ (N1,...,N n ) is not a diagonal matrix: colored stationary GN Energy constraint 1 n E[X2 i ] P Example: we can make I ( (X 1,..., X n ); (Y 1,..., Y n ) ) = h(y 1,..., Y n ) h(y 1,..., Y n X 1,..., X n ) = h(y 1,..., Y n ) h(n 1,..., N n ) Need to maximize the first entropy Using the fact that a Gaussian maximizes entropy for a given autocorrelation matrix, we obtain that the maximum is 1 2 ln ( (2πe) n Λ (X 1,...,X n ) + Λ (N 1,...,N n ) ) note that the constraint on energy is a constraint on the trace of Λ (X 1,...,X n ) Consider Λ (X 1,...,X n ) + Λ (N 1,...,N n )

Colored noise Consider the decomposition QΛQ T = Λ (N 1,...,N n ), where QQ T = I Indeed, Λ (N 1,...,N n ) is a symmetric positive semi-definite matrix Also, Λ (X1,...,X n ) + Λ (N 1,...,N n ) = Λ (X1,...,X n ) + QΛQT = Q Q T Λ (X1,...,X n ) Q + Λ QT = Q T Λ (X1,...,X n ) Q + Λ trace ( Q T Λ (X1,...,X n ) Q) = trace ( QQ T Λ (X1,...,X n ) = trace(λ (X1,...,X n ) ) so energy constraint on input becomes a constraint on new matrix Q T Λ (X1,...,X n ) Q )

Colored noise We know that because conditioning reduces entropy, h(w, V ) h(w ) + h(v ) In particular, if W and V are jointly Gaussian, then this means that ln ( Λ W,V ) ln ( σ 2 V ) + ln ( σ 2 W ) hence Λ W,V σ 2 V σ2 W the RHS is the product of the diagonal terms of Λ W,V Hence, we can use information theory to show Hadamard s inequality, which states that the determinant of any positive definite matrix is upper bounded by the product of its diagonal elements

Colored noise Hence, Q T Λ (X 1,...,X n ) Q+Λ is upper bounded by the product i(α i + λ i ) where the diagonal elements of Q T Λ (X 1,...,X n ) Q are the α i s their sum is upper bounded by np To maximize the product, we would want to take the elements to be equal to some constant At least, we want to make them as equal as possible α i = (ν λ i ) + where α i = np Λ (X 1,...,X n ) = Q diag (α i)q T

ISI channels Y j = T d k=0 α kx j k + N j we may rewrite this as Y n = AX n + N n with some correction factor at the beginning for Xs before time 1 A 1 Y n = X n + A 1 N n consider mutual information between X n and A 1 Y n - same as between X n and Y n equivalent to a colored Gaussian noise channel spectral domain water-filling

General case Single user in multipath Y k = f k S k + N k wheref k is the complex matrix with entries f[j, i] = all paths m g m [j, j i] for 0 j i 0 otherwise For the multiple access model, each source has its own time-varying channel Y k = K i=1 f i k S ik + N k the receiver and the sender have perfect knowledge of the channel for all times In the case of a time-varying channel, this would require knowledge of the future behavior of the channel the mutual information between input and output is I (Y k ; S k ) = h (Y k ) h (N k )

General case We may actually deal with complex random variables, in which case we have 2k degrees of freedom We shall use the random vectors S 2k, Y 2k and N 2k, whose first k components and last k components are, respectively, the real and imaginary parts of the corresponding vectors S k, Y k and N k More generally, the channel may change the dimensionality of the problem, for instance because of time variations Y 2k = f 2k 2k S 2k + N 2k Let us consider the 2k by 2k matrix f 2k 2k T f 2k 2k Let λ 1...., λ 2k be the eigenvalues of f 2k 2k T f 2k 2k These eigenvalues are real and non-negative

General case Using water-filling arguments similar to the ones for colored noise, we may establish that maximum mutual ( information ) per second is 2T 1 2k i=1 ln 1 + u iλ i W N 0 2 where u i is given by and u i = 2k i=1 ( γ W N 0 2λ i u i = T P W ) +

General case Let us consider the multiple access case We place a constraint, P, on the sum of all the K users powers The users may cooperate, and therefore act as an antenna array Such a model is only reasonable if the users are co-located or linked to each other in some fashion There are M = 2Kk input degrees of freedom and 2k output degrees of freedom = f 2k M [Y [1]... Y [2k]] [Ŝ[1]... Ŝi[M] ] T + [N[1]... N[2k]] where we have defined [Ŝ[1]... Ŝ[M]] = [ S 1 [1]... S 1 [2k ], S 2 [1]... S 2 [2k ],..., S K [1]... S K [2k ] ] f 2k M = [ ] f 2k 1 2k, f 2k 2 2k,..., f 2k k 2k

General case f 2k T M f 2k M has M eigenvalues, all of which are real and non-negative and at most 2k of which are non-zero Let us assume that there are κ positive eigenvalues, which we denote λ 1,..., λ κ We have decomposed our multiple-access channels into κ channels which may be interpreted as parallel independent channels The input has M κ additional degrees of freedom, but those degrees of freedom do not reach the output The maximization along the active κ channels may now be performed using waterfilling techniques Let T be the duration of the transmission

We choose General case u i = ( γ N 0W 2 λ i ) + for λ i 0, where γ satisfies ( γ N ) + 0W = T P W 2 λ i i such that λi 0 and u i satisfies 2k i=1 u i = T P W We have reduced several channels, each with its own user, to a single channel with a composite user The sum of all the mutual informations averaged over time is upper bounded by ( 1 1 T 2 ln γ N ) + 0W λ i 1 + 2 λi i such that λi 0 N 0 W 2