Capacity Region of Reversely Degraded Gaussian MIMO Broadcast Channel

Similar documents
On Gaussian MIMO Broadcast Channels with Common and Private Messages

The Capacity Region of the Gaussian MIMO Broadcast Channel

Degrees of Freedom Region of the Gaussian MIMO Broadcast Channel with Common and Private Messages

Superposition Encoding and Partial Decoding Is Optimal for a Class of Z-interference Channels

An Alternative Proof for the Capacity Region of the Degraded Gaussian MIMO Broadcast Channel

A Proof of the Converse for the Capacity of Gaussian MIMO Broadcast Channels

On the Capacity Region of the Gaussian Z-channel

The Gallager Converse

The Capacity Region of the Gaussian Cognitive Radio Channels at High SNR

Information Theory. Lecture 10. Network Information Theory (CT15); a focus on channel capacity results

Variable Length Codes for Degraded Broadcast Channels

A Comparison of Superposition Coding Schemes

Capacity of a Class of Cognitive Radio Channels: Interference Channels with Degraded Message Sets

Bounds and Capacity Results for the Cognitive Z-interference Channel

Multiaccess Channels with State Known to One Encoder: A Case of Degraded Message Sets

On the Capacity of Interference Channels with Degraded Message sets

On the Capacity of the Interference Channel with a Relay

Optimal Natural Encoding Scheme for Discrete Multiplicative Degraded Broadcast Channels

The Capacity Region of the Cognitive Z-interference Channel with One Noiseless Component

The Poisson Channel with Side Information

On Multiple User Channels with State Information at the Transmitters

Optimal Power Allocation for Parallel Gaussian Broadcast Channels with Independent and Common Information

An Achievable Rate Region for the 3-User-Pair Deterministic Interference Channel

Generalized Writing on Dirty Paper

An Outer Bound for the Gaussian. Interference channel with a relay.

On Scalable Source Coding for Multiple Decoders with Side Information

The Capacity of the Semi-Deterministic Cognitive Interference Channel and its Application to Constant Gap Results for the Gaussian Channel

Representation of Correlated Sources into Graphs for Transmission over Broadcast Channels

On Scalable Coding in the Presence of Decoder Side Information

Lecture 10: Broadcast Channel and Superposition Coding

Cut-Set Bound and Dependence Balance Bound

The Capacity Region of a Class of Discrete Degraded Interference Channels

On the Rate-Limited Gelfand-Pinsker Problem

The Role of Directed Information in Network Capacity

Simultaneous Nonunique Decoding Is Rate-Optimal

Multiuser Successive Refinement and Multiple Description Coding

Information Theory for Wireless Communications, Part II:

ProblemsWeCanSolveWithaHelper

II. THE TWO-WAY TWO-RELAY CHANNEL

Capacity of the Discrete Memoryless Energy Harvesting Channel with Side Information

Duality, Achievable Rates, and Sum-Rate Capacity of Gaussian MIMO Broadcast Channels

Capacity of a Class of Semi-Deterministic Primitive Relay Channels

Feedback Capacity of the Gaussian Interference Channel to Within Bits: the Symmetric Case

Feedback Capacity of a Class of Symmetric Finite-State Markov Channels

Capacity Bounds for. the Gaussian Interference Channel

On Two-user Fading Gaussian Broadcast Channels. with Perfect Channel State Information at the Receivers. Daniela Tuninetti

Equivalence for Networks with Adversarial State

Joint Source-Channel Coding for the Multiple-Access Relay Channel

USING multiple antennas has been shown to increase the

On the Duality of Gaussian Multiple-Access and Broadcast Channels

Lattices for Distributed Source Coding: Jointly Gaussian Sources and Reconstruction of a Linear Function

Sum Rate of Multiterminal Gaussian Source Coding

Information Masking and Amplification: The Source Coding Setting

Cognitive Multiple Access Networks

Interference Channel aided by an Infrastructure Relay

On Compound Channels With Side Information at the Transmitter

AN INTRODUCTION TO SECRECY CAPACITY. 1. Overview

Lecture 1: The Multiple Access Channel. Copyright G. Caire 12

On the Required Accuracy of Transmitter Channel State Information in Multiple Antenna Broadcast Channels

Information Theory for Wireless Communications. Lecture 10 Discrete Memoryless Multiple Access Channel (DM-MAC): The Converse Theorem

On the Capacity of the Two-Hop Half-Duplex Relay Channel

A New Achievable Region for Gaussian Multiple Descriptions Based on Subset Typicality

On Capacity Under Received-Signal Constraints

Optimal Encoding Schemes for Several Classes of Discrete Degraded Broadcast Channels

Mismatched Multi-letter Successive Decoding for the Multiple-Access Channel

Shannon meets Wiener II: On MMSE estimation in successive decoding schemes

Two Applications of the Gaussian Poincaré Inequality in the Shannon Theory

Competition and Cooperation in Multiuser Communication Environments

Sum Capacity of Gaussian Vector Broadcast Channels

LECTURE 13. Last time: Lecture outline

Solutions to Homework Set #2 Broadcast channel, degraded message set, Csiszar Sum Equality

Binary Dirty MAC with Common State Information

arxiv: v1 [cs.it] 4 Jun 2018

A Formula for the Capacity of the General Gel fand-pinsker Channel

An Achievable Error Exponent for the Mismatched Multiple-Access Channel

IN this paper, we show that the scalar Gaussian multiple-access

Side-information Scalable Source Coding

Secret Key Agreement Using Conferencing in State- Dependent Multiple Access Channels with An Eavesdropper

Lossy Distributed Source Coding

A Single-letter Upper Bound for the Sum Rate of Multiple Access Channels with Correlated Sources

Duality Between Channel Capacity and Rate Distortion With Two-Sided State Information

to be almost surely min{n 0, 3

Secret Key Agreement Using Asymmetry in Channel State Knowledge

An Uplink-Downlink Duality for Cloud Radio Access Network

On the Duality between Multiple-Access Codes and Computation Codes

Interference Channels with Source Cooperation

Lecture 22: Final Review

Interactive Decoding of a Broadcast Message

Error Exponent Region for Gaussian Broadcast Channels

Capacity Bounds for Diamond Networks

UCSD ECE 255C Handout #12 Prof. Young-Han Kim Tuesday, February 28, Solutions to Take-Home Midterm (Prepared by Pinar Sen)

Lecture 5 Channel Coding over Continuous Channels

On The Binary Lossless Many-Help-One Problem with Independently Degraded Helpers

Relay Networks With Delays

Energy State Amplification in an Energy Harvesting Communication System

X 1 : X Table 1: Y = X X 2

ECE 4400:693 - Information Theory

On the Capacity Region for Secure Index Coding

On Dependence Balance Bounds for Two Way Channels

Capacity bounds for multiple access-cognitive interference channel

Transcription:

Capacity Region of Reversely Degraded Gaussian MIMO Broadcast Channel Jun Chen Dept. of Electrical and Computer Engr. McMaster University Hamilton, Ontario, Canada Chao Tian AT&T Labs-Research 80 Park Avenue Florham Park, NJ 0793, USA Abstract We consider the problem of broadcasting a common message and two individual messages to two users on a product channel of two reversely degraded Gaussian multiple-input multiple-output (MIMO) broadcast channels. Though El Gamal provided a single letter characterization for the general discrete memoryless problem in 980, this characterization in fact does not include a channel cost constraint, and thus does not apply directly to the Gaussian MIMO setting. We first show El Gamal s single letter characterization can indeed be generalized to include channel cost constraints, however special care has to be taken and the characterization holds only with certain class of cost functions. This characterization has an equivalent form, and by utilizing this form, as well as the enhancement technique and an extremal inequality which were only discovered recently, we show that indeed Gaussian codebooks are optimal for this MIMO setting. I. INTRODUCTION The capacity region of general memoryless broadcast channels remains an open problem in multi-user information theory, despite the establishment of capacity region in many special cases such as degraded [][], deterministic [3][4], more powerful broadcast channels [5] and broadcast channels with degraded message sets [6] (see [7] for a review). Recently, important progress was made in this area for Gaussian multiple-input multiple-output (MIMO) channels in [8], where Weingarten et al. showed that dirty paper coding based on Gaussian codebooks is optimal when only individual messages are required, and thus completely characterized the capacity region for this case. When common message is involved in non-degraded broadcast channels in addition to individual messages, known results in the literature are quite limited. In [9], El Gamal provided a single letter characterization of the capacity region for the product of reversely degraded broadcast channels (often simply referred to as the reversely degraded broadcast channel), and furthermore, showed that when the two individual broadcast channels are scalar Gaussian, Gaussian codebooks are optimal. In [0], the problem of Gaussian MIMO broadcast channel with common message is considered, and it was shown a coding scheme based on dirty paper coding [] with Gaussian codebooks is optimal under certain specific conditions. In this work, we consider the problem of broadcasting a common message and two individual messages to two users on a reversely degraded Gaussian MIMO broadcast channel. Although a single letter characterization of the capacity region was established in [9] for general discrete memoryless channels, this characterization in fact does not include channel cost constraint, and thus can not be applied directly to the MIMO setting. We first show that the result in [9] can indeed be generalized to include channel cost constraints, however special care has to be taken and the characterization holds only with certain class of cost functions. The characterization has an equivalent form, and by utilizing this form, as well as the enhancement technique [8] and an extremal inequality [], we prove the non-trivial fact that Gaussian codebooks are optimal for this MIMO setting under separate covariance constraints on two sub-channels, individual antenna power constraints, or a total power constraint. Similar to [8], the case with separate covariance constraints on two sub-channels can be understood as an intermediate step toward the more realistic constraint on total power. The rest of the paper is organized as follows. In Section II, we describe channel model and provide necessary notations. The main result is given in Section III. The generalization of El Gamal s characterization to include certain cost functions, and the alternative single letter characterization are discussed in Section IV, and the proof of the main result is given in Section V. Finally Section VI concludes the paper. II. CHANNEL MODELS We first describe the system for general discrete memoryless channels, and then introduce the specific Gaussian MIMO notations. The broadcast channel in question is given by channel input in the alphabet X = (X, X ), and first user channel output in the alphabet Y = (Y, Y ), and second user channel output in the alphabet Z = (Z, Z ), where the transition probability P(Y,Y,Z,Z X,X ) factorizes as follows P(Y,Y,Z,Z X,X ) = P(Y X )P(Z Y ) P(Z X )P(Y Z ). () Thus, the overall broadcast channel consists of two parallel broadcast channels with reversely degraded outputs; note this These settings do not satisfy the specific conditions under which tight characterization was given in [0].

also implies the fact that Y Z X X Y Z is a Markov chain. For channel with cost constraints, a set of cost functions c i : X C i, i =,,...,K are defined, and C i is a space of t i -by-t i positive semidefinite matrices, with the partial order defined by positive semidefiniteness. Definition : An (M 0,M,M,n) broadcast code consists of an encoding function f : I M0 I M I M X n, () where I n = {,,...,n} and two decoding functions g : Y n I M0 I M = (Ŵ0,Ŵ) g : Z n I M0 I M = (Ŵ0,Ŵ). (3) A decoding error is defined in the usual way as either at user one (w 0,w ) (ŵ 0,ŵ ) or at user two (w 0,w ) (ŵ 0,ŵ ); the probability of decoding error P e is averaged over the message space I M0 I M I M, i.e., assuming the messages are distributed uniformly. Definition : A rate triple (R 0,R,R ) is called achievable under channel cost constraints C i C i where C i 0, i =,,...,K, if for any ǫ > 0 and sufficiently large n there exists an (M 0,M,M ) code such that M i nri, i =,,3 E n c i (X(j)) C i + ǫi, i =,,...,K, (4) n j= where X(j) is the channel codeword input at j-th position, and P e < ǫ. The capacity region under cost constraints (C,C,...,C K ), is the closure of the set of achievable rate triples, denoted as C(C,C,...,C K ). For the Gaussian MIMO problem, we consider the setting that the input X i is in fact a vector X i for i =,, and Y = Y = X + n, Z = Z = X + m, Y = Y = X + n, Z = Z = X + m, (5) where n i is a t i -dimensional random Gaussian vector with covariance N i, and m i is a t i -dimensional random Gaussian vector with covariance M i, for i =,; moreover, (n,m ) is independent of (n,m ). The covariances are given such that 0 N M and 0 M N. We shall consider three kinds of constraints: Separate covariance constraints: given two positive semidefinite matrices S and S, the two cost functions over t i -dimensional vectors are given by c i (x) = x i x i t. In other words, two covariance constraints are placed separately over the two sub-channels. We denote the capacity region as C(S,S ). Individual antenna power constraints: given a (t + t )- dimensional non-negative vector (P,P,...,P t+t ), the (t + t ) cost functions are given by c i (x) = x (i). We denote the capacity region as C(P,P,...,P t+t ). Total power constraint: given a non-negative power constraint P, the scalar-valued cost function is given by c (x) = x t x. We denote the capacity region as C(P). As we shall discuss in Section IV, these three kinds of cost functions belong to the so-called class of statistically separable cost functions, which allows an explicit characterization of the capacity region. III. MAIN RESULTS Define the region R(S,S ) as the non-negative rate triple (R 0,R,R ) for which there exist some positive semidefinite matrices (Q,Q ), 0 Q i S i for i =,, such that where R 0 = min R 0 + R R 0 + R R 0 + R R 0 + R R 0 + R + R R 0 + R + R (6) { log S + N Q + N + log S + N Q + N, log S + M Q + M + log S + M Q + M }, (7) R = log Q + N, (8) N R = log Q + M. (9) M We have the following theorems. Theorem : C(S,S ) = conv(r(s,s )), where conv( ) is the convex hull operator. Theorem : C(P,P,...,P t+t ) = (S,S ) P(P,P,...,P t +t ) conv(r(s,s )), where P(P,P,...,P t+t ) is the set of pair of positive demidefinite matrices such that S (i) P i and S (i) P i+t, where S i (j) is the j-th diagonal element of S i. Theorem 3: C(P) = conv(r(s,s )), (S,S ) P(P) where P(P) is the set of pair of positive semidefinite matrices such that Tr(S ) + Tr(S ) P, where Tr( ) is the trace. Remark: We write the form of the capacity region as in (6)- (9), because it reveals a fundamental constraint in broadcast when common message is involved. Since common message can be also used to convey individual messages, and vice versa, the rate region in (6) can be taken as a minimal capacity template, or the so-called latent capacity region [3][4]. The rest of the paper is largely devoted to the proof of Theorem, along the direction outlined in Section I. IV. CAPACITY CHARACTERIZATION WITH CHANNEL COST CONSTRAINT AND AN ALTERNATIVE FORM In this section, we first show that the capacity region characterization in [9] can be generalized to include channel cost constraints, however the result only holds for certain class of cost functions, namely channel cost functions that are statistically separable. Through the proof of this capacity

result, we introduce an alternative form of the capacity region, which will become useful in the next section. In this section, we focus on the general discrete memoryless model. Definition 3: A cost function c( ) is called statistically separable, if for any joint distribution P(X,X ), and P (X,X ) = P(X )P(X ), E P (c(x,x )) = E P (c(x,x )), where the expectations are with respect to the distribution P(X,X ) and P (X,X ), respectively. Let R(U,U,X,X ) be the collection of non-negative rate triples (R 0,R,R ), for some choice of probability distribution P(U,U,X,X,Y,Y,Z,Z ) such that R 0 min {I(U ;Y ) + I(U ;Y ),I(U ;Z ) + I(U ;Z )}, (0) R 0 + R I(U ;Y ) + I(U ;Y ) + I(X ;Y U ), () R 0 + R I(U ;Z ) + I(U ;Z ) + I(X ;Z U ), () R 0 + R + R I(X ;Y U ) + I(X ;Z U ) min {I(U ;Y ) + I(U ;Y ),I(U ;Z ) + I(U ;Z )}. (3) The following theorem states that when the cost functions are statistically separable, the characterization in [9] can be generalized to the case with channel cost constraints. Theorem 4: If the cost functions c,c,...,c K are statistically separable, then C(C,C,...,C K ) = (U,U,X,X ) P(C,C,...,C K ) R(U,U,X,X ), where P(C,C,...,C K ) is the set of distributions satisfying the following conditions The distribution factorizes as P(U )(U )P(X U ) P(X U )P(Y,Z X )P(Y,Z X ); in other words, (U,X ) and (U,X ) are independent; E(c i (X,X )) C i, for i =,,...,K. Remark: The only difference from [9] is the condition on the cost functions, and the last condition on the distribution induced by the cost function. We can in fact similarly define the notion of statistical sub-separability when E P (c(x,x )) E P (c(x,x )), and it can also be shown this is sufficient for Theorem 4 to hold. Remark: It is clear for the Gaussian MIMO case, all three channel cost functions given in Section II are statistically separable. To prove Theorem 4, we need the following theorem. Let R(U,U,X,X ) be the collection of non-negative rate triples (R 0,R,R ), for some choice of probability distribution P(U,U,X,X,Y,Y,Z,Z ) such that R 0 R 0, (4) R 0 + R R 0 + I(X ;Y U ), (5) R 0 + R R 0 + I(X ;Z U ), (6) R 0 + R + R R 0 + I(X ;Y U ) + I(X ;Z U ). (7) where R 0 = min {I(U ;Y ) + I(U ;Y ),I(U ;Z ) + I(U ;Z )} Theorem 5: C(C,C,...,C K ) (U,U,X,X ) P(C,C,...,C K ) R(U,U,X,X ), where P(C,C,...,C K ) is the same as in Theorem 4. The proof of this theorem is rather straightforward by using the well known super-position codebook used in degraded broadcast channels [] and the cost function conditions are the direct consequences of typical sequences, and we thus omitted the details of the proof. Next we prove Theorem 4. Proof of Theorem 4: The converse part follows directly from the proof in [9]. However, during the last step of proof, the dependence between (U,X ) and (U,X ) can not be severed arbitrarily, since though the mutual information quantities only involve the marginal distributions, the cost functions may not preserve their costs during this transformation. However when the cost functions are statistically separable, indeed we can sever the dependence between (U,X ) and (U,X ) without causing any change to the region. Now we turn to the forward part of the proof. We claim that for any (U,U,X,X ) such that (U,X ) is independent of (U,X ) and U i X i (Y i,z i ) is a Markov string for i =,, there exist some (U,U,X,X ) where (U,X ) is also independent of (X,U ) which preserves the marginal distribution of (X,X ), and U i X i (Y i,z i ) is a Markov string for i =,, and furthermore, R(U,U,X,X ) R(U,U,X,X ). (8) Note that since the marginal distribution of (X,X ) is preserved, the cost function conditions are also preserved. Once this inclusion is proved, Theorem 5 can be invoked to complete the proof of Theorem 4. Without loss of generality, we assume I(U ;Y ) + I(U ;Y ) I(U ;Z ) + I(U ;Z ). If this inequality in fact holds with equality, then clearly R(U,U,X,X ) = R(U,U,X,X ). If the inequality is strict, then define U (α) (T α,û), α [0,], where T α is a Bernoulli random variable with P(T α = ) = α and Û = { U, T α = 0 X, T α = (9) This implies that U (0) = (0,U ), U () = (,X ), and moreover U U (α) X is a Markov string for any α [0,], and the marginal distribution of (U,X,X ) is preserved, and clearly the Markov string U (α) X (Y,Z ) is true for any α. We shall increase α until I(U ;Y ) + I(U ;Y ) = I(U ;Z ) + I(U (α) ;Z ) or α = (i.e., U (α) = X ); we denote the corresponding α as α. Next we show that in either case, R(U,U,X,X ) R(U,U (α ),X,X ). If α, the respective comparisons between (0)- (3) and (4)-(7) are straightforward by noticing that both I(U ;Y ) + I(U (α) ;Y ) and I(U ;Z ) + I(U (α) ;Z ) are

monotone increasing functions of α, and thus I(U ;Z ) + I(U (α ) ;Z ) = I(U ;Y ) + I(U ;Y ) I(U ;Y ) + I(U (α) ;Y ). If α =, then we clearly have I(U ;Z ) + I(X ;Z ) I(U ;Y ) + I(U ;Y ). (0) The comparison of other quantities are again straightforward, and we only need to consider the comparison between () and (5). Note that because of (0), the right hand side of (3) can be written to satisfy the following inequality I(U ;Z ) + I(U ;Z ) + I(X ;Y U ) + I(X ;Y U ) = I(U ;Z ) + I(X ;Z ) + I(X ;Y U ) () I(U ;Y ) + I(U ;Y ) + I(X ;Y U ), () however the quantity in () is exactly the right hand side of (), and since R 0,R,R are non-negative, the condition in () can be safely removed and replaced by the quantity in (), i.e., R 0 + R I(U ;Z ) + I(X ;Z ) + I(X ;Y U ), which is however exactly the inequality of (5) for α =. This completes the proof. From Theorem 4 and Theorem 5, it is seen that an alternative characterization of C(C,C,...,C K ) is as follows. Corollary : C(C,C,...,C K ) = (U,U,X,X ) P(C,C,...,C K ) R(U,U,X,X ), where P(C,C,...,C K ) is the same as in Theorem 4. V. REVERSELY DEGRADED GAUSSIAN MIMO BROADCAST CHANNEL Now we turn to the reversely degraded Gaussian MIMO broadcast channel. We first prove Theorem. The proof of Theorem and 3 will also be discussed briefly. To prove the converse, a result from [] is needed. Theorem 6 ([]): Let Q be a positive semidefinite matrix such that 0 Q S and such that K β λ () i= i (Q + N i ) = β K λ () j (Q + M j ) + O, j= where β i > 0 for i =,, λ () i 0, λ (j) i 0, O 0, (S Q ) O = 0, and there exists an N such that N i N M j for all i and j. Then for any T independent of W i N(0,N i ) and V j N(0,M j ) and X such that E(XX t ) S we have K β λ () i= β β i h(x + W i T) β K i= K K j= λ () i log πe(q + N i ) j= λ () j log πe(q + M j ). λ () j h(x + V j T) Next consider the optimization problem max µ 0 R0 + µ R + µ R, 0 Q S,0 Q S where the clearly achievable rate triple (R0,R,R ) is defined in (7)-(9), and µ 0 (µ,µ ) 0 (since otherwise one can set R 0 = 0 and the problem reduces to the case with only private messages). Clearly, we can assume µ 0 > 0 since the objective function is zero otherwise. It is possible to show using Fritz John necessary conditions [6] that the following lemma is true. Lemma : There exist λ i 0 (i =,), λ + λ = and Q and Q such that max µ 0 R0 + µ R + µ R 0 Q S,0 Q S ( = µ 0 λ log S + N Q + N + log S + N Q ( + N + µ 0 λ log S + M Q + M + log S ) + M Q + M + µ log Q + N + µ N log Q + M, M and furthermore there exist positive semidefinite matrices O and O such that ) µ (Q + N ) + O = µ 0 λ (Q + N ) + µ 0 λ (Q +M ) +O and µ (Q +M ) +O = µ 0 λ (Q + N ) + µ 0 λ (Q + M ) + O. ) Q i O i = 0 and (S i Q i ) O i = 0, i =,. 3) We have λ = and λ = 0 if log S + N Q + N + log S + N Q + N < log S + M Q + M + log S + M Q + M and λ = 0 and λ = if log S + N Q + N + log S + N Q + N > log S + M Q + M + log S + M Q + M. Now we are ready to prove Theorem. Proof of Theorem : The forward part is clearly true, and we only need to prove the converse. Without loss of generality, we shall assume µ 0 µ µ 0, and subsequently choose λ and λ as specified in Lemma. We start by writing the following inequality using Corollary, µ 0 R 0 + µ R + µ R = µ (R 0 + R + R ) + (µ µ )(R 0 + R ) + (µ 0 µ )R 0 µ 0 min {I(U ;Y ) + I(U ;Y ),I(U ;Z ) + I(U ;Z )} + µ I(X ;Y U ) + µ I(X ;Z U ) µ 0 λ (I(U ;Y ) + I(U ;Y )) + µ 0 λ I(U ;Z ) + µ 0 λ I(U ;Z ) + µ I(X ;Y U ) + µ I(X ;Z U ). )

Note that µ 0 λ (I(U ;Y ) + I(U ;Y )) + µ 0 λ (I(U ;Z ) + I(U ;Z )) + µ I(X ;Y U ) + µ I(X ;Z U ) = µ 0 λ (h(y ) + h(y )) + µ 0 λ (h(z ) + h(z )) µ h(y X ) µ h(z X ) + (µ µ 0 λ )h(y U ) µ 0 λ h(z U ) + (µ µ 0 λ )h(z U ) µ 0 λ h(y U ) µ 0 λ ( log( πe(s + N ) ) + log( πe(s + N ) ) + µ 0 λ ( log( πe(s + M ) ) + log( πe(s + M ) ) µ h(y X ) µ h(z X ) + (µ µ 0 λ )h(y U ) µ 0 λ h(z U ) + (µ µ 0 λ )h(z U ) µ 0 λ h(y U ). To complete the proof, we need to show that (µ µ 0 λ )h(y U ) µ 0 λ h(z U ) µ h(y X ) µ µ 0 λ µ 0λ log( πe(q + N ) ) log( πe Q + M ) µ log( πen ), (µ µ 0 λ )h(z U ) µ 0 λ h(y U ) µ h(z X ) µ µ 0 λ µ 0λ log( πe(q + M ) ) (3) log( πe(q + N ) ) µ log( πem ), (4) where Q,Q are as specified by Lemma. By symmetry, we only need to show that (3) is true. If O = 0 in Lemma, then we must have µ µ 0 λ 0, and it follows from Theorem 6 that (3) is true. Now consider the case O 0. If µ µ 0 λ < 0, then O 0 since µ (Q + N ) + O = µ 0 λ (Q + N ) + µ 0 λ (Q + M ) + O, which further implies Q = 0 due to the fact that Q O = 0. Then it is clear that (3) is true. If µ µ 0 λ = 0, then again we have either Q = 0 or λ = 0. It can be seen that (3) is true in both cases. The remaining case is µ µ 0 λ > 0 and O 0. Define Ñ 0 such that (µ µ 0 λ )(Q + Ñ) = (µ µ 0 λ )(Q + N ) + O. It is clear that Ñ N M. Define an enhanced receiver Ỹ associated with Ñ. It is important to note that if we replace the overall sub-channel Y by Ỹ, then the rate of the common message will also be affected. Instead, we shall only enhance the component related to the private messages. More specifically, let Y be physically degraded version of Ỹ, then it follows I(X ;Y U ) I(X ;Ỹ,Y U ) = I(X ;Ỹ U ). It follows that we can now bound the left hand side of (3) as given next (µ µ 0 λ )h(y U ) µ 0 λ h(z U ) µ h(y X ) µ I(X ;Ỹ U ) µ 0 λ h(y U ) µ 0 λ h(z U ) = µ h(ỹ U) µ 0 λ h(y U ) µ 0 λ h(z U ) µ h(ỹ X ) (a) µ log( πe(q + Ñ) ) µ 0λ log( πe(q + N ) ) µ 0λ log( πe(q + M ) ) µ log( πe(ñ) ), where (a) is due to Theorem 6. Since Q O = 0 and (µ µ 0 λ )(Q + Ñ) = (µ µ 0 λ )(Q + N ) + O, we have (µ µ 0 λ )Q (Q + Ñ) = (µ µ 0 λ )Q (Q + N ) + Q O = (µ µ 0 λ )Q (Q + N ), i.e., Q (Q + Ñ) = Q (Q + N ). This implies Ñ (Q +Ñ) = N (Q +N ). Therefore, we have Q + Ñ = Q + N. Ñ N Thus (3) is proved. Since both regions in the statement of Theorem are convex, and we have essentially shown that their supporting hyperplanes match, the proof is now complete. The proof of Theorem can be outlined as follows. Due to the single letter characterization of Theorem 4, there exist independent random variables (U,X ) and (U,X ) to satisfy the individual antenna power constraints. However, such random variables induce separate covariances E(X X t ) and E(X X t ), the diagonal elements of which satisfy the individual antenna power constraint. But this rate triple must belong to a certain capacity region C(E(X X t ), E(X X t )), again due to Theorem 4. Thus the union of capacity region over all possible covariance pairs (S,S ) gives the complete capacity region C(P,P,...,P t+t ). The proof of Theorem 3 follows a similar argument. VI. CONCLUSION We considered the problem of broadcasting on reversely degraded Gaussian MIMO channels. The characterization given in [9] is generalized to include channel cost constraints for statistically separable cost functions. By utilizing an alternative characterization, an extremal inequality and the enhancement technique, we showed that Gaussian codebooks are optimal for the reversely degraded Gaussian MIMO channels under various power constraints. REFERENCES [] P. Bergmans, A simple converse proof for the broadcast channels with additive white Gaussian noise, IEEE Trans. Information Theory, vol. 0, no., pp. 79 80, Mar. 974. [] R. G. Gallager, Capacity and coding for degraded broadcast channels, Probl. Pered. Inform., vol. 0, no. 3, pp. 3 4, Jul.-Sep. 974; translated Probl. Inform. Transm., vol. 0, no. 3, pp. 85 93, Jul.-Sep. 974.

[3] K. Marton, The capacity region of deterministic broadcast channels, in Trans. Int. Symp. Inform. Theory, pp. 43-48, Cachan,. France, 977 [4] M. S. Pinsker, The capacity region of noiseless broadcast channels, Probl. Pered. Inform., vol. 4, no., pp. 8 34, Apr.-Jun. 978; translated Probl. Inform. Transm., vol. 4, no., pp. 97 0, Apr.-Jun. 978. [5] A. A. El Gamal, The capacity of a class of broadcast channels, IEEE Trans. Information Theory, vol. 5, no., pp. 66 69, Mar. 979. [6] J. Korner and K. Marton, General broadcast channels with degraded message sets, IEEE Trans. Information Theory, vol. 3, no., pp. 60 64, Jan. 977. [7] T. Cover, Comments on broadcast channels, IEEE Trans. Information Theory, vol. 44, no. 6, pp. 54 530, Oct. 998. [8] H. Weingarten, Y. Steinberg, and S. Shamai, The capacity region of the Gaussian multiple-input multiple-output broadcast channel, IEEE Trans. Information Theory, vol. 5, no. 9, pp. 3936 3964, Sep. 006. [9] A. A. El Gamal, The capacity of the product and sum of two reversely degraded broadcast channels, Probl. Pered. Inform., vol. 6, no., pp. 3 3, Jan.-Mar. 980; translated Probl. Inform. Transm., vol. 6, no., pp. 6, Jan.-Mar. 980. [0] H. Weingarten, Y. Steinberg, and S. Shamai, On the capacity region of the multi-antenna broadcast channel with common messages, IEEE Int. Symp. Inform. Th., pp. 95 99, Jul. 006, Seattle, USA. [] N. Jindal and A. Goldsmith, Capacity and dirty paper coding for Gaussian broadcast channels with common information, IEEE Int. Symp. Inform. Th., p. 5, Jun.-Jul. 004, Chicago, USA. [] H. Weingarten, T. Liu, S. Shamai, Y. Steinberg and P. Viswanath, Capacity region of the degraded MIMO compound broadcast channel, IEEE Trans. Information Theory, to appear. [3] L. Grokop and D. N. C. Tse, Fundamental constraints on multicast capacity regions, IEEE Trans. Information Theory, submitted for publication. Arxiv:0809.835v. [4] C. Tian, Latent capacity region: a case study on symmetric broadcast with common messages, IEEE Trans. Information Theory, submitted for publication. [5] T. Berger, Multiterminal source coding, in Lecture notes at CISM summer school on the information theory approach to communications, 977. [6] D. P. Bertsekas, Nonlinear Programming, nd ed. Belmont, MA: Athena Scientific, 999.