A Single-letter Upper Bound for the Sum Rate of Multiple Access Channels with Correlated Sources

Similar documents
A New Data Processing Inequality and Its Applications in Distributed Source and Channel Coding

The Capacity Region of a Class of Discrete Degraded Interference Channels

On the Capacity Region of the Gaussian Z-channel

On Dependence Balance Bounds for Two Way Channels

Capacity of the Discrete Memoryless Energy Harvesting Channel with Side Information

On the Rate-Limited Gelfand-Pinsker Problem

Capacity of a Class of Semi-Deterministic Primitive Relay Channels

Superposition Encoding and Partial Decoding Is Optimal for a Class of Z-interference Channels

Transmit Directions and Optimality of Beamforming in MIMO-MAC with Partial CSI at the Transmitters 1

Joint Source-Channel Coding for the Multiple-Access Relay Channel

An Alternative Proof for the Capacity Region of the Degraded Gaussian MIMO Broadcast Channel

Reliable Computation over Multiple-Access Channels

Cut-Set Bound and Dependence Balance Bound

Discrete Memoryless Channels with Memoryless Output Sequences

The Poisson Channel with Side Information

Optimum Power Allocation in Fading MIMO Multiple Access Channels with Partial CSI at the Transmitters

(each row defines a probability distribution). Given n-strings x X n, y Y n we can use the absence of memory in the channel to compute

On Gaussian MIMO Broadcast Channels with Common and Private Messages

Secure Degrees of Freedom of the MIMO Multiple Access Wiretap Channel

SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices)

Parallel Additive Gaussian Channels

Secret Key and Private Key Constructions for Simple Multiterminal Source Models

Multiple Antennas in Wireless Communications

Degrees of Freedom Region of the Gaussian MIMO Broadcast Channel with Common and Private Messages

Lecture 4 Noisy Channel Coding

Multiaccess Channels with State Known to One Encoder: A Case of Degraded Message Sets

Single-User MIMO systems: Introduction, capacity results, and MIMO beamforming

On Multiple User Channels with State Information at the Transmitters

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2.

UMIACS-TR July CS-TR 2721 Revised March Perturbation Theory for. Rectangular Matrix Pencils. G. W. Stewart.

THE multiple-access relay channel (MARC) is a multiuser

Capacity Region of Reversely Degraded Gaussian MIMO Broadcast Channel

Energy State Amplification in an Energy Harvesting Communication System

Institute for Advanced Computer Studies. Department of Computer Science. On Markov Chains with Sluggish Transients. G. W. Stewart y.

Chapter 4. Data Transmission and Channel Capacity. Po-Ning Chen, Professor. Department of Communications Engineering. National Chiao Tung University

Lecture 14: SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Lecturer: Sanjeev Arora

NP-hardness of the stable matrix in unit interval family problem in discrete time

EE5139R: Problem Set 7 Assigned: 30/09/15, Due: 07/10/15

EE 4TM4: Digital Communications II. Channel Capacity

The Gallager Converse

Representation of Correlated Sources into Graphs for Transmission over Broadcast Channels

Homework 1. Yuan Yao. September 18, 2011

arxiv: v4 [cs.it] 17 Oct 2015

An Achievable Error Exponent for the Mismatched Multiple-Access Channel

ELEC E7210: Communication Theory. Lecture 10: MIMO systems

An Alternative Proof of Channel Polarization for Channels with Arbitrary Input Alphabets

Computing and Communications 2. Information Theory -Entropy

Katalin Marton. Abbas El Gamal. Stanford University. Withits A. El Gamal (Stanford University) Katalin Marton Withits / 9

Convexity/Concavity of Renyi Entropy and α-mutual Information

A Proof of the Converse for the Capacity of Gaussian MIMO Broadcast Channels

Lecture 22: Final Review

18.2 Continuous Alphabet (discrete-time, memoryless) Channel

Chain Independence and Common Information

Computational math: Assignment 1

On the Simulatability Condition in Key Generation Over a Non-authenticated Public Channel

Basic Calculus Review

Multiple Antennas for MIMO Communications - Basic Theory

The Sensor Reachback Problem

On the Throughput, Capacity and Stability Regions of Random Multiple Access over Standard Multi-Packet Reception Channels

Lecture 4: Purifications and fidelity

Ergodic and Outage Capacity of Narrowband MIMO Gaussian Channels

On the Entropy of Sums of Bernoulli Random Variables via the Chen-Stein Method

Lecture 14 February 28

A Formula for the Capacity of the General Gel fand-pinsker Channel

EIGENVALUES AND SINGULAR VALUE DECOMPOSITION

Upper Bounds on the Capacity of Binary Intermittent Communication

Lecture 7 MIMO Communica2ons

Large Scale Data Analysis Using Deep Learning

Online Scheduling for Energy Harvesting Broadcast Channels with Finite Battery

Lecture 8: Linear Algebra Background

Variable Length Codes for Degraded Broadcast Channels

ELEC546 Review of Information Theory

Markov Chains, Stochastic Processes, and Matrix Decompositions

Functional Analysis Review

Feedback Capacity of a Class of Symmetric Finite-State Markov Channels

Optimization in Information Theory

Mismatched Multi-letter Successive Decoding for the Multiple-Access Channel

Entropies & Information Theory

An Outer Bound for the Gaussian. Interference channel with a relay.

Convexity of the Joint Numerical Range

Multiuser Capacity in Block Fading Channel

Entropy and Ergodic Theory Notes 21: The entropy rate of a stationary process

Transmitter optimization for distributed Gaussian MIMO channels

1 Linearity and Linear Systems

Solutions to Homework Set #1 Sanov s Theorem, Rate distortion

Can Feedback Increase the Capacity of the Energy Harvesting Channel?

Distributed Functional Compression through Graph Coloring

NORMS ON SPACE OF MATRICES

A Comparison of Superposition Coding Schemes

On Large Deviation Analysis of Sampling from Typical Sets

THE NUMERICAL EVALUATION OF THE MAXIMUM-LIKELIHOOD ESTIMATE OF A SUBSET OF MIXTURE PROPORTIONS*

arxiv: v1 [cs.it] 5 Sep 2008

Symmetric Characterization of Finite State Markov Channels

The Capacity Region for Multi-source Multi-sink Network Coding

ELEC633: Graphical Models

Multi-Kernel Polar Codes: Proof of Polarization and Error Exponents

(Classical) Information Theory III: Noisy channel coding

Structured Stochastic Uncertainty

1 Singular Value Decomposition and Principal Component

Approaching Blokh-Zyablov Error Exponent with Linear-Time Encodable/Decodable Codes

Transcription:

A Single-letter Upper Bound for the Sum Rate of Multiple Access Channels with Correlated Sources Wei Kang Sennur Ulukus Department of Electrical and Computer Engineering University of Maryland, College Park, MD 074 wkang@eng.umd.edu ulukus@umd.edu ariv:cs/05096v [cs.it] 8 Nov 005 Abstract The capacity region of the multiple access channel with arbitrarily correlated sources remains an open problem. Cover, El Gamal and Salehi gave an achievable region in the form of single-letter entropy and mutual information expressions, without a single-letter converse. Cover, El Gamal and Salehi also gave a converse in terms of some n-letter mutual informations, which are incomputable. In this paper, we derive an upper bound for the sum rate of this channel in a single-letter expression by using spectrum analysis. The incomputability of the sum rate of Cover, El Gamal and Salehi scheme comes from the difficulty of characterizing the possible joint distributions for the n-letter channel inputs. Here we introduce a new data processing inequality, which leads to a single-letter necessary condition for these possible joint distributions. We develop a single-letter upper bound for the sum rate by using this single-letter necessary condition on the possible joint distributions. I. INTRODUCTION The problem of determining the capacity region of the multiple access channel with correlated sources can be formulated as follows. Given a pair of correlated sources (U,V described by the joint probability distribution p(u, v, and a discrete, memoryless, multiple access channel characterized by the transition probability p(y x,x, what are the necessary and sufficient conditions for the reliable transmission of n independent identically distributed (i.i.d. samples of the sources through the channel, in n channel uses, as n? This problem was studied by Cover, El Gamal and Salehi in [], where an achievable region expressed by single-letter entropies and mutual informations was given. This region was shown to be suboptimal by Dueck []. Cover, El Gamal and Salehi [] also provided a capacity result with both achievability and converse in incomputable expressions in the form of some n-letter mutual informations. In this paper, we derive an upper bound for the sum rate of this channel in a single-letter expression. The incomputability of the sum rate of Cover, El Gamal and Salehi scheme is due to the difficulty of characterizing the possible joint distributions for the n-letter channel inputs. The Cover, El Gamal, Salehi converse is H(U,V n I(n, n ; n ( This work was supported by NSF Grants CCR 03-3, CCF 04-4763 and CCF 05-4846; and ARL/CTA Grant DAAD 9-0--00. where the random variables involved have a joint distribution expressed in the form n n p(u i,v i p(x n un p(x n vn p(y i x i,x i ( i= i.e., the sources and the channel inputs satisfy the Markov chain relation n Un V n n. It is difficult to evaluate the mutual information on the right hand side of ( when the joint probability distribution of the random variables involved is subject to (. A usual way to upper bound the mutual information in ( is n I(n,n ; n n I( i, i ; i n i= i= maxi(, ; (3 where the maximization in (3 is over all possible and such that U n V n. Therefore, combining ( and (3, a single-letter upper bound for the sum rate is obtained as, H(U,V maxi(, ; (4 where the maximization is over all, such that U n V n. However, a closed form expression for p(x,x satisfying this Markov chain, for all U, V and n, seems intractable to obtain. Data processing inequality [3, p. 3] is an intuitive way to obtain a necessary condition on p(x,x for the above Markov chain constraint, i.e., we may try to solve the following problem as an upper bound for (4 max I(, ; (5 s.t. I( ; I(U n ;V n = ni(u,v where s.t. line provides a constraint on the feasible set of p(x,x. However, when n is large, this upper bound becomes trivial as ni(u,v quickly gets larger than I( ; for p(x,x even without the Markov chain constraint. Although the data processing inequality in its usual form does not prove useful in this problem, we will still use the basic methodology of employing a data processing inequality to represent the Markov chain constraint on the valid input

distributions. For this, we will introduce a new data processing inequality. Spectrum analysis has been instrumental in the study of some properties of pairs of correlated random variables, especially, those of the i.i.d. sequences of pairs of correlated random variables, e.g., common information in [4] and isomorphism in [5]. In this paper, we use spectrum analysis to introduce a new data processing inequality. Our new data processing inequality provides a single-letter necessary condition for the joint distributions satisfying the Markov chain condition, and leads to a non-trivial single-letter upper bound for the sum rate of the multiple access channel with correlated sources. II. SOME PRELIMINARIES In this section, we provide some basic results what will be used in our later development. The concepts used here are originally introduced by Witsenhausen in [4] in the context of operator theory. Here, we limit ourselves to the finite alphabet case, and derive our results by means of matrix theory. We first introduce our matrix notation for probability distributions. For a pair of discrete random variables and, which take values in = {x,x,...,x m } and = {y,y,...,y n }, respectively, the joint distribution matrix P is defined as P (i,j Pr( = x i, = y j, where P (i,j denotes the (i,j-th element of the matrix P. From this definition, we have P T = P. The marginal distribution of a random variable is defined as a diagonal matrix with P (i,i Pr( = x i. The vectorform marginal distribution is defined asp (i Pr( = x i, i.e., p = P e, where e is a vector of all ones. Similarly, we define p P e and p e. The conditional P distribution of given is defined in the matrix form as P (i,j Pr( = x i = y j, and P = P P. We define a new quantity, P, which will play an important role in the rest of the paper, as P = P P P (6 Our main theorem in this section identifies the spectral properties of P. Before stating our theorem, we provide the following lemma, which will be used in its proof. Lemma [6, p. 49] The spectral radius of a stochastic matrix is. A non-negative matrix T is stochastic if and only if e is an eigenvector of T corresponding to the eigenvalue. Theorem An m n non-negative matrix P is a joint distribution matrix with marginal distributions P and P, i.e., Pe = p P e and P T e = p P e, if and only if the singular value decomposition (SVD of P P PP satisfies P = UΛV T = p T + λ i u i vi T (7 where U [u,...,u l ] and V [v,...,v l ] are two unitary matrices, Λ diag[λ,...,λ l ] and l = min(m,n; u = p, v = p, and λ = λ λ l 0. That is, all of the singular values of P are between 0 and, the largest singular value of P is, and the corresponding left and right singular vectors are p and p. Proof: Let P satisfy (7, then P PP e = P T + λ i u i vi T p = P p T p +P λ i u i vi T v = p (8 P PP Similarly, e T P PP = pt. Thus, the non-negative matrix is a joint distribution matrix with marginal distributions p and p. Conversely, we consider a joint distribution P with marginal distributions p and p. We need to show that the singular values of P lie in [0,], the largest singular value is equal to, and p and p, respectively, are the left and right singular vectors corresponding to the singular value. To this end, we first construct a Markov chain with P = P = P. Note that this also implies P = P, P = P = P, and P = P. The special structure of the constructed Markov chain provides the following: P = P P = P P = PP PT P = P (P PP (P PT P P = P P PT P (9 We note that the matrix P is similar to the matrix P P T [7, p. 44]. Therefore, all eigenvalues of P are the eigenvalues of P P T as well, and if v is a left eigenvector of P corresponding to an eigenvalue µ, then P v is a left eigenvector of P P T corresponding to the same eigenvalue. We note that P is a stochastic matrix, therefore, from Lemma, e is a left eigenvector of P corresponding the eigenvalue, which is also equal to the spectral radius of P. Since P is similar to P P T, we have that p is a left eigenvector of P PT with eigenvalue, and the rest of the eigenvalues of P P T lie in [,]. In addition, P PT is a symmetric positive semi-definite matrix, which implies that the eigenvalues of P P T are real and non-negative. Since the eigenvalues of P PT are non-negative, and the largest eigenvalue is equal to, we conclude that all of the eigenvalues of P P T lie in the interval [0,]. The singular values of P are the square roots of the eigenvalues of P P T, and the left singular vectors of P are the eigenvectors of P P T. Thus, the singular values of P lie in [0,], the largest singular value is equal to, and p is a

left singular vector corresponding to the singular value. The corresponding right singular vector is v T = ut P = T P PP = p T P = T (0 which concludes the proof. III. A NEW DATA PROCESSING INEQUALIT In this section, we introduce a new data processing inequality in the following theorem. We first provide a lemma that will be used in its proof. Lemma [8, p. 78] For matrices A and B λ i (AB λ i (Aλ (B ( where λ i ( denotes the i-th largest singular value of a matrix. Theorem If, then λ i ( P λ i ( P λ ( P λ i ( P ( where i =,...,rank( P. Proof: From the structure of the Markov chain, and from the definition of P in (6, we have P = P P P = P P (3 Using (7 for P, we obtain P =p T + λ i ( P u i ( P v i ( P T (4 and using (7 for P and P yields ( P P = p T + λ i ( P u i ( P v i ( P T ( p T + λ i ( P u i ( P v i ( P T ( =p T + λ i ( P u i ( P v i ( P T ( λ i ( P u i ( P v i ( P T (5 where the two cross-terms vanish since p is both v ( P and u ( P, and therefore, p is orthogonal to both v i ( P and u j ( P, for all i,j. Using (3 and equating (4 and (5, we obtain λ i ( P u i ( P v i ( P T ( = λ i ( P u i ( P v i ( P T ( λ i ( P u i ( P v i ( P T The proof is completed by applying Lemma to (6. (6 IV. ON I.I.D. SEQUENCES Let ( n, n be a pair of i.i.d. sequences, where each pair of letters of these sequences satisfies a joint distribution P. Thus, the joint distribution of the sequences isp n n = P n, where A A, A k A A (k, and represents the Kronecker product of matrices [7]. From (6, P = P P P (7 Then, P n n = P n = (P P P n = (P n P n (P n (8 We also have P n = (P n and P n = (P n. Thus, P n n P np n np n n P n = (P n (P (P n (P n = P (9 Applying SVD to P n n, we have P n n = U nλ n V T n = n P n = U n Λ n (V n T (0 From the uniqueness of the SVD, we know that U n = U n, Λ n = Λ n and V n = V n. Then, the ordered singular values of P n n are {,λ( P,...,λ( P,...} where the second through the n+-st singular values are all equal to λ ( P. V. A NECESSAR CONDITION As stated in Section I, the sum rate can be upper bounded as H(U,V maxi(, ; ( where the maximization is over all possible and that satisfy the Markov chain U n V n. From Theorem in Section III, we know that if U n V n, then, for i =,...,rank( P, λ i ( P λ ( P U nλ i( P Un V nλ ( P V n ( We showed in Section IV that λ i ( P Un V n λ ( P UV for i, and λ i ( P U n V n = λ ( P UV for i =,...,n +. Therefore, for i =,...,rank( P, we have λ i ( P λ ( P U nλ ( P UV λ ( P V n (3 From Theorem, we know that λ ( P Un and λ ( P V n. Next, in Theorem 3, we determine that the least upper bound for λ ( P U n and λ ( P V n is also. Theorem 3 Let F(n,P be the set of all joint distributions for and U n with a given marginal distribution for, P. Then, sup λ ( P Un = (4 F(n,P, n=,,...

The proof of Theorem 3 is given in the Appendix. Combining (3 and Theorem 3, we obtain the main result of our paper, which is stated in the following theorem. Theorem 4 If a pair of i.i.d. sources (U,V with joint distribution P UV can be transmitted reliably through a discrete, memoryless, multiple access channel characterized by P, then for some (, with H(U,V I(, ; (5 λ i ( P λ ( P UV, i =,...,rank( P. (6 VI. SOME SIMPLE EAMPLES We consider a multiple access channel where the alphabets of, and are all binary, and the channel transition probability matrix p(y x,x is given as \ 0 0 00 / / 0 0 0 / / The following is a trivial upper bound, which we provide as a benchmark, max I(, ; = (7 p(x,x where the maximization is over all binary bivariate distributions. The maximum is achieved by P( =, = = P( = 0, = 0 = /. We note that this upper bound does not depend on the source distribution. First, we consider a binary source (U, V with the following joint distribution p(u, v U\V 0 /3 /6 0 /6 /3 In this case, H(U,V =.9. We first note, using the trivial upper bound in (7, that, it is impossible to transmit this source through the given channel reliably. The upper bound we developed in this paper gives /3 for this source. We also note that, for this case, our upper bound coincides with the single-letter achievability expression given in [], which is H(U,V I(, ; (8 where, are such that U V holds. Therefore, for this case, our upper bound is the converse, as it matches the achievability expression. Next, we consider a binary source(u,v with the following joint distribution p(u, v U\V 0 0 0. 0 0. 0.8 In this case, H(U, V = 0.9, the single-letter achievability in (8 reaches 0.5 and our upper bound is 0.56. The gap between the achievability and our upper bound is quite small. We note that, in this case, the trivial upper bound in (7 fails to test whether it is possible to have reliable transmission or not, while our upper bound determines conclusively that reliable transmission is not possible. Finally, we consider a binary source (U,V with the following joint distribution p(u, v U\V 0 0 0.85 0 0. 0.05 In this case, H(U, V = 0.75, the single-letter achievability expression in (8 gives 0.57 and our upper bound is 0.9. We note that the joint entropy of the sources falls into the gap between the achievability expression and our upper bound, which means that we cannot conclude whether it is possible (or not to transmit these sources through the channel reliably. VII. CONCLUSION In this paper, we investigated the problem of transmitting correlated sources through a multiple access channel. We utilized the spectrum analysis to develop a new data processing inequality, which provided a single-letter necessary condition for the joint distributions satisfying the Markov chain condition. By using our new data processing inequality, we developed a new single-letter upper bound for the sum rate of the multiple access channel with correlated sources. To find APPENDI PROOF OF THEOREM 3 sup λ ( P Un, we need to exhaust F(n,P, n=,,... the sets F(n,P with n. In the following, we show that it suffices to check only the asymptotic case. For any joint distribution P U n F(n,P, we attach an independent U, say U n+, to the existing n-sequence, and get a new joint distribution P U n+ = P U n p U, where p U is the marginal distribution of U in the vector form. By arguments similar to those in Section IV, we have that λ i ( P U n+ = λ i( P U n. Therefore, for every P U n F(n,P, there exists some P U n+ F(n+,P, such that λ i ( P n+ U = λ i( P Un. Thus, sup λ ( P U n sup λ ( P Un+ (9 F(n,P F(n+,P From (9, we see that sup λ ( P Un is monotonically F(n,P non-decreasing in n. We also note that λ ( P Un is upper bounded by for all n, i.e., λ ( P Un. Therefore, sup λ ( P Un = lim F(n,P, n=,,... n sup F(n,P λ ( P U n To complete the proof, we need the following lemma. (30 Lemma 3 [4] λ ( P = if and only if P decomposes. By P decomposes, we mean that there exist sets S, S, such that P(S, P( S, P(S, P( S are positive, while P(( S S = P(S ( S = 0.

In the following, we will show by construction that there exists a joint distribution that decomposes asymptotically. For a given marginal distributionp, we arbitrarily choose a subset S from the alphabet of. We find a set S in the alphabet of U n such that P(S = P(S if it is possible. Otherwise, we pick S such that P(S P(S is minimized. We denote S(n to be the set of all subsets of the alphabet of U n and we also define P max = maxpr(s for all s U. Then, we have min S S(n P(S P(S P n max (3 We construct a joint distribution for and U n as follows. First, we construct the joint distribution P i corresponding to the case where and U n are independent. Second, we rearrange the alphabets of and U n and group the sets S, S, S and U n S as follows [ ] P i P i = P i P i P i (3 where P, i P, i P, i P i correspond to the sets S S, S (U n S, ( S S, ( S (U n S, respectively. Here, we assume that P(S P(S. Then, we scale these four sub-matrices as P = Pi P(S P(S, P P(S = 0, P = Pi (P(S P(S ( P(S, P P P(S = i ( P(S ( P(S ( P(S, and let [ ] P 0 P = (33 P P We note that P is a joint distribution for and U n with the given marginal distributions. Next, we move the mass in the sub-matrix P to P, which yields [ ] [ ] [ ] P P 0 P 0 E 0 = P+E = + 0 P P P E 0 (34 where E P, E Pi (P(S P(S P(S P(S, and P = P P(S P(S. We denote P and P U n as the marginal distributions of P. We note that P U = P n U n and P = P M where M is a scaling diagonal matrix. The elements in the set S are scaled up by a factor of P(S P(S, and those in the set S are scaled down by a factor of P(S P(S. Then, P = M P +M P U n (35 We will need the following lemmas in the remainder of our derivations. Lemma 5 can be proved using techniques similar to those in the proof of Lemma 4 [9]. Lemma 4 [9] If A = A+E, then λ i (A λ i (A E, where E is the spectral norm of E. Lemma 5 If A = MA, where M is an invertible matrix, then M λ i (A /λ i (A M. Since P decomposes, using Lemma 3, we conclude that λ ( P =. We upper bound P U n as follows, P U n P U n F (36 where F is the Frobenius norm. Combining (3 and (34, we have P U n F (P(S P(S P P(S P P i P U n F (37 where P min(p(s, P(S. Since P i corresponds to the independent case, we have P P i P U n F = from (7. Then, from (3, (36 and (37, we obtain where c P P(S. From Lemma, we have P U n c P n max (38 M P U n = λ (M P U ( n P(S c Pmax n c Pmax n (39 P(S From Lemma 4, we have c P n max λ (M P +c P n max (40 We upper bound M as follows M P(S = P(S + P(S P(S P(S + Pn/ max P(S +c 3Pmax n/ (4 Similarly, M c 4 P n/ max. From Lemma 5, we have ( c 4 Pmax n/ λ ( P (+c 3 P λ (M max P n/ (4 Since P is a joint distribution matrix, from Theorem, we know that λ ( P. Therefore, we have ( c 4 P n/ max ( c P n/ max λ ( P (43 When P max <, corresponding to the non-trivial case, lim n Pmax n/ = 0, and using (30, (4 follows. The casep(s < P(S can be proved similarly. REFERENCES [] T. M. Cover, A. El-Gamal, and M. Salehi, Multiple access channel with arbitrarily correlated sources, IEEE Trans. Inform. Theory, vol. 6, pp. 648 657, Nov. 980. [] G. Dueck, A note on the multiple access channel with correlated sources, IEEE Trans. Inform. Theory, vol. 7, pp. 3 35, Mar. 98. [3] T. M. Cover and J. A. Thomas, Elements of Information Theory. John Wiley and Sons, 99. [4] H. S. Witsenhausen, On sequences of pairs of dependent random variables, SIAM Journal on Applied Mathematics, vol. 8, pp. 00 3, Jan. 975. [5] K. Marton, The structure of isomorphisms of discrete memoryless correlated sources, eitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete, vol. 56(3, pp. 37 37, 98. [6] A. Berman and R. J. Plemmons, Nonnegative Matrices in the Mathematical Sciences. Academic Press, 979. [7] R. A. Horn and C. R. Johnson, Matrix Analysis. Cambridge, 985. [8], Topics in Matrix Analysis. Cambridge, 99. [9] G. W. Stewart, On the early history of the singular value decomposition, SIAM Review, vol. 35, pp. 55 566, Dec. 993.