Fifty-first Annual Allerton Conference Allerton House, UIUC, Illinois, USA October 2-3, 2013 Optial Jaing Over Additive Noise: Vector Source-Channel Case Erah Akyol and Kenneth Rose Abstract This paper extends the recent analysis of optial zero-delay jaing proble, specialized to scalar sources over scalar additive noise channels, to vector spaces. Particularly, we find the saddle point solution to the jaing proble in vector spaces, within its ost general, non-gaussian setting; the proble has been open even for the iportant special case of Gaussian source and channel. Siilar to the scalar setting, linearity conditions for encoding and decoding appings play a pivotal role in jaing, in the sense that the optial jaing strategy is to effectively force both transitter and receiver to default to linear appings, i.e., the jaer ensures, whenever possible, that the transitter and receiver cannot benefit fro non-linear strategies. The optial strategy is then randoized linear encoding for transitter and generating independent noise for the jaing. Moreover, the optial jaer allocates power according to the well-known waterfilling over the eigenvalues of the channel noise, and the density of the jaing noise that render the optial encoding and decoding appings linear, is deterined by the power constraints and the source density. I. INTRODUCTION We consider the proble of optial jaing, by a power constrained agent, over additive noise channel, in vector spaces. This proble was solved in [1], [2] for scalar Gaussian source and scalar channel. The saddle point solution to this zero-su gae, derived for Gaussian sourcechannel pair, involves randoized linear apping for the transitter and generating independent, Gaussian noise as the jaer output and a linear decoder. We recently extended this work to non-gaussian scalar sources and scalar channels [3], and showed that the linearity of encoding and decoding appings is essential while Gaussianity of the jaer is erely to satisfy the linearity conditions for the Gaussian source and channel. In this paper, by leveraging the recent results on conditions for linearity of optial estiation and counication appings, [4], [5], we extend the scalar analysis to vector sources and channels. The contributions of this paper are: We derive the necessary and sufficient condition (called the vector atching condition ) on the jaing noise density to ensure linearity of the optial transitter and the receiver, within the vector setting. The condition is uch ore involved than the scalar one, due to dependencies aong the source and channel coponents. The jaer ais to render both encoder and decoder linear appings, while allocates jaing power over This work is supported in part by the NSF under grants CCF-1016861 and CCF-1118075. Erah Akyol and Kenneth Rose are with Departent of Electrical and Coputer Engineering, University of California at Santa Barbara, CA 93106, USA {eakyol, rose}@ece.ucsb.edu Transitter Fig. 1. Y Jaer + N U The jaing proble. Receiver the channel eigenvalues by the well known waterfilling solution. We discover an analogy between zero-delay vector jaing and vector rate-distortion and vector channel capacity, in ters of power/rate/jaing power allocation as waterfilling. This paper is organized as follows. In Section II, we present the proble definition and preliinaries. In Section III, we review the prior results related to jaing, estiation and counication probles. In Section IV, we derive the linearity result and then the ain result on optial jaing. Finally, we discuss the future directions in Section V. II. PROBLEM DEFINITION Let R and R + denote the respective sets of real nubers and positive real nubers. Let E( ), P( ) and denote the expectation, probability and convolution operators, respectively. Let Bern(p) denote the Bernoulli rando variable, taking values in { 1, 1} with probability {p, 1 p}. The Gaussian density with ean µ and variance σ 2 is denoted as df (x) dx N (µ, σ 2 ). Let f (x) denote the first order derivative of the function f( ). All logariths in the paper are natural logariths and ay in general be coplex, and the integrals are, in general, Lebesgue integrals. Let us define the set S as the set of Borel easurable R R k appings. In general, lowercase letters (e.g., c) denote scalars, boldface lowercase (e.g., x) vectors, uppercase (e.g., C, ) atrices and rando variables. I denotes the identity atrix. R, and R denote the auto-covariance of and cross covariance of and respectively. A T denotes the transpose of atrix A. (x) + denotes the function ax(0,x). denotes the gradient and x denotes the partial gradient with respect to x. Throughout this paper, we assue that the source is an - diensional vector with zero ean and covariance R. The ˆ 978-1-4799-3410-2/13/$31.00 2013 IEEE 1329
channel noise is additive k-diensional Gaussian, of zero ean and covariance R N. Covariance atrices R and R N allow the diagonalization R Q Λ Q T, and R N Q N Λ N Q T N (1) where Q Q T Q N Q T N I and Λ and Λ N are diagonal atrices, having ordered eigenvalues as entries, i.e., Λ diag{λ } and Λ N diag{λ N } where λ and λ are ordered (descending) eigenvalues and Q T and QT N are the KLT atrices of the source and the channel, respectively. We will ake use of the following auxiliary lea, see eg. [6] for a proof. Lea 1. Let λ and λ N be two ordered vectors in R + with descending entries (λ (1) λ (2),...,λ ()) and Π denote any perutation of the indices {1, 2,...,}, then and in Π λ (Π(i))λ N (i) λ (i)λ N ( i) (2) ax Π λ ((i))λ N (Π(i)) λ (i)λ N (i) (3) We consider the general counication syste whose block diagra is shown in Figure 1. Source R is apped into Y R k by function g T ( ) Sand transitted over an additive noise channel. The adversary receives the sae signal and generates the jaing signal through function g A ( ) Swhich is added to the channel output, and ais to coproise the transission of the source. The received signal U Y + + N is apped by the decoder to an estiate ˆ via function h( ) S. The zero ean noise N is assued to be independent of the source. The source density is denoted f ( ) and the noise density is f N ( ) with characteristic functions F (ω) and F N (ω), respectively. The overall cost, easured as the ean squared error (MSE) between source and its estiate at the decoder ˆ, is a function of the transitter, jaer and the receiver appings: J(g T ( ), g A ( ), h( )) E{ ˆ 2 }. (4) Transitter g T ( ) :R R k and receiver h( ) :R R k seek to iniize this cost while the adversary (jaer) seeks to axiize it by appropriate choice of g A ( ) :R R k. Power constraints ust be satisfied by the transitter and jaer E{ g T () 2 } P T, (5) E{ g A () 2 } P A. (6) The conflict of interest underlying this proble iplies that the optial transitter-receiver-adversarial policy is the saddle point solution (g T ( ),g A ( ),h ( )) satisfying the set of inequalities J(g T, g A, h ) J(g T, g A, h ) J(g T, g A, h). (7) 1330 + Fig. 2. U Estiator The estiation proble. III. PRIOR WORK The jaing proble, in the for defined here, was studied in [1], [2], for scalar Gaussian sources and channels. The proble of interest is intrinsically connected to the fundaental probles of estiation theory and the theory of zero-delay source-channel coding. In particular, conditions for linearity of optial estiation [4] and optial appings in counications [5] are relevant to our proble here. We start with the estiation proble. A. Estiation Proble Consider the setting in Figure 2. The estiator receives U, the noisy version of the source and generates the estiate ˆ by the function h : R R such that MSE, E{( ˆ) 2 } is iniized. It is well known that, when a Gaussian source is containated with Gaussian noise, a linear estiator iniizes MSE. Recent work [4] analyzed, ore generally, the conditions for linearity of optial estiators. Here, we present the basic result pertaining to the jaing proble considered here. Specifically, we present the necessary and sufficient condition for source and channel distributions such that the linear estiator h(u) κ κ+1 ˆ U is optial where κ σ2 σ 2 is the SNR. Theore 1 ( [4]). Given SNR level κ, and noise with characteristic function F (ω), there exists a source for which the optial estiator is linear if and only if F (ω) F κ (ω). (8) Given a valid characteristic function F (ω), and for soe κ R +, the function F κ (ω) ay or ay not be a valid characteristic function, which deterines the existence of a atching source. For exaple, atching is guaranteed for integer κ and it is also guaranteed for infinitely divisible. More coprehensive discussion of the conditions on κ and F (ω) for F κ (ω) to be a valid characteristic function can be found in [4]. Extension of the conditions to the vector case is nontrivial due to the dependencies across coponents of the source and noise. Forally, we consider the proble of estiating the vector source R given the observation Y +, where and R are independent, as shown in Figure 2. Let Q be the eigenatrix of R R 1, and U Q 1 and let eigenvalues λ 1,...,λ be the eleents of the diagonal atrix Λ, i.e., the following holds: R R 1 U 1 ΛU (9)
Transitter Fig. 3. Y + U The counication setting. Receiver We are looking for the conditions on F (ω) and F (ω) such that h(y ) KY with K R (R + R ) 1 iniizes the estiation error E{ h(y ) 2 }. By following a siilar approach to the scalar case, the necessary and sufficient condition for linearity was derived in [4], reproduced below: Theore 2 ( [4]). Let the characteristic functions of the transfored source and noise (U and U) be F U (ω) and F U (ω). The necessary and sufficient condition for linearity of optial estiation is: log F U (ω) B. Counication Proble λ i log F U (ω), 1 i (10) In [5], a counication scenario whose block diagra is shown in Figure 3 was studied. In this setting, a scalar source R is apped into Y R by function g S, and transitted over an additive noise channel. The channel output U Y + is apped by the decoder to the estiate ˆ via function h S. The zero ean noise is assued to be independent of the source. The source density is denoted f ( ) and the noise density is f ( ) with characteristic functions F (ω) and F (ω), respectively. The objective is to iniize, over the choice of encoder g and decoder h, the distortion D E{( ˆ) 2 }, (11) subject to the average transission power constraint, E{g 2 ()} P T. (12) The necessary and sufficient condition for linearity for both appings is given by the following theore. Theore 3 ( [5]). For a given power constraint P T, noise with variance σ 2 and characteristic function F (ω), source with variance σ 2 and characteristic function F (ω), the optial encoder and decoder appings are linear if and only if F (αω) F κ (ω) (13) where κ P T σ 2 and α PT. σ 2 Here, we first extend this result to vector spaces, in Section III-b. Towards, deriving this extension, we need the optial encoding and decoding transfors for a general counication proble with source and channel noise covariances R and R N and total encoding power liit P T, here we ˆ reproduce the classical result due to [7] (see also [8], [9] for alternative derivations of this result.). Theore 4 ( [7] [9]). The encoding transfor that iniizes the MSE distortion subject to the power constraint P T is C Q N ΣQ T (14) where Σ is diagonal power allocation atrix. Moreover the total distortion is w ( 2 λ (i)λ N ( i) D P T + w + λ (i) (15) λ N ( i) w+1 where w is the nuber of active channels deterined by the power P T. Reark 1. Distortion expression (15) has an interesting interpretation of power allocation as reverse water-filling over the source eigenvalues. As we will show in the next section, the optial jaer also perfors power allocation as water-filling over the channel eigenvalues. Reark 2. Note that the ordering of the eigenvalues are so that the largest source eigenvalue is ultiplied with the sallest of the and so on, which physically eans that the encoder uses the best channel for the sallest variance source coponent, and so on. This is siply due to Lea 1. Assuption 1: In this paper, we assue the source and the channel have atched diensions, i.e., k (which eans bandwidth copression or expansion). This assuption is essential in the sense that when k the jaer cannot ensure linearity of g() and h(y ). A well known exaple is when 2and k 1, the optial appings are highly nonlinear, even in the case of Gaussian source and channel (see eg. [5], [10]and there references there in for details). Assuption 2: Throughout this paper, we assue that P T is high enough, so that all the channels are active, (each channel is allocated strictly positive power), i.e., w, hence (15) can be rewritten as J(λ, λ N ) ( λ (i)λ N ( i) P T + λ N (i) 2 (16) This assuption is not necessary but significantly siplifies the results. C. Gaussian Scalar Jaing Proble The proble of transitting independent and identically distributed Gaussian rando variables over a Gaussian channel in the presence of an additive jaer was considered in [1], [2]. In [2] a gae theoretic approach was developed and it was shown that the proble adits a ixed saddle point solution where the optial transitter and receiver eploy a randoized strategy. The randoization inforation can 1331
be sent over a side channel between transitter and receiver or it could be viewed as the inforation generated by the third party and observed by both transitter and receiver 1. Surprisingly, the optial jaing strategy ignores the input to the jaer and erely generates Gaussian noise, independent of the source. Theore 5 ( [1], [2]). The optial encoding function for the transitter is randoized linear apping: Y (i) γ(i)α T (i), (17) where γ(i) is i.i.d. Bernoulli ( 1 2 ) over the alphabet { 1, 1} and α T PT σ 2 Gaussian output (i) γ(i) Bern( 1 2 ) (18). The optial jaer generates i.i.d. (i) N(0,P A ) (19) where (i) is independent of the source (i). The proof of Theore 5 relies on the well known fact that for a Gaussian source over a Gaussian channel, zerodelay linear appings achieve the perforance of the asyptotically high delay optial source-channel counication syste [11]. This fact is unique to the Gaussian sourcechannel pair, hence it ight be tepting to conclude that the saddle point solution in Theore 5 can only be obtained in the all Gaussian setting. Perhaps surprisingly, in [3], we showed that there are infinitely any source-noise pairs that yield a saddle point solution siilar to Theore 5. We also proved that the linearity property of the optial transitter and receiver at the saddle point solution still holds, while the Gaussianity of the jaer output in the early special case was erely a eans to satisfy this linearity condition, and does not hold in general. Here, we reproduce the ain result in [3] for the scalar, non-gaussian jaing proble. Theore 6 ( [3]). For the jaing proble, the optial encoding function for the transitter is randoized linear apping: Y (i) γ(i)α T (i), (20) where γ(i) is i.i.d. Bernoulli ( 1 2 ) over the alphabet { 1, 1} and α T PT σ 2 γ(i) Bern( 1 2 ) (21). The optial jaing function is to generate i.i.d. output (i) with characteristic function F (ω) F β (α T ω) F N (ω) (22) where (i) is independent of the adversarial input (i) and β P A+σ 2 N P T. 1 In practice, randoization can be achieved by (pseudo)rando nuber generators at the transitter and receiver using the sae seed. IV. MAIN RESULTS A. An Upper Bound on Distortion Based on Linear Mappings In this section, we present a new lea that is used to upper bound the distortion of any zero-delay counication syste by that of the fixed, best linear encoder and decoder. A scalar version of this lea appeared in [3], here we extend this result to vector spaces. Lea 2. Consider the proble setting in Figure 1. For any given jaer, the distortion achievable by the transitterreceiver, D, is upper bounded by the distortion achieved by linear encoder and decoder 2 λ (i) ax(θ, λ N ( i)) J u P T + ax(θ, λ N (i)) (23) where θ satisfies the water-filling condition: (θ λ N (i)) + P A. (24) Note that this upper bound is deterined by only second order statistics, regardless of the noralized densities. Proof: The proof is based on the fact that the transitter and receiver can always use the linear appings that satisfy the power constraints. The jaer will try to ake all effective noise eigenvalues identical since in Π J(λ (Π), λ N ) (25) is axiized when λ N is unifor (see eg. [12] for a proof based on ajorization). The effective channel eigenvalues are upper bounded by the vector σ λ N +λ where λ is the ascending ordered eigenvalues of the jaer noise. To achieve this upper bound, the jaer sets R Q N Λ Q T N, i.e., the eigenvectors of the noise and the jaer ust be identical. If P A is not large enough, then the solution of is known to be the waterfilling solution ; intuitively the jaer ais to ake entries of σ as close to unifor as possible. Reark 3. Note that the optial jaer perfors waterfilling over the channel eigenvalues while the encoder allocates power according to reverse water-filling over the source eigenvalues. This observation parallels the inforation theoretical (asyptotically high delay) water-filling duality, where the rate distortion optial vector encoding schee allocates total rate by reverse water-filling over source eigenvalues and vector channel capacity achieving schee allocates power over channel eigenvalues by waterfilling. Reark 4. Lea 2 is the key result that connects the recent results on linearity of optial estiation and counication appings to the jaing proble. Lea 2 iplies that the optial strategy for a jaer which can 1332
only control the density of the additive noise channel, is to force the transitter and receiver to use linear appings. B. Conditions for Linearity of Counication Mappings in Vector Spaces For source R and channel R, we derive the necessary and sufficient condition for siultaneous linearity of optial encoder and decoder. In [5], it was shown that the counication proble is convex in the density of the channel input for the scalar setting. It is easy to extend this convexity result to atched diensions, see Assuption 1. Moreover, by siply extending the related scalar result in [3], it is straightforward to show that the optial decoder is linear if and only if the optial decoder is linear. Hence, we will investigate the conditions for linearity of optial decoder given that the encoder is linear. Towards deriving our results, we will ake use of the following auxiliary lea fro atrix analysis. Lea 3. Given a function f : R n R, atrix A R n and vector x R xf(ax) A T f(ax) (26) See Appendix II. Next, we present the necessary and sufficient condition for linearity of optial decoder for a given, optial linear decoder. Theore 7. Let the characteristic functions of the transfored source and noise (ΣQ T and QT ) be F ΣQ T (ω) and F Q T (ω). The necessary and sufficient condition for linearity of optial estiation is: log F ΣQ T (ω) where S ΣΛ ΣΛ 1. C. Main Result S i log F QT (ω), 1 i (27) Our ain result concerns the optial strategy for the transitter, the adversary and the receiver in Figure 1 for the transission index i. Theore 8. For the jaing proble, the optial encoding function for the transitter is: Y (i) γ(i)c(i), (28) where γ(i) is i.i.d. Bernoulli ( 1 2 ) over the alphabet { 1, 1} γ(i) Bern( 1 2 ) (29) and C Q N ΣQ T. The optial jaing function is to generate i.i.d. output (i), independent of (i), that satisfies: log F ΣQ T (ω) log F QT S i for S ΣΛ ΣΛ 1 and N (N+) (ω), 1 i (30) R Q N λ Q T N (31) where λ (i) (θ λ N (i)) + (32) and θ satisfies the water-filling condition: (θ λ N (i)) + P A (33) The optial receiver is h(u(i)) R C T (CR C T + R N + R ) 1 U(i), (34) and total cost is 2 λ (i) ax(θ, λ N ( i)) J P T + ax(θ, λ N (i)) (35) Moreover, this saddle point solution is (alost surely) unique. Proof: The proof follows fro verification of the saddle point inequalities given in (7), following the approach in [13], and is erely a siple extension of the scalar result in [3]. The essential idea of the proof is that the jaer renders the channel noise to atch the source in a way that the optial encoding and decoding appings are linear. The condition for such atching is given in Theore 7. Moreover, R shares the sae eigenvectors as R N as explained in the proof of Lea 2. V. DISCUSSION In this paper, we studied the vector jaing proble. Siilar to scalar setting, linearity conditions for encoding and decoding appings play the key role in jaing, in the sense that the optial jaing strategy is to effectively force both transitter and receiver to default to linear appings. Hence, the optial strategy is randoized linear encoding for transitter and generating independent noise for the jaing. The eigenvalues of the optial jaing noise are allocated according to water-filling over the eigenvalues of the channel noise, and the density of the jaing noise is atched to source and the channel to render the appings linear. We derived the atching condition to be satisfied by the jaer and also the second order statistics of the jaing noise. The power allocation solutions in the zero-delay probles ( water-filling for jaer and reverse water-filling for the transitter) nicely parallels the resource allocation strategies in asyptotically high delay (Shannon type) probles, such as rate allocation in rate-distortion (reverse water-filling) and power allocation in channel capacity (water-filling). Throughout this paper, we assued that a atching jaing noise, that forces the transitter and receiver to use linear appings, can always be generated by the jaer. The case for which this source-channel-jaer atching is not possible was analyzed for scalar sources and channels in our recent work [3]. Intuitively, the jaer approxiates the atching solution in soe polynoials which are orthogonal 1333
under the easure of the channel output. The analysis for vector settings can be carried out siilarly, and is oitted since it is orthogonal to the focus of the current paper. APPENDI I DERIVATION OF THE MATCHING CONDITION Let us rewrite the MSE optial decoder, h(y) E{ y} using Bayes rule and independence of and : xf (x)f (y Cx) dx h(y) (36) f (x)f (y Cx) dx Plugging h(y) Ky in (36) we obtain, Ky f (x)f (y Cx) dx xf (x)f (y Cx) dx Taking the Fourier transfor of both sides (37) jk ω [F (Cω)F (ω)] jc 1 [ ω F (Cω)]F (ω) (38) and rearranging ters, we get C 1 K 1 F (Cω) F (Cω) K 1 F (ω) F (ω) Using log F (Cω) 1 F (Cω) F (Cω), (39) log F (Cω) (C 1 K) 1 K log F (ω) (40) Note that (see eg. [14]) K R C T (CR C T + R ) 1 (41) Let use define β (C 1 K) 1 K then, β (C 1 R C T (CR C T +R ) 1 ) 1 It is easy to see that and hence Plugging (14) in (45), we have and hence R C T (CR C T +R ) 1 (42) β 1 R (CR C T ) 1 (43) β (CR C T )R 1 (44) β Q (ΣΛ ΣΛ 1 )QT (45) log F (Cω) Q (ΣΛ ΣΛ 1 )QT log F (ω) (46) defining S (ΣΛ ΣΛ 1 ), we have Q T ω log F (Cω) SQ T log F (ω) (47) Let us define ω Q T ω, hence ω Q ω. Plugging this in (47), we have Q T Q ω log F (CQ ω) SQ T Q ω log F (Q ω) (48) Using Lea 3, we can rewrite (48) as ω log F (C T Q ω) S ω log F (Q ω) (49) noting that C T Q Q Σ and the characteristic functions of the source and noise after transforation can be written in ters of the known characteristic functions F (ω) and F (ω), specifically F ΣQ T (ω) F (ΣQ T ω) and F Q T (ω) F (Q ω), we have ω log F ΣQ T ( ω) S ω log F Q T ( ω) (50) Using the fact that S is diagonal, we convert (50) to the set of scalar differential equations of (30). Converse can be shown by retracing the steps in the derivation of the necessity. Note that none of these steps, (36)-(47), introduce any loss of generality, hence retracing back fro (47) to (36), we show that if (47) is satisfied, the optial decoder is linear given the encoder is linear. APPENDI II PROOF OF LEMMA 3 By the chain rule we have, f(ax) n f(ax) [Ax] k (51) x i [Ax] k [x] i k1 n f(ax) ([A] k x) [Ax] k [x] i (52) n f(ax) [A] ki [Ax] k (53) n k f(ax)[a] ki (54) k1 k1 k1 [A] i T f(ax) (55) It follows fro (55) that x f(ax) A T f(ax). REFERENCES [1] T. Başar, The Gaussian test channel with an intelligent jaer, IEEE Transactions on Inforation Theory, vol. 29, no. 1, pp. 152 157, 1983. [2] T. Başar and Y.W. Wu, A coplete characterization of iniax and axiin encoder-decoder policies for counication channels with incoplete statistical description, IEEE Transactions on Inforation Theory,, vol. 31, no. 4, pp. 482 489, 1985. [3] E. Akyol, K. Rose, and T. Başar, On optial jaing over an additive noise channel, draft, available online at: http://arxiv.org/abs/1303.3049. [4] E. Akyol, K. Viswanatha, and K. Rose, On conditions for linearity of optial estiation, IEEE Transactions on Inforation Theory, vol. 58, no. 6, pp. 3497 3508, 2012. [5] E. Akyol, K. Viswanatha, K. Rose, and Rastad T., On zero-delay source-channel coding, IEEE Transactions on Inforation Theory, in review, available online at: http://arxiv.org/abs/1302.3660. [6] R.A. Horn and C.R. Johnson, Matrix Analysis, Cabridge University Press, 1985. [7] K.H. Lee and D. Petersen, Optial linear coding for vector channels, IEEE Transactions on Counications, vol. 24, no. 12, pp. 1283 1290, 1976. [8] T. Başar, B. Sankur, and H. Abut, Perforance bounds and optial linear coding for discrete-tie ultichannel counication systes (corresp.), IEEE Transactions on Inforation Theory, vol. 26, no. 2, pp. 212 217, 1980. [9] E. Akyol and K. Rose, On linear transfors in zero-delay gaussian source channel coding, in IEEE International Syposiu on Inforation Theory Proceedings (ISIT),. IEEE, 2012, pp. 1543 1547. [10] F. Hekland, P.A. Floor, and T.A. Rastad, Shannon-Kotelnikov appings in joint source-channel coding, IEEE Transactions on Counications, vol. 57, no. 1, pp. 94 105, 2009. 1334
[11] T. M Cover and J. A. Thoas, Eleents of inforation theory, Wileyinterscience, 2006. [12] AW Marshall and I. Olkin, Inequalities: Theory of Majorization and Its Applications, Acadeic Press, New York, 1979. [13] T. Başar and Y.W. Wu, Solutions to a class of iniax decision probles arising in counication systes, Journal of optiization theory and applications, vol. 51, no. 3, pp. 375 404, 1986. [14] T. Kailath, A. Sayed, and B. Hassibi, Linear Estiation, Prentice Hall Upper Saddle River, NJ, 2000. 1335