Generalized Writing on Dirty Paper

Similar documents
The Poisson Channel with Side Information

Sparse Regression Codes for Multi-terminal Source and Channel Coding

Multiaccess Channels with State Known to One Encoder: A Case of Degraded Message Sets

Duality Between Channel Capacity and Rate Distortion With Two-Sided State Information

The Duality Between Information Embedding and Source Coding With Side Information and Some Applications

Shannon meets Wiener II: On MMSE estimation in successive decoding schemes

Dirty Paper Writing and Watermarking Applications

On Compound Channels With Side Information at the Transmitter

An Information-Theoretic Analysis of Dirty Paper Coding for Informed Audio Watermarking

ACOMMUNICATION situation where a single transmitter

Sum Capacity of Gaussian Vector Broadcast Channels

On the Duality of Gaussian Multiple-Access and Broadcast Channels

Lecture 10: Broadcast Channel and Superposition Coding

Computation of Information Rates from Finite-State Source/Channel Models

Superposition Encoding and Partial Decoding Is Optimal for a Class of Z-interference Channels

Competition and Cooperation in Multiuser Communication Environments

The Method of Types and Its Application to Information Hiding

X 1 Q 3 Q 2 Y 1. quantizers. watermark decoder N S N

On the Duality between Multiple-Access Codes and Computation Codes

Broadcasting over Fading Channelswith Mixed Delay Constraints

WE study the capacity of peak-power limited, single-antenna,

Lattices for Distributed Source Coding: Jointly Gaussian Sources and Reconstruction of a Linear Function

Capacity Region of Reversely Degraded Gaussian MIMO Broadcast Channel

A Proof of the Converse for the Capacity of Gaussian MIMO Broadcast Channels

Energy State Amplification in an Energy Harvesting Communication System

Primary Rate-Splitting Achieves Capacity for the Gaussian Cognitive Interference Channel

Duality, Achievable Rates, and Sum-Rate Capacity of Gaussian MIMO Broadcast Channels

18.2 Continuous Alphabet (discrete-time, memoryless) Channel

Structured interference-mitigation in two-hop networks

IN this paper, we show that the scalar Gaussian multiple-access

The Capacity Region of the Cognitive Z-interference Channel with One Noiseless Component

Bounds and Capacity Results for the Cognitive Z-interference Channel

Capacity of Memoryless Channels and Block-Fading Channels With Designable Cardinality-Constrained Channel State Feedback

EE376A - Information Theory Final, Monday March 14th 2016 Solutions. Please start answering each question on a new page of the answer booklet.

Covert Communication with Channel-State Information at the Transmitter

EE 4TM4: Digital Communications II. Channel Capacity

arxiv: v1 [cs.it] 4 Jun 2018

On the Required Accuracy of Transmitter Channel State Information in Multiple Antenna Broadcast Channels

On the Capacity of the Interference Channel with a Relay

5958 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 56, NO. 12, DECEMBER 2010

Lecture 4 Noisy Channel Coding

LECTURE 13. Last time: Lecture outline

Outage-Efficient Downlink Transmission Without Transmit Channel State Information

Variable Length Codes for Degraded Broadcast Channels

Dirty Paper Coding vs. TDMA for MIMO Broadcast Channels

Lecture 8: Channel Capacity, Continuous Random Variables

Cognitive Multiple Access Networks

Upper Bounds on the Capacity of Binary Intermittent Communication

On Scalable Coding in the Presence of Decoder Side Information

X 1 : X Table 1: Y = X X 2

On the Capacity Region of the Gaussian Z-channel

The Capacity Region of the Gaussian MIMO Broadcast Channel

Chapter 9 Fundamental Limits in Information Theory

On Two-user Fading Gaussian Broadcast Channels. with Perfect Channel State Information at the Receivers. Daniela Tuninetti

Interference Channel aided by an Infrastructure Relay

The Capacity Region of the Gaussian Cognitive Radio Channels at High SNR

On the Secrecy Capacity of the Z-Interference Channel

Binary Dirty MAC with Common State Information

Capacity of the Discrete Memoryless Energy Harvesting Channel with Side Information

Information Theory Meets Game Theory on The Interference Channel

Approximately achieving the feedback interference channel capacity with point-to-point codes

THE dirty-paper (DP) channel, first introduced by

Quantization Index Modulation using the E 8 lattice

ECE Information theory Final (Fall 2008)

On Multiple User Channels with State Information at the Transmitters

Appendix B Information theory from first principles

On the Capacity of Free-Space Optical Intensity Channels

On Capacity of the Writing onto Fast Fading Dirt Channel

Lecture 5 Channel Coding over Continuous Channels

An Achievable Rate Region for the 3-User-Pair Deterministic Interference Channel

Feedback Capacity of the Compound Channel

On Common Information and the Encoding of Sources that are Not Successively Refinable

Cut-Set Bound and Dependence Balance Bound

Multicoding Schemes for Interference Channels

Capacity bounds for multiple access-cognitive interference channel

Incremental Coding over MIMO Channels

A Comparison of Two Achievable Rate Regions for the Interference Channel

The Gallager Converse

Two Applications of the Gaussian Poincaré Inequality in the Shannon Theory

Information Embedding meets Distributed Control

MMSE estimation and lattice encoding/decoding for linear Gaussian channels. Todd P. Coleman /22/02

ELEC546 Review of Information Theory

ECE Information theory Final

Interference Channels with Source Cooperation

A Formula for the Capacity of the General Gel fand-pinsker Channel

National University of Singapore Department of Electrical & Computer Engineering. Examination for

Source-Channel Coding Theorems for the Multiple-Access Relay Channel

Chapter 4. Data Transmission and Channel Capacity. Po-Ning Chen, Professor. Department of Communications Engineering. National Chiao Tung University

An Outer Bound for the Gaussian. Interference channel with a relay.

Error Exponent Region for Gaussian Broadcast Channels

Lecture 6 I. CHANNEL CODING. X n (m) P Y X

Quantization Index Modulation: A Class of Provably Good Methods for Digital Watermarking and Information Embedding

On Gaussian MIMO Broadcast Channels with Common and Private Messages

Dispersion of the Gilbert-Elliott Channel

Joint Compression and Digital Watermarking: Information-Theoretic Study and Algorithms Development

On the Capacity of the Multiple Antenna Broadcast Channel

Multiuser Successive Refinement and Multiple Description Coding

Lecture 6: Gaussian Channels. Copyright G. Caire (Sample Lectures) 157

Joint Source-Channel Coding for the Multiple-Access Relay Channel

Joint Write-Once-Memory and Error-Control Codes

Transcription:

Generalized Writing on Dirty Paper Aaron S. Cohen acohen@mit.edu MIT, 36-689 77 Massachusetts Ave. Cambridge, MA 02139-4307 Amos Lapidoth lapidoth@isi.ee.ethz.ch ETF E107 ETH-Zentrum CH-8092 Zürich, Switzerland Abstract We consider a generalization of Costa s writing on dirty paper model in which a powerlimited communicator encounters two independent sources of additive noise, one of which is known non-causally to the encoder. We seek to characterize when the capacity for this channel would not change if the source of noise known to the encoder were also known to the decoder; we call this property private-public equivalence (PPE). Costa showed that this model has PPE if both sources of noise are IID Gaussian. We show that this model has PPE as long as the unknown noise is Gaussian (but not necessarily IID) for any distribution on the known noise. We also conjecture that for a general class of coding strategies and for IID noise sequences (with some additional assumptions), if this model has PPE then the unknown noise must be Gaussian. This result relies on the Darmois-Skitovič Theorem, which states that linear combinations of independent random variables can be independent only for Gaussian random variables. 1 Introduction Costa s writing on dirty paper [1] is a communication model in which there are two independent sources of additive white Gaussian noise, one known non-causally to the encoder. Costa showed that a power constrained encoder can reliably transmit all rates less than 1 2 log ( 1 + P ) N bits/symbol, where P is the power constraint on the attacker and N is the variance of the unknown noise. Note the surprising fact that this capacity does not depend on the variance of the known noise (i.e., the achievable rates would not change if this noise were not present or if this noise were also known at the decoder and could be subtracted off). We generalize Costa s model by considering different distributions on the two noise sources. We show that the known noise does not affect the capacity for a Gaussian (possibly non-white) distribution on the unknown noise and for any distribution on the known noise; a similar result has been shown in [2], see below for discussion. We also conjecture that for a particular coding strategy known as distortion compensated quantization index modulation [3] and for independent, identically distributed (IID) noise sequences (with some additional assumptions) this is the most general condition under which this result holds (even allowing for a different input constraint than power). Other extensions to writing on dirty paper have been given. First, in [4] and [5], it was shown that the known noise does not affect the capacity as long as long as both noise sources are Gaussian, Submitted for presentation at ISIT 2002. Kindly direct all correspondence to A. Cohen. 1

S Z M Encoder X Y Decoder ˆM Figure 1: Generalized writing on dirty paper; S and Z are independent noise sequences; S is known non-causally to the encoder. but not necessarily white. Second, in [2], Erez, Shamai and Zamir extended this result to when the known noise is any deterministic sequence. This latter result is similar to our first main result. However, we present it here since the proof is significantly different. Writing on dirty paper and its extensions have gained interest in recent years due to its similarity to watermarking or information embedding, see e.g., [6, 3]. In these problems, an encoder wishes to transmit information by modifying a known data sequence (e.g., audio or video). The encoder does not wish to modify the data too much, hence a constraint similar to the power constraint above is placed on the encoder. Also as above, the decoder must recover the information in the presence of additional noise. The writing on dirty paper result has also been applied to Gaussian broadcast channels, see e.g., [7, 8]. For example, a broadcaster to two receivers can use its knowledge of the signal it wishes to send to the first receiver to design the signal it will send to the other receiver. There is no loss of capacity (due to the writing on dirty paper result) while each receiver only has to decode his own message. This contrasts with the usual superposition coding where one receiver must decode both messages. Thus, it is of interest to study the conditions under which Costa s result holds. The remainder of this paper is organized as follows. In Section 2, we give a detailed description of our generalized writing on dirty paper model and state our main results. In Sections 3 and 4, we sketch the proofs of our main results. 2 Model and Main Results We now describe a model we call generalized writing on dirty paper (GWDP) that we will use throughout the paper. The model is illustrated in Figure 1. Channel: The output of the channel is given by Y = X + S + Z, (1) where X is the input to the channel and S and Z are independent noise sequences with distributions P S and P Z, respectively. The noise sequence S will be available to the encoder, while the noise sequence Z will not. We use bold to signify a vector of length n. We also use upper case to signify random variables or vectors while lower case signifies realizations of random variables or vectors. All of the random variables and vectors we will discuss take value in R and R n, respectively. The distributions P S and P Z are specified for each n. 2

Encoder: For a given blocklength n, the encoder produces the input sequence X as a function of the entire known noise sequence S and an independent message M, which takes value uniformly in the set {1,..., 2 nr }. Here, R is the rate of the system. The input sequence must satisfy 1 n n d(x i ) D, a.s., (2) i=1 where d( ) is a non-negative function and a.s. stands for almost surely, i.e., with probability one. Decoder: The decoder observes the output Y and estimates the message with ˆM. We measure the resulting probability of error P e (n) = Pr( ˆM M) by averaging over all sources of randomness. Capacity: A rate R is achievable if there exists a sequence of rate-r encoders and associated decoders such that the probability of error P e (n) tends to zero as the blocklength n tends to infinity. The capacity is the supremum of all achievable rates. Private-Public Equivalence (PPE): As with watermarking [6], we refer to the situation described above as the public version, since the noise sequence S is only known at the encoder. We also consider a private version in which the sequence S is known to the decoder as well. We say that GWDP has private-public equivalence (PPE) if the capacities of both versions are the same. The capacity of the private version is the same as the capacity of an additive noise channel Y = X +Z, where X must satisfy (2). Costa s result is that GWDP has PPE if both S and Z are IID Gaussian and d(x) = x 2. We now characterize a more general situation in which this is true. Theorem 1. If Z is ergodic and Gaussian and d(x) = x 2, then GWDP has PPE, for any ergodic distribution on S. This theorem is proved in Section 3. A significantly different proof was found independently and concurrently in [2]. We next conjecture that for a general class of coding schemes and IID noise sequences, if GWDP has PPE, then then unknown noise must be Gaussian. The class of coding schemes that we are interested in is distortion compensated quantization index modulation (DC-QIM); this name was introduced in [3], but the same coding strategy has been used to achieve capacity originally by Costa [1] and also for watermarking with malicious attackers [6]. In DC- QIM, the input sequence X is formed as a linear combination of the known noise sequence S and a codeword U which depends on S and the message M. Conjecture 1. Let S and Z be IID and let S i and Z i satisfy the conditions under which Conjecture 2 is true (including S i not deterministic). If DC-QIM can be used to achieve PPE for GWDP (with finite capacity for both versions), then Z is IID Gaussian and d(x) x 2. The proof of this conjecture is given in Section 4, and is complete except for one sub-conjecture (Conjecture 2); we have not yet established the conditions under which this result is valid. Note that if S were deterministic then GWDP would trivially have PPE. Further note that by assuming that both versions have finite capacity there can be no z such that Pr(Z i = z) > 0. For simplicity, we further assume that each Z i has a density and that the differential entropy h(z i ) exists. We hope to complete the proof of this result and to extend it to be independent of the coding strategy. 3 Gaussian Unknown Noise is Sufficient We assume that the positive part of Conjecture 1 is true and give a quick proof of Theorem 1. The positive part of Conjecture 1 says that if S and Z are IID with Z Gaussian, then GWDP has 3

PPE. If Z is not IID, but still Gaussian, then we can diagonalize the problem and reduce it to a set of parallel scalar channels whose unknown noise component is IID Gaussian, see e.g., [9]. If S is not IID, but still ergodic, then we can interleave and make S IID on each sub-channel. Thus, it is sufficient to prove the positive part of Conjecture 1, which we do below in Section 4 (the only part that has an incomplete proof is the converse, i.e., that Z must be Gaussian). 4 Gaussian Unknown Noise is Necessary for DC-QIM In this section, we first show that if Z is IID Gaussian, S is IID, and d(x) = x 2, then GWDP has PPE. We then argue that if Z and S are IID and GWDP has PPE for DC-QIM, then Z must be Gaussian. We begin by characterizing when GWDP has PPE. Lemma 1. If GWDP has PPE for P S = (P S ) n and P Z = (P Z ) n, then there exists a conditional distribution P U,X S such that I(U; X + S + Z) I(U; S) = max P X : E[d(X )] D I(X ; X + Z), (3) and E[d(X)] D, where the mutual informations are evaluated with respect to the joint distribution P S,U,X,Z = P S P U,X S P Z. Proof. The right hand side (RHS) of (3) is the capacity of the private version of GWDP; all smaller rates are achievable by subtracting off S at the encoder and decoder. For finite alphabets and no distortion constraints, it was shown in [10] that a rate R is achievable if and only if there exists such a conditional distribution with the left hand side (LHS) of (3) greater than R. In [11], this result has been extended to include a distortion constraint. The proofs can be further extended so that they do not depend on the finiteness of the alphabets. The conditional distribution P U,X S describes the coding strategy that achieves all rates less than the LHS of (3). The marginal distribution P U is used to create the codewords and the conditional distribution P X U,S is used to form the input sequence x from the codeword u and the known noise sequence s. For DC-QIM, the conditional distribution P X U,S ( u, s) places a unit mass at u αs for some constant α. This corresponds to the DC-QIM algorithm in which the input sequence is a linear combination of the codeword and the known noise sequence. We first find general conditions on P U,X S under which (3) is met with equality, and then later restrict to DC-QIM. The fact that the LHS of (3) is at most the RHS can be seen through the following steps: I(U; X + S + Z) I(U; S) I(U; S + X + Z, S) I(U; S) (4) = I(U; S + X + Z S) I(X; S + X + Z S) (5) = h(s + X + Z S) h(s + X + Z X, S) = h(x + Z S) h(x + Z X) (6) h(x + Z) h(x + Z X) (7) = I(X; X + Z) max P X : E[d(X )] D I(X ; X + Z). (8) 4

Here, (5) follows by the data processing inequality since U (X, S) X + S + Z form a Markov chain; the differential entropies since we are assuming that Z has a density; (6) follows since h(s +X +Z X, S) = h(z X, S) = h(z) = h(z X) = h(x +Z X); and (7) follows since conditioning reduces entropy. The conditions for equality in (4), (5), (7) and (8) are respectively, A. The random variables U S + X + Z S form a Markov chain. B. The random variables X (S, U) S + X + Z form a Markov chain. C. The random variables S and X + Z are independent. D. The random variable X has the maximizing distribution on the RHS of (8). The above conditions are necessary in order for GWDP to have PPE for IID noise sequences. Let us assume that Z is Gaussian (with mean zero and variance N) and d(x) = x 2 and describe a conditional distribution P U,X S such that these four conditions are satisfied. Let X be Gaussian D D+N. (with mean zero and variance D) and independent of S and let U = X + αs, where α = We note that X α(x + Z) and X + Z are independent since they are jointly Gaussian and uncorrelated. Condition A is satisfied since I(U; S S + X + Z) = h ( X + αs S + X + Z ) h ( X + αs S + X + Z, S ) = h ( X α(x + Z) S + X + Z ) h ( X α(x + Z) S + X + Z, S ) = h ( X α(x + Z) ) h ( X α(x + Z) ) = 0, where the final equality follows since X α(x +Z) is independent of both S and X +Z. Condition B is satisfied since X is a function of S and U. Condition C is satisfied since S is independent of both X and Z. Condition D is satisfied since a Gaussian distribution is capacity achieving on a power-limited AWGN channel. Thus, if Z is IID Gaussian, S is IID, and d(x) = x 2, then GWDP has PPE. We now argue that in order for conditions A D to hold with U = X + αs for some α (the constraint from DC-QIM), the unknown noise Z must be Gaussian. In order to do so, we shall use the following three technical claims, the first of which can be found in [12] 1 and the last of which is the only unproven part of the argument. Lemma 2 (Darmois-Skitovič Theorem). If X 1,..., X n are independent real random variables and if for constants (a 1,..., a n ) and (b 1,..., b n ), n i=1 a ix i is independent of n i=1 b ix i, then for all i with a i b i 0, X i has a Gaussian distribution. Lemma 3. For real random variables X 1, X 2, and X 3, if (X 1, X 2 ) and X 3 are independent and X 1 and X 2 + X 3 are independent, then X 1 and X 2 are independent. Proof. We shall use the joint moment generating function (MGF), which we denote g x (r) = E [ e jr X]. Recall that X 1 and X 2 are independent if and only if their joint MGF factors, i.e., iff g x1,x 2 (r 1, r 2 ) = g x1 (r 1 )g x2 (r 2 ). By our assumption that X 1 and X 2 + X 3 are independent, g x1,x 2,x 3 (r 1, r, r) = g x1 (r 1 )g x2,x 3 (r, r), for any (r 1, r). Furthermore, since X 2 and X 3 are independent, g x2,x 3 (r, r) = g x2 (r)g x3 (r). By our assumption that (X 1, X 2 ) and X 3 are independent, g x1,x 2,x 3 (r 1, r 2, r 3 ) = g x1,x 2 (r 1, r 2 )g x3 (r 3 ), for any (r 1, r 2, r 3 ), and in particular for r 2 = r 3 = r. Comparing these two factorizations of the joint MGF of X 1, X 2, and X 3, we see that g x1,x 2 (r 1, r 2 ) = g x1 (r 1 )g x2 (r 2 ), and thus X 1 and X 2 are independent. 1 Thanks to Randy Berry for pointing out this result. 5

Conjecture 2. Under some natural conditions on the real random variables X 1, X 2, and X 3, if (X 1, X 2 ) and X 3 are independent and X 1 X 2 + X 3 X 2 form a Markov chain, then X 1 and X 2 are independent. Remark: We shall invoke this result with X 1 = (1 α)x αz, X 2 = X + Z and X 3 = S. Clearly, if X 3 is deterministic or, more generally, if X 2 can be determined from X 2 +X 3, then the statement is not true. However, if X 3 takes value on the entire real line, then we believe that the statement is true. We are looking for the smallest set of assumptions under which this conjecture is true. We now show that Gaussian unknown noise is necessary for the scenario under consideration. First, since (X, S) and Z are independent (by definition) and S and X + Z are independent (condition C), Lemma 3 shows that X and S must be independent. Second, condition A is equivalent to the random variables (1 α)x αz X + S + Z X + Z forming a Markov chain for some α (to see this, subtract α(x + S + Z) and X + S + Z from the left-most and right-most, respectively, random variables in condition C). Using Conjecture 2, these two facts imply that X + Z and (1 α)x αz must be independent for some α. Note that α = 0 cannot give equality in (3) since I(X; X + S + Z) < I(X; X + Z) for S independent of (X, Z). Thus, (1 α)x αz and X + Z must be independent for some α 0, and using Lemma 2, we see that Z must be Gaussian in order for GWDP to have PPE. Furthermore (as long as α 1), the input X must be Gaussian as well and the only constraint where Gaussian input is optimal for an additive Gaussian channel is (up to a constant) d(x) = x 2. Thus, assuming Conjecture 2, we have shown that Conjecture 1 is true. References [1] M. H. M. Costa, Writing on dirty paper, IEEE Trans. IT, vol. 29, pp. 439 441, May 1983. [2] U. Erez, S. Shamai, and R. Zamir, Capacity and lattice-strategies for cancelling known interference, in Proc. of the Cornell Summer Workshop on Inform. Theory, Aug. 2000. [3] B. Chen and G. W. Wornell, Quantization index modulation: A class of provably good methods for digital watermarking and information embedding, IEEE Trans. IT, vol. 47, pp. 1423 1443, May 2001. [4] B. Chen, Design and Analysis of Digital Watermarking, Information Embedding, and Data Hiding Systems. PhD thesis, MIT, Cambridge, MA, 2000. [5] W. Yu, A. Sutivong, D. Julian, T. M. Cover, and M. Chiang, Writing on colored paper, in Proc. of ISIT, (Washington, DC), 2001. [6] A. S. Cohen and A. Lapidoth, The Gaussian watermarking game. To appear in IEEE Trans. IT. [7] G. Caire and S. Shamai, On achievable rates in a multi-access Gaussian broadcast channel, in Proc. of ISIT, (Washington, DC), p. 147, 2001. [8] W. Yu and J. M. Cioffi, Trellis precoding for the broadcast channel, in Proc. of GlobeCom, 2001. [9] W. Hirt and J. L. Massey, Capacity of the discrete-time Gaussian channel with intersymbol interference, IEEE Trans. IT, vol. 34, no. 3, pp. 380 388, 1988. [10] S. I. Gel fand and M. S. Pinsker, Coding for channel with random parameters, Problems of Control and Inform. Theory, vol. 9, no. 1, pp. 19 31, 1980. [11] R. J. Barron, B. Chen, and G. W. Wornell, The duality between information embedding and source coding with side information and some applications, in Proc. of ISIT, (Washington, DC), 2001. [12] R. M. Dudley, Uniform Central Limit Theorems. Cambridge University Press, 1999. 6