STRONG CONVERSE FOR GEL FAND-PINSKER CHANNEL. Pierre Moulin

Similar documents
The Method of Types and Its Application to Information Hiding

Multiaccess Channels with State Known to One Encoder: A Case of Degraded Message Sets

Superposition Encoding and Partial Decoding Is Optimal for a Class of Z-interference Channels

On Multiple User Channels with State Information at the Transmitters

A New Metaconverse and Outer Region for Finite-Blocklength MACs

Variable Length Codes for Degraded Broadcast Channels

The Poisson Channel with Side Information

The Gallager Converse

Coding for Noisy Write-Efficient Memories

Lecture 10: Broadcast Channel and Superposition Coding

On Scalable Source Coding for Multiple Decoders with Side Information

5958 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 56, NO. 12, DECEMBER 2010

The Capacity Region of the Cognitive Z-interference Channel with One Noiseless Component

Exact Random Coding Error Exponents of Optimal Bin Index Decoding

A Formula for the Capacity of the General Gel fand-pinsker Channel

Lecture 4 Noisy Channel Coding

Multicoding Schemes for Interference Channels

Computing sum of sources over an arbitrary multiple access channel

Bounds and Capacity Results for the Cognitive Z-interference Channel

Information Theory. Lecture 10. Network Information Theory (CT15); a focus on channel capacity results

Generalized Writing on Dirty Paper

A Comparison of Superposition Coding Schemes

UCSD ECE 255C Handout #12 Prof. Young-Han Kim Tuesday, February 28, Solutions to Take-Home Midterm (Prepared by Pinar Sen)

A Simple Converse Proof and a Unified Capacity Formula for Channels with Input Constraints

Lecture 4 Channel Coding

Lossy Distributed Source Coding

Capacity of the Discrete Memoryless Energy Harvesting Channel with Side Information

Covert Communication with Channel-State Information at the Transmitter

Shannon s noisy-channel theorem

Representation of Correlated Sources into Graphs for Transmission over Broadcast Channels

Joint Write-Once-Memory and Error-Control Codes

Source and Channel Coding for Correlated Sources Over Multiuser Channels

arxiv: v1 [cs.it] 4 Jun 2018

Joint Source-Channel Coding for the Multiple-Access Relay Channel

Lecture 1: The Multiple Access Channel. Copyright G. Caire 12

A Summary of Multiple Access Channels

Capacity bounds for multiple access-cognitive interference channel

On Dependence Balance Bounds for Two Way Channels

Secret Key Agreement Using Asymmetry in Channel State Knowledge

Statistical Modeling and Analysis of Content Identification

Keyless authentication in the presence of a simultaneously transmitting adversary

Lecture 5: Channel Capacity. Copyright G. Caire (Sample Lectures) 122

Mismatched Multi-letter Successive Decoding for the Multiple-Access Channel

AN INTRODUCTION TO SECRECY CAPACITY. 1. Overview

LECTURE 13. Last time: Lecture outline

The Capacity Region of the Gaussian Cognitive Radio Channels at High SNR

On Compound Channels With Side Information at the Transmitter

Duality Between Channel Capacity and Rate Distortion With Two-Sided State Information

EXPURGATED GAUSSIAN FINGERPRINTING CODES. Pierre Moulin and Negar Kiyavash

Equivalence for Networks with Adversarial State

ProblemsWeCanSolveWithaHelper

A Graph-based Framework for Transmission of Correlated Sources over Multiple Access Channels

arxiv: v2 [cs.it] 28 May 2017

Chapter 4. Data Transmission and Channel Capacity. Po-Ning Chen, Professor. Department of Communications Engineering. National Chiao Tung University

EE 4TM4: Digital Communications II. Channel Capacity

820 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 58, NO. 2, FEBRUARY Stefano Rini, Daniela Tuninetti, and Natasha Devroye

Capacity Region of Reversely Degraded Gaussian MIMO Broadcast Channel

Interference Channels with Source Cooperation

A Comparison of Two Achievable Rate Regions for the Interference Channel

Binary Dirty MAC with Common State Information

Simultaneous Nonunique Decoding Is Rate-Optimal

List of Figures. Acknowledgements. Abstract 1. 1 Introduction 2. 2 Preliminaries Superposition Coding Block Markov Encoding...

On the Rate-Limited Gelfand-Pinsker Problem

Reliable Computation over Multiple-Access Channels

The Duality Between Information Embedding and Source Coding With Side Information and Some Applications

Cut-Set Bound and Dependence Balance Bound

ELEC546 Review of Information Theory

Lecture 5 Channel Coding over Continuous Channels

Random Access: An Information-Theoretic Perspective

National University of Singapore Department of Electrical & Computer Engineering. Examination for

The Capacity Region for Multi-source Multi-sink Network Coding

Lecture 6 I. CHANNEL CODING. X n (m) P Y X

Amobile satellite communication system, like Motorola s

Information Theory Meets Game Theory on The Interference Channel

Capacity of a channel Shannon s second theorem. Information Theory 1/33

A Framework for Optimizing Nonlinear Collusion Attacks on Fingerprinting Systems

On The Binary Lossless Many-Help-One Problem with Independently Degraded Helpers

On Achievable Rates for Channels with. Mismatched Decoding

Side-information Scalable Source Coding

Interference channel capacity region for randomized fixed-composition codes

An Achievable Error Exponent for the Mismatched Multiple-Access Channel

An Achievable Rate Region for the 3-User-Pair Deterministic Interference Channel

On Gaussian MIMO Broadcast Channels with Common and Private Messages

Subset Universal Lossy Compression

On Network Interference Management

LECTURE 15. Last time: Feedback channel: setting up the problem. Lecture outline. Joint source and channel coding theorem

Information Masking and Amplification: The Source Coding Setting

Interactive Decoding of a Broadcast Message

EE5139R: Problem Set 7 Assigned: 30/09/15, Due: 07/10/15

On Scalable Coding in the Presence of Decoder Side Information

Achievable Error Exponents for the Private Fingerprinting Game

Optimal Encoding Schemes for Several Classes of Discrete Degraded Broadcast Channels

An Extended Fano s Inequality for the Finite Blocklength Coding

Source-Channel Coding Theorems for the Multiple-Access Relay Channel

An Achievable Rate for the Multiple Level Relay Channel

Achieving Shannon Capacity Region as Secrecy Rate Region in a Multiple Access Wiretap Channel

Exercise 1. = P(y a 1)P(a 1 )

II. THE TWO-WAY TWO-RELAY CHANNEL

Feedback Capacity of a Class of Symmetric Finite-State Markov Channels

Upper Bounds on the Capacity of Binary Intermittent Communication

Transcription:

STROG COVERSE FOR GEL FAD-PISKER CHAEL Pierre Moulin Beckman Inst., Coord. Sci. Lab and ECE Department University of Illinois at Urbana-Champaign, USA ABSTRACT A strong converse for the Gel fand-pinsker channel is established in this paper. The method is then extended to a multiuser scenario. A strong converse is established for the multiple-access Gel fand- Pinsker channel under the maximum error criterion, and the capacity region is determined.. ITRODUCTIO The Gel fand-pinsker (GP) channel [] and its variants have attracted considerable interest in the information theory literature. Applications include coding in the presence of a known interference at the transmitter, and watermarking [2]. This paper derives a strong converse for the GP channel. In addition to strengthening the classical weak converse [], the derivation provides new insights into the problem and in particular into the role of the auxiliary random variable. These insights are particularly useful for the multiple-access version of the GP channel, for which an outer rate region can be derived using a weak converse based on Fano s inequality, but that region does not coincide with the achievable region identified in [3]. We prove that in fact the maximum error probability tends to for any rate pair outside the region of [3], thereby determining the capacity region of the multiple-access GP channel. The proof does not require the wringing methods of [4] that were used to prove the strong converse for the multiple-access channel without side information (SI) under the average error criterion. Also note that: (i) according to Ahlswede [5, 6], the maximum error criterion is more natural in multiuser communications than the average error criterion because the latter guarantees a small error probability only if the users choose their message with uniform probability; and (ii) capacity regions under the max error criterion are generally smaller than capacity regions under the average error criterion [6]. These capacity regions coincide if stochastic encoders are allowed [7, pp. 284,285]. This holds a fortiori if common randomness between encoders and receiver is allowed. 2. GEL FAD-PISKER CHAEL Consider the GP channel p(y x, s) with input alphabet X, output alphabet Y, and channel state S Sdistributed according to a pmf p S []. A message M drawn uniformly from M {,, 2 R } is to be sent over the channel using a length- code. The channel state sequence S =(S,,S ) S is iid p S, independent of M, and available to the encoder. The transmitted sequence is denoted by f (M,S) X and the decoding rule by g (Y) M. Gel fand and Pinsker established the capacity formula C =max[i(u; Y ) I(U; S)] () p XU S where U is an auxiliary random variable taking values in an alphabet U of cardinality U S X +, and U (S, X) Y forms a Markov chain. The maximum over p X US is achieved by a deterministic pmf: p X US = {X = f(u, S)} for some function f : U S X. The direct part of the theorem was proven using a random binning technique, where an arbitrarily small number ɛ>0is chosen, and codewords u(l, m), l 2 [I(U;S)+ɛ], m 2 R are drawn iid from the marginal p U associated with the capacity-achieving distribution p XU S, and a virtual memoryless channel p Y U is created from U to Y. The transmitted sequence is given by x t = f(u t (m, s),s t ) for t =,,. Decoding of the codeword u(l, m) selected by the encoder is successful if I(U; S)+R I(U; Y ) 2ɛ. The (weak) converse part of the theorem was proved using a telescoping formula. In this derivation, U =(M,ST +, Y T,T) (where T is a time-sharing random variable) does not admit an obvious coding interpretation since Y is not available at the encoder. In this paper, we establish a strong converse. The notion of a virtual channel p Y U appears clearly in this derivation, and the construction of U does not involve feedback from the decoder. The source and channel coding aspects of the problem arise from the tension between providing the decoder with information about S and M, respectively, and are also apparent in the derivation. Our main result is stated below. Theorem 2. Assume that H(S) > 0 and min y,x,s p Y XS(y x, s) > 0. For any sequence of length- codes (f,g ) with rate R>C, the average error probability P e (f,g ) for the GP channel tends to as. Extended sketch of the proof. Assume without loss of generality (wlog) that min s p S (s) ɛ and min y,x,s p Y XS (y x, s)

ɛ, for some arbitrarily small ɛ>0. Step. Assume wlog that the codewords are given by the following two-step procedure: a Markov chain. The distribution of Y given m, s is the product pmf p Y US (y u(m, s), s). Denote by D m the decoding region for message m, i.e., An alphabet U of arbitrarily large cardinality, a function f : U S X, and a codebook with codewords u(m, s) V are defined; Each channel input symbol is obtained as x t = f(u t (m, s),s t ), t. Since this construction contains the choice U = X, u(m, s) x(m, s), f(x, s) =x as a special case, there is no loss of generality in making the above assumption. Moreover, it can be shown that capacity is not reduced if, instead of S, a slightly degraded version of S is available to the encoder (output of an R/D code with vanishing Hamming distortion for each m M ) and so we may restrict our attention to codes that satisfy property (P) below. Such codes may be thought of as including an elementary amount of binning, since for each m, many (albeit not necessarily exponentially many) sequences s map to the same codeword u(m, s). Let d H (s, s ) denote Hamming distance between two sequences s and s and Ω the set of pairs (m, s) such that u(m, s) =u(m, s ), s : d H (s, s ). (2) In other words, given m, arbitrarily changing any one sample s t of the sequence s does not change the value of the codeword u(m, s). Denote by Σ(m) {s : (m, s) Ω} the sections of Ω along the m direction. Also denote by p s the type of the sequence s (and empirical pmf over S), by T ɛ = {s : max s S p s (s) p S (s) ɛ} the strong ɛ-typical set, and by Σ ɛ (m) Σ(m) T ɛ its intersection with Σ(m). (P). For each m M, the set Σ ɛ (m) has probability P S (Σ ɛ(m)) o(). (3) Step 2. Define the random variables U t = u t (M,S) U and Q t = {S j,j t} S (note Q t is independent of S t ) for t. The equivalence relation s =(s t,q t ) holds for each t. Let T be a time-sharing random variable uniformly distributed over {, 2,, } and independent of all other random variables. Let S = S T, Q = Q T, X = X T, U = U T, and Y = Y T. For each t, the joint pmf of (S,M,U t,x t,y t ) is given by p(s,m,u t,x t,y t ) = p S (s) p M(m) {u t = u(m, s)} {x t = f(u t,s t )} p Y XS (y t x t,s t ). y D m g (y) =m, m M. The decoding regions form a partition of Y. The probability of correct decoding of message m M is given by P c (f,g,m)=pr[g (Y) =m] = p S (s) p Y US (y u(m, s), s). (4) y D m s S The average probability of correct decoding is given by P c (f,g )=2 R P c (f,g,m). m M Step 3. For each m M and s S, denote by λ = λ(m, s) P [] U S the conditional type of u(m, s) given s (empirical conditional pmf, implicitly dependent on (m, s)). The quadruple (p S,λ,f,p Y XS ) induces a joint pmf on S U X Y: λ SUXY (s, u, x, y) =p S(s) λ(u s) {X =f(v,s)} p Y XS (y x, s). Thus U (X, S) Y forms a Markov chain for each λ, f. We denote by λ Y,λ S U, etc. the various marginals and conditional marginals associated with λ SUXY and therefore induced by (λ, f). Consider the conditional mutual informations I λ (U; S) = s,u I λ,f (U; Y ) = s,u,y p S (s) λ(u s) log λ S U(s u) p S (s) p S (s) λ YU S (y, u s) log λ Y U (y u) λ Y (y) which will be viewed as functions of λ and f. Also define the empirical conditional self-informations Î λ (U; S) =α(m, s) Î λ (U; Y )= ˆβ(m, s, y) Ĭ λ,f (U; Y )=β(m, s) log λ S U (s t u t (m, s)), (5) p S (s t ) log λ Y U(y t u t (m, s)), λ Y (y t ) p Y US (y t u t (m, s),s t ) y t Y log λ Y U(y t u t (m, s)) λ Y (y t ) = E Y M,S [ ˆβ(m, s, Y)]. (6) Hence the joint pmf of (T,S,Q,U,X,Y) is p T p S p Q p U SQT {X = f(v,s)} p Y XS. ote that V (S, X) Y forms These quantities do not coincide with I λ (U; S) and I λ,f (U; Y ) because the type of s does not coincide with p S in general.

However for strongly typical s T ɛ we have Îλ(U; S) I λ (U; S) ɛ log S Ĭλ,f (U; Y ) I λ,f (U; Y ) ɛ log Y. (7) Also the following inequality holds for all (m, s) Ω and s differing from s in position t: 2log2/ɛ max s S[ α(m, s )] min s)] + t s t S[ α(m,. (8) Step 4. Define the following subsets of Y, indexed by m, s: } B ɛ (m, s) {y : ˆβ(m, s, y) β(m, s)+ɛ (9) For all m M, s S, the probability Pr[Y / B ɛ (m, s) M = m, S = s] log2 ɛ ɛ 2 (0) vanishes as. This follows from Chebyshev s inequality and the fact that Y t, t, are conditionally independent given (m, s). Step 5. The probability of correct decoding for m may be upper-bounded as P c (f,g,m) Pr[Y / B ɛ (m, S) M = m] +Pr[S / Σ ɛ (m)] + P c (f,g,m)() where the first two terms in the right side were upper-bounded in (3) and (0) respectively, and P c (f,g,m) Pr[g (Y) =m,s Σ ɛ (m), Y B ɛ (m, S) M = m] = s Σ ɛ(m) y D m B ɛ(m,s) λ y Y s t S p S (s) p Y US (y u(m, s), s). (2) Define the disjoint events E(m, λ) {S Σ ɛ (m), λ(m, S) =λ, Y D m B ɛ (m, S)} for all m M and λ P [] U S, and write (2) as P c (f,g,m) = p S (s) p Y US (y u(m, s), s) {E(m, λ)} λ s S y Y (a) = 2 α(m,s) λ S U (s u(m, s)) λ s S y Y p Y US (y u(m, s), s) {E(m, λ)} = 2 α(m,s) λ S U (s t u t (m, s)) λ y Y s t S p Y US (y t u t (m, s),s t ) {E(m, λ)} (b) 2log2/ɛ α(m,s)+ 2 w(s t y t ) λ Y U (y t u t (m, s)) {E(m, λ)} = 6 ɛ 4 (c) 6 ɛ 4 (d) λ 2 α(m,s) w (s y) λ y Y s S λ Y U (y u(m, s)) {E(m, λ)} 2 [β(m,s) α(m,s)+ɛ] w (s y) λ y Y s S λ Y (y) {E(m, λ)} 2 [I λ,f (U;Y ) I λ (U;S)+ɛ ] w (s y) λ Y (y) {E(m, λ)} (3) y Y s S where ɛ = ɛ log(2 S Y ). Equality (a) follows from (5). In (b) the conditional pmf w(s y) is arbitrary. There we have used the property (P) which implies that given any m M, s Σ(m) and t, u t (m, s) is independent of s t. Similarly B ɛ (m, s) is independent of s t ; and by (8), α(m, s) is almost independent of s t. Inequality (c) follows from (9) and (6), and inequality (d) from (7). Averaging (3) over m M, we obtain P c (f,g ) = 2 R P c (f,g,m) m M 2 R 2 [I λ,f (U;Y ) I λ (U;S)+ɛ ] m,λ s S y Y w (s y) λ Y (y) {E(m, λ)} sup max V λ,f 2 [R I λ,f (U;Y )+I λ (U;S) ɛ ] w (s y) λ Y (y) {E(m, λ)} s S y Y m,λ }{{} (a) 2 [R sup V max λ,f (I λ,f (U;Y ) I λ (U;S)) ɛ ] = 2 [R C ɛ ] (4) where (a) holds because the events E(m, λ) are disjoint. Step 6. Combining (3), (0), (2), and (4) yields P c (f,g ) o() + log2 ɛ ) ɛ 2 +2 (R C ɛ. (5) Hence P c (f,g ) vanishes for all sequences of codes (f,g ) of rate R>C+ ɛ. Since this inequality holds for arbitrarily small ɛ>0, we conclude that P c (f,g ) vanishes for all R>C. This concludes the proof. 3. MULTIPLE-ACCESS GEL FAD-PISKER CHAEL Consider the multiple-access GP channel p(y x,x 2,s) with alphabets S, X, X 2, Y and message sets {,, 2 R } and

{,, 2 R2 }. The channel state sequence S S is iid p S and is known to both encoders. The encoders transmit sequences f (m, s) X and f 2 (m 2, s) X 2 respectively, where M and M 2 are independent of S and are drawn uniformly and independently from their respective message sets. Given the channel output sequence y Y, the decoder outputs (ˆm, ˆm 2 )=g (y). Somekh-Baruch and Merhav [3] have shown that the following rate region R is achievable. For a pmf P of the form p S p T p XV ST p X2V 2 ST p Y XX 2S, let R(L, P ) be the region of rate pairs (R,R 2 ) that satisfy R < I(V ; Y V 2,T) I(V ; S V 2,T) R 2 < I(V 2 ; Y V,T) I(V 2 ; S V,T) R + R 2 < I(V,V 2 ; Y T ) I(V,V 2 ; S T ) (6) where the alphabets for the auxiliary random variables V and 2 V 2 have cardinality L. Let R denote the closure of L,P R(L, P ). ({v it = v it (m i, s)} {x it = f i (v it,s t )}) Rate pairs in R(L, P ) are achieved using a time-shared binning scheme with codeword arrays {u (l,m )} and {u 2 (l 2,m 2 )}. p Y XX 2S(y t x t,x 2t,s t ). i= Each transmitter selects the row index so that the corresponding codeword is jointly typical with s. The joint pmf of T,S,Q,V,V 2,X,X 2,Y is given by Attempts to find a outer rate region for this problem by p T p S p Q p V SQT p V2 SQT {X = f (V,S)} deriving a weak converse (based on Fano s inequality and the telescoping formula) have met only partial success. We have {X 2 = f 2 (V 2,S,T)} p Y XX 2S. derived a rate region of the form (6), but the maximization The probability of correct decoding for (m is over a larger set of distributions, with the distribution of,m 2 ) is given by (X,V,X 2,V 2 ) given (S, T ) given by p X ST p X2 ST p VV 2 X X 2STP c (f,f instead of p XV ST p X2V 2 ST. (See [9] for a related problem,,g 2,m,m 2 )= y D(m,m 2) and [8] for a similar mismatch in the case of SI causally available to the encoders.) Apparently the resulting outer region is p S (s) p Y V (y v V 2S (m, s), v 2 (m 2, s), s). strictly larger than the inner region R of (6). However, using s S a strong converse we have established the following result. Under the maximal error criterion we have Theorem 3. Assume that H(S) > 0 and min p Y X y,x X 2S,x 2,s (y x,x 2,s) > 0. For any sequence of length- codes with rate pair (R,R 2 ) / R, the maximum error probability (over all pairs of messages m,m 2 ) tends to as. Furthermore, in the definition of R, it suffices to consider conditional pmfs p XiV i ST of the form p Vi ST {X i = f i (V i,s)} where f i is a mapping from V i Sto X, for each i =, 2. Sketch of the proof. The proof extends the methods from Sec. 2, however our derivation does not make use of types (presumably wringing methods would have to be used to show that certain correlated types have low probability). Define the decoding regions D(m,m 2 ) which form a partition of Y. As in Step 2 of the proof of Theorem, assume the codewords are given by the following two-step procedure: Define for i =, 2 an alphabet V i of arbitrarily large cardinality, a function f i : V i S X i, and a codebook with codewords v i (m i, s) Vi ; Each channel input symbol is obtained as x it = f i (v it (m i, s),s t ), i =, 2, t. Observe that the channel from (v, v 2, s) to y is time-invariant and memoryless, where p Y VV 2S(y v,v 2,s)=p Y XX 2S(y f (v,s),f 2 (v 2,s),s). Define the random variables V it = v it (M i, S) V i for i =, 2, and Q t = {S j,j t} S (again note Q t is independent of S t ) for t. The equivalence relation s = (s t,q t ) holds for each t. Let T be a time-sharing random variable uniformly distributed over {, 2,, } and independent of all other random variables. Let S = S T, Q = Q T, X = X T, V = V T, X 2 = X 2T, V 2 = V 2T, and Y = Y T. For each t, the joint pmf of (S,M,V t,x t,v 2t,X 2t,Y t ) is given by p(s,m,v t,x t,v 2t,x 2t,y t )=p S (s) p M (m) P c (f,f2,g,m,m 2 ) δ, m,m 2 where δ is the maximum error probability. Denote by Ω the set of triples (m,m 2, s) such that v i (m i, s) =v i (m i, s ), i =, 2, s : d H (s, s ). As in (2), for (m,m 2, s) Ω, arbitrarily changing any one sample s t of the sequence s does not change the value of the codewords v (m, s) and v 2 (m 2, s). Denote by Σ(m,m 2 ) {s :(m,m 2, s) Ω} and M (s) {(m,m 2 ):(m,m 2, s) Ω} the sections of Ω along the (m,m 2 ) and s directions. Choose an arbitrarily small ɛ>0. Similarly to (P), without loss of optimality, we restrict our attention to codes that satisfy the following property. (P2). For each m,m 2, the set Σ(m,m 2 ) has probability PS (Σ(m,m 2 )) o(). (7) Hence the sets Σ ɛ {s : M (s) ( ɛ)2 (R+R2) }, Σ ɛ (m,m 2 ) Σ(m,m 2 ) Σ ɛ have probabilities PS ( ) o().

Step 2. Define three conditional self-informations: Î(V t V 2t ; S t Q t = q t )=α (3) (m,m 2, s) s t p S (s t)log p S t V tv 2tQ t (s t v t (m, s),v 2t (m 2, s),q t ) p St Q t (s t q t ) and for j =, 2, α (j) (m,m 2, s) is defined similarly, using log ratios p St V tv 2tQ t /p St V j,t Q t. For any sequence s Σ(m,m 2 ), t, and j =, 2, 3, the quantity α (j) (m,m 2, s) does not depend on s t. We also define the following three conditional self-informations: Î(V t V 2t ; Y t Q t = q t )=β (3) (m,m 2, s) p Y VV 2S(y t v t (m, s),v 2t (m 2, s),s t ) It may be shown that, for j =, 2, 3 and s Σ ɛ, ( ɛ)e[β (j) (M,M 2, s) α (j) (M,M 2, s)] R jt (q t )+ɛ. (8) Similarly to (9), 3 high-probability subsets B ɛ (j) (m,m 2, s) of Y are also defined. 3 cond l reference pmf s are defined: r () (y v 2, s), r (2) (y v, s), and r (3) (y s) p Y t Q t (y t q t ). Step 3. Three upper bounds are evaluated for the probability of correct decoding for m,m 2 : δ P c (f,f,g 2,m,m 2 ) δ ɛ P (j) c (f,f2,g,m,m 2 ), j =, 2, 3 where P (j) c (f,g,m,m 2 ) Pr[g (Y) =(m,m 2 ), S Σ ɛ (m,m 2 ), Y B ɛ (j) (m,m 2, S) M = m,m 2 = m 2 ].(9) Step 4. The case j =3yields an upper bound on the sum rate R 3 = R + R 2. Define the good event E (3) (m,m 2 ) = {S Σ ɛ (m,m 2 ), Y D(m,m 2 ) B ɛ (3) (m,m 2, S)}. Analogously to (4), we derive δ ɛ P (3) c (f,g,m,m 2 ) 2 [β (3) (m,m 2,s) α (3) (m,m 2,s)+ɛ] y Y s S p S (s) r (3) (y s) {E (3) (m,m 2 )}. (20) ow, since the average value of β (3) (m,m 2, s) α (3) (m,m 2, s) over m,m 2 satisfies (8) for all s Σ ɛ, there must exist a large set Γ 3 (s) of pairs m,m 2 for which β (3) (m,m 2, s) α (3) (m,m 2, s) E[ s]+ɛ R 3t (q t )+2ɛ. It is easily shown that Γ 3 (s) ɛ 2 R3 where ɛ = Averaging (20) over (m,m 2 ) Γ 3 (s), we obtain ɛ( δ ɛ) log R 3 +max ɛ +log Y {q t} ɛ ɛ+log Y. R 3t (q t )+2ɛ. s y t t Y (2) log p Y t V tv 2tQ t (y t v t (m, s),v 2t (m 2, s),q t ) In the cases j = and j = 2, the decoder uses a helper. p Yt Q t (y t q t ) who reveals one message and the corresponding codeword v i (m i, s) (but not s). This leads to the same inequality (2), and for j =, 2, β (j) (m,m 2, s) is defined similarly, using with R j and R jt in place of R 3 and R 3t, respectively. log ratios p Yt V tv 2tQ t /p Yt V j,t Q t. Define Step 5. Let W =(T,S,V,V 2,X,X 2,Y) and define R j (p WQ ) = q t p Qt (q t ) R R jt (q t ) for j =, 2, 3. t (q t ) = I(V t ; Y t V 2t,Q t = q t ) I(V t ; S t V 2t,Q t = q t ) We obtain R 2t (q t ) = I(V 2t ; Y t V t,q t = q t ) I(V 2t ; S t V t,q t = q t ) ɛ( δ ɛ) log min R 3t (q t ) = I(V t V 2t ; Y t Q t = q t ) I(V t V 2t ; S t Q t = q t ). ɛ +log Y [ R j+max R j (p W Q p Q )]+2ɛ. j=,2,3 p Q (22) Taking ɛ 0, we conclude that for (22) to hold for all 0 δ, there must exist a pmf P = p WQJ of the form p S p QJT p V SQJT p V2 SQJT {X = f (V,S)} {X 2 = f 2(V 2,S)} p Y X X 2 S such that (6) holds with (Q, J, T ) in place of T. 4. REFERECES [] S. I. Gel fand and M. S. Pinsker, Coding for Channel with Random Parameters, Probl. Contr. Info. Th., Vol. 9, o., pp. 9 3, 980. [2] G. Keshet, Y. Steinberg, and. Merhav, Channel Coding in the Presence of Side Information: Subject Review, Foundations and Trends in Communications and Information Theory, 2007. [3] A. Somekh-Baruch and. Merhav, On the Random Coding Error Exponents of the Single-User and the Multiple-Access Gel fand-pinsker Channels, Proc. ISIT, p. 448, Chicago, IL, June-July 2004. [4] R. Ahlswede, An Elementary Proof of the Strong Converse Theorem for the Multiple-Access Channel, J. Combinatorics, Information and System Sciences, Vol. 7, o. 3, pp. 26 230, 982. [5] R. Ahlswede, On Two-Way Communication Channels and a Problem by Zarankiewicz, 6th Prague Conf. on Information Theory, Statistical Decision Functions, and Random Processes, Prague, 97. [6] G. Dueck, Maximal Error Capacity Regions Are Smaller than Average Error Capacity Regions for Multi-User Channels, Problems Control and Information Theory, Vol. 7, o., pp. 9, 978. [7] I. Csiszár and J. Körner, Information Theory: Coding Theory for Discrete Memoryless Systems, Academic Press, Y, 98. [8] S. Sigurjónsson and Y.-H. Kim, On Multiple User Channels with State Information at the Transmitters, Proc. ISIT 2005. [9] Y. Wang and P. Moulin, Blind Fingerprinting, arxiv:0803.0265v [cs.it], March 2008.