Research Collection. The relation between block length and reliability for a cascade of AWGN links. Conference Paper. ETH Library

Similar documents
Lecture 4 Noisy Channel Coding

Lecture 4 Channel Coding

National University of Singapore Department of Electrical & Computer Engineering. Examination for

LECTURE 10. Last time: Lecture outline

Two Applications of the Gaussian Poincaré Inequality in the Shannon Theory

Feedback Capacity of a Class of Symmetric Finite-State Markov Channels

(each row defines a probability distribution). Given n-strings x X n, y Y n we can use the absence of memory in the channel to compute

Notes 3: Stochastic channels and noisy coding theorem bound. 1 Model of information communication and noisy channel

Shannon s noisy-channel theorem

Superposition Encoding and Partial Decoding Is Optimal for a Class of Z-interference Channels

Shannon s Noisy-Channel Coding Theorem

Multiaccess Channels with State Known to One Encoder: A Case of Degraded Message Sets

Chapter 4. Data Transmission and Channel Capacity. Po-Ning Chen, Professor. Department of Communications Engineering. National Chiao Tung University

Upper Bounds on the Capacity of Binary Intermittent Communication

Equivalence for Networks with Adversarial State

for some error exponent E( R) as a function R,

ELEC546 Review of Information Theory

Convexity/Concavity of Renyi Entropy and α-mutual Information

Reliable Computation over Multiple-Access Channels

Ahmed S. Mansour, Rafael F. Schaefer and Holger Boche. June 9, 2015

Variable-Rate Universal Slepian-Wolf Coding with Feedback

Arimoto Channel Coding Converse and Rényi Divergence

A Novel Asynchronous Communication Paradigm: Detection, Isolation, and Coding

A Comparison of Two Achievable Rate Regions for the Interference Channel

Lecture 8: Channel and source-channel coding theorems; BEC & linear codes. 1 Intuitive justification for upper bound on channel capacity

Error Exponent Region for Gaussian Broadcast Channels

Chain Independence and Common Information

Variable Length Codes for Degraded Broadcast Channels

(Classical) Information Theory III: Noisy channel coding

Graph-based codes for flash memory

Polar codes for the m-user MAC and matroids

Training-Based Schemes are Suboptimal for High Rate Asynchronous Communication

An Outer Bound for the Gaussian. Interference channel with a relay.

Secure Degrees of Freedom of the MIMO Multiple Access Wiretap Channel

ECE Information theory Final (Fall 2008)

The Method of Types and Its Application to Information Hiding

Mismatched Multi-letter Successive Decoding for the Multiple-Access Channel

Optimization in Information Theory

A Summary of Multiple Access Channels

Efficient Use of Joint Source-Destination Cooperation in the Gaussian Multiple Access Channel

An Achievable Error Exponent for the Mismatched Multiple-Access Channel

The Poisson Channel with Side Information

Lecture 11: Polar codes construction

An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1

On the Capacity of the Two-Hop Half-Duplex Relay Channel

Cooperative Communication with Feedback via Stochastic Approximation

CSCI 2570 Introduction to Nanocomputing

The Gallager Converse

LECTURE 13. Last time: Lecture outline

Exponential Error Bounds for Block Concatenated Codes with Tail Biting Trellis Inner Codes

Capacity Region of Reversely Degraded Gaussian MIMO Broadcast Channel

A Proof of the Converse for the Capacity of Gaussian MIMO Broadcast Channels

A Tight Upper Bound on the Second-Order Coding Rate of Parallel Gaussian Channels with Feedback

On the Capacity of the Interference Channel with a Relay

Capacity of a channel Shannon s second theorem. Information Theory 1/33

Lecture 5: Channel Capacity. Copyright G. Caire (Sample Lectures) 122

Representation of Correlated Sources into Graphs for Transmission over Broadcast Channels

An Alternative Proof of Channel Polarization for Channels with Arbitrary Input Alphabets

On the Duality between Multiple-Access Codes and Computation Codes

An introduction to basic information theory. Hampus Wessman

Network Coding and Schubert Varieties over Finite Fields

Low-density parity-check codes

Feedback Capacity of the First-Order Moving Average Gaussian Channel

Cut-Set Bound and Dependence Balance Bound

On the Capacity Region of the Gaussian Z-channel

Appendix B Information theory from first principles

Midterm Exam Information Theory Fall Midterm Exam. Time: 09:10 12:10 11/23, 2016

A Simple Memoryless Proof of the Capacity of the Exponential Server Timing Channel

EE376A: Homeworks #4 Solutions Due on Thursday, February 22, 2018 Please submit on Gradescope. Start every question on a new page.

Electrical and Information Technology. Information Theory. Problems and Solutions. Contents. Problems... 1 Solutions...7

Entropies & Information Theory

Practical Polar Code Construction Using Generalised Generator Matrices

Primary Rate-Splitting Achieves Capacity for the Gaussian Cognitive Interference Channel

Discrete Memoryless Channels with Memoryless Output Sequences

Bounds on the Maximum Likelihood Decoding Error Probability of Low Density Parity Check Codes

Noisy-Channel Coding

Solutions to Homework Set #3 Channel and Source coding

II. THE TWO-WAY TWO-RELAY CHANNEL

Aalborg Universitet. Bounds on information combining for parity-check equations Land, Ingmar Rüdiger; Hoeher, A.; Huber, Johannes

Performance Analysis and Code Optimization of Low Density Parity-Check Codes on Rayleigh Fading Channels

Lecture 8: Shannon s Noise Models

EE 550: Notes on Markov chains, Travel Times, and Opportunistic Routing

Interference Channel aided by an Infrastructure Relay

Dispersion of the Gilbert-Elliott Channel

Capacity of the Discrete Memoryless Energy Harvesting Channel with Side Information

Information-Theoretic Limits of Group Testing: Phase Transitions, Noisy Tests, and Partial Recovery

Approximate Ergodic Capacity of a Class of Fading Networks

Optimal Natural Encoding Scheme for Discrete Multiplicative Degraded Broadcast Channels

A Formula for the Capacity of the General Gel fand-pinsker Channel

The Capacity Region for Multi-source Multi-sink Network Coding

EE5139R: Problem Set 7 Assigned: 30/09/15, Due: 07/10/15

Error Exponent Regions for Gaussian Broadcast and Multiple Access Channels

Algebraic gossip on Arbitrary Networks

MARKOV CHAINS AND HIDDEN MARKOV MODELS

Generalized Writing on Dirty Paper

On Two Probabilistic Decoding Algorithms for Binary Linear Codes

Channel combining and splitting for cutoff rate improvement

A Comparison of Superposition Coding Schemes

Channels with cost constraints: strong converse and dispersion

On Comparability of Multiple Antenna Channels

Transcription:

Research Collection Conference Paper The relation between block length and reliability for a cascade of AWG links Authors: Subramanian, Ramanan Publication Date: 0 Permanent Link: https://doiorg/0399/ethz-a-00705046 Rights / License: In Copyright - on-commercial Use Permitted This page was generated automatically upon download from the ETH Zurich Research Collection For more information please consult the Terms of use ETH Library

Int Zurich Seminar on Communications IZS, February 9 March, 0 The Relation Between Block Length and Reliability for a Cascade of AWG Links Ramanan Subramanian Institute for Telecommunications Research, University of South Australia, Mawson Lakes, SA 5095, Australia Email: RamananSubramanian@unisaeduau Abstract This paper presents a fundamental informationtheoretic scaling law for a heterogeneous cascade of links corrupted by Gaussian noise For any set of fixed-rate encoding/decoding strategies applied at the links, it is claimed that a scaling of Θ log n in block length is both sufficient and necessary for reliable communication A proof of this claim is provided using tools from the convergence theory for inhomogeneous Markov chains I ITRODUCTIO Consider a line network in which a message originating at a certain node is conveyed to the intended destination through a series of hops The reception at every hop is impaired by zero-mean additive white Gaussian noise AWG of a certain variance The originating node as well as the intermediate nodes are subject to an average transmit-power constraint P 0 for the duration of transmission of any message The messages, drawn from a certain set of size M, are encoded and transmitted as codewords in R that satisfy the transmit power constraint Every intermediate hop decodes possibly erroneously the original message from its noisy observation and re-encodes it before transmitting to the next hop One can form the row-stochastic probability transition matrix P i for the i th hop in which each row gives the conditional probability distribution of the message decoded at hop i, given a certain message sent by the previous hop: p i p i p i M p i p i p i M P i = p i M p i M p i M M Assume that a network of the type outlined above consists of n hops If the encoder-decoder pairs are identical at the hops, and if the noise variance at every hop is σ, then the matrices P i will be identical to, say P for all i Then, the rows of P n will give the conditional probability distributions of the message Ŵn decoded at the final destination, given a certain message W at the originating node For decisions made on signals corrupted by Gaussian noise, P has the property that any column is either all-zero or contains only non-zero entries For such a matrix P, the rows of P n tend to be identical for large n see Theorem 49 in [], if M,, P 0, and σ are independent of n In other words, the This work has been supported by the Australian Research Council under the ARC Discovery Grant DP0986089 probability distribution of Ŵ n will be almost independent of the original message W This implies that the mutual information IW ; Ŵn between the original message and the decoded message tends to zero with n Hence, communication in the network is highly unreliable if and M do not vary with the number of hops n On the other hand, if is large and if one chooses M = R for any R < log + P0 σ, random-coding theorem states that there exists a channel code of block length such that the probability of error during decoding at any link is bounded above by e ER,P/σ,whereE is the random coding exponent for the corresponding AWG channel [], [3] By employing such a code for each link in the line network model, choosing any fn ω, and letting vary with log n+fn n as, one obtains the following union bound on the ER,P/σ probability of communication failure in the network: ] P [Ŵn W np B ne ER,P/σ = e fn 0 Hence, it is possible to transmit messages with very high reliability in the network for a fixed code rate R, if the code length satisfies n Θlog n On the other hand, the previous discussion about diminishing IW ; Ŵn implies that any fixed-rate fixed-length coding scheme is not scalable for the line network This leads to the natural question whether the scaling Θlog n for is indeed optimal In this paper, we show that for any scaling of satisfying n olog n, the mutual information IW ; Ŵn diminishes to zero with n for any set of fixed-rate codes This implies that the scaling order Θlog n for is necessary for reliable communication in the network Hence, the sufficiency criterion for given by the union bound in Eqn is indeed optimal up to a constant factor While this scaling problem has been solved partially in the past for cascades of homogeneous DMCs by iesen et al in [4], it has also been pointed out in the same work that the tools developed there apply neither to non-homogeneous cascades nor to cascades of continuous-input channels as is the case in the current paper In contrast, the results developed here are applicable to continuous-input AWG links where the links can be non-identical For any fn > 0 and gn > 0, fn ωgn lim n fn gn =,fn ogn lim n fn gn = 0,fn Θgn c>0 st lim n fn gn = c 7

Int Zurich Seminar on Communications IZS, February 9 March, 0 II OTATIOS AD DEFIITIOS The set of all real numbers is denoted by R and the set of all natural numbers by atural logarithms are assumed in the notations, unless the base is specified The notation represents L norm throughout We employ the following terminologies in this paper Let, called the code length or block length of the transmission scheme A code rate R>0 is a real number such that R is an integer Let M {,, 3,, R }, called the message alphabet LetP 0 > 0 be the input power constraint for transmission Definition : A rate-r length- code C with a power constraint of P 0 is an ordering of M = R elements from R, called code words that satisfies C =x, x, x 3,, x M st w M, x w P 0 Definition : A rate-r length- decision rule R is an ordering of M = R sub-sets of R, called decision regions, spanning R and pairwise disjoint: { w M R =R, R,, R M st R w = R,and w w R w R w = 3 Definition 3: The encoding function EC C : M R for a code C is defined by: EC C w =x w, 4 where x w is the w-th code word in C Definition 4: The decoding function DEC R : R M for a decision rule R is defined by: DEC R y = w Rw y, 5 i M where is the indicator function and R w is the w-th subset in R III ETWORK MODEL The line network model to be considered is described in Fig There are n + nodes in the network identified by the indices {0,,,,n} Then hops in the network are each associated with noise variances σ i, i n For encoding, nodes 0,,,n choose C 0, C,, C n respectively to transmit, each code being rate-r length- with a power constraint of P 0 For decoding, nodes,,,n choose rate- R length- decision rules R, R,, R n for reception From here on, we write EC i and DEC i in place of EC Ci and DEC Ri respectively, for simplicity ode 0 generates a random message W M with probability distribution p W w and intends to convey the same to node n through the cascade of noisy links in a multihop fashion Each of the n intermediate nodes estimates the message as sent by the node in the previous hop from its own noisy observation, re-encodes the decoded message, and transmits the resulting codeword to the next hop The codeword transmitted by node i, forany0 i n is given by X i = EC i Ŵi, 6 where Ŵi is the estimate of the message at node i after decoding ote that Ŵ 0 = W in this notation The observation received by node i, forany i n is given by Y i,which follows a conditional density function that depends on the codeword X i sent by the previous node: p Yi X i y x = πσ i e y x σ i 7 The above density function follows from the assumptions of AWG noise and memoryless-ness in the channel The message Ŵi decoded by node i is given by Ŵ i = DEC i Y i 8 ote that the random variable Ŵn represents the message decoded by the final destination, as per the above notation IV A UPPER BOUD O THE RELIABILITY OF THE ETWORK The key result presented here is summarized by the following theorem and its corollary: Theorem : In a line network consisting of a cascade of n AWG links, let for m n of the links the noise variances satisfy σ i σ 0 > 0 Then, for any choice of codes C 0, C,, C n and decision rules R, R,, R n with rate R length, andforanyɛ > 0, the following holds for sufficiently large m: m ɛ IW ; Ŵn R Γ +e E P 0/σ 0, where E S =exp {cosh S + } +cosh S + The corollary to the above theorem leads to the upper bound on the reliability vs block-length scaling as claimed in the introductory section: Corollary : Let R > 0, σ 0 > 0, and r > 0 be independent of n In a line network of n links, let for rn of the hops the noise variance be at least σ 0 > 0 Then, for any n olog n, lim n IW ; Ŵn =0 Proof of Theorem : At any hop i, the process of channel encoding, noisy reception, followed by decoding induces the conditional probabilities pŵi Ŵi k j, j,k M For every hop i, letp i be the M M row-stochastic matrix whose entry in row j and column k is pŵi Ŵi k j ote that the j th row in P i gives the conditional probability mass function on the estimate of message W at hop i ie,ŵi given that the message sent by hop i was j Let P n P i 9 i= Then, P clearly represents the row-stochastic probability transition matrix between the original message W and the message decoded at the final destination, Ŵ n The transition matrix P 7

Int Zurich Seminar on Communications IZS, February 9 March, 0 ode 0 ode i ode n p W W EC 0 X 0 p Y X 0 Y p Yi X i Y i DEC i Ŵ i EC i X i p Yn Xn Yn DECn Ŵn Fig : Line etwork Model along with p W, the probability mass function of the original message W together induce a joint distribution between W and Ŵn Our goal is to find an upper bound on IW ; Ŵn, with the constraints as given in the statement of Theorem For any M M row-stochastic matrix Q = {Q jk },letus consider the following measures of deviation: δ Q max max j,j k Q jk Q jk, 0 M λ Q min min Q jk,q jk j,j k= Both the above deviations measure the degree to which the rows of Q differ The measure λ Q is known as Dobrushin s coefficient of ergodicity and plays a role in the convergence of non-homogeneous stochastic processes [5] Moreover, for any stochastic matrix Q, 0 λ Q ow consider the total variational distance between the joint distribution of W and Ŵn and the product of the respective marginal distributions: τ W ; Ŵn w,ŵ M p W w pŵn ŵ p W Ŵ n w, ŵ In the above expression, substituting p W w pŵn W ŵ w for p W Ŵ n w, ŵ, w p M Ŵ ŵ w n W p W w for p W Ŵ n w, ŵ and by using the triangle inequality, one obtains the following bound: τ W ; Ŵn w,ŵ p W w w p W w δ P = MδP = R δ P 3 Recalling the definition of P in our case, and from Theorem in [6] equivalently, Lemma in [7], we have: n n δ P =δ P i λ P i i= i= i:σ i σ 0 λ P i 4 The second inequality in the above follows by applying Eqn for those hops for which σ i <σ 0 ote that as per the statement of the theorem, we have at least m terms in the product in the last expression above ow consider λ P i for any hop i such that σ i σ 0 Suppose P i has L distinct columns with indices k,k,,k L such that j,p jk = pŵi Ŵi k j p 5 j,p jkl = pŵi Ŵi k L j p L 6 Then from the definition in Eqn, it follows that λ P i L p l 7 We now show that the conditions given by Eqn 5-6 hold true in our case, so that we can apply Eqn 7 to our bound Let α>0 Consider one particular hop i satisfying σ i σ 0 Define the spherical regions S = {y R : y } P 0 S α = {y R : y α } P 0 Clearly, any code word in C i transmitted by the previous hop has to be in S ext, observe that R i has to contain decision regions, say L in number, spanning S α entirely Let us designate them as R k, R k,, R kl Also define for each l L, V l R kl S α Hence, we have: L V l = S α, and l l V l V l = 8 Let Ŵi = j be the message as decoded by the previous hop Hence, the code word x j from C i was transmitted by the node i The probability that this was decoded as message k l at hop i is given by: pŵi Ŵi k l j = p Yi X i y x j dy y R kl p Yi X i y x j dy πσ i pŵi Ŵi k l j a b e y + x j σ i dy c e y x j σ i e dy +α P 0 σ i dy = e +αp 0 σ i V l, 9 73

Int Zurich Seminar on Communications IZS, February 9 March, 0 where V l denotes the -dimensional volume of V l Inthe above series of inequalities, step a follows from Eqn 7, b from the triangle inequality, and c from the fact that any y V l must satisfy y α P 0,sinceV l S α, and from the fact that x j, being a code word in C i naturally satisfies the power constraint criterion given by x j P 0 ote that the lower bound given by Eqn 9 is independent of the message j decoded by the previous hop Hence, the conditions given by Eqns 5-6 are satisfied for p l = e +αp 0 σ πσ i i V l 0 Hence, the bound given by Eqn 7 holds true In our case, it gives: L λ P i p l = e +α P0 L σ πσ i i V l e +α P0 σ i π Γ π = πσ i Γ + α P 0 The last equality follows from the fact that V l are mutually disjoint and span the entire -dimensional sphere S α Eqn 8 The volume of S α is computed as: S α = α P 0 + The ideas in the above argument are outlined in Figs a and b V 6 V V 3 V 5 V V 4 a Intersecting regions of S α with the decision rule R i S α α P 0 O x j y P0 V l S b Integrating the density function of noise centered at x j over a subset V l of S α Fig : Geometrical argument used for bounding λ P i ext, note that the bound given by Eqn holds true for any α>0 Hence, we can maximize the right hand side of Eqn over all α>0 to obtain λ P i Γ +e E P 0/σ i, 3 where the function E S is as given in the statement of the theorem oting further that ES is a non-decreasing function in S, we will have for hops satisfying σ i σ 0, λ P i Γ +e E P 0/σ 0 4 Since at least m hops satisfy the minimum noise variance criterion σ i σ 0, we can now combine Eqn 4 with Eqn 4 to obtain m δ P Γ +e E P 0/σ 0 5 Applying the above to Eqn 3, we will have τ W ; Ŵn R Γ +e E P 0/σ 0 m 6 We can finally bound IW ; Ŵn using Lemma 7 of Chapter in [8] by: IW ; Ŵn R τ W ; Ŵn log 7 τ W ; Ŵn The bound given by the statement of the theorem then follows from combining the above with Eqn 6 and by observing that for sufficiently large m, τ W ; Ŵn e,that xlog x is non-decreasing for x e, and finally from the fact that for any ɛ>0, x ɛ log x for sufficiently large x Proof of Corollary : For the corollary, we have m = nr Moreover, Γ + Hence, the bound given by Theorem reduces to: IW ; Ŵn R e E P 0/σ 0 nr ɛ 8 R e nr ɛe E P 0 /σ 0 9 = e Rlog elog nr ɛ E P 0 /σ 0 30 The last expression diminishes to 0 with n if = n o log n Hence, we will have IW ; Ŵn 0 for any o log n REFERECES [] David A Levin, Yuval Peres, and Elizabeth L Wilmer, Markov Chains and Mixing Times, American Mathematical Society, edition, 008 [] Claude E Shannon, Probability of error for optimal codes in a gaussian channel, Claude Elwood Shannon: Collected Papers, pp 79 34, 993 [3] R Gallager, A simple derivation of the coding theorem and some applications, IEEE Trans Inform Theory, vol, no, pp 3 8, Jan 965 [4] U iesen, C Fragouli, and D Tuninetti, On capacity of line networks, IEEE Trans Inform Theory, vol 53, no, pp 4039 4058, ov 007 [5] Adolf Rhodius, On the maximum of ergodicity coefficients, the Dobrushin ergodicity coefficient, and products of stochastic matrices, Linear Algebra and its Applications, vol 53, no -3, pp 4 54, 997 [6] J Hajnal and MS Bartlett, Weak ergodicity in non-homogeneous markov chains, Mathematical Proceedings of the Cambridge Philosophical Society, vol 54, pp 33 46, 958 [7] J Wolfowitz, Products of indecomposable, aperiodic, stochastic matrices, Proceedings of the American Mathematical Society, vol 4, no 5, pp pp 733 737, Oct 963 [8] I Csiszar and J Korner, Information Theory: Coding Theorems for Discrete Memoryless Systems, volofprobability and Mathematical Statistics, Academic Press, 98 74