Network Coding on Directed Acyclic Graphs

Similar documents
The Capacity Region for Multi-source Multi-sink Network Coding

Lossy Distributed Source Coding

Network Combination Operations Preserving the Sufficiency of Linear Network Codes

Superposition Encoding and Partial Decoding Is Optimal for a Class of Z-interference Channels

Capacity Region of the Permutation Channel

EE/Stat 376B Handout #5 Network Information Theory October, 14, Homework Set #2 Solutions

LECTURE 13. Last time: Lecture outline

Network coding for multicast relation to compression and generalization of Slepian-Wolf

Lecture 1: The Multiple Access Channel. Copyright G. Caire 12

Matroid Bounds on the Region of Entropic Vectors

Exploiting Symmetry in Computing Polyhedral Bounds on Network Coding Rate Regions

Alphabet Size Reduction for Secure Network Coding: A Graph Theoretic Approach

ELEC546 Review of Information Theory

LECTURE 15. Last time: Feedback channel: setting up the problem. Lecture outline. Joint source and channel coding theorem

Lecture 8: Channel and source-channel coding theorems; BEC & linear codes. 1 Intuitive justification for upper bound on channel capacity

Lecture 4 Noisy Channel Coding

A computational approach for determining rate regions and codes using entropic vector bounds

Multiterminal Networks: Rate Regions, Codes, Computations, & Forbidden Minors

Lecture 6 I. CHANNEL CODING. X n (m) P Y X

Shannon s noisy-channel theorem

Lecture 5: Channel Capacity. Copyright G. Caire (Sample Lectures) 122

Information Theory. Lecture 10. Network Information Theory (CT15); a focus on channel capacity results

Lecture 11: Polar codes construction

Representation of Correlated Sources into Graphs for Transmission over Broadcast Channels

On Multiple User Channels with State Information at the Transmitters

Multiaccess Channels with State Known to One Encoder: A Case of Degraded Message Sets

EE376A - Information Theory Final, Monday March 14th 2016 Solutions. Please start answering each question on a new page of the answer booklet.

Characterising Probability Distributions via Entropies

EECS 229A Spring 2007 * * (a) By stationarity and the chain rule for entropy, we have

Linearly Representable Entropy Vectors and their Relation to Network Coding Solutions

Lecture 11: Quantum Information III - Source Coding

Homework Set #2 Data Compression, Huffman code and AEP

Symmetry in Network Coding

Extremal Entropy: Information Geometry, Numerical Entropy Mapping, and Machine Learning Application of Associated Conditional Independences.

EE5585 Data Compression May 2, Lecture 27

A digital interface for Gaussian relay and interference networks: Lifting codes from the discrete superposition model

Lecture 22: Final Review

LECTURE 10. Last time: Lecture outline

Exercises with solutions (Set B)

Lecture 10: Broadcast Channel and Superposition Coding

Lecture 16. Error-free variable length schemes (contd.): Shannon-Fano-Elias code, Huffman code

PART III. Outline. Codes and Cryptography. Sources. Optimal Codes (I) Jorge L. Villar. MAMME, Fall 2015

COMM901 Source Coding and Compression. Quiz 1

Lecture 3: Error Correcting Codes

On Network Coding Capacity - Matroidal Networks and Network Capacity Regions. Anthony Eli Kim

ECE Information theory Final (Fall 2008)

Lecture 3: Channel Capacity

(Classical) Information Theory III: Noisy channel coding

Optimal Encoding Schemes for Several Classes of Discrete Degraded Broadcast Channels

Amobile satellite communication system, like Motorola s

A Graph-based Framework for Transmission of Correlated Sources over Multiple Access Channels

Rate region for a class of delay mitigating codes and P2P networks

On Multi-source Multi-Sink Hyperedge Networks: Enumeration, Rate Region. Computation, and Hierarchy. A Thesis. Submitted to the Faculty

On the Capacity and Degrees of Freedom Regions of MIMO Interference Channels with Limited Receiver Cooperation

Symmetries in the Entropy Space

Capacity Region of Reversely Degraded Gaussian MIMO Broadcast Channel

Interference Channels with Source Cooperation

Problem Set 2: Solutions Math 201A: Fall 2016

SIGNAL COMPRESSION Lecture Shannon-Fano-Elias Codes and Arithmetic Coding

Shannon s Noisy-Channel Coding Theorem

EE376A: Homework #3 Due by 11:59pm Saturday, February 10th, 2018

An Achievable Rate Region for the 3-User-Pair Deterministic Interference Channel

Interactive Decoding of a Broadcast Message

Intro to Information Theory

Analyzing Large Communication Networks

18.2 Continuous Alphabet (discrete-time, memoryless) Channel

The Gallager Converse

Lattices for Distributed Source Coding: Jointly Gaussian Sources and Reconstruction of a Linear Function

Multimedia Communications. Mathematical Preliminaries for Lossless Compression

Algorithms for Computing Network Coding Rate Regions via Single Element Extensions of Matroids

Lecture 1: September 25, A quick reminder about random variables and convexity

An Algebraic Approach to Network Coding

Entropy as a measure of surprise

EE376A - Information Theory Midterm, Tuesday February 10th. Please start answering each question on a new page of the answer booklet.

Reliable Computation over Multiple-Access Channels

Generalized Network Sharing Outer Bound and the Two-Unicast Problem

Simultaneous Nonunique Decoding Is Rate-Optimal

PCP Theorem and Hardness of Approximation

Entropy Vectors and Network Information Theory

Multicoding Schemes for Interference Channels

When does a mixture of products contain a product of mixtures?

The Capacity of a Network

Bounding the Entropic Region via Information Geometry

Distributed Lossless Compression. Distributed lossless compression system

Equivalence for Networks with Adversarial State

Chapter 4. Data Transmission and Channel Capacity. Po-Ning Chen, Professor. Department of Communications Engineering. National Chiao Tung University

Entropies & Information Theory

The Shannon s basic inequalities refer to the following fundamental properties of entropy function:

A Novel Asynchronous Communication Paradigm: Detection, Isolation, and Coding

An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1

On the Duality between Multiple-Access Codes and Computation Codes

Distributed Lossy Interactive Function Computation


CS 6820 Fall 2014 Lectures, October 3-20, 2014

Notes 3: Stochastic channels and noisy coding theorem bound. 1 Model of information communication and noisy channel

Information Theory for Wireless Communications. Lecture 10 Discrete Memoryless Multiple Access Channel (DM-MAC): The Converse Theorem

Lecture 5: Asymptotic Equipartition Property

EE5139R: Problem Set 7 Assigned: 30/09/15, Due: 07/10/15

for some error exponent E( R) as a function R,

Sum-networks from undirected graphs: Construction and capacity analysis

Transcription:

Network Coding on Directed Acyclic Graphs John MacLaren Walsh, Ph.D. Multiterminal Information Theory, Spring Quarter, 0 Reference These notes are directly derived from Chapter of R. W. Yeung s Information Theory and Network Coding, Springer, 008. Motivating Example: Butterfly Network s s b t t Figure : The butterfly example. We presented the butterfly example as a case where we could increase the capacity region of a network by incorporating coding between flows at intermediate network nodes. Summing the messages on link b allow both s and s to be multicasted to both sinks t, t. 3 Definition of a Network Code & Coding Capacity Region In these lectures we model a network as a directed acyclic graph G = V, E). There is a finite set of vertices or nodes V, and collection of edges e E which are ordered pairs of vertices e = v, v ), v, v V. We call vertex v the tail of edge e = v, v ) and the vertex v the head of edge e. A sequence of vertices e, e,..., e k such that the head of edge e n is the tail of the next edge e n+ is called a directed path in the the graph G. A directed path with the property that the tail of e is the head of edge e k is called a cycle, and the graph is called acyclic if it has no cycles. There is also a set of source nodes S V and sink nodes T V with S T =. Each source node s is endowed with a source variable X s uniformly distributed over the set X s =,,..., Nτs. The variables X s, s S are mutually independent, and represent messages to be sent over the network. Additionally, the edges of the graph are associated with capacity itations R e, indicating the number of bits per source time instant which can be sent over these edges. Each sink t T has a subset of source variables, those with indices in βt) S, that it wishes to determine. In order to make this possible, each node in the network will encode all of the messages it hears on its incoming edges i.e. all those edges that have it as a tail) into a message to be sent on its outgoing edges i.e. all those edges that have it as a head). These nodes do this using the functions i V \ S T ), e Outi) k e : 0,,..., η d 0,,..., η e ) d Ine) where Ini) = e E e = j, i) some j V is the set of edges having node i as their tail and Outi) = e E e = i, j) some j V is the set of edges having node i as their head. The source nodes encode their sources into messages

through the functions s S, e Outs), k e : X s 0,,..., η e ) The sink nodes reproduce the source messages using the encoded messages available locally to them through the functions t T, g t : 0,,..., η d X s 3) d Int) The aggregate of these functions ),),3) are collectively known as a N, η e : e E), τ s : s S network code. For this to work, all of the messages must have arrived on the incoming edges before the ones on the outgoing edges are calculated. This is enabled through the assumption of an acyclic network. For any finite directed acyclic graph, it is possible to order the nodes in a sequence such that if e E, e = i, j), then node i appears before node j in the sequence. By selecting such an order to perform the encoding among the nodes, every node will have all of the messages from its incoming edges before it calculates the message on the outgoing nodes. Let g t X S ) represent the composition of all functions from the sources to the sink t. We say that a collection of source rates ω s, s S are achievable if for arbitrarily small ɛ > 0, there exists a network code such that s βt) N log η e R e + ɛ e E 4) τ s ω s ɛ 5) P [ g t X S ) X βt) ] ɛ 6) The set of all achievable rate vectors ω, denoted by R, is the network coding capacity region. 4 Network Coding Capacity Region If a collection of rates were ω achievable with zero probability of error and a block length of N = for a particular network code, then, after identifying the random variables Y s, s S with the sources X s, s S, and the random variables U e with the coded message on edge e, for this code we would have the inequalities Indeed HY s ) ω s s S 7) HY S ) = s S HY s ) 8) HU Outs) Y s ) = 0 s S 9) HU Outi) U Ini) ) = 0 i V \ S T ) 0) HU e ) R e ) HY βt) U Int) ) = 0 ) 7) reflects the fact that the sources must be uniform over a set with cardinality Nτs with τ s ω s 8) reflects the requirement that the sources are independent of one another. 9) reflects that the message encoded by a source node is a function of the source available to it., i.e. ) 0) reflects that the messages on the outgoing edges from a node are a function of the messages on its incoming edges, i.e. ) ) reflects the itations on edge capacity 4) ) indicates the zero probability of error reconstruction. Of course, our notion of an achievable rate in the network coding capacity region R, was the usual Shannon lossless notion, which allows a non-zero, but finite probability of error as indicated by 6) and an arbitrarily large block length N and closure in rate space as represented by 4) and 6). Surprisingly, this lossless network coding capacity region can be written directly in terms of the inequalities 7,8,9,0,,) with an expression we shall define presently. The first bit of notation is to stack subset entropies into a vector. That is, given a collection of M = V + E random variables, there are M non-empty subsets of the random variables, and to each such subset we have an associated entropy.

We stack the entropies of these subsets into a vector h of dimension M, and will index this vector via the subset, so that for instance h A will represent the joint entropy of the random variables in A. The ordering for the indexing can be done, for instance, by using the integer associated with the length M binary string whose kth bit indicates whether or not k is in A.) We then consider each of the inequalities 8,9,0,,) as linear inequalities for this vector, defining the linear constraint sets L = h R M h Y S = h Ys 3) s S L = h R M huouts) Y s h Ys = 0 s S 4) L 3 = h R M huouti) U Ini) h UIni) = 0 i V \ S T ) 5) L 4 = h R M h Ue R e e E 6) L 5 = h R M hyβt) U Int) h UInt) = 0 t T 7) Additionally, introduce the following notation. Proj YS B) := h Ys : s S) h B. ΛB) := h R S + h h, h B 3. convb) the convex hull of the set B 4. B the closure of the set B 5. Γ M = h R N + Z, Z,..., Z M ) finite discrete random variables with h A = HZ A ) A,..., M Yeung and his co-workers have shown that the network coding capacity region R is equal to R = Λ Proj YS conv Γ M L 3) L 4 L 5 8) 4. Converse Sketch For the converse, we must show that R R. Consider an achievable rate vector ω R and a monotone decreasing sequence ɛ k 0 as k. Then for every k for every N sufficiently large there exists a network code such that N log η e R e + ɛ k e E 9) τ s ω s ɛ k 0) P [ g t X S ) X βt) ] ɛ k ) Let U e be the message sent by the network code on edge e for all e E, and identify Y s = X s as the source variables. Because the source variables are independent and because encodings on outgoing edges are a function of the messages on incoming edges to a node, the inequalities 7,8,9,0) hold among these finite discrete random variables Y s, s S, U e, e E. Now, Fano s inequality states that HY βt) U Int) ) + k) t log Y βt) = + k) t HY βt) ) + ɛ k HY βt) ) ) While we can upper bound the entropy of HY βt) ) using HY βt) ) = IY βt) ; U Int) ) + HY βt) U Int) ) HU Int) ) + HY βt) U Int) ) 3) HU Int) ) + + k) t HY βt) ) log η e + ) + + k) t HY βt) ) NR e + ɛ k ) + ɛ k) HY βt) 4) ) e Int) Solving for an inequality for HY βt) ) we get HY βt) ) N ɛ k e Int) R e + ɛ k N e Int) 5) 3

which when substituted back into Fano ) gives HY βt) U Int) ) N N + ɛ k R e + ɛ k. 6) ɛ k N e Int) φ tn,ɛ k ) Here, it is clear the function φ t N, ɛ k ) is bounded, is monotone decreasing in both k and N, and approaches 0 as k, N. Moving next to the edge capacity constraints, and the entropies of the sources we observe that HU e ) log η e + ) NR e + ɛ k ) HY s ) Nω s ɛ k ) 7) If we define the half spaces, reminiscent of L 4, L 5 in the more general case L N 4,ɛ k = h R M h Ue NR e ɛ k ) e E L N 5,ɛ k = h R M hyβt) U Int) h UInt) Nφ t N, ɛ k ) t T 8) 9) we observe that the subset entropies of this network code h k) lie in the set h k) Γ M L 3 L N 4,ɛ k L N 5,ɛ k, h Ys Nω s ɛ k ), s S 30) Now, since 0 Γ M L 3 L N 4,ɛ k L N 5,ɛ k, and N h can be viewed as a convex combination with h and zero, we observe that where N h k) conv Γ M L 3) L 4,ɛk L 5,ɛk, N h Ys ω s ɛ k 3) L 4,ɛk = L 5,ɛk = h R M h Ue R e + ɛ k ) e E h R M hyβt) U Int) h UInt) φ t N, ɛ k ) t T 3) 33) Defining the constraint set in 3) to be B N,k) we observe that the B N,k) s are monotone decreasing in that hence B N+,k) B N,k) B N,k+) B N,k) 34) N,k BN,k) = N= k= B N,k) 35) and the latter set, since it involves the intersection of the inequalities in L 4,ɛk and L 5,ɛk, becomes N,k N h k) conv Γ M L 3) L 4 L 5, Rearranging this fact, we have that if ω is achievable, then ω Λ proj YS conv Γ M L 3) L 4 L 5 which is what we needed to prove. 4. Obtaining Inner and Outer Bounds N h k) Y N,k s ω s, s S 36) We discussed that the capacity region presented in 8) is implicit, in that we don t generally know all of the inequalities necessary to describe Γ M in fact, the closure of its infinite cardinality counterpart Γ M is not even polyhedral for M 4). It is possible to obtain inner and outer bounds on the capacity region by substituting in inner and outer bounds for Γ M. We discussed a polyhedral outer bound for Γ M known as the Shannon outer bound Γ M. One way to write the Shannon outer bound is the set of vectors obeying the properties that entropy is a non-decreasing set function that is sub-modular: Γ M := h R M h A h A A A 38) h A + h B h A B + h A B A, B,..., M Inner bound TBD. 37) 4

4.3 Achievability Sketch We begin by proving an alternate form of the capacity region 8). Let D ) be a set operator which scales all the points in the set by numbers in between zero and one: DA) = αh h A, α [0, ] 39) We will prove that the convex hull can be replaced by scalings in the capacity region expression, so that Λ proj YS conv Γ M L 3) L 4 L 5 = Λ proj YS D Γ M L 3) L 4 L 5. 40) To do this, we will show that D Γ M L 3) = conv Γ M L 3) 4) Consider a point h D Γ M L 3). It is the it of some sequence h k D Γ M L 3), where h k = α k ĥ k for some ĥ k Γ M L 3 and α k [0, ]. Noting that 0 Γ M L 3, we can view α k ĥ k as the convex combination α k ĥ k + α k )0) convγ M L 3). This shows that D Γ M L 3) conv Γ M L 3). To prove the other containment, we show that DΓ M L 3) is a convex set containing Γ M L 3. Since the convex hull convγ M L 3) is defined as the smallest convex set containing Γ M L 3, the convexity of DΓ M L 3) will guarantee that it contains convγ M L 3) DΓ M L 3).) Consider two points h, h DΓ M L 3), and select any λ [0, ]. These points are its of the sequences h k) = α k) ĥk) and h k) = α k) ĥk) with α k), αk) 0, ] and ĥk), ĥk) Γ M L 3. Select a sequence of positive integers n k, n k N with n k, n k as k and with n k α k n k λ 4) αk λ Letting the collection of random variables Z,..., ZM ) and Z,..., ZM ) be random variables obtaining any ĥ, ĥ Γ M L 3 respectively, we observe that the collection of random variables Z,..., Z M defined via are associated with the entropies n ĥ + n ĥ sufficiently large we have This then shows that Z m = Zm,,..., Zm,n, Zm,,..., Zm,n ), m,..., M 43) n i.i.d. copies of Zm n i.i.d. copies of Zm α k α k n k αk + nk αk for all k sufficiently large, which then implies that α α k k k n k αk + nk αk Γ M L 3, hence n k ĥk) + n k ĥk) ) Γ M L 3. α k α k n k αk + nk αk However, rearranging the terms inside the it we have α α k k ) k n k αk + n k ĥk) nk αk + n k n k ĥk) ) = α k k n k αk + nk αk Additionally for k 44) n k ĥk) + n k ĥk) ) Γ M L 3 45) ) n k ĥk) + n k ĥk) ) Γ M L 3 46) α k ĥk) + k n k α k n k αk + nk αk α k ĥk) = λh + λ)h 47) is in Γ M L 3, proving that it is convex. This establishes We must now prove that any vector ω in the alternate rate region representation ω Λ proj YS DΓ M L 3) L 4 L 5 48) is acheivable. Since we can discard ) any excess rate we wish not to use, this amounts to showing that any rate vector ω proj YS DΓ M L 3) L 4 L 5 is achievable. Let h DΓ M L 3) L 4 L 5 be a vector such that ω = proj YS h. Since h DΓ M L 3) L 4 L 5 it is the it of some sequence h k DΓ M L 3) which is of the form α k ĥ k with 5

ĥ k Γ M L 3. Let Ys k : s S), Ue k : e E be the random variables associated with ĥk, since the entropies for these variables are in L 3 they obey the inequalities HY k S ) = s S HY k s ) 49) HUOuts) k Y s k ) = 0 s S 50) HU Outi) U Ini) ) = 0 i V \ S T ) 5) The equalities 50) and 5) show that we may think of the random variables Ue k = f e,k Ys k ) as a deterministic function f s,e,k of the source random variable Ys k for each e Outs) and s S and the random variables Ue k = f e,k UIni) k ) as a deterministic function of the random variables UIni) k for each e Outi) for each i V \ S T ). Additionally, since the it of the scaled entropies h k is in L 4 L 5 and has proj YS k h k = ω, we have where γ k 0 and µ k 0 as k while ω k s ω s. Let ˆN k = αn k. For each source s, generate a N kτ k s α k HY k s ) = ω k s s S 5) α k HU k e ) R e + µ k e E 53) α k HY k βt) U Int γ k t T 54) ˆNk dimensional matrix by sampling its elements i.i.d. according to the distribution p Ys, let the jth row be denoted by Y ˆN k s j), j,..., N kτ s. For each edge, enumerate all of the length ˆN k typical sequences T ˆN k ɛ Ue k ) as U ˆN k e,k ),..., U ˆN k e,k ηk e ). Due to the bound on the cardinality of the typical set, for such an enumeration ηe k ˆN k HU k e )+ɛc) α kn k HU k e )+ɛc e,k) 55) so that N k log η k e α k HU k e ) + ɛc e,k R e + µ k + ɛc e,k 56) The encoder at source node s selects at random one of the N kτ k s rows in its matrix, then calculates the deterministic function f s,k Y ˆN k s j operating elementwise on each of the ˆN k positions in the vector. Provided we select ɛ e,k Y k s ɛ s,k e Outs) 57) if Y ˆN k s j) T ˆN k ɛ s,k Ys k ), then the result of these deterministic function will be in T ˆN k ɛ e,k Ue k ), and together the outgoing messages will all be jointly typical, i.e. in T ˆN k ɛ Outs),k UOuts) k, Y s k ). The messages sent are the associated typical sequence index from,..., η e from the deterministic function, or 0 if the input was not typical. Via the Markov lemma if we take N sufficiently large, then we observe that if Y ˆN k s j) T ˆN k ɛ s,k Y k s ) for each s S, then all of the messages outgoing from these source are together jointly typical. Proceeding via the order defined by the directed acyclic graph for the operation of the encoders such that all incoming message are available before an outgoing message is calculated), we observe that provided all incoming messages are jointly typical, the outgoing messages calculated with the deterministic functions f e,k operating on the typical sequences associated with the incoming indices) will be jointly typical themselves, and jointly typical with everything computed so far. Thus, provided that each of the selected messages Y ˆN k s j s ) are typical, they will be jointly typical with the sequences associated with each of the incoming messages at a sink. The sink operates by looking in the rows of the codebooks for the sources βt) for a collection of codewords that are jointly typical together with the incoming messages U ˆN k Int). By the logic above, there will be at least one collection corresponding to the correct decoding) of such sequences provided that Y ˆN k s j s ) are typical. If there is more than one such collection, then an error is declared. This error event is then the union, over all subsets A of βt), of the events E A for which U ˆN k Int), Y ˆN k A j A ), Y ˆN k A j c A c) are jointly typical for some j s j s for each s A. These events associated with the independent codewords Y ˆN s j s), s A winding up in the jointly typical set with Y ˆN s j s ), s A c and U ˆN Int), have probabilities bounded by P[E A ] N s A τs ˆNIY A ;Y A c,u Int) ɛc) 58) These will go to zero exponentially as N k provided that we select τs k slightly less than α k HY s ) = ω s. This, together with 49,50,5,56), shows that the ωs k are achievable according to the definition 4,5,6) ) for sufficiently large k and N k. 6