Lecture 14 October 22

Similar documents
Raptor Codes: From a Math Idea to LTE embms. BIRS, October 2015

MAT 585: Johnson-Lindenstrauss, Group testing, and Compressed Sensing

Relaying a Fountain code across multiple nodes

Recent Results on Capacity-Achieving Codes for the Erasure Channel with Bounded Complexity

Guess & Check Codes for Deletions, Insertions, and Synchronization

Lecture B04 : Linear codes and singleton bound

Part III Advanced Coding Techniques

Design and Analysis of LT Codes with Decreasing Ripple Size

Minimum Repair Bandwidth for Exact Regeneration in Distributed Storage

Upper Bounds on the Capacity of Binary Intermittent Communication

The Probabilistic Method

THE ANALYTICAL DESCRIPTION OF REGULAR LDPC CODES CORRECTING ABILITY

COMPSCI 650 Applied Information Theory Apr 5, Lecture 18. Instructor: Arya Mazumdar Scribe: Hamed Zamani, Hadi Zolfaghari, Fatemeh Rezaei

Algebraic Gossip: A Network Coding Approach to Optimal Multiple Rumor Mongering

Fountain Uncorrectable Sets and Finite-Length Analysis

Correcting Bursty and Localized Deletions Using Guess & Check Codes

Introduction to Low-Density Parity Check Codes. Brian Kurkoski

0.1. Linear transformations

Fountain Codes. Amin Shokrollahi EPFL

A Piggybacking Design Framework for Read-and Download-efficient Distributed Storage Codes

CPSC 467: Cryptography and Computer Security

Lecture 6: Expander Codes

Run-length & Entropy Coding. Redundancy Removal. Sampling. Quantization. Perform inverse operations at the receiver EEE

Lecture 11: Polar codes construction

Phase Transitions in k-colorability of Random Graphs

Topic 3 Random variables, expectation, and variance, II

Lecture 4: Proof of Shannon s theorem and an explicit code

Lecture 8: Shannon s Noise Models

Sparsity. The implication is that we would like to find ways to increase efficiency of LU decomposition.

Proofs of Retrievability via Fountain Code

Notes on Discrete Probability

Lecture 7 September 24

An introduction to basic information theory. Hampus Wessman

Product-matrix Construction

Turbo Compression. Andrej Rikovsky, Advisor: Pavol Hanus

Math WW08 Solutions November 19, 2008

Communication Engineering Prof. Surendra Prasad Department of Electrical Engineering Indian Institute of Technology, Delhi

Training-Based Schemes are Suboptimal for High Rate Asynchronous Communication

Bayes Networks. CS540 Bryan R Gibson University of Wisconsin-Madison. Slides adapted from those used by Prof. Jerry Zhu, CS540-1

THE EGG GAME DR. WILLIAM GASARCH AND STUART FLETCHER

17 Random Projections and Orthogonal Matching Pursuit

Windowed Erasure Codes

Spatially Coupled LDPC Codes

Coding for loss tolerant systems

Notes for Lecture 7. 1 Increasing the Stretch of Pseudorandom Generators

12.4 Known Channel (Water-Filling Solution)

1.6: Solutions 17. Solution to exercise 1.6 (p.13).

CS 125 Section #12 (More) Probability and Randomized Algorithms 11/24/14. For random numbers X which only take on nonnegative integer values, E(X) =

WAITING FOR A BAT TO FLY BY (IN POLYNOMIAL TIME)

Math 113: Quiz 6 Solutions, Fall 2015 Chapter 9

Some Review Problems for Exam 3: Solutions

Lecture 6 Random walks - advanced methods

Binary Compressive Sensing via Analog. Fountain Coding

Lecture 16: Introduction to Neural Networks

Lecture 21: P vs BPP 2

Notes 3: Stochastic channels and noisy coding theorem bound. 1 Model of information communication and noisy channel

9 THEORY OF CODES. 9.0 Introduction. 9.1 Noise

STUDY OF PERMUTATION MATRICES BASED LDPC CODE CONSTRUCTION

An Efficient Algorithm for Finding Dominant Trapping Sets of LDPC Codes

ECE Information theory Final (Fall 2008)

Example: sending one bit of information across noisy channel. Effects of the noise: flip the bit with probability p.

Guess & Check Codes for Deletions, Insertions, and Synchronization

Algorithms Reading Group Notes: Provable Bounds for Learning Deep Representations

CS 6820 Fall 2014 Lectures, October 3-20, 2014

Convergence analysis for a class of LDPC convolutional codes on the erasure channel

F denotes cumulative density. denotes probability density function; (.)

Time-Varying Systems and Computations Lecture 3

Lecture 1: Shannon s Theorem

Advanced 3 G and 4 G Wireless Communication Prof. Aditya K Jagannathan Department of Electrical Engineering Indian Institute of Technology, Kanpur

Designing Information Devices and Systems I Fall 2015 Anant Sahai, Ali Niknejad Final Exam. Exam location: RSF Fieldhouse, Back Left, last SID 6, 8, 9

Lecture 4 Noisy Channel Coding

Midterm 1. Name: TA: U.C. Berkeley CS70 : Algorithms Midterm 1 Lecturers: Anant Sahai & Christos Papadimitriou October 15, 2008

(x 1 +x 2 )(x 1 x 2 )+(x 2 +x 3 )(x 2 x 3 )+(x 3 +x 1 )(x 3 x 1 ).

CS 229r: Algorithms for Big Data Fall Lecture 19 Nov 5

Iterative Decoding for Wireless Networks

EE 381V: Large Scale Optimization Fall Lecture 24 April 11

Compressed Sensing and Linear Codes over Real Numbers

channel of communication noise Each codeword has length 2, and all digits are either 0 or 1. Such codes are called Binary Codes.

Machine Learning Lecture Notes

An Introduction to Low Density Parity Check (LDPC) Codes

Quasi-cyclic Low Density Parity Check codes with high girth

ECE472/572 - Lecture 11. Roadmap. Roadmap. Image Compression Fundamentals and Lossless Compression Techniques 11/03/11.

CSE370 HW3 Solutions (Winter 2010)

Coupling. 2/3/2010 and 2/5/2010

Coding for Discrete Source

Recap of the last lecture. CS276A Information Retrieval. This lecture. Documents as vectors. Intuition. Why turn docs into vectors?

Weaknesses of Margulis and Ramanujan Margulis Low-Density Parity-Check Codes

1 Ex. 1 Verify that the function H(p 1,..., p n ) = k p k log 2 p k satisfies all 8 axioms on H.

CPSC 467: Cryptography and Computer Security

Lecture 6 I. CHANNEL CODING. X n (m) P Y X

Lecture 4: Probability and Discrete Random Variables

Contest Number Theory

Iterative Encoding of Low-Density Parity-Check Codes

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 11, NOVEMBER On the Performance of Sparse Recovery

Adaptive Cut Generation for Improved Linear Programming Decoding of Binary Linear Codes

Asymptotically optimal induced universal graphs

Lecture 15 and 16: BCH Codes: Error Correction

X 1 : X Table 1: Y = X X 2

Image and Multidimensional Signal Processing

Designing Information Devices and Systems I Spring 2018 Lecture Notes Note Introduction to Linear Algebra the EECS Way

Transcription:

EE 2: Coding for Digital Communication & Beyond Fall 203 Lecture 4 October 22 Lecturer: Prof. Anant Sahai Scribe: Jingyan Wang This lecture covers: LT Code Ideal Soliton Distribution 4. Introduction So far in this course, we have been woring on codes with fixed rate. In traditional linear codes, the length of the message bloc, and the length of the codeword n need to be fixed in order to construct the n generator matrix. When n becomes large, the decoding strategy to store in a hashtable the nearest-neighbor codeword for each potential output will be impractical. Rateless codes (also called fountain codes) are a new realm of codes, where the number of the encoded symbols n can be potentially limitless []. The receiver will be able to decode the message as long as the number of encoded symbols it receives is slightly larger than the number of symbols in the original message. To appreciate the advantage of rateless codes, consider the following scenario: One transmitter is sending a message to multiple receivers. In order to have all receivers decode the correct message, the transmitter must send the message at the slowest rate R that the receiver with the most noise will be able to decode. For the good receivers with little noise, most of the received symbols will be redundant. To save energy, we would lie these good receivers to switch off as soon as they receive enough information to decode. In the context of linear code, the codewords are generated by c = G m, where m is the message. If the channel is an erasure channel (with no error), the received symbols a receiver gets are a subset of the elements in a codeword c. By picing up the corresponding rows in the generator matrix G, we can potentially solve for m if there is still enough information in the equation c sub = G sub m. In 2002, Michael Luby introduced Luby transform codes (LT code), the first rateless erasure codes that are very efficient as the data length grows []. We will discuss the mechanism of LT code in this lecture. 4-

4.2 LT Codes 4.2. General Idea In the last section, we mentioned that a receiver can still recover the message m given only a subset of elements x in the received message c. If the system of linear equations x = G sub m has a unique solution, the original message m is successfully recovered. In this design, there are two problems that we need to address: The decoding strategy of solving a system of linear equations by Gaussian elimination has a time complexity of O(n 3 ), where n is the number of received symbols. This is too slow for practical use. We need to guarantee a high probability of successful decoding. For the system of linear equations to have a unique solution, we need a good generator matrix G sub. Hence, the problem reduces to: Design a sparse matrix with most 0 s and only a few s, which has good rows. Design a substitutioin algorithm that leverages the sparsity of the matrix and is significantly faster than Gaussian elimination. With these ideas in mind, we will develop the construction of LT Code in more detail as we proceed. 4.2.2 Decoding Decoding involves solving a system of linear equations x n = G n b 4-2

where x is the received message, G is a n matrix, and b is the unnown message we want to solve. The decoding strategy goes as follows: () Find an equation of the form x i = b i (2) Record x i as the value for b i (3) Substitute x i in place of b i in other equations that contain b i. Simplify. (4) Go bac to () We provide a graph interpretation to this decoding scheme: Each data node represents a bit in the original message b. Each received node represents a bit in the received symbols x. For each received node x i, create a corresponding chec node and connect them. This chec node also connects to all the data nodes that appear in the equation of x i. Hence, each chec node represents an XOR operation of all the data nodes it connects to. For example, the graph above represents the set of linear equations x = b + b 2 x 2 = b 2 x 3 = b + b 2 + b... x = b 3 + b As a running example of the decoding strategy, consider the following situation: 4-3

Observe that b 2 = x 2 and pull the value of x 2 up as the value for b 2 : Substitute b 2 = to all chec nodes that connect b 2. XOR b 2 with the received nodes of these chec nodes. Destroy the edges: Repeat. Observe that b = x : 4-4

... so on so forth. If the system of linear equations has a unique solution, we can hope that all data nodes are decoded at the end. It requires that during each iteration, there always exists a chec node that connects to only one data node. 4.2.3 Encoding We now consider how to construct the generator matrix G (i.e. how to connect the chec nodes with the data nodes). The encoding strategy goes as follows: For each row i of the matrix: Draw degree D at random from the distribution P D ( ) Draw D positions uniformly at random from {, 2, } to fill in s in G XOR together these bits as the value for the encoded symbol x i The tas of generating a good matrix G reduces to designing a good distribution P D ( ). 4.3 Degree Distribution P D ( ) 4.3. All-At-Once Distribution P D (i) = { if i = 0 otherwise Every chec node connects to only one data node. This distribution has the simplest decoding, because the problem reduces to finding the received nodes that cover all data nodes, and there is no substitution wor. Each node connects to a certain data node with probability. We loo for the number of received nodes we need to have in order to decode all the data nodes. This is essentially 4-5

the coupon collector s problem, where each received node is a box, and each data node is a coupon. Coupon Collector s Problem We are trying to collect a set of different cards. We get the cards by buying boxes. Each box contains exactly one card, and it is equally liely to be any of the cards. How many boxes do you need to buy until we have collected at least one copy of every card? Expectation Let X be the number of boxes we need to buy to collect all types of cards. Let X i be the number of boxes to buy in order to collect the i-th type of card. X i is a geometric random variable with parameter i = i+. E[X] = E[ X i ] = E[X i ] = Fluctuation = i+ ln j How many boxes do you need to buy so that we will collect at least one copy of every card with probability δ? The probability that we haven t collected a certain card after the r-th box is ( )r e r Using union bound, the probability that we haven t collected all cards after the r-th box is roughly e r Assume we buy ln + m boxes to absorb fluctuation. The probability that we haven t ln +m collected all cards is e. ln +m e = δ e ln e m = δ e m = δ m = ln δ Hence, we need to buy ln + ln δ boxes in order to collect all cards with probability δ. Going bac to the context of the code, we need to receive ln + ln symbols in order δ to decode all the data codes. This doesn t loo bad, but the ln term is still too much overhead in practice, since the rate effectively goes to 0 as gets large. 4-6

4.3.2 Let s improve it Since All-At-Once distribution doesn t quite wor, we need nodes with higher degrees. Let s first introduce some new concepts: Ripple: the set of all bits not yet decoded, but visible. The ripple is the set of x i where x i only connects to one data node b j, but the value of b j has not been substituted. Release: x i has only one edge and joins the ripple. Some value b j is nown. Degree: The number of chec nodes that connect to a data node. Note that we calculate the degree when a node is created. If some edge on a node is destroyed, we still consider that the degree of the node is unchanged, although the number of edges is smaller. Initially, all the degree- nodes are in the ripple, and we would lie to eep the ripple constantly propagating out: degree- nodes release some degree-2 nodes. Degree-2 nodes release some other degree-2 nodes, and further release some degree-3 nodes, etc. Another metaphor is nuclear chain reaction, we would lie the reaction to complete with controlled burn, but not to fizzle or explode. In All-At-Once distribution, it requires ln received symbols, so there are ln edges on average. We aim for only received symbols. If we want to cover received symbols, we need to have ln edges if the edges are chosen independently and randomly. Clearly, when we have higher-degree nodes, the edges are not independently, but it is a good approximation if we eep the total number of edges the same as the all-at-once case. Hence, each chec node has an expectation of ln edges. We may try: ip D (i) = ln i We get a second try on the distribution: P D (i) = for i i 2 This is a converging series, but unfortunately the distribution doesn t sum up to. To get around this problem, we can potentially normalize the distribution, but there turns out to be a better way to do it. 4.3.3 Ideal Soliton Distribution P D (i) = { if i = i(i ) if < i This loos lie a telescoping series, and we can verify that the distribution does sum up to : P D (i) = + i(i ) = + ( ) + ( ) + + ( ) 2 2 3 = + ( ) = We are going to prove that this distribution leads to perfect propagation. 4-7

Perfect Propagation We index the time by L, the number of data symbols still unnown. Let q(i, L) be the probability that an encoded symbol of degree i releases some data symbol when there are L data symbols still unnown. Initially, all degree- nodes release data symbol and all other nodes cannot release: q(, ) = q(i, ) = 0 for i > Consider q(i, L). It connects to i nodes. Since it releases at time L, it connects to only unnown. One nown node is nown at time (L ). For denominator, from the nodes, we pic i nodes with order, so there are combinations in total (The ordering here is not important, so we are artificially over-counting in the denominator. To compensate, we will also over-count in numerator). For numerator, there are i positions and L values for the nown node, (i ) positions for the node just nown at the last time step, and i 3 ( L j) combinations to choose and order (i 2) nodes from the ( L ) previously nown nodes. Hence, q(i, L) = i(i ) L i 3 ( L j) Let r(l) be the release probability for a random encoded symbol when there are still L unnown symbols. r(l) = P D (i)q(i, L) At L =, r() = P D ()q(, ) = = 4-8

At L <, = r(l) = i(i ) = = = P D (i)q(i, L) i(i )L i 3 ( L j) L i 3 ( L j) ( L j) ( L j) = Professor Sahai foresees that you the reader is not very happy with the last step here, so he maes a story to fully convince you that ( L j) = The story goes as follows: ( ) people are standing in a line. L of them are civilians and ( L ) of them are millitary targets. A sniper eeps shooting randomly at the people until a civilian is hit. For each bullet, it is equally liely to ill any of the ( ) people. What is the probability that the (i )-th bullet is the first bullet to hit a civilian? Out of permutations of the (i ) people that are hit, select one of the L civilians to be (i )-th person that is hit. There are i 2 ( L j) permutations of the (i 2) millitary targets. 4-9

Hence, the probability is L i 2 ( L j) ( j) The distribution sums up to, so. L+ ( L j) = For L + 2 i, i 2 ( L j) = 0. Hence, Hence, i= L+2 ( L j) = 0 ( L j) = This is exactly what we are trying to prove. This ind of argument is called a Combinatorial Proof, since it is proved by a story. Conclusion What result do we get? The probability that a symbol is released at any time step is. Given that we have encoded symbols to start with, the expectation of the number of symbols to be released at any time step is. If everything happens as expected, we will find exactly symbol to release at each time step. The ideal Soliton distribution describes a ripple expanding at constant speed, but it cannot protect against fluctuation. If the number of symbol to be released at L = is, one can expect that there is a high probability that there is no symbol to trigger the whole process. In the next lecture, we will continue from here and modify the ideal Soliton distribution. The new distribution, called robust Soliton distribution, will be capable of absorbing fluctuation. 4-0

References [] Michael Luby LT Codes Proceedings of the 43rd Annual IEEE Symposium on Foundations of Computer Science 2002 4-