Tornado and Luby Transform Codes. Ashish Khisti Presentation October 22, 2003

Similar documents
Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Application of Nonbinary LDPC Codes for Communication over Fading Channels Using Higher Order Modulations

Problem Set 9 Solutions

Lecture 3: Shannon s Theorem

18.1 Introduction and Recap

Calculation of time complexity (3%)

Lecture 4: Universal Hash Functions/Streaming Cont d

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /

VQ widely used in coding speech, image, and video

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Computing Correlated Equilibria in Multi-Player Games

Lecture 10: May 6, 2013

Lecture 14 (03/27/18). Channels. Decoding. Preview of the Capacity Theorem.

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

ECE559VV Project Report

On the Multicriteria Integer Network Flow Problem

Introduction to Algorithms

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

ECE 534: Elements of Information Theory. Solutions to Midterm Exam (Spring 2006)

Estimation: Part 2. Chapter GREG estimation

Chapter 8 SCALAR QUANTIZATION

Copyright 2017 by Taylor Enterprises, Inc., All Rights Reserved. Adjusted Control Limits for P Charts. Dr. Wayne A. Taylor

Simultaneous Optimization of Berth Allocation, Quay Crane Assignment and Quay Crane Scheduling Problems in Container Terminals

EGR 544 Communication Theory

find (x): given element x, return the canonical element of the set containing x;

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

Stanford University CS254: Computational Complexity Notes 7 Luca Trevisan January 29, Notes for Lecture 7

x = , so that calculated

Lecture Space-Bounded Derandomization

Learning Theory: Lecture Notes

Entropy Coding. A complete entropy codec, which is an encoder/decoder. pair, consists of the process of encoding or

Introduction to Information Theory, Data Compression,

A 2D Bounded Linear Program (H,c) 2D Linear Programming

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin

Errors for Linear Systems

Maximizing the number of nonnegative subsets

Notes on Frequency Estimation in Data Streams

Winter 2008 CS567 Stochastic Linear/Integer Programming Guest Lecturer: Xu, Huan

Appendix B: Resampling Algorithms

Expected Value and Variance

TOPICS MULTIPLIERLESS FILTER DESIGN ELEMENTARY SCHOOL ALGORITHM MULTIPLICATION

Basic Statistical Analysis and Yield Calculations

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016

The Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD

Hashing. Alexandra Stefan

CHAPTER 17 Amortized Analysis

Lecture 10 Support Vector Machines II

Min Cut, Fast Cut, Polynomial Identities

Real-Time Systems. Multiprocessor scheduling. Multiprocessor scheduling. Multiprocessor scheduling

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012

Lossy Compression. Compromise accuracy of reconstruction for increased compression.

Lecture Notes on Linear Regression

THE ARIMOTO-BLAHUT ALGORITHM FOR COMPUTATION OF CHANNEL CAPACITY. William A. Pearlman. References: S. Arimoto - IEEE Trans. Inform. Thy., Jan.

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement

Introduction to information theory and data compression

A Simple Inventory System

LINEAR REGRESSION ANALYSIS. MODULE VIII Lecture Indicator Variables

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

Feature Selection: Part 1

x i1 =1 for all i (the constant ).

DC-Free Turbo Coding Scheme Using MAP/SOVA Algorithms

On balancing multiple video streams with distributed QoS control in mobile communications

Lecture 12: Classification

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

CS 798: Homework Assignment 2 (Probability)

Hierarchical State Estimation Using Phasor Measurement Units

Compression in the Real World :Algorithms in the Real World. Compression in the Real World. Compression Outline

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

CS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016

Finding Dense Subgraphs in G(n, 1/2)

A Robust Method for Calculating the Correlation Coefficient

Refined Coding Bounds for Network Error Correction

Economics 130. Lecture 4 Simple Linear Regression Continued

Introduction to Algorithms

Which Separator? Spring 1

Machine learning: Density estimation

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law:

More metrics on cartesian products

Error Probability for M Signals

Outline and Reading. Dynamic Programming. Dynamic Programming revealed. Computing Fibonacci. The General Dynamic Programming Technique

Pricing Network Services by Jun Shu, Pravin Varaiya

Statistical Inference. 2.3 Summary Statistics Measures of Center and Spread. parameters ( population characteristics )

Embedded Systems. 4. Aperiodic and Periodic Tasks

Lecture 21: Numerical methods for pricing American type derivatives

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 6 Luca Trevisan September 12, 2017

Design and Optimization of Fuzzy Controller for Inverse Pendulum System Using Genetic Algorithm

Lecture 5 Decoding Binary BCH Codes

The Minimum Universal Cost Flow in an Infeasible Flow Network

Lecture 4: November 17, Part 1 Single Buffer Management

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Queueing Networks II Network Performance

Connecting Multiple-unicast and Network Error Correction: Reduction and Unachievability

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Games of Threats. Elon Kohlberg Abraham Neyman. Working Paper

Sampling Theory MODULE VII LECTURE - 23 VARYING PROBABILITY SAMPLING

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

Transcription:

Tornado and Luby Transform Codes Ashsh Khst 6.454 Presentaton October 22, 2003

Background: Erasure Channel Elas[956] studed the Erasure Channel β x x β β x 2 m x 2 k? Capacty of Noseless Erasure Channel s No Feedback s necessary to acheve capacty A random lnear code can acheve capacty. Encodng: O(n 2 ) Decodng: O(n 3 ) Applcatons Communcaton Lnks over the Internet Storage Meda m( β )

Classcal MDS Codes c c 2 2 k Features Any set of k co-ordnates s an nformaton set for (n,k,d) MDS Code. The recever knows the codeword once t receves any k symbols and knows ther postons. Capacty achevng codes. c 2 k n symbols Drawbacks Reed Solomon Codes (RS) codes requre O(k 2 ) tme for decodng. Block codes : Need pror knowledge of erasure probablty.

Dgtal Fountan Approach A new protocol for bulk data dstrbuton Scenaro: One Server multple Recevers Encodng: Construct encodng symbols on the fly and send them when atleast one recever s lstenng. Decodng: Collect desred number of symbols from the server and reconstruct the orgnal fle Goals: Relable, Effcent, On Demand and Tolerant

Tornado Codes Features Correct -R(- ε) errors over BEC. Tme for encodng and decodng s proportonal to Very fast software mplementatons. n log ( ) ε Tradeoffs The assumpton of ndependent erasures s crtcal. Hgh latency. Low Rate Implementatons are less attractve. Block Codes Not sutable for heterogeneous recever populaton.

Irregular Bpartte Graph x x 2 c + x x2 Irregular Random Graphs are used for generatng check symbols (x, x 2, x n ) (x x n, c c nβ ) x n Input/Message Symbols c nβ Check Symbols Degree Sequences Left Degree Sequence: (λ, λ 2, λ n ) Rght Degree Sequence: (ρ, ρ 2 ρ m ) Defnton: λ k (ρ k ) s the fracton of edges that are ncdent on a left(rght) node of degree k.

Irregular Graphs: Example Gven: (λ, λ 2 ) (/2,/2) (ρ, ρ 2 ) (0,) Number of Edges E 4 2 π 2 3 l number of left nodes of degree r number of rght nodes of degree 2 π λ E 2 E ρ l r 3 ( 2 l, l ) (2,) r, r ) (0,2) ( 2 Random permutaton between edges, nduces a unform dstrbuton over the ensemble.

Constructon of Tornado Codes (n) (βn) (β 2 n) (β m+ n) B 0 B ` B m C B : Irregular Graph C : Conventonal Code Code C(B 0,B,B 2 B m,c): Each B s an rregular bpartte graph wth same degree sequences C s a conventonal rate (- β) code wth O(n 2 ) complexty. m s chosen such that Length of C: m + 0 nβ m m+ 2 nβ n nβ + β β Ths s a rate (- β) code wth encodng/decodng complexty of O(n). n

Lnear Tme Decodng Algorthm s s 2 s 3.. Fnd a check node c n that s connected to only one source node s k. If no such c n stop and declare error. (a) set s k c n (b) fnd all c n that are neghbors of s k and set c n c n + s k (c) Remove all edges connected to s k 2. Repeat () untl all source nodes are determned. 0 4. 0 s 3 0 2. s 2 s 3 5. 0 s3 0 3. 0 s 2 s 3 0 6. 0

What has to be solved? So far Identfed the structure of encoder as a cascade of rregular bpartte graphs. Suggested a canddate decodng algorthm whch has lnear complexty. Goal: Specfy the set of degree sequences (λ, λ 2, λ n ) and (ρ, ρ 2 ρ m ) for whch the ths smple decodng algorthm succeeds. Man Contrbuton of the Paper:. Develop mathematcal condtons on the degree sequences under whch ths decodng scheme succeeds 2. Provde explct degree sequences that acheve the capacty of BEC.

Condtons on Degree Sequences Defne: x ρ( ) ρ x x λ( ) λ δ: Erasure Probablty of the memorless channel Necessary Condton: If the decodng algorthm succeeds n recoverng all message symbols then ρ( δλ( x)) > x, x (0,] Approach: Compute the expected value of the fracton of edges wth degree one on the rght and requre that t s > 0 Suffcent Condton: The above condton s also suffcent f we mpose λ λ 2 0. Approach: The proof uses tools from statstcal mechancs to show that the varance n degree dstrbuton s small. x

Capacty Achevng Dstrbuton Fx an nteger D > 0, Let 2,3..., ) )( ( + D D H λ.2,3..., )! ( e α ρ α Average degree of left nodes Average degree of rght nodes ) log( / D E E λ λ β α ρ α α ) log( D e e Intuton: Posson dstrbuton s natural f all the edges from the left unformly choose the rght nodes. Ths dstrbuton s preserved when edges are successvely removed from the graph. Heavy tal dstrbuton produces some message nodes of hgh degrees that get decoded frst and remove many edges from the graph.

Capacty Achevng Dstrbuton Note that: ρ( x) e α ( x ) λ( x ) log( x) D For the above choce of ρ(x) and λ(x), t s easy to verfy that ρ( δλ( x)) > x, x (0,] whenever, δ < β +/ D Let D /ε. It follows that β(- ε) fracton of erasures can be corrected by ths rate - β code. The maxmum degree log(d), mples that the number of operatons n decodng s proportonal to nlog(/ ε).

Lnear Programmng Approach Fx (λ, λ 2, λ n ) and δ. The objectve s to fnd (ρ, ρ 2 ρ m ) for some fxed m. Let x /N, for,2..n. We have the followng constrants: ρ(- δ λ( x )) > -x ρ 0 and ρ() Mnmze N ( ρ( δλ( x ) + x ) The soluton for ρ(x) s feasble f the nequalty holds for all x n (0,] Once a feasble soluton has been found the optmal δ s found by a bnary search. An teratve approach s suggested that uses the dual condton δλ( ρ( y)) < y, y (0,]

Practcal Implementatons 640K 320K 60K 60K Tornado Z Code G 0 G G 2 Rate ½ code. Takes 640,000 packets (each 256 byte) as nput. Only three cascades have been used. G 0 and G use heavytal / posson dstrbuton as noted. G 2 cannot use a standard quadratc tme code. It s degree dstrbuton s obtaned through lnear programmng. On a 200MHz Pentum machne, the decodng operaton takes.73 seconds.

Issues The assumpton of ndependent erasures s crtcal n desgn of Tornado codes. So deep nterleavng and very long block lengths are necessary. Hgh latency s ncurred n encodng and decodng operatons, snce both encodng and decodng must be delayed by at least one block sze. Heavy memory usage: Decodng each block of Tornado Z requres 32 MB of RAM. Snce they are block codes, they have to be optmzed for a partcular rate. The number of encodng symbols s fxed when the nput block length and rate s fxed.

Luby Transform Codes Features k nput symbols can be recovered from any set of symbols wth probablty -δ. k + O( k log 2 ( k / δ )) Encodng tme: O(log(k/ δ)) per encodng symbol. Decodng tme: O(k log(k/ δ)) These codes are rate-less: the number of dstnct encodng symbols that can be generated s extremely large. Encodng symbols are generated on the fly The constructon does not make use of channel erasure probablty and hence can optmally serve heterogeneous recevers

Encodng of LT Codes Fx a degree dstrbuton ρ(d) To produce each encodng symbol: Generate the degree D ~ ρ(d) For each of the D edges, randomly pck one nput symbol node. Compute the XOR of all the D neghbors and assgn ths value to the encodng symbol. How does the decoder know the neghbors of an encodng symbol t receves? Ths nformaton can be explctly ncluded as an overhead n each packet. Pseudo-randomness can be exploted to duplcate the encodng process at the recever. The recever has to be gven the seed and/or keys assocated wth the process.

Decodng LT Codes The decodng process s vrtually the same as that of Tornado Codes. At the start release all encodng symbols of degree. Ther neghbors are now covered and form a rpple. In each subsequent step, process one message symbol from the rpple s processed. It s removed as a neghbor of the encodng symbols. If any encodng symbol now has a degree one, t s released. If ts neghbor s not already n the rpple, t gets added to the rpple. The process ends when the rpple s empty. If some message symbols reman unprocessed, ths s a decodng falure.

LT Analyss- (ρ()) How many encodng symbols (each of degree ) wll guarantee that all message symbol nodes wll be covered wth probablty > - δ? Ans: k log(k/ δ). The probablty that a message node s not covered s k log( k / δ ) ( ) δ / k k By usng the unon bound estmate, the desred result follows. k log(k/ δ) encodng symbols s unacceptable. Snce all edges are randomly ncdent on message nodes, k log(k/ δ) edges are requred to cover all the nodes.

LT Analyss-2 Suppose L nput symbols reman unprocessed durng a decodng step. Any encodng symbol s equally lkely to get released ndependent of all other encodng symbols. If an encodng symbols s released, t s equally lkely to cover any of the L symbols. Defne: q(,l) probablty that an encodng symbol of degree s released, when L nput symbols reman unprocessed. 0 ), (.., 2.., ), ( ) (, 2 2 ) ( 2 + + L q k L k P C C P C L q k q k L L k otherwse

LT Analyss-3 Solton Dstrbuton: ρ() / k ρ( ) / ( ), 2.. k r(l) : probablty that a an encodng symbol s released ( k L)! L ( k L ( 2))! r( L) ρ( ) q(, L) k ( k )! ( k )! k Thus at each step, we expect one encodng symbol to be released The sze of the rpple at each step s one.

Propertes of Solton Dstrbuton At each step one encodng symbol s released. Only k encodng symbols are needed on average to retreve k nput symbols. It expected number of edges n the graph s k log(k). The deal solton dstrbuton compresses the number of encodng symbols to the mnmum possble value, keepng the number of edges n the graph mnmum. The deal solton dstrbuton does not work well n practce, snce the expected sze of rpple s one. It s extremely senstve to small varatons.

Robust Solton Dstrbuton Mantan the sze of Rpple to a larger value R ~ Defne the followng dstrbuton Intuton c k log( k τ() / δ ) R/k for k/r- R log(r/δ)/k for k/r 0 for I k/r+..k The value of τ() s chosen so that R encodng symbols are expected to be released ntally. Ths generates a rpple of sze R When L encodng symbols are unprocessed, the most probable symbols to be released have degree k/l.

Robust Solton Dstrbuton (contd.) When L R, we requre that all the unprocessed symbols be covered. Ths s ensured by choosng τ(k/r) R log(r/δ)/k. The probablty any covered symbol gets ncluded n the rpple s (L-R)/L. We need L/(L-R) releases to expect that the sze of rpple remans the same. Thus the fracton of encodng symbols of degree k/l s proportonal to: L L R ( ) k ( )( k R) ( ) + R ( )( k R) ρ ( ) +τ ( )

Robust Solton Dstrbuton The robust solton dstrbuton s gven by µ() (τ() + ρ())/ς, where ς (τ() + ρ()). One can show that: Average number of encodng symbols to recover message 2 symbols s k + O( k log ( k / δ )) Decodng takes tme proportonal to O(k log(k/δ)) and encodng takes tme O(log(k/δ)) per symbol The probablty that the decodng algorthm fals to recover the message symbols s less than δ. 26

Conclusons Tornado Codes acheve lnear tme encodng and decodng but cannot solve the heterogeneous user case LT Codes can smultaneously serve heterogeneous users, but requre O(k logk) tme. Raptor codes (2000) acheve the best of both worlds and wll be dscussed next week.