An Introduction to Algorithmic Coding Theory

Similar documents
EE229B - Final Project. Capacity-Approaching Low-Density Parity-Check Codes

ECEN 655: Advanced Channel Coding

Low-density parity-check codes

Introduction to Low-Density Parity Check Codes. Brian Kurkoski

An Introduction to Low Density Parity Check (LDPC) Codes

Belief-Propagation Decoding of LDPC Codes

Performance Analysis and Code Optimization of Low Density Parity-Check Codes on Rayleigh Fading Channels

Codes on graphs and iterative decoding

Coding Techniques for Data Storage Systems

Iterative Encoding of Low-Density Parity-Check Codes

On Generalized EXIT Charts of LDPC Code Ensembles over Binary-Input Output-Symmetric Memoryless Channels

Codes on graphs and iterative decoding

Error-Correcting Codes:


Bounds on Achievable Rates of LDPC Codes Used Over the Binary Erasure Channel

Slepian-Wolf Code Design via Source-Channel Correspondence

5. Density evolution. Density evolution 5-1

Low-Density Parity-Check Codes

Bifurcations in iterative decoding and root locus plots

Lower Bounds on the Graphical Complexity of Finite-Length LDPC Codes

Lecture 4 : Introduction to Low-density Parity-check Codes

Introducing Low-Density Parity-Check Codes

An Introduction to Low-Density Parity-Check Codes

Joint Iterative Decoding of LDPC Codes and Channels with Memory

Graph-based codes for flash memory

Optimal Rate and Maximum Erasure Probability LDPC Codes in Binary Erasure Channel

Bounds on the Performance of Belief Propagation Decoding

On the Typicality of the Linear Code Among the LDPC Coset Code Ensemble

Iterative Decoding for Wireless Networks

LDPC Codes. Intracom Telecom, Peania

THE seminal paper of Gallager [1, p. 48] suggested to evaluate

Belief propagation decoding of quantum channels by passing quantum messages

An Efficient Maximum Likelihood Decoding of LDPC Codes Over the Binary Erasure Channel

Notes 3: Stochastic channels and noisy coding theorem bound. 1 Model of information communication and noisy channel

Capacity-Achieving Ensembles for the Binary Erasure Channel With Bounded Complexity

Capacity-approaching codes

Lecture 12. Block Diagram

Low Density Parity Check (LDPC) Codes and the Need for Stronger ECC. August 2011 Ravi Motwani, Zion Kwok, Scott Nelson

Lecture 8: Shannon s Noise Models

Convergence analysis for a class of LDPC convolutional codes on the erasure channel

Structured Low-Density Parity-Check Codes: Algebraic Constructions

Lecture 6 I. CHANNEL CODING. X n (m) P Y X

Analysis of Sum-Product Decoding of Low-Density Parity-Check Codes Using a Gaussian Approximation

Physical Layer and Coding

Error Correcting Codes Questions Pool

A Non-Asymptotic Approach to the Analysis of Communication Networks: From Error Correcting Codes to Network Properties

The BCH Bound. Background. Parity Check Matrix for BCH Code. Minimum Distance of Cyclic Codes

Error Correcting Codes: Combinatorics, Algorithms and Applications Spring Homework Due Monday March 23, 2009 in class

Low-complexity error correction in LDPC codes with constituent RS codes 1

Decoding Codes on Graphs

LDPC Code Ensembles that Universally Achieve Capacity under BP Decoding: A Simple Derivation

On the Block Error Probability of LP Decoding of LDPC Codes

Factor Graphs and Message Passing Algorithms Part 1: Introduction

Fountain Uncorrectable Sets and Finite-Length Analysis

One Lesson of Information Theory

List Decoding of Reed Solomon Codes

Making Error Correcting Codes Work for Flash Memory

Lecture 2 Linear Codes

Lecture 12: November 6, 2017

Bridging Shannon and Hamming: List Error-Correction with Optimal Rate

Block Codes :Algorithms in the Real World

LDPC Codes. Slides originally from I. Land p.1

LOW-density parity-check (LDPC) codes were invented

B I N A R Y E R A S U R E C H A N N E L

BOUNDS ON THE MAP THRESHOLD OF ITERATIVE DECODING SYSTEMS WITH ERASURE NOISE. A Thesis CHIA-WEN WANG

Chapter 9 Fundamental Limits in Information Theory

LP Decoding Corrects a Constant Fraction of Errors

On Encoding Symbol Degrees of Array BP-XOR Codes

6.1.1 What is channel coding and why do we use it?

Time-invariant LDPC convolutional codes

Constructions of Nonbinary Quasi-Cyclic LDPC Codes: A Finite Field Approach

Spatially Coupled LDPC Codes

Codes designed via algebraic lifts of graphs

On Achievable Rates and Complexity of LDPC Codes over Parallel Channels: Bounds and Applications

Revision of Lecture 5

ON THE MINIMUM DISTANCE OF NON-BINARY LDPC CODES. Advisor: Iryna Andriyanova Professor: R.. udiger Urbanke

Lecture 4: Codes based on Concatenation

Graph-based Codes and Iterative Decoding

Codes on Graphs. Telecommunications Laboratory. Alex Balatsoukas-Stimming. Technical University of Crete. November 27th, 2008

Coding for loss tolerant systems

2 Information transmission is typically corrupted by noise during transmission. Various strategies have been adopted for reducing or eliminating the n

GLDPC-Staircase AL-FEC codes: A Fundamental study and New results

Notes 10: List Decoding Reed-Solomon Codes and Concatenated codes

Part III Advanced Coding Techniques

Raptor Codes: From a Math Idea to LTE embms. BIRS, October 2015

SIDDHARTH GROUP OF INSTITUTIONS :: PUTTUR Siddharth Nagar, Narayanavanam Road UNIT I

CSCI 2570 Introduction to Nanocomputing

COMPSCI 650 Applied Information Theory Apr 5, Lecture 18. Instructor: Arya Mazumdar Scribe: Hamed Zamani, Hadi Zolfaghari, Fatemeh Rezaei

Low-Density Parity-Check codes An introduction

Practical Polar Code Construction Using Generalised Generator Matrices

Turbo Compression. Andrej Rikovsky, Advisor: Pavol Hanus

Construction and Performance Evaluation of QC-LDPC Codes over Finite Fields

Part III. Cyclic codes

Recent Results on Capacity-Achieving Codes for the Erasure Channel with Bounded Complexity

SIPCom8-1: Information Theory and Coding Linear Binary Codes Ingmar Land

Lecture 14 February 28

Lecture 4 Noisy Channel Coding

Noisy channel communication

Lecture 3: Channel Capacity

Transcription:

An Introduction to Algorithmic Coding Theory M. Amin Shokrollahi Bell Laboratories

Part : Codes -

A puzzle What do the following problems have in common? 2

Problem : Information Transmission MESSAGE G W D A S U N C L R V J K F T X M I H Z B Y E MASSAGE?? 3

Problem 2: Football Pool Problem Bayern München :. FC Kaiserslautern Homburg : Hannover 96 86 München : FC Karlsruhe 2. FC Köln : Hertha BSC Wolfsburg : Stuttgart Bremen : Unterhaching 2 Frankfurt : Rostock Bielefeld : Duisburg Feiburg : Schalke 4 2 4

Problem 3: In-Memory Database Systems Database Corrupted! User memory malloc malloc malloc malloc Code memory 5

Problem 4: Bulk Data Distribution 6

What do they have in common? They all can be solved using (algorithmic) coding theory 7

Basic Idea of Coding Adding redundancy to be able to correct! Objectives: MESSAGE Add as little redundancy as possible; correct as many errors as possible. MESSAGE MESSAGE MASSAGE 8

Codes A code of block-length n over the alphabet GF(q) is a set of vectors in GF(q) n. If the set forms a vector space over GF(q), then the code is called linear. [n,k] q -code 9

??? Encoding Problem source??? Codewords Efficiency!

Linear Codes: Generator Matrix Generator matrix source codeword k n k x n O(n 2 )

Linear Codes: Parity Check Matrix Parity check matrix codeword O(n 2 ) after O(n 3 ) preprocessing. 2

The Decoding Problem: Maximum Likelihood Decoding a b c d a b c d a b c d a b c d Received: a x c d a b z d a b c d. a x c d a b z d a b c d a x c d a x c d a x c d 3 a b z d a b z d a b z d 3 a b c d a b c d a b c d 2 3 3

Hamming Distance Hamming distance between two vectors of equal dimension is number of positions at which they differ. [n,k,d] q -code 4

Maximum Likelihood Decoding Given received word, find a codeword that has least Hamming distance to it. Intractable in general. 5

Worst Case Error Correction [n,k,d] q -code is capable of correcting up to e = (d )/2 errors in whatever locations! Hamming balls of radius e d e 6

Errors in Known Locations: Erasures [n,k,d] q -code is capable of correcting up to d errors in whatever locations if the locations are known. 7

Projective plane over GF(2) 2 3 4 5 6 2 3 4 5 6 Minimum distance = 4 8

Hamming Code 2 3 4 5 6 7,4,3] [ -code 2 9

A Solution to the Football Match Problem [4,2,3] 3 -Hamming Code with generator matrix: 2 Codewords: 2 2 2 2 2 2 2 2 2 2 2 2 2

A Solution to the Football Match Problem This Hamming code is perfect: Hamming balls of radius fill the space GF(3) 4 : ( ) 4 3 2 2 i = 9 (+4 2) = 8. i i= Any vector in GF(3) 4 has Hamming distance at most to a codeword. Hamming code Hamming code Fixed game 2

Bounds How can we prove optimality of codes? (Fix any two of the three parameters n, k, d and maximize the third.). Hamming bound: e i=( n i) (q ) i q n k. Equality: perfect codes 2. Plotkin bound: d+k n+. Equality: MDS codes. 3. Other more refined bounds... 22

Perfect Codes Have been completely classified by van Lint and Tieätväinen. Essentially: Hamming codes, Golay codes. Not desirable in communication scenarios. 23

MDS Codes Not classified completely. Open problem: Given dimension k and field size q, determine maximum block length of an MDS-code. (Conjecturally q + or q +2.) MDS codes are desirable in practice in worst case scenarios if efficient encoding and decoding available. Prototype: Reed-Solomon codes. 24

Reed-Solomon Codes: Applications. Satellite Communication, 2. Hard disks, 3. Compact Discs, Digital Versatile Disks, Digital Audio Tapes, 4. Wireless Communication, 5. Secret sharing, 6.... 25

Reed-Solomon Codes: Definitions Choose n different elements x,...,x n in GF(q). Reed-Solomon code is image of the morphism GF(q)[x] <k GF(q) n f (f(x ),...,f(x n )) Block length = n. Dimension? Minimum distance? 26

Reed-Solomon Codes: Parameters Theorem. Nonzero polynomial of degree m over a field can have at most n zeros over that field. 27

Reed-Solomon Codes: Dimension GF(q)[x] <k GF(q) n f (f(x ),...,f(x n )) Kernel: if k n since nonzero polynomial of degree k has at most k zeros. Dimension: k (if k n). 28

Reed-Solomon Codes: Minimum Distance GF(q)[x] <k GF(q) n f (f(x ),...,f(x n )) Minimum distance: Maximal number of zeros in a nonzero codeword is k, since evaluating polynomial of degree k. Minimum distance is n k +, hence equal, hence MDS code!. 29

Reed-Solomon Codes: Encoding f (f(x ),...,f(x n )) is easy to compute! O(n 2 ) using naive algorithm. O(nlog 2 (n)loglog(n) using fast algorithms. 3

Reed-Solomon Codes: Decoding No efficient maximum likelihood decoding known. Concentrate on bounded distance decoding. Transmitted word Received word Number of agreements (n+k)/2. Number of disagreements (n k)/2. 3

Welch-Berlekamp Algorithm Transmitted word: (f(x ),...,f(x n )). Received word: (y,...,y n ). Number of agreements (n+k)/2. Find f! 32

Welch-Berlekamp Algorithm Step : Find g(x) GF(q)[x] <(n+k)/2 and h(x) GF(q)[x] (n k)/2, not both zero, such that i =,...,n: g(x i )+y i h(x i ) =. (Solving a system of equations!) Step 2: Then f = g/h! 33

Welch-Berlekamp Algorithm: Proof H(x) := g(x) f(x)h(x). Degree of H(x) < (n+k)/2. If y i = f(x i ), then H(x i ) =. H(x) has at least (n+k)/2 zeros. H(x) is zero. f(x) = g(x)/h(x). 34

Welch-Berlekamp Algorithm: Running time Step : Solving a homogeneous n (n+) system of equations; O(n 3 ). Can be reduced to O(n 2 ) (Welch-Berlekamp, 983; displacement method (Olshevsky-Shokrollahi, 999)). Step 2: Polynomial division; O(n 2 ). 35

Welch-Berlekamp Algorithm: Generalization Has been generalized to more than (n k)/2 errors (list-decoding, Sudan, 997, Guruswami-Sudan, 999). Step 2 requires factorization of bivariate polynomials. Can be done more efficiently (Gao-Shokrollahi 999, Olshevsky-Shokrollahi 999). 36

A Solution to the In-Memory Database Problem Database Redundant User memory malloc malloc malloc malloc Code memory 37

Reed-Solomon Codes: Generalization Disadvantage of RS-codes: GF(q) must be large to accommodate many points, so long codes impossible. Interpret GF(q) as affine line over itself, and generalize to more complicated algebraic curves. Lead to best known codes in terms of minimum distance, dimension, block-length. Above algorithms can be generalized to these Algebraic-geometric codes. 38

Probabilistic Methods 38-

Input alphabet Channels.2.3.5.5.5..9 Transition probabilities 39 Output alphabet

Entropy and Mutual Information X and Y discrete random variables on alphabets X and Y and distributions p(x) and q(x). p(x, y) their joint distribution. Entropy H(X) of X H(X) = x X p(x)log p(x). Mutual information I(X; Y) I(X;Y) = y Y x X p(x,y)log p(x,y) p(x)p(y). 4

Entropy and Mutual Information H(X) is the amount of uncertainty of random variable X. I(X;Y) is the reduction in the uncertainty of X due to the knowledge of Y. 4

Capacity Capacity of a channel with input alphabet X and output alphabet Y and probability transition matrix p(y x) is C = max p(x) I(X;Y), where maximum is over all possible input distributions p(x). 42

Binary Erasure Channel: Examples of Capacity: BEC -p p p E -p Capacity = p 43

Binary Symmetric Channel: Examples of Capacity: BSC -p p p -p Capacity = +plog 2 (p)+( p)log 2 ( p) 44

Capacity and Communication Shannon s Coding Theorem, 948: C channel with capacity C. For any rate R C there exists a sequence of codes of rate R such that the probability of error of the Maximum Likelihood Decoding for these codes approaches zero as the block-length approaches infinity. The condition R C is necessary and sufficient. 45

Problems How to find the sequences of codes? (Random codes, Concatenated codes, ) How to decode efficiently? Has been open for almost 5 years. Low-Density Parity-Check Codes 46

46- Part 2: Low-Density Parity-Check Codes

Low-Density Parity Check Codes Gallager 963 Zyablov 97 Zyablov-Pinsker 976 Tanner 98 Turbo Codes 993 Berroux-Glavieux-Thitimajshima 47

Sipser-Spielman, Spielman 995 MacKay-Neal, MacKay 995 Luby-Mitzenmacher-S-Spielman-Stemann 997 Luby-Mitzenmacher-S-Spielman 998 Richardson-Urbanke 999 Richardson-Shokrollahi-Urbanke 999 48

Code Construction Codes are constructed from sparse bipartite graphs. 49

Code Construction Any binary linear code has a graphical representation. a b a c f! = c b c! d e= d e! a c e= f Not any code can be represented by a sparse graph. 5

Parameters n r Rate Rate n-r n - average left degree average right degree 5

Dual Construction a b c a b f Source bits d e a b c g c e g h Redundant bits f g b d e f g h h Encoding time is proportional to number of edges. 52

Encoding? Algorithmic Issues Is linear time for the dual construction Is quadratic time (after preprocessing) for the Gallager construction. More later! Decoding? Depends on the channel, Depends on the fraction of errors. 53

Decoding on a BSC: Flipping satisfied check unsatisfied check 54

Decoding on a BSC: Gallager Algorithm A (Message passing) b u x y z m x y m z u m = x b if x=y=z=u else m=x y z u MESSAGE CHECK 55

Decoding on a BSC: Belief Propagation b u x y z m hyperbolic transform x y m z u m = x+y+z+u+b m=x * y z u * * ( a,b) * (c,d):=(a+c, b+d mod 2) MESSAGE CHECK Messages in log-likelihood ratios. 56

Optimality of Belief Propagation Belief propagation is bit-optimal if graph has no loops. Maximizes the probability P(c m = b y) = c CP(c y). 57

Performance on a (3,6)-graph Shannon limit: % Flipping algorithm: %? Gallager A: 4% Gallager B: 4% (6.27%) Erasure decoder: 7% Belief propagation: 8.7% (.8%) 58

The Binary Erasure Channel (BEC) -p p p E -p 59

Decoding on a BEC: Luby-Mitzenmacher-Shokrollahi-Spielman- Stemann x y z u m x y m z u m = if x y z u = else m = if x = y = z = u = else MESSAGE CHECK 6

Decoding on a BEC Phase : Direct recovery a? c?? f g h b b e b d d e 6

Decoding on a BEC Phase 2: Substitution a b c?? f g h b d e d e 62

Example (a) (b) (c) Complete Recovery (d) (e) (f) 63

Have: fast decoding algorithms. The (inverse) problem Want: design codes that can correct many errors using these algorithms. Focus on the BEC in the following. 64

Choose regular graphs. Experiments An (d,k)-regular graph has rate at least d/k. Can correct at most an d/k-fraction of erasures. Choose a random (d, k)-graph. p := maximum fraction of erasures the algorithm can correct. What are these numbers? d k d/k p 3 6.5.429 4 8.5.383 5.5.34 3 9.33.282 4 2.33.2572 65

A Theorem Luby, Mitzenmacher, Shokrollahi, Spielman, Stemann, 997: A randomly chosen (d,k)-graph can correct a p -fraction of erasures with high probability if and only if p ( ( x) k ) d < x for x (,p ). 66

67

Analysis: (3,6)-graphs Expand neighborhoods of message nodes. 68

Analysis: (3,6)-graphs p i probability that message node is still erased after ith iteration. p i+ Message Check p i Message p i+ = p ( ( p i ) 5 ) 2. 69

Successful Decoding Condition: p ( ( p i ) 5 ) 2 <p i 7

Making arguments exact: Analysis: (3,6)-graphs Neighborhood is tree-like: high probability, standard argument. Above argument works for expected fraction of erasures at lth round. Real value is sharply concentrated around expected value p l : Edge exposure martingale, Azuma s inequality. 7

The General Case Let λ i and ρ i be the fraction of edges of degree i on the left and the right hand side, respectively. Let λ(x) := i λ ix i and ρ(x) := i ρ ix i. Condition for successful decoding for erasure probability p is then for all x (,p ). p λ( ρ( x)) < x 72

Richardson-Urbanke, 999: Belief propagation f l : density of the probability distribution of the messages passed from the check nodes to the message nodes at round l of the algorithm. P : density of the error distribution (in log-likelihood representation). Consider (d, k) regular graph. Γ(f l+ ) = ( ( )) (d ) (k ) Γ P f l, where Γ is a hyperbolic change of measure function, and denotes convolution. Γ(f)(y) := f(lncothy/2)/sinh(y), 73

We want f l to converge to a Delta function at. Gives rise to high-dimensional optimization algorithms. 74

Achieving capacity Want to design codes that can recover from a fraction of R of erasures (asymptotically). Want to have λ and ρ so that p λ( ρ( x)) < x for all x (,p ), and p arbitrarily close to R = ρ(x)dx. λ(x)dx 75

Tornado codes Extremely irregular graphs provide for any rate R sequences of codes which come arbitrarily close to the capacity of the erasure channel! Degree structure? Choose design parameter D. λ(x) := H(D) (x+ x2 2 + + xd D ) ρ(x) := exp(µ(x )), where H(D) = +/2+ +/D and µ = H(D)/( /(D +)). 76

Tornado Codes: Left Degree Distribution 77

Right regular codes Shokrollahi, 999: Graphs that are regular on the right. Degrees on the left are related to the Taylor expansion of ( x) /m. These are the only known examples of LDPC codes that achieve capacity on a nontrivial channel using a linear time decoding algorithm. 78

Other channels? f density function. λ(f) := i λ if (i ). ρ(f) := i ρ if (i ). Want P such that f l. Γ(f l+ ) = ρ(γ(p λ(f l ))). 79

Conditions on the density functions Richardson-Shokrollahi-Urbanke, 999: Consistency: if the channel is symmetric, then the density functions f l satisfy f(x) = f( x)e x. Fixed point theorem: If P err (f i ) = P err (f j ) for i < j, then f i = f j is a fixed point of the iteration. 8

Conditions on the density functions Stability: let r := lim n n logp err(p n ). Then for λ 2 ρ () > e r we have P err (f l ) > ǫ for some fixed ǫ and all l. If λ 2 ρ () < e r, then the fixed point is stable. is the error probability. P err (f) := f(x)dx 8

Stability Erasure channel with erasure probability p : λ 2 ρ () p. BSC channel: with probability p: λ 2 ρ () AWGN channel: with variance σ 2 : 2 p( p). λ 2 ρ () e 2σ 2. 82

Shokrollahi, 999: Stability for the Erasure Channel stable not stable p λ( ρ( x ))- x p λ( ρ( x)) - x 83

Flatness: Higher Stability Conditions Shokrollahi, 2: (λ m (x),ρ m (x)) capacity achieving sequence of degree distributions. Then: ( R)λ m ( ρ m ( x)) x converges uniformly to the zero-function on the interval [, R]. No equivalent known for other channels. 84

Flatness: Higher Stability Conditions p λ( ρ( x)) x 85

Capacity achieving No sequences of c.a. degree distributions for channels other than the erasure channel known. Conjecture: They exist! -2-3 -4 irregular LDPCC; n= 6 Turbo Code; n= 6-5 Shannon Limit Threshold Threshold (3, 6)-regular LDPCC; n= 6-6..2.4.6.8..2 E b /N [db]..977.955.933.92.89.87.85 σ.59.53.47.42.36.3.25.2 P b 86

Applications to computer networks Distribution of bulk data to a large number of clients. Want fully reliable, low network overhead, support vast number of receivers with heterogeneous characteristics users want to access data at times of their choosing and these access times overlap. 87

A Solution Broadcast Server 88

A Solution Client joins multicast group until enough of the encoding has been received, and then decodes to obtain original data. Amount of encoding received Time Digital Fountain, http://www.dfountain.com. 89

Open problems Asymptotic theory. Classification of capacity achieving sequences for the erasure channel. 2. Capacity achieving sequences for other channels. 3. Exponentially small error probabilities for the decoder (instead of polynomially small). Explicit constructions. Constructions using finite geometries. 2. Construction using Reed-Solomon-Codes. 3. Algebraic constructions. Graphs with loops. Short codes 9

Algorithmic issues. Design and analysis of new decoding algorithms. 2. Design of new encoders. Packet based wireless networks. Applications Randomness Use of randomness in other areas: random convolutional codes?. 9