Lecture 4: Codes based on Concatenation

Similar documents
Lecture 19 : Reed-Muller, Concatenation Codes & Decoding problem

Lecture 03: Polynomial Based Codes

Lecture 4: Proof of Shannon s theorem and an explicit code

Lecture 9: List decoding Reed-Solomon and Folded Reed-Solomon codes

Lecture 12: November 6, 2017

Error Detection and Correction: Hamming Code; Reed-Muller Code

Error Correcting Codes: Combinatorics, Algorithms and Applications Spring Homework Due Monday March 23, 2009 in class

Lecture 3: Error Correcting Codes

Error Correcting Codes Questions Pool

6.895 PCP and Hardness of Approximation MIT, Fall Lecture 3: Coding Theory

CS151 Complexity Theory. Lecture 9 May 1, 2017

Lecture 19: Elias-Bassalygo Bound

Lecture 18: Shanon s Channel Coding Theorem. Lecture 18: Shanon s Channel Coding Theorem

Lecture 28: Generalized Minimum Distance Decoding

for some error exponent E( R) as a function R,

Notes 10: List Decoding Reed-Solomon Codes and Concatenated codes

Notes for Lecture 18

Lecture 7 September 24

List Decoding of Reed Solomon Codes

Lecture 8: Shannon s Noise Models

Lecture Introduction. 2 Formal Definition. CS CTT Current Topics in Theoretical CS Oct 30, 2012

Locally testable and Locally correctable Codes Approaching the Gilbert-Varshamov Bound

Lecture B04 : Linear codes and singleton bound

Lecture 6: Expander Codes

EE 229B ERROR CONTROL CODING Spring 2005

Notes 3: Stochastic channels and noisy coding theorem bound. 1 Model of information communication and noisy channel

Decoding Reed-Muller codes over product sets

Lecture Introduction. 2 Linear codes. CS CTT Current Topics in Theoretical CS Oct 4, 2012

And for polynomials with coefficients in F 2 = Z/2 Euclidean algorithm for gcd s Concept of equality mod M(x) Extended Euclid for inverses mod M(x)

Questions Pool. Amnon Ta-Shma and Dean Doron. January 2, Make sure you know how to solve. Do not submit.

Notes for the Hong Kong Lectures on Algorithmic Coding Theory. Luca Trevisan. January 7, 2007

High-rate Locally-testable Codes with Quasi-polylogarithmic Query Complexity

: Error Correcting Codes. October 2017 Lecture 1

Towards Optimal Deterministic Coding for Interactive Communication

Local correctability of expander codes

Lecture 29: Computational Learning Theory

Near-Optimal Secret Sharing and Error Correcting Codes in AC 0

Math 512 Syllabus Spring 2017, LIU Post

2 Completing the Hardness of approximation of Set Cover

Locally Decodable Codes

MATH Examination for the Module MATH-3152 (May 2009) Coding Theory. Time allowed: 2 hours. S = q

Lecture 11: Polar codes construction

Lecture 11: Quantum Information III - Source Coding

Lecture 7: Passive Learning

CS264: Beyond Worst-Case Analysis Lecture #11: LP Decoding

Exercise 1. = P(y a 1)P(a 1 )

Locally Decodable Codes

Error-Correcting Codes:

Lecture 12: Reed-Solomon Codes

Efficiently decodable codes for the binary deletion channel

CS Communication Complexity: Applications and New Directions

Goldreich-Levin Hardcore Predicate. Lecture 28: List Decoding Hadamard Code and Goldreich-L

Notes 7: Justesen codes, Reed-Solomon and concatenated codes decoding. 1 Review - Concatenated codes and Zyablov s tradeoff

Lecture 2 Linear Codes

Lecture 8: Channel and source-channel coding theorems; BEC & linear codes. 1 Intuitive justification for upper bound on channel capacity

Explicit Capacity Approaching Coding for Interactive Communication

1 Randomized Computation

Lecture 12: Constant Degree Lossless Expanders

Two Query PCP with Sub-Constant Error

Lecture 12: November 6, 2013

Related-Key Statistical Cryptanalysis

Lecture 12. Block Diagram

CSCI 2570 Introduction to Nanocomputing

Lecture 21: P vs BPP 2

Concatenated Polar Codes

Linear-algebraic list decoding for variants of Reed-Solomon codes

Berlekamp-Massey decoding of RS code

Lecture 3: AC 0, the switching lemma

1 Ex. 1 Verify that the function H(p 1,..., p n ) = k p k log 2 p k satisfies all 8 axioms on H.

List and local error-correction

IP = PSPACE using Error Correcting Codes

Optimum Soft Decision Decoding of Linear Block Codes

COMPSCI 650 Applied Information Theory Apr 5, Lecture 18. Instructor: Arya Mazumdar Scribe: Hamed Zamani, Hadi Zolfaghari, Fatemeh Rezaei

Ma/CS 6b Class 25: Error Correcting Codes 2

(each row defines a probability distribution). Given n-strings x X n, y Y n we can use the absence of memory in the channel to compute

Shannon s Noisy-Channel Coding Theorem

Guess & Check Codes for Deletions, Insertions, and Synchronization

Lecture 8 (Notes) 1. The book Computational Complexity: A Modern Approach by Sanjeev Arora and Boaz Barak;

16.36 Communication Systems Engineering

Explicit List-Decodable Codes with Optimal Rate for Computationally Bounded Channels

MT361/461/5461 Error Correcting Codes: Preliminary Sheet

The BCH Bound. Background. Parity Check Matrix for BCH Code. Minimum Distance of Cyclic Codes

Locality in Coding Theory

Algebraic Geometry Codes. Shelly Manber. Linear Codes. Algebraic Geometry Codes. Example: Hermitian. Shelly Manber. Codes. Decoding.

atri/courses/coding-theory/book/

On the List-Decodability of Random Linear Codes

Hardness Amplification

Reed-Solomon codes. Chapter Linear codes over finite fields

1 The Low-Degree Testing Assumption

Lecture 6. Today we shall use graph entropy to improve the obvious lower bound on good hash functions.

Arrangements, matroids and codes

Lecture 17: Perfect Codes and Gilbert-Varshamov Bound

Information Theory. Week 4 Compressing streams. Iain Murray,

A list-decodable code with local encoding and decoding

Lecture Summary. 2 Simplified Cramer-Shoup. CMSC 858K Advanced Topics in Cryptography February 26, Chiu Yuen Koo Nikolai Yakovenko

Chapter 7. Error Control Coding. 7.1 Historical background. Mikael Olofsson 2005

Great Theoretical Ideas in Computer Science

MATH3302. Coding and Cryptography. Coding Theory

CSE 123: Computer Networks

Transcription:

Lecture 4: Codes based on Concatenation Error-Correcting Codes (Spring 206) Rutgers University Swastik Kopparty Scribe: Aditya Potukuchi and Meng-Tsung Tsai Overview In the last lecture, we studied codes based on univarite polynomials, also known as Reed-Solomon codes, and on multivariate polynomials, also known as Reed-Muller codes. The Reed-Solomon codes had rate R = δ, which is optimal (given by the Singleton bound), but required alphabets of large size ( Σ n). Multivariate polynomials in a constant number of variables, however, gave codes of positive rate and distance, and required an alphabet size of only n m, where m is the number of variables. While this is not as bad as the Reed-Solomon codes in this aspect, it still is a growing function of n. One interesting special cases of Reed-Muller codes is that over an alphabet of size log n, and degree d =, We get codes of size C = 2n, with distance δ = 2, known as the Hadamard code. We will first study a few basic properties of the Hadamard code, and move on to Concatenation of codes, which gives us codes of constant rate and distance, over a binary alphabet. We study concatenation codes under both error models, i.e., worst case, and random. In the latter case, we shall study constructions of codes that meet Shannon Capacity. 2 The Hadamard Code Recall that the extended Hamming code is nothing but the Hamming code with an extra linear constraint, i.e., i [n] x i = 0. This is a code of size Θ ( ) 2 n n of distance 4. Claim. The Hadamard code is the Dual of the extended Hamming Code. Proof. The Hadamard code is obtained by evaluating point x {0, } m at all the degree polynomials over F 2 in m variables, which has all {0, } vectors of length m +, starting with, as columns. This is exactly the same as the parity check matrix of the extended Hamming code of length 2 m. Problem 3 in Homweork.... 0 0... G = 0 0......... 0...

3 Construction of Concatenated Codes Given a long code over a large alphabet, and a (possibly) short code over a small alphabet, we perform an operation called concatenation to get a long code over a small alphabet with decent parameters, i.e., with R > 0 and δ > 0. This idea was due to Forney in 970. The strategy is the following: We use an outer code, which is basically the long code over a large alphabet, and encode each alphabet of this code using the inner code. We shall call these the blocks of the code. 3. Choice of Outer and Inner codes. The outer code is a Reed-Solomon code C over F 2 t with rate R. These are codes of length 2 t constructed using polynomials of degree R 2 t. 2. The inner code is any binary code C F a 2 with dimension t, so C = 2 t. Let r := t a to denote the rate of C. We require that dist(c ) H(r), and we can find such a linear code by brute force search 2. 3.2 Parameters of the Resulting Code The size of the code is given by the outer code, i.e., C = ( 2 t) R2 t +. Since the outer code words have minimum distance R, and for each block, they differ in, the inner code words have distance H ( r), the distance of the resulting code is ( R)H ( r). Here H is the inverse of the entropy function which takes values between 0 and 2. The length of a code word in this code is a2 t bits. So the rate is given by t R 2t a 2 t = tr a = rr. 3.3 Construction Time There are three main steps in the construction of these codes, i.e., finding the inner code, via. brute force, construction of the generator matrix for the outer code, and the time taken to substituting the generator matrix of the smaller code in the generator matrix of the longer code. Finding a small code efficiently takes poly(2 a ) = poly(2 t ) time, if r is a constant (what about the case when r is not a constant?). The rest of the steps take time poly(2 t ). So the whole process takes poly(2 t ). 2 Problem 2 in Homework. 2

3.4 Decoding of Concatention Codes from a Constant Fraction of Errors We basically import the known algorithm to decode Reed-Solomon codes with a slight modification to correct some constant fraction φ of errors. The decoding algorithm comprises of the following two main steps: Step Decode each block of the Inner code by brute force. Step 2 Use the Berlekamp-Welch algotithm to decode the resultant string over large alphabet to the nearest Reed-Solomon codeword. Claim 2. The above algorithm decodes the concatenation code from 4rR fraction of errors. Proof. If the total fraction of errors is at most 4rR, then the fraction of blocks that have more than 2 r errors is at most 2R (by the Markov Inequality). After the first step of the algorithm, there is at most 2R fraction of the blocks that are decoded incorrectly. Since this is within the decoding radius of the Reed-Solomon code, the second step of the algorithm decodes the resulting string correctly. One can correct 2 ( R)H ( r) using a more carefully designed algorithm that keeps track of the errors corrected so far. 4 The story so far..(in the Hamming model) Although we proved the existence of codes with rate R = H(δ), we are far behind in being able to construct them, even in polynomial time. The new line we have on the plot shows that fact. 0.8 R 0.6 0.4 upper bound 0.2 explicit lower bound 0 0 0.2 0.4 0.6 0.8 δ 3

5 Random Error Model: Codes that meet Shannon capacity Let Bsc(p) be a binary symmetric channel so that every bit transmitted through the channel has probability p being flipped, i.e. from 0 to or from to 0. Shannon s theorem tells us that for all R < C = H(p) (a.k.a. Shannon capacity) there exist an encoder E : {0, } Rn {0, } n and a decoder D : {0, } n {0, } Rn so that the failure probability over Bsc(p) is e δn for some constant δ > 0 (depending on C R). Formally, Pr [D(E(m) z) m] z {0,} n,pr[z i =]=p e δn. The existence of such a pair E, D is guaranteed by a probabilistic argument 3, which says nothing about an efficient construction. Here, we try to construct such a code efficiently. 5. A First Attempt As a first attempt, we try the following code:. By brute force, find a code of length Θ(log n) that has rate r = C ε such that decoding to the nearest codeword has failure probability. The existence of such a code is guaranteed by Shannon s Theorem. ( ) 2. Break the message into Θ n log n blocks of size Θ(log n), and encode each block using the code above. Union ( bound tells ) us that the ( probability ) that some block is decoded incorrectly is at most Θ n log n, which is Θ for a sufficiently good code found in step. above. (Note that the expected number of incorrectly decoded blocks is, so after a (not very) large number of queries, it is very likely that one block has been incorrectly decoded). While this seems quite good, our goal is to construct a code that has exponentially small probably of incorrect decoding (as guaranteed by Shannon s theorem). For this purpose, we have the following solutions. 6 The Real Solution Solution is a code that has exponentially small error probability, and can be constructed, encoded, decoded in 2 O(log2 n) (quasi-polynomial) time. The idea is using concatenated codes. We use Reed-Solomon codes as outer codes, and codes that meet Shannon capacity as inner codes. The main idea behind this approach is that although some blocks may get incorrectly decoded, to make the decoding truly err, we need that a lot of blocks are incorrectly decoded (dictated by the outer code), which is unlikely. 3 The proof can be found in the 2nd lecture note. 4

One of the main ingredients in this recipe, is a linear code of length t, that meets Shannon capacity. Exercise: For any ε > 0, and 0 p, there exist linear codes of rate at least H(p) ε that have exponentially low probability of incorrect decoding. [Hint: Choose a m n generator matrix with each entry uniformly and independently. Proceed exactly as in Shannon s theorem, and bound both kinds of errors, i.e., the one where there are too many errors, and the one where there are more than one codewords near the corrupted codeword. Note that in the latter case, it is enough to bound the error probability for one code word (say 0), because in a linear code, all code words look the same. This makes the analysis of this error easier than in Shannon s Theorem] The next step is to find such a code. ( Lemma. It takes O 2 t2) time to find a linear code of length t, rate C ε/2 with exponentially low error probability. Proof. One can enumerate all possible linear codes specified by the span of t vectors and pick a feasible linear code. The number of possible linear codes is thus bounded by ( ) 2 t t = 2 t 2, and the time taken to verify the probability of bad decoding for every word is bounded by O(2 2t ). 6. Choice of Outer and Inner Codes. For the outer code we use the Reed-Solomon code over F 2 t of rate ε/2 and length l. 2. For the inner code, we use the code C which has length t, rate C ε/2 probability of incorrect decoding at most e δt. Setting t = 2 δ log n, we get that the probability of incorrect decoding is at most. We also get n 2 l = δn 2 log n 6.2 Parameters of the Code The size of the code is, again, given by the outer code C = (2 t ) l( ε/2). The rate of this code, similar to the previous case, is (C ε/2)( ε/2) > C ε. To compute the failure probability, we note that C has failure probability over Bsc(p) bounded by /n 2. To make the concatenated code err over Bsc(p), it requires more than ε/4 fraction of blocks 5

decoded incorrectly, which happens with probability at most ( ) l εl/4 (n 2 ) (ε/4)l 2H(ε/4) l 2 = (2 log n)(ε/4)l 2l(2(ε/4) log n H(ε/4)) 2 l(ε/4) log n (note that l = δn 2 log n ) = 2 ε 8 δn = e δ n for some constant δ > 0 = exp( n) 6.3 Construction Time Since we have efficient algorithms to construct the outer code C, similar to the previous case, the total running time is dominated by the construction of inner code C, which takes 2 O(log2 n) time due to Lemma. Remark: In terms of running time, the main bottleneck is finding codes of length Θ(log n) that meet Shannon capacity. It is natural to attempt to bring down the construction time to by replacing C by a code of smaller length, say log n. Here, the problem occurs in the outer code, where we need large alphabet sizes ( l). 6.4 Decoding To decode the concatenated code, we find the nearest codeword by brute-force to decode the inner code. This takes time O(2 t δn 2 log n ) =. Then, we use Berlekamp-Welch algorithm to decode the outer code. In total, it requires time. 7 Other Solutions Solution 2 is a code that meets Shannon capacity C and can be constructed, encoded, decoded in poly(n, 2 /ε ) time, a.k.a. Forney code. This code has rate C ε and the failure probability over Bsc(p) is at most e δn. The idea is the same as the Solution, except the following 2 changes.. Replace the RS code with a code over f(ε)-size alphabet so that the code has rate ε/2 and can be efficiently decoded from at most r(ε) > 0 fraction of errors. 2. Replace the inner code with a code of length O(log f(ε)) with rate C ε/2 so that the failure probability over Bsc(p) is at most q(ε), requiring that q(ε) < r(ε)/5. Solution 3 is called Polar code that meets Shannon capacity C and can be constructed, encoded, decoded in poly(n, /ε) time. 6