: Error Correcting Codes. October 2017 Lecture 1

Similar documents
: Error Correcting Codes. November 2017 Lecture 2

Lecture 12: Reed-Solomon Codes

Lecture B04 : Linear codes and singleton bound

Outline. MSRI-UP 2009 Coding Theory Seminar, Week 2. The definition. Link to polynomials

Solutions of Exam Coding Theory (2MMC30), 23 June (1.a) Consider the 4 4 matrices as words in F 16

Lecture 19 : Reed-Muller, Concatenation Codes & Decoding problem

MATH32031: Coding Theory Part 15: Summary

Lecture 12: November 6, 2017

Error Correcting Codes Questions Pool

Arrangements, matroids and codes

1 Vandermonde matrices

7.1 Definitions and Generator Polynomials

Linear Cyclic Codes. Polynomial Word 1 + x + x x 4 + x 5 + x x + x

Lecture Introduction. 2 Linear codes. CS CTT Current Topics in Theoretical CS Oct 4, 2012

EE 229B ERROR CONTROL CODING Spring 2005

Section 3 Error Correcting Codes (ECC): Fundamentals

Error-Correcting Codes

MATH Examination for the Module MATH-3152 (May 2009) Coding Theory. Time allowed: 2 hours. S = q

Error Correcting Codes: Combinatorics, Algorithms and Applications Spring Homework Due Monday March 23, 2009 in class

Hamming codes and simplex codes ( )

G Solution (10 points) Using elementary row operations, we transform the original generator matrix as follows.

MATH3302 Coding Theory Problem Set The following ISBN was received with a smudge. What is the missing digit? x9139 9

Know the meaning of the basic concepts: ring, field, characteristic of a ring, the ring of polynomials R[x].

ELEC 519A Selected Topics in Digital Communications: Information Theory. Hamming Codes and Bounds on Codes

Mathematics Department

Linear Cyclic Codes. Polynomial Word 1 + x + x x 4 + x 5 + x x + x f(x) = q(x)h(x) + r(x),

Coding Theory and Applications. Solved Exercises and Problems of Cyclic Codes. Enes Pasalic University of Primorska Koper, 2013

The BCH Bound. Background. Parity Check Matrix for BCH Code. Minimum Distance of Cyclic Codes

Lecture 3 Small bias with respect to linear tests

CS151 Complexity Theory. Lecture 9 May 1, 2017

Orthogonal Arrays & Codes

Questions Pool. Amnon Ta-Shma and Dean Doron. January 2, Make sure you know how to solve. Do not submit.

Chapter 6 Reed-Solomon Codes. 6.1 Finite Field Algebra 6.2 Reed-Solomon Codes 6.3 Syndrome Based Decoding 6.4 Curve-Fitting Based Decoding

Error Correction Review

Coding Theory: Linear-Error Correcting Codes Anna Dovzhik Math 420: Advanced Linear Algebra Spring 2014

6.895 PCP and Hardness of Approximation MIT, Fall Lecture 3: Coding Theory

Lecture 3: Error Correcting Codes

3. Coding theory 3.1. Basic concepts

Error-correcting Pairs for a Public-key Cryptosystem

Lecture 4: Codes based on Concatenation

Lecture 03: Polynomial Based Codes

Lecture Introduction. 2 Formal Definition. CS CTT Current Topics in Theoretical CS Oct 30, 2012

The extended coset leader weight enumerator

Math 512 Syllabus Spring 2017, LIU Post

Decoding Reed-Muller codes over product sets

ERROR CORRECTING CODES

Notes 10: List Decoding Reed-Solomon Codes and Concatenated codes

Ma/CS 6b Class 25: Error Correcting Codes 2

Vector spaces. EE 387, Notes 8, Handout #12

Code-Based Cryptography Error-Correcting Codes and Cryptography

Error Detection and Correction: Hamming Code; Reed-Muller Code

Lecture 14: Hamming and Hadamard Codes

6.S897 Algebra and Computation February 27, Lecture 6

EE512: Error Control Coding

Linear Codes, Target Function Classes, and Network Computing Capacity

MATH 291T CODING THEORY

Reed-Solomon codes. Chapter Linear codes over finite fields

Skew cyclic codes: Hamming distance and decoding algorithms 1

New algebraic decoding method for the (41, 21,9) quadratic residue code

: Coding Theory. Notes by Assoc. Prof. Dr. Patanee Udomkavanich October 30, upattane

Berlekamp-Massey decoding of RS code

Lecture 12. Block Diagram

Lecture 17: Perfect Codes and Gilbert-Varshamov Bound

List Decoding of Noisy Reed-Muller-like Codes

ELEC-E7240 Coding Methods L (5 cr)

Binary Linear Codes G = = [ I 3 B ] , G 4 = None of these matrices are in standard form. Note that the matrix 1 0 0

x n k m(x) ) Codewords can be characterized by (and errors detected by): c(x) mod g(x) = 0 c(x)h(x) = 0 mod (x n 1)

Fault Tolerance & Reliability CDA Chapter 2 Cyclic Polynomial Codes


Cyclic Redundancy Check Codes

Notes for the Hong Kong Lectures on Algorithmic Coding Theory. Luca Trevisan. January 7, 2007

MATH 291T CODING THEORY

Algebraic Geometry Codes. Shelly Manber. Linear Codes. Algebraic Geometry Codes. Example: Hermitian. Shelly Manber. Codes. Decoding.

MT361/461/5461 Error Correcting Codes: Preliminary Sheet

Chapter 3 Linear Block Codes

Proof: Let the check matrix be

Bounding the number of affine roots

Hamming Codes 11/17/04

Information redundancy

Lecture 2 Linear Codes

Finite Fields and Error-Correcting Codes

} has dimension = k rank A > 0 over F. For any vector b!

The Golay codes. Mario de Boer and Ruud Pellikaan

Can You Hear Me Now?

ECEN 604: Channel Coding for Communications

Generalized Reed-Solomon Codes

Generator Matrix. Theorem 6: If the generator polynomial g(x) of C has degree n-k then C is an [n,k]-cyclic code. If g(x) = a 0. a 1 a n k 1.

Coding Theory. Ruud Pellikaan MasterMath 2MMC30. Lecture 11.1 May

Coding problems for memory and storage applications

Communications II Lecture 9: Error Correction Coding. Professor Kin K. Leung EEE and Computing Departments Imperial College London Copyright reserved

Algebra for error control codes

Physical Layer and Coding

Finite fields, randomness and complexity. Swastik Kopparty Rutgers University

g(x) = 1 1 x = 1 + x + x2 + x 3 + is not a polynomial, since it doesn t have finite degree. g(x) is an example of a power series.

CSCI-B609: A Theorist s Toolkit, Fall 2016 Oct 4. Theorem 1. A non-zero, univariate polynomial with degree d has at most d roots.

CYCLIC SIEVING FOR CYCLIC CODES

MATH/MTHE 406 Homework Assignment 2 due date: October 17, 2016

Galois geometries contributing to coding theory

00000, 01101, 10110, Find the information rate and the minimum distance of the binary code 0000, 0101, 0011, 0110, 1111, 1010, 1100, 1001.

Asymptotically good sequences of codes and curves

Transcription:

03683072: Error Correcting Codes. October 2017 Lecture 1 First Definitions and Basic Codes Amnon Ta-Shma and Dean Doron 1 Error Correcting Codes Basics Definition 1. An (n, K, d) q code is a subset of F n q of size K where every two distinct codewords have Hamming distance at least d. d is called the code s distance. An [n, k, d] q code is a linear subspace of F n q of dimension k that is an (n, 2 k, d) q code. Claim 2. Let C be a linear code of distance d. Then: 1. d equals the minimum weight of the nonzero codewords of C. 2. C can detect d 1 errors and correct d 1 2 errors. The two most common ways to present an [n, k] q code are with either an n k generating matrix or an n (n k) parity-check matrix. Definition 3. A generator matrix for an [n, k] q code C is any matrix over F q whose columns form a basis for C. That is, G is a generating matrix for C if Image(G) = { Gx x F k q} = C. Definition 4. An n (n k) matrix H over F q is a parity-check matrix of an [n, k] q code C if { y F n q H T y = 0 } = C. Claim 5. Let C be an [n, k] q code with a generating matrix G. H is a parity-check matrix for G iff H T G = 0 (0 here is the zero matrix) and rank(h) = n k. In particular, if G is of the form ( ) Ik then H = A ( ) A T is a parity-check matrix for C. Claim 6. Let C be an [n, k] q code with a generating matrix G and parity-check matrix H. Then the (minimum) distance of C is the minimum number d such that every d 1 rows of H are linearly independent while there exist d rows that are linearly dependent. Clearly, H is a generator matrix of some code. This code is called the dual of C and denoted C. Stated differently: Definition 7. Let C be an [n, k] q code. The dual code of C is C = { y F n q x, c = 0 for all c C }. Notice that C is an [n, n k] q code. I n k 1

1.1 The Hamming Code Our first example is the Hamming [7, 4] 2 code. It encodes four bits into seven bits by adding three parity bits. Thus, it can detect and correct single-bit errors. A generating matrix is: 1 0 0 0 0 1 0 0 0 0 1 0 G = 0 0 0 1 1 1 0 1 1 0 1 1 0 1 1 1 And a parity-check matrix is: 1 1 0 1 0 1 0 1 1 H = 1 1 1 1 0 0 0 1 0 0 0 1 The parity-check matrix of a general Hamming code over F q is constructed by listing all rows of length r that are pairwise independent over F q. For F r 2, this corresponds to simply listing all nonzero binary vectors of length r. This clearly gives a [2 r 1, 2 r r 1, 3] 2 code. Write d = 2e + 1. An [n, k, d] code C is said to be perfect if for every possible word w there is a unique codeword c C in which at most e letters of c differ from the corresponding bits of w. That is, if e ( n i=0 i) (q 1) i = q n k. Clearly, the binary [n = 2 r 1, k = 2 r r 1, d = 3] 2 Hamming code satisfies 1 + n = 2 r = 2 n k, so it is perfect. For every code, perfect or not, linear or not, q k e i=0 ( n i) (q 1) i q n, because the balls of radius e around codewords must be disjoint (why?). In a perfect code these balls perfectly cover the whole space F n q. Thus, these codes (if exist) are perfectly optimal and have the maximum possible number of codewords among all codes of length n and distance d. To decode the [n = 2 r 1, k = 2 r r 1, 3] 2 Hamming code we introduce the notion of syndrome decoding. A syndrome of y F n 2 is simply s = HT y, so the codewords of C are precisely the vectors whose syndrome is 0. Also, c 1 c 2 C if and only if H T c 1 = H T c 2. Consider the following procedure for the Hamming code. Given a received word y F n 2 : 1. Compute s = H T y F n k 2. If s = 0, y is a codeword and return y. 2. Otherwise, there must be e F n 2 of weight 1 such that s = HT e (why?). If we let e j F n 2 denote the vector with 1 at the j th coordinate and zero otherwise (1 j n), then H T e j is the j th column of H T. As the columns of H T are all distinct, we compare the syndrome s to the columns of H T, and find the column j where they are equal. We then output c = y e j. 2

If we enumerate the n = 2 r 1 rows of H (i.e., the columns of H T ) such that they count in binary from 1 to n, then the value of s in binary immediately tells how to correct the error: value 0 means no error, and value j (1 j n) means c = y e j. Decoding here is especially easy as we only need to correct one error. Decoding general codes is much more challenging. 1.2 The Hadamard Code The Hadamard code is a binary code that maps messages of length k to codewords of length 2 k in the following way. For a message x {0, 1} k, every coordinate y {0, 1} k is computed by x, y (modulo 2), so Had(x) = ( x, y ) y {0,1} k. Check that the Hadamard code is linear. We claim the distance of the Hadamard code is 2 k 1. Furthermore, we claim every nonzero codeword has weight exactly 2 k 1. To see this, let x {0, 1} k be a nonzero message. Then, the weight of Had(x) is given by Pr y [ x, y = 1]. Prove that this value is exactly 1 2. We remark that because there are so few codewords, decoding can be done in polynomial time by going brute force over all the 2 k = n codewords. To summarize, the Hadamard code is [n = 2 k, k, 2 k 1 ] 2 code. Thus, the Hadamard code has high distance (relative distance=distance/length=1/2) but very low dimension, compared with the [n = 2 r 1, k = 2 r r 1, d = 3] 2 Hamming code that has very high dimension but very low distance. What we shall seek next are codes that lie in between, for example codes with constant relative distance (i.e., distance/length) and constant relative rate (which is, dimension/length). 2 Basic Algebra 1. Groups, Fields. 2. F 2, F p, F 7, F p. 3. Rings, Euclidean rings, Unique factorization domains. 4. The ring of polynomials. GCD. Division with reminder. 5. F p k = F p [x] mod E(x) for an irreducible E of degree k. 6. Generators. 7. F 4, F 4, F 8, F 8. 8. For p F[x], p(a) = 0 implies x a p. 9. For p F[x] of degree k, p has at most k roots in F. 3

3 The Reed-Solomon Code Fix a field F q of size q with a generator α of F q. The code RS : F k q F n q corresponds to evaluating all degree k 1 polynomial (whose coefficients are given to us as the input message) on all nonzero field elements. 1 That is, n = q 1 and RS(a 0,..., a k 1 ) = ( p a (α 0 ), p a (α 1 ),..., p a (α n 1 ) ), where p a (x) = k 1 i=0 a ix i. Every nonzero polynomial of degree k 1 can have at most k 1 zeros in F q, so the weight of every nonzero codeword is at least d = n (k 1) = n k + 1. Thus, the RS code is an [n, k, n k + 1] q code. By inspection, the n k generating matrix is given by (α 0 ) 0 (α 0 ) 1 (α 0 ) k 1 (α 1 ) 0 (α 1 ) 1 (α 1 ) k 1 G =.... (α n 1 ) 0 (α n 2 ) 1 (α n 1 ) k 1 so for 0 i n 1 and 0 j k 1, G[i, j] = α ij. Now, consider the n (n k) matrix (α 0 ) 1 (α 0 ) 2 (α 0 ) n k (α 1 ) 1 (α 1 ) 2 (α 1 ) n k H =.... (α n 1 ) 1 (α n 1 ) 2 (α n 1 ) n k so for 0 i n 1 and 1 j n k, H[i, j] = α ij. H is a Vandermonde matrix with the right dimensions, so in order to prove that H is indeed the parity-check matrix of the RS code, it is sufficient to prove that H T G = 0 (n k) k. We have that n 1 n 1 ( (H T G)[a, b] = H[k, a]g[k, b] = α a+b) k. k=0 a ranges from 1 to n k and b ranges from 0 to k 1, so 1 a + b n 1 and the above sums to zero. Remark 8. An [n, k, d] q code that satisfies d = n k + 1 (like the RS code) is called an MDS (maximum distance separable) code. Such codes meet the Singleton bound d n k + 1 which we will see later, and thus have an optimal number of codewords for the given n and d. Remark 9. With 0 as an additional evaluation point, the code becomes an [q, k, q k + 1] q code. We encourage the reader to find natural generating and parity check matrices for it. Also verify that a dual of such a code is of the same type. 1 It is also common to take the zero element as well, however, for reasons that will become clear soon we will not use the zero element as an evaluation point. k=0 4

4 The Reed-Muller code We will follow the references in the syllabus (Sudan s Lecture 4). For calculating the distance, we will use the following lemmas: For a field size larger than the degree, we will use Lemma 10 (Schwartz-Zippel). A nonzero polynomial f F q [x 1,..., x n ] of total degree d is zero on at most d q fraction of the points in Fn q. For a field size smaller than the degree, we will use: Lemma 11 ([2], Theorem 19 of Lecture 2.5). If f F q [x 1,..., x n ] has individual degree at mot l < q in each variable and has total degree d = lk + r with r < l, then it is nonzero on at least ( ) k ( ) 1 l q 1 r q fraction of points in F n q. 5 The Hermitian Code Up until now, our code s domains were the q-ary hypercube. We can instead pick a subset S F m q with some nice geometric properties. We start with the trace and norm functions. Definition 12. The functions Tr, N : F q r F q are given by Tr(x) = r 1 i=0 xqi and N(x) = r 1 i=0 xqi. The image of both functions is in F q (this follows easily F q = {α F q r α q = α}, and using α qr = α for every α F q r. Check!). It is easy to check that the trace is additive whereas the norm is multiplicative. Also, Claim 13. Tr : F q r F q is a perfect q r 1 -to-1 map, and N : (F r q) F q is a perfect r 1 i=0 qi -to-1 map. The Hermitian code lies on the curve { } S = (x, y) F 2 q Tr(x) = N(y) 2 = { (x, y) F q 2 x q + x = y q+1} and its message space is given by bivariate polynomials over F q 2 with total degree at most r q, so k = ( ) r+2 2 r 2 2. That is, similar to a RM code, we evaluate low-degree, bi-variate polynomials but this time on a curve. We shall determine the length of the code, S, and the distance of the code. Claim 14. S = q 3. Proof. For every choice of y (there are q 2 such choices), let γ = N(y). Then, there are q choices of x F q 2 satisfying Tr(x) = γ, as Tr is a perfect q-to-1 map. To determine the distance, we will use Bézout s theorem. 5

Theorem 15 (Bézout, see e.g., Chapter 7 of [1]). If F is a field and f, g F[x, y] have no common factors over the algebraic closure, then {(α, β) f(α, β) = g(α, β) = 0} deg(f) deg(g). Lemma 16. The distance d of the code is at least q 3 r(q + 1). Proof. Let f F q 2[x, y] be of degree at most r. Define g(x, y) = Tr(x) N(y), so S is the zeros of g. Since deg(g) = q + 1 > r and g is irreducible, g and f have no common factors. By Bézout s theorem, the number of zero evaluation points (that is, (α, β) so that f(α, β) = g(α, β) = 0) is at most deg(f) deg(g) r(q + 1), so d n r(q + 1). In conclusion, the code is a [ ] q 3, r2 2, q3 r(q + 1) q 2 code. References [1] H. Hilton. Plane algebraic curves. The Clarendon press. [2] Madhu Sudan. Algorithmic introduction to coding theory lecture notes, September 2001. 6