: Error Correcting Codes. October 2017 Lecture 1

03683072: Error Correcting Codes. October 2017 Lecture 1 First Definitions and Basic Codes Amnon Ta-Shma and Dean Doron 1 Error Correcting Codes Basics Definition 1. An (n, K, d) q code is a subset of F n q of size K where every two distinct codewords have Hamming distance at least d. d is called the code s distance. An [n, k, d] q code is a linear subspace of F n q of dimension k that is an (n, 2 k, d) q code. Claim 2. Let C be a linear code of distance d. Then: 1. d equals the minimum weight of the nonzero codewords of C. 2. C can detect d 1 errors and correct d 1 2 errors. The two most common ways to present an [n, k] q code are with either an n k generating matrix or an n (n k) parity-check matrix. Definition 3. A generator matrix for an [n, k] q code C is any matrix over F q whose columns form a basis for C. That is, G is a generating matrix for C if Image(G) = { Gx x F k q} = C. Definition 4. An n (n k) matrix H over F q is a parity-check matrix of an [n, k] q code C if { y F n q H T y = 0 } = C. Claim 5. Let C be an [n, k] q code with a generating matrix G. H is a parity-check matrix for G iff H T G = 0 (0 here is the zero matrix) and rank(h) = n k. In particular, if G is of the form ( ) Ik then H = A ( ) A T is a parity-check matrix for C. Claim 6. Let C be an [n, k] q code with a generating matrix G and parity-check matrix H. Then the (minimum) distance of C is the minimum number d such that every d 1 rows of H are linearly independent while there exist d rows that are linearly dependent. Clearly, H is a generator matrix of some code. This code is called the dual of C and denoted C. Stated differently: Definition 7. Let C be an [n, k] q code. The dual code of C is C = { y F n q x, c = 0 for all c C }. Notice that C is an [n, n k] q code. I n k 1

1.1 The Hamming Code Our first example is the Hamming [7, 4] 2 code. It encodes four bits into seven bits by adding three parity bits. Thus, it can detect and correct single-bit errors. A generating matrix is: 1 0 0 0 0 1 0 0 0 0 1 0 G = 0 0 0 1 1 1 0 1 1 0 1 1 0 1 1 1 And a parity-check matrix is: 1 1 0 1 0 1 0 1 1 H = 1 1 1 1 0 0 0 1 0 0 0 1 The parity-check matrix of a general Hamming code over F q is constructed by listing all rows of length r that are pairwise independent over F q. For F r 2, this corresponds to simply listing all nonzero binary vectors of length r. This clearly gives a [2 r 1, 2 r r 1, 3] 2 code. Write d = 2e + 1. An [n, k, d] code C is said to be perfect if for every possible word w there is a unique codeword c C in which at most e letters of c differ from the corresponding bits of w. That is, if e ( n i=0 i) (q 1) i = q n k. Clearly, the binary [n = 2 r 1, k = 2 r r 1, d = 3] 2 Hamming code satisfies 1 + n = 2 r = 2 n k, so it is perfect. For every code, perfect or not, linear or not, q k e i=0 ( n i) (q 1) i q n, because the balls of radius e around codewords must be disjoint (why?). In a perfect code these balls perfectly cover the whole space F n q. Thus, these codes (if exist) are perfectly optimal and have the maximum possible number of codewords among all codes of length n and distance d. To decode the [n = 2 r 1, k = 2 r r 1, 3] 2 Hamming code we introduce the notion of syndrome decoding. A syndrome of y F n 2 is simply s = HT y, so the codewords of C are precisely the vectors whose syndrome is 0. Also, c 1 c 2 C if and only if H T c 1 = H T c 2. Consider the following procedure for the Hamming code. Given a received word y F n 2 : 1. Compute s = H T y F n k 2. If s = 0, y is a codeword and return y. 2. Otherwise, there must be e F n 2 of weight 1 such that s = HT e (why?). If we let e j F n 2 denote the vector with 1 at the j th coordinate and zero otherwise (1 j n), then H T e j is the j th column of H T. As the columns of H T are all distinct, we compare the syndrome s to the columns of H T, and find the column j where they are equal. We then output c = y e j. 2

If we enumerate the n = 2 r 1 rows of H (i.e., the columns of H T ) such that they count in binary from 1 to n, then the value of s in binary immediately tells how to correct the error: value 0 means no error, and value j (1 j n) means c = y e j. Decoding here is especially easy as we only need to correct one error. Decoding general codes is much more challenging. 1.2 The Hadamard Code The Hadamard code is a binary code that maps messages of length k to codewords of length 2 k in the following way. For a message x {0, 1} k, every coordinate y {0, 1} k is computed by x, y (modulo 2), so Had(x) = ( x, y ) y {0,1} k. Check that the Hadamard code is linear. We claim the distance of the Hadamard code is 2 k 1. Furthermore, we claim every nonzero codeword has weight exactly 2 k 1. To see this, let x {0, 1} k be a nonzero message. Then, the weight of Had(x) is given by Pr y [ x, y = 1]. Prove that this value is exactly 1 2. We remark that because there are so few codewords, decoding can be done in polynomial time by going brute force over all the 2 k = n codewords. To summarize, the Hadamard code is [n = 2 k, k, 2 k 1 ] 2 code. Thus, the Hadamard code has high distance (relative distance=distance/length=1/2) but very low dimension, compared with the [n = 2 r 1, k = 2 r r 1, d = 3] 2 Hamming code that has very high dimension but very low distance. What we shall seek next are codes that lie in between, for example codes with constant relative distance (i.e., distance/length) and constant relative rate (which is, dimension/length). 2 Basic Algebra 1. Groups, Fields. 2. F 2, F p, F 7, F p. 3. Rings, Euclidean rings, Unique factorization domains. 4. The ring of polynomials. GCD. Division with reminder. 5. F p k = F p [x] mod E(x) for an irreducible E of degree k. 6. Generators. 7. F 4, F 4, F 8, F 8. 8. For p F[x], p(a) = 0 implies x a p. 9. For p F[x] of degree k, p has at most k roots in F. 3

3 The Reed-Solomon Code Fix a field F q of size q with a generator α of F q. The code RS : F k q F n q corresponds to evaluating all degree k 1 polynomial (whose coefficients are given to us as the input message) on all nonzero field elements. 1 That is, n = q 1 and RS(a 0,..., a k 1 ) = ( p a (α 0 ), p a (α 1 ),..., p a (α n 1 ) ), where p a (x) = k 1 i=0 a ix i. Every nonzero polynomial of degree k 1 can have at most k 1 zeros in F q, so the weight of every nonzero codeword is at least d = n (k 1) = n k + 1. Thus, the RS code is an [n, k, n k + 1] q code. By inspection, the n k generating matrix is given by (α 0 ) 0 (α 0 ) 1 (α 0 ) k 1 (α 1 ) 0 (α 1 ) 1 (α 1 ) k 1 G =.... (α n 1 ) 0 (α n 2 ) 1 (α n 1 ) k 1 so for 0 i n 1 and 0 j k 1, G[i, j] = α ij. Now, consider the n (n k) matrix (α 0 ) 1 (α 0 ) 2 (α 0 ) n k (α 1 ) 1 (α 1 ) 2 (α 1 ) n k H =.... (α n 1 ) 1 (α n 1 ) 2 (α n 1 ) n k so for 0 i n 1 and 1 j n k, H[i, j] = α ij. H is a Vandermonde matrix with the right dimensions, so in order to prove that H is indeed the parity-check matrix of the RS code, it is sufficient to prove that H T G = 0 (n k) k. We have that n 1 n 1 ( (H T G)[a, b] = H[k, a]g[k, b] = α a+b) k. k=0 a ranges from 1 to n k and b ranges from 0 to k 1, so 1 a + b n 1 and the above sums to zero. Remark 8. An [n, k, d] q code that satisfies d = n k + 1 (like the RS code) is called an MDS (maximum distance separable) code. Such codes meet the Singleton bound d n k + 1 which we will see later, and thus have an optimal number of codewords for the given n and d. Remark 9. With 0 as an additional evaluation point, the code becomes an [q, k, q k + 1] q code. We encourage the reader to find natural generating and parity check matrices for it. Also verify that a dual of such a code is of the same type. 1 It is also common to take the zero element as well, however, for reasons that will become clear soon we will not use the zero element as an evaluation point. k=0 4

4 The Reed-Muller code We will follow the references in the syllabus (Sudan s Lecture 4). For calculating the distance, we will use the following lemmas: For a field size larger than the degree, we will use Lemma 10 (Schwartz-Zippel). A nonzero polynomial f F q [x 1,..., x n ] of total degree d is zero on at most d q fraction of the points in Fn q. For a field size smaller than the degree, we will use: Lemma 11 ([2], Theorem 19 of Lecture 2.5). If f F q [x 1,..., x n ] has individual degree at mot l < q in each variable and has total degree d = lk + r with r < l, then it is nonzero on at least ( ) k ( ) 1 l q 1 r q fraction of points in F n q. 5 The Hermitian Code Up until now, our code s domains were the q-ary hypercube. We can instead pick a subset S F m q with some nice geometric properties. We start with the trace and norm functions. Definition 12. The functions Tr, N : F q r F q are given by Tr(x) = r 1 i=0 xqi and N(x) = r 1 i=0 xqi. The image of both functions is in F q (this follows easily F q = {α F q r α q = α}, and using α qr = α for every α F q r. Check!). It is easy to check that the trace is additive whereas the norm is multiplicative. Also, Claim 13. Tr : F q r F q is a perfect q r 1 -to-1 map, and N : (F r q) F q is a perfect r 1 i=0 qi -to-1 map. The Hermitian code lies on the curve { } S = (x, y) F 2 q Tr(x) = N(y) 2 = { (x, y) F q 2 x q + x = y q+1} and its message space is given by bivariate polynomials over F q 2 with total degree at most r q, so k = ( ) r+2 2 r 2 2. That is, similar to a RM code, we evaluate low-degree, bi-variate polynomials but this time on a curve. We shall determine the length of the code, S, and the distance of the code. Claim 14. S = q 3. Proof. For every choice of y (there are q 2 such choices), let γ = N(y). Then, there are q choices of x F q 2 satisfying Tr(x) = γ, as Tr is a perfect q-to-1 map. To determine the distance, we will use Bézout s theorem. 5

Theorem 15 (Bézout, see e.g., Chapter 7 of [1]). If F is a field and f, g F[x, y] have no common factors over the algebraic closure, then {(α, β) f(α, β) = g(α, β) = 0} deg(f) deg(g). Lemma 16. The distance d of the code is at least q 3 r(q + 1). Proof. Let f F q 2[x, y] be of degree at most r. Define g(x, y) = Tr(x) N(y), so S is the zeros of g. Since deg(g) = q + 1 > r and g is irreducible, g and f have no common factors. By Bézout s theorem, the number of zero evaluation points (that is, (α, β) so that f(α, β) = g(α, β) = 0) is at most deg(f) deg(g) r(q + 1), so d n r(q + 1). In conclusion, the code is a [ ] q 3, r2 2, q3 r(q + 1) q 2 code. References [1] H. Hilton. Plane algebraic curves. The Clarendon press. [2] Madhu Sudan. Algorithmic introduction to coding theory lecture notes, September 2001. 6