Error-Correcting Codes HMC Algebraic Geometry Final Project Dmitri Skjorshammer December 14, 2010 1 Introduction Transmission of information takes place over noisy signals. This is the case in satellite transmissions and storing data. For this reason, it is useful to encode information so that errors can be detected and corrected. For our purposes, we will assume all information to be encoded constists of words of fixed length k using symbols from a fixed alphabet, and all encoded messages are divided into strings called codewords of fixed block length n, using symbols from the same alphabet. To formalize this, we introduce the notion of a finite field F n q. A finite field is a field with q n elements; actually a direct product of n copies of F q. The design of electronic circuitry requires us to work with {0, 1} and since we can represent letters of any alphabet in terms of strings composed of 0, 1, our alphabet can be identified with elements of F 2 r. Since words to be encoded are of fixed length k, we can think of them as elements of F k 2 and the code as elements of r Fn 2r. Thus, if our alphabet is {000, 001, 010, 100, 011, 101, 110, 111} then the word 010 101 111 can be identified with (010, 101, 111) F 3 8. The encoding process is defined as a one-to-one function of a string from the message of E : F k 2 r Fn 2r. The image C = E(F 2 r) is called the code. With this map, we can associate the decoding operation D : F n q F k q such that D E is the identity on F k q. 2 Linear Codes If the class of codes form a vector subspace of F n q of dimension k, then we say that this is the class of linear codes. We can then think of E, the encoding process, as a linear transformation represented by a matrix, G. We call this matrix the generator matrix. Since E is a map from a k-dimensional space to n-dimensional space, we view the generator matrices as k n matrices and the strings in F k q as row vectors w. Multiplication of a vector by a matrix acts on the left. 2.1 Hamming distance To study error-correcting capabilities of codes, we define the Hamming distance between x and y as d(x, y) = {i, 1 i n : x i y i } Intuitively, this is the number of bits different between x and y. This is indeed a distance function because (1) d(x, y) 0 and d(x, x) = 0 (since all bits are the same). (2) Interchanging the strings will not change the distance, so d(x, y) = d(y, x). (3) For any x, y, z, the distance d(x, y) measures number of bits different between x and y. Adding an intermediary is only going to add number of bits different, so the triangle inequality is satisfied. 1
Now given x F n q denote by B r (x), the closed ball of radius r: B r (x) = {y F n q d(y, x) r} Intuitively, this is the set of elements differing from x in at most r bits. Theorem 2.1. Let C be a code such that for all x, y C F n q satisfy d(x, y) d for d 1. Any d 1 or fewer errors in a received word can be detected. If d 2t + 1 for t 1, then t or fewer errors can be detected. Proof. Given x and suppose that x is an error of x. If no more than d 1 of bits are changed then d(x, x ) d 1 so that x B d 1 (x) which contradicts the minimum distance. Thus, an error can be detected. If d 2t + 1 then for all z F n q by triangle inequality d(x, z) + d(z, y) d(x, y) 2t + 1 This implies that d(x, z) > t or d(y, z) > t so that B t (x) B t (y) = 0 so that B t (x) = {x} for all x. Suppose x has an error e so that x appears as x = x + e to the receiver. Since C is a vector space, x, e, x C and more importantly if x and x differ by no more than t 1 errors (that is, fewer than t errors are introduced), then x, x B t (x). Thus, we know an error occurred. 2.2 Decoding 2.2.1 Nearest Neighbor Decoding There are many ways to decode information. The simplest one is the nearest neighbor decoding algorithm. Suppose an error occurs. Then for some x C, B r (x) {x}, as established from the proof above. Call this error code x. Then we decode x as D(x ) = E 1 (x) where x is chosen so that d(x, x) is minimized (and x x). Since B t (x ) = {x, x } then this algorithm will choose x, i.e. the nearest neighbor, as the candidate for decoding! Note that since we need to look through all x C, this operation takes O(q k ) running time. 2.2.2 Syndrome Decoding Syndrome decoding algorithm is a faster algorithm that makes use of the vector space structure inherent in C. Before we can understand the machinery, we introduce the parity check matrix. The subspace C F n q describes the set of solutions of a system of n k independent linear equations in n variables. The matrix of coefficients of such a system is called a parity check matrix H. Example 2.2. Consider the alphabet F 2 with n = 4 and k = 2. Let C be the code generated by the generator matrix ( ) 1 1 1 1 G = 1 0 1 0 There are precisely four elements of C since all words of length k = 2 are (0, 0), (1, 0), (0, 1), (1, 1): The parity check matrix is and it is easily verified that xh = 0 for all x C. (0, 0)G = (0, 0, 0, 0) (1, 0)G = (1, 1, 1, 1) (0, 1)G = (1, 0, 1, 0) (1, 1)G = (0, 1, 0, 1) 1 1 H = 1 0 1 1 1 0 2
Now suppose x = wg is a codeword for w F k q and e F n q is an error that is introduced in the transmission of x such that you receive the decoded message x = x + e. Then x H = (x + e)h = xh + eh = 0 + eh = eh so that xh depends only on eh F n k q. The elements eh are called syndromes. For efficient processing, each of the elements are computed before the decoding process, and the coset representative with the smallest number of nonzero entries is chosen to be the coset leader. The algorithm works as follows: If x F n q is received, s = xh is computed and compared to the coset leader l associated to s. If l is unique, replace x by x = x l, which is an element of C since C is a vector space. If l is not unique, report an error. If fewer than t errors occur in x, then x represents the unique codeword closest to the received word. Thus, E 1 (x ) is returned. Unlike the Nearest Neighbor Decoding Algorithm, this requires only O(q n k ) running time. While this running time is better than nearest neighbor algorithm, it is still slow. In fact, decoding algorithms are known to be NP-complete so our present algorithms are relatively fast. 3 Reed-Solomon Codes Before we introduce algebraic geometry into coding theory, we introduce a special subclass of linear codes called the Reed-Solomon codes. These codes serve as a nice connection between linear codes and geometric Goppa codes. Choose a finite field F q and let α be generator of (F q ) (multiplicative group). Fix k < q and let k 1 P k 1 = { a i t i : a i F q } i=0 be the vector space of polynomials of degree at most q 1 in F q [t]. The idea is that the codes are created by evaluating polynomials in P k 1 at the q 1 nonzero elements of F q. We definition the Reed-Solomon codes as RS(k, q) = {(f(1), f(α),, f(α q 2 )) F q 1 q f P k 1 } Since P k 1 is a vector space, so is RS(k, q), so we can think of the encoding process as a linear transformation. In other words, the generator matrix G is well-defined and can be obtained by taking the basis {1, t, t 2,, t k 1 } and evaluating this basis at t = α. Example 3.1. The Reed-Solomon code over F 8 with k = 3 can be computed by choosing the basis {1, t, t 2 }. This means we have three defining linear transformations: which define the rows of the generator matrix. 1 (1, 1, 1, 1, 1, 1, 1) t (1, α, α 2, α 3, α 4, α 5, α 6, α 7 ) t 2 (1, α 2, α 4, α 6, 1, α 2, α 4, α 6 ) The difference between linear codes and Reed-Solomon codes is that the latter are cyclic codes. Cyclic codes have the property that a cyclic permutation of its bits yields another codeword. Their practicality manifests itself in a variety of places good burst-error correcting capabilities and efficient decoding algorithms are some of them. 4 Geometric Goppa Codes We finish this exposition by connecting coding theory to algebraic geometry via algebraic geometric codes, also known as Goppa codes. In order to understand Goppa codes, we need to understand Riemann-Roch spaces. 3
4.1 Riemann-Roch spaces A divisor is a linear combination of codimension one subvarieties of algebraic varieties. For our purposes, we will work with planar curves, C, so that our algebraic variety has dimension 1. Thus, a divisor D is a finite linear combination (with integral coefficients) of rational points on the curve C: D = P C D P P Given some fixed divisor D of a planar, non-singular curve C, we define the Riemann-Roch space, denoted by L(D), as L(D) := {f F(C) ord P (f) + D P 0; P C} where ord P (f) is the degree of f when restricted to the affine patch Z = 1 (last coordinate set to Z = 1; note that this restricts our attention to projective curves only). F(C) is the field of fractions of F[C] where F[C] is the coordinate ring defined as F[C] = F[x, y]/i(c) (to see that this is simply the quotient by the defining polynomial h, we identify h as radical ideal so by Nullstellensatz I(C) = I(V ( h )) = h = h ). What structure does L(D) admit? That of a vector space: Identity: Consider 0 F(C). By definition, ord P (0) = for all P C so 0 L(D). Commutativity and associativity are trivial. Inverse: Suppose f L(D). Then ord P (f)+d P 0 for all points P C. However, ord P (f) = ord P ( f) so f L(D). Scaling is trivial as the order of f is not affected. The Riemann-Roch space describes certain properties of corresponding Riemann surface which allows us to apply the Riemann-Roch theorem to conclude that L(D) is finite-dimensional. 4.2 Goppa Codes We are now fortunate enough to have a finite-dimensional vector space at our fingertips. To show that this does indeed define a linear code, we introduce the notion of a support of a divisor, which is simply supp(d) := {P D P 0}. To extend L(D) to a linear code, we apply the same ideas that we considered in constructing Reed-Solomon codes. Let C be a non-singular, projective planar curve and let {P 1,, P n } be n distinct rational points on the curve C. Choose a divisor D such that its support is disjoint from that of the divisor B = P 1 + + P n. Intuitively, the support of B are all the n rational points (since the coefficient is non-zero). Thus, to require the support of D to be disjoint is to require that the coefficients of those n-rational points be zero on D. The Goppa code of B and D, denoted by G(C, B, D), is defined as G(C, B, D) := {(f(p 1 ),, f(p n )) f L(D)} The cunning reader will note that this is a morphism φ : L(D) F n given by f (f(p 1 ),, f(p n )). 4.3 Applications The advantage of Goppa codes is that it allows the construction and analysis of codes using the vast tools of algebraic geometry. Thanks to Riemann-Roch Theorem, for instance, the code length, the rank and the minimum distance are known. This is just the basics and the reader is encouraged to continue exploring the powerful applications of algebraic geometry to coding theory. 4
References [1] D. Cox, J. Little, D. O Shea, Using Algebraic Geometry, Springer Press, New York, 1991. [2] D. Cox, J. Little, D. O Shea, Ideals, Varieties and Algorithms, Springer Press, New York, 2007. [3] Kapovich, M. The Riemann-Roch Theorem, available at: http://www.math.ucdavis.edu/~kapovich/ RS/RiemannRoch.pdf [4] Reid, M. Undergraduate Algebraic Geometry, Cambridge University Press, Cambridge, 1988. [5] Zhuo, D. Algebraic Geometric Coding Theory, available at: http://upload.wikimedia.org/wikipedia/ commons/7/71/algebraic_geometric_coding_theory.pdf 5