IITM-CS6845: Theory Toolkit February 08, 2012 Lecture 19 : Reed-Muller, Concatenation Codes & Decoding problem Lecturer: Jayalal Sarma Scribe: Dinesh K Theme: Error correcting codes In the previous lecture, we saw an important bound on the code parameters (minimum distance d, code size k) namely the Hamming bound. We also came up with generalisations of Hamming codes and showed that they achieve Hamming bound. We also explored the notion of dual of a code and came up with dual of Hamming code called Hadamard codes. We also saw another bound called Singleton bound and asked for codes that achieve them. This led to Reed-Solomon codes and we argued that they meet the singleton bound. In this lecture, we shall see more on Reed-Solomon codes, Concatenation codes, Reed-Muller codes (where we shall be visiting Schwartz-Zippel lemma), the unique decoding problem in coding theory and Reed-Solomon decoding algorithm - Berlekamp Welch decoding. 1 Singleton bound and Reed-Solomon codes Before heading on, let us try to gain more intuition on the relation between the code parameters - distance d and code size k. Singleton bound says that for any (n, k, d) code, d n k + 1 Dividing by n on both sides, d n 1 k n + 1 n δ 1 r + 1 n where δ is the relative minimum distance (error correcting ability) and r is the rate (efficiency of the code). As we can see, attempting to improve rate will cause us to lose out on the minimum distance and vice versa. Thus singleton bound clearly brings out the trade offs involved in selecting the parameters. Recall, Definition 1. Reed-Solomon Codes A code (n, k, n k + 1) defined on alphabets in F q is a Reed-Solomon code then there is an S = {α 1, α 2,..., α n } F q, all being distinct, and for each message m = (m 0, m 1,..., m k 1 ), there is a polynomial p m (x) = k 1 i=0 m ix i, code word C is an n-tuple obtained by evaluation of p m (x) on points in S. Hence C RS = {(p m (α i )) i {1,2,...,n} m = (m 0, m 1,..., m k 1 ) F k q} 19-1
We saw that the code achieves singleton bound d = n k + 1 and hence has good rate and minimum distance. But still there are some disadvantages, that are outlined below. 1. The codes are not defined over binary. This makes computer implementations difficult. 2. The field size must be at least n. This is necessary since F = n. Hence larger fields are required for larger codes. So the next attempt is to achieve (or attempt to achieve) singleton bound 1. even on binary alphabets which essentially is achieved through code concatenation, 2. with smaller fields, resulting in Reed-Muller codes. 2 Concatenation codes The trouble with Reed-Solomon codes is that the alphabets are not binary. A natural thought to overcome this would be to represent the alphabets themselves in binary. In this case, we would like to see if singleton bound is satisfied. Note that since the field must have size at least n, each alphabet in F q must require at least log n bits to be represented in binary. Hence the modified RS code (n, k, d ) has The whole process can be seen as follows n = n log n, k = k log n 1 2......... k Differ by d positions F k q F k q 1 2......... k 10101 {0, 1} k log n F k R.S F n {0, 1} n log n 10111 So what is the distance d? It can be observed that since RS code words must differ in at least d positions and binary representations in each positions must also differ by at least 19-2
one bit, d d. At the same time the quantity n k + 1 = (n k) log n + 1 has now significantly grown up. Hence we have, n k + 1 d d n k + 1 Essentially the gap has increased. One can still argue that the bound for d is weak, but it is possible to construct examples where it is tight and is left as an exercise. So we are not achieving much by converting to binary since distance has not improved. So why not use another code that has good minimum distance and defined on binary to perform this transformation? Such kind of encoding where there are two codes - the outer and the inner codes used in conjunction with the other is called as concatenation codes. Here, the inner code is generally a Reed-Solomon code. The encoding scheme is shown as follows. C outer : Σ k Σ n 1 2 k Σ kk 1 2 k Σ nk C inner : Γ K Γ N Idenitify symbols in Σ n as symbols in Γ Σ nn Idenitify symbols in Γ as symbols in Σ n Definition 2. Concatenation codes [For66] Let C outer : Σ k Σ n be an (n, k, d) code and C inner : Γ K Γ N be an (N, K, D) code. Then C outer C inner is the concatenation code defined from Σ kk Σ nn such that for a message the codeword would be m = (m 1, m 2,..., m k ) Σ kk, m i Σ k C inner (m 1, m 2,..., m k ) ΓN = Σ nn where (m 1, m 2,..., m k ) = (C outer(m 1 ), C outer (m 2 ),..., C outer (m k )) Thus, we get an (nn, kk, D ) code. 19-3
Claim 3. D dd for an (nn, kk) concatenation code. Proof. It can be seen that the first encoding produces tuples which differ in at least D coordinates and for each element in the tuple, the second encoding gives at least d coordinate difference. Hence together there will be at least dd difference. But we are still at loss with respect to the singleton bound, since even though the inner code has D = N K + 1 we do not know of a binary code that achieves the singleton bound. Hence d n k + 1 and dd nn kk + 1 (N K + 1)(n k + 1) But the advantage is that it still helps in alphabet reduction. 3 Reed Muller Codes Let us get back to our efforts of reducing the field size. Note that for Reed-Solomon codes, we need at least n elements in the field since the code tuples must be of length n. So the question is Can n tuples be still obtained with a smaller field? The answer is yes and can be achieved by moving to a multivariate polynomial. We shall see how to reduce the field size by the following example of 2 variate polynomial. Let p(x, y) be a bivariate polynomial of degree t. Hence, p(x, y) = m ij x i y j 0 i,j t i+j t We shall interpret the coefficients of this polynomial (also called as monomials) as the message. Exercise 1. For a bivariate polynomial of degree t, the number of monomials given by {(i, j) 0 i, j t, i + j t} is ( ) t+2 2. In general show that for a v variate polynomial of degree t, number of monomials is ( ) t+v v. Hence the message length k = ( ) t+2 2 or k = (t+2)(t+1) 2 > t2 2. Hence t < 2k (1) 19-4
Now to generate n evaluations of p(x, y), the set S needs to have only n elements thereby reducing the field size. Hence for a message m = (m 1, m 2,..., m k ) where k = ( ) t+2 2, C(m) = (p(α, β)) α,β S 2 Exercise 2. Argue that the resulting code is linear. Now, let us try to find the minimum distance of Reed-Muller code and try to see how much off are we from singleton bound. Since the code is linear (by exercise 2), minimum distance equals the weight of minimum weight code word. Hence it is sufficient to find the minimum number of non-zero entries in any n-tuple that is a codeword. To obtain this bound we use the following lemma. Lemma 4. Schwartz-Zippel Let p 0, be a polynomial in v variables (x 1, x 2,..., x n ) of degree t > 0 defined on a field F. Let S be a non-empty subset of F. Then, Applying the above lemma, we have Pr a S v[p (a) = 0] t Pr a S 2(p(a) = 0) = total degree < 2k n Hence, number of zeros in any code word < 2k n 2 = 2k n n = 2kn Therefore # of non zero entries is > n 2kn. By linearity of the code, minimum distance d > n 2kn. Also d n k + 1. Hence, n 2kn < d n k + 1 This inequality points to the trade-off between the distance and field size for the Reed-Muller code construction. 3.1 Proof of Schwartz-Zippel lemma Proof is by induction on v. Base case For v = 1, we have a univariate polynomial p of degree t, for which there can be utmost t distinct roots. Hence Pr a S (p(a) = 0) = {a a S, p(a) = 0} t 19-5
Induction case Suppose the result holds for any degree t polynomial p of v = l 1 variables say (x 2, x 3,..., x l ), l > 1. Hence, Pr a S l 1[p(a) = 0] = t Now, consider a polynomial q of v = l variables, say (x 1, x 2,..., x l ). Observation 5. Polynomial q can be expressed as x l 1p 1 (x 2, x 3,..., x l ) + x l 1 1 p 2 (x 2, x 3,..., x l ) +... + x 0 1p l+1 (x 2, x 3,..., x l ) where each of the p i s will have degree t l. Now, fix a random a = (a 1, a 2,..., a l ) S l. polynomial gets evaluated to zero. There can be two situations where the 1. It can be that j, p j (a 2, a 3,..., a l ) = 0. Let this be the event A. 2. It can be that the polynomial p evaluates to 0. This happens when the a i s happened to be the roots of p. Let it be the event B. We have now, Pr[B] = Pr[B A] + Pr[B A] = Pr[A].Pr[B A] + Pr[A].Pr[B A] t l.1 + l.pr[a] t l = d + l [By inductive hypothesis] [Pr[A] 1] Now, let us consider the general case where we have a degree t polynomial in v variables. By Schwatrz-Zippel lemma, number of zero t.v = t v 1. Hence minimum distance d n t v 1. Since v = n, = n 1/v. Hence d n t.n v 1 v and t and k are related (by exercise 1) as k = ( ) t+v v. Here again, we can see the trade-offs: increasing n (for obtaining a larger code word) or increasing v (to reduce the field size) clearly decreases the minimum distance. 19-6
3.2 Example of a Reed-Muller code Let polynomial p(x 1, x 2,..., x n ) = m 1 x 1 +m 2 x 2 +...+m k x k where m = (m 1, m 2,..., m k ) F k 2 is the message polynomial. Encoding would be, C(m) = (p(a 1, a 2,..., a n )) (a1,a 2,...,a n) F n 2 Effectively we obtain all possible combinations of the message which is nothing but a [2 k, k, 2 k 1 ] Hadamard code. Hence Hadamard codes can be seen as special case of Reed- Muller code. 4 Decoding Problem for Linear Codes Given an [n, k, d] q code defined on field F q. The decoding problem is to find the code word that was most likely to be send. Stated formally, Definition 6. Decoding Problem Input : Receieved word y F n q Output : Unique x F k q such that c(x) is the nearest code word to y We saw that if the number of errors is < d 2 for the minimum distance d, then it is theoretically possible to decode by searching for the nearest neighbour code word in the ball of radius d 2 in Fn q. But this requires us to search through all the code words in the ball which comes up to d/2 ( n i=0 i) and takes time which is exponential in n. In fact the general problem of the nearest neighbour decoding on linear codes is NP-complete [BMVT78]. (Recall that for special cases like Hamming codes indeed had an efficient decoding mechanism) 4.1 Reed-Solomon Decoding We shall introduce the Berlekamp-Welch decoding algorithm for decoding Reed-Solomon codes. For a Reed-Solomon code defined over F q with (α 1, α 2,..., α n ) F q Input (y 1, y 2,..., y n ) F n q is the received codeword. Output m = (m 0, m 1,..., m k 1 ) F k q such that i [n], y i = p m (α i ) where p m = k 1 i=0 m ix i. We use the notion of an error correcting polynomial defined as follows 19-7
Definition 7. Error locating polynomial E(x) is designed to have E(α i ) = 0 whenever there is an error. That is E(α i ) = 0 p(α i ) y i Since there can at most d/2 errors, the degree of this polynomial can be at most d/2. In the next lecture we shall see how this polynomial is put to use for decoding in more details. References [BMVT78] E. Berlekamp, R. McEliece, and H. Van Tilborg. On the inherent intractability of certain coding problems. Information Theory, IEEE Transactions on, 24(3):384 386, 1978. [For66] G. D. Forney, Jr. Concatenated codes. MIT Press, Cambridge, MA, 1966. 19-8