Ma/CS 6b Class 24: Error Correcting Codes By Adam Sheffer Communicating Over a Noisy Channel Problem. We wish to transmit a message which is composed of 0 s and 1 s, but noise might accidentally flip some bits. I love you I love lou 1
Naïve Approach: Parity Bit At the end of the message, add 0 if the number of 1 s in the message is even. Otherwise, add 1. Either way, the message is sent with an even number of 1 s. When receiving a message, if the number of 1 s is odd, we know that something went wrong. Problem. What if two bits were flipped? Check Digit A similar idea is used to verify credit card numbers (works with MasterCard, Visa, and others). Double alternating digits starting with the first digit in the sequence. Sum up the digits. If you did not get a multiple of 10, something is wrong with your card. 2
Error Correction What if in addition to detecting a bit flip, we want to also be able to correct it? We can send every bit three times. But this is very inefficient! Instead, we can break the message into 4-tuples of bits. We put them in a 2 2 matrix and add row and column parity bits. Bits of the message 1 1 0 0 1 1 1 0 Parity bits Detects and corrects a single bit flip, while only doubling the size of the transmission. Formalizing the Problem V n the set of binary sequences of length n. For example, V 3 = 000,001,010,011,100,101,110,111. Codes of length n are subsets of V n. For example, we might have 000 = up, 110 = down, 011 = left, 101 = right. This code is 000,110,011,101 V 3. Sequences that are in the code are called codewords (or just words). 3
Detection and Correction Consider the code 000,110,011,101 V 3. This code can detect if a single bit was flipped in a word. If we receive the sequence 100, we know that some error happened, but we cannot recover the original message. Consider the code 000000,111000,001110,110011 V 6. If a single bit was flipped, we can detect and correct that. If we receive 000001, we know that the original message was 000000. Distance Between Words Given a, b V n the distance a, b is the number of bits that are different in a and b. 010101,000111 = 2. Basic properties of distance: a, b = 0 iff a = b. a, b = b, a for every a, b V n. a, b a, c + c, b for every a, b, c V n. 4
Distance of a Code Given a code C V n, the distance of C is δ a, b. min a,b C If C is a code with distance d: We are guaranteed to detect any errors, as long as no more than d 1 bits were flipped. We can correct any errors, as long as less than d/2 bits were flipped (by finding the closest word to the one we received). Example We consider again the code C = 000000,111000,001110,110011 V 6. The distance of C is 3. Thus, we can detect any error with up to 2 bit flips. We can correct any error of one bit flip. 5
Hamming Distance The distance δ a, b is called Hamming distance. According to urban dictionary, Hamming is the act of throwing or placing slices of meat on cars, houses, etc. Unfortunately the distance notion is unrelated and named after Richard Hamming. One of the pioneers of error correcting codes. Richard Hamming A Blast from the Past Algebra Consider the set of binary sequences V n. For a, b V n, denote by a + b the addition mod 2 of every two corresponding bits. 11011001 + 11101110 = 00110111. Then V n under the + operation is a group! 6
A Group Closure. For every a, b V n, we have a + b V n. Associativity. For every a, b, c V n, we have a + b + c = a + b + c. Identity. There exists 0 n = 00 0 V n, such that for every a V n, we have a + 0 n = 0 n + a = a. Inverse. For every a V n, we have a + a = 0 n. So each element is its own inverse. Linear Codes A code C V n is linear if for every a, b C, we have a + b C. Notice that C is a subgroup of V n : Closure. Implied by the assumption. Associativity. Derived from the assoc. of V n Identity. If a C, then 0 n = a + a C. Inverse. Every element is its own inverse. 7
Recall: Lagrange s Theorem Theorem. If G is a group of a finite order n and H is a subgroup of G of order m, then m n. Let C V n be a linear code. V n = 2 n. Thus, C = 2 k for some 0 k n. We say that k is the dimension of C. Our Requirements Let C V n be a linear code of dimension k. For C to be efficient, k has to be relatively large with respect to n. We also want the number e of bit flips that can be corrected to be large. Theorem. 2 n k 1 + n 1 + n 2 + + n e. 8
Let a C. The number of elements of V n that n can be obtained by flipping r bits of a is r. The number of sequences of V n that can be obtained by flipping at most e bits of a is m = 1 + n 1 + n 2 + + n e. For any a, b C we cannot obtain the same sequence d by flipping at most e bits in a and in b, since otherwise we will not be able to correct that d. Thus, we have 2 k = C Vn m = 2n m, or 1 + n 1 + n 2 + + n e 2n k. Example Problem. How many bits can a linear code in V 17 of dimension 10 correct? We have 2 n k 1 + n 1 + n 2 + + n e In this case n = 17 and k = 10. We have 2 7 = 128 > 1 + 17 1. 128 < 1 + 17 1 + 17 2. Thus, correcting one bit might be possible, but not two. 9
Distance Properties For any a, b V n, we have a + b = a b. In either case, the i th bit is one iff a i b i. For any a, b, c V n, we have a, b = a + c, b + c. The distance is the number of bits that differ between a and b. If c i = 0 then a i and b i remain unchanged. If c i = 1 then both a i and b i are flipped. Either way, the number of bits that differ remains unchanged. The Weight of a Word Consider a code C V n. For a word a C, the weight w a is the number of 1 s in a. For every a C, we have a, 0 n = w a. For a, b C, we have a, b = a b, b b = w a b. 10
The Minimum Weight Theorem. Let C V n be a linear code. Let w min be the minimum weight of a nonzero word in C. Then the distance δ of C is w min. This allows us to compute δ more efficiently! Proof. Take z C so that w z = w min. We have δ z, 0 n = w z = w min. Take a, b C such that δ = a, b. By definition, a b C. We have δ = a, b = w a b w min. Parity Check Matrices Consider a, b V n and an m n matrix H with entries in 0,1. If Ha = Hb = 0 m, then H a + b = Ha + Hb = 0 m. Thus, the set C = a V n : Ha = 0 m is a linear code. The matrix H is the parity-check matrix of C (or simply check matrix). 11
Example Problem. Find the code C whose check matrix is H = 1 0 1 0 0 1 1 1. Solution. We need to solve 1 0 1 0 0 1 1 1 x 1 x 2 x 3 x 4 = 0 0. That is x 1 + x 3 = 0 and x 2 + x 3 + x 4 = 0. By going over all the possible values of x 3 and x 4, we find C = 0000,0101,1011,1110. The previous example is a special case of: For 1 r < n, we consider a code C V n defined by an r n matrices of the form 1 0 0 0 b 1,1 b 1,n r 0 1 0 0 b 2,1 b 2,n r H = 0 0 1 0 b 3,1 b 3,n r. 0 0 0 1 b r,1 b r,n r A vector x = x 1, x 2,, x n is in the code if x 1 = b 1,1 x r+1 + + b 1,n r x n, x r = b r,1 x r+1 + + b r,n r x n. Thus, we have C = 2 n r. 12
Standard Form We consider matrices of the form H = 1 0 0 0 b 1,1 b 1,n r 0 1 0 0 b 2,1 b 2,n r 0 0 1 0 b 3,1 b 3,n r 0 0 0 1 b r,1 b r,n r The order of the columns does not matter, since reordering the columns corresponds to reordering the bits in the code words. The above order is called standard form.. Correcting Errors with Linear Codes For 1 r < n, we consider a linear code C V n defined by the r n standard form matrix H. Theorem. If no column in H consists entirely of zeros and no two columns of H are identical, then C can correct at least one error. 13
Proof To show that C can correct one error, it suffices to show that its distance δ is at least 3. To prove δ 3, it suffices to show that any nonzero word in C is of weight at least 3. Assume for contradiction that there exists a word a C with w a = 1. That is, a has a single non-zero bit. Say that this is a i. Since a C, then Ha = 0 r. This implies that the i th column of H is all zeros. Contradiction! Proof (cont.) We proved that C contains no words of weight 1. It remains to show that there are no words of weight 2. Assume for contradiction that there exists a word a C with w a = 2. That is, a has two non-zero bits. Say that these are a i and a j. Since a C, then Ha = 0 r. This implies that the i th and j th columns of H are identical. Contradiction! 14
Concluding the Proof We showed: Since H has no zero columns, C contains no words of weight 1. Since H has no identical columns, C contains no words of weight 2. Thus, the minimum weight of a word in C is at least 3, implying that the distance of C is at least 3. That is, we can correct at least one bit flip. Code Size Problem. We are interested in a code with a check matrix H that has r rows and does not contain zero columns and identical columns. What is the maximum number of columns in H? There are 2 r distinct columns of r entries. After ignoring the all zeros column, we remain with 2 r 1 possible columns. For example, when r = 3, we have 1 0 0 1 1 0 1 0 1 0 1 0 1 1. 0 0 1 0 1 1 1 15
Hamming Codes Hamming codes are a family of linear codes: For every r 2, we have a code that corresponds to the check matrix H r having r rows and n = 2 r 1 distinct nonzero columns. We assume that the matrix is in standard form. A vector x = x 1, x 2,, x n is in the code if x 1 = b 1,1 x r+1 + + b 1,n r x n, x r = b r,1 x r+1 + + b r,n r x n. Determining x r+1,, x n uniquely determines x 1,, x r. So there are 2 n r words in the code. The End Hamming worked in Bell labs (credited with the development of radio astronomy, the transistor, the laser, information theory, the UNIX operating system, and the C and C++ programming languages). In 1960, Hamming predicted that one day half of the Bell Lab's budget would be spent on computing. None of his colleagues thought that it would ever be so high. 16