MATH/MTHE 406 Homework Assignment 2 due date: October 17, 2016 Notation: We will use the notations x 1 x 2 x n and also (x 1, x 2,, x n ) to denote a vector x F n where F is a finite field. 1. [20=6+5+9] Let C be the linear code over F 7 with parity check matrix 1 1 1 1 1 1 H =. 1 2 3 4 5 6 (a) Find the dimension and the minimum distance of the code. (b) Construct a systematic generator matrix for C. (c) Suppose that y = 111111 is the word received at the channel output. Find all the codewords that can be produced as output by a minimum-distance decoder applied to y. Solutions: We perform row operations on H to bring it into a form that has an identity matrix in its last two columns. Note that row operations do not change the nullspace of the matrix. In other words, if H is any matrix obtained from H via some sequence of row operations, then nullspace(h ) = nullspace(h) = C. Performing the appropriate row operations (modulo 7), we obtain 5 4 3 2 1 0 H =. 3 4 5 6 0 1 H is also a parity check matrix for C. (a) rank(h ) = 2, and hence dim(c) = n rank(h) = 6 2 = 4. All columns of H is non-zero, and no column is the multiple of any other column, so C has minimum distance at least 3. It is easy to find three columns of H that are linearly dependent, for example, ( 1) col1 + 5 col5 + 3 col6 = 0. Thus, there exists codewords of weight 3 in C, e.g., 600053. Hence the minimum distance of C is 3. So C is a [6, 4, 3] linear code. (b) H is in the form [A I 2 ] so G = [I 4 A T ] is a generator matrix for C, and G is in symmetric form. More precisely, 1 0 0 0 2 4 G = 0 1 0 0 3 3 0 0 1 0 4 2 0 0 0 1 5 1 is a symmetric generator matrix for C. 1
(c) We will use H to compute syndromes. The syndrome of y = 11111 is [60] T. We need to identify the least-weight words in F 6 7 that yield the syndrome [60]T, as these are the least-weight vectors in the coset of C within which y lies. It is easy to see that there are no weight 1 words e such that He T = [60] T. This is because no column of H is of the form [α0] T. To find all weight 2 words e such that He T = [60] T, we must determine all possible ways in which linear combination of some two columns of H can yield [60] T. Consider any pair of distinct columns [1i] T and [1j] T, 1 i < j 6. We need and thus, [ 1 1 i j [ α β = (j i) 1 [ j 1 i 1 ][ α β ] = [ 6 0 ], ] 1 1 1 6 = i j 0 ] 6 = 0 j(j i) 1 i(j i) 1. This shows that for any pair i, j with 1 i < j 6, if we define as then e (i,j) = [e 1 e 2 e 6 ] j(j i) 1 if k = i e k = i(j i) 1 if k = j 0 otherwise He (i,j)t = [60] T. Moreover, this exhausts all weight 2 solutions to the syndrome equation. It follows that the codewords of C closest in Hamming distance to y = 111111 are the ( 6 ) 2 = 15 words y e (i,j), 1 i < j 6. An explicit list of these codewords is 301111, 614111, 011311, 411161, 511115, 146111, 131011, 151151, 161114, 115511, 110131, 113110, 111641, 111016, 111103. 2. [20] Determine the dimension and minimum distance of the linear code C over F 11 with parity check matrix H as follows: 1 2 3 4 5 6 7 8 9 10 H = 1 2 2 2 3 2 4 2 5 2 6 2 7 2 8 2 9 2 10 2. 2
Solutions: The code C has parity check matrix H. Note first that rank(h) 2, as it can readily be checked that the second row is not a multiple of the first row. Hence dim(c) = 10 2 = 8. To determine the minimum distance, we identify the smallest number [ of linearly ] i j dependent columns of H. Any 2 2 submatrix of H is of the form B = i 2 j 2 with 0 < i < j < 11. We evaluate det(b) to be ij(i j), which is always non-zero over F 11 as it is the product of non-zero elements of F 11. Hence, any pair of columns of H is linearly independent over F 11. This means that there are no codewords in C of weight 1 or 2. Thus, the code C has minimum distance at least 3. On the other hand, any three columns of H must be linearly dependent, as the matrix has rank 2. Hence, there exist codewords of weight 3, which shows that the minimum distance of the code is equal to 3. 3. [21=7+7+7] Let C be an [n, k, d] binary linear code. Let C denote the code obtained by taking only those codewords of C whose last coordinate is 0, and deleting that last coordinate. To be precise, C = {c 1 c 2 c n 1 c 1 c 2 c n 1 0 C}. The code C is said to be obtained by shortening C at the last coordinate. (a) Show that a parity check matrix for C is obtained by deleting the last column from a parity check matrix for C. (b) Show that C is an [n 1, k, d ] linear code, where k is either k 1 or k, and d d. (c) Show by means of an example that d > d is possible. Solutions: While this problem specifies C to be a binary linear code, there is nothing special about the binary part. Therefore, solution assumes simply that C is a linear code over some finite field F. (a) Let H be a parity check matrix for C, so that C = ker(h). Let H be obtained from H by deleting the last column. We want to show that C = ker(h ), i.e., that x = x 1 x 2 x n 1 F n 1 is in C if and only if H (x ) T = 0. Note that no matter what the last column of H is, we always have H [x 1 x 2 x n 1 ] T = 0 H[x 1 x 2 x n 1 0] T = 0. Hence, for any x 1 x 2 x n 1 F n 1, the following equivalences hold: x 1 x 2 x n 1 C 3
x 1 x 2 x n 1 0 C = ker(h) x 2 x 2 x n 1 ker(h ). Thus, C = ker(h ), which is what we needed to show. (b) As shown above C is in the kernel (nullspace) of some matrix. Hence, it is a linear code. It clearly has length n 1. Let H and H be as in the solution to part (c) above. Note that rank(h) = n k, and rank(h ) = (n 1) dim(c ). Now, the rank of a matrix is equal to the maximum number of linearly independent columns. H is obtained by deleting a column from H, and so, rank(h ) is equal to either rank(h) = n k or rank(h) 1 = n k 1. Hence, dim(c ) = (n 1) rank(h ) is either n 1) (n k) = k 1 or (n 1) (n k 1) = k. The minimum distance of a code is equal to the least number of linearly dependent columns in a parity check matrix for the code. Columns of H are also columns of H. Hence, linearly dependent columns in H are also linearly dependent columns in H (but the converse need not be true). It follows that (c) Consider the binary length n code d = d min (C ) d min (C) = d. C = {00 00, 00 01, 11 10, 11 11}. It is readily checked that the sum of any two coordinates in C is also in C, and hence, the code is linear. It is an [n, 2, 1] code. Verify that C = {00 0, 11 1}. This has minimum distance n 1, which is obviously greater than d min (C) = 1 whenever n > 2. 4. [20=5+5+5+5] Let C be an [n, k] binary linear code. (a) Show that either all the codewords of C have even weight, or exactly half have even weight and half have odd weight. (b) Show that either all the codewords of C begin with a 0, or exactly half begin with a 0 and half with a 1. (c) Show that the sum of the weights of C of all codewords of C is at most n2 k 1. (d) If d is the minimum distance of C, show that d n2k 1 2 k 1. 4
Solutions: We will use the following theorem. Let P C be a subset of a binary linear code C satisfying the following two conditions: (i) if x,y P, or if x,y P, the x + y P; (ii) if x P, but y P, then x + y P. Then, either P = C, or exactly half the codewords are in P while the other half are not. (a) Let P be the set of all codewords of even weight. We have only to show that P satisfies the two conditions of the above theorem. This is readily verified from the following fact. Fact: For any binary vectors x = (x 1, x 2,, x n ) and y = (y 1, y 2,, y n ), w(x + y) = w(x) + w(y) 2w(x y) where x y is the vector (x 1 y 1, x 2 y 2,, x n y n ), which has ones only in the positions where both x and y do. Proof: Note that w(x) w(x y) is the number of positions i such that x i = 1 but y i = 0. Similarly, w(y) w(x y) is the number of positions i such that y i = 1 but x i = 0. Therefore, w(x) w(x y) + w(y) w(x y) = w(x) + w(y) 2w(x y) is equal to the Hamming distance d(x,y) between x and y. However, d(x,y) = w(x + y). It is clear from the above fact that if w(x) and w(y) are both even or both odd, then w(x+y) is even. On the other hand, if w(x) is even, while w(y) is odd, then w(x+y) is odd. Thus, the set P of even weight codewords satisfies the conditions (i) and (ii) of the theorem. So the conclusion of the theorem applies to P. (b) Let P be the set of all codewords that begin with a 0. It is easily verified that P satisfies the conditions of the theorem, and hence the result follows. (c) C is an [n, k] binary linear code, so it has 2 k codewords. For i {1, 2,, n}, define P i to be the set of all codewords c = c 1 c 2 c n such that c i = 0, i.e., P i is the set of all codewords that have a 0 in their i-th position. Correspondingly, let P i be the set of all codewords that have a 1 in their i-th position. Note that P i satisfies the conditions of the theorem above. Hence either P i = 0 or P i = P i = 1 2 C = 2k 1. In any case, P i 2 k 1. Let w(c) denote the sum of the weights of all the codewords of the code. If we write out all the codewords of the code, one below the other, in the form of a 5
2 k n matrix, then w(c) is simply the sum of the entries of this matrix. Note that the sum of the entries can be obtained by first summing along each column, and then adding up the column sums. Clearly, the sum along the i-th column is simply the number of codewords with a 1 in their i-th position which is precisely P i. Hence n w(c) = P n i 2 k 1 = n2 k 1. i=1 (d) w(c) is the sum of the weights of all the non-zero codewords in C, since the zero codeword does not contribute any weight to w(c). The weight of any non-zero codewords is at least d, since this is the minimum distance of C, and C is linear. There are 2 k 1 non-zero codewords in C, each of weight at least d, and hence i=1 w(c) d(2 k 1). From the above two inequalities, we obtain d(2 k 1) n2 k 1, which upon rearrangement yields the desired inequality d n2k 1 2 k 1. 5. [20] We would like to construct a binary linear code of length 10 that can correct all single-error patterns, as well as all double-error patterns in which both 1 s occur within the first five positions (e.g., 1100000000, 1000100000 and 0001100000, but not 1000010000 or 0000011000). Other double-error patterns and all higher-order error patterns need not be correctable. Construct a code with the largest possible dimension that meets the given requirement. You must justify the fact that the dimension of the code you construct is indeed the largest possible under the conditions of the problem. Solution: There are 10 single-error patterns, and ( 5 2) = 10 double-error patterns in which both errors are in the first five positions. In order to be correctable, these 20 error patterns must give rise to distinct non-zero syndromes. The number of distinct non-zero syndromes that can be obtained from the parity check matrix of a binary [10, k] linear code is 2 10 k 1. So we must have 2 10 k 1 20. Therefore, k 10 log 2 21 5.6, which shows that the maximum dimension that such a code can have is k = 5. To construct a code that meets the requirements of the problem, we need to construct a parity check matrix with 10 distinct non-zero columns, such that the sum of any two of the first five columns is distinct from the sum of any other pair of columns from the first five, and 6
the sum of any two of the first five columns is also distinct from any of the ten columns of the matrix. It may be verified that the following construction does the trick. H = 1 0 0 0 0 0 1 1 1 1 0 1 0 0 0 1 0 1 1 1 0 0 1 0 0 1 1 0 1 1 0 0 0 1 0 1 1 1 0 1 0 0 0 0 1 1 1 1 1 0 Clearly all columns of the matrix are non-zero and distinct, so the code can correct all single-error patterns. The sum of any two of the first five columns gives a vector of weight 2, which is distinct from any of the columns of H, and it is also clear that distinct pairs of columns (from the first five) have distinct sums. Thus, any double-error pattern with both errors in the first five positions gives rise to a syndrome that is distinct from any syndrome due to a single-error pattern, as well as from any syndrome due to another such double-error pattern. Let C be the binary linear code with parity check matrix H as above. If we choose the 21 to-be-corrected error patterns (including the no-error pattern) as the coset leaders of their respective cosets, then a syndrome decoder for C will have the required error-correction properties. Note that C has dimension k = 10 rank(h) = 5. As shown above, this is the maximum dimension that such a code can have.. 7