IERG60 Coding for Distributed Storage Systems Lecture - 05//06 Berlekamp-Massey decoding of RS code Lecturer: Kenneth Shum Scribe: Bowen Zhang Berlekamp-Massey algorithm We recall some notations from lecture Let (S, S, S 3,, S n ) be an arbitrary sequence of elements in a field K We say that this sequence is described by a linear feedback shift register (LFSR) of length L if there exists L field elements Λ, Λ,, Λ L, such that S i can be obtained as a linear function of the previous L elements, S i = Λ S i Λ S i Λ L S i L () for i = L +, L +,, n We specify the linear feedback shift register by a pair (L, Λ(Z)), where L denotes the length of the LFSR and Λ(Z) is the feedback polynomial defined as Λ(Z) := + Λ Z + Λ Z + + Λ L Z L We note that the degree of Λ(Z) is less than or equal to L, and the constant term is equal to (The degree is strictly less than L if Λ L = 0) For notational convenience, we define Λ 0 :=, so that () can be written compactly as for i = L +,, n 0 = Λ j S i j, j=0 Figure : Example of linear feedback shift register We want to find the shortest LFSR that generates (S, S, S 3,, S n ) The length of the shortest LFSR that generates (S, S, S 3,, S n ) is called the linear complexity of (S, S, S 3,, S n ) For M =,,, n, let L M be the linear complexity of the first M terms (S, S,, S M ), and let Λ M (Z) = L M j=0 ΛM j Z j be the corresponding feedback polynomial Caveat: There may be more than one LFSR that achieve the shortest length
We have the following chain of inequalities L L L 3 L n, () because, if (L M, Λ M (Z)) is the shortest LFSR that generates the first M terms, it also generates the first M terms The non-decreasing sequence of integers in () is usually called the linear complexity profile of (S, S, S 3,, S n ) The M-th term produced by (L M, Λ M (Z)), may not equal to the desired value S M We let M Ŝ M := Λ M j S M j j= M := S M ŜM = M j=0 Λ M j S M j, be the difference between S M and ŜM If M = 0, then the LFSR (L M, Λ M (Z)) correctly computes the M-th term S M, and we can let (L M, Λ M (Z)) = (L M, Λ M (Z)) When M 0, we have shown in lecture that Theorem Suppose that (L i, Λ i (Z)) is the shortest LFSR that produces (S, S,, S i ), for i =,,, M If M 0, then L M max(l M, M L M ) The Berlekamp-Massey algorithm computes the linear complexity profile and the corresponding feedback polynomials of a sequence of elements (S,, S n ) from a field K The algorithm computes L, L, iteratively In each step, we consider two cases If M = 0, set L M = L M and Λ M (Z) = Λ M (Z) If M 0, then the feedback polynomial Λ M (Z) is obtained by Λ M (Z) = Λ M (Z) + Λ µ (Z)Z e α, where µ M, e Z, and α K The value of µ, e and α are obtained by the following theorem Theorem Suppose (L i, Λ i (Z)) is the shortest LFSR for (S,, S i ), satisfying L i if i = 0, L i = max(l i, i L i ) if i 0, for i =,, 3,, M If µ < M, satisfying L µ < L µ = L µ+ = = L M, and µ 0, then we can find an LFSR (L M, Λ M (Z)) that generates (S,, S M ), with length L M if M = 0, L M = max(l M, M L M ) if M 0
Proof If M = 0, we set Λ M (Z) = Λ M (Z) and L M = L M Suppose that M 0 Set Λ M (Z) = Λ M (Z) M µ Z M µ Λ µ (Z) (3) We remark that we do not have division by zero in (3), because µ is non-zero by the assumption in the theorem Let D = deg Λ M (Z) max(l M, M µ + L µ ) be the degree of Λ M (Z) defined in (3) We check that the LFSR specified by feedback polynomial Λ M (Z) can generate (S,, S M ) By the assumptions in the theorem, we have and L µ M Λ µ i S j i = Λ M i S j i = 0 if j = L µ +, L µ +,, µ, µ if j = µ, 0 if j = L M +, L M +,, M, M if j = M With the notations Λ M i = 0 for i = L M +, L M +,, D and Λ µ i = 0 for i = L µ +, L µ +,, D, we can write the recursion as Λ M i S j i = Λ M i S j i M µ Λ µ i S j M+µ i (4) If j = M, then the two summations on the right-hand side of (4) are equal to D and Λ µ i S M M+µ i = Λ µ i S µ i = µ, ΛM i S M i = M respectively Therefore, the right-hand side of (4) is equal to zero Now suppose that j < M The first summation on the right-hand side of (4) is equal to zero for j > L M, and the second summation is equal to zero if j M + µ > L µ, By the hypothesis that µ 0, we have L µ = µ L µ Hence the second summation in (4) is equal to zero when j > M µ + L µ = M L µ = M L µ+ = = M L M We conclude that D ΛM i S j i is equal to zero for max(l M, M L M ) < j M We can set L M = max(l M, M L M ) The degree of Λ M is no more than max(l M, M µ + L µ ) = max(l M, M L M ) = L M The polynomial Λ M (Z) specifies an LFSR with length no more than L M The LFSR (L M, Λ M (Z)) generates (S, S,, S M ) 3
Given a sequence (S, S,, S N ), we find the smallest index m such that S m 0 We initialize the algorithm by L j = 0, Λ j (Z) =, for j = 0,,, m, and L m = m, Λ m (Z) = The LFSR (L i, Λ i (Z)) satisfy the conditions in Theorem with m = S m 0 The algorithm continues with repeated applications of Theorem for M = m, m +, m +,, n Example Determine the linear complexity profile of the sequence over F This sequence is ultimately periodic S = 0, S =, S 3 = 0, and S i = for i 4, Initialization The first non-zero element occurs at S = Let (L 0, Λ 0 (Z)) = (L, Λ (Z)) = (0, ), and (L, Λ (Z)) = (, ) We have = M = 3 As 3 = 0, we set L 3 = L and Λ 3 (Z) = Λ (Z) M = 4 4 = Set L 4 = max(l 3, 4 L 3 ) = max(, 4 ) =, and Λ 4 (Z) = Λ 3 (Z) + Z Λ (Z) = + Z M = 5 5 = Set L 5 = max(l 4, 5 L 4 ) = max(, 5 ) = 3, and M = 6 Because 6 = 0, (L 6, Λ 6 (Z)) = (L 5, Λ 5 (Z)) Λ 5 (Z) = Λ 4 (Z) + Z 4 Λ (Z) = + Z + Z 3 M = 7 7 = Set L 7 = max(l 6, 7 L 6 ) = max(3, 7 3) = 4, and Λ 7 (Z) = Λ 6 (Z) + Z Λ 4 (Z) = ( + Z + Z 3 ) + Z ( + Z ) = + Z 3 + Z 4 M = 8 8 = Set L 8 = max(l 7, 8 L 7 ) = max(4, 8 4) = 4, and Λ 8 (Z) = Λ 7 (Z) + ZΛ 6 (Z) = ( + Z 4 + Z 3 ) + Z( + Z + Z 3 ) = + Z M 9 M = 0, (L M, Λ M (Z)) = (L 8, Λ 8 (Z)) = (4, + Z) The calculations are summarized in the following table i 0 3 4 5 6 7 8 9 S i 0 0 i 0 0 0 0 L i 0 0 3 3 4 4 4 Λ i (z) + Z + Z + Z 3 + Z + Z 3 + Z 3 + Z 4 + Z + Z The linear complexity profile is 0,,,,3,3,4,4,4, 4
Decoding RS Codes First we define some notations Fix n distinct elements α,, α n in F q Let g(z) := (Z α )(Z α ) (Z α n ) g i (Z) := g(z) Z α i = (Z α )(Z α ) (Z α i )(Z α i+ )(Z α i+ ) (Z α n ) We first prove the following useful lemma Lemma 3 We have Proof n i= α j i g i (α i ) = 0 if j = 0,,, n if j = n We use the notation α α α n V (α,, α n ) := = (α j α i ) α n α n αn n j>i for the determinant of a Vandermonde matrix Suppose that we replace the last row of the above Vandermonde matrix by [α j i ] i=,,n, for some j between and n, α α α n δ j := α n α n αn n α j α j αn j Expand the determinant on the last row We get δ j = n ( ) n+i α j i V (α, α,, α i,, α n ) i= n = ( ) n+i α j i V (α, α,, α n ) (α i= n α i ) (α i+ α i ) (α i α i ) (α i α ) n α j i = V (α, α,, α n ) g i (α i ) i= If j = n, then the determinant δ j is equal to V (α,, α n ), and we get n i= αn i /g i (α i ) = For j = 0,,,, n, we have δ j = 0, because there are two repeated rows in the determinant δ j Hence n i= αj i /g i(α i ) = 0 for j = 0,,,, n We consider an (n, k) Reed-Solomon codes with the following k n generator matrix α α α n G = α k α k αn k (5) 5
where α i s are distinct nonzero elements in a finite field F q In the following, we need the assumption that all α i s are non-zero, so that α i exists in F q for all i A list of k message symbols, m to m k, are encoded to a codeword by multiplying (m,, m k ) G From Lemma 3, we can write down a parity-check matrix as H = g (α ) α g (α ) α n k g (α ) g (α ) g n (α n ) α g (α ) α n g n(α n) α n k g (α ) α n k n g n(α n) = α α α n α n k α n k αn n k We note that g i (α i ) are nonzero for all i, because the elements α,, α n are distinct The following is a syndrome-based method for decoding RS code diag( g (α ),, g n (α n ) ) Step() Calculate syndromes by multiplying the received vector Y and the transpose of parity-check matrix in (6) The syndromes are the components of yh T = (s, s,, s d ), where d = n k + is the minimum distance Step() Obtain the shortest linear feedback shift register (L, Λ(Z)) that generates the syndrome sequence s, s,, s d Step(3) If the number of errors t is less than or equal to (d ), then the feedback polynomial Λ(Z) has degree t and t distinct roots There is an error at location i if and only if Λ(α i ) = 0 In the case when Λ(Z) has no root in F q or the number distinct roots of Λ(Z) is strictly less than the degree of Λ(Z), then we can declare that there are more than errors, and stop the decoding procedure Step(4) equations Exercise (d ) After locating the errors, then error values can be calculated by solving a system of linear In the last step of the decoding procedure of RS code, we need to determine the error values Suppose that we have already determined the location of the errors, and they are i < i < < i t for some integer t less than or equal to (d ) Let e be the error vector, defined as the difference between the received vector and the transmitted codeword Let the i j -th component of e be e ij, for j =,, t Show that the error values e ij satisfy the following system of linear equations: s g i (α s = α i α i α it i ) e i g i (α i ) e i s t α t i α t i α t i t g it e (α it ) it (6) After obtaining the correct codeword, we also need to decode the message symbols m,, m k A naive method is to solve a system of k k system of linear equations Since RS code is MDS, we can arbitrarily pick k coded symbols and solve for the k message symbols This requires O(k 3 ) steps Show that the following procedure can also produce the message symbols 6
Input: a valid codeword c = (c, c,, c n ) in the row-space of matrix G in (5) Output: a message vector (m,, m k ) such that (m,, m k ) G = c Step 0 Let x c Step l k Step Compute the l-th message symbol m l by taking the inner product m l x ( α n l g (α ), α n l g (α ),, αn n l ) g n (α n ) Step 3 x x m l (α l, α l,, α l n ) Step 4 l l Step 5 While l, go back to step, otherwise return (m,, m k ) and stop This method computes the message symbols in the order of m k, m k,, m The while-loop is repeated k times, and we need to perform O(n) field operations in steps and 3 The overall computational complexity is O(kn) References [] J L Massey, Shift-register synthesis and BCH decoding, IEEE Trans Inf Theory, vol 5, no 7, pp, Jan 969 [] E Berlekamp, Algebraic coding theory, revised edition, World Scientific Publishing, 05 7