Reed-Solomon Error Correcting Codes

Size: px
Start display at page:

Download "Reed-Solomon Error Correcting Codes"

Transcription

1 Reed-Solomon Error Correcting Codes Ohad Rodeh Reed-Solomon Error Correcting Codes p.1/22

2 Introduction Reed-Solomon codes are used for error correction The code was invented in 1960 by Irving S. Reed and Gustave Solomon, at the MIT Lincoln Laboratory This lecture describes a particular application to storage It is based on: A Tutorial on Reed-Solomon Coding for Fault-Tolerance in RAID-like systems, James S. Plank, University of Tennessee, Reed-Solomon Error Correcting Codes p.2/22

3 Problem definition n storage devices holding data: D 1,..., D n. Each holds k bytes. m storage devices holding checksums: C 1,..., C m. Each holds k bytes. The checksums are computed from the data. The goal: if any m devices fail, they can be reconstructed from the surviving devices. Reed-Solomon Error Correcting Codes p.3/22

4 Constructing the checksums The value of the checksum devices is computed using a function F Reed-Solomon Error Correcting Codes p.4/22

5 Words The RS-RAID scheme breaks the data into words of size w bits The coding scheme works on stripes whose width is w Reed-Solomon Error Correcting Codes p.5/22

6 Simplifying the problem From here we focus on a single stripe The data words are d 1, d 2,..., d n The checksum words are c 1, c 2,..., c m c i = F (d 1, d 2,..., d n ) If d j changes to d j compute: c i = G i,j (d j, d j, c i ) When data devices fail we re-compute the data from the available devices and then re-compute the failed checksum devices from the data devices Reed-Solomon Error Correcting Codes p.6/22

7 Example, RAID-4 as RS-RAID Set m = 1 Describe n + 1 parity: c i = F (d 1, d 2,..., d n ) = d 1 d 2 d n If device j fails then: d j = d 1... d j 1 d j+1 d n c 1 Reed-Solomon Error Correcting Codes p.7/22

8 Restating the problem There are n data words d 1,... d n all of size w We will 1. Define functions F and G used to calculate the parity words c 1,.., c m 2. Describe how to recover after losing up to m devices Three parts to the solution: 1. Using Vandermonde matrix to compute checksums 2. Using Gaussian elimination to recover from failures 3. Use of Galois fields to perform arithmetic Reed-Solomon Error Correcting Codes p.8/22

9 Vandermonde matrix Each checksum word c i is a linear combination of the data words The matrix contains linearly independent rows F D = C Reed-Solomon Error Correcting Codes p.9/22

10 Changing a single word If data in device j changes from d j to d j then c i = G i,j (d j, d j, c i ) = c i + f i,j (d j d j ) This makes changing a single word reasonably efficient Reed-Solomon Error Correcting Codes p.10/22

11 Recovering from failures In the matrix below, even after removing m rows, the matrix remains invertible Reed-Solomon Error Correcting Codes p.11/22

12 Galois fields We want to use bytes as our data and checksum elements. Therefore, we need a field with 256 elements. A field GF (n) is a set of n elements closed under addition and multiplication Every element has an inverse for addition and multiplication Reed-Solomon Error Correcting Codes p.12/22

13 Primes If n = p is a prime then Z p = {0,..., n 1} is a field with addition and multiplication module p. For example, GF (2) = {0, 1} with addition/multiplication modulo 2. If n is not a prime then addition/multiplication modulo n over the set {0, 1,.., n 1} are not a field 1. For example, if n = 4, and the elements are {0, 1, 2, 3} 2. Element 2 does not have a multiplicative inverse Reed-Solomon Error Correcting Codes p.13/22

14 GF (2 w ) Therefore, we cannot simply use elements {0,.., 2 n 1} with addition/multiplication module 2 n, we need to use Galois fields We use the Galois field GF (2 w ) The elements of GF (2 w ) are polynomials with coefficients that are zero or one. Arithmetic in GF (2 w ) is like polynomial arithmetic module a primitive polynomial of degree w A primitive polynomial is one that is cannot be factored Reed-Solomon Error Correcting Codes p.14/22

15 GF (4) GF (4) contains elements: {0, 1, x, x + 1} The primitive polynomial is q(x) = x 2 + x + 1 Arithmetic: x + (x + 1) = 1 x x = x 2 mod (x 2 + x + 1) = x + 1 x (x + 1) = x 2 + x mod (x 2 + x + 1) = 1 Reed-Solomon Error Correcting Codes p.15/22

16 Efficient arithmetic It is important to implement arithmetic efficiently over GF (2 w ) Elements in GF (2 n ) are mapped onto the integers 0,..., 2 n 1 Mapping of r(x) onto word of size w i-th bit is equal to the coefficient of x i in r(x) Addition is performed with XOR How do we perform efficient multiplication? Polynomial multiplication is too slow. Reed-Solomon Error Correcting Codes p.16/22

17 Primitive polynomial A primitive polynomial q(x) is one that cannot be factored If q(x) is primitive then x generates all the elements in GF (2 n ) Reed-Solomon Error Correcting Codes p.17/22

18 Multiplication Multiplication (and division) is done using logarithms a b = x log x(a)+log x (b) Build two tables 1. gflog[] : maps an integer to its logarithm 2. gfilog[] : maps an integer to its inverse logarithm In order to multiply two integers, a and b 1. Compute their logarithms 2. Add 3. Compute the inverse logarithm For division: subtract instead of add Reed-Solomon Error Correcting Codes p.18/22

19 GF (16) Example tables for GF (16) Example computations: Reed-Solomon Error Correcting Codes p.19/22

20 GF (16) cont. Reed-Solomon Error Correcting Codes p.20/22

21 Summary Choose a convenient value for w, w = 8 is a good choice Setup the tables gflog and gfilog Setup the Vandermonde matrix, arithmetic is over GF (2 w ) Use F to maintain the checksums If any device fails, use F to reconstruct the data Reed-Solomon Error Correcting Codes p.21/22

22 Conclusions We presented Reed-Solomon error correction codes Advantages: 1. Conceptually simple 2. Can work with any n and m Disadvantage: Computationally more expensive than XOR based codes Reed-Solomon Error Correcting Codes p.22/22

23 Reed Solomon error correction - Wikipedia, the free encyclopedia 2/16/10 5:00 PM Reed Solomon error correction From Wikipedia, the free encyclopedia Reed Solomon error correction is an error-correcting code that works by oversampling a polynomial constructed from the data. The polynomial is evaluated at several points, and these values are sent or recorded. Sampling the polynomial more often than is necessary makes the polynomial over-determined. As long as it receives "many" of the points correctly, the receiver can recover the original polynomial even in the presence of a "few" bad points. Reed Solomon codes are used in a wide variety of commercial applications, most prominently in CDs, DVDs and Blu-ray Discs, in data transmission technologies such as DSL & WiMAX, in broadcast systems such as DVB and ATSC, and in computer applications such as RAID 6 systems. Contents 1 Overview 2 Definition 2.1 Overview 2.2 Mathematical formulation 2.3 Remarks 2.4 Reed Solomon codes as BCH codes 2.5 Equivalence of the two formulations 3 Properties of Reed Solomon codes 4 History 5 Applications 5.1 Data storage 5.2 Data transmission 5.3 Mail encoding 5.4 Satellite transmission 6 Sketch of the error correction algorithm 7 See also 8 References 9 External links Overview Reed Solomon codes are block codes. This means that a fixed block of input data is processed into a fixed block of output data. In the case of the most commonly used R S code (255, 223) 223 Reed Solomon input symbols (each eight bits long) are encoded into 255 output symbols. Solomon_error_correction Page 1 of 10

24 Reed Solomon error correction - Wikipedia, the free encyclopedia 2/16/10 5:00 PM Most R S error-correcting code schemes are systematic. This means that some portion of the output codeword contains the input data in its original form. A Reed Solomon symbol size of eight bits forces the longest codeword length to be 255 symbols. The standard (255, 223) Reed Solomon code is capable of correcting up to 16 Reed Solomon symbol errors in each codeword. Since each symbol is actually eight bits, this means that the code can correct up to 16 short bursts of error. The Reed Solomon code, like the convolutional code, is a transparent code. This means that if the channel symbols have been inverted somewhere along the line, the decoders will still operate. The result will be the complement of the original data. However, the Reed Solomon code loses its transparency when the code is shortened (see below). The "missing" bits in a shortened code need to be filled by either zeros or ones, depending on whether the data is complemented or not. (To put it another way, if the symbols are inverted, then the zero fill needs to be inverted to a ones fill.) For this reason it is mandatory that the sense of the data (i.e., true or complemented) be resolved before Reed Solomon decoding. Definition Overview The key idea behind a Reed Solomon code is that the data encoded is first visualized as a Gegenbauer polynomial. The code relies on a theorem from algebra that states that any k distinct points uniquely determine a univariate polynomial of degree, at most, k 1. The sender determines a degree k 1 polynomial, over a finite field, that represents the k data points. The polynomial is then "encoded" by its evaluation at various points, and these values are what is actually sent. During transmission, some of these values may become corrupted. Therefore, more than k points are actually sent. As long as sufficient values are received correctly, the receiver can deduce what the original polynomial was, and hence decode the original data. In the same sense that one can correct a curve by interpolating past a gap, a Reed Solomon code can bridge a series of errors in a block of data to recover the coefficients of the polynomial that drew the original curve. Mathematical formulation Given a finite field F and polynomial ring F[x], let n and k be chosen such that distinct elements of F, denoted. Then, the codebook C is created from the tuplets of values obtained by evaluating every polynomial (over F) of degree less than k at each x i ; that is,. Pick n Solomon_error_correction Page 2 of 10

25 Reed Solomon error correction - Wikipedia, the free encyclopedia 2/16/10 5:00 PM C is a [n,k,n k + 1] code; in other words, it is a linear code of length n (over F) with dimension k and minimum Hamming distance n k + 1. A Reed Solomon code is a code of the above form, subject to the additional requirement that the set Remarks must be the set of all non-zero elements of the field F (and therefore, n = F 1). For practical uses of Reed Solomon codes, it is common to use a finite field F with 2 m elements. In this case, each symbol can be represented as an m-bit value. The sender sends the data points as encoded blocks, and the number of symbols in the encoded block is n = 2 m 1. Thus a Reed Solomon code operating on 8-bit symbols has n = = 255 symbols per block. (This is a very popular value because of the prevalence of byte-oriented computer systems.) The number k, with k < n, of data symbols in the block is a design parameter. A commonly used code encodes k = 223 eight-bit data symbols plus 32 eight-bit parity symbols in an n = 255-symbol block; this is denoted as a (n,k) = (255,223) code, and is capable of correcting up to 16 symbol errors per block. The set {x 1,...,x n } of non-zero elements of a finite field can be written as {1,α,α 2,...,α n 1 }, where α is a primitive nth root of unity. It is customary to encode the values of a Reed Solomon code in this order. Since α n = 1, and since for every polynomial p(x), the function p(αx) is also a polynomial of the same degree, it then follows that a Reed Solomon code is cyclic. Reed Solomon codes as BCH codes Reed Solomon codes are a special case of a larger class of codes called BCH codes. An efficient error correction algorithm for BCH codes (and therefore Reed Solomon codes) was discovered by Berlekamp in To see that Reed Solomon codes are special BCH codes, it is useful to give the following alternate definition of Reed Solomon codes. [1] Given a finite field F of size q, let n = q 1 and let α be a primitive nth root of unity in F. Also let be given. The Reed Solomon code for these parameters has code word (f 0,f 1,...,f n 1 ) if and only if α,α 2,...,α n k are roots of the polynomial p(x) = f 0 + f 1 x f n 1 x n 1. Solomon_error_correction Page 3 of 10

26 Reed Solomon error correction - Wikipedia, the free encyclopedia 2/16/10 5:00 PM With this definition, it is immediately seen that a Reed Solomon code is a polynomial code, and in particular a BCH code. The generator polynomial g(x) is the minimal polynomial with roots α,α 2,...,α n k, and the code words are exactly the polynomials that are divisible by g(x). Equivalence of the two formulations At first sight, the above two definitions of Reed Solomon codes seem very different. In the first definition, code words are values of polynomials, whereas in the second, they are coefficients. Moreover, the polynomials in the first definition are required to be of small degree, whereas those in the second definition are required to have specific roots. The equivalence of the two definitions is proved using the discrete Fourier transform. This transform, which exists in all finite fields as well as the complex numbers, establishes a duality between the coefficients of polynomials and their values. This duality can be approximately summarized as follows: Let p(x) and q(x) be two polynomials of degree less than n. If the values of p(x) are the coefficients of q(x), then (up to a scalar factor and reordering), the values of q(x) are the coefficients of p(x). For this to make sense, the values must be taken at locations x = α i, for To be more precise, let,, where α is a primitive nth root of unity. and assume p(x) and q(x) are related by the discrete Fourier transform. Then the coefficients and values of p(x) and q(x) are related as follows: for all, f i = p(α i ) and. Using these facts, we have: definition is a code word of the Reed Solomon code according to the first if and only if p(x) is of degree less than k (because if and only if v i = 0 for i = k,...,n 1, if and only if q(α i ) = 0 for i = 1,...,n k (because q(α i ) = nv n i ), are the values of p(x)), if and only if is a code word of the Reed Solomon code according to the second definition. This shows that the two definitions are equivalent. Properties of Reed Solomon codes Solomon_error_correction Page 4 of 10

27 Reed Solomon error correction - Wikipedia, the free encyclopedia 2/16/10 5:00 PM The error-correcting ability of any Reed Solomon code is determined by n k, the measure of redundancy in the block. If the locations of the errored symbols are not known in advance, then a Reed Solomon code can correct up to (n k) / 2 erroneous symbols, i.e., it can correct half as many errors as there are redundant symbols added to the block. Sometimes error locations are known in advance (e.g., side information in demodulator signal-to-noise ratios) these are called erasures. A Reed Solomon code (like any MDS code) is able to correct twice as many erasures as errors, and any combination of errors and erasures can be corrected as long as the relation number of erasures in the block. is satisfied, where E is the number of errors and S is the The properties of Reed Solomon codes make them especially well-suited to applications where errors occur in bursts. This is because it does not matter to the code how many bits in a symbol are in error if multiple bits in a symbol are corrupted it only counts as a single error. Conversely, if a data stream is not characterized by error bursts or drop-outs but by random single bit errors, a Reed Solomon code is usually a poor choice. Designers are not required to use the "natural" sizes of Reed Solomon code blocks. A technique known as shortening can produce a smaller code of any desired size from a larger code. For example, the widely used (255,223) code can be converted to a (160,128) code by padding the unused portion of the block (usually the beginning) with 95 binary zeroes and not transmitting them. At the decoder, the same portion of the block is loaded locally with binary zeroes. The Delsarte-Goethals-Seidel [2] theorem illustrates an example of an application of shortened Reed Solomon codes. In 1999 Madhu Sudan and Venkatesan Guruswami at MIT, published Improved Decoding of Reed Solomon and Algebraic-Geometry Codes introducing an algorithm that allowed for the correction of errors beyond half the minimum distance of the code. It applies to Reed Solomon codes and more generally to algebraic geometry codes. This algorithm produces a list of codewords (it is a list-decoding algorithm) and is based on interpolation and factorization of polynomials over GF(2 m ) and its extensions. History The code was invented in 1960 by Irving S. Reed and Gustave Solomon, who were then members of MIT Lincoln Laboratory. Their seminal article was entitled "Polynomial Codes over Certain Finite Fields." When it was written, digital technology was not advanced enough to implement the concept. The first application, in 1982, of RS codes in mass-produced products was the compact disc, where two interleaved RS codes are used. An efficient decoding algorithm for large-distance RS codes was developed by Elwyn Berlekamp and James Massey in Today RS codes are used in hard disk drive, DVD, telecommunication, and digital broadcast protocols. Applications Data storage Solomon_error_correction Page 5 of 10

28 Reed Solomon error correction - Wikipedia, the free encyclopedia 2/16/10 5:00 PM Reed Solomon coding is very widely used in mass storage systems to correct the burst errors associated with media defects. Reed Solomon coding is a key component of the compact disc. It was the first use of strong error correction coding in a mass-produced consumer product, and DAT and DVD use similar schemes. In the CD, two layers of Reed Solomon coding separated by a 28-way convolutional interleaver yields a scheme called Cross- Interleaved Reed Solomon Coding (CIRC). The first element of a CIRC decoder is a relatively weak inner (32,28) Reed Solomon code, shortened from a (255,251) code with 8-bit symbols. This code can correct up to 2 byte errors per 32-byte block. More importantly, it flags as erasures any uncorrectable blocks, i.e., blocks with more than 2 byte errors. The decoded 28-byte blocks, with erasure indications, are then spread by the deinterleaver to different blocks of the (28,24) outer code. Thanks to the deinterleaving, an erased 28-byte block from the inner code becomes a single erased byte in each of 28 outer code blocks. The outer code easily corrects this, since it can handle up to 4 such erasures per block. The result is a CIRC that can completely correct error bursts up to 4000 bits, or about 2.5 mm on the disc surface. This code is so strong that most CD playback errors are almost certainly caused by tracking errors that cause the laser to jump track, not by uncorrectable error bursts. [3] Another product which incorporates Reed Solomon coding is Nintendo's e-reader. This is a video-game delivery system which uses a two-dimensional "barcode" printed on trading cards. The cards are scanned using a device which attaches to Nintendo's Game Boy Advance game system. Reed Solomon error correction is also used in parchive files which are commonly posted accompanying multimedia files on USENET. The Distributed online storage service Wuala also makes use of Reed Solomon when breaking up files. Data transmission Specialized forms of Reed Solomon codes specifically Cauchy-RS and Vandermonde-RS can be used to overcome the unreliable nature of data transmission over erasure channels. The encoding process assumes a code of RS(N,K) which results in N codewords of length N symbols each storing K symbols of data, being generated, that are then sent over an erasure channel. Any combination of K codewords received at the other end is enough to reconstruct all of the N codewords. The code rate is generally set to 1/2 unless the channel's erasure likelihood can be adequately modelled and is seen to be less. In conclusion N is usually 2K, meaning that at least half of all the codewords sent must be received in order to reconstruct all of the codewords sent. Reed Solomon codes are also used in xdsl systems and CCSDS's Space Communications Protocol Specifications as a form of Forward Error Correction. Mail encoding Paper bar codes such as PostBar, MaxiCode, Datamatrix and QR Code use Reed Solomon error correction to allow correct reading even if a portion of the bar code is damaged. Solomon_error_correction Page 6 of 10

29 Reed Solomon error correction - Wikipedia, the free encyclopedia 2/16/10 5:00 PM Satellite transmission One significant application of Reed Solomon coding was to encode the digital pictures sent back by the Voyager space probe. Voyager introduced Reed Solomon coding concatenated with convolutional codes, a practice that has since become very widespread in deep space and satellite (e.g., direct digital broadcasting) communications. Viterbi decoders tend to produce errors in short bursts. Correcting these burst errors is a job best done by short or simplified Reed Solomon codes. Modern versions of concatenated Reed Solomon/Viterbi-decoded convolutional coding were and are used on the Mars Pathfinder, Galileo, Mars Exploration Rover and Cassini missions, where they perform within about db of the ultimate limit imposed by the Shannon capacity. These concatenated codes are now being replaced by more powerful turbo codes where the transmitted data does not need to be decoded immediately. Sketch of the error correction algorithm The following is a sketch of the main idea behind the error correction algorithm for Reed Solomon codes. By definition, a code word of a Reed Solomon code is given by the sequence of values of a low-degree polynomial over a finite field. A key fact for the error correction algorithm is that the values and the coefficients of a polynomial are related by the discrete Fourier transform. The purpose of a Fourier transform is to convert a signal from a time domain to a frequency domain or vice versa. In case of the Fourier transform over a finite field, the frequency domain signal corresponds to the coefficients of a polynomial, and the time domain signal correspond to the values of the same polynomial. As shown in Figures 1 and 2, an isolated value in the frequency domain corresponds to a smooth wave in the time domain. The wavelength depends on the location of the isolated value. Conversely, as shown in Figures 3 and 4, an isolated value in the time domain corresponds to a smooth wave in the frequency domain. Solomon_error_correction Page 7 of 10

30 Reed Solomon error correction - Wikipedia, the free encyclopedia 2/16/10 5:00 PM Figure 1 Figure 2 Figure 3 Figure 4 In a Reed Solomon code, the frequency domain is divided into two regions as shown in Figure 5: a left (lowfrequency) region of length k, and a right (high-frequency) region of length n k. A data word is then embedded into the left region (corresponding to the k coefficients of a polynomial of degree at most k 1), while the right region is filled with zeros. The result is Fourier transformed into the time domain, yielding a code word that is composed only of low frequencies. In the absence of errors, a code word can be decoded by reverse Fourier transforming it back into the frequency domain. Now consider a code word containing a single error, as shown in red in Figure 6. The effect of this error in the frequency domain is a smooth, single-frequency wave in the right region, called the syndrome of the error. The error location can be determined by determining the frequency of the syndrome signal. Similarly, if two or more errors are introduced in the code word, the syndrome will be a signal composed of two or more frequencies, as shown in Figure 7. As long as it is possible to determine the frequencies of which the syndrome is composed, it is possible to determine the error locations. Notice that the error locations depend only on the frequencies of these waves, whereas the error magnitudes depend on their amplitudes and phase. The problem of determining the error locations has therefore been reduced to the problem of finding, given a sequence of n k values, the smallest set of elementary waves into which these values can be decomposed. It is known from digital signal processing that this problem is equivalent to finding the roots of the minimal Solomon_error_correction Page 8 of 10

31 Reed Solomon error correction - Wikipedia, the free encyclopedia 2/16/10 5:00 PM polynomial of the sequence, or equivalently, of finding the shortest linear feedback shift register (LFSR) for the sequence. The latter problem can either be solved inefficiently by solving a system of linear equations, or more efficiently by the Berlekamp-Massey algorithm. Figure 5 Figure 6 Figure 7 See also Forward error correction BCH code Low-density parity-check code Chien search Datamatrix Tornado codes Finite ring References 1. ^ Lidl, Rudolf; Pilz, Günter (1999). Applied Abstract Algebra (2nd ed.). Wiley. p ^ "Kissing Numbers, Sphere Packings, and Some Unexpected Proofs", Notices of the American Mathematical Society, Volume 51, Issue 8, 2004/09. Explains the Delsarte-Goethals-Seidel ( theorem as used in the context of the error correcting code for compact disc. 3. ^ K.A.S. Immink, Reed Solomon Codes and the Compact Disc in S.B. Wicker and V.K. Bhargava, Edrs, Reed Solomon Codes and Their Applications, IEEE Press, F. J. MacWilliams and N. J. A. Sloane, The Theory of Error-Correcting Code, New York: North- Holland Publishing Company, Irving S. Reed and Xuemin Chen, Error-Control Coding for Data Networks", Boston: Kluwer Academic Publishers, MIT OpenCourseWare - Principles of Digital Communication II - Lecture 10 on Reed Solomon Codes ( 2005/LectureNotes/detail/embed10.htm) Solomon_error_correction Page 9 of 10

32 Reed Solomon error correction - Wikipedia, the free encyclopedia 2/16/10 5:00 PM External links Schifra Open Source C++ Reed Solomon Codec ( A collection of links to books, online articles and source code ( Henry Minsky's RSCode library, Reed Solomon encoder/decoder ( A Tutorial on Reed Solomon Coding for Fault-Tolerance in RAID-like Systems ( A thesis on Algebraic soft-decoding of Reed Solomon codes ( It explains the basics as well. Matlab implementation of errors-and-erasures Reed-Solomon decoding ( BBC R&D White Paper WHP031 ( Retrieved from " Categories: Error detection and correction This page was last modified on 29 January 2010 at 06:24. Text is available under the Creative Commons Attribution-ShareAlike License; additional terms may apply. See Terms of Use for details. Wikipedia is a registered trademark of the Wikimedia Foundation, Inc., a non-profit organization. Solomon_error_correction Page 10 of 10

33 List Decoding of Reed Solomon Codes p. 1/30 List Decoding of Reed Solomon Codes Madhu Sudan MIT CSAIL

34 Background: Reliable Transmission of Information List Decoding of Reed Solomon Codes p. 2/30

35 List Decoding of Reed Solomon Codes p. 3/30 The Problem of Information Transmission Sender Noisy Channel Receiver

36 List Decoding of Reed Solomon Codes p. 4/30 The Problem of Information Transmission We are not ready Sender Noisy Channel Receiver

37 List Decoding of Reed Solomon Codes p. 5/30 The Problem of Information Transmission We are not ready We are now ready Sender Noisy Channel Receiver When information is digital, reliability is critical. Need to understand errors, and correct them.

38 List Decoding of Reed Solomon Codes p. 6/30 Shannon (1948) Model noise by probability distribution. Example: Binary symmetric channel (BSC) Parameter p [0, 1 2 ]. Channel transmits bits. With probability 1 p bit transmitted faithfully, and with probability p bit flipped (independent of all other events). Shannon s architecture Sender encodes k bits into n bits. Transmits n bit string on channel. Receiver decodes n bits into k bits. Rate of channel usage = k/n.

39 List Decoding of Reed Solomon Codes p. 7/30 Shannon s theorem Every channel (in broad class) has a capacity s.t., transmitting at Rate below capacity is feasible and above capacity is infeasible. Example: Binary symmetric channel (p) has capacity 1 H(p), where H(p) is the binary entropy function. p = 0 implies capacity = 1. p = 1 2 p < 1 2 implies capacity = 0. implies capacity > 0. Example: q-ary symmetric channel (p): On input σ F q receiver receives (independently) σ, where σ = σ w.p. 1 p. σ uniform over F q {σ} w.p. p. Capacity positive if p < 1 1/q.

40 List Decoding of Reed Solomon Codes p. 8/30 Constructive versions Shannon s theory was non-constructive. Decoding takes exponential time. [Elias 55] gave polytime algorithms to achieve positive rate on every channel of positive capacity. [Forney 66] achieved any rate < capacity with polynomial time algorithms (and exponentially small error). Modern results (following [Spielman 96]) lead to linear time algorithms.

41 List Decoding of Reed Solomon Codes p. 9/30 Hamming (1950) Modelled errors adversarially. Focussed on image of encoding function (the Code ). Introduced metric (Hamming distance) on range of encoding function. d(x,y) = # coordinates such that x i y i. Noticed that for adversarial error (and guaranteed error recovery), distance of Code is important. (C) = min x,y C {d(x,y)}. Code of distance d corrects (d 1)/2 errors.

42 Contrast between Shannon & Hamming List Decoding of Reed Solomon Codes p. 10/30

43 List Decoding of Reed Solomon Codes p. 10/30 Contrast between Shannon & Hamming [Sha48] : C probabilistic. E.g., flips each bit independently w.p. p. Tightly analyzed for many cases e.g., q-sc(p). Channel may be too weak to capture some scenarios. Need very accurate channel model.

44 List Decoding of Reed Solomon Codes p. 10/30 Contrast between Shannon & Hamming [Sha48] : C probabilistic. Corrects many errors. Channel restricted.

45 List Decoding of Reed Solomon Codes p. 10/30 Contrast between Shannon & Hamming [Sha48] : C probabilistic. Corrects many errors. Channel restricted. [Ham50] : C flips bits adversarially Safer model, good codes known Too pessimistic: Can only decode if p < 1/2 for any alphabet.

46 List Decoding of Reed Solomon Codes p. 10/30 Contrast between Shannon & Hamming [Sha48] : C probabilistic. Corrects many errors. Channel restricted. [Ham50] : C flips bits adversarially Fewer errors. More general errors.

47 List Decoding of Reed Solomon Codes p. 10/30 Contrast between Shannon & Hamming [Sha48] : C probabilistic. Corrects many errors. Channel restricted. [Ham50] : C flips bits adversarially Fewer errors. More general errors. Which model is correct? Depends on application. Crudely: Small q Shannon. Large q Hamming. Today: New Models of error-correction + algorithms. List-decoding: Relaxed notion of decoding.

48 List Decoding of Reed Solomon Codes p. 10/30 Contrast between Shannon & Hamming [Sha48] : C probabilistic. Corrects many errors. Channel restricted. [Ham50] : C flips bits adversarially Fewer errors. More general errors. Which model is correct? Depends on application. Crudely: Small q Shannon. Large q Hamming. Today: New Models of error-correction + algorithms. List-decoding: Relaxed notion of decoding. More errors Strong (enough) errors.

49 Reed-Solomon Codes List Decoding of Reed Solomon Codes p. 11/30

50 List Decoding of Reed Solomon Codes p. 12/30 Motivation: [Singleton] Bound Suppose C F n q has q k codewords. How large can its distance be?

51 List Decoding of Reed Solomon Codes p. 12/30 Motivation: [Singleton] Bound Suppose C F n q has q k codewords. How large can its distance be? Bound: (C) n k + 1.

52 List Decoding of Reed Solomon Codes p. 12/30 Motivation: [Singleton] Bound Suppose C F n q has q k codewords. How large can its distance be? Bound: (C) n k + 1. Proof: Project code to first k 1 coordinates. By Pigeonhole Principle, two codewords collide. These two codewords thus disagree in at most n k + 1 coordinates.

53 List Decoding of Reed Solomon Codes p. 12/30 Motivation: [Singleton] Bound Suppose C F n q has q k codewords. How large can its distance be? Bound: (C) n k + 1. Proof: Project code to first k 1 coordinates. By Pigeonhole Principle, two codewords collide. These two codewords thus disagree in at most n k + 1 coordinates. Surely we can do better?

54 List Decoding of Reed Solomon Codes p. 12/30 Motivation: [Singleton] Bound Suppose C F n q has q k codewords. How large can its distance be? Bound: (C) n k + 1. Proof: Project code to first k 1 coordinates. By Pigeonhole Principle, two codewords collide. These two codewords thus disagree in at most n k + 1 coordinates. Surely we can do better? Actually - No! [Reed-Solomon] Codes match this bound!

55 List Decoding of Reed Solomon Codes p. 13/30 Reed-Solomon Codes Messages Polynomial. Encoding Evaluation at x 1,...,x n. n > Degree: Injective x1 x2 x3 x4 x5 x6 x7 x8 x9 m1 m2 m3 m4 n Degree: Redundant

56 List Decoding of Reed Solomon Codes p. 14/30 Reed-Solomon Codes (formally) Let F q be a finite field. Code specified by k,n,α 1,...,α n F q. Message: c 0,...,c k F k+1 q coefficients of degree k polynomial p(x) = c 0 + c 1 x + c k x k. Encoding: p p(α 1 ),...,p(α n ). (k + 1 letters to n letters.) Degree k poly has at most k roots Distance d = n k. These are the Reed-Solomon codes. Match [Singleton] bound! Commonly used (CDs, DVDs etc.).

57 List-Decoding of Reed-Solomon Codes List Decoding of Reed Solomon Codes p. 15/30

58 List Decoding of Reed Solomon Codes p. 16/30 Reed-Solomon Decoding Restatement of the problem: Input: n points (α i,y i ) F 2 q; agreement parameter t Output: All degree k polynomials p(x) s.t. p(α i ) = y i for at least t values of i. We use k = 1 for illustration. i.e. want all lines (y ax b = 0) that pass through t out of n points.

59 List Decoding of Reed Solomon Codes p. 17/30 Algorithm Description [S. 96] n = 14 points; Want all lines through at least 5 points.

60 List Decoding of Reed Solomon Codes p. 17/30 Algorithm Description [S. 96] n = 14 points; Want all lines through at least 5 points. Find deg. 4 poly. Q(x,y) 0 s.t. Q(α i,y i ) = 0 for all points.

61 List Decoding of Reed Solomon Codes p. 17/30 Algorithm Description [S. 96] n = 14 points; Want all lines through at least 5 points. Find deg. 4 poly. Q(x,y) 0 s.t. Q(α i,y i ) = 0 for all points. Q(x,y) = y 4 x 4 y 2 + x 2 Let us plot all zeroes of Q...

62 List Decoding of Reed Solomon Codes p. 17/30 Algorithm Description [S. 96] n = 14 points; Want all lines through at least 5 points. Find deg. 4 poly. Q(x,y) 0 s.t. Q(α i,y i ) = 0 for all points. Q(x,y) = y 4 x 4 y 2 + x 2 Let us plot all zeroes of Q... Both relevant lines emerge!

63 List Decoding of Reed Solomon Codes p. 17/30 Algorithm Description [S. 96] n = 14 points; Want all lines through at least 5 points. Find deg. 4 poly. Q(x,y) 0 s.t. Q(α i,y i ) = 0 for all points. Q(x,y) = y 4 x 4 y 2 + x 2 Let us plot all zeroes of Q... Both relevant lines emerge! Formally, Q(x, y) factors as: (x 2 + y 2 1)(y + x)(y x).

64 List Decoding of Reed Solomon Codes p. 18/30 What Happened? 1. Why did degree 4 curve exist? Counting argument: degree 4 gives enough degrees of freedom to pass through any 14 points. 2. Why did all the relevant lines emerge/factor out? Line l intersects a deg. 4 curve Q in 5 points = l is a factor of Q

65 List Decoding of Reed Solomon Codes p. 19/30 Generally Lemma 1: Q with deg x (Q),deg y (Q) D = n passing thru any n points. Lemma 2: If Q with deg x (Q),deg y (Q) D intersects y p(x) with deg(p) d intersect in more that (D + 1)d points, then y p(x) divides Q.

66 List Decoding of Reed Solomon Codes p. 20/30 Efficient algorithm? 1. Can find Q by solving system of linear equations

67 List Decoding of Reed Solomon Codes p. 20/30 Efficient algorithm? 1. Can find Q by solving system of linear equations 2. Fast algorithms for factorization of bivariate polynomials exist ( 83-85) [Kaltofen, Chistov & Grigoriev, Lenstra, von zur Gathen & Kaltofen]

68 List Decoding of Reed Solomon Codes p. 20/30 Efficient algorithm? 1. Can find Q by solving system of linear equations 2. Fast algorithms for factorization of bivariate polynomials exist ( 83-85) [Kaltofen, Chistov & Grigoriev, Lenstra, von zur Gathen & Kaltofen] Immediate application:

69 List Decoding of Reed Solomon Codes p. 20/30 Efficient algorithm? 1. Can find Q by solving system of linear equations 2. Fast algorithms for factorization of bivariate polynomials exist ( 83-85) [Kaltofen, Chistov & Grigoriev, Lenstra, von zur Gathen & Kaltofen] Immediate application: Theorem: Can list-decode Reed-Solomon code from n (k + 1) n errors.

70 List Decoding of Reed Solomon Codes p. 20/30 Efficient algorithm? 1. Can find Q by solving system of linear equations 2. Fast algorithms for factorization of bivariate polynomials exist ( 83-85) [Kaltofen, Chistov & Grigoriev, Lenstra, von zur Gathen & Kaltofen] Immediate application: Theorem: Can list-decode Reed-Solomon code from n (k + 1) n errors. With some fine-tuning of parameters:

71 List Decoding of Reed Solomon Codes p. 20/30 Efficient algorithm? 1. Can find Q by solving system of linear equations 2. Fast algorithms for factorization of bivariate polynomials exist ( 83-85) [Kaltofen, Chistov & Grigoriev, Lenstra, von zur Gathen & Kaltofen] Immediate application: Theorem: Can list-decode Reed-Solomon code from n (k + 1) n errors. With some fine-tuning of parameters: Theorem: [S. 96] Can list-decode Reed-Solomon code from 1 2R-fraction errors.

72 List Decoding of Reed Solomon Codes p. 20/30 Efficient algorithm? 1. Can find Q by solving system of linear equations 2. Fast algorithms for factorization of bivariate polynomials exist ( 83-85) [Kaltofen, Chistov & Grigoriev, Lenstra, von zur Gathen & Kaltofen] Immediate application: Theorem: Can list-decode Reed-Solomon code from n (k + 1) n errors. With some fine-tuning of parameters: Theorem: [S. 96] Can list-decode Reed-Solomon code from 1 2R-fraction errors. Does not meet combinatorial bounds though!

73 List Decoding of Reed Solomon Codes p. 20/30 Efficient algorithm? 1. Can find Q by solving system of linear equations 2. Fast algorithms for factorization of bivariate polynomials exist ( 83-85) [Kaltofen, Chistov & Grigoriev, Lenstra, von zur Gathen & Kaltofen] Immediate application: Theorem: Can list-decode Reed-Solomon code from n (k + 1) n errors. With some fine-tuning of parameters: Theorem: [S. 96] Can list-decode Reed-Solomon code from 1 2R-fraction errors. Does not meet combinatorial bounds though!

74 Improved List-Decoding List Decoding of Reed Solomon Codes p. 21/30

75 List Decoding of Reed Solomon Codes p. 22/30 Going Further: Example 2 [Guruswami+S. 98] n = 11 points; Want all lines through 4 pts.

76 List Decoding of Reed Solomon Codes p. 22/30 Going Further: Example 2 [Guruswami+S. 98] n = 11 points; Want all lines through 4 pts. Fitting degree 4 curve Q as earlier doesn t work.

77 List Decoding of Reed Solomon Codes p. 22/30 Going Further: Example 2 [Guruswami+S. 98] n = 11 points; Want all lines through 4 pts. Fitting degree 4 curve Q as earlier doesn t work. Why?

78 List Decoding of Reed Solomon Codes p. 22/30 Going Further: Example 2 [Guruswami+S. 98] n = 11 points; Want all lines through 4 pts. Fitting degree 4 curve Q as earlier doesn t work. Why? Correct answer has 5 lines. Degree 4 curve can t have 5 factors!

79 List Decoding of Reed Solomon Codes p. 23/30 Going Further: Example 2 [Guruswami+S. 98] n = 11 points; Want all lines through 4 pts. Fit degree 7 poly. Q(x,y) passing through each point twice. Q(x,y) = (margin too small) Plot all zeroes...

80 List Decoding of Reed Solomon Codes p. 23/30 Going Further: Example 2 [Guruswami+S. 98] n = 11 points; Want all lines through 4 pts. Fit degree 7 poly. Q(x,y) passing through each point twice. Q(x,y) = (margin too small) Plot all zeroes... All relevant lines emerge!

81 List Decoding of Reed Solomon Codes p. 23/30 Going Further: Example 2 [Guruswami+S. 98] n = 11 points; Want all lines through 4 pts. Fit degree 7 poly. Q(x,y) passing through each point twice. Q(x,y) = (margin too small) Plot all zeroes... All relevant lines emerge!

82 List Decoding of Reed Solomon Codes p. 24/30 Where was the gain? Requiring Q to pass through each point twice, effectively doubles the # intersections between Q and line. So # intersections is now 8. On the other hand # constraints goes up from 11 to 33. Forces degree used to go upto 7 (from 4). But now # intersections is less than degree! Can pass through each point twice with less than twice the degree! Letting intersection multiplicity go to gives decoding algorithm for upto 1 R errors.

83 List Decoding of Reed Solomon Codes p. 25/30 Summary Can correct errors in Reed-Solomon codes well beyond half the distance (Hamming) barrier!

84 List Decoding of Reed Solomon Codes p. 25/30 Summary Can correct errors in Reed-Solomon codes well beyond half the distance (Hamming) barrier! Matches best known combinatorial bounds on list-decodability.

85 List Decoding of Reed Solomon Codes p. 25/30 Summary Can correct errors in Reed-Solomon codes well beyond half the distance (Hamming) barrier! Matches best known combinatorial bounds on list-decodability. Open Question: Correct more errors, or show this leads to exponentially large lists!

86 List Decoding of Reed Solomon Codes p. 25/30 Summary Can correct errors in Reed-Solomon codes well beyond half the distance (Hamming) barrier! Matches best known combinatorial bounds on list-decodability. Open Question: Correct more errors, or show this leads to exponentially large lists! Techniques: The polynomial method, and the method of multiplicities!

87 List Decoding of Reed Solomon Codes p. 26/30 The Polynomial Method Goal: Understand some combinatorial parameters of some algebraically nice set. E.g.,

88 List Decoding of Reed Solomon Codes p. 26/30 The Polynomial Method Goal: Understand some combinatorial parameters of some algebraically nice set. E.g., Minimum number of points in the union of l sets where each set is t points from a degree k polynomial =? Minimum number of points in K F n q such that K contains a line in every direction.

89 List Decoding of Reed Solomon Codes p. 26/30 The Polynomial Method Goal: Understand some combinatorial parameters of some algebraically nice set. E.g., Method: Fit low-degree polynomial Q to the set K. Infer Q is zero on points outside K, due to algebraic niceness. Infer lower bound on degree of Q (due to abundance of zeroes). Transfer to bound on combinatorial parameter of interest.

90 List Decoding of Reed Solomon Codes p. 27/30 Kakeya Sets Definition: K F n q is a Kakeya set if it contains a line in every direction. Question: How small can K be?

91 List Decoding of Reed Solomon Codes p. 27/30 Kakeya Sets Definition: K F n q is a Kakeya set if it contains a line in every direction. Question: How small can K be? Bounds (till 2007): K, K q n/2 K, K q n

92 List Decoding of Reed Solomon Codes p. 27/30 Kakeya Sets Definition: K F n q is a Kakeya set if it contains a line in every direction. Question: How small can K be? Bounds (till 2007): K, K q n/2 K, K (q/2) n [Mockenhaupt & Tao]

93 List Decoding of Reed Solomon Codes p. 27/30 Kakeya Sets Definition: K F n q is a Kakeya set if it contains a line in every direction. Question: How small can K be? Bounds (till 2007): K, K q n/2 K, K (q/2) n [Mockenhaupt & Tao] In particular, even exponent of q unknown!

94 List Decoding of Reed Solomon Codes p. 27/30 Kakeya Sets Definition: K F n q is a Kakeya set if it contains a line in every direction. Question: How small can K be? Bounds (till 2007): K, K q n/2 K, K (q/2) n [Mockenhaupt & Tao] In particular, even exponent of q unknown! [Dvir 08] s breakthrough: K, K q n /n!

95 List Decoding of Reed Solomon Codes p. 27/30 Kakeya Sets Definition: K F n q is a Kakeya set if it contains a line in every direction. Question: How small can K be? Bounds (till 2007): K, K q n/2 K, K (q/2) n [Mockenhaupt & Tao] In particular, even exponent of q unknown! [Dvir 08] s breakthrough: K, K q n /n! Subsequently [Dvir, Kopparty, Saraf, S.] K, K (q/2) n

96 List Decoding of Reed Solomon Codes p. 28/30 Polynomial Method and Kakeya Sets [Dvir 08] s analysis: Fit low-degree polynomial Q to K. (Interpolation Degree not too high if K not large.)

97 List Decoding of Reed Solomon Codes p. 28/30 Polynomial Method and Kakeya Sets [Dvir 08] s analysis: Fit low-degree polynomial Q to K. (Interpolation Degree not too high if K not large.) Show homogenous part of Q zero at y if line in direction y contained in K.

98 List Decoding of Reed Solomon Codes p. 28/30 Polynomial Method and Kakeya Sets [Dvir 08] s analysis: Fit low-degree polynomial Q to K. (Interpolation Degree not too high if K not large.) Show homogenous part of Q zero at y if line in direction y contained in K. Conclude homogenous part is zero too often!

99 List Decoding of Reed Solomon Codes p. 28/30 Polynomial Method and Kakeya Sets [Dvir 08] s analysis: Fit low-degree polynomial Q to K. (Interpolation Degree not too high if K not large.) Show homogenous part of Q zero at y if line in direction y contained in K. Conclude homogenous part is zero too often! [Saraf + S.], [Dvir + Kopparty + Saraf + S.]: Fit Q to vanish many times at each point of K. Yields better bounds!

100 List Decoding of Reed Solomon Codes p. 29/30 Conclusions Importance of model of error.

101 List Decoding of Reed Solomon Codes p. 29/30 Conclusions Importance of model of error. Virtues of relaxing some notions (e.g., list-decoding vs. unique-decoding)

102 List Decoding of Reed Solomon Codes p. 29/30 Conclusions Importance of model of error. Virtues of relaxing some notions (e.g., list-decoding vs. unique-decoding) New algorithmic insights: Can be useful outside the context of list-decoding (e.g., [Koetter-Vardy] Soft-decision decoder).

103 List Decoding of Reed Solomon Codes p. 29/30 Conclusions Importance of model of error. Virtues of relaxing some notions (e.g., list-decoding vs. unique-decoding) New algorithmic insights: Can be useful outside the context of list-decoding (e.g., [Koetter-Vardy] Soft-decision decoder). Central open question:

104 List Decoding of Reed Solomon Codes p. 29/30 Conclusions Importance of model of error. Virtues of relaxing some notions (e.g., list-decoding vs. unique-decoding) New algorithmic insights: Can be useful outside the context of list-decoding (e.g., [Koetter-Vardy] Soft-decision decoder). Central open question: Constructive list-decodable binary codes of rate 1 H(ρ) correcting ρ-fraction errors!! Corresponding question for large alphabets resolved by [ParvareshVardy05, GuruswamiRudra06].

105 List Decoding of Reed Solomon Codes p. 29/30 Conclusions Importance of model of error. Virtues of relaxing some notions (e.g., list-decoding vs. unique-decoding) New algorithmic insights: Can be useful outside the context of list-decoding (e.g., [Koetter-Vardy] Soft-decision decoder). Central open question: Constructive list-decodable binary codes of rate 1 H(ρ) correcting ρ-fraction errors!! Corresponding question for large alphabets resolved by [ParvareshVardy05, GuruswamiRudra06]. New (?) mathematical insights.

106 List Decoding of Reed Solomon Codes p. 29/30 Conclusions Importance of model of error. Virtues of relaxing some notions (e.g., list-decoding vs. unique-decoding) New algorithmic insights: Can be useful outside the context of list-decoding (e.g., [Koetter-Vardy] Soft-decision decoder). Central open question: Constructive list-decodable binary codes of rate 1 H(ρ) correcting ρ-fraction errors!! Corresponding question for large alphabets resolved by [ParvareshVardy05, GuruswamiRudra06]. New (?) mathematical insights. Challenge: Apply existing insights to other practical settings.

107 Thank You!! List Decoding of Reed Solomon Codes p. 30/30

108 Reed-Solomon Codes by Bernard Sklar Introduction In 1960, Irving Reed and Gus Solomon published a paper in the Journal of the Society for Industrial and Applied Mathematics [1]. This paper described a new class of error-correcting codes that are now called Reed-Solomon (R-S) codes. These codes have great power and utility, and are today found in many applications from compact disc players to deep-space applications. This article is an attempt to describe the paramount features of R-S codes and the fundamentals of how they work. Reed-Solomon codes are nonbinary cyclic codes with symbols made up of m-bit sequences, where m is any positive integer having a value greater than 2. R-S (n, k) codes on m-bit symbols exist for all n and k for which 0 < k < n < 2 m + 2 (1) where k is the number of data symbols being encoded, and n is the total number of code symbols in the encoded block. For the most conventional R-S (n, k) code, (n, k) = (2 m - 1, 2 m - 1-2t) (2) where t is the symbol-error correcting capability of the code, and n - k = 2t is the number of parity symbols. An extended R-S code can be made up with n = 2 m or n = 2 m + 1, but not any further. Reed-Solomon codes achieve the largest possible code minimum distance for any linear code with the same encoder input and output block lengths. For nonbinary codes, the distance between two codewords is defined (analogous to Hamming distance) as the number of symbols in which the sequences differ. For Reed- Solomon codes, the code minimum distance is given by [2] d min = n - k + 1 (3)

109 The code is capable of correcting any combination of t or fewer errors, where t can be expressed as [3] min - 1 n - k t = d = 2 2 (4) where x means the largest integer not to exceed x. Equation (4) illustrates that for the case of R-S codes, correcting t symbol errors requires no more than 2t parity symbols. Equation (4) lends itself to the following intuitive reasoning. One can say that the decoder has n - k redundant symbols to spend, which is twice the amount of correctable errors. For each error, one redundant symbol is used to locate the error, and another redundant symbol is used to find its correct value. The erasure-correcting capability, ρ, of the code is ρ = d min - 1 = n - k (5) Simultaneous error-correction and erasure-correction capability can be expressed as follows: 2α + γ < d min < n - k (6) where α is the number of symbol-error patterns that can be corrected and γ is the number of symbol erasure patterns that can be corrected. An advantage of nonbinary codes such as a Reed-Solomon code can be seen by the following comparison. Consider a binary (n, k) = (7, 3) code. The entire n-tuple space contains 2 n = 2 7 = 128 n-tuples, of which 2 k = 2 3 = 8 (or 1/16 of the n-tuples) are codewords. Next, consider a nonbinary (n, k) = (7, 3) code where each symbol is composed of m = 3 bits. The n-tuple space amounts to 2 nm = 2 21 = 2,097,152 n-tuples, of which 2 km = 2 9 = 512 (or 1/4096 of the n-tuples) are codewords. When dealing with nonbinary symbols, each made up of m bits, only a small fraction (i.e., 2 km of the large number 2 nm ) of possible n-tuples are codewords. This fraction decreases with increasing values of m. The important point here is that when a small fraction of the n-tuple space is used for codewords, a large d min can be created. Any linear code is capable of correcting n - k symbol erasure patterns if the n - k erased symbols all happen to lie on the parity symbols. However, R-S codes have the remarkable property that they are able to correct any set of n - k symbol erasures within the block. R-S codes can be designed to have any redundancy. However, the complexity of a high-speed implementation increases with 2 Reed-Solomon Codes

110 redundancy. Thus, the most attractive R-S codes have high code rates (low redundancy). Reed-Solomon Error Probability The Reed-Solomon (R-S) codes are particularly useful for burst-error correction; that is, they are effective for channels that have memory. Also, they can be used efficiently on channels where the set of input symbols is large. An interesting feature of the R-S code is that as many as two information symbols can be added to an R-S code of length n without reducing its minimum distance. This extended R-S code has length n + 2 and the same number of parity check symbols as the original code. The R-S decoded symbol-error probability, P E, in terms of the channel symbol-error probability, p, can be written as follows [4]: P E 1 m 2 1 2m 1 j=+ t 1 m 2 1 j j p m j j ( ) p (7) where t is the symbol-error correcting capability of the code, and the symbols are made up of m bits each. The bit-error probability can be upper bounded by the symbol-error probability for specific modulation types. For MFSK modulation with M = 2 m, the relationship between P B and P E is as follows [3]: P P B E m 1 2 = m 2 1 (8) Figure 1 shows P B versus the channel symbol-error probability p, plotted from Equations (7) and (8) for various (t-error-correcting 32-ary orthogonal Reed- Solomon codes with n = 31 (thirty-one 5-bit symbols per code block). Figure 2 shows P B versus E b /N 0 for such a coded system using 32-ary MFSK modulation and noncoherent demodulation over an AWGN channel [4]. For R-S codes, error probability is an exponentially decreasing function of block length, n, and decoding complexity is proportional to a small power of the block length [2]. The R-S codes are sometimes used in a concatenated arrangement. In such a system, an inner convolutional decoder first provides some error control by operating on soft-decision demodulator outputs; the convolutional decoder then presents hard-decision data to the outer Reed-Solomon decoder, which further reduces the probability of error. Reed-Solomon Codes 3

111 Figure 1 P B versus p for 32-ary orthogonal signaling and n = 31, t-error correcting Reed-Solomon coding [4]. 4 Reed-Solomon Codes

112 Figure 2 Bit-error probability versus E b /N 0 performance of several n = 31, t-error correcting Reed- Solomon coding systems with 32-ary MPSK modulation over an AWGN channel [4]. Reed-Solomon Codes 5

113 Why R-S Codes Perform Well Against Burst Noise Consider an (n, k) = (255, 247) R-S code, where each symbol is made up of m = 8 bits (such symbols are typically referred to as bytes). Since n - k = 8, Equation (4) indicates that this code can correct any four symbol errors in a block of 255. Imagine the presence of a noise burst, lasting for 25-bit durations and disturbing one block of data during transmission, as illustrated in Figure 3. Figure 3 Data block disturbed by 25-bit noise burst. In this example, notice that a burst of noise that lasts for a duration of 25 contiguous bits must disturb exactly four symbols. The R-S decoder for the (255, 247) code will correct any four-symbol errors without regard to the type of damage suffered by the symbol. In other words, when a decoder corrects a byte, it replaces the incorrect byte with the correct one, whether the error was caused by one bit being corrupted or all eight bits being corrupted. Thus if a symbol is wrong, it might as well be wrong in all of its bit positions. This gives an R-S code a tremendous burst-noise advantage over binary codes, even allowing for the interleaving of binary codes. In this example, if the 25-bit noise disturbance had occurred in a random fashion rather than as a contiguous burst, it should be clear that many more than four symbols would be affected (as many as 25 symbols might be disturbed). Of course, that would be beyond the capability of the (255, 247) code. R-S Performance as a Function of Size, Redundancy, and Code Rate For a code to successfully combat the effects of noise, the noise duration has to represent a relatively small percentage of the codeword. To ensure that this happens most of the time, the received noise should be averaged over a long period of time, reducing the effect of a freak streak of bad luck. Hence, error-correcting 6 Reed-Solomon Codes

114 codes become more efficient (error performance improves) as the code block size increases, making R-S codes an attractive choice whenever long block lengths are desired [5]. This is seen by the family of curves in Figure 4, where the rate of the code is held at a constant 7/8, while its block size increases from n = 32 symbols (with m = 5 bits per symbol) to n = 256 symbols (with m = 8 bits per symbol). Thus, the block size increases from 160 bits to 2048 bits. As the redundancy of an R-S code increases (lower code rate), its implementation grows in complexity (especially for high-speed devices). Also, the bandwidth expansion must grow for any real-time communications application. However, the benefit of increased redundancy, just like the benefit of increased symbol size, is the improvement in bit-error performance, as can be seen in Figure 5, where the code length n is held at a constant 64, while the number of data symbols decreases from k = 60 to k = 4 (redundancy increases from 4 symbols to 60 symbols). Figure 5 represents transfer functions (output bit-error probability versus input channel symbol-error probability) of hypothetical decoders. Because there is no system or channel in mind (only an output-versus-input of a decoder), you might get the idea that the improved error performance versus increased redundancy is a monotonic function that will continually provide system improvement even as the code rate approaches zero. However, this is not the case for codes operating in a real-time communication system. As the rate of a code varies from minimum to maximum (0 to 1), it is interesting to observe the effects shown in Figure 6. Here, the performance curves are plotted for BPSK modulation and an R-S (31, k) code for various channel types. Figure 6 reflects a real-time communication system, where the price paid for error-correction coding is bandwidth expansion by a factor equal to the inverse of the code rate. The curves plotted show clear optimum code rates that minimize the required E b /N 0 [6]. The optimum code rate is about 0.6 to 0.7 for a Gaussian channel, 0.5 for a Rician-fading channel (with the ratio of direct to reflected received signal power, K = 7 db), and 0.3 for a Rayleigh-fading channel. Why is there an E b /N 0 degradation for very large rates (small redundancy) and very low rates (large redundancy)? It is easy to explain the degradation at high rates compared to the optimum rate. Any code generally provides a coding-gain benefit; thus, as the code rate approaches unity (no coding), the system will suffer worse error performance. The degradation at low code rates is more subtle because in a real-time communication system using both modulation and coding, there are two mechanisms at work. One mechanism works to improve error performance, and the other works to Reed-Solomon Codes 7

115 degrade it. The improving mechanism is the coding; the greater the redundancy, the greater will be the error-correcting capability of the code. The degrading mechanism is the energy reduction per channel symbol (compared to the data symbol) that stems from the increased redundancy (and faster signaling in a real-time communication system). The reduced symbol energy causes the demodulator to make more errors. Eventually, the second mechanism wins out, and thus at very low code rates the system experiences error-performance degradation. Let s see if we can corroborate the error performance versus code rate in Figure 6 with the curves in Figure 2. The figures are really not directly comparable because the modulation is BPSK in Figure 6 and 32-ary MFSK in Figure 2. However, perhaps we can verify that R-S error performance versus code rate exhibits the same general curvature with MFSK modulation as it does with BPSK. In Figure 2, the error performance over an AWGN channel improves as the symbol errorcorrecting capability, t, increases from t = 1 to t = 4; the t = 1 and t = 4 cases correspond to R-S (31, 29) and R-S (31, 23) with code rates of 0.94 and 0.74 respectively. However, at t = 8, which corresponds to R-S (31, 15) with code rate = 0.48, the error performance at P B = 10-5 degrades by about 0.5 db of E b /N 0 compared to the t = 4 case. From Figure 2, we can conclude that if we were to plot error performance versus code rate, the curve would have the same general shape as it does in Figure 6. Note that this manifestation cannot be gleaned from Figure 1, since that figure represents a decoder transfer function, which provides no information about the channel and the demodulation. Therefore, of the two mechanisms at work in the channel, the Figure 1 transfer function only presents the output-versus-input benefits of the decoder, and displays nothing about the loss of energy as a function of lower code rate. 8 Reed-Solomon Codes

116 Figure 4 Reed-Solomon rate 7/8 decoder performance as a function of symbol size. Figure 5 Reed-Solomon (64, k) decoder performance as a function of redundancy. Reed-Solomon Codes 9

117 Figure 6 BPSK plus Reed-Solomon (31, k) decoder performance as a function of code rate. Finite Fields In order to understand the encoding and decoding principles of nonbinary codes, such as Reed-Solomon (R-S) codes, it is necessary to venture into the area of finite fields known as Galois Fields (GF). For any prime number, p, there exists a finite field denoted GF( p) that contains p elements. It is possible to extend GF( p) to a field of p m elements, called an extension field of GF( p), and denoted by GF( p m ), where m is a nonzero positive integer. Note that GF( p m ) contains as a subset the elements of GF( p). Symbols from the extension field GF(2 m ) are used in the construction of Reed-Solomon (R-S) codes. The binary field GF(2) is a subfield of the extension field GF(2 m ), in much the same way as the real number field is a subfield of the complex number field. 10 Reed-Solomon Codes

118 Besides the numbers 0 and 1, there are additional unique elements in the extension field that will be represented with a new symbol α. Each nonzero element in GF(2 m ) can be represented by a power of α. An infinite set of elements, F, is formed by starting with the elements {0, 1, α}, and generating additional elements by progressively multiplying the last entry by α, which yields the following: F = {0, 1, α, α 2,, α j, } = {0, α 0, α 1, α 2,, α j, } (9) To obtain the finite set of elements of GF(2 m ) from F, a condition must be imposed on F so that it may contain only 2 m elements and is closed under multiplication. The condition that closes the set of field elements under multiplication is characterized by the irreducible polynomial shown below: or equivalently (2 m 1) α + 1= 0 (2 m 1) 0 α = 1 = α (10) Using this polynomial constraint, any field element that has a power equal to or greater than 2 m - 1 can be reduced to an element with a power less than 2 m - 1, as follows: (2 m+ n) (2m 1) n+ 1 n+ 1 α = α α = α (11) Thus, Equation (10) can be used to form the finite sequence F* from the infinite sequence F as follows: F = = 0, 0, 1, α, α α 0 1, α, 2,..., α 2, α..., m 2 2 α, α m 2 2 m 2 1, α 0,, α α 1 m 2,, α 2...,... (12) Therefore, it can be seen from Equation (12) that the elements of the finite field, GF(2 m ), are as follows: m 2 { } m GF(2 ) = 0, α, α, α,..., α (13) Reed-Solomon Codes 11

119 Addition in the Extension Field GF(2 m ) Each of the 2 m elements of the finite field, GF(2 m ), can be represented as a distinct polynomial of degree m - 1 or less. The degree of a polynomial is the value of its highest-order exponent. We denote each of the nonzero elements of GF(2 m ) as a polynomial, a i (X ), where at least one of the m coefficients of a i (X ) is nonzero. For i = 0,1,2,,2 m - 2, α i = a i (X ) = a i, 0 + a i, 1 X + a i, 2 X a i, m - 1 X m - 1 (14) Consider the case of m = 3, where the finite field is denoted GF(2 3 ). Figure 7 shows the mapping (developed later) of the seven elements {α i } and the zero element, in terms of the basis elements {X 0, X 1, X 2 } described by Equation (14). Since Equation (10) indicates that α 0 = α 7, there are seven nonzero elements or a total of eight elements in this field. Each row in the Figure 7 mapping comprises a sequence of binary values representing the coefficients a i, 0, a i, 1, and a i, 2 in Equation (14). One of the benefits of using extension field elements {α i } in place of binary elements is the compact notation that facilitates the mathematical representation of nonbinary encoding and decoding processes. Addition of two elements of the finite field is then defined as the modulo-2 sum of each of the polynomial coefficients of like powers, α i + α j = (a i, 0 + a j, 0 ) + (a i, 1 + a j, 1 ) X + + (a i, m a j, m - 1 ) X m - 1 (15) Figure 7 Mapping field elements in terms of basis elements for GF(8) with f(x) = 1 + x + x Reed-Solomon Codes

120 A Primitive Polynomial Is Used to Define the Finite Field A class of polynomials called primitive polynomials is of interest because such functions define the finite fields GF(2 m ) that in turn are needed to define R-S codes. The following condition is necessary and sufficient to guarantee that a polynomial is primitive. An irreducible polynomial f(x ) of degree m is said to be primitive if the smallest positive integer n for which f(x ) divides X n + 1 is n = 2 m - 1. Note that the statement A divides B means that A divided into B yields a nonzero quotient and a zero remainder. Polynomials will usually be shown low order to high order. Sometimes, it is convenient to follow the reverse format (for example, when performing polynomial division). Example 1: Recognizing a Primitive Polynomial Based on the definition of a primitive polynomial given above, determine whether the following irreducible polynomials are primitive. a. 1 + X + X 4 b. 1 + X + X 2 + X 3 + X 4 Solution a. We can verify whether this degree m = 4 polynomial is primitive by determining whether it divides X n (2m 1) + 1 = X + 1 = X , but does not divide X n + 1, for values of n in the range of 1 n < 15. It is easy to verify that 1 + X + X 4 divides X [3], and after repeated computations it can be verified that 1 + X + X 4 will not divide X n + 1 for any n in the range of 1 n < 15. Therefore, 1 + X + X 4 is a primitive polynomial. b. It is simple to verify that the polynomial 1 + X + X 2 + X 3 + X 4 divides X Testing to see whether it will divide X n + 1 for some n that is less than 15 yields the fact that it also divides X Thus, although 1 + X + X 2 + X 3 + X 4 is irreducible, it is not primitive. The Extension Field GF(2 3 ) Consider an example involving a primitive polynomial and the finite field that it defines. Table 1 contains a listing of some primitive polynomials. We choose the first one shown, f(x) = 1 + X + X 3, which defines a finite field GF(2 m ), where the degree of the polynomial is m = 3. Thus, there are 2 m = 2 3 = 8 elements in the field defined by f(x ). Solving for the roots of f(x ) means that the values of X that Reed-Solomon Codes 13

121 correspond to f(x ) = 0 must be found. The familiar binary elements, 1 and 0, do not satisfy (are not roots of) the polynomial f(x ) = 1 + X + X 3, since f(1) = 1 and f(0) = 1 (using modulo-2 arithmetic). Yet, a fundamental theorem of algebra states that a polynomial of degree m must have precisely m roots. Therefore for this example, f(x ) = 0 must yield three roots. Clearly a dilemma arises, since the three roots do not lie in the same finite field as the coefficients of f(x ). Therefore, they must lie somewhere else; the roots lie in the extension field, GF(2 3 ). Let α, an element of the extension field, be defined as a root of the polynomial f(x ). Therefore, it is possible to write the following: f(α) = α + α 3 = 0 (16) α 3 = 1 α Since in the binary field +1 = 1, α 3 can be represented as follows: α 3 = 1 + α (17) Thus, α 3 is expressed as a weighted sum of α-terms having lower orders. In fact all powers of α can be so expressed. For example, consider α 4, where we obtain α 4 = α α 3 = α (1 + α) = α + α 2 (18a) Now, consider α 5, where α 5 = α α 4 = α (α + α 2 ) = α 2 + α 3 (18b) From Equation (17), we obtain α 5 = 1 + α + α 2 (18c) Now, for α 6, using Equation (18c), we obtain α 6 = α α 5 = α (1 + α + α 2 ) = α + α 2 + α 3 = 1 + α 2 (18d) And for α 7, using Equation (18d), we obtain α 7 = α α 6 = α (1 + α 2 ) = α + α 3 = 1 = α 0 (18e) Note that α 7 = α 0, and therefore the eight finite field elements of GF(2 3 ) are {0, α 0, α 1, α 2, α 3, α 4, α 5, α 6 } (19) 14 Reed-Solomon Codes

122 Table 1 Some Primitive Polynomials m m X + X X + X 6 + X 10 + X X + X X + X X 2 + X X + X 3 + X 12 + X X + X X 3 + X X 3 + X X 7 + X X 2 + X 3 + X 4 + X X + X 2 + X 5 + X X 4 + X X 3 + X X 3 + X X 2 + X X 2 + X X + X X + X 4 + X 6 + X X 5 + X X + X 3 + X 4 + X X + X 2 + X 7 + X 24 The mapping of field elements in terms of basis elements, described by Equation (14), can be demonstrated with the linear feedback shift register (LFSR) circuit shown in Figure 8. The circuit generates (with m = 3) the 2 m - 1 nonzero elements of the field, and thus summarizes the findings of Figure 7 and Equations (17) through (19). Note that in Figure 8 the circuit feedback connections correspond to the coefficients of the polynomial f(x ) = 1 + X + X 3, just like for binary cyclic codes [3]. By starting the circuit in any nonzero state, say 1 0 0, and performing a right-shift at each clock time, it is possible to verify that each of the field elements shown in Figure 7 (except the all-zeros element) will cyclically appear in the stages of the shift register. Two arithmetic operations, addition and multiplication, can be defined for this GF(2 3 ) finite field. Addition is shown in Table 2, and multiplication is shown in Table 3 for the nonzero elements only. The rules of addition follow from Equations (17) through (18e), and can be verified by noticing in Figure 7 that the sum of any field elements can be obtained by adding (modulo- 2) the respective coefficients of their basis elements. The multiplication rules in Table 3 follow the usual procedure, in which the product of the field elements is obtained by adding their exponents modulo-(2 m - 1), or for this case, modulo-7. Reed-Solomon Codes 15

123 Figure 8 Extension field elements can be represented by the contents of a binary linear feedback shift register (LFSR) formed from a primitive polynomial. Table 2 Table 3 Addition Table Multiplication Table α 0 α 1 α 2 α 3 α 4 α 5 α 6 α 0 α 1 α 2 α 3 α 4 α 5 α 6 α 0 0 α 3 α 6 α 1 α 5 α 4 α 2 α 0 α 0 α 1 α 2 α 3 α 4 α 5 α 6 α 1 α 3 0 α 4 α 0 α 2 α 6 α 5 α 1 α 1 α 2 α 3 α 4 α 5 α 6 α 0 α 2 α 6 α 4 0 α 5 α 1 α 3 α 0 α 2 α 2 α 3 α 4 α 5 α 6 α 0 α 1 α 3 α 1 α 0 α 5 0 α 6 α 2 α 4 α 3 α 3 α 4 α 5 α 6 α 0 α 1 α 2 α 4 α 5 α 2 α 1 α 6 0 α 0 α 3 α 4 α 4 α 5 α 6 α 0 α 1 α 2 α 3 α 5 α 4 α 6 α 3 α 2 α 0 0 α 1 α 5 α 5 α 6 α 0 α 1 α 2 α 3 α 4 α 6 α 2 α 5 α 0 α 4 α 3 α 1 0 α 6 α 6 α 0 α 1 α 2 α 3 α 4 α 5 A Simple Test to Determine Whether a Polynomial Is Primitive There is another way of defining a primitive polynomial that makes its verification relatively easy. For an irreducible polynomial to be a primitive polynomial, at least one of its roots must be a primitive element. A primitive element is one that when raised to higher-order exponents will yield all the nonzero elements in the field. Since the field is a finite field, the number of such elements is finite. Example 2: A Primitive Polynomial Must Have at Least One Primitive Element Find the m = 3 roots of f(x ) = 1 + X + X 3, and verify that the polynomial is primitive by checking that at least one of the roots is a primitive element. What are the roots? Which ones are primitive? 16 Reed-Solomon Codes

124 Solution The roots will be found by enumeration. Clearly, α 0 = 1 is not a root because f(α 0 ) = 1. Now, use Table 2 to check whether α 1 is a root. Since α is therefore a root. Now check whether α 2 is a root: Hence, α 2 is a root. Now check whether α 3 is a root. f(α) = 1 + α + α 3 = 1 + α 0 = 0 f(α 2 ) = 1 + α 2 + α 6 = 1 + α 0 = 0 f(α 3 ) = 1 + α 3 + α 9 = 1 + α 3 + α 2 = 1 + α 5 = α 4 0 Hence, α 3 is not a root. Is α 4 a root? f(α 4 ) = α 12 + α = α 5 + α = 1 + α 0 = 0 Yes, it is a root. Hence, the roots of f(x ) = 1 + X + X 3 are α, α 2, and α 4. It is not difficult to verify that starting with any of these roots and generating higher-order exponents yields all of the seven nonzero elements in the field. Hence, each of the roots is a primitive element. Since our verification requires that at least one root be a primitive element, the polynomial is primitive. A relatively simple method to verify whether a polynomial is primitive can be described in a manner that is related to this example. For any given polynomial under test, draw the LFSR, with the feedback connections corresponding to the polynomial coefficients as shown by the example of Figure 8. Load into the circuit-registers any nonzero setting, and perform a right-shift with each clock pulse. If the circuit generates each of the nonzero field elements within one period, the polynomial that defines this GF(2 m ) field is a primitive polynomial. Reed-Solomon Codes 17

125 Reed-Solomon Encoding Equation (2), repeated below as Equation (20), expresses the most conventional form of Reed-Solomon (R-S) codes in terms of the parameters n, k, t, and any positive integer m > 2. (n, k) = (2 m - 1, 2 m - 1-2t) (20) where n - k = 2t is the number of parity symbols, and t is the symbol-error correcting capability of the code. The generating polynomial for an R-S code takes the following form: g(x ) = g 0 + g 1 X + g 2 X g 2t - 1 X 2t X 2t (21) The degree of the generator polynomial is equal to the number of parity symbols. R-S codes are a subset of the Bose, Chaudhuri, and Hocquenghem (BCH) codes; hence, it should be no surprise that this relationship between the degree of the generator polynomial and the number of parity symbols holds, just as for BCH codes. Since the generator polynomial is of degree 2t, there must be precisely 2t successive powers of α that are roots of the polynomial. We designate the roots of g(x ) as α, α 2,, α 2t. It is not necessary to start with the root α; starting with any power of α is possible. Consider as an example the (7, 3) double-symbol-error correcting R-S code. We describe the generator polynomial in terms of its 2t = n - k = 4 roots, as follows: ( ) ( ) ( ) ( ) X ( ) X X ( ) X ( X X ) ( X X ) ( ) ( ) ( ) g( X) = X α X α X α X α = α+ α +α α + α + α = α + α α + α = X α+α X + α+α +α X α+α X +α = X α X + α X α X + α Following the low order to high order format, and changing negative signs to positive, since in the binary field +1 = 1, g(x ) can be expressed as follows: g(x ) = α 3 + α 1 X + α 0 X 2 + α 3 X 3 + X 4 (22) 18 Reed-Solomon Codes

126 Encoding in Systematic Form Since R-S codes are cyclic codes, encoding in systematic form is analogous to the binary encoding procedure [3]. We can think of shifting a message polynomial, m(x ), into the rightmost k stages of a codeword register and then appending a parity polynomial, p(x ), by placing it in the leftmost n - k stages. Therefore we multiply m(x ) by X n - k, thereby manipulating the message polynomial algebraically so that it is right-shifted n - k positions. Next, we divide X n - k m(x ) by the generator polynomial g(x ), which is written in the following form: X n - k m(x ) = q(x ) g(x ) + p(x ) (23) where q(x ) and p(x ) are quotient and remainder polynomials, respectively. As in the binary case, the remainder is the parity. Equation (23) can also be expressed as follows: p(x ) = X n - k m(x ) modulo g(x ) (24) The resulting codeword polynomial, U(X ) can be written as U(X ) = p(x ) + X n - k m(x ) (25) We demonstrate the steps implied by Equations (24) and (25) by encoding the following three-symbol message: { { { α1 α3 α5 with the (7, 3) R-S code whose generator polynomial is given in Equation (22). We first multiply (upshift) the message polynomial α 1 + α 3 X + α 5 X 2 by X n - k = X 4, yielding α 1 X 4 + α 3 X 5 + α 5 X 6. We next divide this upshifted message polynomial by the generator polynomial in Equation (22), α 3 + α 1 X + α 0 X 2 + α 3 X 3 + X 4. Polynomial division with nonbinary coefficients is more tedious than its binary counterpart, because the required operations of addition (subtraction) and multiplication (division) must follow the rules in Tables 2 and 3, respectively. It is left as an exercise for the reader to verify that this polynomial division results in the following remainder (parity) polynomial. p(x ) = α 0 + α 2 X + α 4 X 2 + α 6 X 3 Then, from Equation (25), the codeword polynomial can be written as follows: U(X ) = α 0 + α 2 X + α 4 X 2 + α 6 X 3 + α 1 X 4 + α 3 X 5 + α 5 X 6 Reed-Solomon Codes 19

127 Systematic Encoding with an (n - k) Stage Shift Register Using circuitry to encode a three-symbol sequence in systematic form with the (7, 3) R-S code described by g(x ) in Equation (22) requires the implementation of a linear feedback shift register (LFSR) circuit, as shown in Figure 9. It can easily be verified that the multiplier terms in Figure 9, taken from left to right, correspond to the coefficients of the polynomial in Equation (22) (low order to high order). This encoding process is the nonbinary equivalent of cyclic encoding [3]. Here, corresponding to Equation (20), the (7, 3) R-S nonzero codewords are made up of 2 m - 1 = 7 symbols, and each symbol is made up of m = 3 bits. Figure 9 LFSR encoder for a (7, 3) R-S code. Here the example is nonbinary, so that each stage in the shift register of Figure 9 holds a 3-bit symbol. In the case of binary codes, the coefficients labeled g 1, g 2, and so on are binary. Therefore, they take on values of 1 or 0, simply dictating the presence or absence of a connection in the LFSR. However in Figure 9, since each coefficient is specified by 3-bits, it can take on one of eight values. The nonbinary operation implemented by the encoder of Figure 9, forming codewords in a systematic format, proceeds in the same way as the binary one. The steps can be described as follows: 1. Switch 1 is closed during the first k clock cycles to allow shifting the message symbols into the (n - k) stage shift register. 2. Switch 2 is in the down position during the first k clock cycles in order to allow simultaneous transfer of the message symbols directly to an output register (not shown in Figure 9). 20 Reed-Solomon Codes

128 3. After transfer of the kth message symbol to the output register, switch 1 is opened and switch 2 is moved to the up position. 4. The remaining (n - k) clock cycles clear the parity symbols contained in the shift register by moving them to the output register. 5. The total number of clock cycles is equal to n, and the contents of the output register is the codeword polynomial p(x ) + X n - k m(x ), where p(x ) represents the parity symbols and m(x ) the message symbols in polynomial form. We use the same symbol sequence that was chosen as a test message earlier: { { { α1 α3 α5 where the rightmost symbol is the earliest symbol, and the rightmost bit is the earliest bit. The operational steps during the first k = 3 shifts of the encoding circuit of Figure 9 are as follows: INPUT QUEUE CLOCK CYCLE REGISTER CONTENTS FEEDBACK α 1 α 3 α α 5 α 1 α 3 1 α 1 α 6 α 5 α 1 α 0 α 1 2 α 3 0 α 2 α 2 α 4-3 α 0 α 2 α 4 α 6 - After the third clock cycle, the register contents are the four parity symbols, α 0, α 2, α 4, and α 6, as shown. Then, switch 1 of the circuit is opened, switch 2 is toggled to the up position, and the parity symbols contained in the register are shifted to the output. Therefore the output codeword, U(X ), written in polynomial form, can be expressed as follows: U ( X) = u n n X 6 n= 0 U( X) =α+α X+α X +α X +α X +α X +αx ( 100) ( 001) ( 011) ( 101) ( 010) ( 110) ( 111) = + X + X + X + X + X + X (26) Reed-Solomon Codes 21

129 The process of verifying the contents of the register at various clock cycles is somewhat more tedious than in the binary case. Here, the field elements must be added and multiplied by using Table 2 and Table 3, respectively. The roots of a generator polynomial, g(x ), must also be the roots of the codeword generated by g(x ), because a valid codeword is of the following form: U(X ) = m(x ) g(x ) (27) Therefore, an arbitrary codeword, when evaluated at any root of g(x ), must yield zero. It is of interest to verify that the codeword polynomial in Equation (26) does indeed yield zero when evaluated at the four roots of g(x ). In other words, this means checking that U(α) = U(α 2 ) = U(α 3 ) = U(α 4 ) = 0 Evaluating each term independently yields the following: U(α) = α + α + α + α + α + α + α =α +α +α +α +α +α+α =α+α +α +α =α + α = U( α)=α +α +α +α +α +α +α =α 0+α 4+α+α+α α 6+α3 =α 5+α 6+α 0+α3 =α 1+ α 1= U( α)=α +α +α +α +α +α +α =α 0+α 5+α 3+α+α 1 6+α 4+α2 =α 4+α 0+α 3+α2 =α 5+ α 5= U( α)=α +α +α +α +α +α +α =α 0+α 6+α 5+α 4+α 3+α 2+α1 =α 2+α 0+α 5+α1 =α 6+ α 6= This demonstrates the expected results that a codeword evaluated at any root of g(x ) must yield zero. 22 Reed-Solomon Codes

130 Reed-Solomon Decoding Earlier, a test message encoded in systematic form using a (7, 3) R-S code resulted in a codeword polynomial described by Equation (26). Now, assume that during transmission this codeword becomes corrupted so that two symbols are received in error. (This number of errors corresponds to the maximum error-correcting capability of the code.) For this seven-symbol codeword example, the error pattern, e(x ), can be described in polynomial form as follows: 6 e ( X) = e n nx (28) n= 0 For this example, let the double-symbol error be such that e( X) = 0 + 0X + 0X + α X + α X + 0X + 0X ( 000) ( 000) ( 000) ( 001) ( 111) ( 000) ( 000) = + X + X + X + X + X + X (29) In other words, one parity symbol has been corrupted with a 1-bit error (seen as α 2 ), and one data symbol has been corrupted with a 3-bit error (seen as α 5 ). The received corrupted-codeword polynomial, r(x ), is then represented by the sum of the transmitted-codeword polynomial and the error-pattern polynomial as follows: r( X) = U( X) + e ( X) (30) Following Equation (30), we add U(X ) from Equation (26) to e(x ) from Equation (29) to yield r(x ), as follows: ( ) ( ) ( ) ( ) ( ) ( ) ( ) r ( X) = X X X X X X =α+α 0 2X +α 4X 2+α 0X3+α 6X 4+α 3X5+α 5X6 (31) In this example, there are four unknowns two error locations and two error values. Notice an important difference between the nonbinary decoding of r(x ) that we are faced with in Equation (31) and binary decoding; in binary decoding, the decoder only needs to find the error locations [3]. Knowledge that there is an error at a particular location dictates that the bit must be flipped from 1 to 0 or vice versa. But here, the nonbinary symbols require that we not only learn the error locations, but also determine the correct symbol values at those locations. Since there are four unknowns in this example, four equations are required for their solution. Reed-Solomon Codes 23

131 Syndrome Computation The syndrome is the result of a parity check performed on r to determine whether r is a valid member of the codeword set [3]. If in fact r is a member, the syndrome S has value 0. Any nonzero value of S indicates the presence of errors. Similar to the binary case, the syndrome S is made up of n - k symbols, {S i } (i = 1,, n - k). Thus, for this (7, 3) R-S code, there are four symbols in every syndrome vector; their values can be computed from the received polynomial, r(x ). Note how the computation is facilitated by the structure of the code, given by Equation (27) and rewritten below: U(X ) = m(x ) g(x ) From this structure it can be seen that every valid codeword polynomial U(X ) is a multiple of the generator polynomial g(x ). Therefore, the roots of g(x ) must also be the roots of U(X ). Since r(x ) = U(X ) + e(x ), then r(x ) evaluated at each of the roots of g(x ) should yield zero only when it is a valid codeword. Any errors will result in one or more of the computations yielding a nonzero result. The computation of a syndrome symbol can be described as follows: S = r ( X) r ( α i) i = 1, L, n k (32) i X =α i = where r(x ) contains the postulated two-symbol errors as shown in Equation (29). If r(x ) were a valid codeword, it would cause each syndrome symbol S i to equal 0. For this example, the four syndrome symbols are found as follows: = α)=α+α+α+α+α +α+α =α+α+α+α+α+α+α =α3 S1 r( r =α+α+α+α+α+α+α =α5 S2 = ( α)=α+α+α+α+α α 13+α S3 = r( α)=α+α+α α+α 9 18+α 18+α =α+α+α+α+α+α+α =α6 = r α )= α + α + α + α + α + α + α S4 ( =α+α+α+α+α+α+α =0 (33) (34) (35) (36) 24 Reed-Solomon Codes

132 The results confirm that the received codeword contains an error (which we inserted), since S 0. Example 3: A Secondary Check on the Syndrome Values For the (7, 3) R-S code example under consideration, the error pattern is known, since it was chosen earlier. An important property of codes when describing the standard array is that each element of a coset (row) in the standard array has the same syndrome [3]. Show that this property is also true for the R-S code by evaluating the error polynomial e(x ) at the roots of g(x ) to demonstrate that it must yield the same syndrome values as when r(x ) is evaluated at the roots of g(x ). In other words, it must yield the same values obtained in Equations (33) through (36). Solution S = r( X) = r( α i) i = 1,2, L, n k i X =αi i i Si = U( X) + e( X ) = i U( α ) + e ( α ) X =α S = r( α i) = U( α i) + e( α i) = 0 + e ( αi) i From Equation (29), e(x ) = α 2 X 3 + α 5 X 4 Therefore, S 1 = e( α 1)=α 5+ α9 5 2 =α + α =α3 S2 = e( α 2) = α 8+ α =α+ α =α5 continues Reed-Solomon Codes 25

133 continued e =α + α =α6 = e α ) = α + α S3 = ( α 3) =α 11+ α S4 ( 0 0 =α + α =0 These results confirm that the syndrome values are the same, whether obtained by evaluating e(x ) at the roots of g(x ), or r(x ) at the roots of g(x ). Error Location j j j 1 2 Suppose there are ν errors in the codeword at location X, X,..., X ν. Then, the error polynomial e(x ) shown in Equations (28) and (29) can be written as follows: j 1 j 2 j j 1 j2 jν e (37) ( X) = e X + e X e X ν The indices 1, 2, ν refer to the first, second,, ν th errors, and the index j refers to the error location. To correct the corrupted codeword, each error value e and jl its location X, where l = 1, 2,..., ν, must be determined. We define an error jl locator number as β=α l. Next, we obtain the n - k = 2t syndrome symbols by substituting α i into the received polynomial for i = 1, 2, 2t: S = r ( α ) = e β + e β e ν βν 1 j1 1 j2 2 j S = r ( α ) = e β + e β e ν βν (38) j1 1 j2 2 2 j S = r ( α ) = e β + e β e ν βν 2t 2t 2t 2t 2t j1 1 j2 2 j j l 26 Reed-Solomon Codes

134 There are 2t unknowns (t error values and t locations), and 2t simultaneous equations. However, these 2t simultaneous equations cannot be solved in the usual way because they are nonlinear (as some of the unknowns have exponents). Any technique that solves this system of equations is known as a Reed-Solomon decoding algorithm. Once a nonzero syndrome vector (one or more of its symbols are nonzero) has been computed, that signifies that an error has been received. Next, it is necessary to learn the location of the error or errors. An error-locator polynomial, σ(x ), can be defined as follows: σ( X) = (1 + β X)(1 + β X)...(1 + β X) 1 2 = + σ X + σ X + +σ X ν ν ν (39) The roots of σ(x ) are 1/β 1, 1/β 2,,1/β ν. The reciprocal of the roots of σ(x ) are the error-location numbers of the error pattern e(x ). Then, using autoregressive modeling techniques [7], we form a matrix from the syndromes, where the first t syndromes are used to predict the next syndrome. That is, S 1 S 2 S 3... S t 1 S t σ t S t + 1 S 2 S 3 S 4... S t S t + 1 σ t 1 S t + 2 = (40) S t 1 S t S t S 2t 3 S 2t 2 σ 2 S 2t 1 S t S t + 1 S t S 2t 2 S 2t 1 σ 1 S 2t Reed-Solomon Codes 27

135 We apply the autoregressive model of Equation (40) by using the largest dimensioned matrix that has a nonzero determinant. For the (7, 3) double-symbolerror correcting R-S code, the matrix size is 2 2, and the model is written as follows: S S σ S = S2 S3 σ1 S4 α α σ α = α α σ (41) (42) To solve for the coefficients σ 1 and σ 2 and of the error-locator polynomial, σ(x ), we first take the inverse of the matrix in Equation (42). The inverse of a matrix [A] is found as follows: Inv cofactor A = det A A Therefore, det α α =αα αα =α + α α α =α + α =α (43) cofactor α α α α = α α α α (44) 28 Reed-Solomon Codes

136 Inv 6 5 α α α α α α 5 α α = = α 5 3 α α α α α (45) α α α α α α =α = = 0 5 α α α α α α Safety Check If the inversion was performed correctly, the multiplication of the original matrix by the inverted matrix should yield an identity matrix. α α α 3 5 α α α α + α α + α 1 0 = = α α α + α α + α 0 1 (46) Continuing from Equation (42), we begin our search for the error locations by solving for the coefficients of the error-locator polynomial, σ(x ). σ 2 σ 1 0 α α α α α = = α α 6 = 6 0 α α (47) From Equations (39) and (47), we represent σ(x ) as shown below. σ( X) =α+ σ X + σ X =α + α X + α X (48) The roots of σ(x ) are the reciprocals of the error locations. Once these roots are located, the error locations will be known. In general, the roots of σ(x ) may be one or more of the elements of the field. We determine these roots by exhaustive Reed-Solomon Codes 29

137 testing of the σ(x ) polynomial with each of the field elements, as shown below. Any element X that yields σ(x ) = 0 is a root, and allows us to locate an error. σ(α 0 ) = α 0 + α 6 + α 0 = α 6 0 σ(α 1 ) = α 0 + α 7 + α 2 = α 2 0 σ(α 2 ) = α 0 + α 8 + α 4 = α 6 0 σ(α 2 ) = α 0 + α 8 + α 4 = α 6 0 σ(α 3 ) = α 0 + α 9 + α 6 = 0 => ERROR σ(α 4 ) = α 0 + α 10 + α 8 = 0 => ERROR σ(α 5 ) = α 0 + α 11 + α 10 = α 2 0 σ(α 6 ) = α 0 + α 12 + α 12 = α 0 0 As seen in Equation (39), the error locations are at the inverse of the roots of the polynomial. Therefore σ(α 3 ) = 0 indicates that one root exits at 1/β l = α 3. Thus, β l = 1/α 3 = α 4. Similarly, σ(α 4 ) = 0 indicates that another root exits at 1/β l = α 4. Thus, β l = 1/α 4 = α 3, where l and l refer to the first, second,, ν th error. Therefore, in this example, there are two-symbol errors, so that the error polynomial is of the following form: ( X ) = e X + e X j 1 2 j1 j2 j e (49) The two errors were found at locations α 3 and α 4. Note that the indexing of the error-location numbers is completely arbitrary. Thus, for this example, we can designate the β = α j l j values as β = α 1 =α 3 j and 2 4. l 1 β =α =α Error Values An error had been denoted e j l, where the index j refers to the error location and the index l identifies the l th error. Since each error value is coupled to a particular location, the notation can be simplified by denoting e j l, simply as e. Preparing to l determine the error values e 1 and e 2 associated with locations β 1 = α 3 and β 2 = α 4, 2 30 Reed-Solomon Codes

138 any of the four syndrome equations can be used. From Equation (38), let s use S 1 and S 2. r (50) S = ( α ) = eβ + e β S = r ( α ) = eβ + e β We can write these equations in matrix form as follows: β β e e S S = β1 β2 2 2 (51) 3 4 e 3 α α 1 α = (52) α 6 α 8 e 5 2 α To solve for the error values e 1 and e 2, the matrix in Equation (52) is inverted in the usual way, yielding Inv 1 4 α α α α α α = α α αα αα α α α α 6 α α 1 α α = = α 4 3 = α α+α α α α α (53) α α α α = = α α α α Now, we solve Equation (52) for the error values, as follows: e α α α α +α α +α = = = = e α α α α +α α +α α α (54) Reed-Solomon Codes 31

139 Correcting the Received Polynomial with Estimates of the Error Polynomial From Equations (49) and (54), the estimated error polynomial is formed, to yield the following: $ j1 j e( X) = e 2 1X + e2x =α X + α X (55) The demonstrated algorithm repairs the received polynomial, yielding an estimate of the transmitted codeword, and ultimately delivers a decoded message. That is, Û(X ) = r(x ) + ê(x ) = U(X ) + e(x ) + ê(x ) (56) r(x ) = (100) + (001)X + (011)X 2 + (100)X 3 + (101)X 4 + (110)X 5 + (111)X 6 ê(x ) = (000) + (000)X + (000)X 2 + (001)X 3 + (111)X 4 + (000)X 5 + (000)X 6 Û(X ) = (100) + (001)X + (011)X 2 + (101)X 3 + (010)X 4 + (110)X 5 + (111)X 6 = α 0 + α 2 X + α 4 X 2 + α 6 X 3 + α 1 X 4 + α 3 X 5 + α 5 X 6 (57) Since the message symbols constitute the rightmost k = 3 symbols, the decoded message is { { { α1 α3 α5 which is exactly the test message that was chosen earlier for this example. For further reading on R-S coding, see the collection of papers in reference [8]. Conclusion In this article, we examined Reed-Solomon (R-S) codes, a powerful class of nonbinary block codes, particularly useful for correcting burst errors. Because coding efficiency increases with code length, R-S codes have a special attraction. They can be configured with long block lengths (in bits) with less decoding time 32 Reed-Solomon Codes

140 than other codes of similar lengths. This is because the decoder logic works with symbol-based rather than bit-based arithmetic. Hence, for 8-bit symbols, the arithmetic operations would all be at the byte level. This increases the complexity of the logic, compared with binary codes of the same length, but it also increases the throughput. References [1] Reed, I. S. and Solomon, G., Polynomial Codes Over Certain Finite Fields, SIAM Journal of Applied Math., vol. 8, 1960, pp [2] Gallager, R. G., Information Theory and Reliable Communication (New York: John Wiley and Sons, 1968). [3] Sklar, B., Digital Communications: Fundamentals and Applications, Second Edition (Upper Saddle River, NJ: Prentice-Hall, 2001). [4] Odenwalder, J. P., Error Control Coding Handbook, Linkabit Corporation, San Diego, CA, July 15, [5] Berlekamp, E. R., Peile, R. E., and Pope, S. P., The Application of Error Control to Communications, IEEE Communications Magazine, vol. 25, no. 4, April 1987, pp [6] Hagenauer, J., and Lutz, E., Forward Error Correction Coding for Fading Compensation in Mobile Satellite Channels, IEEE JSAC, vol. SAC-5, no. 2, February 1987, pp [7] Blahut, R. E., Theory and Practice of Error Control Codes (Reading, MA: Addison-Wesley, 1983). [8] Wicker, S. B. and Bhargava, V. K., ed., Reed-Solomon Codes and Their Applications (Piscataway, NJ: IEEE Press, 1983). About the Author Bernard Sklar is the author of Digital Communications: Fundamentals and Applications, Second Edition (Prentice-Hall, 2001, ISBN ). Reed-Solomon Codes 33

141 Reed-Solomon Codes Reed-Solomon codes In these notes we examine Reed-Solomon codes, from a computer science point of view. Reed-Solomon codes can be used as both error-correcting and erasure codes. In the error-correcting setting, we wish to transmit a sequence of numbers over a noisy communication channel. The channel noise might cause the data sent to arrive corrupted. In the erasure setting, the channel might fail to send our message. For both cases, we handle the problem of noise by sending additional information beyond the original message. The data sent is an encoding of the original message. If the noise is small enough, the additional information will allow the original message to be recovered, through a decoding process. Encoding Let us suppose that we wish to transmit a sequence of numbers b 0,b 1,...,b d 1. To simplify things, we will assume that these numbers are in GF(p), i.e., our arithmetic is all done modulo p. In practice, we want to reduce everything to bits, bytes, and words, so we will later discuss how to compute over fields more conducive to this setting, namely fields of the form GF(2 r ). Our encoding will be a longer sequence of numbers e 0,e 1,...,e n 1, where we require that p > n. We derive the e sequence from our original b sequence by using the b sequence to define a polynomial P, which we evaluate at n points. There are several ways to do this; here are two straightforward ones: Let P(x)=b 0 + b 1 x + b 2 x b d 1 x d 1. This representation is convenient since it requires no computation to define the polynomial. Our encoding would consist of the values P(0), P(1),...,P(n 1). (Actually, our encoding could be the evalution of P at any set of n points; we choose this set for convenience. Notice that we need p > n, or we can t choose n distinct points!) Let P(x) =c 0 + c 1 x + c 2 x c d 1 x d 1 be such that P(0) =b 0, P(1) =b 1,..., P(d 1) =b d 1. Our encoding e 0,e 1,...,e n 1 would consist of the values P(0),P(1),...,P(n 1). Although this representation requires computing the appropriate polynomial P, it has the advantage that our original message is actually sent as part of the encoding. A code with this property is called a systematic code. Systematic codes can be useful; for example, if there happen to be no erasures or errors, we might immediately be able to get our message on the other end. The polynomial P(x) can be found by using a technique known as Lagrange interpolation. We know that there is a unique polynomial of degree d 1 that passes through d points. (For example, two points uniquely determine a line.) Given d points (a 0,b 0 ),...,(a d 1,b d 1 ), it is easy to check that d 1 P(x)= j=0 b j k j x a k a j a k x a is a polynomial of degree d 1 that passes through the points. (To check this, note that k k j a j a k x = a j and 0 when x = a k for k j.) is 1 when Note that in either case, the encoded message is just a set of values obtained from a polynomial. The important point is not which actual polynomial we use (as long as the sender and receiver agree!), but just that we use a polynomial that is uniquely determined by the data values. 0-1

142 We will also have to assume that the sender and receiver agree on a system so that when the encoded information e i arrives at the receiver, the receiver knows it corresponds to the value P(i). If all the information is sent and arrives in a fixed order, this of course is not a problem. In the case of erasures, we must assume that when an encoded value is missing, we know that it is missing. For example, when information is sent over a network, usually the value i is derived from a packet header; hence we know when we receive a value e i which number i it corresponds to. Decoding Let us now consider what must be done at the end of the receiver. The receiver must determine the polynomial from the received values; once the polynomial is determined, the receiver can determine the original message values. (In the first approach above, the coefficients of the polynomial are the message values; in the second, given the polynomial the message is determined by computing P(0), P(1), etc.) Let us first consider the easier case, where there are erasures, but no errors. Suppose that just d (correct) values e j1,e j2,...,e jd arrive at the receiver. No matter which d values, the receiver can determine the polynomial, just by using Lagrange interpolation! Note that the polynomial the receiver computes must match P, since there is a unique polynomial of degree d 1 passing through d points. What if there are errors, instead of erasures? (By the way, if there are erasures and errors, notice that we can pretend an erasure is an error, just by filling the erased value with a random value!) This is much harder. When there were only erasures, the receiver knows which values they received and that they are all correct. Here the receiver has obtained n values, f 0, f 1,..., f n 1, but has no idea which ones are correct. We show, however, that as long as at most k received values are in error, that is f j e j at most k times, then the original message can be determined whenever k n d 2. Notice that we can make n as large as we like. If we know a bound on error rate in advance, we can choose n accordingly. By making n sufficiently large, we can deal with error rates of up to 50%. (Of course, recall that we need p > n.) The decoding algorithm we cover is due to Berlekamp and Welch. An important concept for the decoding is an error polynomial. An error polynomial E(x) satifies E(i) =0if f i e i. That is, the polynomial E marks the positions where we have received erroneous values. Without loss of generality, we can assume that E has degree k and is monic (that is, the leasing coefficient is 1), since we can choose E to be i: fi e i (x i) if there are k errors, and throw extra terms x i in the product where f i equals e i if there are fewer than k errors. Now consider the following interesting (and somewhat magical) setup: we claim that there exist polynomials V (x) of degree at most d 1 + k and monic W (x) of degree at most k satisfying V(i)= f i W(i). Namely, we can let W (x) =E(x), and V (x) =P(x) E(x). It is easy to check that V (i) and W (i) are both 0 when f i e i, and are both e i W(i) otherwise. So, if someone handed us these polynomials V and W, we could find P(x), since P(x)=V (x)/w (x). There are two things left to check. First, we need to show how to find polynomials V and W satisfying V (i)= f i W(i). Second, we need to check that when we find these polynomials, we don t somehow find a wrong pair of polynomials that do not satisfy V(x)/W (x) =P(x). For example, a priori we could find a polynomial D that was different from E! First, we show that we can find an V and W efficiently. Let v 0,v 1,...,v d+k 1 be the coefficients of V and w 0,w 1,...,w k be the coefficients of W. Note we can assume w k = 1. Then the equations V (i)= f i W(i) give n linear equations in the coefficients of V and W, so that we have n equations and d + 2k n unknowns. Hence a pair V and W can be found by solving a set of linear equations. 0-2

143 Since we know a solution V and W exist, the set of linear equations will have a solution. But it could also have many solutions. However, any solution we obtain will satisfy V (x)/w (x) =P(x). To see this, let us suppose we have two pairs of solutions (V 1,W 1 ) and (V 2,W 2 ). Clearly V 1 (i)w 2 (i) f i = V 2 (i)w 1 (i) f i.iff i does not equal 0, then V 1 (i)w 2 (i)=v 2 (i)w 1 (i) by cancellation. But if f i does equal 0, then V 1 (i)w 2 (i)=v 2 (i)w 1 (i) since then V 1 (i)=v 2 (i) = 0. But this means that the polynomials V 1 W 2 and V 2 W 1 must be equal, since they agree on n points and each has degree d + 2k 1 < n. But if these polynomials are equal, then V 1 (x)/w 1 (x) =V 2 (x)/w 2 (x). Since any solution (V,W ) yields the same ratio V (x)/w (x), this ratio must always equal P(x)! Arithmetic in GF(2 r ) In practice, we want our Reed-Solomon codes to be very efficient. In this regard, working in GF(p) for some prime is inconvenient, for several reasons. Let us suppose it is most convenient if we work in blocks of 8 bits. If we work in GF(251), we are not using all the possibilities for our eight bits. Besides being wasteful, this is problematic if our data (which may come from text, compressed data, etc.) contains a block of eight bits which corresponds to the number 252! It is therefore more natural to work in a field with 2 r elements, or GF(2 r ). Arithmetic is this field is done by finding an irreducible (prime) polynomial π(x) of degree r, and doing all arithmetic in Z 2 [π(x)]. That is, all coefficients are modulo 2, arithmetic is done modulo π(x), and π(x) should not be able to be factored over GF(2). For example, for GF(2 8 ), an irreducible polynomial is π(x) =x 8 + x 6 + x 5 + x + 1. A byte can naturally be though of as a polynomial in the field. For example, by letting the least significant bit represent x 0, and the ith least significant bit represent x i, we have that the byte represents the polynomial x 7 +x 4 +x. Adding in GF(2 r ) is easy: since all coefficients are modulo 2, we can just XOR two bytes together. For example = (x 7 + x 4 + x)+(x 7 + x 5 + x 3 + x) = x 5 + x 4 + x 3. Moreover, subtracting is just the same as adding! Multiplication is slightly harder, since we work modulo π(x). As an example (x 4 + x) (x 4 + x 2 )=x 8 + x 6 + x 5 + x 3. However, we must reduce this so that we can fit it into a byte. As we work modulo π(x), we have that π(x)=0, or x 8 = x 6 + x 5 + x + 1. Hence (x 4 + x) (x 4 + x 2 )=x 8 + x 6 + x 5 + x 3 =(x 6 + x 5 + x + 1)+x 6 + x 5 + x 3 = x 3 + x + 1, and hence = Rather than compute these products on the fly, all possible pairs can be precomputed once in the beginning, and then all multiplications are done by just doing a lookup in the multiplication lookup table. Hence by using memory and preprocessing, one can work in GF(2 8 ) and still obtain great speed. Reed-Solomon codes work exactly the same over GF(2 r ) as they do over GF(p), since in both cases the main reqiurement, namely that a polynomial of degree d 1 be uniquely defined by d points, is satisfied. 0-3

144 reed-solomon codes 2/16/10 4:50 PM 1. Introduction Reed-Solomon Codes An introduction to Reed-Solomon codes: principles, architecture and implementation Reed-Solomon codes are block-based error correcting codes with a wide range of applications in digital communications and storage. Reed-Solomon codes are used to correct errors in many systems including: Storage devices (including tape, Compact Disk, DVD, barcodes, etc) Wireless or mobile communications (including cellular telephones, microwave links, etc) Satellite communications Digital television / DVB High-speed modems such as ADSL, xdsl, etc. A typical system is shown here: The Reed-Solomon encoder takes a block of digital data and adds extra "redundant" bits. Errors occur during transmission or storage for a number of reasons (for example noise or interference, scratches on a CD, etc). The Reed-Solomon decoder processes each block and attempts to correct errors and recover the original data. The number and type of errors that can be corrected depends on the characteristics of the Reed-Solomon code. 2. Properties of Reed-Solomon codes Reed Solomon codes are a subset of BCH codes and are linear block codes. A Reed-Solomon code is specified as RS(n,k) with s-bit symbols. This means that the encoder takes k data symbols of s bits each and adds parity symbols to make an n symbol codeword. There are n-k parity symbols of s bits each. A Reed-Solomon decoder can correct up to t symbols that contain errors in a codeword, where 2t = n-k. The following diagram shows a typical Reed-Solomon codeword (this is known as a Systematic code because the data is left unchanged and the parity symbols are appended): Example: A popular Reed-Solomon code is RS(255,223) with 8-bit symbols. Each codeword contains 255 code word bytes, of which 223 bytes are data and 32 bytes are parity. For this code: Page 1 of 6

145 reed-solomon codes 2/16/10 4:50 PM n = 255, k = 223, s = 8 2t = 32, t = 16 The decoder can correct any 16 symbol errors in the code word: i.e. errors in up to 16 bytes anywhere in the codeword can be automatically corrected. Given a symbol size s, the maximum codeword length (n) for a Reed-Solomon code is n = 2 s 1 For example, the maximum length of a code with 8-bit symbols (s=8) is 255 bytes. Reed-Solomon codes may be shortened by (conceptually) making a number of data symbols zero at the encoder, not transmitting them, and then re-inserting them at the decoder. Example: The (255,223) code described above can be shortened to (200,168). The encoder takes a block of 168 data bytes, (conceptually) adds 55 zero bytes, creates a (255,223) codeword and transmits only the 168 data bytes and 32 parity bytes. The amount of processing "power" required to encode and decode Reed-Solomon codes is related to the number of parity symbols per codeword. A large value of t means that a large number of errors can be corrected but requires more computational power than a small value of t. Symbol Errors One symbol error occurs when 1 bit in a symbol is wrong or when all the bits in a symbol are wrong. Example: RS(255,223) can correct 16 symbol errors. In the worst case, 16 bit errors may occur, each in a separate symbol (byte) so that the decoder corrects 16 bit errors. In the best case, 16 complete byte errors occur so that the decoder corrects 16 x 8 bit errors. Reed-Solomon codes are particularly well suited to correcting burst errors (where a series of bits in the codeword are received in error). Decoding Reed-Solomon algebraic decoding procedures can correct errors and erasures. An erasure occurs when the position of an erred symbol is known. A decoder can correct up to t errors or up to 2t erasures. Erasure information can often be supplied by the demodulator in a digital communication system, i.e. the demodulator "flags" received symbols that are likely to contain errors. When a codeword is decoded, there are three possible outcomes: 1. If 2s + r < 2t (s errors, r erasures) then the original transmitted code word will always be recovered, OTHERWISE 2. The decoder will detect that it cannot recover the original code word and indicate this fact. OR 3. The decoder will mis-decode and recover an incorrect code word without any indication. The probability of each of the three possibilities depends on the particular Reed-Solomon code and on the Page 2 of 6

146 reed-solomon codes 2/16/10 4:50 PM number and distribution of errors. Coding Gain The advantage of using Reed-Solomon codes is that the probability of an error remaining in the decoded data is (usually) much lower than the probability of an error if Reed-Solomon is not used. This is often described as coding gain. Example: A digital communication system is designed to operate at a Bit Error Ratio (BER) of 10-9, i.e. no more than 1 in 10 9 bits are received in error. This can be achieved by boosting the power of the transmitter or by adding Reed-Solomon (or another type of Forward Error Correction). Reed-Solomon allows the system to achieve this target BER with a lower transmitter output power. The power "saving" given by Reed-Solomon (in decibels) is the coding gain. 3. Architectures for encoding and decoding Reed-Solomon codes Reed-Solomon encoding and decoding can be carried out in software or in special-purpose hardware. Finite (Galois) Field Arithmetic Reed-Solomon codes are based on a specialist area of mathematics known as Galois fields or finite fields. A finite field has the property that arithmetic operations (+,-,x,/ etc.) on field elements always have a result in the field. A Reed-Solomon encoder or decoder needs to carry out these arithmetic operations. These operations require special hardware or software functions to implement. Generator Polynomial A Reed-Solomon codeword is generated using a special polynomial. All valid codewords are exactly divisible by the generator polynomial. The general form of the generator polynomial is: and the codeword is constructed using: c(x) = g(x).i(x) where g(x) is the generator polynomial, i(x) is the information block, c(x) is a valid codeword and a is referred to as a primitive element of the field. Example: Generator for RS(255,249) 3.1 Encoder architecture The 2t parity symbols in a systematic Reed-Solomon codeword are given by: The following diagram shows an architecture for a systematic RS(255,249) encoder: Page 3 of 6

147 reed-solomon codes 2/16/10 4:50 PM Each of the 6 registers holds a symbol (8 bits). The arithmetic operators carry out finite field addition or multiplication on a complete symbol. 3.2 Decoder architecture A general architecture for decoding Reed-Solomon codes is shown in the following diagram. Key r(x) Si L(x) Xi Yi c(x) v Received codeword Syndromes Error locator polynomial Error locations Error magnitudes Recovered code word Number of errors The received codeword r(x) is the original (transmitted) codeword c(x) plus errors: r(x) = c(x) + e(x) A Reed-Solomon decoder attempts to identify the position and magnitude of up to t errors (or 2t erasures) and to correct the errors or erasures. Syndrome Calculation This is a similar calculation to parity calculation. A Reed-Solomon codeword has 2t syndromes that depend only on errors (not on the transmitted code word). The syndromes can be calculated by substituting the 2t roots of the generator polynomial g(x) into r(x). Finding the Symbol Error Locations Page 4 of 6

List Decoding of Reed Solomon Codes

List Decoding of Reed Solomon Codes List Decoding of Reed Solomon Codes p. 1/30 List Decoding of Reed Solomon Codes Madhu Sudan MIT CSAIL Background: Reliable Transmission of Information List Decoding of Reed Solomon Codes p. 2/30 List Decoding

More information

Lecture 12. Block Diagram

Lecture 12. Block Diagram Lecture 12 Goals Be able to encode using a linear block code Be able to decode a linear block code received over a binary symmetric channel or an additive white Gaussian channel XII-1 Block Diagram Data

More information

Information redundancy

Information redundancy Information redundancy Information redundancy add information to date to tolerate faults error detecting codes error correcting codes data applications communication memory p. 2 - Design of Fault Tolerant

More information

Error Detection, Correction and Erasure Codes for Implementation in a Cluster File-system

Error Detection, Correction and Erasure Codes for Implementation in a Cluster File-system Error Detection, Correction and Erasure Codes for Implementation in a Cluster File-system Steve Baker December 6, 2011 Abstract. The evaluation of various error detection and correction algorithms and

More information

Optical Storage Technology. Error Correction

Optical Storage Technology. Error Correction Optical Storage Technology Error Correction Introduction With analog audio, there is no opportunity for error correction. With digital audio, the nature of binary data lends itself to recovery in the event

More information

Chapter 7 Reed Solomon Codes and Binary Transmission

Chapter 7 Reed Solomon Codes and Binary Transmission Chapter 7 Reed Solomon Codes and Binary Transmission 7.1 Introduction Reed Solomon codes named after Reed and Solomon [9] following their publication in 1960 have been used together with hard decision

More information

VHDL Implementation of Reed Solomon Improved Encoding Algorithm

VHDL Implementation of Reed Solomon Improved Encoding Algorithm VHDL Implementation of Reed Solomon Improved Encoding Algorithm P.Ravi Tej 1, Smt.K.Jhansi Rani 2 1 Project Associate, Department of ECE, UCEK, JNTUK, Kakinada A.P. 2 Assistant Professor, Department of

More information

Reed-Solomon codes. Chapter Linear codes over finite fields

Reed-Solomon codes. Chapter Linear codes over finite fields Chapter 8 Reed-Solomon codes In the previous chapter we discussed the properties of finite fields, and showed that there exists an essentially unique finite field F q with q = p m elements for any prime

More information

Notes 10: List Decoding Reed-Solomon Codes and Concatenated codes

Notes 10: List Decoding Reed-Solomon Codes and Concatenated codes Introduction to Coding Theory CMU: Spring 010 Notes 10: List Decoding Reed-Solomon Codes and Concatenated codes April 010 Lecturer: Venkatesan Guruswami Scribe: Venkat Guruswami & Ali Kemal Sinop DRAFT

More information

Dr. Cathy Liu Dr. Michael Steinberger. A Brief Tour of FEC for Serial Link Systems

Dr. Cathy Liu Dr. Michael Steinberger. A Brief Tour of FEC for Serial Link Systems Prof. Shu Lin Dr. Cathy Liu Dr. Michael Steinberger U.C.Davis Avago SiSoft A Brief Tour of FEC for Serial Link Systems Outline Introduction Finite Fields and Vector Spaces Linear Block Codes Cyclic Codes

More information

1 Reed Solomon Decoder Final Project. Group 3 Abhinav Agarwal S Branavan Grant Elliott. 14 th May 2007

1 Reed Solomon Decoder Final Project. Group 3 Abhinav Agarwal S Branavan Grant Elliott. 14 th May 2007 1 Reed Solomon Decoder 6.375 Final Project Group 3 Abhinav Agarwal S Branavan Grant Elliott 14 th May 2007 2 Outline Error Correcting Codes Mathematical Foundation of Reed Solomon Codes Decoder Architecture

More information

Error Detection & Correction

Error Detection & Correction Error Detection & Correction Error detection & correction noisy channels techniques in networking error detection error detection capability retransmition error correction reconstruction checksums redundancy

More information

Error Correction Methods

Error Correction Methods Technologies and Services on igital Broadcasting (7) Error Correction Methods "Technologies and Services of igital Broadcasting" (in Japanese, ISBN4-339-06-) is published by CORONA publishing co., Ltd.

More information

Error Correcting Codes Questions Pool

Error Correcting Codes Questions Pool Error Correcting Codes Questions Pool Amnon Ta-Shma and Dean Doron January 3, 018 General guidelines The questions fall into several categories: (Know). (Mandatory). (Bonus). Make sure you know how to

More information

ECE 4450:427/527 - Computer Networks Spring 2017

ECE 4450:427/527 - Computer Networks Spring 2017 ECE 4450:427/527 - Computer Networks Spring 2017 Dr. Nghi Tran Department of Electrical & Computer Engineering Lecture 5.2: Error Detection & Correction Dr. Nghi Tran (ECE-University of Akron) ECE 4450:427/527

More information

Lecture 3: Error Correcting Codes

Lecture 3: Error Correcting Codes CS 880: Pseudorandomness and Derandomization 1/30/2013 Lecture 3: Error Correcting Codes Instructors: Holger Dell and Dieter van Melkebeek Scribe: Xi Wu In this lecture we review some background on error

More information

2013/Fall-Winter Term Monday 12:50 Room# or 5F Meeting Room Instructor: Fire Tom Wada, Professor

2013/Fall-Winter Term Monday 12:50 Room# or 5F Meeting Room Instructor: Fire Tom Wada, Professor SYSTEM ARCHITECTURE ADVANCED SYSTEM ARCHITECTURE Error Correction Code 1 01/Fall-Winter Term Monday 1:50 Room# 1- or 5F Meeting Room Instructor: Fire Tom Wada, Professor 014/1/0 System Arch 1 Introduction

More information

Polynomial Codes over Certain Finite Fields

Polynomial Codes over Certain Finite Fields Polynomial Codes over Certain Finite Fields A paper by: Irving Reed and Gustave Solomon presented by Kim Hamilton March 31, 2000 Significance of this paper: Introduced ideas that form the core of current

More information

Algebraic Codes for Error Control

Algebraic Codes for Error Control little -at- mathcs -dot- holycross -dot- edu Department of Mathematics and Computer Science College of the Holy Cross SACNAS National Conference An Abstract Look at Algebra October 16, 2009 Outline Coding

More information

Reed-Solomon Error-correcting Codes

Reed-Solomon Error-correcting Codes The Deep Hole Problem Matt Keti (Advisor: Professor Daqing Wan) Department of Mathematics University of California, Irvine November 8, 2012 Humble Beginnings Preview of Topics 1 Humble Beginnings Problems

More information

2. Polynomials. 19 points. 3/3/3/3/3/4 Clearly indicate your correctly formatted answer: this is what is to be graded. No need to justify!

2. Polynomials. 19 points. 3/3/3/3/3/4 Clearly indicate your correctly formatted answer: this is what is to be graded. No need to justify! 1. Short Modular Arithmetic/RSA. 16 points: 3/3/3/3/4 For each question, please answer in the correct format. When an expression is asked for, it may simply be a number, or an expression involving variables

More information

Error-Correcting Codes:

Error-Correcting Codes: Error-Correcting Codes: Progress & Challenges Madhu Sudan Microsoft/MIT Communication in presence of noise We are not ready Sender Noisy Channel We are now ready Receiver If information is digital, reliability

More information

Physical Layer and Coding

Physical Layer and Coding Physical Layer and Coding Muriel Médard Professor EECS Overview A variety of physical media: copper, free space, optical fiber Unified way of addressing signals at the input and the output of these media:

More information

Coding Theory: Linear-Error Correcting Codes Anna Dovzhik Math 420: Advanced Linear Algebra Spring 2014

Coding Theory: Linear-Error Correcting Codes Anna Dovzhik Math 420: Advanced Linear Algebra Spring 2014 Anna Dovzhik 1 Coding Theory: Linear-Error Correcting Codes Anna Dovzhik Math 420: Advanced Linear Algebra Spring 2014 Sharing data across channels, such as satellite, television, or compact disc, often

More information

Error Correction Review

Error Correction Review Error Correction Review A single overall parity-check equation detects single errors. Hamming codes used m equations to correct one error in 2 m 1 bits. We can use nonbinary equations if we create symbols

More information

Part III. Cyclic codes

Part III. Cyclic codes Part III Cyclic codes CHAPTER 3: CYCLIC CODES, CHANNEL CODING, LIST DECODING Cyclic codes are very special linear codes. They are of large interest and importance for several reasons: They posses a rich

More information

SIPCom8-1: Information Theory and Coding Linear Binary Codes Ingmar Land

SIPCom8-1: Information Theory and Coding Linear Binary Codes Ingmar Land SIPCom8-1: Information Theory and Coding Linear Binary Codes Ingmar Land Ingmar Land, SIPCom8-1: Information Theory and Coding (2005 Spring) p.1 Overview Basic Concepts of Channel Coding Block Codes I:

More information

Chapter 6 Reed-Solomon Codes. 6.1 Finite Field Algebra 6.2 Reed-Solomon Codes 6.3 Syndrome Based Decoding 6.4 Curve-Fitting Based Decoding

Chapter 6 Reed-Solomon Codes. 6.1 Finite Field Algebra 6.2 Reed-Solomon Codes 6.3 Syndrome Based Decoding 6.4 Curve-Fitting Based Decoding Chapter 6 Reed-Solomon Codes 6. Finite Field Algebra 6. Reed-Solomon Codes 6.3 Syndrome Based Decoding 6.4 Curve-Fitting Based Decoding 6. Finite Field Algebra Nonbinary codes: message and codeword symbols

More information

The BCH Bound. Background. Parity Check Matrix for BCH Code. Minimum Distance of Cyclic Codes

The BCH Bound. Background. Parity Check Matrix for BCH Code. Minimum Distance of Cyclic Codes S-723410 BCH and Reed-Solomon Codes 1 S-723410 BCH and Reed-Solomon Codes 3 Background The algebraic structure of linear codes and, in particular, cyclic linear codes, enables efficient encoding and decoding

More information

Solutions of Exam Coding Theory (2MMC30), 23 June (1.a) Consider the 4 4 matrices as words in F 16

Solutions of Exam Coding Theory (2MMC30), 23 June (1.a) Consider the 4 4 matrices as words in F 16 Solutions of Exam Coding Theory (2MMC30), 23 June 2016 (1.a) Consider the 4 4 matrices as words in F 16 2, the binary vector space of dimension 16. C is the code of all binary 4 4 matrices such that the

More information

Part I. Cyclic codes and channel codes. CODING, CRYPTOGRAPHY and CRYPTOGRAPHIC PROTOCOLS

Part I. Cyclic codes and channel codes. CODING, CRYPTOGRAPHY and CRYPTOGRAPHIC PROTOCOLS CODING, CRYPTOGRAPHY and CRYPTOGRAPHIC PROTOCOLS Part I prof. RNDr. Jozef Gruska, DrSc. Cyclic codes and channel codes Faculty of Informatics Masaryk University October 12, 2016 CHAPTER 3: CYCLIC, STREAM

More information

CS6304 / Analog and Digital Communication UNIT IV - SOURCE AND ERROR CONTROL CODING PART A 1. What is the use of error control coding? The main use of error control coding is to reduce the overall probability

More information

Bridging Shannon and Hamming: List Error-Correction with Optimal Rate

Bridging Shannon and Hamming: List Error-Correction with Optimal Rate Proceedings of the International Congress of Mathematicians Hyderabad, India, 2010 Bridging Shannon and Hamming: List Error-Correction with Optimal Rate Venkatesan Guruswami Abstract. Error-correcting

More information

Chapter 6. BCH Codes

Chapter 6. BCH Codes Chapter 6 BCH Codes Description of the Codes Decoding of the BCH Codes Outline Implementation of Galois Field Arithmetic Implementation of Error Correction Nonbinary BCH Codes and Reed-Solomon Codes Weight

More information

Error Correction and Trellis Coding

Error Correction and Trellis Coding Advanced Signal Processing Winter Term 2001/2002 Digital Subscriber Lines (xdsl): Broadband Communication over Twisted Wire Pairs Error Correction and Trellis Coding Thomas Brandtner brandt@sbox.tugraz.at

More information

Lecture 4: Codes based on Concatenation

Lecture 4: Codes based on Concatenation Lecture 4: Codes based on Concatenation Error-Correcting Codes (Spring 206) Rutgers University Swastik Kopparty Scribe: Aditya Potukuchi and Meng-Tsung Tsai Overview In the last lecture, we studied codes

More information

Optimum Soft Decision Decoding of Linear Block Codes

Optimum Soft Decision Decoding of Linear Block Codes Optimum Soft Decision Decoding of Linear Block Codes {m i } Channel encoder C=(C n-1,,c 0 ) BPSK S(t) (n,k,d) linear modulator block code Optimal receiver AWGN Assume that [n,k,d] linear block code C is

More information

Guess & Check Codes for Deletions, Insertions, and Synchronization

Guess & Check Codes for Deletions, Insertions, and Synchronization Guess & Check Codes for Deletions, Insertions, and Synchronization Serge Kas Hanna, Salim El Rouayheb ECE Department, Rutgers University sergekhanna@rutgersedu, salimelrouayheb@rutgersedu arxiv:759569v3

More information

Decoding Reed-Muller codes over product sets

Decoding Reed-Muller codes over product sets Rutgers University May 30, 2016 Overview Error-correcting codes 1 Error-correcting codes Motivation 2 Reed-Solomon codes Reed-Muller codes 3 Error-correcting codes Motivation Goal: Send a message Don t

More information

5.0 BCH and Reed-Solomon Codes 5.1 Introduction

5.0 BCH and Reed-Solomon Codes 5.1 Introduction 5.0 BCH and Reed-Solomon Codes 5.1 Introduction A. Hocquenghem (1959), Codes correcteur d erreurs; Bose and Ray-Chaudhuri (1960), Error Correcting Binary Group Codes; First general family of algebraic

More information

Error Correction Code (1)

Error Correction Code (1) Error Correction Code 1 Fire Tom Wada Professor, Information Engineering, Univ. of the Ryukyus 01/1/7 1 Introduction Digital data storage Digital data transmission Data might change by some Noise, Fading,

More information

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Discussion 6A Solution

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Discussion 6A Solution CS 70 Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Discussion 6A Solution 1. Polynomial intersections Find (and prove) an upper-bound on the number of times two distinct degree

More information

The Pennsylvania State University. The Graduate School. Department of Computer Science and Engineering

The Pennsylvania State University. The Graduate School. Department of Computer Science and Engineering The Pennsylvania State University The Graduate School Department of Computer Science and Engineering A SIMPLE AND FAST VECTOR SYMBOL REED-SOLOMON BURST ERROR DECODING METHOD A Thesis in Computer Science

More information

Outline. EECS Components and Design Techniques for Digital Systems. Lec 18 Error Coding. In the real world. Our beautiful digital world.

Outline. EECS Components and Design Techniques for Digital Systems. Lec 18 Error Coding. In the real world. Our beautiful digital world. Outline EECS 150 - Components and esign Techniques for igital Systems Lec 18 Error Coding Errors and error models Parity and Hamming Codes (SECE) Errors in Communications LFSRs Cyclic Redundancy Check

More information

Binary Primitive BCH Codes. Decoding of the BCH Codes. Implementation of Galois Field Arithmetic. Implementation of Error Correction

Binary Primitive BCH Codes. Decoding of the BCH Codes. Implementation of Galois Field Arithmetic. Implementation of Error Correction BCH Codes Outline Binary Primitive BCH Codes Decoding of the BCH Codes Implementation of Galois Field Arithmetic Implementation of Error Correction Nonbinary BCH Codes and Reed-Solomon Codes Preface The

More information

Berlekamp-Massey decoding of RS code

Berlekamp-Massey decoding of RS code IERG60 Coding for Distributed Storage Systems Lecture - 05//06 Berlekamp-Massey decoding of RS code Lecturer: Kenneth Shum Scribe: Bowen Zhang Berlekamp-Massey algorithm We recall some notations from lecture

More information

IBM Research Report. Construction of PMDS and SD Codes Extending RAID 5

IBM Research Report. Construction of PMDS and SD Codes Extending RAID 5 RJ10504 (ALM1303-010) March 15, 2013 Computer Science IBM Research Report Construction of PMDS and SD Codes Extending RAID 5 Mario Blaum IBM Research Division Almaden Research Center 650 Harry Road San

More information

ECE8771 Information Theory & Coding for Digital Communications Villanova University ECE Department Prof. Kevin M. Buckley Lecture Set 2 Block Codes

ECE8771 Information Theory & Coding for Digital Communications Villanova University ECE Department Prof. Kevin M. Buckley Lecture Set 2 Block Codes Kevin Buckley - 2010 109 ECE8771 Information Theory & Coding for Digital Communications Villanova University ECE Department Prof. Kevin M. Buckley Lecture Set 2 Block Codes m GF(2 ) adder m GF(2 ) multiplier

More information

CSCI 2570 Introduction to Nanocomputing

CSCI 2570 Introduction to Nanocomputing CSCI 2570 Introduction to Nanocomputing Information Theory John E Savage What is Information Theory Introduced by Claude Shannon. See Wikipedia Two foci: a) data compression and b) reliable communication

More information

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 11. Error Correcting Codes Erasure Errors

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 11. Error Correcting Codes Erasure Errors CS 70 Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 11 Error Correcting Codes Erasure Errors We will consider two situations in which we wish to transmit information on an unreliable channel.

More information

The E8 Lattice and Error Correction in Multi-Level Flash Memory

The E8 Lattice and Error Correction in Multi-Level Flash Memory The E8 Lattice and Error Correction in Multi-Level Flash Memory Brian M. Kurkoski kurkoski@ice.uec.ac.jp University of Electro-Communications Tokyo, Japan ICC 2011 IEEE International Conference on Communications

More information

Chapter 7. Error Control Coding. 7.1 Historical background. Mikael Olofsson 2005

Chapter 7. Error Control Coding. 7.1 Historical background. Mikael Olofsson 2005 Chapter 7 Error Control Coding Mikael Olofsson 2005 We have seen in Chapters 4 through 6 how digital modulation can be used to control error probabilities. This gives us a digital channel that in each

More information

for some error exponent E( R) as a function R,

for some error exponent E( R) as a function R, . Capacity-achieving codes via Forney concatenation Shannon s Noisy Channel Theorem assures us the existence of capacity-achieving codes. However, exhaustive search for the code has double-exponential

More information

The E8 Lattice and Error Correction in Multi-Level Flash Memory

The E8 Lattice and Error Correction in Multi-Level Flash Memory The E8 Lattice and Error Correction in Multi-Level Flash Memory Brian M Kurkoski University of Electro-Communications Tokyo, Japan kurkoski@iceuecacjp Abstract A construction using the E8 lattice and Reed-Solomon

More information

18.310A Final exam practice questions

18.310A Final exam practice questions 18.310A Final exam practice questions This is a collection of practice questions, gathered randomly from previous exams and quizzes. They may not be representative of what will be on the final. In particular,

More information

An Enhanced (31,11,5) Binary BCH Encoder and Decoder for Data Transmission

An Enhanced (31,11,5) Binary BCH Encoder and Decoder for Data Transmission An Enhanced (31,11,5) Binary BCH Encoder and Decoder for Data Transmission P.Mozhiarasi, C.Gayathri, V.Deepan Master of Engineering, VLSI design, Sri Eshwar College of Engineering, Coimbatore- 641 202,

More information

Introduction to Wireless & Mobile Systems. Chapter 4. Channel Coding and Error Control Cengage Learning Engineering. All Rights Reserved.

Introduction to Wireless & Mobile Systems. Chapter 4. Channel Coding and Error Control Cengage Learning Engineering. All Rights Reserved. Introduction to Wireless & Mobile Systems Chapter 4 Channel Coding and Error Control 1 Outline Introduction Block Codes Cyclic Codes CRC (Cyclic Redundancy Check) Convolutional Codes Interleaving Information

More information

Introduction to Low-Density Parity Check Codes. Brian Kurkoski

Introduction to Low-Density Parity Check Codes. Brian Kurkoski Introduction to Low-Density Parity Check Codes Brian Kurkoski kurkoski@ice.uec.ac.jp Outline: Low Density Parity Check Codes Review block codes History Low Density Parity Check Codes Gallager s LDPC code

More information

An Introduction to (Network) Coding Theory

An Introduction to (Network) Coding Theory An to (Network) Anna-Lena Horlemann-Trautmann University of St. Gallen, Switzerland April 24th, 2018 Outline 1 Reed-Solomon Codes 2 Network Gabidulin Codes 3 Summary and Outlook A little bit of history

More information

Discrete Mathematics and Probability Theory Fall 2013 Vazirani Note 6

Discrete Mathematics and Probability Theory Fall 2013 Vazirani Note 6 CS 70 Discrete Mathematics and Probability Theory Fall 2013 Vazirani Note 6 Error Correcting Codes We will consider two situations in which we wish to transmit information on an unreliable channel. The

More information

MATH3302 Coding Theory Problem Set The following ISBN was received with a smudge. What is the missing digit? x9139 9

MATH3302 Coding Theory Problem Set The following ISBN was received with a smudge. What is the missing digit? x9139 9 Problem Set 1 These questions are based on the material in Section 1: Introduction to coding theory. You do not need to submit your answers to any of these questions. 1. The following ISBN was received

More information

16.36 Communication Systems Engineering

16.36 Communication Systems Engineering MIT OpenCourseWare http://ocw.mit.edu 16.36 Communication Systems Engineering Spring 2009 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms. 16.36: Communication

More information

4F5: Advanced Communications and Coding

4F5: Advanced Communications and Coding 4F5: Advanced Communications and Coding Coding Handout 4: Reed Solomon Codes Jossy Sayir Signal Processing and Communications Lab Department of Engineering University of Cambridge jossy.sayir@eng.cam.ac.uk

More information

MATH3302. Coding and Cryptography. Coding Theory

MATH3302. Coding and Cryptography. Coding Theory MATH3302 Coding and Cryptography Coding Theory 2010 Contents 1 Introduction to coding theory 2 1.1 Introduction.......................................... 2 1.2 Basic definitions and assumptions..............................

More information

Assume that the follow string of bits constitutes one of the segments we which to transmit.

Assume that the follow string of bits constitutes one of the segments we which to transmit. Cyclic Redundancy Checks( CRC) Cyclic Redundancy Checks fall into a class of codes called Algebraic Codes; more specifically, CRC codes are Polynomial Codes. These are error-detecting codes, not error-correcting

More information

Some error-correcting codes and their applications

Some error-correcting codes and their applications Chapter 14 Some error-correcting codes and their applications J. D. Key 1 14.1 Introduction In this chapter we describe three types of error-correcting linear codes that have been used in major applications,

More information

Coding for loss tolerant systems

Coding for loss tolerant systems Coding for loss tolerant systems Workshop APRETAF, 22 janvier 2009 Mathieu Cunche, Vincent Roca INRIA, équipe Planète INRIA Rhône-Alpes Mathieu Cunche, Vincent Roca The erasure channel Erasure codes Reed-Solomon

More information

Lecture 8: Shannon s Noise Models

Lecture 8: Shannon s Noise Models Error Correcting Codes: Combinatorics, Algorithms and Applications (Fall 2007) Lecture 8: Shannon s Noise Models September 14, 2007 Lecturer: Atri Rudra Scribe: Sandipan Kundu& Atri Rudra Till now we have

More information

Design and Implementation of Reed-Solomon Decoder using Decomposed Inversion less Berlekamp-Massey Algorithm by

Design and Implementation of Reed-Solomon Decoder using Decomposed Inversion less Berlekamp-Massey Algorithm by Design and Implementation of Reed-Solomon Decoder using Decomposed Inversion less Berlekamp-Massey Algorithm by Hazem Abd Elall Ahmed Elsaid A Thesis Submitted to the Faculty of Engineering at Cairo University

More information

REED-SOLOMON ENCODING AND DECODING

REED-SOLOMON ENCODING AND DECODING Bachelor's Thesis Degree Programme in Information Technology 2011 León van de Pavert REED-SOLOMON ENCODING AND DECODING A Visual Representation i Bachelor's Thesis Abstract Turku University of Applied

More information

Belief propagation decoding of quantum channels by passing quantum messages

Belief propagation decoding of quantum channels by passing quantum messages Belief propagation decoding of quantum channels by passing quantum messages arxiv:67.4833 QIP 27 Joseph M. Renes lempelziv@flickr To do research in quantum information theory, pick a favorite text on classical

More information

Coding theory: Applications

Coding theory: Applications INF 244 a) Textbook: Lin and Costello b) Lectures (Tu+Th 12.15-14) covering roughly Chapters 1,9-12, and 14-18 c) Weekly exercises: For your convenience d) Mandatory problem: Programming project (counts

More information

Lecture 9: List decoding Reed-Solomon and Folded Reed-Solomon codes

Lecture 9: List decoding Reed-Solomon and Folded Reed-Solomon codes Lecture 9: List decoding Reed-Solomon and Folded Reed-Solomon codes Error-Correcting Codes (Spring 2016) Rutgers University Swastik Kopparty Scribes: John Kim and Pat Devlin 1 List decoding review Definition

More information

Block Codes :Algorithms in the Real World

Block Codes :Algorithms in the Real World Block Codes 5-853:Algorithms in the Real World Error Correcting Codes II Reed-Solomon Codes Concatenated Codes Overview of some topics in coding Low Density Parity Check Codes (aka Expander Codes) -Network

More information

Cyclic Redundancy Check Codes

Cyclic Redundancy Check Codes Cyclic Redundancy Check Codes Lectures No. 17 and 18 Dr. Aoife Moloney School of Electronics and Communications Dublin Institute of Technology Overview These lectures will look at the following: Cyclic

More information

Math 512 Syllabus Spring 2017, LIU Post

Math 512 Syllabus Spring 2017, LIU Post Week Class Date Material Math 512 Syllabus Spring 2017, LIU Post 1 1/23 ISBN, error-detecting codes HW: Exercises 1.1, 1.3, 1.5, 1.8, 1.14, 1.15 If x, y satisfy ISBN-10 check, then so does x + y. 2 1/30

More information

Lecture Introduction. 2 Formal Definition. CS CTT Current Topics in Theoretical CS Oct 30, 2012

Lecture Introduction. 2 Formal Definition. CS CTT Current Topics in Theoretical CS Oct 30, 2012 CS 59000 CTT Current Topics in Theoretical CS Oct 30, 0 Lecturer: Elena Grigorescu Lecture 9 Scribe: Vivek Patel Introduction In this lecture we study locally decodable codes. Locally decodable codes are

More information

ECEN 655: Advanced Channel Coding

ECEN 655: Advanced Channel Coding ECEN 655: Advanced Channel Coding Course Introduction Henry D. Pfister Department of Electrical and Computer Engineering Texas A&M University ECEN 655: Advanced Channel Coding 1 / 19 Outline 1 History

More information

EECS Components and Design Techniques for Digital Systems. Lec 26 CRCs, LFSRs (and a little power)

EECS Components and Design Techniques for Digital Systems. Lec 26 CRCs, LFSRs (and a little power) EECS 150 - Components and esign Techniques for igital Systems Lec 26 CRCs, LFSRs (and a little power) avid Culler Electrical Engineering and Computer Sciences University of California, Berkeley http://www.eecs.berkeley.edu/~culler

More information

6.895 PCP and Hardness of Approximation MIT, Fall Lecture 3: Coding Theory

6.895 PCP and Hardness of Approximation MIT, Fall Lecture 3: Coding Theory 6895 PCP and Hardness of Approximation MIT, Fall 2010 Lecture 3: Coding Theory Lecturer: Dana Moshkovitz Scribe: Michael Forbes and Dana Moshkovitz 1 Motivation In the course we will make heavy use of

More information

Chapter 6 Lagrange Codes

Chapter 6 Lagrange Codes Chapter 6 Lagrange Codes 6. Introduction Joseph Louis Lagrange was a famous eighteenth century Italian mathematician [] credited with minimum degree polynomial interpolation amongst his many other achievements.

More information

Top Ten Algorithms Class 6

Top Ten Algorithms Class 6 Top Ten Algorithms Class 6 John Burkardt Department of Scientific Computing Florida State University... http://people.sc.fsu.edu/ jburkardt/classes/tta 2015/class06.pdf 05 October 2015 1 / 24 Our Current

More information

Coding Theory and Applications. Solved Exercises and Problems of Cyclic Codes. Enes Pasalic University of Primorska Koper, 2013

Coding Theory and Applications. Solved Exercises and Problems of Cyclic Codes. Enes Pasalic University of Primorska Koper, 2013 Coding Theory and Applications Solved Exercises and Problems of Cyclic Codes Enes Pasalic University of Primorska Koper, 2013 Contents 1 Preface 3 2 Problems 4 2 1 Preface This is a collection of solved

More information

Lecture 12: November 6, 2017

Lecture 12: November 6, 2017 Information and Coding Theory Autumn 017 Lecturer: Madhur Tulsiani Lecture 1: November 6, 017 Recall: We were looking at codes of the form C : F k p F n p, where p is prime, k is the message length, and

More information

Notes 3: Stochastic channels and noisy coding theorem bound. 1 Model of information communication and noisy channel

Notes 3: Stochastic channels and noisy coding theorem bound. 1 Model of information communication and noisy channel Introduction to Coding Theory CMU: Spring 2010 Notes 3: Stochastic channels and noisy coding theorem bound January 2010 Lecturer: Venkatesan Guruswami Scribe: Venkatesan Guruswami We now turn to the basic

More information

Channel Coding I. Exercises SS 2017

Channel Coding I. Exercises SS 2017 Channel Coding I Exercises SS 2017 Lecturer: Dirk Wübben Tutor: Shayan Hassanpour NW1, Room N 2420, Tel.: 0421/218-62387 E-mail: {wuebben, hassanpour}@ant.uni-bremen.de Universität Bremen, FB1 Institut

More information

Communications II Lecture 9: Error Correction Coding. Professor Kin K. Leung EEE and Computing Departments Imperial College London Copyright reserved

Communications II Lecture 9: Error Correction Coding. Professor Kin K. Leung EEE and Computing Departments Imperial College London Copyright reserved Communications II Lecture 9: Error Correction Coding Professor Kin K. Leung EEE and Computing Departments Imperial College London Copyright reserved Outline Introduction Linear block codes Decoding Hamming

More information

Sector-Disk Codes and Partial MDS Codes with up to Three Global Parities

Sector-Disk Codes and Partial MDS Codes with up to Three Global Parities Sector-Disk Codes and Partial MDS Codes with up to Three Global Parities Junyu Chen Department of Information Engineering The Chinese University of Hong Kong Email: cj0@alumniiecuhkeduhk Kenneth W Shum

More information

Channel Coding I. Exercises SS 2017

Channel Coding I. Exercises SS 2017 Channel Coding I Exercises SS 2017 Lecturer: Dirk Wübben Tutor: Shayan Hassanpour NW1, Room N 2420, Tel.: 0421/218-62387 E-mail: {wuebben, hassanpour}@ant.uni-bremen.de Universität Bremen, FB1 Institut

More information

An Introduction to (Network) Coding Theory

An Introduction to (Network) Coding Theory An Introduction to (Network) Coding Theory Anna-Lena Horlemann-Trautmann University of St. Gallen, Switzerland July 12th, 2018 1 Coding Theory Introduction Reed-Solomon codes 2 Introduction Coherent network

More information

Lecture 19 : Reed-Muller, Concatenation Codes & Decoding problem

Lecture 19 : Reed-Muller, Concatenation Codes & Decoding problem IITM-CS6845: Theory Toolkit February 08, 2012 Lecture 19 : Reed-Muller, Concatenation Codes & Decoding problem Lecturer: Jayalal Sarma Scribe: Dinesh K Theme: Error correcting codes In the previous lecture,

More information

Lecture 12: Reed-Solomon Codes

Lecture 12: Reed-Solomon Codes Error Correcting Codes: Combinatorics, Algorithms and Applications (Fall 007) Lecture 1: Reed-Solomon Codes September 8, 007 Lecturer: Atri Rudra Scribe: Michel Kulhandjian Last lecture we saw the proof

More information

Coding Techniques for Data Storage Systems

Coding Techniques for Data Storage Systems Coding Techniques for Data Storage Systems Thomas Mittelholzer IBM Zurich Research Laboratory /8 Göttingen Agenda. Channel Coding and Practical Coding Constraints. Linear Codes 3. Weight Enumerators and

More information

Convolutional Coding LECTURE Overview

Convolutional Coding LECTURE Overview MIT 6.02 DRAFT Lecture Notes Spring 2010 (Last update: March 6, 2010) Comments, questions or bug reports? Please contact 6.02-staff@mit.edu LECTURE 8 Convolutional Coding This lecture introduces a powerful

More information

3x + 1 (mod 5) x + 2 (mod 5)

3x + 1 (mod 5) x + 2 (mod 5) Today. Secret Sharing. Polynomials Polynomials. Secret Sharing. Share secret among n people. Secrecy: Any k 1 knows nothing. Roubustness: Any k knows secret. Efficient: minimize storage. A polynomial P(x)

More information

Introduction to Convolutional Codes, Part 1

Introduction to Convolutional Codes, Part 1 Introduction to Convolutional Codes, Part 1 Frans M.J. Willems, Eindhoven University of Technology September 29, 2009 Elias, Father of Coding Theory Textbook Encoder Encoder Properties Systematic Codes

More information

Making Error Correcting Codes Work for Flash Memory

Making Error Correcting Codes Work for Flash Memory Making Error Correcting Codes Work for Flash Memory Part I: Primer on ECC, basics of BCH and LDPC codes Lara Dolecek Laboratory for Robust Information Systems (LORIS) Center on Development of Emerging

More information

IN this paper, we will introduce a new class of codes,

IN this paper, we will introduce a new class of codes, IEEE TRANSACTIONS ON INFORMATION THEORY, VOL 44, NO 5, SEPTEMBER 1998 1861 Subspace Subcodes of Reed Solomon Codes Masayuki Hattori, Member, IEEE, Robert J McEliece, Fellow, IEEE, and Gustave Solomon,

More information

A Public Key Encryption Scheme Based on the Polynomial Reconstruction Problem

A Public Key Encryption Scheme Based on the Polynomial Reconstruction Problem A Public Key Encryption Scheme Based on the Polynomial Reconstruction Problem Daniel Augot and Matthieu Finiasz INRIA, Domaine de Voluceau F-78153 Le Chesnay CEDEX Abstract. The Polynomial Reconstruction

More information