Dr. Cathy Liu Dr. Michael Steinberger. A Brief Tour of FEC for Serial Link Systems

Prof. Shu Lin Dr. Cathy Liu Dr. Michael Steinberger U.C.Davis Avago SiSoft A Brief Tour of FEC for Serial Link Systems

Outline Introduction Finite Fields and Vector Spaces Linear Block Codes Cyclic Codes Important Classes of Cyclic Codes FEC Applications to Serial Link Systems System Performance Estimation with FEC Error Correlation Study

Block Diagram of a Serial Link Information Source Source Encoder u Channel Encoder v Modulator Noise Channel Destination Source Decoder v Channel Decoder r Demodulator

Encoding Source Encoder u Channel Encoder v Modulator single bit or m-bit symbol message codeword u = v = u 0, u 1,, u k 1 v 0, v 1,, v n 1 n k (n,k) block code n-k check bits or symbols added

Decoding v = v 0, v 1,, v n 1 Hard Decision: r = r 0, r 1,, r n 1 v j 0 1 1 p 1 p p p 0 1 r j binary symmetric channel Source Decoder v Channel Decoder r Soft Decision: Demodulator v j 0 P 0 1 P 1 0 P 0 0 P 1 1 P 7 0 1 P 7 1 7 binary input, 8-ary output discrete channel 0 1 r j

Optimal Decoding optimum decoding rule Minimize P(E) Optimal decoder concept Minimize P E r Maximize P v = v r P v P r v 1. Compute P v r = for every possible value of v P(r) 2. Choose v to be the value with the largest probability Maximum likelihood decoder (all messages equally likely!) 1. Compute P r v j for every possible value of v j 2. Choose v k to be the value with the largest probability 3. Find u k corresponding to v k

Signal to Noise Ratio E b : Energy per bit. A measure of signal energy N 0 : Noise spectral density. A measure of noise energy E b : A unit-less measure of signal to noise ratio N 0 Noise limited channel: 0.1 5 7 9 11 13 15 0.001 0.00001 000001 1E-09 1E-11 1E-13 1E-15 coding gain @ BER BER uncoded BER coded

BER Shannon Capacity Limit Assume: Occupied bandwidth W (use most restrictive definition possible) Transmitted power P s Data rate R channel Theoretical minimum BER Then define channel capacity** C W log 2 1 + P 0.5 s WN 0 R channel > C R channel < C 0 E b N0 Hard to approach Shannon limit without using FEC. ** Wozencraft and Jacobs, Principles of Communication Engineering, pg. 323, John Wiley and Sons, Inc., copyright 1965

High Speed Serial Channel A different animal Dispersion limited not Noise limited Hard decision decoding not Soft decision decoding

Binary Arithmetic Addition (XOR) 0 + 0 = 0 0 + 1 = 1 1 + 0 = 1 1 + 1 = 0 NOTE: Subtraction = Addition (?!) Multiplication (AND) 0 0 = 0 ϵ {0,1} 0 1 = 0 1 0 = 0 1 1 = 1 ϵ {0,1} Galois Field {0,1} = GF(2)

Unique arithmetic Finite (Galois) Fields For any p positive prime integer m positive (non-zero) integer Define p m symbols Addition (maps back to same p m symbols) Multiplication (maps back to same p m symbols) p = 2 GF(p m ) Familiar algebraic properties Same as real numbers or complex numbers Can define vectors and polynomials! GF(2 m ) m-bit symbols (especially m = 1 ) Implement arithmetic using linear feedback shift registers. m > 1 essential for tolerating bursts of errors.

V n : a 0, a 1,, a n 1 a i GF 2 m Vectors Addition: a 0, a 1,, a n 1 + b 0, b 1,, b n 1 = a 0 + b 0, a 1 + b 1,, a n 1 + b n 1 Scalar multiplication: c a 0, a 1,, a n 1 = c a 0, c a 1,, c a n 1 Inner product: a, b = a 0 b 0 + a 1 b 1 + + a n 1 b n 1 Orthogonal: a, b =0 Subspace V k subset of V n : a V k and b V k a + b V k The concept of subspaces is critical to error correction coding

Linear Block Code V n : a 0, a 1,, a n 1 a i GF 2 Binary Linear Block Code Decompose V n into orthogonal subspaces V k and V n k (orthogonal: p, q = 0 for all p V k, q V n k ) r = p + q r V n, p V k, q V n k Codewords of n, k Linear Block Code form V k Linear: Includes all linear combinations within V k Block: Each codeword is a block of n bits or n m-bit symbols rate = k n

Encoding n, k linear block code C V k is generated by k basis vectors g 0, g 1,, g k 1. These vectors can be organized into the generator matrix G = g 0 g 1 g k 1 = g 0,0 g 0,1 g 0,n 1 g 1,0 g 1,1 g 1,n 1 g k 1,0 g k 1,1 g k 1,n 1 message u = u 0, u 1,, u k 1 produces codeword v = u G = u 0 g 0 + u 1 g 1 + + u k 1 g k 1 v V k

Linear Systematic Block Code Systematic: v = v 0, v 1,, v n k 1, u 0, u 1,, u k 1 parity check unaltered message Linear combination of message bits/symbols Check for errors (and correct if possible) Linear Block Code + Systematic = Linear Systematic Block Code

Parity-Check Matrix V n k is generated by n k basis vectors h 0, h 1,, h n k 1. These vectors can be organized into the parity-check matrix H = h 0 h 1 h n k 1 = h 0,0 h 0,1 h 0,n 1 h 1,0 h 1,1 h 1,n 1 h n k 1,0 h n k 1,1 h n k 1,n 1 Since v V k v, h i = 0 for all i (We said the subspaces were orthogonal.) v H T = 0,0,, 0 parity constraint

Linear Systematic Block Code Example k = 3 Messages Codewords u 0, u 1, u 2 v 0, v 1, v 2, v 3, v 4, v 5 n = 6 (000) (000000) (100) (011100) (010) (101010) (110) (110110) (001) (110001) (101) (101101) (011) (011011) (111) (000111)

Matrices for Example Code G = g 0 g 1 g 2 = 0 1 1 1 0 1 1 1 0 1 0 0 0 1 0 0 0 1 H = 1 0 0 0 1 0 0 0 1 0 1 1 1 0 1 1 1 0 v 0 = u 1 + u 2 v 1 = u 0 + u 2 v 2 = u 0 + u 1 v 3 = u 0 v 4 = u 1 v 5 = u 2 Suppose message u = (101) v = u G = 1 g 0 + 0 g 1 + 1 g 2 = 1 011100 + 0 101010 + 1 110001 = 011100 + 000000 + 110001 = (101101) parity u

Error Detection v transmitted codeword r received vector (hard decision) r = v r V k correct transmission r v r / V k error (detectable) r v r V k error (undetectable) Syndrome s r H T s = s 0, s 1,, s n k 1 s = r H T = 0 s = r H T 0 correct transmission or undetectable error detectable error

Error Correction 1. Compute the syndrome s of the received vector r to detect errors. 2. Identify the locations of the errors (the hardest part). 3. Correct the errors.

Identifying Error Locations Error vector e r + v = e 0, e 1,, e n 1 = r 0 + v 0, r 1 + v 1,, r n 1 + v n 1 Remember we noted that subtraction = addition? Here it is. e j = 0 if r j = v j e j = 1 if r j v j error location Estimated error vector Estimated transmitted codeword v = r + e v V k Choose e to minimize the number of error locations needed to make v a valid codeword. e

Error Correction Capability Hamming distance: d(v, w) v, w V n defined over GF p m number of symbol locations where v and w differ (Apply to C defined over GF 2 ) Minimum distance of (n, k) linear block code C: d min C = min {d v, w : v, w C, v w} Error-correction capability: t = d min C 1 2 t v x r w (n, k, d min ) linear block code

Cyclic Codes a = a 0, a 1,, a n 1 a i GF 2 m Right shift operator a (1) a n 1, a 0,, a n 2 Cyclic code C: v C v (1) C Encoding and syndrome computation implemented using shift registers with simple feedback. Inherent algebraic structure enables many implementation options.

Polynomials using GF(2 m ) For each vector There is a corresponding polynomial a 0, a 1, a 2,, a n 1 a 0 + a 1 X + a 2 X 2 + + a n 1 X n 1 (allowing for the unique arithmetic of the GF 2 m ) Same arithmetic and algebraic properties as polynomials with real or complex coefficients. Polynomial version of right shift operator 1. Multiply by X Xa X = a 0 X + a 1 X 2 + a 2 X 3 + + a n 2 X n 1 + a n 1 X n 2. Divide by X n + 1 Xa X = a n 1 + a 0 X + a 1 X 2 + + a n 2 X n 1 + a n 1 (X n +1) 3. Keep the remainder Xa X = a (1) X + a n 1 (X n + 1) Remember: In GF(2 m ) subtraction =addition

Generator Polynomial g X = 1 + g 1 X + g 2 X 2 + + g n k 1 X n k 1 + X n k non-zero g(x) is a factor of X n + 1 degree n k A code polynomial v(x) is in code C iff it has the form v x = a X g(x) n > degree n k degree < k In principle, a(x) could be a message. However, the resulting code would not be systematic. g(x) is the generator polynomial for the code C.

Systematic Encoding 1. Right-shift the message by n k symbols (that is, multiply by X n k ) X n k u X = u 0 X n k + u 1 X n k+1 + + u k 1 X n 1 2. Fill in the parity-check symbols in a way that creates a codeword. X n k u X = a X g X + b X degree = n k degree n k b X + X n k u X = a X g X = v(x) parity-check message codeword

Systematic Encoding Circuit gate g 1 g 2 g n-k-1 b 0 b 1 b 2 b n-k-1 message u codeword parity-check symbols v

Example (7,4) Cyclic Code g X = X 3 + X + 1 X 7 + 1 = (X 4 + X 2 + X + 1) g(x) Message Codeword Code Polynomial (0000) (0000000) 0 = 0 g(x) (1000) (1101000) 1 + X + X 3 = g(x) (0100) (0110100) X + X 2 + X 4 = Xg(X) (1100) (1011100) 1 + X 2 + X 3 + X 4 = 1 + X g(x) (0010) (1110010) 1 + X + X 2 + X 5 = 1 + X 2 g(x) (1010) (0011010) X 2 + X 3 + X 5 = X 2 g(x) (0110) (1000110) 1 + X 4 + X 5 = 1 + X + X 2 g(x) (1110) (0101110) X + X 3 + X 4 + X 5 = X + X 2 g(x) (0001) (1010001) 1 + X 2 + X 6 = 1 + X + X 3 g(x) (1001) (0111001) X + X 2 + X 3 + X 6 = X + X 3 g(x) (0101) (1100101) 1 + X + X 4 + X 6 = 1 + X 3 g(x) (1101) (0001101) X 3 + X 4 + X 6 = X 3 g(x) (0011) (0100011) X + X 5 + X 6 = X + X 2 + X 3 g(x) (1011) (1001011) 1 + X 3 + X 5 + X 6 = 1 + X + X 2 + X 3 g(x) (0111) (0010111) X 2 + X 4 + X 5 + X 6 = X 2 + X 3 g(x) (1111) (1111111) 1 + X + X 2 + X 3 + X 4 + X 5 + X 6 = 1 + X 2 + X 3 g(x)

Error Detection and Correction Divide r(x) by g(x) r X = a X g X + s(x) (Requires a feedback shift register circuit with n k flip-flops.) If s X = 0, assume that r X = v X (Transmission was correct.) If s(x) 0 an error definitely occurred. Locate and correct the error(s).

Example Error Detection Circuit g X = X 3 + X + 1 gate input gate

Important Classes of Cyclic Codes Low Error Correction Capacity t Hamming Density Parity Check Bose Chaudhuri Hocquenghem Reed Solomon Fire Random Errors (e.g., satellite channel) Burst Errors (e.g., scratch in a CD)

Hamming Codes For any positive integer m 3, there exists a 2m 1, 2m m 1, 3 Hamming code with minimum distance 3. This code is capable of correcting a single error at any location over a block of 2m 1 bits. Decoding is simple.

BCH Codes For any positive integer m 3 and t < 2m 1, there exists a binary cyclic BCH code with the following parameters: Length: n = 2m 1 Number of parity-check bits: n k mt Minimum distance: d min 2t + 1 This code is capable of correcting t or fewer random errors over a span of 2m 1 bit positions and hence called a t-error-correcting BCH code.

Reed Solomon (RS) Codes For any q that is a power of a prime and any t with 1 t < q, there exists an RS code with code symbols from a finite field GF(q) of order q with the following parameters: Length: q 1 Dimension: q 2t 1 Number of parity-check symbols: n k = 2t Minimum distance: d min = 2t + 1 If q = 2 m then each symbol consists of m consecutive bits. Corrects up to t symbols in a block. Effective for correcting random errors. Effective for correcting bursts of errors (when multiple errors occur in a single symbol). The most commonly used RS code is the (255, 239, 17). Used in optical and satellite communications and data storage systems.

Fire Codes Optimized for bursts of errors. (Errors all occur in isolated windows of length l.) Decoding is very simple. It is called error trapping decoding.

Low Density Parity Checking (LDPC) An LDPC code over GF(q), a finite field with q elements, is a q-ary linear block code given by the null space of a sparse parity-check matrix H over GF(q). An LDPC code is said to be regular if its parity-check matrix H has a constant number of ones in its columns, say, and a constant number of ones in its rows, say. and are typically quite small compared to n. Low-density parity-check (LDPC) codes are currently the most promising coding technique for approaching the Shannon capacities (or limits) for a wide range of channels.

FEC Applications to Communication System FEC used for many dispersion and noise limited systems Single burst error correction OIF CEI-P fire code (1604, 1584) and 10GBASE-R QC code (2112, 2080) for 10G serial link system Reed Solomon (RS) codes widely applied in telecommunication systems such as 100GBASE-KR4 and KP4 Turbo codes in deep space satellite communications Low Density Parity Check (LDPC) codes in 10GBASE-T, DVB, WiMAX, disk drive read channel, and NASA standard code (8176, 7156) used in the NASA Landsat (near earth satellite communications) and the Interface Region Imaging Spectrograph (IRIS) missions Trellis coded modulation (TCM) in 1000BASE-T

FEC Applications to Serial Link System Recently adopted FEC Fire code (1604, 1584) OIF CEI-P QC code (2112, 2080) 10GBASE-KR RS(528, 514, 7) over GF(2 10 ) 100GBASE-KR4 RS(544, 514, 15) over GF(2 10 ) 100GBASE-KP4 Applying FEC to serial link system needs to consider Coding gain Coding overhead Encoder/Decoder latency Encoder/Decoder complexity

Coding Gain => BER Relaxing DFE error propagation will cause long burst error and degrade coding gain Hence, burst error correcting FEC such as RS codes are preferred However, RS(544, 514, 15) relaxes BER requirement from 1e-15 to 1e-6 for serial link system 100GBASE-KP4

BER Coding Overhead => Higher Link Rate 10 0 10-10 10-20 10-30 10-40 10-50 10-60 uncoded t=1 RS n=514, t=1 t=2 t=2 t=3 t=4 t=4 t=... t=16 higher t RS(n, 514, t) codes 10 11 12 13 14 15 16 17 SNR (db) Trade-off between coding gain and channel loss due to coding overhead KR4 absorbs RS(528,514,7) overhead, link rate 25.8Gb/s remains KP4 RS(544,514,15) increases link rate from 25.8Gb/s to 27.2Gb/s

syndrome Encoding/Decoding Time => Link Latency Encoder Normally takes relatively small latency: t Delay Line Decoder: Syndrome Computation (SC): n/p cycles Key Equation Solver (KES): 2*t cycles Chien Search and Forney (CSnF): n/p + (1~2) cycles t t KES t Chien Searc h Forne y t,p (p is the parallel level of processing in a design) Total about 50ns-200ns for KR4 and KP4 FEC @100Gb/s

syndrom e Encoder/Decoder Complexity => Cost (Area + Power) Decoder complexity is normally proportional to t Code t Gates Area KR4 RS (528,514) KP4 RS (544,514) 7 100-150k 15 200-350k 1 2-2.5 t t Area KES t Chien Searc h Forne t,p y 20% 35-55% 25-45% Power Area

System Performance Estimation w/ FEC Random error BSC model Burst error Gilbert model Multinomial statistical model PDA and Importance Sampling

Random and Gilbert Burst Error Models Random error model Binary symmetric channel AWGN noise Gilbert burst error model The probability of a successive error with 1-tap DFE p ep BER pre Q( SNR) erfc( SNR ) 2 1 (1 2b 1 b0 ) SNR (1 2b1 b0 ) SNR erfc erfc 4 2 2 Probability of the burst error length p( bl Then, post FEC BER can be calculated based on p(bl) and error correction capability of the code 1 2 k pep k 1) 1 p ep if k 0 if k 0

Generic FEC Model - Multinomial Distribution A generic FEC model based on multinomial distribution can be used to calculate post FEC BER performance Assume that errors are caused by independent random or burst error events at output of SerDes detection Let w i (i=1,2,3,4, ) be the probability of having i-byte (m bits per byte) error event Then code word failure probability where k=k1+k2+k3+k4 and ki range from 1 to an upper limit such as t+1

Error Correlation Study Channel Goals: BER ~ 10-5 Vary correlation to data pattern Vary DFE error propagation Determine error correlation CTLE + DFE PRBS (variable SR length) 12.5 Gb/s Additive White Gaussian Noise

Error Correlation Study Method Time domain simulation Simulated 500 million bits for each case. For each bit error, recorded Bit time Surrounding data pattern Previous bit errors less than 64 bits from current error (Autocorrelation function of error process) * Bit error log for analysis probe ARX1_Probe Time bit_no pattern 4.27E-05 533466 1 1 1 0 0 1 0 1 0 0 0 0 0 0 0 1 1 1 1 7.34E-05 917369 0 1 1 0 1 0 1 1 0 0 1 0 1 1 1 1 1 0 1 9.85E-05 1231038 0 1 1 1 0 1 1 1 1 1 1 0 1 0 1 0 0 0 0 1.00E-04 1254936 0 1 1 0 1 1 1 0 0 0 0 0 1 1 0 0 0 0 1 1.07E-04 1343202 0 1 1 1 0 0 0 0 0 1 0 0 1 0 1 0 1 0 1 2.29E-04 2862468 0 1 1 1 1 0 0 0 1 0 1 0 1 0 1 1 1 0 0 2.94E-04 3676209 1 1 1 1 1 0 0 1 0 0 1 0 0 1 0 0 0 0 0 3.09E-04 3858573 0 1 1 0 1 1 1 1 0 0 0 1 1 0 0 1 1 1 0 3.14E-04 3920436 1 1 1 0 0 0 1 1 1 0 0 0 1 1 0 0 0 0 0 3.38E-04 4220051 1 1 1 0 0 0 1 0 1 1 1 1 0 1 0 0 0 1 1 3.64E-04 4545431 0 1 1 1 1 0 0 0 0 0 1 0 0 0 1 1 0 1 1 4.92E-04 6153288 1 1 1 0 1 1 0 0 1 1 1 1 0 0 0 1 0 0 1 6.10E-04 7624993 1 1 1 0 0 1 1 1 0 1 0 1 1 1 0 1 1 1 1

Error Correlation Example PRBS63 CTLE + DFE DFE fully adaptive Rx Noise 50mV rms Total Errors 1169 Minimum bits between errors 1703 Maximum bits between errors 3237475 There should have been at least twenty errors at distance equal to one. Where did they go?

Correlation Coefficient Pattern Correlation Q: To what extent were the bits adjacent to the errored bit correlated with the errored bit? Perfectly correlated Uncorrelated Perfectly anti-correlated -1-80 -60-40 -20 0 20 Relative Bit Position Errored bit 1 0.8 0.6 0.4 0.2 0-0.2-0.4-0.6-0.8

Error Spacing vs. Equalization Q: To what extent are errors grouped close together? A: It depends Increasing pattern dependence PRBS63 PRBS63 PRBS63 CTLE+DFE+Noise CTLE+minimal DFE DFE only Total Errors 1169 4815 2987 Minimum bits between errors Maximum bits between errors 1703 70 2 3237475 928258 1728641

250 200 150 100 50 0 50 40 30 20 10 0 DFE Error Autocorrelation vs. Data Pattern PRBS28 PRBS31 PRBS39 PRBS47 PRBS63 errors 2987 2863 1305 929 454 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 Lots of errors close together Significant pattern dependence DFE error propagation is not the primary impairment. 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 PRBS28 PRBS31 PRBS39 PRBS47 PRBS63 PRBS39 PRBS47 PRBS63

Importance Sampling What if you already knew which data patterns (events) were likely to cause errors? Those are the only ones you d bother to simulate. N patterns M patterns P IS err P err = M N P IS err * If you choose the wrong data patterns, your results are going to be worthless.

Distortion Analysis Invent some really nasty data patterns 1 0.5 0-0.5-1 -10-5 0 5 Interleave them! x x x x 0 x x x x 0 1 0 x x x x M N =2 12 x x x x 0 x x x x 0 1 0 x x x x x x x x 1 x x x x 1 0 1 x x x x x x x x 0 x x x x 0 1 0 x x x x x x x x 1 x x x 0 1 0 1 0 0 1 0 x 0 1 0 x x x x

Serial Channel Error Correlation Study Preliminary Conclusions Pattern dependence appears to be the primary impairment. Use Distortion Analysis to identify the critical data patterns. Use Importance Sampling to simulate only the critical data patterns and yet get unbiased results for the real system. Should we have codes designed specifically for high speed serial channels?