Low-complexity, Low-memory EMS algorithm for non-binary LDPC codes

Similar documents
Low-Complexity Decoding for Non-Binary LDPC Codes in High Order Fields

Extended MinSum Algorithm for Decoding LDPC Codes over GF (q)

Design of Spatially Coupled LDPC Codes over GF(q) for Windowed Decoding

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t.

A note on the multiplication of sparse matrices

Feature Extraction Techniques

A Simple Regression Problem

A Simplified Analytical Approach for Efficiency Evaluation of the Weaving Machines with Automatic Filling Repair

Reed-Muller Codes. m r inductive definition. Later, we shall explain how to construct Reed-Muller codes using the Kronecker product.

Block designs and statistics

Fast Montgomery-like Square Root Computation over GF(2 m ) for All Trinomials

Interactive Markov Models of Evolutionary Algorithms

Non-binary Hybrid LDPC Codes: structure, decoding and optimization

ASSUME a source over an alphabet size m, from which a sequence of n independent samples are drawn. The classical

Non-Parametric Non-Line-of-Sight Identification 1

Iterative Decoding of LDPC Codes over the q-ary Partial Erasure Channel

Convolutional Codes. Lecture Notes 8: Trellis Codes. Example: K=3,M=2, rate 1/2 code. Figure 95: Convolutional Encoder

CS Lecture 13. More Maximum Likelihood

Symbolic Analysis as Universal Tool for Deriving Properties of Non-linear Algorithms Case study of EM Algorithm

In this chapter, we consider several graph-theoretic and probabilistic models

Multi-Scale/Multi-Resolution: Wavelet Transform

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines

Sharp Time Data Tradeoffs for Linear Inverse Problems

Pattern Recognition and Machine Learning. Artificial Neural networks

Randomized Recovery for Boolean Compressed Sensing

Decoding Algorithms for Nonbinary LDPC Codes over GF(q)

On Constant Power Water-filling

e-companion ONLY AVAILABLE IN ELECTRONIC FORM

Fixed-to-Variable Length Distribution Matching

Intelligent Systems: Reasoning and Recognition. Artificial Neural Networks

Impact of Imperfect Channel State Information on ARQ Schemes over Rayleigh Fading Channels

Ch 12: Variations on Backpropagation

Status of Knowledge on Non-Binary LDPC Decoders

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis

Error Exponents in Asynchronous Communication

A remark on a success rate model for DPA and CPA

ESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS. A Thesis. Presented to. The Faculty of the Department of Mathematics

COS 424: Interacting with Data. Written Exercises

The Transactional Nature of Quantum Information

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices

Combining Classifiers

Polygonal Designs: Existence and Construction

Pattern Recognition and Machine Learning. Artificial Neural networks

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon

Figure 1: Equivalent electric (RC) circuit of a neurons membrane

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

3.8 Three Types of Convergence

Boosting with log-loss

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization

Lec 05 Arithmetic Coding

The Weierstrass Approximation Theorem

lecture 36: Linear Multistep Mehods: Zero Stability

NBN Algorithm Introduction Computational Fundamentals. Bogdan M. Wilamoswki Auburn University. Hao Yu Auburn University

Topic 5a Introduction to Curve Fitting & Linear Regression

Inspection; structural health monitoring; reliability; Bayesian analysis; updating; decision analysis; value of information

A Low-Complexity Congestion Control and Scheduling Algorithm for Multihop Wireless Networks with Order-Optimal Per-Flow Delay

ARTICLE IN PRESS. Murat Hüsnü Sazlı a,,canişık b. Syracuse, NY 13244, USA

Data-Driven Imaging in Anisotropic Media

Algorithms for parallel processor scheduling with distinct due windows and unit-time jobs

Ph 20.3 Numerical Solution of Ordinary Differential Equations

Compression and Predictive Distributions for Large Alphabet i.i.d and Markov models

Elliptic Curve Scalar Point Multiplication Algorithm Using Radix-4 Booth s Algorithm

Physically Based Modeling CS Notes Spring 1997 Particle Collision and Contact

Optimal Resource Allocation in Multicast Device-to-Device Communications Underlaying LTE Networks

EE5900 Spring Lecture 4 IC interconnect modeling methods Zhuo Feng

Analyzing Simulation Results

Vulnerability of MRD-Code-Based Universal Secure Error-Correcting Network Codes under Time-Varying Jamming Links

List Scheduling and LPT Oliver Braun (09/05/2017)

The accelerated expansion of the universe is explained by quantum field theory.

SPECTRUM sensing is a core concept of cognitive radio

Extension of CSRSM for the Parametric Study of the Face Stability of Pressurized Tunnels

Optical Properties of Plasmas of High-Z Elements

Homework 3 Solutions CSE 101 Summer 2017

An Adaptive UKF Algorithm for the State and Parameter Estimations of a Mobile Robot

On the Communication Complexity of Lipschitzian Optimization for the Coordinated Model of Computation

Necessity of low effective dimension

P016 Toward Gauss-Newton and Exact Newton Optimization for Full Waveform Inversion

Quantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search

The proofs of Theorem 1-3 are along the lines of Wied and Galeano (2013).

Automated Frequency Domain Decomposition for Operational Modal Analysis

Effective joint probabilistic data association using maximum a posteriori estimates of target states

Testing equality of variances for multiple univariate normal populations

Using EM To Estimate A Probablity Density With A Mixture Of Gaussians

Statistical Logic Cell Delay Analysis Using a Current-based Model

On the Use of A Priori Information for Sparse Signal Approximations

A method to determine relative stroke detection efficiencies from multiplicity distributions

A Self-Organizing Model for Logical Regression Jerry Farlow 1 University of Maine. (1900 words)

REDUCTION OF FINITE ELEMENT MODELS BY PARAMETER IDENTIFICATION

On the Design of MIMO-NOMA Downlink and Uplink Transmission

ZISC Neural Network Base Indicator for Classification Complexity Estimation

When Short Runs Beat Long Runs

Upper and Lower Bounds on the Capacity of Wireless Optical Intensity Channels

Kinematics and dynamics, a computational approach

An Approximate Model for the Theoretical Prediction of the Velocity Increase in the Intermediate Ballistics Period

Distributed Subgradient Methods for Multi-agent Optimization

Sparse beamforming in peer-to-peer relay networks Yunshan Hou a, Zhijuan Qi b, Jianhua Chenc

Research Article Rapidly-Converging Series Representations of a Mutual-Information Integral

Detection and Estimation Theory

Pattern Recognition and Machine Learning. Learning and Evaluation for Pattern Recognition

Kinetic Theory of Gases: Elementary Ideas

Transcription:

Low-coplexity, Low-eory EMS algorith for non-binary LDPC codes Adrian Voicila,David Declercq, François Verdier ETIS ENSEA/CP/CNRS MR-85 954 Cergy-Pontoise, (France) Marc Fossorier Dept. Electrical Engineering niv. Hawaii at Manoa Honolulu, HI 968, (SA), Pascal rard STMicroelectronics Crolles, (France) Abstract In this paper, we propose a new ipleentation of the EMS decoder for non binary LDPC codes presented in [7]. A particularity of the new algorith is that it takes into accounts the eory proble of the non binary LDPC decoders, together with a significant coplexity reduction per decoding iteration. The key feature of our decoder is to truncate the vector essages of the decoder to a liited nuber n of values in order to reduce the eory requireents. sing the truncated essages, we propose an efficient ipleentation of the EMS decoder which reduces the order of coplexity to O(n log n ), which starts to be reasonable enough to copete with binary decoders. The perforance of the low coplexity algorith with proper copensation are quite good with respect to the iportant coplexity reduction, which is shown both with a siulated density evolution approach and actual FER siulations. I. INTRODCTION It is now well known that binary low density parity check (LDPC) codes achieve rates close to the channel capacity for very long codeword lengths, and ore and ore LDPC solutions are proposed in standards (DVB, WIMAX, etc). In ters of perforance, binary LDPC codes start to show their weaknesses when the code size is sall or oderate, and higher order odulation is used for transission. For these cases, non binary LDPC (NB-LDPC) codes designed in high order Galois fields GF(q) have shown great interest [], [], [], [4]. However, the perforance gain brought by using LDPC codes over GF(q) coes together with a significant increase of the decoding coplexity. NB-LDPC codes are decoded with essage passing algoriths as the belief propagation (BP) decoder, but the size of the essages varies in the order q of the field. Therefore, a straightforward ipleentation of the BP decoder has coplexity in O(q ). A Fourier doain ipleentation of the BP is possible like in the binary case, reducing the coplexity to O(q log q), but this ipleentation is only convenient for essages expressed in the probability doain. In [7], the authors have proposed a decoding algorith, called extended in-su (EMS) algorith, which uses only a liited nuber n of reliabilities in the essages at the input of the check node in order to reduce the coputational burden of the check node update. With n q, the coplexity of a check node varies in O(n.q), using LDR representations of the essages, without sacrificing too uch in perforance, even for high order field codes, up to GF(56). This coplexity is still very high and one should try to reduce it even ore in order for NB-LDPC codes to copete with binary codes in ters of perforance/coplexity trade off. Moreover, the larger aount of eory required to store the essages of size q is an issue that has not been addressed in [7]. In this paper, we propose several iproveents of the EMS algorith that allows us to reduce both the eory and the coplexity of the decoder. We keep the basic idea of using only n q values for the coputation of essages, but we extend the principle to all the essages in the Tanner graph, that is both at the check nodes and the data nodes input. Moreover, we propose to store only n reliabilities instead of q in each essage. The truncation of essages fro q to n values has to be done in an efficient way in order to reduce its ipact on the perforance of the code. The technique that we propose is described in details in section III, together with an efficient offset correction to copensate for the perforance loss. sing the truncated essages representation, and a recursive ipleentation of the check node update, we propose a new ipleentation of the EMS decoder whose coplexity is doinated by O(n log n ), with n q, which is an iportant coplexity reduction copared to all existing ethods [5], [6], [7]. Our new algorith is depicted in section IV and a study of its coplexity/perforance trade off is ade in section V. We conclude the paper by siulation results that deonstrate that our low coplexity decoder still perfors very close to the BP decoder that we use as benchark. II. PRELIMINARIES A non binary LDPC code is defined by a very sparse rando parity check atrix H whose coponents belong to a finite field GF(q). Decoding algoriths of LDPC codes are iterative essage passing decoders based on a factor (or Tanner) graph representation of the atrix H. In general, an LDPC code has a factor graph consisting of N variable nodes with various connexion degrees and M parity check nodes with also various degrees. To siplify our notations, we will only present the decoder equations for isolated nodes with given degrees, and we denote d v the degree of the sybol node and d c the degree of the check node. In order to apply the decoder to irregular LDPC codes, siply let d v (resp. d c ) vary with the sybol (resp. check) index. A single parity check equation involving d c variable nodes (codeword sybols) c t is of the for: d c t= h t c t = in GF(q) ()

where h t is a nonzero value of the parity atrix H. An iportant difference between non binary and binary LDPC decoders is that the forer s essages that circulate on the factor graph are ultidiensional vectors, rather than scalar values. Like the binary decoders, however, there are two possible representations for the essages : probability weights vectors or log-density-ratio (LDR) vectors. The use of the LDR for for essages has been advised by any authors who proposed practical LDPC decoders. Indeed, LDR values which represent real reliability easures on the bits or the sybols are less sensitive to quantization errors due to finite precision coding of the essages [8]. Moreover, LDR easures operate in the logarith doain, which avoids coplicated operations (in ters of hardware ipleentation) like ultiplications or divisions. The following notation will be used for a LDR vector of a rando variable z GF (q): L(z) =[L[]...L[q ]] T Fig.. code where L[i] = log P (z = α i) () P (z = α ) with P (z = α i ) the probability that the rando variable z takes on the values α i GF (q) and L[] =, L[i] R. For exaple, the log-likelihood-ratio (LLR) vector essage at the channel output is denoted L ch = {L ch [k] k={,...,q } } defined by q- ters of the type (), the values of the probability weights P (z = α i ) depending on the transission channel statistics. The decoding algorith that we propose is independent of the channel, and we just assue that a deodulator provides a LLR vector L ch that initializes the decoder (the BI-AWGN channel is used in our siulations, seesectionvi). The GF(q)-LDPC iterative decoding algoriths are charactern v v v V p v h v h v v p h v p c n V c p n p c n v p Factor graph structure of a parity check node for a non-binary LDPC ized by three ain steps corresponding to the different nodes depicted in figure Fig. : (i) the variable node update, (ii) the perutation of the essages due to nonzeros values in the atrix H and (iii) the check node update which is the bottleneck of the decoder coplexity since the BP operation at the check node is a convolution of the input essages, which akes the coputational coplexity grow in O(q ). We use the following notations for the essages in the graph (see Fig. ). Let {V piv} i={,...,dv } be the set of essages entering a variable node of degree d v, and { vpi } i={,...,dv } be the output essages for this variable node. The index pv indicates that the essage coes fro a perutation node to a variable node, and vp is for the other direction. We define siilarly the essages { pic} i={,...,dc } ({V cpi } i={,...,dc }) at the input (output) of a check node. In [7], the EMS algorith reduces the coplexity of the check node update by considering only the n largest values of the essages at the input of the check node. However, the output essages of the check node are still coposed of q values. As a consequence, the EMS coplexity of a single parity check node varies in O(n.q) and all essages in the graph are stored with their full representation of q real values, which iplies a high eory storage coplexity. In this paper, we present a new ipleentation of the EMS algorith whose ain originality is to store exactly n q values in all vector essages vp, V cp. This has the advantage of reducing the eory requireents, but also to further reduce the coputational coplexity with a proper algorith which is described in details in section IV. Let us first present the way we truncate the essages fro q to n values and discuss its ipact on the error correction perforance of the decoder. III. STRCTRE AND COMPENSATION OF THE TRNCATED MESSAGES The vector essages V cp and vp are now liited to only n entries which are assued to be the largest reliability values of the corresponding rando variable. Moreover, the values in a essage are sorted in decreasing order. That way, V cp [] is the axiu value and V cp [n ] is the iniu value in V cp. We need to associate to the vectors V cp, vp of the additional vectors β Vcp and β vp (of size n ) which store the eleents α k GF (q), associated to the largest LDR values of vectors V cp and vp. For exaple, vp [k] is the LDR value that corresponds to the sybol value β vp [k] GF (q). Although interesting in ters of eory and coputation reduction, the truncation of essages obviously looses potentially valuable inforation which leads to perforance degradation on the error rate curves. This loss of perforance could be itigated by using a proper copensation of the inforation that has been truncated. Because our ain concern is the developent of low coplexity decoders, we have chosen to copensate the q n truncated values with a single scalar value γ, which is the siplest odel one can use. The following definition is used for a copensated essage: Definition Let A be any essage in the graph which represents a LDR vector of size q. A truncated version B of A is coposed of the n largest values of A sorted in decreasing order, plus an additional (n +)-th value γ A, whose goal is to copensate for the inforation loss due to the truncation of q n values. The copensated-truncated essage B has then (n +) coponents, and the value γ A is seen as a constant real value that replaces the q n issing reliabilities. A full representation of the truncated essage B would then be: B =[B[],...,B[n ],γ A,...,γ A ] T

This eans in particular that γ A B[n ]. Let us first analyze a possible solution to copute the value of γ A using noralization of probability essages. We consider P A the probability doain representation of the LDR vector A P A [k] =P (z = α k )=P A []e A[k] k = {,...,q } and let P B be the vector of with values P B [k] =P (z = β B [k]) = P A []e B[k] k = {...,n } Reeber that A is unsorted while B is sorted, which explains the difference in these two definitions. Because P A is a probability weight vector, we have: q P A [k] = k= n k= P B [k] < () A clever way to fix a good value on the scalar copensation γ A is to assue that the truncated essage should represent a probability weight vector with a su equal to one, so that n j= P B [j]+(q n )P γa =is satisfied. With P γa the probability weight associated, with LDR value γ A, that is P γa = P A []e γa. The noralization of vector P B is then: n (q n )P γa = P A [] P γa P A [] = log P γ A P A [] = log q and finally γ A = log i= j= P A[] n j= e B[j] q n n e A[i] q i=,a[i] / B j= e A[i] e B[j] e B[j] log(q n ) log(q n ) (4) As a first reark we note that the coputation of the additional ter requires the q n ignored values of vector A, and the coputation of a non linear function. The non linear function can be expressed in ters of the ax (x,x ) operator, used in any papers [6], and in order to siplify the equation (4), we approxiate this operator by: ax (x,x ) = log (e x + e x ) ax(x,x ) (5) Equation (4) becoes: γ A = ax i=,a[i] / B (A[i]) log(q n ) ax ) i=,a[i] / B B[n ] log(q n ) (6) where B[n ] is the largest value aong the (q n ) ignored values of vector A. sing the approxiation (6) we obtain a siple coputational forula for the suppleentary ter γ A, since we just need to truncate the LDR vector A with its (n +) largest values instead of its n largest values. On the other hand, this approxiation introduces a degradation of the error perforance of the decoder. The approxiation (5) is well known to overestiate the values of the LDR essages [9]. In principle, the copensation of the over-estiation should be different for each essage since the accuracy of approxiation (5) depends on the values it is applied to. An adaptive copensation would be obviously too coplicated with regards to our goal of proposing a low coplexity algorith. We have then chosen to copensate globally the over-estiation of the additional ter γ A with a single scalar offset, constant for all essages in the graph and also constant for all decoding iterations: γ A = B[n ] ln(q n ) offset = B[n ] Offset (7) There are several ways of optiizing the value of a global offset correction in essage passing decoders. We have chosen to follow the technique proposed in [7], which consists of iniizing the decoding threshold of the LDPC code, coputed with siulated density evolution. Because of the lack of space, we do not discuss in this paper the optiization of the global offset, and we recall that estiated density evolution is just used as a criterion to choose the correction factor and not to copute accurate thresholds. IV. DESCRIPTION OF THE ALGORITHM A. Decoding steps with essages of q We now present the steps of the EMS decoder that uses copensated-truncated essages of. We assue that the LLR vectors of the received sybols are known at the variable nodes, either stored in an external eory or coputed on the fly fro the channel easureents. sing the notations of Fig., the basic steps of the algorith are: ) Initialization: the n largest values of the LLR vectors are copied in the graph on the { vpi } i {,...,dv } essages. ) Variable-node update: the output vector essage { vpi } i {,...,dv } associated to a variable node v passed to a check node c is coputed given all the inforation propagated fro all adjacent check nodes and the channel, except this check node itself. ) Perutation step: this step perutes the essages according to the nonzero values of H (see ()). In our algorith, it just odifies the indices vectors and not the essage values: β pi c [k] =h i.β vpi [k] k {,...,n } (8) where the ultiplication is perfored in GF (q). 4) Check-node update: for each check node, the values {V cpi [k]} i {,...,dc },k {,...,n } sent fro check a node to a perutation node are defined as the probabilities (expressed in LDR forat) that the parity-check equation is satisfied if the variable node v is assued to be equal to β Vpi v [k]. 5) Inverse perutation step: this is the perutation step fro check nodes to sybol nodes, so it is identical to step ), but in the reverse order.

For steps () and (4) a recursive ipleentation (Fig.(a)) iniizes the nuber of eleentary updates needed for both check node and variable node updates. Lets put first soe notations on the recursive ipleentation, whose principle is to decopose the parity check and variable node into several eleentary steps (Fig.(b)). One eleentary step assues only two input essages and one output essage. The decoposition of the check nodes (variable nodes) with degree d c 4 (d v ) iplies therefore the coputation of interediate essages denoted I (Fig.(a)), which are assued to be stored also with n values. The output V cpi ( vpi ) of a check node (variable node) is then - in ost cases - coputed fro the cobination of an input essage (respectively V) and an interediate essage I. Backward Forward p c p c I V (b) Vc p V c p p c p c I F I B p c I F V cp I B p c p 4 c p c 5 h v h v h v h v h v 4 5 4 5 (a) p c 4 V cp 5 p c 5 V cp 4 Fig.. The recursive structure of a degree d c =5check-node (a); The eleentary step (b) B. Variable node: eleentary step Let assue that an eleentary step describing the variable node update has V and I as input essages and as output essage. The vectors V, I and of are sorted in decreasing order. We note also by β V, β I and β their associated index vectors. sing the BP equations in the log-doain for the variable node update [6], the goal of an eleentary step is to copute the output vector containing the n largest values aong the n candidates (9) (stored in an internal vector essage T). The processing of the eleentary step in the case of a variable node update is described by: T [k] =V [k]+y T[n +k] =γ V +I[k] k {,...,n } (9) with { I[l] if βi [l] =β Y = V [k] γ I if β I [l] / β V k, l {,...,n } The copensation value γ is used when the required sybol index is not present in an input essage. Whenever the V input corresponds to the LLR channel vector, the γ V ter (9) is substituted by L ch [β I [k]]. C. Low coplexity ipleentation of a check node eleentary step The this section describes in details the algorith that we propose for an eleentary coponent of the check node (Fig.(b)). This step is the bottleneck of the algorith coplexity and we discuss its ipleentation in details in the rest of the paper. The check node eleentary step has and I as input essages and V as output essage. All these vectors are of are sorted in decreasing order. Siilar to the variable node update, we note also by β, β I and β V their associated index vectors. Following the EMS algorith presented in [7], we define S(β V [i]) as the set of all the possible sybol cobinations which satisfy the parity equation β V [i] β [j] β I [p] =. With these notations, the output essage values are obtained with: V [i] = ax ([j]+i[p]) i {,...,n } () S(β V [i]) Just as in the variable node update (step () in the previous section), when a required index is not present in the truncated vector or I, its copensated value γ is used in equation (). Without a particular strategy, the coputation coplexity of an eleentary step is doinated by O(n ). We propose a low coputational strategy to ski the two sorted vectors and I, that provide a iniu nuber of operations to process the n sorted values of the output vector V. The ain coponent of our algorith is a sorter of size n, which is used to fill the output essage. For the clarity of presentation, we use a virtual atrix M built fro the vectors and I (cf. Fig.), each eleent of M being of the for M[i, p] =[j]+i[p]. This atrix contains the n candidates to update the output vector V. The goal of our algorith is to explore in a efficient way M in order to copute iteratively its n largest values, using the fact that M is build fro sorted essages. For instance, we reark that the n largest values of M are located in the upper part of the anti diagonal of the atrix. The basic steps of the algorith are: j I Fig.. p M SORTER S[] Diagra of the low coplexity algorith ) Initialization: the values of the first colun are introduced in the sorter. ) Output: the largest value is coputed. ) Test: does the associated GF(q) index of the output value already exists in the output vector. Yes: no action No: the value is oved in the vector V 4) Evolution: The right neighbor - with regard to the M atrix - of the filled value is introduced in the sorter. 5) Go to step In order to insure that all values of the output vector V correspond to different sybols α V GF (q), we can not stop the algorith after only n steps, because it is possible that V i

aong the coputed values after n steps, two or ore can correspond to the sae α V. Let us define K as the nuber of necessary steps so that all the n values of the output vector are coputed. The paraeter K is used to indicate the coputational coplexity of our new EMS ipleentation. We note that K [n, n ]. Of course, the value of K depends on the LDR that copose the input vectors and I, and a strictly valid ipleentation of the eleentary step should take into account the possibility of the worst case. However, we have found that K is ost of the tie quite sall. As an exaple, Fig. 4 shows the discrete epirical probability density function of K for a regular GF(56)-LDPC code, n =and a signal to noise ratio in the waterfall region of the code. As a atter of fact, the distribution of K has an exponential p(x).8.7.6.5.4.....5..5 5 4 45 5 4 45 5 55 6 65 x Fig. 4. Discrete probability density function of K shape and decreases very rapidly, e.g. prob(k n +4)=.986. Based on this observation, it sees natural to consider that the bad situations with large K are sufficiently rare so that they do not really ipact on the decoder perforance. We have verified this clai by siulations of density evolution and found that using K ax = n does not change the value of the decoding threshold for various LDPC code paraeters. Note that with K ax = n, soeties the output vector V could be filled with less than n values and in those cases, we fill the rest of the vector with a constant value equal to the additional ter γ V. The worst case scenario for the coplexity of an eleentary step is then to O(K ax log n )=O(n log n ), which corresponds to the nuber of ax operations needed to insert K ax eleents into a sorted list of. In the next section, we study in details the coplexity of our new ipleentation of the EMS. V. COMPLEXITY AND MEMORY EVALATION OF THE ALGORITHM The coputational coplexity of a single parity node and a single variable node are indicated in table I in ters of their connexion degree d c (resp. d v ). This coplexity applies both for regular and irregular non binary LDPC codes, the local value of the connexion degree following the connectivity profile of the code. This coplexity assues the use of truncated essages of TABLE I COMPTATIONAL COMPLEXITY OF THE MESSAGE PDATES WITH THE EMS ALGORITHM AND MESSAGES OF SIZE n Coplexity check node No. ax (d c )K ax log n No. real add (d c )(K ax + n ) No. add over GF(q) (d c )(K ax + n ) Coplexity variable node No. ax (d v +(d v ))n log (n ) No. real add (d v +(d v ))n, even for the channel LLR vector L ch and the ipleentation of the check node update presented in this paper. Note that we indicated the worst case coplexity for the check node with K = K ax and that the average coplexity is often less than that. The coplexity associated with the update of vectors at the variable node output is obtained with a recursive ipleentation of the variable node, which is used only for connexion degrees d v. As a results, the coplexity of our decoding algorith is doinated by O(n log (n )) in the case K ax = n for both parity and variable nodes coputation. Interestingly, the coplexity of a check node respectively of a variable node are soewhat balanced, which is a nice property that should help an efficient hardware ipleentation based on a generic processor odel. Moreover, one can reark that the coplexity of the decoder does not depend on q, the order of the field in which the code is considered. Let us again stress the fact that the coplexity of our decoder varies in the order of O(n log (n )) and with n q, which is a great coputational reduction copared to existing solutions [5], [6], [7]. The eory space requireents of the decoder is coposed by two independent eory coponents, that correspond to the channel essages L ch and to the extrinsic essages, V with their associated index vectors β. Storing each LDR value on Nbits bits in finite precision would therefore require a total nuber of n N d v (Nbits + log q) bits. So, the eory storage depends linearly on n, which was the initial constraint that we put on the essages. Since n is the key paraeter of our algorith that tune the coplexity and the eory of the decoder, we now need to study for which values of n the perforance loss is sall or negligible. In order to give a first answer to this question, we have ade an asyptotic threshold analysis of the ipact of n on the threshold value. For a rate R =.5 LDPC code with paraeters (d v =,d c =4), Fig.5 plot the estiated threshold in (E b /N ) db of our algorith for different values of n and two different field orders GF(64) and GF(56). As expected, the thresholds becoe better as n increases, and can approach closely the threshold of BP with uch less coplexity. The BP thresholds are equal to δ =.58dB for the GF(64) code and δ =.5dB for the GF(56) code [7]. We can use the plots on Fig.5 as first indication for choosing the field order of the LDPC code that corresponds to a given coplexity/perforance trade off. Note however

Coplexity / Elentary step 8 6 4 n =64 n = n =48 n =4 n = n =6 n =4 n = GF(64) GF(56) n =6 n =8.55.6.65.7.75.8.85.9.95.5 Threshold Value (db) Fig. 5. Estiated decoding threshold vs. Coplexity that this asyptotic study has to be balanced with the girth properties of finite length codes, since it has been identified in [], [] that ultra-sparse LDPC codes in high order fields and with high girth have excellent perforance. VI. EXPERIMENTAL RESLTS In this section, we present the siulation results of our low coplexity EMS algorith, copared with the BP algorith. In order to ake a fair coparison, floating point ipleentation have been used for both algoriths, and we will discuss in a future paper the effect of quantization on the EMS decoder perforance. We focus on a regular GF(q)-LDPC code over high order fields, of rate R =/ (d v =,d c =4). In Fig.6, we have reported the frae error rate (FER) of short code of length N b = 848 equivalent bits, at convergence. We denote by EMS n GF(q) the EMS decoder over the field GF(q) the waterfall region and perfors even better than the BP decoder in the error floor region. This behavior is now well known in the literature and coes fro the fact that for sall code lengths, an EMS algorith corrected by an offset is less sensitive to pseudo-codewords than the BP. Note that with this exaple, there is no advantage of using a GF(56) code in ters of perforance/coplexity trade off since EMS GF(64) has better perforance than EMS GF(56) for all considered (E b /N ), although they share the sae coplexity. Our low coplexity algorith is however quite robust since the coplexity reduction fro q = 56 to n =is a lot higher than fro q =64to n =, which was a drawback of the solutions proposed in [5], [6]. The sae kind of behavior have been observed for other code and decoder paraeters. VII. CONCLSION We have presented in this paper a general low coplexity decoding algorith for non binary LDPC codes, using log-density-ratio as essages. The ain originality of the proposed algorith is to truncate the vector essages to a fixed nuber of values n q, in order to solve the coplexity proble and to reduce the eory requireents of the non binary LDPC decoders. We have also shown that by using a correction ethod for the essages, our EMS decoding algorith can approach the perforance of the BP decoder and even in sae cases beat the BP decoder. The coplexity of the proposed algorith is doinated by O(n log (n )). For values of n providing near-bp error perforance, this coplexity is saller than the coplexity of the BP-FFT decoder. The proposed low coplexity, low eory EMS decoding algorith then becoes a good candidate for the hardware ipleentation of non binary LDPC decoders. Since its coplexity and its eory space has been greatly reduced copared to the other suboptial decoding algoriths for the non binary LDPC codes and the perforance degradation is sall or negligible. FER 4 5 6 7 EMS GF(64) n=6 BP GF(64) BP GF(56) EMS GF(56) n= EMS GF(64) n= 8..4.6.8..4.6.8 E b/n (in db) Fig. 6. Coparison between BP and EMS decoding algoriths, floating point ipleentation with paraeter n. Let us first copare our low coplexity decoder to the BP decoder. For the code over GF(64), the EMS 6 GF(64) is the less coplex algorith presented. It perfors within.5db of the BP decoder in the waterfall region. The EMS GF(64) algorith has.5db perforance loss in REFERENCES [] M. Davey and D.J.C. MacKay, Low Density Parity Check Codes over GF(q), IEEE Coun. Lett., vol., pp. 65-67, June 998. [] X.-Y. Hu and E. Eleftheriou, Binary Representation of Cycle Tanner- Graph GF( q ) Codes, The Proc. IEEE Intern. Conf. on Coun., Paris, France, pp. 58-5, June 4. [] C. Poulliat,M. Fossorier and D. Declercq, Design of non binary LDPC codes using their binary iage: algebraic properties, ISIT 6, Seattle, SA, July 6. [4] A. Bennatan and David Burshtein, Design and Analysis of Nonbinary LDPC Codes for Arbitrary Discrete-Meoryless Channels, IEEE Trans. on Infor. Theory, vol. 5, no., pp. 549-58, Feb. 6. [5] H. Song and J.R. Cruz, Reduced-Coplexity Decoding of Q-ary LDPC Codes for Magnetic Recording, IEEE Trans. Magn., vol. 9, pp. 8-87, Mar.. [6] H. Wyeersch, H. Steenda and M. Moeneclaey, Log-Doain Decoding of LDPC Codes over GF(q), The Proc. IEEE Intern. Conf. on Coun., Paris, France, June 4, pp. 77-776. [7] D. Declercq and M. Fossorier, Decoding Algoriths for Nonbinary LDPC Codes over GF(q), to appear in IEEE Trans. on Coun., 7. [8] L. Ping and W.K. Leung, Decoding low density parity check codes with finite quantization bits, IEEE Coun. Lett., pp.6-64, February. [9] J. Chen and M. Fossorier, Density Evolution for Two Iproved BP- Based Decoding Algoriths of LDPC Codes, IEEE Coun. Lett., vol. 6, pp. 8-, May.