Advanced Hardware Architecture for Soft Decoding Reed-Solomon Codes

Similar documents
An Introduction to Low Density Parity Check (LDPC) Codes

Structured Low-Density Parity-Check Codes: Algebraic Constructions

Chapter 7 Reed Solomon Codes and Binary Transmission

Low Density Parity Check (LDPC) Codes and the Need for Stronger ECC. August 2011 Ravi Motwani, Zion Kwok, Scott Nelson

Pipeline processing in low-density parity-check codes hardware decoder

Pipelined Viterbi Decoder Using FPGA

FPGA-based Niederreiter Cryptosystem using Binary Goppa Codes

Making Error Correcting Codes Work for Flash Memory

SOFT-DECISION DECODING OF REED-SOLOMON CODES USING PATTERN INFORMATION OVER PARTIAL RESPONSE CHANNELS. Soowoong Lee

The E8 Lattice and Error Correction in Multi-Level Flash Memory

Adaptive Cut Generation for Improved Linear Programming Decoding of Binary Linear Codes

The E8 Lattice and Error Correction in Multi-Level Flash Memory

ECC for NAND Flash. Osso Vahabzadeh. TexasLDPC Inc. Flash Memory Summit 2017 Santa Clara, CA 1

A Simplified Min-Sum Decoding Algorithm. for Non-Binary LDPC Codes

Coding Techniques for Data Storage Systems

Memory Elements I. CS31 Pascal Van Hentenryck. CS031 Lecture 6 Page 1

Compressed Sensing Using Reed- Solomon and Q-Ary LDPC Codes

Low-complexity error correction in LDPC codes with constituent RS codes 1

FPGA Implementation of a Predictive Controller

Random Redundant Soft-In Soft-Out Decoding of Linear Block Codes

Performance Study of Non-Binary Belief Propagation for Decoding Reed-Solomon Codes

A COMBINED 16-BIT BINARY AND DUAL GALOIS FIELD MULTIPLIER. Jesus Garcia and Michael J. Schulte

VHDL Implementation of Reed Solomon Improved Encoding Algorithm

Low-density parity-check codes

A Survey on Binary Message LDPC decoder

AN IMPROVED LOW LATENCY SYSTOLIC STRUCTURED GALOIS FIELD MULTIPLIER

SOFT DECISION FANO DECODING OF BLOCK CODES OVER DISCRETE MEMORYLESS CHANNEL USING TREE DIAGRAM

Introduction to Low-Density Parity Check Codes. Brian Kurkoski

Some Aspects of Hardware Implementation of LDPC Codes

On the minimum distance of LDPC codes based on repetition codes and permutation matrices 1

Codes on graphs and iterative decoding

Fully-parallel linear error block coding and decoding a Boolean approach

Instruction Set Extensions for Reed-Solomon Encoding and Decoding

Progressive Algebraic Soft-Decision Decoding of Reed-Solomon Codes

ELEC3227/4247 Mid term Quiz2 Solution with explanation

Error Correction and Trellis Coding

Staircase Codes. for High-Speed Optical Communications

Progressive algebraic Chase decoding algorithms for Reed Solomon codes

THIS paper is aimed at designing efficient decoding algorithms

STUDY OF PERMUTATION MATRICES BASED LDPC CODE CONSTRUCTION

Solutions of Exam Coding Theory (2MMC30), 23 June (1.a) Consider the 4 4 matrices as words in F 16

Digital Logic: Boolean Algebra and Gates. Textbook Chapter 3

Construction of low complexity Array based Quasi Cyclic Low density parity check (QC-LDPC) codes with low error floor

EECS150 - Digital Design Lecture 23 - FFs revisited, FIFOs, ECCs, LSFRs. Cross-coupled NOR gates

Simplification of Procedure for Decoding Reed- Solomon Codes Using Various Algorithms: An Introductory Survey

Symmetric Product Codes

Information Theoretic Imaging

Shift Register Counters

Short Polar Codes. Peihong Yuan. Chair for Communications Engineering. Technische Universität München

Belief propagation decoding of quantum channels by passing quantum messages

R. A. Carrasco and M. Johnston, Non-Binary Error Control Coding Cork 2009

ECEN 248: INTRODUCTION TO DIGITAL SYSTEMS DESIGN. Week 9 Dr. Srinivas Shakkottai Dept. of Electrical and Computer Engineering

Codes on graphs and iterative decoding

Constructions of Nonbinary Quasi-Cyclic LDPC Codes: A Finite Field Approach

VLSI Architecture of Euclideanized BM Algorithm for Reed-Solomon Code

Dr. Cathy Liu Dr. Michael Steinberger. A Brief Tour of FEC for Serial Link Systems

Low-density parity-check (LDPC) codes

2. Accelerated Computations

Efficient Bit-Channel Reliability Computation for Multi-Mode Polar Code Encoders and Decoders

On the Performance of SC-MMSE-FD Equalization for Fixed-Point Implementations

ERROR CORRECTION BEYOND THE CONVENTIONAL ERROR BOUND FOR REED SOLOMON CODES

Quasi-cyclic Low Density Parity Check codes with high girth

A Short Length Low Complexity Low Delay Recursive LDPC Code

FPGA accelerated multipliers over binary composite fields constructed via low hamming weight irreducible polynomials

BInary low-density parity-check (LDPC) codes, discovered

Pre-sorted Forward-Backward NB-LDPC Check Node Architecture

Communication Theory II

Distributed Arithmetic Coding

Lecture 12. Block Diagram

Optimizing Loop Operation and Dataflow in FPGA Acceleration of Deep Convolutional Neural Networks

An Enhanced (31,11,5) Binary BCH Encoder and Decoder for Data Transmission

Coding for Memory with Stuck-at Defects

Optimum Soft Decision Decoding of Linear Block Codes

Girth Analysis of Polynomial-Based Time-Invariant LDPC Convolutional Codes

CHAPTER 3 LOW DENSITY PARITY CHECK CODES

A New Performance Evaluation Metric for Sub-Optimal Iterative Decoders

Error Correction Review

Trapping Set Enumerators for Specific LDPC Codes

Polar Code Construction for List Decoding

Achieving Flexibility in LDPC Code Design by Absorbing Set Elimination

Polar Codes: Graph Representation and Duality

Physical Layer and Coding

Introduction to Wireless & Mobile Systems. Chapter 4. Channel Coding and Error Control Cengage Learning Engineering. All Rights Reserved.

ECEN 655: Advanced Channel Coding

A contribution to the reduction of the dynamic power dissipation in the turbo decoder

Efficient random number generation on FPGA-s

Side Channel Analysis and Protection for McEliece Implementations

Word-length Optimization and Error Analysis of a Multivariate Gaussian Random Number Generator

Iterative Soft-Decision Decoding of Binary Cyclic Codes

Channel Coding I. Exercises SS 2017

Channel Coding I. Exercises SS 2017

Information redundancy

PAPER A Low-Complexity Step-by-Step Decoding Algorithm for Binary BCH Codes

Error Correction Methods

Design and Implementation of High Speed CRC Generators

A New Division Algorithm Based on Lookahead of Partial-Remainder (LAPR) for High-Speed/Low-Power Coding Applications

Polar Coding for the Large Hadron Collider: Challenges in Code Concatenation

Mapper & De-Mapper System Document

Improved Successive Cancellation Flip Decoding of Polar Codes Based on Error Distribution

Outline. EECS Components and Design Techniques for Digital Systems. Lec 18 Error Coding. In the real world. Our beautiful digital world.

Transcription:

Advanced Hardware Architecture for Soft Decoding Reed-Solomon Codes Stefan Scholl, Norbert Wehn Microelectronic Systems Design Research Group TU Kaiserslautern, Germany

Overview Soft decoding decoding for the RS(255,239) New hardware architecture Goal: large FER gain (over hard decision decoding) Algorithm based on information set decoding Complexity evaluation on a Virtex 5 FPGA 2

Motivation RS / BCH Decoder Hardware wireless wired storage VDSL NASA / CCSDS Optical (G.709) Widely used code: RS(255,239) or its shortened versions 3

Decoding Algorithms for Reed-Solomon Hard Decoding Soft Decoding Algorithm: standard method algebraic decoding complexity very low: first chip implementations in the 1970/80s Progress in microelectronics allows for more complexity today! 4

Decoding Algorithms for Reed-Solomon Hard Decoding Algorithm: standard method algebraic decoding complexity very low: first chip implementations in the 1970/80s Progress in microelectronics allows for more complexity today! Soft Decoding Improved error correction possible gain: up to 3 db (depends on length and coderate) Algorithms: Chase Decoding Information Set Decoding Adaptive Belief Propagation Kötter-Vardy 5

Decoding Algorithms for Reed-Solomon Hard Decoding Algorithm: standard method algebraic decoding complexity very low: first chip implementations in the 1970/80s Progress in microelectronics allows for more complexity today! Soft Decoding Improved error correction possible gain: up to 3 db (depends on length and coderate) Algorithms: Chase Decoding Information Set Decoding Adaptive Belief Propagation Kötter-Vardy We consider the widely used RS(255,239) but RS(255,239) seems to be challenging 6

State-of-the-art Soft Decoder Hardware Real & complete hardware implementations for RS(255,239) Paper Year Algorithm Gain (over HDD) An (PhD thesis, MIT) 2010 Low complexity Chase 0.45 db Hsu et al (ESSCIRC) 2011 Chase 0.35 db Garcia-Herrero et al (CSSP) 2011 Low complexity Chase 0.3 db low gain hardware <0.5 db Kan et al (ISTC) 2008 Adaptive BP 0.75 db Heloir et al (NEWCAS) 2012 Stochastic Chase 0.7 db Scholl et al (DATE) 2014 Information set 0.75 db medium gain hardware 0.5 1 db 7

State-of-the-art Hardware Implementations Hard decision decoding low gain <0.5 db 8

State-of-the-art Hardware Implementations Hard decision decoding medium gain 0.5-1 db low gain <0.5 db 9

State-of-the-art Hardware Implementations Hard decision decoding Literature shows: up to 2 db gain should be possible Not yet investigated! high gain > 1 db medium gain 0.5-1 db low gain <0.5 db 10

Implemented Algorithm* Information set decoding approach most reliable least reliable Received bits Binary image H = 1 1 0 0 1 0 1 1 0 1 0 0 1 1 reliability 0 0 0 0 1 0 1 0 1 1 0 0 1 0 1 0 1 1 0 1 0 1 0 0 0 1 1 1 0 0 0 0 1 0 1 0 0 0 1 0 1 1 1 0 1 1 0 0 1 0 1 1 0 1 0 1 1 1 0 1 1 0 0 1 0 1 0 0 0 1 0 0 1 1 0 0 0 1 0 0 1 1 0 0 *A. Ahmed, R. Koetter, and N. R. Shanbhag. Performance analysis of the adaptive parity check matrix based soft-decision decoding algorithm, 2004. 11

Implemented Algorithm* Information set decoding approach most reliable least reliable Received bits Diagonalized by Gaussian elimination Binary image H = 1 1 0 0 1 0 1 1 0 1 0 0 1 1 reliability 01 01 0 01 1 01 10 01 1 10 0 0 10 0 10 01 1 1 0 1 01 1 0 01 0 10 10 10 01 0 0 01 1 01 1 01 0 0 1 0 10 10 1 0 1 1 0 01 10 0 10 10 0 1 0 10 10 1 0 1 1 01 01 10 0 10 0 0 01 10 01 0 1 1 01 0 0 1 0 0 10 10 0 01 *A. Ahmed, R. Koetter, and N. R. Shanbhag. Performance analysis of the adaptive parity check matrix based soft-decision decoding algorithm, 2004. 12

Implemented Algorithm* Information set decoding approach most reliable least reliable Received bits Diagonalized by Gaussian elimination Binary image H = 1 1 0 0 1 0 1 1 0 1 0 0 1 1 reliability 01 01 0 01 1 01 10 01 1 10 0 0 10 0 10 01 1 1 0 1 01 1 0 01 0 10 10 10 01 0 0 01 1 01 1 01 0 0 1 0 10 10 1 0 1 1 0 01 10 0 10 10 0 1 0 10 10 1 0 1 1 01 01 10 0 10 0 0 01 10 01 0 1 1 01 0 0 1 0 0 10 10 0 01 syndrome 0 0 1 0 0 0 Syndrome weight: Small: Only errors in least rel. part Large: Min. 1 errors in most rel part *A. Ahmed, R. Koetter, and N. R. Shanbhag. Performance analysis of the adaptive parity check matrix based soft-decision decoding algorithm, 2004. 13

Implemented Algorithm* Information set decoding approach most reliable least reliable Received bits Diagonalized by Gaussian elimination Binary image H = 1 1 0 0 1 0 1 1 0 1 0 0 1 1 reliability 01 01 0 01 1 01 10 01 1 10 0 0 10 0 10 01 1 1 0 1 01 1 0 01 0 10 10 10 01 0 0 01 1 01 1 01 0 0 1 0 10 10 1 0 1 1 0 01 10 0 10 10 0 1 0 10 10 1 0 1 1 01 01 10 0 10 0 0 01 10 01 0 1 1 01 0 0 1 0 0 10 10 0 01 syndrome 0 0 1 0 0 0 Syndrome weight: Small: Only errors in least rel. part Large: Min. 1 errors in most rel part Order 1 processing: tentatively flip each most reliable bit (here: 1912) Order 2 processing: tentatively flip all combinations of 2 most reliable bits (~2 million cases) Can be seen as a low complexity variant of ordered-statistics decoding *A. Ahmed, R. Koetter, and N. R. Shanbhag. Performance analysis of the adaptive parity check matrix based soft-decision decoding algorithm, 2004. 14

Algorithm Improvements We add further features for improvement (mostly from other literature): Use a hard decision decoder (counters potential error floor) Use three differently diagonalized parity check matrices (improves FER) Partial overlapping of diagonalized parts allows for sophisticated architecture (complexity reduction) Restrict order 2 processing to fair reliable bits (250 out of 1912) Need to determine additional group: fair reliable (besides least and most) Large reduction of processings (factor 60 less) Use approximative reliability sorting to enable parallelization (higher speed) Overall loss due to complexity reduction: < 0.1 db 15

Our New Hardware Architecture Input: 2040 bit LLRs 8 in parallel Quantization: 6 bits Implementation on Virtex 5 FPGA output: 2040 bits (hard out) 8 in parallel 16

Our Hardware Architecture Sorting Finds low and fair reliable bits Finds 378 lowest out of 2040 LLRs Shift register based insertion sort 8 sorters parallel (approximative sorting) Stores bit positions in four memories 17

Our Hardware Architecture Gaussian Elimination /Diagonalization: Original matrix stored in memory Diagonalization on the fly Diagonalizaton column wise 2 phases: setup & elimination Saves ~70% hardware over state-of-the-art diagonalizations (e.g. systolic arrays) Three diagonalizations: exploit overlapping column original matrix Pipelined array eliminator P + + + + P P P: Fixed pivot positions! column eliminated matrix 18

Our Hardware Architecture Correction Unit Performs order 1 and 2 processing Parallelized order 2 proc. In 1 clock cycle: 1x order 1 6x order 2 3 instances (for 3 matrices) Selects best results for output 19

Our Hardware Architecture Syndrome Calculation: Required: syndrome of the diagonalized matrix Strategy: First: calculate syndrome using original matrix Second: diagonalize syndrome in the Gaussian Elimination Advantage: allows use of Galois field operations (much faster) 20

FPGA Implementations State-of-the-art soft decoder RS(255,239), gain > 0.5 db Kan et al Scholl et al Heloir et al THIS WORK Algorithm Adaptive BP Information Set Stoch. Chase Information Set Chip Stratix II Virtex 5 Virtex 5 Virtex 5 Flipflops n/a 42,000 143,000 70,200 Look-Up Tables 43,700 13,700 117,000 32,400 Throughput 4 Mbit/s 800 Mbit/s 50 Mbit/s 300 Mbit/s Communications gain over HDD 0.75 db 0.75 db 0.7 db 1.3 db M. Kan et al., Hardware implementation of soft-decision decoding for Reed-Solomon code. In Proc. 5th Int. Turbo Codes and Related Topics Symp, 2008. S. Scholl and N. Wehn, Hardware Implementation of a Reed-Solomon Soft Decoder based on Information Set Decoding, DATE 14, 2014. R. Heloir, C. Leroux, S. Hemati, M. Arzel, and W.J.Gross. Stochastic chase decoder for reed-solomon codes. IEEE NEWCAS 2012 Our new architecture 21

FPGA Implementations State-of-the-art soft decoder RS(255,239), gain > 0.5 db Kan et al Scholl et al Heloir et al THIS WORK Algorithm Adaptive BP Information Set Stoch. Chase Information Set Chip Stratix II Virtex 5 Virtex 5 Virtex 5 Flipflops n/a 42,000 143,000 70,200 Look-Up Tables 43,700 13,700 117,000 32,400 Throughput 4 Mbit/s 800 Mbit/s 50 Mbit/s 300 Mbit/s Communications gain over HDD 0.75 db 0.75 db 0.7 db 1.3 db M. Kan et al., Hardware implementation of soft-decision decoding for Reed-Solomon code. In Proc. 5th Int. Turbo Codes and Related Topics Symp, 2008. S. Scholl and N. Wehn, Hardware Implementation of a Reed-Solomon Soft Decoder based on Information Set Decoding, DATE 14, 2014. R. Heloir, C. Leroux, S. Hemati, M. Arzel, and W.J.Gross. Stochastic chase decoder for reed-solomon codes. IEEE NEWCAS 2012 Our new architecture 22

FPGA Implementations State-of-the-art soft decoder RS(255,239), gain > 0.5 db Kan et al Scholl et al Heloir et al THIS WORK Algorithm Adaptive BP Information Set Stoch. Chase Information Set Chip Stratix II Virtex 5 Virtex 5 Virtex 5 Flipflops n/a 42,000 143,000 70,200 Look-Up Tables 43,700 13,700 117,000 32,400 Throughput 4 Mbit/s 800 Mbit/s 50 Mbit/s 300 Mbit/s Communications gain over HDD 0.75 db 0.75 db 0.7 db 1.3 db M. Kan et al., Hardware implementation of soft-decision decoding for Reed-Solomon code. In Proc. 5th Int. Turbo Codes and Related Topics Symp, 2008. S. Scholl and N. Wehn, Hardware Implementation of a Reed-Solomon Soft Decoder based on Information Set Decoding, DATE 14, 2014. R. Heloir, C. Leroux, S. Hemati, M. Arzel, and W.J.Gross. Stochastic chase decoder for reed-solomon codes. IEEE NEWCAS 2012 Our new architecture 23

Comparison FER This work 24

Summary & Outlook Summary Proposed new RS soft decoder hardware for RS(255,239) Based on information set decoding Implementation with currently best FER: gain 1.3 db over HDD New High gain architecture, besides low & medium gain Acceptable complexity Future Challenges Improving implementation efficiency Architectures for specific application s requirements Approach applicable to every linear code 25

Thank you for your attention! Questions? 26

Our new Binary Gaussian Elimination Basic operation: adding rows onto other rows to form unit columns For our hardware: Two Phase Approach 1. Setup: configures addition patterns 2. Elimination: performs actual elimination Architecture: Column by column processing with pipelined array P + + Columns from original matrix + P Columns of eliminated matrix + P P: Fixed pivot positions! S. Scholl, C. Stumm, and N. Wehn. Hardware Implementations of Gaussian Elimination over GF(2) for Channel Decoding Algorithms. IEEE AFRICON 2013. 27

Comparison, 128 x 2040 matrix Design Example: Reed-Solomon (255,239) Code: Binary Matrix Size: 128 x 2040 Implementation on a Xilinx FPGA Chip (Virtex 7) Architecture Look-Up-Tables Flipflops Throughput SMITH* 780k* 260k* Systolic array 82k 99k 219k matrices / s proposed 17k 33k 272k matrices / s * estimated -80% saving -67% saving +25% increase Efficient Gaussian elimination is the key for efficient soft decoding! 28