Decoding linear codes via systems solving: complexity issues

Similar documents
Decoding linear codes via systems solving: complexity issues and generalized Newton identities

Decoding error-correcting codes with Gröbner bases

Arrangements, matroids and codes

The extended coset leader weight enumerator

Solutions of Exam Coding Theory (2MMC30), 23 June (1.a) Consider the 4 4 matrices as words in F 16

On the Complexity of Gröbner Basis Computation for Regular and Semi-Regular Systems

Error Correcting Codes: Combinatorics, Algorithms and Applications Spring Homework Due Monday March 23, 2009 in class

Error-correcting Pairs for a Public-key Cryptosystem

3. Coding theory 3.1. Basic concepts

Lecture 2 Linear Codes

MATH 433 Applied Algebra Lecture 21: Linear codes (continued). Classification of groups.

Definition 2.1. Let w be a word. Then the coset C + w of w is the set {c + w : c C}.

Lecture Notes in Mathematics. Arkansas Tech University Department of Mathematics. The Basics of Linear Algebra

A variant of the F4 algorithm

Chapter 2. Error Correcting Codes. 2.1 Basic Notions

MATH32031: Coding Theory Part 15: Summary

Gröbner Bases. Applications in Cryptology

MATH/MTHE 406 Homework Assignment 2 due date: October 17, 2016

Liouvillian solutions of third order differential equations

COMPSCI 650 Applied Information Theory Apr 5, Lecture 18. Instructor: Arya Mazumdar Scribe: Hamed Zamani, Hadi Zolfaghari, Fatemeh Rezaei

Extended Subspace Error Localization for Rate-Adaptive Distributed Source Coding

Know the meaning of the basic concepts: ring, field, characteristic of a ring, the ring of polynomials R[x].

New algebraic decoding method for the (41, 21,9) quadratic residue code

Improving the Performance of the SYND Stream Cipher

Chapter 2 Date Compression: Source Coding. 2.1 An Introduction to Source Coding 2.2 Optimal Source Codes 2.3 Huffman Code

And for polynomials with coefficients in F 2 = Z/2 Euclidean algorithm for gcd s Concept of equality mod M(x) Extended Euclid for inverses mod M(x)

The Golay codes. Mario de Boer and Ruud Pellikaan

A 2-error Correcting Code

Coding Theory: Linear-Error Correcting Codes Anna Dovzhik Math 420: Advanced Linear Algebra Spring 2014

ANALYTICAL MATHEMATICS FOR APPLICATIONS 2018 LECTURE NOTES 3

Cyclic Redundancy Check Codes

Chapter 1: Systems of linear equations and matrices. Section 1.1: Introduction to systems of linear equations

11 Minimal Distance and the Parity Check Matrix

Algebraic Decoding of Rank Metric Codes

MATH 291T CODING THEORY

Linear, Quadratic, and Cubic Forms over the Binary Field

Coding Theory. Ruud Pellikaan MasterMath 2MMC30. Lecture 11.1 May

Localization. Introduction. Markus Lange-Hegermann

An instantaneous code (prefix code, tree code) with the codeword lengths l 1,..., l N exists if and only if. 2 l i. i=1

Polynomial interpolation over finite fields and applications to list decoding of Reed-Solomon codes

(each row defines a probability distribution). Given n-strings x X n, y Y n we can use the absence of memory in the channel to compute

Lecture 12: November 6, 2017

MATH Examination for the Module MATH-3152 (May 2009) Coding Theory. Time allowed: 2 hours. S = q

ADVANCED TOPICS IN ALGEBRAIC GEOMETRY

Lecture 12. Block Diagram

Owen (Chia-Hsin) Chen, National Taiwan University

Orthogonal Arrays & Codes

Linear Block Codes. Saravanan Vijayakumaran Department of Electrical Engineering Indian Institute of Technology Bombay

MATH3302. Coding and Cryptography. Coding Theory

Computing Minimal Polynomial of Matrices over Algebraic Extension Fields

Elliptic Curve Discrete Logarithm Problem

Finite Mathematics. Nik Ruškuc and Colva M. Roney-Dougal

3 (Maths) Linear Algebra

Algebra II Vocabulary Alphabetical Listing. Absolute Maximum: The highest point over the entire domain of a function or relation.

Non-Standard Coding Theory

MATH 291T CODING THEORY

New Gröbner Bases for formal verification and cryptography

LDPC Codes. Intracom Telecom, Peania

Quasi-cyclic Low Density Parity Check codes with high girth

Lecture 4: Proof of Shannon s theorem and an explicit code

Chapter 1: Systems of Linear Equations and Matrices

Unit 13: Polynomials and Exponents

Chapter 7. Error Control Coding. 7.1 Historical background. Mikael Olofsson 2005

3.2 Gaussian Elimination (and triangular matrices)

Current Advances. Open Source Gröbner Basis Algorithms

Math 1060 Linear Algebra Homework Exercises 1 1. Find the complete solutions (if any!) to each of the following systems of simultaneous equations:

DM559 Linear and Integer Programming. Lecture 2 Systems of Linear Equations. Marco Chiarandini

1 Solutions to selected problems

FUNDAMENTALS OF ERROR-CORRECTING CODES - NOTES. Presenting: Wednesday, June 8. Section 1.6 Problem Set: 35, 40, 41, 43

ECEN 5682 Theory and Practice of Error Control Codes

Math Camp Lecture 4: Linear Algebra. Xiao Yu Wang. Aug 2010 MIT. Xiao Yu Wang (MIT) Math Camp /10 1 / 88

MATH3302 Coding Theory Problem Set The following ISBN was received with a smudge. What is the missing digit? x9139 9

Distances & Similarities

Section 9.2: Matrices.. a m1 a m2 a mn

Math 215 HW #9 Solutions

Computability and Complexity Theory: An Introduction

Statistics 202: Data Mining. c Jonathan Taylor. Week 2 Based in part on slides from textbook, slides of Susan Holmes. October 3, / 1

ALGEBRA: From Linear to Non-Linear. Bernd Sturmfels University of California at Berkeley

ECEN 615 Methods of Electric Power Systems Analysis Lecture 18: Least Squares, State Estimation

Part I. Other datatypes, preprocessing. Other datatypes. Other datatypes. Week 2 Based in part on slides from textbook, slides of Susan Holmes

Product of P-polynomial association schemes

Capacity of a channel Shannon s second theorem. Information Theory 1/33

1 Last time: least-squares problems

Berlekamp-Massey decoding of RS code

Fundamentals of Linear Algebra. Marcel B. Finan Arkansas Tech University c All Rights Reserved

On Extremal Codes With Automorphisms

Lecture 7. Econ August 18

Vector spaces. EE 387, Notes 8, Handout #12

Lesson 3: Polynomials and Exponents, Part 1

Linear Algebra. Matrices Operations. Consider, for example, a system of equations such as x + 2y z + 4w = 0, 3x 4y + 2z 6w = 0, x 3y 2z + w = 0.

Linear Algebra: Matrix Eigenvalue Problems

A distinguisher for high-rate McEliece Cryptosystems

Lecture 12: Reed-Solomon Codes

TOPOLOGICAL COMPLEXITY OF 2-TORSION LENS SPACES AND ku-(co)homology

5.0 BCH and Reed-Solomon Codes 5.1 Introduction

On The Hardness of Approximate and Exact (Bichromatic) Maximum Inner Product. Lijie Chen (Massachusetts Institute of Technology)

An Interpolation Algorithm for List Decoding of Reed-Solomon Codes

Journal of Algebra 226, (2000) doi: /jabr , available online at on. Artin Level Modules.

Linear Algebra. Session 12

Coding Theory and Applications. Solved Exercises and Problems of Cyclic Codes. Enes Pasalic University of Primorska Koper, 2013

Transcription:

Decoding linear codes via systems solving: complexity issues Stanislav Bulygin (joint work with Ruud Pellikaan) University of Kaiserslautern June 19, 2008

Outline Outline of the talk Introduction: codes and decoding problem Quadratic system method Complexity issues: Macaulay matrix, extended linearization, some estimates

Introduction Codes recap Let F q be a field with q elements. A linear code C is a linear subspace of F n q endowed with the Hamming metric Hamming distance between x, y F n q : d(x, y) = #{i x i y i }. Hamming weight of x F n q : wt(x) = #{i x i 0}. Minimum distance of the code C : d(c) := min x,y C,x y (d(x, y)). The code C of dimension k and minimum distance d is denoted as [n, k, d]. A matrix G whose rows are the basis vectors of C is a generator matrix. A matrix H with the property c C Hc T = 0 is a check matrix.

Introduction Decoding problem Complete decoding: Given y F n q and a code C F n q, so that y is at distance d(y, C) from the code, find c C : d(y, c) = d(y, C). Bounded up to half the minimum distance: Additional assumption d(y, C) (d(c) 1)/2. Then a codeword with the above property is unique.

Introduction Decoding via systems solving One distinguishes between two concepts: Generic decoding: Solve some system S(C) and obtain some closed formulas F. Evaluating these formulas at data specific to a received word y should yield a solution to the decoding problem. For example for f F : f (syndrome(y), x) = poly(x). The roots of poly(x) = 0 yield error positions general error-locator polynomial f. Online decoding: Solve some system S(C, y). The solutions should solve the decoding problem. Computational effort Generic decoding. Preprocessing: very hard. Decoding: relatively simple. Online decoding. Preprocessing:. Decoding: hard.

Quadratic system method Unknown syndrome Let b 1,..., b n be a basis of F n q and let B be the n n matrix with b 1,..., b n as rows. The unknown syndrome u(b, e) of a word e w.r.t B is the column vector u(b, e) = Be T with entries u i (B, e) = b i e for i = 1,..., n. Structure constants For two vectors x, y F n q define x y = (x 1 y 1,..., x n y n ). Then b i b j is a linear combination of b 1,..., b n, so there are constants µ ij l F q such that b i b j = n l=1 µij l b l. The elements µ ij l F q are the structure constants of the basis b 1,..., b n. MDS matrix Let B s be the s n matrix with b 1,..., b s as rows (B = B n ). Then b 1,..., b n is an ordered MDS basis and B an MDS matrix if all the s s submatrices of B s have rank s for all s = 1,..., n.

Quadratic system method Check matrix Let C be an F q -linear code with parameters [n, k, d]. W.l.o.g n q. H is a check matrix of C. Let h 1,..., h n k be the rows of H. One can express h i = n j=1 a ijb j for some a ij F q. In other words H = AB where A is the (n k) n matrix with entries a ij. Known syndrome Let y = c + e be a received word with c C and e an error vector. The syndromes of y and e w.r.t H are equal and known: s i (y) := h i y = h i e = s i (e). They can be expressed in the unknown syndromes of e w.r.t B: s i (y) = s i (e) n j=1 a iju j (e) since h i = n j=1 a ijb j and b j e = u j (e).

Quadratic system method Linear forms Let B be an MDS matrix with structure constants µ ij l. Define U ij in the variables U 1,..., U n by U ij = n l=1 µij l U l. Quadratic system The ideal J(y) in F q [U 1,..., U n ] is generated by n l=1 a jlu l s j (y) for j = 1,..., r The ideal I (t, U, V ) in F q [U 1,..., U n, V 1,..., V t ] is generated by t j=1 U ijv j U it+1 for i = 1,..., n Let J(t, y) be the ideal in F q [U 1,..., U n, V 1,..., V t ] generated by J(y) and I (t, U, V ).

Quadratic system method Main result Let B be an MDS matrix with structure constants µ ij l. Let H be a check matrix of the code C such that H = AB as above. Let y = c + e be a received word with c C the codeword sent and e the error vector. Suppose that wt(e) 0 and wt(e) (d(c) 1)/2. Let t be the smallest positive integer such that J(t, y) has a solution (u, v) over F q. Then wt(e) = t and the solution is unique satisfying u = u(e). the reduced Gröbner basis G for the ideal J(t, y) w.r.t any monomial ordering is { Ui u i (e), i = 1,..., n, V j v j, j = 1,..., t, where (u(e), v) is the unique solution.

Quadratic system method Features No field equations. The same result holds for the complete decoding. The solution lies in the field F q. The equations are at most quadratic. After solving J(t, y) decoding is simple: B 1 u(b, e) = B 1 Be T = e T.

Quadratic system method Analysis From J(y) one can express some n k U-variables via k others. Substitution of those in I (t, U, V ) yields a systems of n quadratic equations in k + t variables, thus obtaining overdetermined system. Easier to solve when With constant k and t, n increases. With constant n and t, k decreases. Simulations For example for random binary codes with n = 120 and k = 10,..., 40 one can correct 5 20 errors in 1000 sec. via computing the reduced Gröbner basis in SINGULAR or MAGMA.

Diagonal representation (joint with S.Ovsienko) Our system is equivalent to HX T = s X i Y i = 0, i = 1,..., n Ĥ t Y T = ŝ t, where H is a check matrix of the code C, s a known syndrome, X = (X 1,..., X n ) and Y = (Y 1,..., Y n ) are new variables, Ĥ t is a check matrix of a code with the generator matrix B t, ŝ t is a syndrome of the vector b t+1 w.r.t to Ĥ t.

Macaulay matrix Like above one can obtain a system Sys with n quadratic equations and k + t variables, w.l.o.g X 1,..., X k, Y 1,..., Y t. The monomials that appear in the system are X i Y j, 1 i k, 1 j t, X 1,..., X k, Y 1,..., Y t. The total number of monomials appearing in the system is kt + k + t = (k + 1)(t + 1) 1. One can consider the Macaulay matrix of Sys: rows are indexed by the equations, columns by the monomials. Denote the matrix by M(Sys). Linearization If n kt + k + t and M(Sys) is full-rank, one can find X i s by applying Gaussian elimination to M(Sys).

Macaulay matrix is full-rank Let C be a random [n, k] code over F q, defined e.g. by a random full-rank (n k) n check matrix H and let e be a random error vector over F q of weight t. Let Sys = Sys(n, k, t) be the corresponding system as above. Then the probability of the fact that M(Sys) has full-rank tends to 1 as n tends to infinity. Experiments suggest that already for small values of n (e.g. 20-30) the statement also holds with quite high probability. Idea of the proof Degeneracy of M(Sys) is reduced to the fact that e l + C l ( B t+1 ). Here e l and C l are the vector e and the code C resp. restricted to some l positions from {1,..., n} and B t+1 is a code equivalent to the code B t+1 restricted to the same l positions as before.

Extended linearization One can try to go further and apply extended linearization. Consider binary case, so Xi 2 = X i for all i. Multiply the system Sys with all monomials in X 1,..., X k of degree s < k. A system, call it Sys s, obtained in this way has n(1 + ( ( k 1) + + k s) ) equations and C s := C s 1 + ( k s+1) (t + 1) monomials. Denote ( k ) ( 0 + k ) ( 1 + + k ) s =: f (k, s). If we assume that M(Syss ) is full-rank, then if ( ( ) ( ) k k ) n 1+ + + = nf (k, s) C s 1 = (t+1)f (k, s+1) 1, 1 s then successfull application of Gaussian elimination to M(Sys) is possible.

Complexity coefficient If an algorithm A has complexity O(2 αn ) in the length of input n, then we say that α is a complexity coefficient of algorithm A. Denote CC A. Minimum distance of a random binary code Let C be a binary random code with parameters [n, Rn, n], where 0 < R < 1 is given. If n, then almost all codes have = H 1 (1 R), where H( ) is a binary entropy function: H(x) = x log 2 x (1 x) log 2 (1 x), 0 < x < 1.

Estimating degree of extended linearization If k = Rn, t = δn, s = λn, 0 < R, δ, λ < 1, then it is sufficient to take λ = δr for extended linearization to work. Upper bound for complexity The upper bound on the complexity coefficient with the extended linearization as above is CC QED (R) = ωrh 2 (δ), where ω is the exponent of Gaussian elimination and δ δ e = H 1 (1 R)/2 such that t = δn is the number of errors occurred.

Comparing complexities Let t = δ e n so the full error-correcting capacity is used. x axis is information rate R, y-axis is the complexity coefficient. 0.5 0.4 0.3 0.2 0.1 0.2 0.4 0.6 0.8 1

Comparing complexities II Relax requirement on t a little. Take t = δn for δ = δ e /2. 0.5 0.4 0.3 0.2 0.1 0.2 0.4 0.6 0.8 1

Comparing complexities III Relax a little more. Take t = δn for δ = δ e /4. 0.5 0.4 0.3 0.2 0.1 0.2 0.4 0.6 0.8 1

Highest degree in GB computation I In the table below we compare highest degrees when computing GB w.r.t degrevlex with std command in SINGULAR. For every code the left column shows degree for the diagonal system, and the right one for the initial system. no. of err. [120,40] [120,30] [120,20] [120,10] [150,10] 2 4 4 3 3 3 3 3 3 3 3 3 7 6 5 5 3 3 3 3 3 3 4 6 5 5 6 3 3 3 3 3 3 5 - - 6 5 6 6 3 3 3 3 6 - - 5 5 6 5 3 3 3 3 7 - - 5 5 5 5 3 3 3 3 8 - - - - 5 5 3 3 3 3 9 - - - - 5 5 3 3 3 3 10 - - - - 5 5 3 3 3 3

Highest degree in GB computation I no. of err. [120,40] [120,30] [120,20] [120,10] [150,10] 11 - - - - 33 19 6 6 3 3 12 - - - - - - 6 5 3 3 13 - - - - - - 5 5 4 4 14 - - - - - - 5 5 5 6 15 - - - - - - 5 5 6 5 16 - - - - - - 5 5 5 5 17 - - - - - - 5 5 6 5 18 - - - - - - 5 5 5 5 19 - - - - - - 5 5 5 5 20 - - - - - - 6 6 5 5 21 - - - - - - 6 5 5 5 22 - - - - - - - - 5 5 23 - - - - - - - - 5 5 24 - - - - - - - - 5 5

Highest degree in GB computation II In the next table we show degrees that appear during GB computation with F4 implementation of MAGMA (GroebnerBasis command) no. of err. [120,40] [120,30] [120,20] [120,10] [150,10] 2 3 3 3 3 3 3 3 3 3 3 3 4 4 3 3 3 3 5-4 3 3 3 6-4 3 3 3 7-4 3 3 3 8 - - 4 3 3 9 - - 4 3 3 10 - - 4 3 3 11 - - 4 3 3

Highest degree in GB computation II no. of err. [120,40] [120,30] [120,20] [120,10] [150,10] 12 - - - 3 3 13 - - - 4 3 14 - - - 4 3 15 - - - 4 3 16 - - - 4 4 17 - - - 4 4 18 - - - 4 4 19 - - - 4 4 20 - - - 4 4 21 - - - 4 4 22 - - - - 4 23 - - - - 4 24 - - - - 4

Highest degree in GB computation III We also present degree estimate with extended linearization as above. no. of err. [120,40] [120,30] [120,20] [120,10] [150,10] 2 3 2 2 2 2 3 3 3 2 2 2 4 3 3 2 2 2 5-3 3 2 2 6-3 3 2 2 7-4 3 2 2 8 - - 3 2 2 9 - - 3 2 2 10 - - 3 2 2 11 - - 4 3 2 12 - - - 3 2

Highest degree in GB computation III no. of err. [120,40] [120,30] [120,20] [120,10] [150,10] 13 - - - 3 2 14 - - - 3 3 15 - - - 3 3 16 - - - 3 3 17 - - - 3 3 18 - - - 3 3 19 - - - 3 3 20 - - - 3 3 21 - - - 3 3 22 - - - - 3 23 - - - - 3 24 - - - - 3

Highest degree in GB computation IV Although highest degree of polynomials occurring during GB computations is an important parameter for complexity, it is not the only one. For example in our experiments with slimgb we never obtained degree larger than 3, and still running time was much worse than the ones of std or F4.

Comparing with different random systems Consider different types of random systems: R 1 is a system of n quadratic equations that has the same monomials as Sys, but the corresponding coefficients are randomly taken from F q. Require that R 1 has a unique solution in F q. R 2 is a system that has the same properties as R 1, but the requirement on uniqueness of a solution is dropped out. R 3 is a fully random system of n quadratic equations, i.e. it has all possible monomials of degree 2 and the corresponding coefficients are random from F q Note that R 2 and R 3 do not have solutions in general.

Experiments Using some experimental evidence we conjecture that there are following relations between the complexities for solving Sys, R 1, R 2, and R 3 with general methods Compl(Sys) Compl(R 1 ) Compl(R 2 ) Compl(R 3 ). Semi-regular sequences Solving R 3 -systems has to do with the semi-regular sequences introduced by M.Bardet et.al. Complexity estimates for the F5 algorithm are available. These results are not applicable for our situation, since the quadratic homogeneous part in our systems has positive dimension, whereas the results there are valid only for zero-dimensional case.

Further research Further research The possible directions of research: Complexity analysis of solving, e.g. via the analysis of R 2 systems. In the complexity analysis take into account a special form of the Macaulay matrix. Algorithmic questions connected with the existence of Generalized Newton identities for arbitrary linear codes. Decoding algorithms of polynomial complexity for some classes of codes that follow the idea of cyclic codes.