Attacks in code based cryptography: a survey, new results and open problems

Attacks in code based cryptography: a survey, new results and open problems J.-P. Tillich Inria, team-project SECRET April 9, 2018

1. Code based cryptography introduction Difficult problem in coding theory Problem 1. [Decoding] Input: n, r, t with r < n, parity-check matrix H F r n q, s F r q Question:? e such that { He = s e t where e = hamming weight of e = #{i 1, n, e i 0}. Problem N P -complete 1/52

The dual problem introduction Code C def = { c F n q : Hc = 0 } dim C = n r = k Input: t, C subspace of dim k of F n q, y F n q Question:? c C such that y c t. H(y }{{ c) } = Hy = s e y = the word that we want to decode e = y c = the error we want to find 2/52

A long-studied problem introduction Correct. t errors in a code of length n and dim. k has cost Õ(2α(k n, t n )n ) Author(s) Year max α(r, τ) R,τ Prange 1962 0.1207 Stern 1988 0.1164 Dumer 1991 0.1162 Bernstein, Lange, Peters 2011 May, Meurer and Thomae 2011 0.1114 Becker, Joux, May, Meurer 2012 0.1019 May, Ozerov 2015 0.0966 Both, May 2017 0.0953 Both, May 2018 0.0885 3/52

Complexities collapse when t = o(n) introduction [CantoTorres, Sendrier, 2016] complexity 2 log(1 R)t(1+o(1)) when t = o(n) and where R = k/n 4/52

Code-based cryptography introduction Code C def = {c F n q : Hc = 0} Take a code that has an efficient decoding algorithm Public key: random parity-check matrix of the code H rand = QH where Q is a random invertible matrix in F r r q Private key: trapdoor to the efficient decoding algorithm 5/52

Two approaches introduction Pick up your favorite code (that has an efficient decoder) Choose a code/scheme with a reduction to decoding a generic linear code 6/52

History introduction 1978 McEliece: binary Goppa codes 1986 Niederreiter variant based on GRS codes 1991 Gabidulin, Paramonov, Tretjakov: Gabidulin codes 1994 Sidelnikov: Reed-Muller codes 1996 Janwa-Moreno: algebraic geometric codes 199* a zillion propositions with LDPC codes 2003 Alekhnovich: Alekhnovich system 2005 Berger-Loidreau: subcodes of GRS codes 2006 Wieschebrink, GRS codes + random columns in the generator matrix 7/52

2008 Baldi-Bodrato-Chiaraluce: LDPC based MDPC codes 2010 Bernstein, Lange, Peters: non-binary wild Goppa codes 2012 Misoczki-Tillich-Barreto-Sendrier: MDPC codes 2012 Löndahl-Johansson: convolutional codes 2013 Gaborit, Murat, Ruatta, Zémor: LRPC codes 2014 Shrestha, Kim: polar codes 2014 Hooshmand, Shooshtari, Eghlidos, Aref: subcodes of polar codes 8/52

Code based NIST submissions in Hamming metric Algebraic codes binary Goppa codes m=1 DAGS m=3 m=2 BIG QUAKE Classic McEliece NTS KEM RLCE KEM pqsigrm Reed Muller related 9/52

Code based NIST submissions in Hamming metric Non-algebraic codes BIKE HQC LEDAkem LEDApkc Lepton QC-MDPC RaCoSS 10/52

Code based NIST submissions in the rank metric Edon-K LAKE LOCKER McNie Ourobouros-R RankSign RQC 11/52

2. The main cryptanalytic techniques for attacking the key Finding small weight codewords in C or in C that reveal the underlying structure Algebraic attacks Product considerations Folding techniques Computing the hull C C 12/52

3. Product considerations product 13/52

Square code attacks product Definition 1. [Componentwise product] Given two vectors a = (a 1,..., a n ) and b = (b 1,..., b n ) F n q, we denote by a b the componentwise product a b def = (a 1 b 1,..., a n b n ) Definition 2. [Product of codes & square code] The star product code denoted by A B of A and B is the vector space spanned by all products a b where a and b range over A and B respectively. When B = A, A A is called the square code of A and is rather denoted by A 2. 14/52

Dimension of the square code product A and B codes with respective bases (a i ) and (b j ). 1. dim(a B) dim(a) dim(b) (generated by the a i b j s) ( ) dim(a) + 1 2. dim(a 2 ) (generated by the a i a j s with i j) 2 15/52

Generalized Reed-Solomon (GRS) codes product Definition 3. [Generalized Reed-Solomon code] Let k and n be integers such that 1 k < n q where q is a power of a prime number. The generalized Reed-Solomon code GRS k (x, y) of dimension k is associated to a pair (x, y) F n q F n q where x is an n-tuple of distinct elements of F q and the entries y i are arbitrary nonzero elements in F q. GRS k (x, y) is defined as: GRS k (x, y) def = { } (y 1 p(x 1 ),..., y n p(x n )) : p F q [X], deg p < k. x is the support and y the multiplier. 16/52

GRS codes, alternant codes product A GRS code corrects n k 2 errors. Definition 1. Let x (F q m) n, y (F q m) n be as in the definition of GRS codes. The alternant code Alt r (x, y) is defined by Proposition 1. Alt r (x, y) def = GRS } r {{ (x, y) } (F q ) n GRS n r (x,y ) dim Alt r (x, y) n mr d min Alt r (x, y) r + 1 17/52

product What is wrong with generalized Reed-Solomon codes? When C is a random code of length n, with high probability [Cascudo, Cramer, Mirandola, Zémor] {( ) } dim(c) + 1 dim(c 2 ) = min, n 2 When C is a generalized Reed-Solomon code dim(c 2 ) = min {2 dim(c) 1, n} 18/52

The explanation product c = (y 1 p(x 1 ),..., y n p(x n )), c = (y 1 q(x 1 ),..., y n q(x n )) GRS k (x, y) where p and q are two polynomials of degree at most k 1. c c = ( y1p(x 2 1 )q(x 1 ),..., ynp(x 2 n )q(x n ) ) = ( y1r(x 2 1 ),..., ynr(x 2 n ) ) where r is a polynomial of degree 2k 2. = c c GRS 2k 1 (x, y 2 ) 19/52

product The Wieschebrink attack on the Berger-Loidreau cryptosystem known: a subcode C GRS k (x, y) unknown: x and y. If the codimension of C is small enough C C = GRS k (x, y) GRS k (x, y) = GRS 2k 1 (x, y ) The Wieschebrink attack 1. Compute C C = GRS 2k 1 (x, y ) 2. Recover x and y by using the Sidelnikov-Shestakov algorithm. 20/52

Filtration attack product [Couvreur, Otmani, T 2014]: m = 2. Attack on wild Goppa codes when 21/52

A filtration for GRS codes product A new attack on McEliece based on GRS codes. known : C 0 = GRS k (x, y) unknown : x,y. C 0 = GRS k (x, y) C 1 = GRS k 1 (x, y) C k 1 = GRS 1 (x, y) The point: C k 1 = {αy, α F q } y known x by solving a linear system. 22/52

The fundamental induction product C i C i 2 = C i 1 C i 1 C i C i 2 = GRS k i (x, y) GRS k i+2 (x, y) = GRS 2k 2i+1 (x, y y) = GRS k i+1 (x, y) GRS k i+1 (x, y) = C i 1 C i 1 23/52

The picture product Alternant codes GRS codes m=1 m=3 m=2 Goppa codes wild Goppa codes binary Goppa codes 24/52

Code based NIST submissions in Hamming metric Algebraic codes binary Goppa codes m=1 DAGS m=3 m=2 BIG QUAKE Classic McEliece NTS KEM RLCE KEM pqsigrm Reed Muller related 25/52

4. Folding operation, the Origami attack folding 26/52

Origami attack folding Related to Gentry attack on NTRU-composite Applies to codes with a non trivial permutation group For σ S n, c σ C σ def = (c σ(i) ) i 1,n def = {c σ : c C} σ is a permutation automorphism of C iff C σ = C 27/52

Examples folding Parity-check matrix has a block form H = with blocks of some size l of the form B (11)... B (1n ). B (ij).... B (r n ) B (r 1) B (ij) = a 0 a 1 a l 1 a l 1 a 0 a l 2........ a 1 a 2 a 0 B(ij) = a 0 a 1 a 2 a 3 a 1 a 0 a 3 a 2 a 2 a 3 a 0 a 1 a 3 a 2 a 1 a 0 quasi-cyclic case B (ij) s,t = a t s (mod l) quasidyadic case B (ij) s,t = a t s 28/52

Folding folding Folding x = w.r. to σ adding the coordinates in a same orbit of σ σ = (123)(456)(678) x = (x } 1, x {{ 2, x } 3,..., } x 7, x {{ 8, x } 9 ) orbit orbit x σ = (x 1 + x 2 + x 3,..., x 7 + x 8 + x 8 ) C σ def = {c σ : c C}. 29/52

Why is this an interesting operation? folding Orbits of σ of size l Code gets smaller C = code of length n dim. k C σ = code of length n/l and dim. k l Words do not increase their weight c = w c σ w 30/52

folding Folding quasi- alternant codes/ Goppa codes [Faugère, Otmani, Perret, Portzamparc, T 2014] Folding the dual of a Q*-alternant or Q*-Goppa code dual of an alternant or a Goppa code [Barelli-Couvreur 2017] Folding a Q*-alternant or a Q*-Goppa code alternant or a Goppa code 31/52

Message attacks folding { He = s { e t H σ (e σ ) = (s σ ) e σ t We recover e σ (say = e 0 ) and then solve the much easier problem H σ e = s e t e σ = e 0 32/52

5. Algebraic attacks algebraic Alternant code Alt r (x, y) parity-check matrix H of the form H = y 1 y 2...... y n y 1 x 1. y 2 x 2......... y n x n... y j x i j....... y 1 x1 r 1 y 2 x2 r 1...... y n x r 1 n Goppa code Gop(x, Γ) = Alt deg Γ (x, 1 Γ(x) ). 33/52

Algebraic attacks algebraic G = (g ij ) i 1,k j 1,n Unknowns: y 1,..., y n, x 1,..., x n 2n unknowns Algebraic system generator matrix of C = Alt r (x, y). n j=1 GH = 0 g ij y j x a j = 0 (i, a) 1, k 0, r 1 k r equations 34/52

When was this successful? algebraic [Faugère,Otmani,Perret,T 2010-2015] Q*-alternant of Q*-Goppa codes [Faugère,Perret,Portzamparc 2014] Wild Goppa codes for certain parameters 35/52

Rank Metric rank Difficult problem in coding theory Problem 2. [Decoding] Input: n, r, t integers, r < n, parity-check matrix H F q m r n, syndrome s F r q Question:? e such that (i) He = s, (ii) e t where e R = rank weight of e. Randomized reduction to N P -complete problems. 36/52

Rank metric rank (β 1... β m ) basis of F q m over F q x = (x 1,..., x n ) F n q m Mat(x) = where x j = m i=1 x ijβ i. x 11 x 12 x 1n x 21 x 22 x 2n.... Fm n q x m1 x m2... x mn Rank metric = viewing an element of F n q m as an m n matrix. x y r def = Rank (Mat(x) Mat(y)). 37/52

Complexity of the best known algorithms Algebraic attacks (MinRank) Combinatorial attacks Õ ( q t(k+1) m) when m = n. 38/52

LRPC codes [Gaborit, Murat, Ruatta, Zémor 2013] Definition 4. An LRPC code over F q m of weight d is a code that admits an (n k) n parity-check matrix H with entries h ij that span an F q space of dimension d. x r = dim x 1,..., x n Fq all rows of H have weight d. Correct t errors when td n k. 39/52

RankSign Secret key H where H = [ H R ] P with H = (n k) n parity-check matrix of an LRPC code over F q m R = random (n k) t matrix over F q m P = (n + t) (n + t) invertible matrix over F q P isometry xp r = x r. LRPC code of weight d codewords of weight d + t in the dual code. 40/52

Attack on RankSign rank [Debris-Alazard, T 2018] Looking for low weight codewords in the dual code? 41/52

Attack on RankSign rank [Debris-Alazard, T 2018] Looking for low weight codewords in the code itself Product trick 42/52

Getting rid of R rank If there is a low weight codeword c LRPC in C LRPC low weight codeword c = (c LRPC, 0 t )(P 1 ) in the public code of parity-check matrix H pub = QH = [ H R ] P H pub c = Hpub P 1 (c LRPC, 0 t ) = Q [ H R ] P P 1 (c LRPC, 0 t ) = Q [ H R ] (c LRPC, 0 t ) = QHc LRPC ( R F (n k) t q m ) = 0 (c LRPC belongs to the code of parity-check matrix H) 43/52

Product trick rank F F q -space of dimension d generated by the entries of H parity-check of the [n, k] LRPC code C LRPC. U and V two subspaces of F q m, U V def = uv : u U, v V Fq. Lemma 1. It there exists an F q -subspace F of F q m such that (n k) dim(f F ) < n dim F. Then there exist nonzero codewords in the LRPC code of weight dim F. 44/52

Proof A codeword c of the LRPC code satisfies n i 1, n k H i,j c j = 0. (1) j=1 If its entries are in F then n j=1 H i,jc j F F unknowns coordinates c ij of c j in F = f 1,..., f d Fq : c j = i 1,d c ij f i rank # equations = (n k) dim F F #unknowns = n dim F 45/52

Consequence on RankSign rank Necessary condition for RankSign to work n = (n k)d Problem: typically dim F F = dim F dim F and therefore n dim F = n d = (n k)d d = (n k) dim F F F = f 1,..., f d Fq F def = f 1, f 2 Fq F F = x i x j : i 1, d, j 1, 2 Fq dim F F = 2d 1< dim F dim F codewords in C LRPC of weight 2 46/52

Consequence on LRPC in general? rank No direct attack on LRPC codes without the additional condition n = (n k)d 47/52

Conclusion conclusion Up to now all distinguishers of the public parity-check matrix / random matrix with the exception of high rate alternant/goppa codes. [Faugère,Gauthier,Otmani,Perret,T 2011], [Márquez-Corbella, Pellikaan 2012], when r is sufficiently small dim ( Alt r (x, y) Alt r (x, y) ) unusually small The problem, when x, y F n q m Alt r (x, y) = {(y j p(x j )) : deg p < n r} F n q {( ) } Alt r (x, y) = Tr Fq m F q (y j p(x j ) : deg p < r 48/52

Other open problems conclusion improving algebraic attacks in the rank metric Polynomial time attacks on Reed-Muller codes? other families of codes (MDPC,... )? 49/52

What about alternant/goppa codes? We have square Alt r (x, y) = GRS r (x, y) F n q = GRS n r (x, y ) F n q Alt r (x, y) 2 Alt 2r n+1 (x, y ) and Fact 1. dim Alt r (x, y) n mr. To distinguish we need 2r n + 1 > 0 = r n/2, however m > 1 = n mr 0. 50/52

square A miracle when m = 2 in the case of wild Goppa codes Theorem 1. [Couvreur, Otmani, Tillich] wild Goppa code (here r = (q 1)r ) When Alt r (x, y) is a Alt r (x, y) n 2r + r (r 2) and for r close to n/2 we may have wild Goppa codes of small dimension such that 2r n + 1 > 0 51/52

Shortening trick for other dimensions square A shortened alternant code is still an alternant code of the same degree r as the original alternant code. Leads to a distinguisher of wild Goppa codes when m = 2 Leads to an attack of the McEliece scheme based on wild Goppa codes when m = 2. First time that there is an attack working in polynomial time on a McEliece scheme based on Goppa codes. 52/52