A distinguisher for high-rate McEliece Cryptosystems

A distinguisher for high-rate McEliece Cryptosystems JC Faugère (INRIA, SALSA project), A Otmani (Université Caen- INRIA, SECRET project), L Perret (INRIA, SALSA project), J-P Tillich (INRIA, SECRET project) May 28th, 2010

1/15 Algebraic approach 1 Algebraic approach for attacking the McEliece cryptosystem x = (x 1,, x n ) F n q m with x i x j if i j y = (y 1,, y n ) F n q m with y i 0 For any t < n, let H def = Definition 1 y 1 y 2 y n y 1 x 1 y 2 x 2 y n x n y 1 x t 1 1 y 2 x t 1 2 y n x t 1 n An alternant code is the kernel of an H of this type A t (x, y) = { v F n q Hv T = 0 } Goppa code : Γ, polynomial of degree t such that y i = Γ(x i ) 1

2/15 algebraic approach Decoding Alternant and Goppa codes Proposition 1 [decoding alternant codes] t/2 errors can be decoded in polynomial time as long as x and y are known Proposition 2 [The special case of binary Goppa codes] In the case of a binary Goppa code (q = 2), t errors can be decoded in polynomial time, if x and Γ are known

3/15 The problem algebraic approach What is known: a basis of the code rows of a generator matrix G = (g ij ) of size k n What we also know: HG T = 0 (1) What we want to find: H Find in the case of an alternant code x, y, and in the special case of a binary Goppa code x and Γ

4/15 HG T = 0 translates to The algebraic system g 1,1 Y 1 + + g 1,n Y n = 0 g k,1 Y 1 + + g k,n Y n = 0 g 1,1 Y 1 X 1 + + g 1,n Y n X n = 0 g k,1 Y 1 X 1 + + g k,n Y n X n = 0 g 1,1 Y 1 X t 1 1 + + g 1,n Y n X t 1 n = 0 g k,1 Y 1 X t 1 1 + + g k,n Y n X t 1 n = 0 algebraic approach (2) where the g i,j s are known coefficients in F q and k n t m

5/15 Freedom of choice in (2) algebraic approach Proposition 3 Theoretically, the system has 2n unknowns but we can take arbitrary values for one Y i and for three X i s (as long as these values are different)

Applications algebraic approach When the number of unknowns is small, ex: Berger-Cayrel-Gaborit-Otmani proposal at AfricaCrypt 09 based on quasi-cyclic alternant codes Misoczki-Baretto at SAC 09 variant based on quasi-dyadic Goppa codes algebraic system can be solved by (dedicated) Grobner basis techniques breaks all parameters proposed in these articles ([Faugère-Otmani- Perret-Tillich;Eurocrypt 2010] with the exception of binary dyadic codes Related to [Leander-Gauthier Umana; SCC2010] 6/15

7/15 2 A naive attack naive attack Wlog we can assume that G is systematic in its k first positions k n k=mt 1 0 G = P k 0 1

8/15 naive attack Step 1 expressing the Y i Xi d s in terms of the Y j Xj d s for j {k + 1,, n} P = (p ij ) 1 i k We can rewrite (2) as k+1 j n n Y i = j=k+1 p i,jy j n Y i X i = j=k+1 p i,jy j X j Y i X t 1 i = n j=k+1 p i,jy j X t 1 j (3) for all i {1,, k}

9/15 Step 2 Exploiting Y i (Y i X 2 i ) = (Y ix i ) 2 Naive attack n Y i = j=k+1 p i,jy j Y i X i = n j=k+1 p i,jy j X j Y i Xi 2 = n j=k+1 p i,jy j Xj 2 (4) n n n p i,j Y j p i,j Y j Xj 2 = j=k+1 n j=k+1 j >j j=k+1 j=k+1 ( ) p i,j p i,j Yj Y j Xj 2 + Y j Y j Xj 2 = 0 p i,j Y j X j 2

Step 3 Linearization Naive attack Z jj def = Y j Y j Xj 2 + Y j Y j Xj 2 n p i,j p i,j Z jj = 0 j=k+1 j >j ( ) n k 2 m 2 t 2 2 unknowns k = n mt equations reveals Z jj when n mt m2 t 2 2? This happens for the Courtois-Finiasz-Sendrier scheme, ex: n = 2 21,t = 10,m = 21 which has to choose small values of t 10/15

11/15 This approach always fails Naive attack D alternant, resp D Goppa dimension of the linear solution space when G is the generator matrix of an alternant code, resp Goppa code def Experimental fact 1 Let D rand = ( ) mt 2 k, with high probability ( { }) D alternant = max D rand, m(t 1) 2 (2l + 1)t 2 ql+1 1 q 1 for l def = log q (t 1) ( ) D Goppa = D alternant = max D rand, m(t 1)(t 2) 2 for t < q 1 D Goppa = max ( { D rand, mt 2 (2l + 1)t 2q l + 2q l 1 1 }), for t q 1 and with l st q l 2q l 1 +q l 2 < t q l+1 2q l +q l 1

12/15 Naive attack Table 1: q = 2 and m = 10 t 3 4 5 6 7 8 9 10 11 ( mt ) 2 435 780 1225 1770 2415 3160 4005 4950 5995 k 994 984 974 964 954 944 934 924 914 D rand 0 0 251 806 1461 2216 3071 4026 5081 D alternant 30 90 251 806 1461 2216 3071 4026 5081 T alternant 30 90 220 400 630 910 1320 1800 2350 D Goppa 180 380 700 1110 1610 2216 3071 4026 5081 T Goppa 180 380 700 1110 1610 2200 2970 3850 4840

13/15 3 A Distinguisher Distinguisher D Goppa D alternant D rand Table 2: t min = smallest degree of the Goppa polynomial Γ for which we can not distinguish a binary Goppa code from a random binary linear code when n = 2 m m 9 10 11 12 13 14 15 16 17 18 19 20 21 t min 8 8 11 16 20 26 34 47 62 85 114 157 213

14/15 An explanation for the distinguisher Distinguisher We have used Any identity of the form Y i Y i X 2 i = (Y i X i ) 2 Y i X a i Y i X b i = Y i X c i Y i X d i with a, b, c, d {0, 1,, t 1} such that a + b = c + d would do the same job: Z a,b,c,d jj def = Y j Xj a Y j Xj b + Y j Xj a Y j Xj b + Y j XjY c j Xj d + Y j Xj c Y j Xj d n p i,j p i,j Z a,b,c,d jj = 0 j=k+1 j >j

15/15 Conclusion Conclusion Combinatorial explanation of the distinguisher in the alternant case Partial combinatorial explanation in the Goppa case A slightly better distinguisher can be obtained by taking the subcode of codewords of even weights Distinguisher attack? Approach requires k n very close to 1 Should very high rates be avoided in a McEliece like scheme?