Objectives. McBits: fast constant-time code-based cryptography. Set new speed records for public-key cryptography. (to appear at CHES 2013)

Size: px
Start display at page:

Download "Objectives. McBits: fast constant-time code-based cryptography. Set new speed records for public-key cryptography. (to appear at CHES 2013)"

Transcription

1 McBits: fast constant-time code-based cryptography (to appear at CHES 2013) Objectives Set new speed records for public-key cryptography. D. J. Bernstein University of Illinois at Chicago & Technische Universiteit Eindhoven Joint work with: Tung Chou Technische Universiteit Eindhoven Peter Schwabe Radboud University Nijmegen

2 McBits: fast constant-time code-based cryptography (to appear at CHES 2013) D. J. Bernstein University of Illinois at Chicago & Technische Universiteit Eindhoven Objectives Set new speed records for public-key cryptography. : : : at a high security level. Joint work with: Tung Chou Technische Universiteit Eindhoven Peter Schwabe Radboud University Nijmegen

3 McBits: fast constant-time code-based cryptography (to appear at CHES 2013) D. J. Bernstein University of Illinois at Chicago & Technische Universiteit Eindhoven Objectives Set new speed records for public-key cryptography. : : : at a high security level. : : : including protection against quantum computers. Joint work with: Tung Chou Technische Universiteit Eindhoven Peter Schwabe Radboud University Nijmegen

4 McBits: fast constant-time code-based cryptography (to appear at CHES 2013) D. J. Bernstein University of Illinois at Chicago & Technische Universiteit Eindhoven Joint work with: Tung Chou Technische Universiteit Eindhoven Objectives Set new speed records for public-key cryptography. : : : at a high security level. : : : including protection against quantum computers. : : : including full protection against cache-timing attacks, branch-prediction attacks, etc. Peter Schwabe Radboud University Nijmegen

5 McBits: fast constant-time code-based cryptography (to appear at CHES 2013) D. J. Bernstein University of Illinois at Chicago & Technische Universiteit Eindhoven Joint work with: Tung Chou Technische Universiteit Eindhoven Peter Schwabe Radboud University Nijmegen Objectives Set new speed records for public-key cryptography. : : : at a high security level. : : : including protection against quantum computers. : : : including full protection against cache-timing attacks, branch-prediction attacks, etc. : : : using code-based crypto with a solid track record.

6 McBits: fast constant-time code-based cryptography (to appear at CHES 2013) D. J. Bernstein University of Illinois at Chicago & Technische Universiteit Eindhoven Joint work with: Tung Chou Technische Universiteit Eindhoven Peter Schwabe Radboud University Nijmegen Objectives Set new speed records for public-key cryptography. : : : at a high security level. : : : including protection against quantum computers. : : : including full protection against cache-timing attacks, branch-prediction attacks, etc. : : : using code-based crypto with a solid track record. : : : all of the above at once.

7 tant-time ed cryptography ar at CHES 2013) rnstein y of Illinois at Chicago & he Universiteit Eindhoven rk with: ou he Universiteit Eindhoven hwabe University Nijmegen Objectives Set new speed records for public-key cryptography. : : : at a high security level. : : : including protection against quantum computers. : : : including full protection against cache-timing attacks, branch-prediction attacks, etc. : : : using code-based crypto with a solid track record. : : : all of the above at once. Example Some cy (Intel Co from ben mceliec (2008 Bi gls254 (binary e kumfp12 (hyperell curve25 (conserv mceliec ronald1

8 raphy S 2013) is at Chicago & iteit Eindhoven iteit Eindhoven y Nijmegen Objectives Set new speed records for public-key cryptography. : : : at a high security level. : : : including protection against quantum computers. : : : including full protection against cache-timing attacks, branch-prediction attacks, etc. : : : using code-based crypto with a solid track record. : : : all of the above at once. Examples of the co Some cycle counts (Intel Core i from bench.cr.yp mceliece encrypt (2008 Biswas Send gls254 DH (binary elliptic cur kumfp127g DH (hyperelliptic; Euro curve25519 DH (conservative ellipt mceliece decrypt ronald1024 decry

9 go & hoven hoven n Objectives Set new speed records for public-key cryptography. : : : at a high security level. : : : including protection against quantum computers. : : : including full protection against cache-timing attacks, branch-prediction attacks, etc. : : : using code-based crypto with a solid track record. : : : all of the above at once. Examples of the competition Some cycle counts on h9ivy (Intel Core i5-3210m, Ivy Br from bench.cr.yp.to: mceliece encrypt (2008 Biswas Sendrier, 2 80 ) gls254 DH (binary elliptic curve; CHES kumfp127g DH 1 (hyperelliptic; Eurocrypt 201 curve25519 DH 1 (conservative elliptic curve) mceliece decrypt 12 ronald1024 decrypt 13

10 Objectives Set new speed records for public-key cryptography. : : : at a high security level. : : : including protection against quantum computers. : : : including full protection against cache-timing attacks, branch-prediction attacks, etc. : : : using code-based crypto with a solid track record. : : : all of the above at once. Examples of the competition Some cycle counts on h9ivy (Intel Core i5-3210m, Ivy Bridge) from bench.cr.yp.to: mceliece encrypt (2008 Biswas Sendrier, 2 80 ) gls254 DH (binary elliptic curve; CHES 2013) kumfp127g DH (hyperelliptic; Eurocrypt 2013) curve25519 DH (conservative elliptic curve) mceliece decrypt ronald1024 decrypt

11 es speed records c-key cryptography. high security level. ding protection uantum computers. ding full protection ache-timing attacks, rediction attacks, etc. code-based crypto lid track record. f the above at once. Examples of the competition Some cycle counts on h9ivy (Intel Core i5-3210m, Ivy Bridge) from bench.cr.yp.to: mceliece encrypt (2008 Biswas Sendrier, 2 80 ) gls254 DH (binary elliptic curve; CHES 2013) kumfp127g DH (hyperelliptic; Eurocrypt 2013) curve25519 DH (conservative elliptic curve) mceliece decrypt ronald1024 decrypt New dec (n; t) = (

12 rds tography. ity level. ction omputers. rotection ng attacks, ttacks, etc. ed crypto record. e at once. Examples of the competition Some cycle counts on h9ivy (Intel Core i5-3210m, Ivy Bridge) from bench.cr.yp.to: mceliece encrypt (2008 Biswas Sendrier, 2 80 ) gls254 DH (binary elliptic curve; CHES 2013) kumfp127g DH (hyperelliptic; Eurocrypt 2013) curve25519 DH (conservative elliptic curve) mceliece decrypt ronald1024 decrypt New decoding spe (n; t) = (4096; 41);

13 Examples of the competition Some cycle counts on h9ivy (Intel Core i5-3210m, Ivy Bridge) from bench.cr.yp.to: New decoding speeds (n; t) = (4096; 41); secu, c. mceliece encrypt (2008 Biswas Sendrier, 2 80 ) gls254 DH (binary elliptic curve; CHES 2013) kumfp127g DH (hyperelliptic; Eurocrypt 2013) curve25519 DH (conservative elliptic curve) mceliece decrypt ronald1024 decrypt

14 Examples of the competition Some cycle counts on h9ivy (Intel Core i5-3210m, Ivy Bridge) from bench.cr.yp.to: New decoding speeds (n; t) = (4096; 41); security: mceliece encrypt (2008 Biswas Sendrier, 2 80 ) gls254 DH (binary elliptic curve; CHES 2013) kumfp127g DH (hyperelliptic; Eurocrypt 2013) curve25519 DH (conservative elliptic curve) mceliece decrypt ronald1024 decrypt

15 Examples of the competition Some cycle counts on h9ivy (Intel Core i5-3210m, Ivy Bridge) from bench.cr.yp.to: mceliece encrypt (2008 Biswas Sendrier, 2 80 ) gls254 DH (binary elliptic curve; CHES 2013) kumfp127g DH (hyperelliptic; Eurocrypt 2013) curve25519 DH (conservative elliptic curve) mceliece decrypt ronald1024 decrypt New decoding speeds (n; t) = (4096; 41); security: Ivy Bridge cycles. Talk will focus on this case. (Decryption is slightly slower: includes hash, cipher, MAC.)

16 Examples of the competition Some cycle counts on h9ivy (Intel Core i5-3210m, Ivy Bridge) from bench.cr.yp.to: mceliece encrypt (2008 Biswas Sendrier, 2 80 ) gls254 DH (binary elliptic curve; CHES 2013) kumfp127g DH (hyperelliptic; Eurocrypt 2013) curve25519 DH (conservative elliptic curve) mceliece decrypt ronald1024 decrypt New decoding speeds (n; t) = (4096; 41); security: Ivy Bridge cycles. Talk will focus on this case. (Decryption is slightly slower: includes hash, cipher, MAC.) (n; t) = (2048; 32); 2 80 security: Ivy Bridge cycles.

17 Examples of the competition Some cycle counts on h9ivy (Intel Core i5-3210m, Ivy Bridge) from bench.cr.yp.to: mceliece encrypt (2008 Biswas Sendrier, 2 80 ) gls254 DH (binary elliptic curve; CHES 2013) kumfp127g DH (hyperelliptic; Eurocrypt 2013) curve25519 DH (conservative elliptic curve) mceliece decrypt ronald1024 decrypt New decoding speeds (n; t) = (4096; 41); security: Ivy Bridge cycles. Talk will focus on this case. (Decryption is slightly slower: includes hash, cipher, MAC.) (n; t) = (2048; 32); 2 80 security: Ivy Bridge cycles. All load/store addresses and all branch conditions are public. Eliminates cache-timing attacks etc. Similar improvements for CFS.

18 s of the competition cle counts on h9ivy re i5-3210m, Ivy Bridge) ch.cr.yp.to: e encrypt swas Sendrier, 2 80 ) DH lliptic curve; CHES 2013) 7g DH iptic; Eurocrypt 2013) 519 DH ative elliptic curve) e decrypt decrypt New decoding speeds (n; t) = (4096; 41); security: Ivy Bridge cycles. Talk will focus on this case. (Decryption is slightly slower: includes hash, cipher, MAC.) (n; t) = (2048; 32); 2 80 security: Ivy Bridge cycles. All load/store addresses and all branch conditions are public. Eliminates cache-timing attacks etc. Similar improvements for CFS. Constant The extr to elimin Handle a using on XOR (^)

19 mpetition on h9ivy M, Ivy Bridge).to: rier, 2 80 ) ve; CHES 2013) crypt 2013) ic curve) pt New decoding speeds (n; t) = (4096; 41); security: Ivy Bridge cycles. Talk will focus on this case. (Decryption is slightly slower: includes hash, cipher, MAC.) (n; t) = (2048; 32); 2 80 security: Ivy Bridge cycles. All load/store addresses and all branch conditions are public. Eliminates cache-timing attacks etc. Similar improvements for CFS. Constant-time fana The extremist s ap to eliminate timing Handle all secret d using only bit oper XOR (^), AND (&

20 idge) ) ) New decoding speeds (n; t) = (4096; 41); security: Ivy Bridge cycles. Talk will focus on this case. (Decryption is slightly slower: includes hash, cipher, MAC.) (n; t) = (2048; 32); 2 80 security: Ivy Bridge cycles. All load/store addresses and all branch conditions are public. Eliminates cache-timing attacks etc. Similar improvements for CFS. Constant-time fanaticism The extremist s approach to eliminate timing attacks: Handle all secret data using only bit operations XOR (^), AND (&), etc.

21 New decoding speeds (n; t) = (4096; 41); security: Ivy Bridge cycles. Talk will focus on this case. (Decryption is slightly slower: includes hash, cipher, MAC.) Constant-time fanaticism The extremist s approach to eliminate timing attacks: Handle all secret data using only bit operations XOR (^), AND (&), etc. (n; t) = (2048; 32); 2 80 security: Ivy Bridge cycles. All load/store addresses and all branch conditions are public. Eliminates cache-timing attacks etc. Similar improvements for CFS.

22 New decoding speeds (n; t) = (4096; 41); security: Ivy Bridge cycles. Talk will focus on this case. (Decryption is slightly slower: includes hash, cipher, MAC.) (n; t) = (2048; 32); 2 80 security: Ivy Bridge cycles. Constant-time fanaticism The extremist s approach to eliminate timing attacks: Handle all secret data using only bit operations XOR (^), AND (&), etc. We take this approach. All load/store addresses and all branch conditions are public. Eliminates cache-timing attacks etc. Similar improvements for CFS.

23 New decoding speeds (n; t) = (4096; 41); security: Ivy Bridge cycles. Talk will focus on this case. (Decryption is slightly slower: includes hash, cipher, MAC.) (n; t) = (2048; 32); 2 80 security: Ivy Bridge cycles. All load/store addresses and all branch conditions are public. Eliminates cache-timing attacks etc. Similar improvements for CFS. Constant-time fanaticism The extremist s approach to eliminate timing attacks: Handle all secret data using only bit operations XOR (^), AND (&), etc. We take this approach. How can this be competitive in speed? Are you really simulating field multiplication with hundreds of bit operations instead of simple log tables?

24 oding speeds 4096; 41); security: y Bridge cycles. focus on this case. ion is slightly slower: hash, cipher, MAC.) 2048; 32); 2 80 security: y Bridge cycles. store addresses ranch conditions c. Eliminates ing attacks etc. mprovements for CFS. Constant-time fanaticism The extremist s approach to eliminate timing attacks: Handle all secret data using only bit operations XOR (^), AND (&), etc. We take this approach. How can this be competitive in speed? Are you really simulating field multiplication with hundreds of bit operations instead of simple log tables? Yes, we Not as s On a typ the XOR is actual operatin on vecto

25 eds security: cycles. this case. htly slower: er, MAC.) 2 80 security: cycles. esses ditions tes ks etc. nts for CFS. Constant-time fanaticism The extremist s approach to eliminate timing attacks: Handle all secret data using only bit operations XOR (^), AND (&), etc. We take this approach. How can this be competitive in speed? Are you really simulating field multiplication with hundreds of bit operations instead of simple log tables? Yes, we are. Not as slow as it s On a typical 32-bit the XOR instructio is actually 32-bit X operating in parall on vectors of 32 b

26 rity: : ity: S. Constant-time fanaticism The extremist s approach to eliminate timing attacks: Handle all secret data using only bit operations XOR (^), AND (&), etc. We take this approach. How can this be competitive in speed? Are you really simulating field multiplication with hundreds of bit operations instead of simple log tables? Yes, we are. Not as slow as it sounds! On a typical 32-bit CPU, the XOR instruction is actually 32-bit XOR, operating in parallel on vectors of 32 bits.

27 Constant-time fanaticism The extremist s approach to eliminate timing attacks: Handle all secret data using only bit operations XOR (^), AND (&), etc. We take this approach. Yes, we are. Not as slow as it sounds! On a typical 32-bit CPU, the XOR instruction is actually 32-bit XOR, operating in parallel on vectors of 32 bits. How can this be competitive in speed? Are you really simulating field multiplication with hundreds of bit operations instead of simple log tables?

28 Constant-time fanaticism The extremist s approach to eliminate timing attacks: Handle all secret data using only bit operations XOR (^), AND (&), etc. We take this approach. How can this be competitive in speed? Are you really simulating field multiplication with hundreds of bit operations instead of simple log tables? Yes, we are. Not as slow as it sounds! On a typical 32-bit CPU, the XOR instruction is actually 32-bit XOR, operating in parallel on vectors of 32 bits. Low-end smartphone CPU: 128-bit XOR every cycle. Ivy Bridge: 256-bit XOR every cycle, or three 128-bit XORs.

29 -time fanaticism emist s approach ate timing attacks: ll secret data ly bit operations, AND (&), etc. this approach. n this be ive in speed? really simulating tiplication with of bit operations f simple log tables? Yes, we are. Not as slow as it sounds! On a typical 32-bit CPU, the XOR instruction is actually 32-bit XOR, operating in parallel on vectors of 32 bits. Low-end smartphone CPU: 128-bit XOR every cycle. Ivy Bridge: 256-bit XOR every cycle, or three 128-bit XORs. Not imm that this saves tim multiplic

30 ticism proach attacks: ata ations ), etc. ach. ed? lating with erations og tables? Yes, we are. Not as slow as it sounds! On a typical 32-bit CPU, the XOR instruction is actually 32-bit XOR, operating in parallel on vectors of 32 bits. Low-end smartphone CPU: 128-bit XOR every cycle. Ivy Bridge: 256-bit XOR every cycle, or three 128-bit XORs. Not immediately o that this bitslicing saves time for, e.g multiplication in F

31 Yes, we are. Not as slow as it sounds! On a typical 32-bit CPU, the XOR instruction is actually 32-bit XOR, operating in parallel on vectors of 32 bits. Not immediately obvious that this bitslicing saves time for, e.g., multiplication in F Low-end smartphone CPU: 128-bit XOR every cycle. Ivy Bridge: 256-bit XOR every cycle, or three 128-bit XORs.

32 Yes, we are. Not as slow as it sounds! On a typical 32-bit CPU, the XOR instruction is actually 32-bit XOR, operating in parallel on vectors of 32 bits. Not immediately obvious that this bitslicing saves time for, e.g., multiplication in F Low-end smartphone CPU: 128-bit XOR every cycle. Ivy Bridge: 256-bit XOR every cycle, or three 128-bit XORs.

33 Yes, we are. Not as slow as it sounds! On a typical 32-bit CPU, the XOR instruction is actually 32-bit XOR, operating in parallel on vectors of 32 bits. Not immediately obvious that this bitslicing saves time for, e.g., multiplication in F But quite obvious that it saves time for addition in F Low-end smartphone CPU: 128-bit XOR every cycle. Ivy Bridge: 256-bit XOR every cycle, or three 128-bit XORs.

34 Yes, we are. Not as slow as it sounds! On a typical 32-bit CPU, the XOR instruction is actually 32-bit XOR, operating in parallel on vectors of 32 bits. Low-end smartphone CPU: 128-bit XOR every cycle. Ivy Bridge: 256-bit XOR every cycle, or three 128-bit XORs. Not immediately obvious that this bitslicing saves time for, e.g., multiplication in F But quite obvious that it saves time for addition in F Typical decoding algorithms have add, mult roughly balanced. Coming next: how to save many adds and most mults. Nice synergy with bitslicing.

35 are. low as it sounds! ical 32-bit CPU, instruction ly 32-bit XOR, g in parallel rs of 32 bits. smartphone CPU: OR every cycle. e: OR every cycle, 128-bit XORs. Not immediately obvious that this bitslicing saves time for, e.g., multiplication in F But quite obvious that it saves time for addition in F Typical decoding algorithms have add, mult roughly balanced. Coming next: how to save many adds and most mults. Nice synergy with bitslicing. The add Fix n = Big final is to find of f = c For each compute 41 adds,

36 ounds! CPU, n OR, el its. ne CPU: cycle. cycle, Rs. Not immediately obvious that this bitslicing saves time for, e.g., multiplication in F But quite obvious that it saves time for addition in F Typical decoding algorithms have add, mult roughly balanced. Coming next: how to save many adds and most mults. Nice synergy with bitslicing. The additive FFT Fix n = 4096 = 2 1 Big final decoding is to find all roots of f = c 41 x 41 + For each 2 F 2 12 compute f() by H 41 adds, 41 mults.

37 Not immediately obvious that this bitslicing saves time for, e.g., multiplication in F But quite obvious that it saves time for addition in F Typical decoding algorithms have add, mult roughly balanced. Coming next: how to save many adds and most mults. Nice synergy with bitslicing. The additive FFT Fix n = 4096 = 2 12, t = 41. Big final decoding step is to find all roots in F 2 12 of f = c 41 x c 0 x 0. For each 2 F 2 12, compute f() by Horner s r 41 adds, 41 mults.

38 Not immediately obvious that this bitslicing saves time for, e.g., multiplication in F But quite obvious that it saves time for addition in F Typical decoding algorithms have add, mult roughly balanced. Coming next: how to save many adds and most mults. Nice synergy with bitslicing. The additive FFT Fix n = 4096 = 2 12, t = 41. Big final decoding step is to find all roots in F 2 12 of f = c 41 x c 0 x 0. For each 2 F 2 12, compute f() by Horner s rule: 41 adds, 41 mults.

39 Not immediately obvious that this bitslicing saves time for, e.g., multiplication in F But quite obvious that it saves time for addition in F Typical decoding algorithms have add, mult roughly balanced. Coming next: how to save many adds and most mults. Nice synergy with bitslicing. The additive FFT Fix n = 4096 = 2 12, t = 41. Big final decoding step is to find all roots in F 2 12 of f = c 41 x c 0 x 0. For each 2 F 2 12, compute f() by Horner s rule: 41 adds, 41 mults. Or use Chien search: compute c i g i, c i g 2i, c i g 3i, etc. Cost per point: again 41 adds, 41 mults.

40 Not immediately obvious that this bitslicing saves time for, e.g., multiplication in F But quite obvious that it saves time for addition in F Typical decoding algorithms have add, mult roughly balanced. Coming next: how to save many adds and most mults. Nice synergy with bitslicing. The additive FFT Fix n = 4096 = 2 12, t = 41. Big final decoding step is to find all roots in F 2 12 of f = c 41 x c 0 x 0. For each 2 F 2 12, compute f() by Horner s rule: 41 adds, 41 mults. Or use Chien search: compute c i g i, c i g 2i, c i g 3i, etc. Cost per point: again 41 adds, 41 mults. Our cost: 6.01 adds, 2.09 mults.

41 ediately obvious bitslicing e for, e.g., ation in F e obvious that it e for addition in F ecoding algorithms, mult roughly balanced. next: how to save ds and most mults. ergy with bitslicing. The additive FFT Fix n = 4096 = 2 12, t = 41. Big final decoding step is to find all roots in F 2 12 of f = c 41 x c 0 x 0. For each 2 F 2 12, compute f() by Horner s rule: 41 adds, 41 mults. Or use Chien search: compute c i g i, c i g 2i, c i g 3i, etc. Cost per point: again 41 adds, 41 mults. Our cost: 6.01 adds, 2.09 mults. Asympto normally so Horne Θ(nt) =

42 bvious., that it ition in F lgorithms ghly balanced. to save st mults. bitslicing. The additive FFT Fix n = 4096 = 2 12, t = 41. Big final decoding step is to find all roots in F 2 12 of f = c 41 x c 0 x 0. For each 2 F 2 12, compute f() by Horner s rule: 41 adds, 41 mults. Or use Chien search: compute c i g i, c i g 2i, c i g 3i, etc. Cost per point: again 41 adds, 41 mults. Our cost: 6.01 adds, 2.09 mults. Asymptotics: normally t 2 Θ(n= so Horner s rule co Θ(nt) = Θ(n 2 = lg

43 12. nced. The additive FFT Fix n = 4096 = 2 12, t = 41. Big final decoding step is to find all roots in F 2 12 of f = c 41 x c 0 x 0. For each 2 F 2 12, compute f() by Horner s rule: 41 adds, 41 mults. Asymptotics: normally t 2 Θ(n= lg n), so Horner s rule costs Θ(nt) = Θ(n 2 = lg n). Or use Chien search: compute c i g i, c i g 2i, c i g 3i, etc. Cost per point: again 41 adds, 41 mults. Our cost: 6.01 adds, 2.09 mults.

44 The additive FFT Fix n = 4096 = 2 12, t = 41. Big final decoding step is to find all roots in F 2 12 of f = c 41 x c 0 x 0. Asymptotics: normally t 2 Θ(n= lg n), so Horner s rule costs Θ(nt) = Θ(n 2 = lg n). For each 2 F 2 12, compute f() by Horner s rule: 41 adds, 41 mults. Or use Chien search: compute c i g i, c i g 2i, c i g 3i, etc. Cost per point: again 41 adds, 41 mults. Our cost: 6.01 adds, 2.09 mults.

45 The additive FFT Fix n = 4096 = 2 12, t = 41. Big final decoding step is to find all roots in F 2 12 of f = c 41 x c 0 x 0. For each 2 F 2 12, compute f() by Horner s rule: 41 adds, 41 mults. Or use Chien search: compute c i g i, c i g 2i, c i g 3i, etc. Cost per point: again 41 adds, 41 mults. Asymptotics: normally t 2 Θ(n= lg n), so Horner s rule costs Θ(nt) = Θ(n 2 = lg n). Wait a minute. Didn t we learn in school that FFT evaluates an n-coeff polynomial at n points using n 1+o(1) operations? Isn t this better than n 2 = lg n? Our cost: 6.01 adds, 2.09 mults.

46 itive FFT 4096 = 2 12, t = 41. decoding step all roots in F x c 0 x 0. 2 F 2 12, f() by Horner s rule: 41 mults. hien search: compute 2i, c i g 3i, etc. Cost per ain 41 adds, 41 mults. : 6.01 adds, 2.09 mults. Asymptotics: normally t 2 Θ(n= lg n), so Horner s rule costs Θ(nt) = Θ(n 2 = lg n). Wait a minute. Didn t we learn in school that FFT evaluates an n-coeff polynomial at n points using n 1+o(1) operations? Isn t this better than n 2 = lg n? Standard Want to f = c 0 + at all the Write f Observe f() = f f( ) = f 0 has n evaluate by same Similarly

47 2, t = 41. step in F c 0 x 0., orner s rule: h: compute etc. Cost per ds, 41 mults. ds, 2.09 mults. Asymptotics: normally t 2 Θ(n= lg n), so Horner s rule costs Θ(nt) = Θ(n 2 = lg n). Wait a minute. Didn t we learn in school that FFT evaluates an n-coeff polynomial at n points using n 1+o(1) operations? Isn t this better than n 2 = lg n? Standard radix-2 F Want to evaluate f = c 0 + c 1 x + at all the nth root Write f as f 0 (x 2 ) Observe big overla f() = f 0 ( 2 ) + f( ) = f 0 ( 2 ) f 0 has n=2 coeffs; evaluate at (n=2)n by same idea recur Similarly f 1.

48 ule: te per lts. ults. Asymptotics: normally t 2 Θ(n= lg n), so Horner s rule costs Θ(nt) = Θ(n 2 = lg n). Wait a minute. Didn t we learn in school that FFT evaluates an n-coeff polynomial at n points using n 1+o(1) operations? Isn t this better than n 2 = lg n? Standard radix-2 FFT: Want to evaluate f = c 0 + c 1 x + + c n 1x n at all the nth roots of 1. Write f as f 0 (x 2 ) + xf 1 (x 2 ) Observe big overlap between f() = f 0 ( 2 ) + f 1 ( 2 ), f( ) = f 0 ( 2 ) f 1 ( 2 ). f 0 has n=2 coeffs; evaluate at (n=2)nd roots of by same idea recursively. Similarly f 1.

49 Asymptotics: normally t 2 Θ(n= lg n), so Horner s rule costs Θ(nt) = Θ(n 2 = lg n). Wait a minute. Didn t we learn in school that FFT evaluates an n-coeff polynomial at n points using n 1+o(1) operations? Isn t this better than n 2 = lg n? Standard radix-2 FFT: Want to evaluate f = c 0 + c 1 x + + c n 1x n 1 at all the nth roots of 1. Write f as f 0 (x 2 ) + xf 1 (x 2 ). Observe big overlap between f() = f 0 ( 2 ) + f 1 ( 2 ), f( ) = f 0 ( 2 ) f 1 ( 2 ). f 0 has n=2 coeffs; evaluate at (n=2)nd roots of 1 by same idea recursively. Similarly f 1.

50 tics: t 2 Θ(n= lg n), r s rule costs Θ(n 2 = lg n). inute. e learn in school evaluates ff polynomial nts +o(1) operations? better than n 2 = lg n? Standard radix-2 FFT: Want to evaluate f = c 0 + c 1 x + + c n 1x n 1 at all the nth roots of 1. Write f as f 0 (x 2 ) + xf 1 (x 2 ). Observe big overlap between f() = f 0 ( 2 ) + f 1 ( 2 ), f( ) = f 0 ( 2 ) f 1 ( 2 ). f 0 has n=2 coeffs; evaluate at (n=2)nd roots of 1 by same idea recursively. Similarly f 1. Useless i Standard FFT con 1988 Wa independ additive Still quit 1996 von some im 2010 Ga much be We use G plus som

51 lg n), sts n). school s ial ations? an n 2 = lg n? Standard radix-2 FFT: Want to evaluate f = c 0 + c 1 x + + c n 1x n 1 at all the nth roots of 1. Write f as f 0 (x 2 ) + xf 1 (x 2 ). Observe big overlap between f() = f 0 ( 2 ) + f 1 ( 2 ), f( ) = f 0 ( 2 ) f 1 ( 2 ). f 0 has n=2 coeffs; evaluate at (n=2)nd roots of 1 by same idea recursively. Similarly f 1. Useless in char 2: Standard workarou FFT considered im 1988 Wang Zhu, independently 198 additive FFT in Still quite expensiv 1996 von zur Gath some improvement 2010 Gao Mateer: much better additi We use Gao Mate plus some new imp

52 ? Standard radix-2 FFT: Want to evaluate f = c 0 + c 1 x + + c n 1x n 1 at all the nth roots of 1. Write f as f 0 (x 2 ) + xf 1 (x 2 ). Observe big overlap between f() = f 0 ( 2 ) + f 1 ( 2 ), f( ) = f 0 ( 2 ) f 1 ( 2 ). f 0 has n=2 coeffs; evaluate at (n=2)nd roots of 1 by same idea recursively. Similarly f 1. Useless in char 2: =. Standard workarounds are pa FFT considered impractical Wang Zhu, independently 1989 Cantor: additive FFT in char 2. Still quite expensive von zur Gathen Gerhar some improvements Gao Mateer: much better additive FFT. We use Gao Mateer, plus some new improvement

53 Standard radix-2 FFT: Want to evaluate f = c 0 + c 1 x + + c n 1x n 1 at all the nth roots of 1. Write f as f 0 (x 2 ) + xf 1 (x 2 ). Observe big overlap between f() = f 0 ( 2 ) + f 1 ( 2 ), f( ) = f 0 ( 2 ) f 1 ( 2 ). f 0 has n=2 coeffs; evaluate at (n=2)nd roots of 1 by same idea recursively. Similarly f 1. Useless in char 2: =. Standard workarounds are painful. FFT considered impractical Wang Zhu, independently 1989 Cantor: additive FFT in char 2. Still quite expensive von zur Gathen Gerhard: some improvements Gao Mateer: much better additive FFT. We use Gao Mateer, plus some new improvements.

54 radix-2 FFT: evaluate c 1 x + + c n 1x n 1 nth roots of 1. as f 0 (x 2 ) + xf 1 (x 2 ). big overlap between 0( 2 ) + f 1 ( 2 ), f 0 ( 2 ) f 1 ( 2 ). =2 coeffs; at (n=2)nd roots of 1 idea recursively. f 1. Useless in char 2: =. Standard workarounds are painful. FFT considered impractical Wang Zhu, independently 1989 Cantor: additive FFT in char 2. Still quite expensive von zur Gathen Gerhard: some improvements Gao Mateer: much better additive FFT. We use Gao Mateer, plus some new improvements. Gao and f = c 0 + on a size Main ide f 0 (x 2 + Big over f 0 ( 2 + and f( f 0 ( 2 + Twist Then size-(n=2 Apply sa

55 FT: + c n 1x n 1 s of 1. + xf 1 (x 2 ). p between f 1 ( 2 ), f 1 ( 2 ). d roots of 1 sively. Useless in char 2: =. Standard workarounds are painful. FFT considered impractical Wang Zhu, independently 1989 Cantor: additive FFT in char 2. Still quite expensive von zur Gathen Gerhard: some improvements Gao Mateer: much better additive FFT. We use Gao Mateer, plus some new improvements. Gao and Mateer ev f = c 0 + c 1 x + on a size-n F 2 -line Main idea: Write f f 0 (x 2 + x) + xf 1 ( Big overlap betwee f 0 ( 2 + ) + f 1 ( and f( + 1) = f 0 ( 2 + ) + ( + Twist to ensure Then 2 + is size-(n=2) F 2 -linea Apply same idea re

56 1 Useless in char 2: =. Standard workarounds are painful. FFT considered impractical. Gao and Mateer evaluate f = c 0 + c 1 x + + c n 1x n on a size-n F 2 -linear space Wang Zhu, independently 1989 Cantor: additive FFT in char 2. Still quite expensive von zur Gathen Gerhard: some improvements. Main idea: Write f as f 0 (x 2 + x) + xf 1 (x 2 + x). Big overlap between f() = f 0 ( 2 + ) + f 1 ( 2 + ) and f( + 1) = f 0 ( 2 + ) + ( + 1)f 1 ( Gao Mateer: much better additive FFT. We use Gao Mateer, plus some new improvements. Twist to ensure 1 2 space Then 2 + is a size-(n=2) F 2 -linear space. Apply same idea recursively.

57 Useless in char 2: =. Standard workarounds are painful. FFT considered impractical Wang Zhu, independently 1989 Cantor: additive FFT in char 2. Still quite expensive von zur Gathen Gerhard: some improvements Gao Mateer: much better additive FFT. We use Gao Mateer, plus some new improvements. Gao and Mateer evaluate f = c 0 + c 1 x + + c n 1x n 1 on a size-n F 2 -linear space. Main idea: Write f as f 0 (x 2 + x) + xf 1 (x 2 + x). Big overlap between f() = f 0 ( 2 + ) + f 1 ( 2 + ) and f( + 1) = f 0 ( 2 + ) + ( + 1)f 1 ( 2 + ). Twist to ensure 1 2 space. Then 2 + is a size-(n=2) F 2 -linear space. Apply same idea recursively.

58 n char 2: =. workarounds are painful. sidered impractical. ng Zhu, ently 1989 Cantor: FFT in char 2. e expensive. zur Gathen Gerhard: provements. o Mateer: tter additive FFT. ao Mateer, e new improvements. Gao and Mateer evaluate f = c 0 + c 1 x + + c n 1x n 1 on a size-n F 2 -linear space. Main idea: Write f as f 0 (x 2 + x) + xf 1 (x 2 + x). Big overlap between f() = f 0 ( 2 + ) + f 1 ( 2 + ) and f( + 1) = f 0 ( 2 + ) + ( + 1)f 1 ( 2 + ). Twist to ensure 1 2 space. Then 2 + is a size-(n=2) F 2 -linear space. Apply same idea recursively. We gene f = c 0 + for any t ) severa not all o by simpl For t = 0 For t 2 f f 1 is a c Instead o this cons multiply and com

59 =. nds are painful. practical. 9 Cantor: char 2. e. en Gerhard: s. ve FFT. er, rovements. Gao and Mateer evaluate f = c 0 + c 1 x + + c n 1x n 1 on a size-n F 2 -linear space. Main idea: Write f as f 0 (x 2 + x) + xf 1 (x 2 + x). Big overlap between f() = f 0 ( 2 + ) + f 1 ( 2 + ) and f( + 1) = f 0 ( 2 + ) + ( + 1)f 1 ( 2 + ). Twist to ensure 1 2 space. Then 2 + is a size-(n=2) F 2 -linear space. Apply same idea recursively. We generalize to f = c 0 + c 1 x + for any t < n. ) several optimiza not all of which ar by simply tracking For t = 0: copy c 0 For t 2 f1; 2g: f 1 is a constant. Instead of multiply this constant by ea multiply only by ge and compute subse

60 inful. d: s. Gao and Mateer evaluate f = c 0 + c 1 x + + c n 1x n 1 on a size-n F 2 -linear space. Main idea: Write f as f 0 (x 2 + x) + xf 1 (x 2 + x). Big overlap between f() = f 0 ( 2 + ) + f 1 ( 2 + ) and f( + 1) = f 0 ( 2 + ) + ( + 1)f 1 ( 2 + ). Twist to ensure 1 2 space. Then 2 + is a size-(n=2) F 2 -linear space. Apply same idea recursively. We generalize to f = c 0 + c 1 x + + c t x t for any t < n. ) several optimizations, not all of which are automat by simply tracking zeros. For t = 0: copy c 0. For t 2 f1; 2g: f 1 is a constant. Instead of multiplying this constant by each, multiply only by generators and compute subset sums.

61 Gao and Mateer evaluate f = c 0 + c 1 x + + c n 1x n 1 on a size-n F 2 -linear space. Main idea: Write f as f 0 (x 2 + x) + xf 1 (x 2 + x). Big overlap between f() = f 0 ( 2 + ) + f 1 ( 2 + ) and f( + 1) = f 0 ( 2 + ) + ( + 1)f 1 ( 2 + ). Twist to ensure 1 2 space. Then 2 + is a size-(n=2) F 2 -linear space. Apply same idea recursively. We generalize to f = c 0 + c 1 x + + c t x t for any t < n. ) several optimizations, not all of which are automated by simply tracking zeros. For t = 0: copy c 0. For t 2 f1; 2g: f 1 is a constant. Instead of multiplying this constant by each, multiply only by generators and compute subset sums.

62 Mateer evaluate c 1 x + + c n 1x n 1 -n F 2 -linear space. a: Write f as x) + xf 1 (x 2 + x). lap between f() = ) + f 1 ( 2 + ) + 1) = ) + ( + 1)f 1 ( 2 + ). to ensure 1 2 space. 2 + is a ) F 2 -linear space. me idea recursively. We generalize to f = c 0 + c 1 x + + c t x t for any t < n. ) several optimizations, not all of which are automated by simply tracking zeros. For t = 0: copy c 0. For t 2 f1; 2g: f 1 is a constant. Instead of multiplying this constant by each, multiply only by generators and compute subset sums. Syndrom Initial de s 0 = r 1 + s 1 = r 1 s 2 = r 1..., s t = r 1 r 1 ; r 2 ; : : scaled by Typically mapping Not as s still n 2+

63 aluate + c n 1x n 1 ar space. as x 2 + x). n f() = 2 + ) 1)f 1 ( 2 + ). 1 2 space. a r space. cursively. We generalize to f = c 0 + c 1 x + + c t x t for any t < n. ) several optimizations, not all of which are automated by simply tracking zeros. For t = 0: copy c 0. For t 2 f1; 2g: f 1 is a constant. Instead of multiplying this constant by each, multiply only by generators and compute subset sums. Syndrome comput Initial decoding ste s 0 = r 1 + r 2 + s 1 = r r 2 2 s 2 = r r , s = t r 1 t 1 + r 2 t 2 r 1 ; r 2 ; : : : ; r are r n scaled by Goppa c Typically precompu mapping bits to sy Not as slow as Chi still n 2+o(1) and h

64 1 + ).. We generalize to f = c 0 + c 1 x + + c t x t for any t < n. ) several optimizations, not all of which are automated by simply tracking zeros. For t = 0: copy c 0. For t 2 f1; 2g: f 1 is a constant. Instead of multiplying this constant by each, multiply only by generators and compute subset sums. Syndrome computation Initial decoding step: compu s 0 = r 1 + r r, n s 1 = r r r n s 2 = r r r n..., s = t r 1 t 1 + r 2 t r n r 1 ; r 2 ; : : : ; r are received bi n scaled by Goppa constants. Typically precompute matrix mapping bits to syndrome. Not as slow as Chien search still n 2+o(1) and huge secret

65 We generalize to f = c 0 + c 1 x + + c t x t for any t < n. ) several optimizations, not all of which are automated by simply tracking zeros. For t = 0: copy c 0. For t 2 f1; 2g: f 1 is a constant. Instead of multiplying this constant by each, multiply only by generators and compute subset sums. Syndrome computation Initial decoding step: compute s 0 = r 1 + r r, n s 1 = r r r n, n s 2 = r r r n 2 n,..., s = t r 1 t 1 + r 2 t r n t n. r 1 ; r 2 ; : : : ; r are received bits n scaled by Goppa constants. Typically precompute matrix mapping bits to syndrome. Not as slow as Chien search but still n 2+o(1) and huge secret key.

66 ralize to c 1 x + + c t x t < n. l optimizations, f which are automated y tracking zeros. : copy c 0. 1; 2g: onstant. f multiplying tant by each, only by generators pute subset sums. Syndrome computation Initial decoding step: compute s 0 = r 1 + r r, n s 1 = r r r n, n s 2 = r r r n 2 n,..., s = t r 1 t 1 + r 2 t r n t n. r 1 ; r 2 ; : : : ; r are received bits n scaled by Goppa constants. Typically precompute matrix mapping bits to syndrome. Not as slow as Chien search but still n 2+o(1) and huge secret key. Compare f( 1 ) = f( 2 ) =..., f( ) = n

67 + c t x t tions, e automated zeros.. ing ch, nerators t sums. Syndrome computation Initial decoding step: compute s 0 = r 1 + r r, n s 1 = r r r n, n s 2 = r r r n 2 n,..., s = t r 1 t 1 + r 2 t r n t n. r 1 ; r 2 ; : : : ; r are received bits n scaled by Goppa constants. Typically precompute matrix mapping bits to syndrome. Not as slow as Chien search but still n 2+o(1) and huge secret key. Compare to multip f( 1 ) = c 0 + c 1 1 f( 2 ) = c 0 + c , f( ) = n c 0 + c 1

68 ed Syndrome computation Initial decoding step: compute s 0 = r 1 + r r, n s 1 = r r r n, n s 2 = r r r n 2 n,..., s = t r 1 t 1 + r 2 t r n t n. r 1 ; r 2 ; : : : ; r are received bits n scaled by Goppa constants. Typically precompute matrix mapping bits to syndrome. Not as slow as Chien search but still n 2+o(1) and huge secret key. Compare to multipoint evalu f( 1 ) = c 0 + c c f( 2 ) = c 0 + c c..., f( ) = n c 0 + c n

69 Syndrome computation Initial decoding step: compute s 0 = r 1 + r r, n s 1 = r r r n, n s 2 = r r r n 2 n, Compare to multipoint evaluation: f( 1 ) = c 0 + c c t t 1, f( 2 ) = c 0 + c c t t 2,..., f( ) = n c 0 + c n c t t n...., s = t r 1 t 1 + r 2 t r n t n. r 1 ; r 2 ; : : : ; r are received bits n scaled by Goppa constants. Typically precompute matrix mapping bits to syndrome. Not as slow as Chien search but still n 2+o(1) and huge secret key.

70 Syndrome computation Initial decoding step: compute s 0 = r 1 + r r, n s 1 = r r r n, n s 2 = r r r n 2 n,..., s = t r 1 t 1 + r 2 t r n t n. r 1 ; r 2 ; : : : ; r are received bits n scaled by Goppa constants. Typically precompute matrix mapping bits to syndrome. Not as slow as Chien search but still n 2+o(1) and huge secret key. Compare to multipoint evaluation: f( 1 ) = c 0 + c c t t 1, f( 2 ) = c 0 + c c t t 2,..., f( ) = n c 0 + c n c t t n. Matrix for syndrome computation is transpose of matrix for multipoint evaluation.

71 Syndrome computation Initial decoding step: compute s 0 = r 1 + r r, n s 1 = r r r n, n s 2 = r r r n 2 n,..., s = t r 1 t 1 + r 2 t r n t n. r 1 ; r 2 ; : : : ; r are received bits n scaled by Goppa constants. Typically precompute matrix mapping bits to syndrome. Not as slow as Chien search but still n 2+o(1) and huge secret key. Compare to multipoint evaluation: f( 1 ) = c 0 + c c t t 1, f( 2 ) = c 0 + c c t t 2,..., f( ) = n c 0 + c n c t t n. Matrix for syndrome computation is transpose of matrix for multipoint evaluation. Amazing consequence: syndrome computation is as few ops as multipoint evaluation. Eliminate precomputed matrix.

72 e computation coding step: compute r r n, 1 + r r n n, r r n 2 n, t 1 + r 2 t r n t n. : ; r n are received bits Goppa constants. precompute matrix bits to syndrome. low as Chien search but o(1) and huge secret key. Compare to multipoint evaluation: f( 1 ) = c 0 + c c t t 1, f( 2 ) = c 0 + c c t t 2,..., f( ) = n c 0 + c n c t t n. Matrix for syndrome computation is transpose of matrix for multipoint evaluation. Amazing consequence: syndrome computation is as few ops as multipoint evaluation. Eliminate precomputed matrix. Transpos If a linea compute then reve exchangi compute 1956 Bo independ for Boole 1973 Fid preserves preserves number

73 ation p: compute + r, n + + r n, n + + r n 2 n, + + r n t n. eceived bits onstants. te matrix ndrome. en search but uge secret key. Compare to multipoint evaluation: f( 1 ) = c 0 + c c t t 1, f( 2 ) = c 0 + c c t t 2,..., f( ) = n c 0 + c n c t t n. Matrix for syndrome computation is transpose of matrix for multipoint evaluation. Amazing consequence: syndrome computation is as few ops as multipoint evaluation. Eliminate precomputed matrix. Transposition princ If a linear algorithm computes a matrix then reversing edg exchanging inputs/ computes the tran 1956 Bordewijk; independently 195 for Boolean matric 1973 Fiduccia ana preserves number o preserves number o number of nontrivi

74 te n, 2 n, t n. ts but key. Compare to multipoint evaluation: f( 1 ) = c 0 + c c t t 1, f( 2 ) = c 0 + c c t t 2,..., f( ) = n c 0 + c n c t t n. Matrix for syndrome computation is transpose of matrix for multipoint evaluation. Amazing consequence: syndrome computation is as few ops as multipoint evaluation. Eliminate precomputed matrix. Transposition principle: If a linear algorithm computes a matrix M then reversing edges and exchanging inputs/outputs computes the transpose of M 1956 Bordewijk; independently 1957 Lupanov for Boolean matrices Fiduccia analysis: preserves number of mults; preserves number of adds plu number of nontrivial outputs

75 Compare to multipoint evaluation: f( 1 ) = c 0 + c c t t 1, f( 2 ) = c 0 + c c t t 2,..., f( ) = n c 0 + c n c t t n. Matrix for syndrome computation is transpose of matrix for multipoint evaluation. Amazing consequence: syndrome computation is as few ops as multipoint evaluation. Eliminate precomputed matrix. Transposition principle: If a linear algorithm computes a matrix M then reversing edges and exchanging inputs/outputs computes the transpose of M Bordewijk; independently 1957 Lupanov for Boolean matrices Fiduccia analysis: preserves number of mults; preserves number of adds plus number of nontrivial outputs.

76 to multipoint evaluation: c 0 + c c t t 1, c 0 + c c t t 2, c 0 + c n c t t n. r syndrome computation ose of r multipoint evaluation. consequence: e computation is as few ultipoint evaluation. e precomputed matrix. Transposition principle: If a linear algorithm computes a matrix M then reversing edges and exchanging inputs/outputs computes the transpose of M Bordewijk; independently 1957 Lupanov for Boolean matrices Fiduccia analysis: preserves number of mults; preserves number of adds plus number of nontrivial outputs. We built producin Too man gcc ran

77 oint evaluation: + + c t t 1, + + c t t 2, + + n c t t n. e computation int evaluation. nce: tion is as few valuation. uted matrix. Transposition principle: If a linear algorithm computes a matrix M then reversing edges and exchanging inputs/outputs computes the transpose of M Bordewijk; independently 1957 Lupanov for Boolean matrices Fiduccia analysis: preserves number of mults; preserves number of adds plus number of nontrivial outputs. We built transposi producing C code. Too many variable gcc ran out of me

78 ation: t t 1, t t 2, c t t n. ation tion. few. ix. Transposition principle: If a linear algorithm computes a matrix M then reversing edges and exchanging inputs/outputs computes the transpose of M Bordewijk; independently 1957 Lupanov for Boolean matrices Fiduccia analysis: preserves number of mults; preserves number of adds plus number of nontrivial outputs. We built transposing compil producing C code. Too many variables for m = gcc ran out of memory.

79 Transposition principle: If a linear algorithm computes a matrix M then reversing edges and exchanging inputs/outputs computes the transpose of M. We built transposing compiler producing C code. Too many variables for m = 13; gcc ran out of memory Bordewijk; independently 1957 Lupanov for Boolean matrices Fiduccia analysis: preserves number of mults; preserves number of adds plus number of nontrivial outputs.

80 Transposition principle: If a linear algorithm computes a matrix M then reversing edges and exchanging inputs/outputs computes the transpose of M Bordewijk; independently 1957 Lupanov for Boolean matrices. We built transposing compiler producing C code. Too many variables for m = 13; gcc ran out of memory. Used qhasm register allocator to optimize the variables. Worked, but not very quickly Fiduccia analysis: preserves number of mults; preserves number of adds plus number of nontrivial outputs.

81 Transposition principle: If a linear algorithm computes a matrix M then reversing edges and exchanging inputs/outputs computes the transpose of M Bordewijk; independently 1957 Lupanov for Boolean matrices Fiduccia analysis: preserves number of mults; preserves number of adds plus number of nontrivial outputs. We built transposing compiler producing C code. Too many variables for m = 13; gcc ran out of memory. Used qhasm register allocator to optimize the variables. Worked, but not very quickly. Wrote faster register allocator. Still excessive code size.

82 Transposition principle: If a linear algorithm computes a matrix M then reversing edges and exchanging inputs/outputs computes the transpose of M Bordewijk; independently 1957 Lupanov for Boolean matrices Fiduccia analysis: preserves number of mults; preserves number of adds plus number of nontrivial outputs. We built transposing compiler producing C code. Too many variables for m = 13; gcc ran out of memory. Used qhasm register allocator to optimize the variables. Worked, but not very quickly. Wrote faster register allocator. Still excessive code size. Built new interpreter, allowing some code compression. Still big; still some overhead.

83 ition principle: r algorithm s a matrix M rsing edges and ng inputs/outputs s the transpose of M. rdewijk; ently 1957 Lupanov an matrices. uccia analysis: number of mults; number of adds plus of nontrivial outputs. We built transposing compiler producing C code. Too many variables for m = 13; gcc ran out of memory. Used qhasm register allocator to optimize the variables. Worked, but not very quickly. Wrote faster register allocator. Still excessive code size. Built new interpreter, allowing some code compression. Still big; still some overhead. Better so stared at wrote do with sam Small co Speedup translate to transp Further s merged fi scaling b

84 iple: M es and outputs spose of M. 7 Lupanov es. lysis: f mults; f adds plus al outputs. We built transposing compiler producing C code. Too many variables for m = 13; gcc ran out of memory. Used qhasm register allocator to optimize the variables. Worked, but not very quickly. Wrote faster register allocator. Still excessive code size. Built new interpreter, allowing some code compression. Still big; still some overhead. Better solution: stared at additive wrote down transp with same loops et Small code, no ove Speedups of additi translate easily to transposed algo Further savings: merged first stage scaling by Goppa c

85 .. s We built transposing compiler producing C code. Too many variables for m = 13; gcc ran out of memory. Used qhasm register allocator to optimize the variables. Worked, but not very quickly. Wrote faster register allocator. Still excessive code size. Built new interpreter, allowing some code compression. Still big; still some overhead. Better solution: stared at additive FFT, wrote down transposition with same loops etc. Small code, no overhead. Speedups of additive FFT translate easily to transposed algorithm. Further savings: merged first stage with scaling by Goppa constants.

86 We built transposing compiler producing C code. Too many variables for m = 13; gcc ran out of memory. Used qhasm register allocator to optimize the variables. Worked, but not very quickly. Wrote faster register allocator. Still excessive code size. Built new interpreter, allowing some code compression. Still big; still some overhead. Better solution: stared at additive FFT, wrote down transposition with same loops etc. Small code, no overhead. Speedups of additive FFT translate easily to transposed algorithm. Further savings: merged first stage with scaling by Goppa constants.

87 transposing compiler g C code. y variables for m = 13; out of memory. sm register allocator ize the variables. but not very quickly. ster register allocator. ssive code size. interpreter, some code compression. still some overhead. Better solution: stared at additive FFT, wrote down transposition with same loops etc. Small code, no overhead. Speedups of additive FFT translate easily to transposed algorithm. Further savings: merged first stage with scaling by Goppa constants. Secret p Additive field elem This is n needed i Must ap part of t Same iss Solution Almost d Beneš ne

88 ng compiler s for m = 13; mory. er allocator riables. ery quickly. er allocator. size. er, e compression. overhead. Better solution: stared at additive FFT, wrote down transposition with same loops etc. Small code, no overhead. Speedups of additive FFT translate easily to transposed algorithm. Further savings: merged first stage with scaling by Goppa constants. Secret permutation Additive FFT ) f field elements in a This is not the ord needed in code-bas Must apply a secre part of the secret k Same issue for syn Solution: Batcher Almost done with Beneš network.

McBits: fast constant-time code-based cryptography. (to appear at CHES 2013)

McBits: fast constant-time code-based cryptography. (to appear at CHES 2013) McBits: fast constant-time code-based cryptography (to appear at CHES 2013) D. J. Bernstein University of Illinois at Chicago & Technische Universiteit Eindhoven Joint work with: Tung Chou Technische Universiteit

More information

Objectives. McBits: fast constant-time code-based cryptography. Set new speed records for public-key cryptography. (to appear at CHES 2013)

Objectives. McBits: fast constant-time code-based cryptography. Set new speed records for public-key cryptography. (to appear at CHES 2013) McBits: fast constant-time code-based cryptography (to appear at CHES 2013) Objectives Set new speed records for public-key cryptography. D. J. Bernstein University of Illinois at Chicago & Technische

More information

Advanced code-based cryptography. Daniel J. Bernstein University of Illinois at Chicago & Technische Universiteit Eindhoven

Advanced code-based cryptography. Daniel J. Bernstein University of Illinois at Chicago & Technische Universiteit Eindhoven Advanced code-based cryptography Daniel J. Bernstein University of Illinois at Chicago & Technische Universiteit Eindhoven Lattice-basis reduction Define L = (0; 24)Z + (1; 17)Z = {(b; 24a + 17b) : a;

More information

Objectives. McBits: fast constant-time code-based cryptography. Set new speed records for public-key cryptography.

Objectives. McBits: fast constant-time code-based cryptography. Set new speed records for public-key cryptography. McBits: fast constant-time code-based cryptography D. J. Bernstein University of Illinois at Chicago & Technische Universiteit Eindhoven Objectives Set new speed records for public-key cryptography. Joint

More information

McBits: Fast code-based cryptography

McBits: Fast code-based cryptography McBits: Fast code-based cryptography Peter Schwabe Radboud University Nijmegen, The Netherlands Joint work with Daniel Bernstein, Tung Chou December 17, 2013 IMA International Conference on Cryptography

More information

McBits: fast constant-time code-based cryptography

McBits: fast constant-time code-based cryptography McBits: fast constant-time code-based cryptography Daniel J. Bernstein 1,2, Tung Chou 2, and Peter Schwabe 3 1 Department of Computer Science University of Illinois at Chicago, Chicago, IL 60607 7053,

More information

Speeding up characteristic 2: I. Linear maps II. The Å(Ò) game III. Batching IV. Normal bases. D. J. Bernstein University of Illinois at Chicago

Speeding up characteristic 2: I. Linear maps II. The Å(Ò) game III. Batching IV. Normal bases. D. J. Bernstein University of Illinois at Chicago Speeding up characteristic 2: I. Linear maps II. The Å(Ò) game III. Batching IV. Normal bases D. J. Bernstein University of Illinois at Chicago NSF ITR 0716498 Part I. Linear maps Consider computing 0

More information

The tangent FFT. D. J. Bernstein University of Illinois at Chicago

The tangent FFT. D. J. Bernstein University of Illinois at Chicago The tangent FFT D. J. Bernstein University of Illinois at Chicago Advertisement SPEED: Software Performance Enhancement for Encryption and Decryption A workshop on software speeds for secret-key cryptography

More information

Attacking and defending the McEliece cryptosystem

Attacking and defending the McEliece cryptosystem Attacking and defending the McEliece cryptosystem (Joint work with Daniel J. Bernstein and Tanja Lange) Christiane Peters Technische Universiteit Eindhoven PQCrypto 2nd Workshop on Postquantum Cryptography

More information

High-speed cryptography, part 3: more cryptosystems. Daniel J. Bernstein University of Illinois at Chicago & Technische Universiteit Eindhoven

High-speed cryptography, part 3: more cryptosystems. Daniel J. Bernstein University of Illinois at Chicago & Technische Universiteit Eindhoven High-speed cryptography, part 3: more cryptosystems Daniel J. Bernstein University of Illinois at Chicago & Technische Universiteit Eindhoven Cryptographers Working systems Cryptanalytic algorithm designers

More information

Message Authentication Codes (MACs)

Message Authentication Codes (MACs) Message Authentication Codes (MACs) Tung Chou Technische Universiteit Eindhoven, The Netherlands October 8, 2015 1 / 22 About Me 2 / 22 About Me Tung Chou (Tony) 2 / 22 About Me Tung Chou (Tony) Ph.D.

More information

Post-Quantum Cryptography

Post-Quantum Cryptography Post-Quantum Cryptography Code-Based Cryptography Tanja Lange with some slides by Tung Chou and Christiane Peters Technische Universiteit Eindhoven ASCrypto Summer School: 18 September 2017 Error correction

More information

Hyperelliptic-curve cryptography. D. J. Bernstein University of Illinois at Chicago

Hyperelliptic-curve cryptography. D. J. Bernstein University of Illinois at Chicago Hyperelliptic-curve cryptography D. J. Bernstein University of Illinois at Chicago Thanks to: NSF DMS 0140542 NSF ITR 0716498 Alfred P. Sloan Foundation Two parts to this talk: 1. Elliptic curves; modern

More information

Advances in code-based public-key cryptography. D. J. Bernstein University of Illinois at Chicago

Advances in code-based public-key cryptography. D. J. Bernstein University of Illinois at Chicago Advances in code-based public-key cryptography D. J. Bernstein University of Illinois at Chicago Advertisements 1. pqcrypto.org: Post-quantum cryptography hash-based, lattice-based, code-based, multivariate

More information

Decoding One Out of Many

Decoding One Out of Many Decoding One Out of Many Nicolas Sendrier INRIA Paris-Rocquencourt, équipe-projet SECRET Code-based Cryptography Workshop 11-12 May 2011, Eindhoven, The Netherlands Computational Syndrome Decoding Problem:

More information

Efficient and Secure Algorithms for GLV-Based Scalar Multiplication and Their Implementation on GLV-GLS Curves

Efficient and Secure Algorithms for GLV-Based Scalar Multiplication and Their Implementation on GLV-GLS Curves Efficient and Secure Algorithms for GLV-Based Scalar Multiplication and Their Implementation on GLV-GLS Curves SESSION ID: CRYP-T07 Patrick Longa Microsoft Research http://research.microsoft.com/en-us/people/plonga/

More information

Classic McEliece vs. NTS-KEM

Classic McEliece vs. NTS-KEM Classic McEliece vs. NTS-KEM Classic McEliece Comparison Task Force 2018.06.29 Contents 1 Introduction 2 2 Ciphertext size: identical 3 3 Ciphertext details: Classic McEliece is better 4 4 Patent status:

More information

Software implementation of ECC

Software implementation of ECC Software implementation of ECC Radboud University, Nijmegen, The Netherlands June 4, 2015 Summer school on real-world crypto and privacy Šibenik, Croatia Software implementation of (H)ECC Radboud University,

More information

(which is not the same as: hyperelliptic-curve cryptography and elliptic-curve cryptography)

(which is not the same as: hyperelliptic-curve cryptography and elliptic-curve cryptography) Hyper-and-elliptic-curve cryptography (which is not the same as: hyperelliptic-curve cryptography and elliptic-curve cryptography) Daniel J. Bernstein University of Illinois at Chicago & Technische Universiteit

More information

Cryptographie basée sur les codes correcteurs d erreurs et arithmétique

Cryptographie basée sur les codes correcteurs d erreurs et arithmétique with Cryptographie basée sur les correcteurs d erreurs et arithmétique with with Laboratoire Hubert Curien, UMR CNRS 5516, Bâtiment F 18 rue du professeur Benoît Lauras 42000 Saint-Etienne France pierre.louis.cayrel@univ-st-etienne.fr

More information

FPGA-based Niederreiter Cryptosystem using Binary Goppa Codes

FPGA-based Niederreiter Cryptosystem using Binary Goppa Codes FPGA-based Niederreiter Cryptosystem using Binary Goppa Codes Wen Wang 1, Jakub Szefer 1, and Ruben Niederhagen 2 1. Yale University, USA 2. Fraunhofer Institute SIT, Germany April 9, 2018 PQCrypto 2018

More information

FPGA-based Key Generator for the Niederreiter Cryptosystem using Binary Goppa Codes

FPGA-based Key Generator for the Niederreiter Cryptosystem using Binary Goppa Codes FPGA-based Key Generator for the Niederreiter Cryptosystem using Binary Goppa Codes Wen Wang 1, Jakub Szefer 1, and Ruben Niederhagen 2 1 Yale University, New Haven, CT, USA {wen.wang.ww349, jakub.szefer}@yale.edu

More information

Short generators without quantum computers: the case of multiquadratics

Short generators without quantum computers: the case of multiquadratics Short generators without quantum computers: the case of multiquadratics Daniel J. Bernstein University of Illinois at Chicago 31 July 2017 https://multiquad.cr.yp.to Joint work with: Jens Bauch & Henry

More information

Elliptic curves. Tanja Lange Technische Universiteit Eindhoven. with some slides by Daniel J. Bernstein

Elliptic curves. Tanja Lange Technische Universiteit Eindhoven. with some slides by Daniel J. Bernstein Elliptic curves Tanja Lange Technische Universiteit Eindhoven with some slides by Daniel J. Bernstein Diffie-Hellman key exchange Pick some generator. Diffie-Hellman key exchange Pick some generator. Diffie-Hellman

More information

Wild McEliece Incognito

Wild McEliece Incognito Wild McEliece Incognito Christiane Peters Technische Universiteit Eindhoven joint work with Daniel J. Bernstein and Tanja Lange Seminaire de Cryptographie Rennes April 1, 2011 Bad news Quantum computers

More information

This is the new speed record for high-security Diffie-Hellman. Thanks to: University of Illinois at Chicago NSF CCR Alfred P.

This is the new speed record for high-security Diffie-Hellman. Thanks to: University of Illinois at Chicago NSF CCR Alfred P. New speed records for point multiplication D. J. Bernstein 640838 Pentium M cycles to compute a 32-byte secret shared by Dan and Tanja, given Dan s 32-byte secret key and Tanja s 32-byte public key. All

More information

Thanks to: University of Illinois at Chicago NSF CCR Alfred P. Sloan Foundation

Thanks to: University of Illinois at Chicago NSF CCR Alfred P. Sloan Foundation The Poly1305-AES message-authentication code D. J. Bernstein Thanks to: University of Illinois at Chicago NSF CCR 9983950 Alfred P. Sloan Foundation The AES function ( Rijndael 1998 Daemen Rijmen; 2001

More information

Daniel J. Bernstein University of Illinois at Chicago. means an algorithm that a quantum computer can run.

Daniel J. Bernstein University of Illinois at Chicago. means an algorithm that a quantum computer can run. Quantum algorithms 1 Daniel J. Bernstein University of Illinois at Chicago Quantum algorithm means an algorithm that a quantum computer can run. i.e. a sequence of instructions, where each instruction

More information

T h e C S E T I P r o j e c t

T h e C S E T I P r o j e c t T h e P r o j e c t T H E P R O J E C T T A B L E O F C O N T E N T S A r t i c l e P a g e C o m p r e h e n s i v e A s s es s m e n t o f t h e U F O / E T I P h e n o m e n o n M a y 1 9 9 1 1 E T

More information

Curve41417: Karatsuba revisited

Curve41417: Karatsuba revisited Curve41417: Karatsuba revisited Chitchanok Chuengsatiansup Technische Universiteit Eindhoven September 25, 2014 Joint work with Daniel J. Bernstein and Tanja Lange Chitchanok Chuengsatiansup Curve41417:

More information

Four-Dimensional GLV Scalar Multiplication

Four-Dimensional GLV Scalar Multiplication Four-Dimensional GLV Scalar Multiplication ASIACRYPT 2012 Beijing, China Patrick Longa Microsoft Research Francesco Sica Nazarbayev University Elliptic Curve Scalar Multiplication A (Weierstrass) elliptic

More information

Code-based post-quantum cryptography. D. J. Bernstein University of Illinois at Chicago

Code-based post-quantum cryptography. D. J. Bernstein University of Illinois at Chicago Code-based post-quantum cryptography D. J. Bernstein University of Illinois at Chicago Once the enormous energy boost that quantum computers are expected to provide hits the street, most encryption security

More information

Gurgen Khachatrian Martun Karapetyan

Gurgen Khachatrian Martun Karapetyan 34 International Journal Information Theories and Applications, Vol. 23, Number 1, (c) 2016 On a public key encryption algorithm based on Permutation Polynomials and performance analyses Gurgen Khachatrian

More information

176 5 t h Fl oo r. 337 P o ly me r Ma te ri al s

176 5 t h Fl oo r. 337 P o ly me r Ma te ri al s A g la di ou s F. L. 462 E l ec tr on ic D ev el op me nt A i ng er A.W.S. 371 C. A. M. A l ex an de r 236 A d mi ni st ra ti on R. H. (M rs ) A n dr ew s P. V. 326 O p ti ca l Tr an sm is si on A p ps

More information

Post-quantum RSA. We built a great, great 1-terabyte RSA wall, and we had the university pay for the electricity

Post-quantum RSA. We built a great, great 1-terabyte RSA wall, and we had the university pay for the electricity We built a great, great 1-terabyte RSA wall, and we had the university pay for the electricity Daniel J. Bernstein Joint work with: Nadia Heninger Paul Lou Luke Valenta The referees are questioning applicability...

More information

Table of C on t en t s Global Campus 21 in N umbe r s R e g ional Capac it y D e v e lopme nt in E-L e ar ning Structure a n d C o m p o n en ts R ea

Table of C on t en t s Global Campus 21 in N umbe r s R e g ional Capac it y D e v e lopme nt in E-L e ar ning Structure a n d C o m p o n en ts R ea G Blended L ea r ni ng P r o g r a m R eg i o na l C a p a c i t y D ev elo p m ent i n E -L ea r ni ng H R K C r o s s o r d e r u c a t i o n a n d v e l o p m e n t C o p e r a t i o n 3 0 6 0 7 0 5

More information

A Key Recovery Attack on MDPC with CCA Security Using Decoding Errors

A Key Recovery Attack on MDPC with CCA Security Using Decoding Errors A Key Recovery Attack on MDPC with CCA Security Using Decoding Errors Qian Guo Thomas Johansson Paul Stankovski Dept. of Electrical and Information Technology, Lund University ASIACRYPT 2016 Dec 8th, 2016

More information

A L A BA M A L A W R E V IE W

A L A BA M A L A W R E V IE W A L A BA M A L A W R E V IE W Volume 52 Fall 2000 Number 1 B E F O R E D I S A B I L I T Y C I V I L R I G HT S : C I V I L W A R P E N S I O N S A N D TH E P O L I T I C S O F D I S A B I L I T Y I N

More information

Kummer strikes back: new DH speed records

Kummer strikes back: new DH speed records Kummer strikes back: new D speed records Daniel J. Bernstein 1,2, Chitchanok Chuengsatiansup 2, Tanja Lange 2, and Peter Schwabe 3 1 Department of Computer Science, University of Illinois at Chicago Chicago,

More information

A brief survey of post-quantum cryptography. D. J. Bernstein University of Illinois at Chicago

A brief survey of post-quantum cryptography. D. J. Bernstein University of Illinois at Chicago A brief survey of post-quantum cryptography D. J. Bernstein University of Illinois at Chicago Once the enormous energy boost that quantum computers are expected to provide hits the street, most encryption

More information

Improving the efficiency of Generalized Birthday Attacks against certain structured cryptosystems

Improving the efficiency of Generalized Birthday Attacks against certain structured cryptosystems Improving the efficiency of Generalized Birthday Attacks against certain structured cryptosystems Robert Niebuhr 1, Pierre-Louis Cayrel 2, and Johannes Buchmann 1,2 1 Technische Universität Darmstadt Fachbereich

More information

Efficient Application of Countermeasures for Elliptic Curve Cryptography

Efficient Application of Countermeasures for Elliptic Curve Cryptography Efficient Application of Countermeasures for Elliptic Curve Cryptography Vladimir Soukharev, Ph.D. Basil Hess, Ph.D. InfoSec Global Inc. May 19, 2017 Outline Introduction Brief Summary of ECC Arithmetic

More information

Security Issues in Cloud Computing Modern Cryptography II Asymmetric Cryptography

Security Issues in Cloud Computing Modern Cryptography II Asymmetric Cryptography Security Issues in Cloud Computing Modern Cryptography II Asymmetric Cryptography Peter Schwabe October 21 and 28, 2011 So far we assumed that Alice and Bob both have some key, which nobody else has. How

More information

Signatures and DLP-I. Tanja Lange Technische Universiteit Eindhoven

Signatures and DLP-I. Tanja Lange Technische Universiteit Eindhoven Signatures and DLP-I Tanja Lange Technische Universiteit Eindhoven How to compute ap Use binary representation of a to compute a(x; Y ) in blog 2 ac doublings and at most that many additions. E.g. a =

More information

LS-Designs. Bitslice Encryption for Efficient Masked Software Implementations

LS-Designs. Bitslice Encryption for Efficient Masked Software Implementations Bitslice Encryption for Efficient Masked Software Implementations Vincent Grosso 1 Gaëtan Leurent 1,2 François Xavier Standert 1 Kerem Varici 1 1 UCL, Belgium 2 Inria, France FSE 2014 G Leurent (UCL,Inria)

More information

CRYPTANALYSE EN TEMPS POLYNOMIAL DU SCHÉMA DE MCELIECE BASÉ SUR LES CODES

CRYPTANALYSE EN TEMPS POLYNOMIAL DU SCHÉMA DE MCELIECE BASÉ SUR LES CODES POLYNOMIAL DU SCHÉMA CODES GÉOMÉTRIQUES A. COUVREUR 1 I. MÁRQUEZ-CORBELLA 1 R. PELLIKAAN 2 1 INRIA Saclay & LIX 2 Department of Mathematics and Computing Science, TU/e. Journées Codage et Cryptographie

More information

3. Focus on secure cryptosystems. How do we know what a quantum computer will do?

3. Focus on secure cryptosystems. How do we know what a quantum computer will do? Cryptographic readiness levels, and the impact of quantum computers Daniel J. Bernstein How is crypto developed? How confident are we that crypto is secure? Many stages of research from cryptographic design

More information

P a g e 5 1 of R e p o r t P B 4 / 0 9

P a g e 5 1 of R e p o r t P B 4 / 0 9 P a g e 5 1 of R e p o r t P B 4 / 0 9 J A R T a l s o c o n c l u d e d t h a t a l t h o u g h t h e i n t e n t o f N e l s o n s r e h a b i l i t a t i o n p l a n i s t o e n h a n c e c o n n e

More information

P a g e 3 6 of R e p o r t P B 4 / 0 9

P a g e 3 6 of R e p o r t P B 4 / 0 9 P a g e 3 6 of R e p o r t P B 4 / 0 9 p r o t e c t h um a n h e a l t h a n d p r o p e r t y fr om t h e d a n g e rs i n h e r e n t i n m i n i n g o p e r a t i o n s s u c h a s a q u a r r y. J

More information

RSA Key Extraction via Low- Bandwidth Acoustic Cryptanalysis. Daniel Genkin, Adi Shamir, Eran Tromer

RSA Key Extraction via Low- Bandwidth Acoustic Cryptanalysis. Daniel Genkin, Adi Shamir, Eran Tromer RSA Key Extraction via Low- Bandwidth Acoustic Cryptanalysis Daniel Genkin, Adi Shamir, Eran Tromer Mathematical Attacks Input Crypto Algorithm Key Output Goal: recover the key given access to the inputs

More information

Toward Secure Implementation of McEliece Decryption

Toward Secure Implementation of McEliece Decryption Toward Secure Implementation of McEliece Decryption Mariya Georgieva & Frédéric de Portzamparc Gemalto & LIP6, 13/04/2015 1 MCELIECE PUBLIC-KEY ENCRYPTION 2 DECRYPTION ORACLE TIMING ATTACKS 3 EXTENDED

More information

Geometric Predicates P r og r a m s need t o t es t r ela t ive p os it ions of p oint s b a s ed on t heir coor d ina t es. S im p le exa m p les ( i

Geometric Predicates P r og r a m s need t o t es t r ela t ive p os it ions of p oint s b a s ed on t heir coor d ina t es. S im p le exa m p les ( i Automatic Generation of SS tag ed Geometric PP red icates Aleksandar Nanevski, G u y B lello c h and R o b ert H arp er PSCICO project h ttp: / / w w w. cs. cm u. ed u / ~ ps ci co Geometric Predicates

More information

Code-based Cryptography

Code-based Cryptography a Hands-On Introduction Daniel Loebenberger Ηράκλειο, September 27, 2018 Post-Quantum Cryptography Various flavours: Lattice-based cryptography Hash-based cryptography Code-based

More information

Ball-collision decoding

Ball-collision decoding Ball-collision decoding Christiane Peters Technische Universiteit Eindhoven joint work with Daniel J. Bernstein and Tanja Lange Oberseminar Cryptography and Computer Algebra TU Darmstadt November 8, 200

More information

From 5-pass MQ-based identification to MQ-based signatures

From 5-pass MQ-based identification to MQ-based signatures From 5-pass MQ-based identification to MQ-based signatures Ming-Shing Chen 1,2, Andreas Hülsing 3, Joost Rijneveld 4, Simona Samardjiska 5, Peter Schwabe 4 National Taiwan University 1 / Academia Sinica

More information

Code-based cryptography

Code-based cryptography Code-based graphy Laboratoire Hubert Curien, UMR CNRS 5516, Bâtiment F 18 rue du professeur Benoît Lauras 42000 Saint-Etienne France pierre.louis.cayrel@univ-st-etienne.fr June 4th 2013 Pierre-Louis CAYREL

More information

e.g Chou software: Ed25519 signature scheme = on Intel Sandy Bridge (2011), EdDSA using conservative

e.g Chou software: Ed25519 signature scheme = on Intel Sandy Bridge (2011), EdDSA using conservative Modern ECC signatures 1 Many papers have explored 2 2011 Bernstein Duif Lange Curve25519/Ed25519 speed. Schwabe Yang: e.g. 2015 Chou software: Ed25519 signature scheme = on Intel Sandy Bridge (2011), EdDSA

More information

Computational algebraic number theory tackles lattice-based cryptography

Computational algebraic number theory tackles lattice-based cryptography Computational algebraic number theory tackles lattice-based cryptography Daniel J. Bernstein University of Illinois at Chicago & Technische Universiteit Eindhoven Moving to the left Moving to the right

More information

Public Key 9/17/2018. Symmetric Cryptography Review. Symmetric Cryptography: Shortcomings (1) Symmetric Cryptography: Analogy

Public Key 9/17/2018. Symmetric Cryptography Review. Symmetric Cryptography: Shortcomings (1) Symmetric Cryptography: Analogy Symmetric Cryptography Review Alice Bob Public Key x e K (x) y d K (y) x K K Instructor: Dr. Wei (Lisa) Li Department of Computer Science, GSU Two properties of symmetric (secret-key) crypto-systems: The

More information

A Smart Card Implementation of the McEliece PKC

A Smart Card Implementation of the McEliece PKC A Smart Card Implementation of the McEliece PKC Falko Strenzke 1 1 FlexSecure GmbH, Germany, strenzke@flexsecure.de 2 Cryptography and Computeralgebra, Department of Computer Science, Technische Universität

More information

Short generators without quantum computers: the case of multiquadratics

Short generators without quantum computers: the case of multiquadratics Short generators without quantum computers: the case of multiquadratics Daniel J. Bernstein & Christine van Vredendaal University of Illinois at Chicago Technische Universiteit Eindhoven 19 January 2017

More information

Software implementation of Koblitz curves over quadratic fields

Software implementation of Koblitz curves over quadratic fields Software implementation of Koblitz curves over quadratic fields Thomaz Oliveira 1, Julio López 2 and Francisco Rodríguez-Henríquez 1 1 Computer Science Department, Cinvestav-IPN 2 Institute of Computing,

More information

Public-key cryptography and the Discrete-Logarithm Problem. Tanja Lange Technische Universiteit Eindhoven. with some slides by Daniel J.

Public-key cryptography and the Discrete-Logarithm Problem. Tanja Lange Technische Universiteit Eindhoven. with some slides by Daniel J. Public-key cryptography and the Discrete-Logarithm Problem Tanja Lange Technische Universiteit Eindhoven with some slides by Daniel J. Bernstein Cryptography Let s understand what our browsers do. Schoolbook

More information

MASCOT: Faster Malicious Arithmetic Secure Computation with Oblivious Transfer

MASCOT: Faster Malicious Arithmetic Secure Computation with Oblivious Transfer MASCOT: Faster Malicious Arithmetic Secure Computation with Oblivious Transfer Tore Frederiksen Emmanuela Orsini Marcel Keller Peter Scholl Aarhus University University of Bristol 31 May 2016 Secure Multiparty

More information

Introduction to Algorithms

Introduction to Algorithms Lecture 1 Introduction to Algorithms 1.1 Overview The purpose of this lecture is to give a brief overview of the topic of Algorithms and the kind of thinking it involves: why we focus on the subjects that

More information

AES side channel attacks protection using random isomorphisms

AES side channel attacks protection using random isomorphisms Rostovtsev A.G., Shemyakina O.V., St. Petersburg State Polytechnic University AES side channel attacks protection using random isomorphisms General method of side-channel attacks protection, based on random

More information

Cryptographical Security in the Quantum Random Oracle Model

Cryptographical Security in the Quantum Random Oracle Model Cryptographical Security in the Quantum Random Oracle Model Center for Advanced Security Research Darmstadt (CASED) - TU Darmstadt, Germany June, 21st, 2012 This work is licensed under a Creative Commons

More information

Thanks to: University of Illinois at Chicago NSF DMS Alfred P. Sloan Foundation

Thanks to: University of Illinois at Chicago NSF DMS Alfred P. Sloan Foundation Building circuits for integer factorization D. J. Bernstein Thanks to: University of Illinois at Chicago NSF DMS 0140542 Alfred P. Sloan Foundation I want to work for NSA as an independent contractor.

More information

Signatures and DLP. Tanja Lange Technische Universiteit Eindhoven. with some slides by Daniel J. Bernstein

Signatures and DLP. Tanja Lange Technische Universiteit Eindhoven. with some slides by Daniel J. Bernstein Signatures and DLP Tanja Lange Technische Universiteit Eindhoven with some slides by Daniel J. Bernstein ECDSA Users can sign messages using Edwards curves. Take a point P on an Edwards curve modulo a

More information

Integer factorization, part 1: the Q sieve. part 2: detecting smoothness. D. J. Bernstein

Integer factorization, part 1: the Q sieve. part 2: detecting smoothness. D. J. Bernstein Integer factorization, part 1: the Q sieve Integer factorization, part 2: detecting smoothness D. J. Bernstein The Q sieve factors by combining enough -smooth congruences ( + ). Enough log. Plausible conjecture:

More information

On the correct use of the negation map in the Pollard rho method

On the correct use of the negation map in the Pollard rho method On the correct use of the negation map in the Pollard rho method Daniel J. Bernstein 1, Tanja Lange 2, and Peter Schwabe 2 1 Department of Computer Science University of Illinois at Chicago, Chicago, IL

More information

From NewHope to Kyber. Peter Schwabe April 7, 2017

From NewHope to Kyber. Peter Schwabe   April 7, 2017 From NewHope to Kyber Peter Schwabe peter@cryptojedi.org https://cryptojedi.org April 7, 2017 In the past, people have said, maybe it s 50 years away, it s a dream, maybe it ll happen sometime. I used

More information

High-Performance Scalar Multiplication using 8-Dimensional GLV/GLS Decomposition

High-Performance Scalar Multiplication using 8-Dimensional GLV/GLS Decomposition High-Performance Scalar Multiplication using 8-Dimensional GLV/GLS Decomposition Joppe W. Bos, Craig Costello, Huseyin Hisil, Kristin Lauter CHES 2013 Motivation - I Group DH ECDH (F p1, ) (E F p2, +)

More information

A Fast Provably Secure Cryptographic Hash Function

A Fast Provably Secure Cryptographic Hash Function A Fast Provably Secure Cryptographic Hash Function Daniel Augot, Matthieu Finiasz, and Nicolas Sendrier Projet Codes, INRIA Rocquencourt BP 15, 78153 Le Chesnay - Cedex, France [DanielAugot,MatthieuFiniasz,NicolasSendrier]@inriafr

More information

Side-channel analysis in code-based cryptography

Side-channel analysis in code-based cryptography 1 Side-channel analysis in code-based cryptography Tania RICHMOND IMATH Laboratory University of Toulon SoSySec Seminar Rennes, April 5, 2017 Outline McEliece cryptosystem Timing Attack Power consumption

More information

OH BOY! Story. N a r r a t iv e a n d o bj e c t s th ea t e r Fo r a l l a g e s, fr o m th e a ge of 9

OH BOY! Story. N a r r a t iv e a n d o bj e c t s th ea t e r Fo r a l l a g e s, fr o m th e a ge of 9 OH BOY! O h Boy!, was or igin a lly cr eat ed in F r en ch an d was a m a jor s u cc ess on t h e Fr en ch st a ge f or young au di enc es. It h a s b een s een by ap pr ox i ma t ely 175,000 sp ect at

More information

A new zero-knowledge code based identification scheme with reduced communication

A new zero-knowledge code based identification scheme with reduced communication A new zero-knowledge code based identification scheme with reduced communication Carlos Aguilar, Philippe Gaborit, Julien Schrek Université de Limoges, France. {carlos.aguilar,philippe.gaborit,julien.schrek}@xlim.fr

More information

Errors, Eavesdroppers, and Enormous Matrices

Errors, Eavesdroppers, and Enormous Matrices Errors, Eavesdroppers, and Enormous Matrices Jessalyn Bolkema September 1, 2016 University of Nebraska - Lincoln Keep it secret, keep it safe Public Key Cryptography The idea: We want a one-way lock so,

More information

8 Elliptic Curve Cryptography

8 Elliptic Curve Cryptography 8 Elliptic Curve Cryptography 8.1 Elliptic Curves over a Finite Field For the purposes of cryptography, we want to consider an elliptic curve defined over a finite field F p = Z/pZ for p a prime. Given

More information

Coset Decomposition Method for Decoding Linear Codes

Coset Decomposition Method for Decoding Linear Codes International Journal of Algebra, Vol. 5, 2011, no. 28, 1395-1404 Coset Decomposition Method for Decoding Linear Codes Mohamed Sayed Faculty of Computer Studies Arab Open University P.O. Box: 830 Ardeya

More information

Calcul d indice et courbes algébriques : de meilleures récoltes

Calcul d indice et courbes algébriques : de meilleures récoltes Calcul d indice et courbes algébriques : de meilleures récoltes Alexandre Wallet ENS de Lyon, Laboratoire LIP, Equipe AriC Alexandre Wallet De meilleures récoltes dans le calcul d indice 1 / 35 Today:

More information

On the Security of Some Cryptosystems Based on Error-correcting Codes

On the Security of Some Cryptosystems Based on Error-correcting Codes On the Security of Some Cryptosystems Based on Error-correcting Codes Florent Chabaud * Florent.Chabaud~ens.fr Laboratoire d'informatique de FENS ** 45, rue d'ulm 75230 Paris Cedex 05 FRANCE Abstract.

More information

Lecture 1: Perfect Secrecy and Statistical Authentication. 2 Introduction - Historical vs Modern Cryptography

Lecture 1: Perfect Secrecy and Statistical Authentication. 2 Introduction - Historical vs Modern Cryptography CS 7880 Graduate Cryptography September 10, 2015 Lecture 1: Perfect Secrecy and Statistical Authentication Lecturer: Daniel Wichs Scribe: Matthew Dippel 1 Topic Covered Definition of perfect secrecy One-time

More information

CS/COE 1501 cs.pitt.edu/~bill/1501/ Integer Multiplication

CS/COE 1501 cs.pitt.edu/~bill/1501/ Integer Multiplication CS/COE 1501 cs.pitt.edu/~bill/1501/ Integer Multiplication Integer multiplication Say we have 5 baskets with 8 apples in each How do we determine how many apples we have? Count them all? That would take

More information

Me n d e l s P e a s Exer c i se 1 - Par t 1

Me n d e l s P e a s Exer c i se 1 - Par t 1 !! Me n d e l s P e a s Exer c i se 1 - Par t 1 TR UE - BR E E D I N G O R G A N I S M S Go a l In this exercise you will use StarGenetics, a genetics experiment simulator, to understand the concept of

More information

CSE 241 Class 1. Jeremy Buhler. August 24,

CSE 241 Class 1. Jeremy Buhler. August 24, CSE 41 Class 1 Jeremy Buhler August 4, 015 Before class, write URL on board: http://classes.engineering.wustl.edu/cse41/. Also: Jeremy Buhler, Office: Jolley 506, 314-935-6180 1 Welcome and Introduction

More information

arxiv: v2 [cs.cr] 14 Feb 2018

arxiv: v2 [cs.cr] 14 Feb 2018 Code-based Key Encapsulation from McEliece s Cryptosystem Edoardo Persichetti arxiv:1706.06306v2 [cs.cr] 14 Feb 2018 Florida Atlantic University Abstract. In this paper we show that it is possible to extend

More information

Data Structures and Algorithms

Data Structures and Algorithms Data Structures and Algorithms Autumn 2018-2019 Outline 1 Algorithm Analysis (contd.) Outline Algorithm Analysis (contd.) 1 Algorithm Analysis (contd.) Growth Rates of Some Commonly Occurring Functions

More information

2

2 Computer System AA rc hh ii tec ture( 55 ) 2 INTRODUCTION ( d i f f e r e n t r e g i s t e r s, b u s e s, m i c r o o p e r a t i o n s, m a c h i n e i n s t r u c t i o n s, e t c P i p e l i n e E

More information

FOR SALE T H S T E., P R I N C E AL BER T SK

FOR SALE T H S T E., P R I N C E AL BER T SK FOR SALE 1 50 1 1 5 T H S T E., P R I N C E AL BER T SK CHECK OUT THIS PROPERTY ON YOUTUBE: LIVINGSKY CONDOS TOUR W W W. LIV IN G S K YC O N D O S. C A Th e re is ou tstan d in g val ue in these 52 re

More information

The Hash Function JH 1

The Hash Function JH 1 The Hash Function JH 1 16 January, 2011 Hongjun Wu 2,3 wuhongjun@gmail.com 1 The design of JH is tweaked in this report. The round number of JH is changed from 35.5 to 42. This new version may be referred

More information

45nm 1GHz Samsung Exynos more than 15 million sold.

45nm 1GHz Samsung Exynos more than 15 million sold. Smartphone/tablet CPUs 1 Apple A4 also appeared 2 ipad 1 (2010) was the in iphone 4 (2010). first popular tablet: 45nm 1GHz Samsung Exynos more than 15 million sold. 3110 in Samsung Galaxy S (2010) ipad

More information

The HMAC brawl. Daniel J. Bernstein University of Illinois at Chicago

The HMAC brawl. Daniel J. Bernstein University of Illinois at Chicago The HMAC brawl Daniel J. Bernstein University of Illinois at Chicago 2012.02.19 Koblitz Menezes Another look at HMAC : : : : Third, we describe a fundamental flaw in Bellare s 2006 security proof for HMAC,

More information

Optimized Interpolation Attacks on LowMC

Optimized Interpolation Attacks on LowMC Optimized Interpolation Attacks on LowMC Itai Dinur 1, Yunwen Liu 2, Willi Meier 3, and Qingju Wang 2,4 1 Département d Informatique, École Normale Supérieure, Paris, France 2 Dept. Electrical Engineering

More information

McEliece and Niederreiter Cryptosystems That Resist Quantum Fourier Sampling Attacks

McEliece and Niederreiter Cryptosystems That Resist Quantum Fourier Sampling Attacks McEliece and Niederreiter Cryptosystems That Resist Quantum Fourier Sampling Attacks Hang Dinh Indiana Uniersity South Bend joint work with Cristopher Moore Uniersity of New Mexico Alexander Russell Uniersity

More information

Define Efficiency. 2: Analysis. Efficiency. Measuring efficiency. CSE 417: Algorithms and Computational Complexity. Winter 2007 Larry Ruzzo

Define Efficiency. 2: Analysis. Efficiency. Measuring efficiency. CSE 417: Algorithms and Computational Complexity. Winter 2007 Larry Ruzzo CSE 417: Algorithms and Computational 2: Analysis Winter 2007 Larry Ruzzo Define Efficiency Runs fast on typical real problem instances Pro: sensible, bottom-line-oriented Con: moving target (diff computers,

More information

Fixed Argument Pairings

Fixed Argument Pairings craig.costello@qut.edu.au Queensland University of Technology LatinCrypt 2010 Puebla, Mexico Joint work with Douglas Stebila Pairings A mapping e : G 1 G 2 G T : P G 1, Q G 2 and e(p, Q) G T : groups are

More information

Diophantine equations via weighted LLL algorithm

Diophantine equations via weighted LLL algorithm Cryptanalysis of a public key cryptosystem based on Diophantine equations via weighted LLL algorithm Momonari Kudo Graduate School of Mathematics, Kyushu University, JAPAN Kyushu University Number Theory

More information

CBEAM: Ecient Authenticated Encryption from Feebly One-Way φ Functions

CBEAM: Ecient Authenticated Encryption from Feebly One-Way φ Functions CBEAM: Ecient Authenticated Encryption from Feebly One-Way φ Functions Author: Markku-Juhani O. Saarinen Presented by: Jean-Philippe Aumasson CT-RSA '14, San Francisco, USA 26 February 2014 1 / 19 Sponge

More information

You separate binary numbers into columns in a similar fashion. 2 5 = 32

You separate binary numbers into columns in a similar fashion. 2 5 = 32 RSA Encryption 2 At the end of Part I of this article, we stated that RSA encryption works because it s impractical to factor n, which determines P 1 and P 2, which determines our private key, d, which

More information