Software implementation of ECC

Size: px
Start display at page:

Download "Software implementation of ECC"

Transcription

1 Software implementation of ECC Radboud University, Nijmegen, The Netherlands June 4, 2015 Summer school on real-world crypto and privacy Šibenik, Croatia

2 Software implementation of (H)ECC Radboud University, Nijmegen, The Netherlands June 4, 2015 Summer school on real-world crypto and privacy Šibenik, Croatia

3 The ECDLP Definition Given two points P and Q on an elliptic curve, such that Q P, find an integer k such that kp = Q. 3

4 The ECDLP Definition Given two points P and Q on an elliptic curve, such that Q P, find an integer k such that kp = Q. Efficient ECC First idea: user needs to compute kp, so make that fast 3

5 The ECDLP Definition Given two points P and Q on an elliptic curve, such that Q P, find an integer k such that kp = Q. Efficient ECC First idea: user needs to compute kp, so make that fast Actual situation is more complex: Keypair generation: Compute kp for fixed P, don t leak information about scalar k 3

6 The ECDLP Definition Given two points P and Q on an elliptic curve, such that Q P, find an integer k such that kp = Q. Efficient ECC First idea: user needs to compute kp, so make that fast Actual situation is more complex: Keypair generation: Compute kp for fixed P, don t leak information about scalar k DH common-key computation: Compute kp for variable P, don t leak information about scalar k 3

7 The ECDLP Definition Given two points P and Q on an elliptic curve, such that Q P, find an integer k such that kp = Q. Efficient ECC First idea: user needs to compute kp, so make that fast Actual situation is more complex: Keypair generation: Compute kp for fixed P, don t leak information about scalar k DH common-key computation: Compute kp for variable P, don t leak information about scalar k Signature verification needs k1p 1 + k 2P 2, ok to leak information about (public) scalars k 1 and k 2 3

8 The ECDLP Definition Given two points P and Q on an elliptic curve, such that Q P, find an integer k such that kp = Q. Efficient ECC First idea: user needs to compute kp, so make that fast Actual situation is more complex: Keypair generation: Compute kp for fixed P, don t leak information about scalar k DH common-key computation: Compute kp for variable P, don t leak information about scalar k Signature verification needs k1p 1 + k 2P 2, ok to leak information about (public) scalars k 1 and k 2 3

9 The ECC implementation pyramid Scalar multiplication ECC add/double Finite-field arithmetic Big-integer or polynomial arithmetic 4

10 Why I don t like the pyramid... Pyramid levels are not independent Interactions through all levels, relevant for Correctness, Security, and Performance 5

11 Why I don t like the pyramid... Pyramid levels are not independent Interactions through all levels, relevant for Correctness, Security, and Performance Plan for today: demonstrate these dependencies 5

12 Why I don t like the pyramid... Pyramid levels are not independent Interactions through all levels, relevant for Correctness, Security, and Performance Plan for today: demonstrate these dependencies Fix target architecture: AMD64 (aka x86_64, aka x64) Fix target microarchitecture: Intel Sandy Bridge and Ivy Bridge 5

13 Let s start with 255-bit integers typedef struct{ unsigned long long a[4]; } bigint255; void bigint255_add(bigint255 *r, const bigint255 *x, const bigint255 *y) { r->a[0] = x->a[0] + y->a[0]; r->a[1] = x->a[1] + y->a[1]; r->a[2] = x->a[2] + y->a[2]; r->a[3] = x->a[3] + y->a[3]; } 6

14 Let s start with 255-bit integers typedef struct{ unsigned long long a[4]; } bigint255; void bigint255_add(bigint255 *r, const bigint255 *x, const bigint255 *y) { r->a[0] = x->a[0] + y->a[0]; r->a[1] = x->a[1] + y->a[1]; r->a[2] = x->a[2] + y->a[2]; r->a[3] = x->a[3] + y->a[3]; } What s wrong about this? 6

15 Let s start with 255-bit integers typedef struct{ unsigned long long a[4]; } bigint255; void bigint255_add(bigint255 *r, const bigint255 *x, const bigint255 *y) { r->a[0] = x->a[0] + y->a[0]; r->a[1] = x->a[1] + y->a[1]; r->a[2] = x->a[2] + y->a[2]; r->a[3] = x->a[3] + y->a[3]; } What s wrong about this? This performs arithmetic on a vector of 4 independent 64-bit integers (modulo 2 64 ) 6

16 Let s start with 255-bit integers typedef struct{ unsigned long long a[4]; } bigint255; void bigint255_add(bigint255 *r, const bigint255 *x, const bigint255 *y) { r->a[0] = x->a[0] + y->a[0]; r->a[1] = x->a[1] + y->a[1]; r->a[2] = x->a[2] + y->a[2]; r->a[3] = x->a[3] + y->a[3]; } What s wrong about this? This performs arithmetic on a vector of 4 independent 64-bit integers (modulo 2 64 ) This is not the same as arithmetic on 256-bit integers Need to ripple the carries of all additions! 6

17 Radix-2 51 representation Radix-2 64 representation works and is sometimes a good choice Highly depends on the efficiency of handling carries 7

18 Radix-2 51 representation Radix-2 64 representation works and is sometimes a good choice Highly depends on the efficiency of handling carries Example 1: Intel Nehalem can do 3 additions every cycle, but only 1 addition with carry every two cycles (carries cost a factor of 6!) 7

19 Radix-2 51 representation Radix-2 64 representation works and is sometimes a good choice Highly depends on the efficiency of handling carries Example 1: Intel Nehalem can do 3 additions every cycle, but only 1 addition with carry every two cycles (carries cost a factor of 6!) Example 2: When using vector arithmetic, carries are typically lost (expensive to recompute) 7

20 Radix-2 51 representation Radix-2 64 representation works and is sometimes a good choice Highly depends on the efficiency of handling carries Example 1: Intel Nehalem can do 3 additions every cycle, but only 1 addition with carry every two cycles (carries cost a factor of 6!) Example 2: When using vector arithmetic, carries are typically lost (expensive to recompute) Let s get rid of the carries, represent A as (a 0, a 1, a 2, a 3, a 4 ) with A = 4 a i 2 51 i i=0 This is called radix-2 51 representation 7

21 Radix-2 51 representation Radix-2 64 representation works and is sometimes a good choice Highly depends on the efficiency of handling carries Example 1: Intel Nehalem can do 3 additions every cycle, but only 1 addition with carry every two cycles (carries cost a factor of 6!) Example 2: When using vector arithmetic, carries are typically lost (expensive to recompute) Let s get rid of the carries, represent A as (a 0, a 1, a 2, a 3, a 4 ) with A = 4 a i 2 51 i i=0 This is called radix-2 51 representation Multiple ways to write the same integer A, for example A = 2 52 : (2 52, 0, 0, 0, 0) (0, 2, 0, 0, 0) 7

22 Addition of two bigint255 typedef struct{ unsigned long long a[5]; } bigint255; void bigint255_add(bigint255 *r, const bigint255 *x, const bigint255 *y) { r->a[0] = x->a[0] + y->a[0]; r->a[1] = x->a[1] + y->a[1]; r->a[2] = x->a[2] + y->a[2]; r->a[3] = x->a[3] + y->a[3]; r->a[4] = x->a[4] + y->a[4]; } 8

23 Addition of two bigint255 typedef struct{ unsigned long long a[5]; } bigint255; void bigint255_add(bigint255 *r, const bigint255 *x, const bigint255 *y) { r->a[0] = x->a[0] + y->a[0]; r->a[1] = x->a[1] + y->a[1]; r->a[2] = x->a[2] + y->a[2]; r->a[3] = x->a[3] + y->a[3]; r->a[4] = x->a[4] + y->a[4]; } 8

24 Addition of two bigint255 typedef struct{ unsigned long long a[5]; } bigint255; void bigint255_add(bigint255 *r, const bigint255 *x, const bigint255 *y) { r->a[0] = x->a[0] + y->a[0]; r->a[1] = x->a[1] + y->a[1]; r->a[2] = x->a[2] + y->a[2]; r->a[3] = x->a[3] + y->a[3]; r->a[4] = x->a[4] + y->a[4]; } This works as long as all coefficients are in [0,..., ] 8

25 Addition of two bigint255 typedef struct{ unsigned long long a[5]; } bigint255; void bigint255_add(bigint255 *r, const bigint255 *x, const bigint255 *y) { r->a[0] = x->a[0] + y->a[0]; r->a[1] = x->a[1] + y->a[1]; r->a[2] = x->a[2] + y->a[2]; r->a[3] = x->a[3] + y->a[3]; r->a[4] = x->a[4] + y->a[4]; } This works as long as all coefficients are in [0,..., ] When starting with 51-bit coefficients, we can do quite a few additions before we have to carry 8

26 Subtraction of two bigint255 typedef struct{ signed long long a[5]; } bigint255; void bigint255_sub(bigint255 *r, const bigint255 *x, const bigint255 *y) { r->a[0] = x->a[0] - y->a[0]; r->a[1] = x->a[1] - y->a[1]; r->a[2] = x->a[2] - y->a[2]; r->a[3] = x->a[3] - y->a[3]; r->a[4] = x->a[4] - y->a[4]; } Slightly update our bigint255 definition to work with signed 64-bit integers 9

27 Carrying in radix-2 51 With many additions, coefficients may grow larger than 63 bits They grow even faster in multiplication 10

28 Carrying in radix-2 51 With many additions, coefficients may grow larger than 63 bits They grow even faster in multiplication Eventually we have to carry en bloc: signed long long carry = r.a[0] >> 51; r.a[1] += carry; carry <<= 51; r.a[0] -= carry; 10

29 Carrying in radix-2 51 With many additions, coefficients may grow larger than 63 bits They grow even faster in multiplication Eventually we have to carry en bloc: signed long long carry = r.a[0] >> 51; r.a[1] += carry; carry <<= 51; r.a[0] -= carry; Similar for all higher coefficients... 10

30 Big integers and polynomials Addition/Subtraction code would look exactly the same for 5-coefficient polynomial addition 11

31 Big integers and polynomials Addition/Subtraction code would look exactly the same for 5-coefficient polynomial addition This is no coincidence: We actually perform arithmetic in Z[x] Inputs to addition/subtraction are 5-coefficient polynomials 11

32 Big integers and polynomials Addition/Subtraction code would look exactly the same for 5-coefficient polynomial addition This is no coincidence: We actually perform arithmetic in Z[x] Inputs to addition/subtraction are 5-coefficient polynomials Nice thing about arithmetic Z[x]: no carries! 11

33 Big integers and polynomials Addition/Subtraction code would look exactly the same for 5-coefficient polynomial addition This is no coincidence: We actually perform arithmetic in Z[x] Inputs to addition/subtraction are 5-coefficient polynomials Nice thing about arithmetic Z[x]: no carries! To go from Z[x] to Z, evaluate at the radix (this is a ring homomorphism) Carrying means evaluating at the radix 11

34 Big integers and polynomials Addition/Subtraction code would look exactly the same for 5-coefficient polynomial addition This is no coincidence: We actually perform arithmetic in Z[x] Inputs to addition/subtraction are 5-coefficient polynomials Nice thing about arithmetic Z[x]: no carries! To go from Z[x] to Z, evaluate at the radix (this is a ring homomorphism) Carrying means evaluating at the radix Thinking of multiprecision integers as polynomials is very powerful for efficient arithmetic 11

35 Using floating-point limbs Now we can also use floats for our coefficients An IEEE-754 floating-point number has value ( 1) s (1.b m 1 b m 2... b 0 ) 2 e t with b i {0, 1} 12

36 Using floating-point limbs Now we can also use floats for our coefficients An IEEE-754 floating-point number has value ( 1) s (1.b m 1 b m 2... b 0 ) 2 e t with b i {0, 1} For double-precision floats: s {0, 1} sign bit m = 52 mantissa bits e {1,..., 2046} exponent t =

37 Using floating-point limbs Now we can also use floats for our coefficients An IEEE-754 floating-point number has value ( 1) s (1.b m 1 b m 2... b 0 ) 2 e t with b i {0, 1} For double-precision floats: s {0, 1} sign bit m = 52 mantissa bits e {1,..., 2046} exponent t = 1023 For single-precision floats: s {0, 1} sign bit m = 23 mantissa bits e {1,..., 254} exponent t =

38 Using floating-point limbs Now we can also use floats for our coefficients An IEEE-754 floating-point number has value ( 1) s (1.b m 1 b m 2... b 0 ) 2 e t with b i {0, 1} For double-precision floats: s {0, 1} sign bit m = 52 mantissa bits e {1,..., 2046} exponent t = 1023 For single-precision floats: s {0, 1} sign bit m = 23 mantissa bits e {1,..., 254} exponent t = 127 Exponent = 0 used to represent 0 12

39 Using floating-point limbs Now we can also use floats for our coefficients An IEEE-754 floating-point number has value ( 1) s (1.b m 1 b m 2... b 0 ) 2 e t with b i {0, 1} For double-precision floats: s {0, 1} sign bit m = 52 mantissa bits e {1,..., 2046} exponent t = 1023 For single-precision floats: s {0, 1} sign bit m = 23 mantissa bits e {1,..., 254} exponent t = 127 Exponent = 0 used to represent 0 Any number that can be represented like this, will be precise Other numbers will be rounded, according to a rounding mode 12

40 Addition typedef struct{ double a[12]; } bigint255; void bigint255_add(bigint255 *r, const bigint255 *x, const bigint255 *y) { int i; for(i=0;i<12;i++) r->a[i] = x->a[i] + y->a[i]; } 13

41 Subtraction typedef struct{ double a[12]; } bigint255; void bigint255_sub(bigint255 *r, const bigint255 *x, const bigint255 *y) { int i; for(i=0;i<12;i++) r->a[i] = x->a[i] - y->a[i]; } 14

42 Carrying For carrying integers we used a right shift (discard lowest bits) 15

43 Carrying For carrying integers we used a right shift (discard lowest bits) For floating-point numbers we can use multiplication by the inverse of the radix Example: Radix 2 22, multiply by 2 22 This does not cut off lowest bits, need to round 15

44 Carrying For carrying integers we used a right shift (discard lowest bits) For floating-point numbers we can use multiplication by the inverse of the radix Example: Radix 2 22, multiply by 2 22 This does not cut off lowest bits, need to round Some processors have efficient rounding instructions, e.g., vroundpd 15

45 Carrying For carrying integers we used a right shift (discard lowest bits) For floating-point numbers we can use multiplication by the inverse of the radix Example: Radix 2 22, multiply by 2 22 This does not cut off lowest bits, need to round Some processors have efficient rounding instructions, e.g., vroundpd Otherwise (for double-precision): add constant subtract constant This will round the number to an integer according to the rounding mode (to nearest, towards zero, away from zero, or truncate) 15

46 Why would you want this? ECC is typically bottlenecked by speed of multiplier Intel Sandy Bridge, Ivy Bridge: One multiplication per cycle 16

47 Why would you want this? ECC is typically bottlenecked by speed of multiplier Intel Sandy Bridge, Ivy Bridge: One multiplication per cycle Four (vectorized) double-precision multiplications per cycle 16

48 Why would you want this? ECC is typically bottlenecked by speed of multiplier Intel Sandy Bridge, Ivy Bridge: One multiplication per cycle Four (vectorized) double-precision multiplications per cycle Four (vectorized) double-precision additions in the same cycle 16

49 Why would you want this? ECC is typically bottlenecked by speed of multiplier Intel Sandy Bridge, Ivy Bridge: One multiplication per cycle Four (vectorized) double-precision multiplications per cycle Four (vectorized) double-precision additions in the same cycle Operations on 256-bit vector registers introduced with AVX 16

50 Why would you want this? ECC is typically bottlenecked by speed of multiplier Intel Sandy Bridge, Ivy Bridge: One multiplication per cycle Four (vectorized) double-precision multiplications per cycle Four (vectorized) double-precision additions in the same cycle Operations on 256-bit vector registers introduced with AVX Integer operations on those registers introduced only with AVX2 Sandy Bridge and Ivy Bridge don t have AVX2 16

51 Vectorizing EC scalar multiplication Computing multiple scalar multiplications Changes the rules of the game Increases size of active data set 17

52 Vectorizing EC scalar multiplication Computing multiple scalar multiplications Changes the rules of the game Increases size of active data set Parallelism inside multiprecision arithmetic Addition (in redundant representation) is trivially vectorized Vectorizing multiplication needs many shuffles Vectorization eats up instruction-level parallelism 17

53 Vectorizing EC scalar multiplication Computing multiple scalar multiplications Changes the rules of the game Increases size of active data set Parallelism inside multiprecision arithmetic Addition (in redundant representation) is trivially vectorized Vectorizing multiplication needs many shuffles Vectorization eats up instruction-level parallelism Parallelism inside EC arithmetic Vectorize independent multiplications in EC addition May still need some shuffles (after each block of operations) Efficiency depends on EC formulas 17

54 Example: Montgomery ladder step function ladderstep(x Q P, X P, Z P, X Q, Z Q ) t 1 X P + Z P t 6 t 2 1 t 2 X P Z P t 7 t 2 2 t 5 t 6 t 7 t 3 X Q + Z Q t 4 X Q Z Q t 8 t 4 t 1 t 9 t 3 t 2 X P +Q (t 8 + t 9 ) 2 Z P +Q x Q P (t 8 t 9 ) 2 X [2]P t 6 t 7 Z [2]P t 5 (t 7 + ((A + 2)/4) t 5 ) return (X [2]P, Z [2]P, X P +Q, Z P +Q ) end function 18

55 Example: Montgomery ladder step function ladderstep(x Q P, X P, Z P, X Q, Z Q ) t 1 X P + Z P ; t 2 X P Z P ; t 3 X Q + Z Q ; t 4 X Q Z Q t 6 t 1 t 1 ; t 7 t 2 t 2 ; t 8 t 4 t 1 ; t 9 t 3 t 2 t 10 ((A + 2)/4) t 6 t 11 ((A + 2)/4 1) t 7 t 5 t 6 t 7 ; t 4 t 10 t 11 ; t 1 t 8 t 9 ; t 0 t 8 + t 9 Z [2]P t 5 t 4 ; X P +Q t 2 0; X [2]P t 6 t 7 ; t 2 t 1 t 1 Z P +Q x Q P t 2 return (X [2]P, Z [2]P, X P +Q, Z P +Q ) end function 18

56 A better candidate: Kummer surfaces Think of a Kummer surface as the Jacobian of a hyperelliptic curve modulo negation 19

57 A better candidate: Kummer surfaces Think of a Kummer surface as the Jacobian of a hyperelliptic curve modulo negation Easier way to think about it: Group modulo negation Map from group to Kummer surface by rational map X Elements represented projectively as (x : y : z : t) (x : y : z : t) = (rx : ry : rz : rt) for any r 0 Efficient doubling and efficient differential addition 19

58 A better candidate: Kummer surfaces Think of a Kummer surface as the Jacobian of a hyperelliptic curve modulo negation Easier way to think about it: Group modulo negation Map from group to Kummer surface by rational map X Elements represented projectively as (x : y : z : t) (x : y : z : t) = (rx : ry : rz : rt) for any r 0 Efficient doubling and efficient differential addition Ladderstep: gets as input X(P ) = (x 2 : y 2 : z 2 : t 2 ), X(Q) = (x 3 : y 3 : z 3 : t 3 ), and X(Q P ) = (x 1 : y 1 : z 1 : t 1 ) Computes X(2P ) = (x4 : y 4 : z 4 : t 4) Computes X(P + Q) = (x5 : y 5 : z 5 : t 5) 19

59 A better candidate: Kummer surfaces Think of a Kummer surface as the Jacobian of a hyperelliptic curve modulo negation Easier way to think about it: Group modulo negation Map from group to Kummer surface by rational map X Elements represented projectively as (x : y : z : t) (x : y : z : t) = (rx : ry : rz : rt) for any r 0 Efficient doubling and efficient differential addition Ladderstep: gets as input X(P ) = (x 2 : y 2 : z 2 : t 2 ), X(Q) = (x 3 : y 3 : z 3 : t 3 ), and X(Q P ) = (x 1 : y 1 : z 1 : t 1 ) Computes X(2P ) = (x4 : y 4 : z 4 : t 4) Computes X(P + Q) = (x5 : y 5 : z 5 : t 5) Coordinates are elements of a (large) finite field 19

60 A better candidate: Kummer surfaces Think of a Kummer surface as the Jacobian of a hyperelliptic curve modulo negation Easier way to think about it: Group modulo negation Map from group to Kummer surface by rational map X Elements represented projectively as (x : y : z : t) (x : y : z : t) = (rx : ry : rz : rt) for any r 0 Efficient doubling and efficient differential addition Ladderstep: gets as input X(P ) = (x 2 : y 2 : z 2 : t 2 ), X(Q) = (x 3 : y 3 : z 3 : t 3 ), and X(Q P ) = (x 1 : y 1 : z 1 : t 1 ) Computes X(2P ) = (x4 : y 4 : z 4 : t 4) Computes X(P + Q) = (x5 : y 5 : z 5 : t 5) Coordinates are elements of a (large) finite field For same security level, underlying field has half the size as for ECC Example: Choose 128-bit field for 128 bits of security 19

61 Arithmetic on the Kummer surface x 2 H H y 2 A2 B 2 z 2 A2 C 2 t 2 a b a c A2 D 2 a d x 3 H y 3 z 3 t 3 H x1 y 1 x1 z 1 x1 t 1 x 4 y 4 z 4 t 4 x 5 y 5 z 5 t 5 10M + 9S + 6m ladder formulas 20

62 Arithmetic on the Kummer surface x 2 H H y 2 A2 B 2 z 2 A2 C 2 t 2 a b a c A2 D 2 a d x 3 H y 3 z 3 t 3 H x1 y 1 x1 z 1 x1 t 1 x 4 y 4 z 4 t 4 x 5 y 5 z 5 t 5 10M + 9S + 6m ladder formulas x 2 H H y 2 A2 B 2 z 2 A2 C 2 t 2 a b a c A2 D 2 a d x 3 H y 3 z 3 t 3 H A2 B 2 A2 C 2 A2 D 2 x1 y 1 x1 z 1 x1 t 1 x 4 y 4 z 4 t 4 x 5 y 5 z 5 t 5 7M + 12S + 9m ladder formulas 20

63 The squared Kummer surface In fact, we use arithmetic on a different, squared surface Each point (x : y : z : t) on the original surface corresponds to (x 2 : y 2 : z 2 : t 2 ) on the squared surface No operation-count advantages Easier to construct squared surface with small constants In the following rename (x 2 : y 2 : z 2 : t 2 ) to (x : y : z : t) 21

64 Arithmetic on the squared Kummer surface x 2 H H y 2 A2 B 2 z 2 A2 C 2 t 2 A2 D 2 x 3 H y 3 z 3 t 3 H a2 b a2 2 c a2 2 d x1 2 y 1 x1 z 1 x1 t 1 x 4 y 4 z 4 t 4 x 5 y 5 z 5 t 5 10M + 9S + 6m ladder formulas 22

65 Arithmetic on the squared Kummer surface x 2 H H y 2 A2 B 2 z 2 A2 C 2 t 2 A2 D 2 x 3 H y 3 z 3 t 3 H a2 b a2 2 c a2 2 d x1 2 y 1 x1 z 1 x1 t 1 x 4 y 4 z 4 t 4 x 5 y 5 z 5 t 5 x 2 H H y 2 A2 B 2 z 2 A2 C 2 t 2 A2 D 2 x 3 H y 3 z 3 t 3 H A2 B 2 A2 C 2 A2 D 2 a2 b a2 2 c a2 2 d x1 2 y 1 x1 z 1 x1 t 1 x 4 y 4 z 4 t 4 x 5 y 5 z 5 t 5 10M + 9S + 6m ladder formulas 7M + 12S + 9m ladder formulas 22

66 Arithmetic on the (original) Kummer surface x 2 H H y 2 A2 B 2 z 2 A2 C 2 t 2 a b a c A2 D 2 a d x 3 H y 3 z 3 t 3 H x1 y 1 x1 z 1 x1 t 1 x 4 y 4 z 4 t 4 x 5 y 5 z 5 t 5 10M + 9S + 6m ladder formulas x 2 H H y 2 A2 B 2 z 2 A2 C 2 t 2 a b a c A2 D 2 a d x 3 H y 3 z 3 t 3 H A2 B 2 A2 C 2 A2 D 2 x1 y 1 x1 z 1 x1 t 1 x 4 y 4 z 4 t 4 x 5 y 5 z 5 t 5 7M + 12S + 9m ladder formulas 23

67 A suitable Kummer surface Formulas for efficient Kummer surface arithmetic known for a while Originally proposed by Chudnovsky, Chudnovsky, M + 9S + 6m formulas by Gaudry, M + 12S + 9m formulas by Bernstein,

68 A suitable Kummer surface Formulas for efficient Kummer surface arithmetic known for a while Originally proposed by Chudnovsky, Chudnovsky, M + 9S + 6m formulas by Gaudry, M + 12S + 9m formulas by Bernstein, 2006 Problem: find cryptographically secure surface with small constants 24

69 A suitable Kummer surface Formulas for efficient Kummer surface arithmetic known for a while Originally proposed by Chudnovsky, Chudnovsky, M + 9S + 6m formulas by Gaudry, M + 12S + 9m formulas by Bernstein, 2006 Problem: find cryptographically secure surface with small constants Gaudry, Schost, 2012: suitable (squared) surface: Defined over the field F (1 : a 2 /b 2 : a 2 /c 2 : a 2 /d 2 ) = ( 114 : 57 : 66 : 418) (1 : A 2 /B 2 : A 2 /C 2 : A 2 /D 2 ) = ( 833 : 2499 : 1617 : 561) 24

70 A suitable Kummer surface Formulas for efficient Kummer surface arithmetic known for a while Originally proposed by Chudnovsky, Chudnovsky, M + 9S + 6m formulas by Gaudry, M + 12S + 9m formulas by Bernstein, 2006 Problem: find cryptographically secure surface with small constants Gaudry, Schost, 2012: suitable (squared) surface: Defined over the field F (1 : a 2 /b 2 : a 2 /c 2 : a 2 /d 2 ) = ( 114 : 57 : 66 : 418) (1 : A 2 /B 2 : A 2 /C 2 : A 2 /D 2 ) = ( 833 : 2499 : 1617 : 561) Finding this surface cost CPU hours The same surface has been used by Bos, Costello, Hisil, and Lauter (Eurocrypt 2013) 24

71 Representing elements of F Represent an element A in radix-2 127/6 Write A as a 0, a 1, a 2, a 3, a 4, a 5, where a0 is a small multiple of 2 0 a1 is a small multiple of 2 22 a2 is a small multiple of 2 43 a3 is a small multiple of 2 64 a4 is a small multiple of 2 85 a5 is a small multiple of

72 Multiplication Consider multiplication of A and B with reduction mod Make use of the fact that With radix 2 127/6 we obtain: r 0 = a 0 b a 1 b a 2 b a 3 b a 4 b a 5 b 1 r 1 = a 0 b 1 + a 1 b a 2 b a 3 b a 4 b a 5 b 2 r 2 = a 0 b 2 + a 1 b 1 + a 2 b a 3 b a 4 b a 5 b 3 r 3 = a 0 b 3 + a 1 b 2 + a 2 b 1 + a 3 b a 4 b a 5 b 4 r 4 = a 0 b 4 + a 1 b 3 + a 2 b 2 + a 3 b 1 + a 4 b a 5 b 5 r 5 = a 0 b 5 + a 1 b 4 + a 2 b 3 + a 3 b 2 + a 4 b 1 + a 5 b 0 26

73 Multiplication Consider multiplication of A and B with reduction mod Make use of the fact that With radix 2 127/6 we obtain: r 0 = a 0 b a 1 b a 2 b a 3 b a 4 b a 5 b 1 r 1 = a 0 b 1 + a 1 b a 2 b a 3 b a 4 b a 5 b 2 r 2 = a 0 b 2 + a 1 b 1 + a 2 b a 3 b a 4 b a 5 b 3 r 3 = a 0 b 3 + a 1 b 2 + a 2 b 1 + a 3 b a 4 b a 5 b 4 r 4 = a 0 b 4 + a 1 b 3 + a 2 b 2 + a 3 b 1 + a 4 b a 5 b 5 r 5 = a 0 b 5 + a 1 b 4 + a 2 b 3 + a 3 b 2 + a 4 b 1 + a 5 b 0 Obviously, we always perform this whole thing 4 in parallel 26

74 Multiplication Consider multiplication of A and B with reduction mod Make use of the fact that With radix 2 127/6 we obtain: r 0 = a 0 b a 1 b a 2 b a 3 b a 4 b a 5 b 1 r 1 = a 0 b 1 + a 1 b a 2 b a 3 b a 4 b a 5 b 2 r 2 = a 0 b 2 + a 1 b 1 + a 2 b a 3 b a 4 b a 5 b 3 r 3 = a 0 b 3 + a 1 b 2 + a 2 b 1 + a 3 b a 4 b a 5 b 4 r 4 = a 0 b 4 + a 1 b 3 + a 2 b 2 + a 3 b 1 + a 4 b a 5 b 5 r 5 = a 0 b 5 + a 1 b 4 + a 2 b 3 + a 3 b 2 + a 4 b 1 + a 5 b 0 Obviously, we always perform this whole thing 4 in parallel Obviously, we specialize squaring 26

75 Multiplication Consider multiplication of A and B with reduction mod Make use of the fact that With radix 2 127/6 we obtain: r 0 = a 0 b a 1 b a 2 b a 3 b a 4 b a 5 b 1 r 1 = a 0 b 1 + a 1 b a 2 b a 3 b a 4 b a 5 b 2 r 2 = a 0 b 2 + a 1 b 1 + a 2 b a 3 b a 4 b a 5 b 3 r 3 = a 0 b 3 + a 1 b 2 + a 2 b 1 + a 3 b a 4 b a 5 b 4 r 4 = a 0 b 4 + a 1 b 3 + a 2 b 2 + a 3 b 1 + a 4 b a 5 b 5 r 5 = a 0 b 5 + a 1 b 4 + a 2 b 3 + a 3 b 2 + a 4 b 1 + a 5 b 0 Obviously, we always perform this whole thing 4 in parallel Obviously, we specialize squaring Obviously, we specialize multiplications by small constants 26

76 The Hadamard transform x y z t Only shuffeling operation in Kummer arithmetic AVX has limited shuffeling across left and right half Plain Hadamard turns out to be expensive 27

77 The Hadamard transform x y z t Permuted and negated Hadamard Only shuffeling operation in Kummer arithmetic AVX has limited shuffeling across left and right half Plain Hadamard turns out to be expensive Allow generalized Hadamard to output permuted vector Self-inverting permutation cleans after two generalized Hadamards 27

78 The Hadamard transform x y z t Permuted and negated Hadamard Only shuffeling operation in Kummer arithmetic AVX has limited shuffeling across left and right half Plain Hadamard turns out to be expensive Allow generalized Hadamard to output permuted vector Self-inverting permutation cleans after two generalized Hadamards Allow generalized Hadamard to negate vector entries Clean negations by multiplication by negated constants 27

79 Arithmetic on the squared Kummer surface x 2 y 2 z 2 x 3 y 3 z 3 H H x x x x t t z z t 2 (A 2 /D 2 ) ( A 2 /C 2 ) (A 2 /B 2 ) t y y (a 2 /b 2 ) y z z z ( a 2 /c 2 ) z y y t y t (a 2 /d 2 ) t H H x x x x t t z z t 3 (A 2 /D 2 ) ( A 2 /C 2 ) (A 2 /B 2 ) t y y (x 1 /y 1 ) y z z z ( x 1 /z 1 ) z y y y t t (x 1 /t 1 ) x 4 y 4 z 4 t 4 x 5 y 5 z 5 t 5 t 28

80 Looking back... Fastest computation units are vector units Choose (H)ECC with efficiently vectorizable formulas 29

81 Looking back... Fastest computation units are vector units Choose (H)ECC with efficiently vectorizable formulas Formulas dictate the scalar multiplication algorithm 29

82 Looking back... Fastest computation units are vector units Choose (H)ECC with efficiently vectorizable formulas Formulas dictate the scalar multiplication algorithm Choose representation of field elements for fast reduction 29

83 Looking back... Fastest computation units are vector units Choose (H)ECC with efficiently vectorizable formulas Formulas dictate the scalar multiplication algorithm Choose representation of field elements for fast reduction Adjust formulas according to fast shuffle instructions 29

84 Looking back... Fastest computation units are vector units Choose (H)ECC with efficiently vectorizable formulas Formulas dictate the scalar multiplication algorithm Choose representation of field elements for fast reduction Adjust formulas according to fast shuffle instructions Optimizations go through all levels of the pyramid! 29

85 Results 128-bit secure, constant-time scalar multiplication arch cycles open g source of software Sandy yes 1 Bernstein Duif Lange Schwabe Yang CHES 2011 Sandy ? no 1 Hamburg Sandy ? no 1 Longa Sica Asiacrypt 2012 Sandy yes 2 Bos Costello Hisil Lauter Eurocrypt 2013 Sandy yes 1 Oliveira López Aranha Rodríguez- Henríquez CHES 2013 Sandy 96000? no 1 Faz-Hernández Longa Sánchez CT- RSA 2014 Sandy 92000? no 1 Faz-Hernández Longa Sánchez July 2014 Sandy yes 2 new (our results) 30

86 Results 128-bit secure, constant-time scalar multiplication arch cycles open g source of software Ivy yes 1 Bernstein Duif Lange Schwabe Yang CHES 2011 Ivy ? yes 1 Costello Hisil Smith Eurocrypt 2014 Ivy yes 2 Bos Costello Hisil Lauter Eurocrypt 2013 Ivy yes 1 Oliveira López Aranha Rodríguez- Henríquez CHES 2013 Ivy 92000? no 1 Faz-Hernández Longa Sánchez CT- RSA 2014 Ivy 89000? no 1 Faz-Hernández Longa Sánchez July 2014 Ivy yes 2 new (our results) 30

87 More results Also optimized for Intel Haswell arch cycles open g source of software Haswell yes 1 Bernstein Duif Lange Schwabe Yang CHES 2011 Haswell yes 2 Bos Costello Hisil Lauter Eurocrypt 2013 Haswell no 1 Oliveira López Aranha Rodríguez-Henríquez CHES 2013 Haswell yes 2 new (our results) 31

88 Even more results Also optimized for ARM Cortex-A8 arch cycles open g source of software A8-slow yes 1 Bernstein Schwabe CHES 2012 A8-slow yes 2 new (our result) A8-fast yes 1 Bernstein Schwabe CHES 2012 A8-fast yes 2 new (our result) 32

89 Resources online Paper: Daniel J. Bernstein, Chitchanok Chuengsatiansup, Tanja Lange, Peter Schwabe. Kummer strikes back: new DH speed records. Software: Included in SUPERCOP, subdirectory crypto_scalarmult/kummer/ 33

(which is not the same as: hyperelliptic-curve cryptography and elliptic-curve cryptography)

(which is not the same as: hyperelliptic-curve cryptography and elliptic-curve cryptography) Hyper-and-elliptic-curve cryptography (which is not the same as: hyperelliptic-curve cryptography and elliptic-curve cryptography) Daniel J. Bernstein University of Illinois at Chicago & Technische Universiteit

More information

Kummer strikes back: new DH speed records

Kummer strikes back: new DH speed records Kummer strikes back: new D speed records Daniel J. Bernstein 1,2, Chitchanok Chuengsatiansup 2, Tanja Lange 2, and Peter Schwabe 3 1 Department of Computer Science, University of Illinois at Chicago Chicago,

More information

µkummer: efficient hyperelliptic signatures and key exchange on microcontrollers

µkummer: efficient hyperelliptic signatures and key exchange on microcontrollers µkummer: efficient hyperelliptic signatures and key exchange on microcontrollers Joost Renes 1 Peter Schwabe 1 Benjamin Smith 2 Lejla Batina 1 1 Digital Security Group, Radboud University, The Netherlands

More information

Advanced Constructions in Curve-based Cryptography

Advanced Constructions in Curve-based Cryptography Advanced Constructions in Curve-based Cryptography Benjamin Smith Team GRACE INRIA and Laboratoire d Informatique de l École polytechnique (LIX) Summer school on real-world crypto and privacy Sibenik,

More information

Fast Cryptography in Genus 2

Fast Cryptography in Genus 2 Fast Cryptography in Genus 2 Joppe W. Bos, Craig Costello, Huseyin Hisil and Kristin Lauter EUROCRYPT 2013 Athens, Greece May 27, 2013 Fast Cryptography in Genus 2 Recall that curves are much better than

More information

Hyperelliptic Curve Cryptography

Hyperelliptic Curve Cryptography Hyperelliptic Curve Cryptography A SHORT INTRODUCTION Definition (HEC over K): Curve with equation y 2 + h x y = f x with h, f K X Genus g deg h(x) g, deg f x = 2g + 1 f monic Nonsingular 2 Nonsingularity

More information

Efficient and Secure Algorithms for GLV-Based Scalar Multiplication and Their Implementation on GLV-GLS Curves

Efficient and Secure Algorithms for GLV-Based Scalar Multiplication and Their Implementation on GLV-GLS Curves Efficient and Secure Algorithms for GLV-Based Scalar Multiplication and Their Implementation on GLV-GLS Curves SESSION ID: CRYP-T07 Patrick Longa Microsoft Research http://research.microsoft.com/en-us/people/plonga/

More information

Curve41417: Karatsuba revisited

Curve41417: Karatsuba revisited Curve41417: Karatsuba revisited Chitchanok Chuengsatiansup Technische Universiteit Eindhoven September 25, 2014 Joint work with Daniel J. Bernstein and Tanja Lange Chitchanok Chuengsatiansup Curve41417:

More information

Hyperelliptic-curve cryptography. D. J. Bernstein University of Illinois at Chicago

Hyperelliptic-curve cryptography. D. J. Bernstein University of Illinois at Chicago Hyperelliptic-curve cryptography D. J. Bernstein University of Illinois at Chicago Thanks to: NSF DMS 0140542 NSF ITR 0716498 Alfred P. Sloan Foundation Two parts to this talk: 1. Elliptic curves; modern

More information

Fast point multiplication algorithms for binary elliptic curves with and without precomputation

Fast point multiplication algorithms for binary elliptic curves with and without precomputation Fast point multiplication algorithms for binary elliptic curves with and without precomputation Thomaz Oliveira 1 Diego F. Aranha 2 Julio López 2 Francisco Rodríguez-Henríquez 1 1 CINVESTAV-IPN, Mexico

More information

High-Performance Scalar Multiplication using 8-Dimensional GLV/GLS Decomposition

High-Performance Scalar Multiplication using 8-Dimensional GLV/GLS Decomposition High-Performance Scalar Multiplication using 8-Dimensional GLV/GLS Decomposition Joppe W. Bos, Craig Costello, Huseyin Hisil, Kristin Lauter CHES 2013 Motivation - I Group DH ECDH (F p1, ) (E F p2, +)

More information

Post-Snowden Elliptic Curve Cryptography. Patrick Longa Microsoft Research

Post-Snowden Elliptic Curve Cryptography. Patrick Longa Microsoft Research Post-Snowden Elliptic Curve Cryptography Patrick Longa Microsoft Research Joppe Bos Craig Costello Michael Naehrig NXP Semiconductors Microsoft Research Microsoft Research June 2013 the Snowden leaks the

More information

Chapter 4 Number Representations

Chapter 4 Number Representations Chapter 4 Number Representations SKEE2263 Digital Systems Mun im/ismahani/izam {munim@utm.my,e-izam@utm.my,ismahani@fke.utm.my} February 9, 2016 Table of Contents 1 Fundamentals 2 Signed Numbers 3 Fixed-Point

More information

A new algorithm for residue multiplication modulo

A new algorithm for residue multiplication modulo A new algorithm for residue multiplication modulo 2 521 1 Shoukat Ali and Murat Cenk Institute of Applied Mathematics Middle East Technical University, Ankara, Turkey shoukat.1983@gmail.com mcenk@metu.edu.tr

More information

Software implementation of Koblitz curves over quadratic fields

Software implementation of Koblitz curves over quadratic fields Software implementation of Koblitz curves over quadratic fields Thomaz Oliveira 1, Julio López 2 and Francisco Rodríguez-Henríquez 1 1 Computer Science Department, Cinvestav-IPN 2 Institute of Computing,

More information

qdsa: Small and Secure Digital Signatures with Curve-based Diffie-Hellman Key Pairs

qdsa: Small and Secure Digital Signatures with Curve-based Diffie-Hellman Key Pairs qdsa: Small and Secure Digital Signatures with Curve-based Diffie-Hellman Key Pairs Joost Renes 1 Benjamin Smith 2 1 Radboud University 2 INRIA and Laboratoire d Informatique de l École polytechnique 15

More information

Fast Cryptography in Genus 2

Fast Cryptography in Genus 2 Fast Cryptography in Genus 2 Joppe W. Bos 1, Craig Costello 1, Huseyin Hisil 2, and Kristin Lauter 1 1 Microsoft Research, Redmond, USA 2 Yasar University, Izmir, Turkey Abstract. In this paper we highlight

More information

McBits: Fast code-based cryptography

McBits: Fast code-based cryptography McBits: Fast code-based cryptography Peter Schwabe Radboud University Nijmegen, The Netherlands Joint work with Daniel Bernstein, Tung Chou December 17, 2013 IMA International Conference on Cryptography

More information

A gentle introduction to elliptic curve cryptography

A gentle introduction to elliptic curve cryptography A gentle introduction to elliptic curve cryptography Craig Costello Summer School on Real-World Crypto and Privacy June 5, 2017 Šibenik, Croatia Part 1: Motivation Part 2: Elliptic Curves Part 3: Elliptic

More information

Faster arithmetic for number-theoretic transforms

Faster arithmetic for number-theoretic transforms University of New South Wales 7th October 2011, Macquarie University Plan for talk 1. Review number-theoretic transform (NTT) 2. Discuss typical butterfly algorithm 3. Improvements to butterfly algorithm

More information

Selecting Elliptic Curves for Cryptography Real World Issues

Selecting Elliptic Curves for Cryptography Real World Issues Selecting Elliptic Curves for Cryptography Real World Issues Michael Naehrig Cryptography Research Group Microsoft Research UW Number Theory Seminar Seattle, 28 April 2015 Elliptic Curve Cryptography 1985:

More information

ECM at Work. Joppe W. Bos 1 and Thorsten Kleinjung 2. 1 Microsoft Research, Redmond, USA

ECM at Work. Joppe W. Bos 1 and Thorsten Kleinjung 2. 1 Microsoft Research, Redmond, USA ECM at Work Joppe W. Bos 1 and Thorsten Kleinjung 2 1 Microsoft Research, Redmond, USA 2 Laboratory for Cryptologic Algorithms, EPFL, Lausanne, Switzerland 1 / 18 Security assessment of public-key cryptography

More information

High-Performance Scalar Multiplication using 8-Dimensional GLV/GLS Decomposition

High-Performance Scalar Multiplication using 8-Dimensional GLV/GLS Decomposition High-Performance Scalar Multiplication using 8-Dimensional GLV/GLS Decomposition Joppe W. Bos 1, Craig Costello 1, Huseyin Hisil 2, and Kristin Lauter 1 1 Microsoft Research, Redmond, USA 2 Yasar University,

More information

Ed448-Goldilocks, a new elliptic curve

Ed448-Goldilocks, a new elliptic curve Ed448-Goldilocks, a new elliptic curve Mike Hamburg Abstract Many papers have proposed elliptic curves which are faster and easier to implement than the NIST prime-order curves. Most of these curves have

More information

Faster ECC over F 2. School of Computer and Communication Sciences EPFL, Switzerland 2 CertiVox Labs.

Faster ECC over F 2. School of Computer and Communication Sciences EPFL, Switzerland 2 CertiVox Labs. Faster ECC over F 2 521 1 Robert Granger 1 and Michael Scott 2 1 Laboratory for Cryptologic Algorithms School of Computer and Communication Sciences EPFL, Switzerland robbiegranger@gmail.com 2 CertiVox

More information

An improved compression technique for signatures based on learning with errors

An improved compression technique for signatures based on learning with errors An improved compression technique for signatures based on learning with errors Shi Bai and Steven D. Galbraith Department of Mathematics, University of Auckland. CT-RSA 2014 1 / 22 Outline Introduction

More information

Elliptic Curve Cryptography and Security of Embedded Devices

Elliptic Curve Cryptography and Security of Embedded Devices Elliptic Curve Cryptography and Security of Embedded Devices Ph.D. Defense Vincent Verneuil Institut de Mathématiques de Bordeaux Inside Secure June 13th, 2012 V. Verneuil - Elliptic Curve Cryptography

More information

Faster implementation of scalar multiplication on Koblitz curves

Faster implementation of scalar multiplication on Koblitz curves Faster implementation of scalar multiplication on Koblitz curves Diego F. Aranha 1, Armando Faz-Hernández 2, Julio López 3, and Francisco Rodríguez-Henríquez 2 1 Departament of Computer Science, University

More information

FourQNEON: Faster Elliptic Curve Scalar Multiplications on ARM Processors

FourQNEON: Faster Elliptic Curve Scalar Multiplications on ARM Processors FourQNEON: Faster Elliptic Curve Scalar Multiplications on ARM Processors Patrick Longa Microsoft Research, USA plonga@microsoft.com Abstract. We present a high-speed, high-security implementation of the

More information

Two is the fastest prime: lambda coordinates for binary elliptic curves

Two is the fastest prime: lambda coordinates for binary elliptic curves Noname manuscript No. (will be inserted by the editor) Two is the fastest prime: lambda coordinates for binary elliptic curves Thomaz Oliveira Julio López Diego F. Aranha Francisco Rodríguez-Henríquez

More information

Hybrid Binary-Ternary Joint Sparse Form and its Application in Elliptic Curve Cryptography

Hybrid Binary-Ternary Joint Sparse Form and its Application in Elliptic Curve Cryptography Hybrid Binary-Ternary Joint Sparse Form and its Application in Elliptic Curve Cryptography Jithra Adikari, Student Member, IEEE, Vassil Dimitrov, and Laurent Imbert Abstract Multi-exponentiation is a common

More information

CRYPTOGRAPHIC COMPUTING

CRYPTOGRAPHIC COMPUTING CRYPTOGRAPHIC COMPUTING ON GPU Chen Mou Cheng Dept. Electrical Engineering g National Taiwan University January 16, 2009 COLLABORATORS Daniel Bernstein, UIC, USA Tien Ren Chen, Army Tanja Lange, TU Eindhoven,

More information

Double-base scalar multiplication revisited

Double-base scalar multiplication revisited Double-base scalar multiplication revisited Daniel J. Bernstein 1,2, Chitchanok Chuengsatiansup 1, and Tanja Lange 1 1 Department of Mathematics and Computer Science Technische Universiteit Eindhoven P.O.

More information

Faster ECC over F 2. (feat. PMULL)

Faster ECC over F 2. (feat. PMULL) Faster ECC over F 2 571 (feat. PMULL) Hwajeong Seo 1 Institute for Infocomm Research (I2R), Singapore hwajeong84@gmail.com Abstract. In this paper, we show efficient elliptic curve cryptography implementations

More information

ECM at Work. Joppe W. Bos and Thorsten Kleinjung. Laboratory for Cryptologic Algorithms EPFL, Station 14, CH-1015 Lausanne, Switzerland 1 / 14

ECM at Work. Joppe W. Bos and Thorsten Kleinjung. Laboratory for Cryptologic Algorithms EPFL, Station 14, CH-1015 Lausanne, Switzerland 1 / 14 ECM at Work Joppe W. Bos and Thorsten Kleinjung Laboratory for Cryptologic Algorithms EPFL, Station 14, CH-1015 Lausanne, Switzerland 1 / 14 Motivation The elliptic curve method for integer factorization

More information

On the complexity of computing discrete logarithms in the field F

On the complexity of computing discrete logarithms in the field F On the complexity of computing discrete logarithms in the field F 3 6 509 Francisco Rodríguez-Henríquez CINVESTAV-IPN Joint work with: Gora Adj Alfred Menezes Thomaz Oliveira CINVESTAV-IPN University of

More information

Twisted Hessian curves

Twisted Hessian curves Twisted Hessian curves Daniel J. Bernstein 1,2, Chitchanok Chuengsatiansup 1, David Kohel 3, and Tanja Lange 1 1 Department of Mathematics and Computer Science Technische Universiteit Eindhoven P.O. Box

More information

Fast, twist-secure elliptic curve cryptography from Q-curves

Fast, twist-secure elliptic curve cryptography from Q-curves Fast, twist-secure elliptic curve cryptography from Q-curves Benjamin Smith Team GRACE INRIA Saclay Île-de-France Laboratoire d Informatique de l École polytechnique (LIX) ECC #17, Leuven September 16,

More information

The factorization of RSA D. J. Bernstein University of Illinois at Chicago

The factorization of RSA D. J. Bernstein University of Illinois at Chicago The factorization of RSA-1024 D. J. Bernstein University of Illinois at Chicago Abstract: This talk discusses the most important tools for attackers breaking 1024-bit RSA keys today and tomorrow. The same

More information

Daniel J. Bernstein University of Illinois at Chicago. means an algorithm that a quantum computer can run.

Daniel J. Bernstein University of Illinois at Chicago. means an algorithm that a quantum computer can run. Quantum algorithms 1 Daniel J. Bernstein University of Illinois at Chicago Quantum algorithm means an algorithm that a quantum computer can run. i.e. a sequence of instructions, where each instruction

More information

Side-channel attacks and countermeasures for curve based cryptography

Side-channel attacks and countermeasures for curve based cryptography Side-channel attacks and countermeasures for curve based cryptography Tanja Lange Technische Universiteit Eindhoven tanja@hyperelliptic.org 28.05.2007 Tanja Lange SCA on curves p. 1 Overview Elliptic curves

More information

McBits: fast constant-time code-based cryptography. (to appear at CHES 2013)

McBits: fast constant-time code-based cryptography. (to appear at CHES 2013) McBits: fast constant-time code-based cryptography (to appear at CHES 2013) D. J. Bernstein University of Illinois at Chicago & Technische Universiteit Eindhoven Joint work with: Tung Chou Technische Universiteit

More information

High-Performance Scalar Multiplication using 8-Dimensional GLV/GLS Decomposition

High-Performance Scalar Multiplication using 8-Dimensional GLV/GLS Decomposition High-Performance Scalar Multiplication using 8-Dimensional GLV/GLS Decomposition Joppe W. Bos 1, Craig Costello 1, Huseyin Hisil 2, and Kristin Lauter 1 1 Microsoft Research, Redmond, USA 2 Yasar University,

More information

Speeding Up the Fixed-Base Comb Method for Faster Scalar Multiplication on Koblitz Curves

Speeding Up the Fixed-Base Comb Method for Faster Scalar Multiplication on Koblitz Curves Speeding Up the Fixed-Base Comb Method for Faster Scalar Multiplication on Koblitz Curves Christian Hanser and Christian Wagner Institute for Applied Information Processing and Communications (IAIK), Graz

More information

Co-Z Addition Formulæ and Binary Ladders on Elliptic Curves. Raveen Goundar Marc Joye Atsuko Miyaji

Co-Z Addition Formulæ and Binary Ladders on Elliptic Curves. Raveen Goundar Marc Joye Atsuko Miyaji Co-Z Addition Formulæ and Binary Ladders on Elliptic Curves Raveen Goundar Marc Joye Atsuko Miyaji Co-Z Addition Formulæ and Binary Ladders on Elliptic Curves Raveen Goundar Marc Joye Atsuko Miyaji Elliptic

More information

Faster Compact DiffieHellman: Endomorphisms on the x-line

Faster Compact DiffieHellman: Endomorphisms on the x-line Faster Compact DiffieHellman: Endomorphisms on the x-line Craig Costello craigco@microsoft.com Microsoft Resesarch Redmond Seattle, USA Hüseyin Hışıl huseyin.hisil@yasar.edu.tr Computer Eng. Department

More information

Elliptic Curves I. The first three sections introduce and explain the properties of elliptic curves.

Elliptic Curves I. The first three sections introduce and explain the properties of elliptic curves. Elliptic Curves I 1.0 Introduction The first three sections introduce and explain the properties of elliptic curves. A background understanding of abstract algebra is required, much of which can be found

More information

Signatures and DLP-I. Tanja Lange Technische Universiteit Eindhoven

Signatures and DLP-I. Tanja Lange Technische Universiteit Eindhoven Signatures and DLP-I Tanja Lange Technische Universiteit Eindhoven How to compute ap Use binary representation of a to compute a(x; Y ) in blog 2 ac doublings and at most that many additions. E.g. a =

More information

Arithmétique et Cryptographie Asymétrique

Arithmétique et Cryptographie Asymétrique Arithmétique et Cryptographie Asymétrique Laurent Imbert CNRS, LIRMM, Université Montpellier 2 Journée d inauguration groupe Sécurité 23 mars 2010 This talk is about public-key cryptography Why did mathematicians

More information

Four-Dimensional GLV Scalar Multiplication

Four-Dimensional GLV Scalar Multiplication Four-Dimensional GLV Scalar Multiplication ASIACRYPT 2012 Beijing, China Patrick Longa Microsoft Research Francesco Sica Nazarbayev University Elliptic Curve Scalar Multiplication A (Weierstrass) elliptic

More information

Twisted Edwards Curves Revisited

Twisted Edwards Curves Revisited A version of this paper appears in Advances in Cryptology - ASIACRYPT 2008, LNCS Vol. 5350, pp. 326 343. J. Pieprzyk ed., Springer-Verlag, 2008. Twisted Edwards Curves Revisited Huseyin Hisil, Kenneth

More information

On the correct use of the negation map in the Pollard rho method

On the correct use of the negation map in the Pollard rho method On the correct use of the negation map in the Pollard rho method Daniel J. Bernstein 1, Tanja Lange 2, and Peter Schwabe 2 1 Department of Computer Science University of Illinois at Chicago, Chicago, IL

More information

Pairings at High Security Levels

Pairings at High Security Levels Pairings at High Security Levels Michael Naehrig Eindhoven University of Technology michael@cryptojedi.org DoE CRYPTODOC Darmstadt, 21 November 2011 Pairings are efficient!... even at high security levels.

More information

Toward High Performance Matrix Multiplication for Exact Computation

Toward High Performance Matrix Multiplication for Exact Computation Toward High Performance Matrix Multiplication for Exact Computation Pascal Giorgi Joint work with Romain Lebreton (U. Waterloo) Funded by the French ANR project HPAC Séminaire CASYS - LJK, April 2014 Motivations

More information

A Simple Architectural Enhancement for Fast and Flexible Elliptic Curve Cryptography over Binary Finite Fields GF(2 m )

A Simple Architectural Enhancement for Fast and Flexible Elliptic Curve Cryptography over Binary Finite Fields GF(2 m ) A Simple Architectural Enhancement for Fast and Flexible Elliptic Curve Cryptography over Binary Finite Fields GF(2 m ) Stefan Tillich, Johann Großschädl Institute for Applied Information Processing and

More information

Elliptic curves. Tanja Lange Technische Universiteit Eindhoven. with some slides by Daniel J. Bernstein

Elliptic curves. Tanja Lange Technische Universiteit Eindhoven. with some slides by Daniel J. Bernstein Elliptic curves Tanja Lange Technische Universiteit Eindhoven with some slides by Daniel J. Bernstein Diffie-Hellman key exchange Pick some generator. Diffie-Hellman key exchange Pick some generator. Diffie-Hellman

More information

An introduction to supersingular isogeny-based cryptography

An introduction to supersingular isogeny-based cryptography An introduction to supersingular isogeny-based cryptography Craig Costello Summer School on Real-World Crypto and Privacy June 8, 2017 Šibenik, Croatia Towards quantum-resistant cryptosystems from supersingular

More information

Edwards Curves and the ECM Factorisation Method

Edwards Curves and the ECM Factorisation Method Edwards Curves and the ECM Factorisation Method Peter Birkner Eindhoven University of Technology CADO Workshop on Integer Factorization 7 October 2008 Joint work with Daniel J. Bernstein, Tanja Lange and

More information

Signatures and DLP. Tanja Lange Technische Universiteit Eindhoven. with some slides by Daniel J. Bernstein

Signatures and DLP. Tanja Lange Technische Universiteit Eindhoven. with some slides by Daniel J. Bernstein Signatures and DLP Tanja Lange Technische Universiteit Eindhoven with some slides by Daniel J. Bernstein ECDSA Users can sign messages using Edwards curves. Take a point P on an Edwards curve modulo a

More information

9. Datapath Design. Jacob Abraham. Department of Electrical and Computer Engineering The University of Texas at Austin VLSI Design Fall 2017

9. Datapath Design. Jacob Abraham. Department of Electrical and Computer Engineering The University of Texas at Austin VLSI Design Fall 2017 9. Datapath Design Jacob Abraham Department of Electrical and Computer Engineering The University of Texas at Austin VLSI Design Fall 2017 October 2, 2017 ECE Department, University of Texas at Austin

More information

Public-key cryptography and the Discrete-Logarithm Problem. Tanja Lange Technische Universiteit Eindhoven. with some slides by Daniel J.

Public-key cryptography and the Discrete-Logarithm Problem. Tanja Lange Technische Universiteit Eindhoven. with some slides by Daniel J. Public-key cryptography and the Discrete-Logarithm Problem Tanja Lange Technische Universiteit Eindhoven with some slides by Daniel J. Bernstein Cryptography Let s understand what our browsers do. Schoolbook

More information

New Strategy for Doubling-Free Short Addition-Subtraction Chain

New Strategy for Doubling-Free Short Addition-Subtraction Chain Applied Mathematics & Information Sciences 2(2) (2008), 123 133 An International Journal c 2008 Dixie W Publishing Corporation, U. S. A. New Strategy for Doubling-Free Short Addition-Subtraction Chain

More information

Message Authentication Codes (MACs)

Message Authentication Codes (MACs) Message Authentication Codes (MACs) Tung Chou Technische Universiteit Eindhoven, The Netherlands October 8, 2015 1 / 22 About Me 2 / 22 About Me Tung Chou (Tony) 2 / 22 About Me Tung Chou (Tony) Ph.D.

More information

Affine Precomputation with Sole Inversion in Elliptic Curve Cryptography

Affine Precomputation with Sole Inversion in Elliptic Curve Cryptography Affine Precomputation with Sole Inversion in Elliptic Curve Cryptography Erik Dahmen, 1 Katsuyuki Okeya, 2 and Daniel Schepers 1 1 Technische Universität Darmstadt, Fachbereich Informatik, Hochschulstr.10,

More information

Edwards coordinates for elliptic curves, part 1

Edwards coordinates for elliptic curves, part 1 Edwards coordinates for elliptic curves, part 1 Tanja Lange Technische Universiteit Eindhoven tanja@hyperelliptic.org joint work with Daniel J. Bernstein 19.10.2007 Tanja Lange http://www.hyperelliptic.org/tanja/newelliptic/

More information

Outline. Computer Arithmetic for Cryptography in the Arith Group. LIRMM Montpellier Laboratory of Computer Science, Robotics, and Microelectronics

Outline. Computer Arithmetic for Cryptography in the Arith Group. LIRMM Montpellier Laboratory of Computer Science, Robotics, and Microelectronics Outline Computer Arithmetic for Cryptography in the Arith Group Arnaud Tisserand LIRMM, CNRS Univ. Montpellier 2 Arith Group Crypto Puces Porquerolles, April 16 18, 2007 Introduction LIRMM Laboratory Arith

More information

Non-generic attacks on elliptic curve DLPs

Non-generic attacks on elliptic curve DLPs Non-generic attacks on elliptic curve DLPs Benjamin Smith Team GRACE INRIA Saclay Île-de-France Laboratoire d Informatique de l École polytechnique (LIX) ECC Summer School Leuven, September 13 2013 Smith

More information

Elliptic and Hyperelliptic Curves: a Practical Security Comparison"

Elliptic and Hyperelliptic Curves: a Practical Security Comparison Elliptic and Hyperelliptic Curves: a Practical Security Comparison Joppe W. Bos (Microsoft Research), Craig Costello (Microsoft Research),! Andrea Miele (EPFL) 1/13 Motivation and Goal(s)! Elliptic curves

More information

Fast FPGA implementations of Diffie-Hellman on the Kummer surface of a genus-2 curve

Fast FPGA implementations of Diffie-Hellman on the Kummer surface of a genus-2 curve Fast FPGA implementations of Diffie-Hellman on the Kummer surface of a genus-2 curve Philipp Koppermann 1, Fabrizio De Santis 2, Johann Heyszl 1 and Georg Sigl 1,3 1 Fraunhofer Research Institution AISEC,

More information

An Algorithm for Inversion in GF(2 m ) Suitable for Implementation Using a Polynomial Multiply Instruction on GF(2)

An Algorithm for Inversion in GF(2 m ) Suitable for Implementation Using a Polynomial Multiply Instruction on GF(2) An Algorithm for Inversion in GF2 m Suitable for Implementation Using a Polynomial Multiply Instruction on GF2 Katsuki Kobayashi, Naofumi Takagi, and Kazuyoshi Takagi Department of Information Engineering,

More information

Security Issues in Cloud Computing Modern Cryptography II Asymmetric Cryptography

Security Issues in Cloud Computing Modern Cryptography II Asymmetric Cryptography Security Issues in Cloud Computing Modern Cryptography II Asymmetric Cryptography Peter Schwabe October 21 and 28, 2011 So far we assumed that Alice and Bob both have some key, which nobody else has. How

More information

New Composite Operations and Precomputation Scheme for Elliptic Curve Cryptosystems over Prime Fields

New Composite Operations and Precomputation Scheme for Elliptic Curve Cryptosystems over Prime Fields New Composite Operations and Precomputation Scheme for Elliptic Curve Cryptosystems over Prime Fields Patrick Longa 1 and Ali Miri 2 1 Department of Electrical and Computer Engineering University of Waterloo,

More information

ECC security vs. ECDL security. Hyper-and-elliptic-curve cryptography

ECC security vs. ECDL security. Hyper-and-elliptic-curve cryptography Hyper-and-elliptic-curve cryptography (which is not the same as: hyperelliptic-curve cryptography and elliptic-curve cryptography) Daniel J. Bernstein University of Illinois at Chicago & Technische Universiteit

More information

Computing the RSA Secret Key is Deterministic Polynomial Time Equivalent to Factoring

Computing the RSA Secret Key is Deterministic Polynomial Time Equivalent to Factoring Computing the RSA Secret Key is Deterministic Polynomial Time Equivalent to Factoring Alexander May Faculty of Computer Science, Electrical Engineering and Mathematics University of Paderborn 33102 Paderborn,

More information

CPE 776:DATA SECURITY & CRYPTOGRAPHY. Some Number Theory and Classical Crypto Systems

CPE 776:DATA SECURITY & CRYPTOGRAPHY. Some Number Theory and Classical Crypto Systems CPE 776:DATA SECURITY & CRYPTOGRAPHY Some Number Theory and Classical Crypto Systems Dr. Lo ai Tawalbeh Computer Engineering Department Jordan University of Science and Technology Jordan Some Number Theory

More information

Models of Elliptic Curves

Models of Elliptic Curves Models of Elliptic Curves Daniel J. Bernstein Tanja Lange University of Illinois at Chicago and Technische Universiteit Eindhoven djb@cr.yp.to tanja@hyperelliptic.org 26.03.2009 D. J. Bernstein & T. Lange

More information

Fast Point Multiplication on Elliptic Curves Without Precomputation

Fast Point Multiplication on Elliptic Curves Without Precomputation Published in J. von zur Gathen, J.L. Imaña, and Ç.K. Koç, Eds, Arithmetic of Finite Fields (WAIFI 2008), vol. 5130 of Lecture Notes in Computer Science, pp. 36 46, Springer, 2008. Fast Point Multiplication

More information

Formal verification of IA-64 division algorithms

Formal verification of IA-64 division algorithms Formal verification of IA-64 division algorithms 1 Formal verification of IA-64 division algorithms John Harrison Intel Corporation IA-64 overview HOL Light overview IEEE correctness Division on IA-64

More information

Computing Elliptic Curve Discrete Logarithms with the Negation Map

Computing Elliptic Curve Discrete Logarithms with the Negation Map Computing Elliptic Curve Discrete Logarithms with the Negation Map Ping Wang and Fangguo Zhang School of Information Science and Technology, Sun Yat-Sen University, Guangzhou 510275, China isszhfg@mail.sysu.edu.cn

More information

Modular Multiplication in GF (p k ) using Lagrange Representation

Modular Multiplication in GF (p k ) using Lagrange Representation Modular Multiplication in GF (p k ) using Lagrange Representation Jean-Claude Bajard, Laurent Imbert, and Christophe Nègre Laboratoire d Informatique, de Robotique et de Microélectronique de Montpellier

More information

Selecting Elliptic Curves for Cryptography: An Eciency and Security Analysis

Selecting Elliptic Curves for Cryptography: An Eciency and Security Analysis Selecting Elliptic Curves for Cryptography: An Eciency and Security Analysis Joppe W. Bos, Craig Costello, Patrick Longa and Michael Naehrig Microsoft Research, USA Abstract. We select a set of elliptic

More information

Power Consumption Analysis. Arithmetic Level Countermeasures for ECC Coprocessor. Arithmetic Operators for Cryptography.

Power Consumption Analysis. Arithmetic Level Countermeasures for ECC Coprocessor. Arithmetic Operators for Cryptography. Power Consumption Analysis General principle: measure the current I in the circuit Arithmetic Level Countermeasures for ECC Coprocessor Arnaud Tisserand, Thomas Chabrier, Danuta Pamula I V DD circuit traces

More information

Elliptic curve cryptography in a post-quantum world: the mathematics of isogeny-based cryptography

Elliptic curve cryptography in a post-quantum world: the mathematics of isogeny-based cryptography Elliptic curve cryptography in a post-quantum world: the mathematics of isogeny-based cryptography Andrew Sutherland MIT Undergraduate Mathematics Association November 29, 2018 Creating a shared secret

More information

Complement Arithmetic

Complement Arithmetic Complement Arithmetic Objectives In this lesson, you will learn: How additions and subtractions are performed using the complement representation, What is the Overflow condition, and How to perform arithmetic

More information

Four-Dimensional Gallant-Lambert-Vanstone Scalar Multiplication

Four-Dimensional Gallant-Lambert-Vanstone Scalar Multiplication Four-Dimensional Gallant-Lambert-Vanstone Scalar Multiplication Patrick Longa and Francesco Sica 2 Microsoft Research, USA plonga@microsoft.com 2 Nazarbayev University, Kazakhstan francesco.sica@nu.edu.kz

More information

Efficient Computation of Tate Pairing in Projective Coordinate Over General Characteristic Fields

Efficient Computation of Tate Pairing in Projective Coordinate Over General Characteristic Fields Efficient Computation of Tate Pairing in Projective Coordinate Over General Characteristic Fields Sanjit Chatterjee, Palash Sarkar and Rana Barua Cryptology Research Group Applied Statistics Unit Indian

More information

Computing Machine-Efficient Polynomial Approximations

Computing Machine-Efficient Polynomial Approximations Computing Machine-Efficient Polynomial Approximations N. Brisebarre, S. Chevillard, G. Hanrot, J.-M. Muller, D. Stehlé, A. Tisserand and S. Torres Arénaire, LIP, É.N.S. Lyon Journées du GDR et du réseau

More information

Co-Z Addition Formulæ and Binary Lad Elliptic Curves. Goundar, Raveen Ravinesh; Joye, Marc Author(s) Atsuko

Co-Z Addition Formulæ and Binary Lad Elliptic Curves. Goundar, Raveen Ravinesh; Joye, Marc Author(s) Atsuko JAIST Reposi https://dspace.j Title Co-Z Addition Formulæ and Binary Lad Elliptic Curves Goundar, Raveen Ravinesh; Joye, Marc Author(s) Atsuko Citation Lecture Notes in Computer Science, 6 79 Issue Date

More information

OPEN SOURCE IS NOT ENOUGH

OPEN SOURCE IS NOT ENOUGH OPEN SOURCE IS NOT ENOUGH ATTACKING THE EC-PACKAGE OF BOUNCYCASTLE VERSION 1.x 132 DANIEL MALL AND QING ZHONG Abstract. BouncyCastle is an open source Crypto provider written in Java which supplies classes

More information

On the Optimal Pre-Computation of Window τ NAF for Koblitz Curves

On the Optimal Pre-Computation of Window τ NAF for Koblitz Curves On the Optimal Pre-Computation of Window τ NAF for Koblitz Curves William R. Trost and Guangwu Xu Abstract Koblitz curves have been a nice subject of consideration for both theoretical and practical interests.

More information

Subquadratic space complexity multiplier for a class of binary fields using Toeplitz matrix approach

Subquadratic space complexity multiplier for a class of binary fields using Toeplitz matrix approach Subquadratic space complexity multiplier for a class of binary fields using Toeplitz matrix approach M A Hasan 1 and C Negre 2 1 ECE Department and CACR, University of Waterloo, Ontario, Canada 2 Team

More information

An Algorithm to Enhance Elliptic Curves Scalar Multiplication Combining MBNR with Point Halving

An Algorithm to Enhance Elliptic Curves Scalar Multiplication Combining MBNR with Point Halving Applied Mathematical Sciences, Vol. 4, 2010, no. 26, 1259-1272 An Algorithm to Enhance Elliptic Curves Scalar Multiplication Combining MBNR with Point Halving Abdulwahed M. Ismail 1, Mohamad Rushdan MD

More information

Speeding up characteristic 2: I. Linear maps II. The Å(Ò) game III. Batching IV. Normal bases. D. J. Bernstein University of Illinois at Chicago

Speeding up characteristic 2: I. Linear maps II. The Å(Ò) game III. Batching IV. Normal bases. D. J. Bernstein University of Illinois at Chicago Speeding up characteristic 2: I. Linear maps II. The Å(Ò) game III. Batching IV. Normal bases D. J. Bernstein University of Illinois at Chicago NSF ITR 0716498 Part I. Linear maps Consider computing 0

More information

Solving Systems of Modular Equations in One Variable: How Many RSA-Encrypted Messages Does Eve Need to Know?

Solving Systems of Modular Equations in One Variable: How Many RSA-Encrypted Messages Does Eve Need to Know? Solving Systems of Modular Equations in One Variable: How Many RSA-Encrypted Messages Does Eve Need to Know? Alexander May, Maike Ritzenhofen Faculty of Mathematics Ruhr-Universität Bochum, 44780 Bochum,

More information

A gentle introduction to elliptic curve cryptography

A gentle introduction to elliptic curve cryptography A gentle introduction to elliptic curve cryptography Craig Costello Tutorial at SPACE 2016 December 15, 2016 CRRao AIMSCS, Hyderabad, India Part 1: Diffie-Hellman key exchange Part 2: Elliptic Curves Part

More information

SUFFIX PROPERTY OF INVERSE MOD

SUFFIX PROPERTY OF INVERSE MOD IEEE TRANSACTIONS ON COMPUTERS, 2018 1 Algorithms for Inversion mod p k Çetin Kaya Koç, Fellow, IEEE, Abstract This paper describes and analyzes all existing algorithms for computing x = a 1 (mod p k )

More information

ECE260: Fundamentals of Computer Engineering

ECE260: Fundamentals of Computer Engineering Data Representation & 2 s Complement James Moscola Dept. of Engineering & Computer Science York College of Pennsylvania Based on Computer Organization and Design, 5th Edition by Patterson & Hennessy Data

More information

On Generating Coset Representatives of P GL 2 (F q ) in P GL 2 (F q 2)

On Generating Coset Representatives of P GL 2 (F q ) in P GL 2 (F q 2) On Generating Coset Representatives of P GL 2 F q in P GL 2 F q 2 Jincheng Zhuang 1,2 and Qi Cheng 3 1 State Key Laboratory of Information Security Institute of Information Engineering Chinese Academy

More information

ACCELERATING THE SCALAR MULTIPLICATION ON GENUS 2 HYPERELLIPTIC CURVE CRYPTOSYSTEMS

ACCELERATING THE SCALAR MULTIPLICATION ON GENUS 2 HYPERELLIPTIC CURVE CRYPTOSYSTEMS ACCELERATING THE SCALAR MULTIPLICATION ON GENUS 2 HYPERELLIPTIC CURVE CRYPTOSYSTEMS by Balasingham Balamohan A thesis submitted to the Faculty of Graduate and Postdoctoral Studies in partial fulfillment

More information

8 Elliptic Curve Cryptography

8 Elliptic Curve Cryptography 8 Elliptic Curve Cryptography 8.1 Elliptic Curves over a Finite Field For the purposes of cryptography, we want to consider an elliptic curve defined over a finite field F p = Z/pZ for p a prime. Given

More information