Fully Deterministic ECM

Similar documents
1 The Fundamental Theorem of Arithmetic. A positive integer N has a unique prime power decomposition. Primality Testing. and. Integer Factorisation

FACTORIZATION WITH GENUS 2 CURVES

Elliptic Curves Spring 2013 Lecture #12 03/19/2013

COMP4109 : Applied Cryptography

Edwards Curves and the ECM Factorisation Method

Course 2BA1: Trinity 2006 Section 9: Introduction to Number Theory and Cryptography

Integer factorization, part 1: the Q sieve. part 2: detecting smoothness. D. J. Bernstein

ECM at Work. Joppe W. Bos 1 and Thorsten Kleinjung 2. 1 Microsoft Research, Redmond, USA

Course MA2C02, Hilary Term 2013 Section 9: Introduction to Number Theory and Cryptography

A Beginner s Guide To The General Number Field Sieve

Factoring Algorithms Pollard s p 1 Method. This method discovers a prime factor p of an integer n whenever p 1 has only small prime factors.

Integers and Division

2x 1 7. A linear congruence in modular arithmetic is an equation of the form. Why is the solution a set of integers rather than a unique integer?

Congruence of Integers

Another Attempt to Sieve With Small Chips Part II: Norm Factorization

Introduction to Elliptic Curve Cryptography. Anupam Datta

CRYPTOGRAPHIC COMPUTING

Arithmétique et Cryptographie Asymétrique

HOMEWORK 11 MATH 4753

Chapter 9 Mathematics of Cryptography Part III: Primes and Related Congruence Equations

RSA Cryptosystem and Factorization

AN ALGEBRAIC PROOF OF RSA ENCRYPTION AND DECRYPTION

Mathematics for Cryptography

Lecture 6: Cryptanalysis of public-key algorithms.,

ELLIPTIC CURVES AND INTEGER FACTORIZATION

Integer factorization, part 1: the Q sieve. D. J. Bernstein

Mathematical Foundations of Public-Key Cryptography

Number Theory. Modular Arithmetic

ECM at Work. Joppe W. Bos and Thorsten Kleinjung. Laboratory for Cryptologic Algorithms EPFL, Station 14, CH-1015 Lausanne, Switzerland 1 / 14

Iterated Encryption and Wiener s attack on RSA

Basic Algorithms in Number Theory

6. ELLIPTIC CURVE CRYPTOGRAPHY (ECC)

part 2: detecting smoothness part 3: the number-field sieve

A field F is a set of numbers that includes the two numbers 0 and 1 and satisfies the properties:

2x 1 7. A linear congruence in modular arithmetic is an equation of the form. Why is the solution a set of integers rather than a unique integer?

The Elliptic Curve Method and Other Integer Factorization Algorithms. John Wright

Mathematical Foundations of Cryptography

Lecture Notes. Advanced Discrete Structures COT S

NUMBER SYSTEMS. Number theory is the study of the integers. We denote the set of integers by Z:

4 Number Theory and Cryptography

Algorithms for integer factorization and discrete logarithms computation

Algorithms. Shanks square forms algorithm Williams p+1 Quadratic Sieve Dixon s Random Squares Algorithm

Finding small factors of integers. Speed of the number-field sieve. D. J. Bernstein University of Illinois at Chicago

Factorization & Primality Testing

Lenstra s elliptic curve factorization algorithm

Discrete Structures Lecture Primes and Greatest Common Divisor

8 Elliptic Curve Cryptography

Slides by Christopher M. Bourke Instructor: Berthe Y. Choueiry. Spring 2006

Elliptic curves: Theory and Applications. Day 3: Counting points.

Elliptic curves and modularity

Applied Cryptography and Computer Security CSE 664 Spring 2018

Introduction to Information Security

SM9 identity-based cryptographic algorithms Part 1: General

Public-key Cryptography: Theory and Practice

Encryption: The RSA Public Key Cipher

CS March 17, 2009

ALGEBRA. 1. Some elementary number theory 1.1. Primes and divisibility. We denote the collection of integers

The RSA Cryptosystem: Factoring the public modulus. Debdeep Mukhopadhyay

MATH 501 Discrete Mathematics. Lecture 6: Number theory. German University Cairo, Department of Media Engineering and Technology.

MATH 115, SUMMER 2012 LECTURE 4 THURSDAY, JUNE 21ST

Number Theory: Applications. Number Theory Applications. Hash Functions II. Hash Functions III. Pseudorandom Numbers

One can use elliptic curves to factor integers, although probably not RSA moduli.

CPSC 467b: Cryptography and Computer Security

1. multiplication is commutative and associative;

Exercises Exercises. 2. Determine whether each of these integers is prime. a) 21. b) 29. c) 71. d) 97. e) 111. f) 143. a) 19. b) 27. c) 93.

Projective space. There are some situations when this approach seems to break down; for example with an equation like f(x; y) =y 2 (x 3 5x +3) the lin

LARGE PRIME NUMBERS (32, 42; 4) (32, 24; 2) (32, 20; 1) ( 105, 20; 0).

Basic elements of number theory

Basic elements of number theory

CHAPTER 3. Congruences. Congruence: definitions and properties

The RSA Cipher and its Algorithmic Foundations

Introduction to Arithmetic Geometry Fall 2013 Lecture #24 12/03/2013

Notes. Number Theory: Applications. Notes. Number Theory: Applications. Notes. Hash Functions I

The factorization of RSA D. J. Bernstein University of Illinois at Chicago

Commutative Rings and Fields

Euler s ϕ function. Carl Pomerance Dartmouth College

3 The fundamentals: Algorithms, the integers, and matrices

CSC 5930/9010 Modern Cryptography: Number Theory

Fields in Cryptography. Çetin Kaya Koç Winter / 30

COMP239: Mathematics for Computer Science II. Prof. Chadi Assi EV7.635

Parametrizations for Families of ECM-Friendly Curves

Experience in Factoring Large Integers Using Quadratic Sieve

Math 109 HW 9 Solutions

Homework #2 solutions Due: June 15, 2012

Improving Lenstra s Elliptic Curve Method

An integer p is prime if p > 1 and p has exactly two positive divisors, 1 and p.

MATH 2112/CSCI 2112, Discrete Structures I Winter 2007 Toby Kenney Homework Sheet 5 Hints & Model Solutions

Part IA Numbers and Sets

Part V. Chapter 19. Congruence of integers

Summary Slides for MATH 342 June 25, 2018

ECM using Edwards curves

CPSC 467b: Cryptography and Computer Security

Implementing the Elliptic Curve Method of Factoring in Reconfigurable Hardware

Factoring. there exists some 1 i < j l such that x i x j (mod p). (1) p gcd(x i x j, n).

a the relation arb is defined if and only if = 2 k, k

What is The Continued Fraction Factoring Method

All variables a, b, n, etc are integers unless otherwise stated. Each part of a problem is worth 5 points.

Number Theory. CSS322: Security and Cryptography. Sirindhorn International Institute of Technology Thammasat University CSS322. Number Theory.

Algebra. Modular arithmetic can be handled mathematically by introducing a congruence relation on the integers described in the above example.

Elliptic Curve Method for Integer Factorization on Parallel Architectures

Transcription:

Fully Deterministic ECM Iram Chelli LORIA (CNRS) - CACAO Supervisor: P. Zimmermann September 23, 2009

Introduction The Elliptic Curve Method (ECM) is currently the best-known general-purpose factorization algorithm for finding small prime factors in numbers having more than 200 digits. Heuristic expected running time to find a factor p of a number n O(L(p) 2+o(1) M(log(n))) where L(p) = e log(p) log(log(p)) and M(log(n)) is the complexity of multiplications modulo n. The complexity of ECM is dominated by the size of the smallest factor p of n rather than the size of the number n to be factored.

Introduction ECM is a randomized algorithm. Monte-Carlo algorithm. In some applications, one would like to be able to be certain to remove all prime factors up to a given bound B: this is what we call a deterministic factoring algorithm. Ex: the relation collection phase of the Number Field Sieve (NFS) to help identify smooth integers.

Introduction The General Number Field Sieve (GNFS) is the best currently known algorithm for factoring hard integers, i.e., integers that have no small prime factors or other properties that would make them easy to factor. In particular, the modulus of keys used in the RSA asymmetric encryption system is chosen to be a hard composite integer. It is a general factorization algorithm: its run-time depends only on the size of the input number but not on the size or other properties of the prime factors. As factorization of the modulus reveals the private key, the cost of factoring the modulus with GNFS serves as an upper bound on the cost of breaking RSA.

Table of contents 1 Contribution 2 E(F p ) 3 ECM 4 σ-chains 5 Best parameters 6 p < B = 2 32 7 Optimizations for FDECM 8 Results

Contribution We present a FDECM algorithm allowing to remove with certainty all prime factors less than B = 2 32 from a composite input number n. Designed to be the most implementation-independent possible. 203280221 primes below B : trial-division and GCD both are unpractical.

Contribution With traditional ECM it costs a few hundred random curves. It is quite fast but no guarantee that all factors will be found. With FDECM: even faster! We have optimized it to under a hundred well-chosen elliptic curves, which can be very fast in an optimized ECM implementation with optimized B 1 and B 2 smoothness bounds. This is the first detailed description of a fully deterministic ECM algorithm.

Curve equations - Definition Curve equations - Definition Elliptic curve equation in Weierstrass form: y 2 = x 3 + Ax + B. The curve is defined as the set of points whose coordinates verify the equation, plus a point at "infinity". Elliptic curves in Weierstrass form are not very efficient in terms of computational cost. Elliptic curve equation in Montgomery s form: by 2 = x 3 + a x 2 + x. Montgomery s form can be obtained from Weierstrass form by the change of variables: X = (3x + a)/3b, Y = y/b, A = (3 a 2 )/3b 2, B = (2a 3 /9 a)/3b 2.

Curve equations - Definition Elliptic curve defined by y 2 = x 3 + 1.

Curve equations - Definition Elliptic curves in projective coordinates In homogenous form we have the equation by 2 z = x 3 + a x 2 z + xz 2. Points on the curve are represented in projective coordinates (x : y : z), disregarding the y-coordinate simply noting P = (x :: z). Avoid inversions which are computationally costly. A point (x : y : z) in projective coordinates, z 0, can be converted to (x/z, y/z) in affine coordinates and conversely.

Group structure Group structure The set E(K ) of points of an elliptic curve E has an abelian group structure. Point O at infinity is the neutral element Inverse of a point P = (x, y) of E is P = (x, y) E. Group law can be observed geometrically: geometrical construction of the sum of two points on an elliptic curve.

Group structure Geometrical group law view

Point addition laws Point doubling and addition laws In order to work on elliptic curves, we need to be able to compute point additions. Addition law for Curves in Montgomery s form require that the difference of the two points be known. For an elliptic curve in Montgomery s form: Let P = (x P :: z P ) and Q = (x Q :: z Q ) be two distinct points and let P Q = (x P Q :: z P Q ) be their difference. Their sum is: x P+Q = 4z P Q (x P x Q z P z Q ) 2, z P+Q = 4x P Q (x P z Q z P x Q ) 2. The doubling is: x 2P = (x 2 P z2 P )2, z 2P = 4x P z P [(x P z P ) 2 + 4d x P z P ], where d = (a + 2)/4 and a is the coefficient in the elliptic curve equation.

Point addition laws Double and add The ECM algorithm needs computation of point kp from a given integer k and point P. Except for the doubling, we have no multiplication by an integer formula. Point np is computed by means of repeated addings and doublings from the binary expansion of integer n. k = 100 = (1100100) 2 100P = 2(2(P + 2(2(2(P + 2P))))).

Point addition laws Algorithm DOUBLE AND ADD (Right to Left): INPUT: Point P from E and n 1 OUTPUT: Point R = np 1. Set Q = P and R = O (point at infinity) 2. loop while n > 0 if n = 1 (mod 2) set R = R + Q set Q = 2Q and n = n/2 3. return R (which equals np)

Point addition laws Lucas chains Special case of addition chains in which the sum of two terms can appear if and only if their difference also appears. These chains are used when working in Montgomery s form. Montgomery s PRAC is an heuristic algorithm designed to find short Lucas chains. Non-unicity: Many different Lucas chains exist.

Heuristics of the algorithm The ECM method The number to be factored being n, we work over Z/nZ as if it were a field. The only operation that might fail is ring inversion which is calculated using the Euclidean algorithm. If an inversion fails, then n is not coprime with that number and we find a factor of n by computing their gcd.

Smoothness criteria Smoothness criteria Let n be a positive integer, with prime factorization: n = m i=1 pα i i and let B be a positive integer. n is B-smooth if all of its prime factors p i verify p i B. It is B-powersmooth if for all i we have, p α i i B. n is (B 1, B 2 )-smooth if it is B 1 -powersmooth for all but it s largest prime factor p m, and we have α m = 1 and p m B 2. 1108233 = 3 2 7 3 359 is (7 3,359)-smooth.

ECM algorithm ECM algorithm We choose a random nonsingular elliptic curve E over Z/nZ, where n is the number to factor, and a point P on it. We seek a positive integer k such that [k]p = O modulo an (unknown) prime divisor p of n but not modulo n. Then, in the computation of Q = [k]p mod n the attempted inversion of an element not coprime to n reveals the factor p by simple computation of gcd(x Q,n). For this to happen k needs to be a multiple of g p = #E (F p ) which is also unknown.

Stage 1 Stage 1 Stage 1 will compute Q = [k]p where k = π B 1 π log(b 1)/ log(π) = lcm (1, 2,..., B 1 ). It is successful if the order g of the curve is B 1 -powersmooth. The main cost of stage 1 is the elliptic curve arithmetic.

Stage 2 Stage 2 Stage 2 is an improvement of the initial ECM algorithm. When the order g is not B 1 -smooth because of a single prime factor q above B 1. A bound B 2 > B 1 is set so that we have a chance that this prime factor q is below B2. It is successful if g is (B 1, B 2 )-smooth. Prime q will then be found in stage 2.

Brent-Suyama s parametrization Brent-Suyama s parametrization It is simple and of widespread use: allows for easy reproduction of the results obtained. This elliptic curve parametrization uses a single integer (or rational value) σ > 5. From this parameter we then compute: u = σ 2 5, v = 4σ, x 0 = u 3 (mod n), z 0 = v 3 (mod n), a = (v u) 3 (3u + v)/(4u 3 v) 2 (mod n), b = u/z 0 and, y 0 = (σ 2 1)(σ 2 25)(σ 4 25). It yields an elliptic curve in Montgomery s form: by 2 z = x 3 + a x 2 z + xz 2.

Brent-Suyama s parametrization Brent-Suyama s parametrization Suyama s and Montgomery s form parametrizations both ensure 12 divides g over F q. This known factor increases the probability of success of ECM for fixed given parameters.

Using ECM with prime input numbers Using ECM with prime input numbers Normally, ECM is run on a composite input number n that we are trying to factor. Here what we want to determine is whether a given prime p will be found by ECM when it is used on an input number n such that p n, but the cofactor is undetermined. We use ECM in an unusual way: input number p is prime, and ECM returns p if g = #E (F p ) is (B1, B2)-smooth and fails if not. If for a given σ a prime p is found by ECM with given (B1, B2), then it will always be found when the same curve (with same parameters) is run on a number n such that p n!

The influence of B 1, B 2 bounds The influence of B 1, B 2 bounds The choice of B 1 and B 2 bounds impacts both the running time of ECM and the number of primes found by the algorithm. Too high: ECM gets slower, higher probability to return composites or the input number itself. Too low fewer primes are found: we will need to run more curves. We want to minimize the running time while simultaneously maximizing the number of primes found.

The influence of B 1, B 2 bounds B 1, B 2 values minimizing time over found primes ratio. 2 27 : B 1 = 140, B 2 = 7600, time(sec.): 29.59 hits: 247667 Ratio: 119µs. 2 28 : B 1 = 140, B 2 = 9200, time(sec.): 31.44 hits: 208562 Ratio: 151µs. 2 29 : B 1 = 220, B 2 = 10400, time(sec.): 40.6 hits: 211617 Ratio: 192µs. 2 30 : B 1 = 220, B 2 = 8800, time(sec.): 38.71 hits: 163260 Ratio: 237µs. 2 31 : B 1 = 260, B 2 = 11600, time(sec.):48.63 hits: 161424 Ratio: 301µs. 2 32 : B 1 = 260, B 2 = 11600, time(sec.):48.83 hits: 130144 Ratio: 375µs.

ECM implementation Optimizations and consequences on deterministic behavior Methodology: Making ECM implementation-independant ECM factorisation program ecm_check may find primes p for which E(p) is not B1-B2 smooth. Ex: For B = 2 27, B 1 = 315 and B 2 = 5355, 117564 additional primes found by ecm_check. This is about 4% extra primes. We check the σ-chains constructed with ecm-check with Magma considering smoothness of the curve order g = #E(F p ) for every prime found p to make sure it will indeed be found by any relatively standard ECM implementation.

ECM implementation Optimizations and consequences on deterministic behavior Methodology: Making ECM implementation-independant Improvement: considering starting point order instead of the curve order. This is a divisor of the curve order: it has significantly higher probability to be smooth with the same B 1, B 2 parameters. Without sacrificing to generality because the starting point P is fully determined by the choice of σ. Let g the order of the starting point P. The multiplier e is precomputed from B 1. End of stage one: point ep has order g = g/(g, e).

ECM implementation Optimizations and consequences on deterministic behavior Methodology: Making ECM implementation-independant If g = 1 then stage one was successful. Otherwise if g is prime and B1 < g B2 then it will be found in stage two. Implemented as a Magma script: outputs primes for which the starting point order does not verify any of the above propreties.

Dividing to conquer: working in smaller intervals. Finding all primes up to B = 2 32. We work in intervals of same bitlength which are more convenient to handle. After the first one, the size of each interval is approximately the double of the preceding. FDECM running time then also would approximately double for each interval. The cost-per-prime is not fixed: increases with the size of the input factor and size of B 1 B 2 smoothness bounds.

Building σ-chains for sets of non-consecutive primes Building σ-chains for sets of non-consecutive primes We now consider the case where the primes we are looking for are not consecutive. Primes for which an algebraic polynomial f (x) has a root mod p, which is a subset of all possible primes. It is of interest in NFS: given an algebraic polynomial f (x), the primes appearing in the factorization of f (a/b) are those for which f (x) has a root mod p. Ex: The polynomial that was used in the factoring of RSA200 by GNFS.

Building σ-chains for sets of non-consecutive primes Practical case: The RSA200 subset of primes The polynomial is : f (x) = X5 x 5 + X4 x 4 +... + X0. The coefficients are : X5 = 374029011720, X 4 = 2711065637795630118, X 3 = 19400071943177513865892714, X 2 = 33803470609202413094680462360399, X 1 = 120887311888241287002580512992469303610, X 0 = 38767203000799321189782959529938771195170960. This polynomial has a root for a subset of 128740271 elements of the set of 203280221 primes less than 2 32.

Outstanding σ values Outstanding σ values σ = 11 appears to perform noticeably better than random curves for finding primes. Other curves of the infinite family of σ = 11 perform similarly with an above-average performance: they have rational sigmas. Another infinite family of curves, those for σ = 9/4 appear to show the same type of behaviour. First family appears to favour primes p congruent to 1 mod 4 whereas the second favours primes in the congruence class 3 mod 4.

Results achieved A 124-element integer σ-chain which allows to find all primes p < 2 32. With optimized 26-element rational chain: All primes up to 2 32 are now found with 99 curves, and 97 for the RSA200 subset.

Switching from ECM to gcd or trial division and when Switching from ECM to gcd or trial division and when Last third of the curves invariantly accounts for a very small proportion,i.e., less than 1/1000 th, of the primes found. Less costly to take a gcd of the number we want to factor with the product of those primes or simply trial-divide by the remaining primes after ECM has been performed. Which is faster? when is it best to switch from ECM to gcd or trial division? Depends on the size of the input number to factor: we have run tests on 100, 1000, 10000-digit primes (GMP). Testing platform: Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz.

Conclusion We have successfully built two optimized σ-chains below 100 elements in length. The first one guarantees to find all primes below 2 32 and has 99 elements and the second one guarantees to find all primes of the subset RSA200 and has 97 elements. FDECM algorithm s running time is approximately: 0.41 seconds for a 100-digit input number, 7.51 seconds for a 1000-digit input number, and 294 seconds for a 10000-digit input number, which is extremely small comparatively to trial-division for which times are 8.09 seconds, 56.3 seconds, and 5549 seconds respectively. The σ-chains can still be further-optimised to be reduced in length, since there exists in theory a huge super-σ that would find all primes below 2 32 in a single curve!