NUMBER THEORY FOR CRYPTOGRAPHY

1 CHAPTER 4. NUMBER THEORY FOR CRYPTOGRAPHY 1 INSTITÚID TEICNEOLAÍOCHTA CHEATHARLACH INSTITUTE OF TECHNOLOGY CARLOW NUMBER THEORY FOR CRYPTOGRAPHY Contents 1 Number Theory for Cryptography 2 1.1 Linear Diophantine Equation................................. 2 1.2 Linear Congruences....................................... 6 1.3 Primes and Prime Factorization................................ 12 1.4 The Euler Phi Function.................................... 17 1.5 Some Special Congruences................................... 19 1.6 Public-Key Cryptography................................... 22 1.6.1 The RSA Algorithm.................................. 23 1.6.2 Digital Signatures................................... 28 1.6.3 Diffe-Hellman Key Exchange............................. 29 1.6.4 The Knapsack Cryptosystem............................. 29

2 CHAPTER 4. NUMBER THEORY FOR CRYPTOGRAPHY 2 1 Number Theory for Cryptography Theorem 1 Let a and b be integers, not both zero. Then a and b are relatively prime if and only if there exists integers x and y such that ax + by = 1 Theorem 2 If gcd(a,b)=d, then ( a gcd d, b ) = 1 d Theorem 3 If a c and b c, with gcd(a,b)=1, then ab c. Theorem 4 (Euclid theorem) If a bc, with the gcd(a,b)=1, then a c. 1.1 Linear Diophantine Equation A Diophantine equation is any equation in one or more unknowns which is to be solved in the integers. The simplest type of Diophantine equation is the linear Diophantine equation in two unknowns (which we will consider): ax + by = c where a, b, c are integers and a, b not both zero. A solution of this equation is a pair of integers x 0, y 0 which, when substituted into the equation, satisfy it i.e., ax 0 + by 0 = c The name honors the mathematician Diophantus who initiated the study of such equations around 250 AD.

3 CHAPTER 4. NUMBER THEORY FOR CRYPTOGRAPHY 3 Theorem 5 The linear Diophantine equation ax + by = c has a solution if and only if gcd(a, b) c If x 0, y 0 is any particular solution of this equation, then all other solutions are given by x = x 0 + t. b d y = y 0 t. a d where d = gcd(a, b) and t Z. Proof To establish the second assertion of the theorem, let us suppose that a solution x 0, y 0 of the given equation is known. If x, y is any other solution, then ax 0 + by 0 = c = ax + by which is equivalent to a(x x 0 ) = b(y 0 y) From theorem 2 there exists relatively prime integers r and s such that a = dr, b = ds. Substituting these values into the last written equation and canceling the common factor d, we find that r(x x 0 ) = s(y 0 y) We now have that r s(y 0 y) with gcd(r, s) = 1. Using Euclid s theorem, it must be the case that r (y 0 y), or in other words, y 0 y = rt for some integer t. Substituting we obtain This leads us to the formulas x x 0 = st x = x 0 + st = x 0 + t. b d y = y 0 rt = y 0 t. a d where d = gcd(a, b) and t Z.

4 CHAPTER 4. NUMBER THEORY FOR CRYPTOGRAPHY 4 It is easy to see that these values satisfy the Diophantine equation, regardless of the choice of the integer t since ( ax + by = a x 0 + t. b ) d = (ax 0 + by 0 ) + = c + 0.t = c ( + b y 0 t. a d ) ( ab d ab ) t d Thus there are infinitely many solutions of the given equation, one for each value of t. Example Consider the linear Diophantine equation 172x + 20y = 1000 Applying Euclid s Algorithm to the evaluation of gcd(172, 20), we find that 172 = 8(20) + 12 20 = 1(12) + 8 12 = 1(8) + 4 8 = 2(4) + 0 Hence gcd(172, 20) = 4. Since 4 1000, a solution to this equation exists. To obtain the integer 4 as a linear combination of a = 172 and b = 20, we work backwards through the above calculation as follows: a = 8b + 12 12 = a 8b b = a 8b + 8 8 = 9b a a 8b = 9b a + 4 4 = 2a 17b Multiplying * by 250 yields 500a 4250b = 1000 Comparing this equation with the equation we are asked to solve yields an initial solution x 0 = 500 and y 0 = 4250. All other solutions are expressed by x = 500 + 5t y = 4250 43t for some t Z.

5 CHAPTER 4. NUMBER THEORY FOR CRYPTOGRAPHY 5 Note If asked to produce all positive integer solutions to the equation, if any exists, we must determine t such that x > 0 and y > 0. If 5t + 500 > 0 then t > 100. If 4250 43t > 0 then t < 98 8. Since t must be an integer, we are forced to conclude that t = 99. t > 100 t < 98 8 101 100 99 98 Thus our Diophantine equation has a unique positive solution x = 5, y = 7 corresponding to the value t = 99. Example Consider the linear Diophantine equation 578x + 832y = 14932 Applying Euclid s Algorithm to the evaluation of gcd(578, 832), we find that 832 = 1(578) + 254 578 = 2(254) + 70 254 = 3(70) + 44 70 = 1(44) + 26 44 = 1(26) + 18 26 = 1(18) + 8 18 = 2(8) + 2 8 = 4(2) + 0 Hence gcd(578, 832) = 2. Since 2 14932, a solution to this equation exists. To obtain the integer 4 as a linear combination of a = 578 and b = 832, we work backwards through the above calculation as follows: b = a + 254 254 = a + b a = 2(b a) + 70 70 = 3a 2b b a = 3(3a 2b) + 44 44 = 10a + 7b 3a 2b = 10a + 7b + 26 26 = 13a 9b 7b 10a = 13a 9b + 18 18 = 23a + 16b 13a 9b = 23a + 16b + 8 8 = 36a 25b 23a + 16b = 2(36a 25b) + 2 2 = 95a + 66b

6 CHAPTER 4. NUMBER THEORY FOR CRYPTOGRAPHY 6 Multiplying * by 7466 yields 7466(66b) 7466(95a) = 14932 Comparing this equation with the equation we are asked to solve yields an initial solution x 0 = 7466(95) and y 0 = 7466(66). All other solutions are expressed by x = 95(7466) + 416t y = 66(7466) 289t for some t Z. Exercise 1. Determine all solutions in the integers (the general solution) of the following Diophantine equation 1521x + 632y = 13293 2. Determine all solutions in the positive integers of the following Diophantine equation 123x + 360y = 99 1.2 Linear Congruences We consider linear congruences and their solution because of the importance they hold in cryptography. Definition Let a, b, n Z with n > 0. An equation of the form ax b(mod n) is called a linear congruence and the solution of such an equation is an integer x 0 such that ax 0 b(mod n). Note If x 0 is a solution of ax b(mod n) and if x 1 x 0 (mod n) then ax 1 ax 0 b(mod n) so x 1 is also a solution. Hence, if one member of a congruence class modulo n is a solution, then all members of this class are solutions.

7 CHAPTER 4. NUMBER THEORY FOR CRYPTOGRAPHY 7 The following theorem will allow us decide if a linear congruence has a solution and furthermore tell how many congruence classes modulo n provide solutions. Theorem 6 The linear congruence ax b(mod n) has a solution if and only if gcd(a, n) b. d = gcd(a, n) and d b, then it has d distinct congruence classes modulo n as solutions. If We can easily solve linear congruences using the algebra of congruences as follows: 4x 3 13(mod 7) 4x 16(mod 7) x 4(mod 7) Hence the congruence class 4 modulo 7 provides solutions to the linear congruence 4x 3 13(mod 7). Alternatively, we could define the inverse of an integer modulo n and use an inverse to solve a linear congruence. Definition Given any integer a with gcd(a, n) = 1, a solution of is called an inverse of a modulo n. ax 1(mod n) Remark Let a 1 be the inverse of a modulo n, i.e., aa 1 1(mod n). To solve ax b(mod n) we multiply both sides by a 1 a 1 ax a 1 b(mod n) x a 1 b(mod n)

8 CHAPTER 4. NUMBER THEORY FOR CRYPTOGRAPHY 8 Exercise Find all solutions of the following linear congruences i 2x 5(mod 7) ii 17x 14(mod 21) iii 34x 60(mod 98) iv 36x 12(mod 102) If no solution exists, justify your answer. Remark We can solve a basic linear congruence using the algebra of congruences as outlined above. However the solution of the linear congruence ax b(mod n) is identical to the solution of the Linear Diophantine Equation ax = ny + b This is true since if ax b(mod n) n ax b ax b = yn, y Z ax = ny + b This is an important remark because we can now solve a linear congruence by converting it to its equivalent linear equation form and use the technique outlined in section 1.1. This will be our procedure when the congruence contain large integers. Example Consider the following linear congruence 128x 833(mod 1001) Converting to its equivalent linear form we get 128x 833(mod 1001) 1001 128x 833 128x 833 = 1001y, y Z 128x 1001y = 833

9 CHAPTER 4. NUMBER THEORY FOR CRYPTOGRAPHY 9 So the equivalent linear Diophantine equation is 128x 1001y = 833 Applying Euclid s Algorithm to the evaluation of gcd(128, 1001), we find that 1001 = 7(128) + 105 128 = 1(105) + 23 105 = 4(23) + 13 23 = 1(13) + 10 13 = 1(10) + 3 10 = 3(3) + 1 3 = 1(40) + 0 Hence gcd(128, 1001) = 1. Since 1 1001, a solution to this equation (linear congruence) exists. Furthermore there is just 1 congruence class modulo 1001 which provide solutions. Recall Theorem 2. To obtain the integer 1 as a linear combination of a = 128 and b = 1001, we work back through the above calculation as follows: b = 7a + 105 a = 1(b 7a) + 23 b 7a = 4(8a b) + 13 8a b = 1(5b 39a) + 10 5b 39a = 1(47a 6b) + 3 47a 6b = 3(11b 86a) + 1 105 = b 7a 23 = 8a b 13 = 5b 39a 10 = 47a 6b 3 = 11b 86a 1 = 305a 37b Multiplying * by 833 yields 833(305)a 833(37)b = 833 Comparing this equation with the equation we are asked to solve yields an initial solution for x 0 as x 0 = 833(305) So the solution of the linear congruence is x 833(505)(mod 1001) x 254, 065(mod 1001) x 812(mod 1001)

10 CHAPTER 4. NUMBER THEORY FOR CRYPTOGRAPHY 10 Example Consider the following linear congruence 980x 1500(mod 1600) Converting to its equivalent linear form we get 980x 1500(mod 1600) 1600 980x 1500 980x 1500 = 1600y, y Z 980x 1600y = 1500 So the equivalent linear Diophantine equation is 980x 1600y = 1500 Applying Euclid s Algorithm to the evaluation of gcd(980, 1600), we find that 1600 = 1(980) + 620 980 = 1(620) + 360 620 = 1(360) + 260 360 = 1(260) + 100 260 = 2(100) + 60 100 = 1(60) + 40 60 = 1(40) + 20 40 = 2(20) + 0 Hence gcd(980, 1600) = 20. Since 20 1500, a solution to this equation (linear congruence) exists. Furthermore there are 20 distinct congruence classes modulo 1600 which provide solutions. Recall Theorem 2. To obtain the integer 20 as a linear combination of a = 980 and b = 1600, we work back through the above calculation as follows: b = a + 620 a = b a + 360 b a = 2b a + 260 2a b = 2b 3a + 100 2b 3a = 2(5a 3b) + 60 5a 3b = 8b 13a + 40 8b 13a = 18a 11b + 20 620 = b a 360 = 2a b 260 = 2b 3a 100 = 5a 3b 60 = 8b 13a 40 = 18a 11b 20 = 19b 31a

11 CHAPTER 4. NUMBER THEORY FOR CRYPTOGRAPHY 11 Multiplying * by 75 yields 1425b 2325a = 1500 Comparing this equation with the equation we are asked to solve yields an initial solution for x 0 as x 0 = 2325 So a solution to the linear congruence is x 2325(mod 1600) x 875(mod 1600) In this example there are 20 congruence classes modulo 1600 that provide solutions to the linear congruence since gcd(a, n) = 20. The 20 congruence classes are given by x = 875 + t.80 where t {0, 1, 2, 3,..., 16, 17, 18, 19}. Hence x 875, 955, 1035, 1115,...(mod 1600) Note Recall that the Diophantine equation ax + by = c has a solution if and only if gcd(a, b) c. If x 0, y 0 is any particular solution of this equation, then all other solutions are given by x = x 0 + t. b d y = y 0 t. a d where d = gcd(a, b) and t Z. Similarly the linear congruence ax b(mod n) has a solution if and only if gcd(a, n) b. If x 0 is a particular solution of this congruence, then all other solutions are given as x = x 0 + t. n d where d = gcd(a, b) and t Z.

12 CHAPTER 4. NUMBER THEORY FOR CRYPTOGRAPHY 12 Exercise Find the general solution of the following linear congruences (by converting to their equivalent linear equation form) i 7x 4(mod 12) ii 140x 133(mod 301) iii 103x 444(mod 999) iv 987x 610(mod 1597) 1.3 Primes and Prime Factorization Definition A prime number is an integer p greater than one with the property that 1 and p are the only positive integers that divide p. P = {2, 3, 5, 7, 11, 13, 17, 19,...} Definition An integer greater than one that is not a prime number is said to be a composite number. Theorem 7 (The Fundamental Theorem of Arithmetic) Every composite number greater than one factors uniquely as a product of prime numbers. The prime number factorization from 1 to 99 is shown:...... 2 3 2 2 5 2.3 7 2 3 3 2 2.5 11 2 2.3 13 2.7 3.5 2 4 17 2.3 2 19 2 2.5 3.7 2.11 23 2 3.3 5 2 2.13 3 3 2 2.7 29 2.3.5 31 2 5 3.11 2.17 5.7 2 2.3 2 37 2.19 3.13 2 3.5 41 2.3.7 43 2 2.11 3 2.5 2.23 47 2 4.3 7 2 2.5 2 3.17 2 2.13 53 2.3 3 5.11 2 3.7 3.19 2.29 59 2 2.3.5 61 2.31 3 2.7 2 6 5.13 2.3.11 67 2 2.17 3.23 2.5.7 71 2 3.3 2 73 2.37 3.5 2 2 2.19 7.11 2.3.13 79 2 4.5 3 4 2.41 83 2 2.3.7 5.17 2.43 3.29 2 3.11 89 2.3 2.5 7.13 2 2.23 3.31 2.47 5.19 2 5.3 97 2.7 2 3 2.11

13 CHAPTER 4. NUMBER THEORY FOR CRYPTOGRAPHY 13 This product of primes representation is called canonical form. For example 720 = 2 4.3 2.5 To factorise a composite number into its prime factors the method is simply to divide the given integer by the smallest prime 2 until the integer is no longer divisible by 2. Then divide by the next prime 3 until the integer is no longer divisible by 3, next divide by 5 until the integer is no longer divisible by 5... and so on, dividing by larger and larger primes until we reach 1. We can illustrate this method as follows: 720 = 2.360 = 2.2.180 = 2.2.2.90 = 2.2.2.2.45 = 2.2.2.2.3.15 = 2.2.2.2.3.3.5 Hence we have that 720 = 2 4.3 2.5. Also, for example: 1000 = 2.500 = 2.2.250 = 2.2.2.175 = 2.2.2.5.25 = 2.2.2.5.5.5 Hence we have that 1000 = 2 3.5 3. Having used successive division to factorise a known composite integer into its unique prime factors, we find that his method is adequate for composite numbers of reasonable size but is not an efficient method in terms of computer time. We now consider a further method of prime factorisation - a method known as Pollard rho-factorisation.

14 CHAPTER 4. NUMBER THEORY FOR CRYPTOGRAPHY 14 In 1974, John Pollard proposed a method that is remarkably successful in finding moderate-sized factors (up to about 20 digits) of composite numbers that were formerly difficult to factorise. Consider a large odd integer n that is known to be composite. The first step in Pollard s factorisation method is to choose a fairly simple polynomial of degree at least 2 with integer coefficients, such as a quadratic polynomial f(x) = x 2 + a a 0, 2 Then, starting with some initial x 0, a random sequence x 1, x 2, x 3,... is created from the recursive relation x k+1 f(x k )(mod n) k = 0, 1, 2,... that is, the successive iterates x 1 = f(x 0 ), x 2 = f(f(x 0 )), x 3 = f(f(f(x 0 ))),... are computed modulo n. Now simply compare x k with earlier x j, calculating gcd(x k x j, n) until a nontrivial greatest common divisor occurs. The divisor obtained in this way is not necessarily the smallest factor of n, and indeed it may not even be prime. The possibility exists that when a greatest common divisor greater than 1 is found, it may turn out to be equal to n itself, i.e., x k x j (mod n). Although this happens only rarely, one remedy is to repeat the computation with either a new value of x 0 or a different polynomial f(x). We can illustrate the method simply with the integer n = 2189. If we choose x 0 = 1 and f(x) = x 2 + 1, the recursive sequence will be x 1 = 2, x 2 = 5, x 3 = 26, x 4 = 677, x 5 = 829,... Comparing different x k, we find that gcd(x 5 x 3, 2189) = gcd(803, 2189) = 11 and so a divisor of 2189 is 11. Hence the prime factors of 2189 = 11.199. As k increases, the task of computing gcd(x k x j, n) for each j < k becomes very time consuming. We shall see that it is often more efficient to reduce the number of steps by looking at cases in which k = 2j. The following example will illustrate the method more clearly.

15 CHAPTER 4. NUMBER THEORY FOR CRYPTOGRAPHY 15 Example To factor n = 30623 using Pollard s method, let us take x 0 = 3 as a starting value and f(x) = x 2 1 as the polynomial. The sequence of integers that x k generates is 8, 63, 3968, 4801, 21104, 28526, 18319, 18926,... Making the comparison x 2k with x k, we get x 2 x 1 = 63 8 = 55 gcd(55, 30623) = 1 x 4 x 2 = 4801 63 = 4738 gcd(4738, 30623) = 1 x 6 x 3 = 28526 3968 = 24558 gcd(24558, 30623) = 1 x 8 x 4 = 18926 4801 = 14125 gcd(14125, 30623) = 113 The desired factorisation is 30623 = 113.271 Example To factor n = 8051 using Pollard s method, let us take x 0 f(x) = x 2 + 1 as the polynomial. The sequence of integers that x k generates is = 2 as a starting value and 5, 26, 677, 7474, 2839, 871,... Making the comparison x 2k with x k, we get x 2 x 1 = 26 5 = 21 gcd(21, 8051) = 1 x 4 x 2 = 7474 26 = 7448 gcd(7448, 8051) = 1 x 6 x 3 = 871 677 = 194 gcd(194, 8051) = 97 The desired factorisation is 8051 = 97.83 Remark The polynomial f(x) should be chosen so that the probability is high that a suitably large number of integers x i are generated before they repeat. Empirical evidence indicates that the polynomial f(x) = x 2 + 1 performs well for this test. Furthermore the initial value x 0 = 2 is often used. This method is called the Pollard s rho-method. To understand why it is called this consider the following diagram.

16 CHAPTER 4. NUMBER THEORY FOR CRYPTOGRAPHY 16 x 2 = 26 x 3 = 677 95(mod 97) x 1 = 5 x 4 = 7474 5(mod 97) x 0 = 3 Because this figure resembles the greek letter ρ this factoring method is popularly known as Pollard s rho-method. This diagram shows the periodic behavior of the sequence 5, 26, 677, 7474, 2839, 871,... when reduced modulo 97, with x 0 = 2 i.e. the sequence 2, 5, 26, 95, 5, 26, 95,... The part of this sequence that occurs before the periodicity is the tail of the rho (ρ), and the loop is the periodic part. It is worth observing that because x 3 x 6 95 (mod 97), the length of the period is 6 3 = 3. The Pollard s rho-method has proven to be practical for the factorization of integers with moderately large prime factors. In practice, the first attempt to factor a large integer is to do trial division by small primes, say by all primes less than 10, 000. Next, the Pollard s rho-method is used to look for prime factors of intermediate size (up to 10 15 ). Only after trial division by small primes and the Pollard s rho-method have failed are the really big guns brought in, such as the quadratic sieve or the elliptic curve method. Exercise Use Pollard s rho-method to factorize the following integers: 299, 1003, 8051 Solution: 299=13.23, 1003=17.59, 8051=83.97

17 CHAPTER 4. NUMBER THEORY FOR CRYPTOGRAPHY 17 1.4 The Euler Phi Function Definition Let n be a positive integer. The Euler phi-function φ(n) is defined to be the number of positive integers not exceeding n that are relatively prime to n. The table displays the values of φ(n) for 1 n 12. n 1 2 3 4 5 6 7 8 9 10 11 12 φ(n) 1 1 2 2 4 2 6 4 6 4 10 4 Theorem 8 The function φ(n) is a multiplicative function. So, if m, n Z with gcd(m, n) = 1, then φ(m.n) = φ(m).φ(n) So, for example, φ(30) = φ(5).φ(6) = 4.2 = 8 Theorem 9 If p is prime, then φ(p) = p 1 Conversely, if p is a positive integer with φ(p) = p 1, then p is prime. So, if n = p.q where p and q are prime numbers, we now have φ(n) = φ(p).φ(q) = (p 1).(q 1)

18 CHAPTER 4. NUMBER THEORY FOR CRYPTOGRAPHY 18 Example For p = 29 and q = 53 then n = 29.53 = 1537. φ(1537) = φ(29).φ(53) = (29 1).(53 1) = 28.52 = 1456 There are 1456 positive integers not exceeding 1537 that are relatively prime to 1537. Theorem 10 Let p be a prime and a a positive integer. Then φ(p a ) = p a p a 1 Example For example φ(5 3 ) = 5 3 5 2 = 100 φ(2 10 ) = 2 10 2 9 = 512 φ(11 2 ) = 11 2 11 = 110 Combining Theorem 8. and Theorem 9., we have the following: Theorem 11 Let n = p a1 1 pa2 2...pan n n. Then be the prime-power factorization of the positive integer ) ( φ(n) = n (1 )(1 1p1 1p2... 1 1 ) p n Example For example ( φ(100) = φ(2 2 5 2 ) = 100 1 1 ) (. 1 1 ) = 40 2 5 ( φ(360) = φ(2 3 3 2 5) = 360 1 1 ) (. 1 1 ) (. 1 1 ) = 96 2 3 5 ( φ(720) = φ(2 4 3 2 5) = 720 1 1 ) (. 1 1 ) (. 1 1 ) = 192 2 3 5

19 CHAPTER 4. NUMBER THEORY FOR CRYPTOGRAPHY 19 Exercise Calculate φ(1001), φ(5040), φ(36, 000) Leonard Euler (1707 1783) 1.5 Some Special Congruences Theorem 12 (Fermat s Little Theorem) If p is prime and a is a positive integer with p a, then a p 1 1(mod p) Fermat s Little Theorem may be stated in a more general way with the requirement p a dropped. Theorem 13 If p is prime and a is a positive integer, then a p a(mod p) Remark It is important to note that the converse of Fermat s Little Theorem is not true, i.e., if a n a(mod n) for some integer a, then n need not be prime. So for example it can be shown that 2 341 2(mod 341) however 341 is not a prime number since 341 = 11.31.

20 CHAPTER 4. NUMBER THEORY FOR CRYPTOGRAPHY 20 Fermat s little theorem tells us how to work with certain congruences involving exponents when the modulus is prime. But is there a theorem that allows us to work with similar congruences where the moduli is composite? The Swiss mathematician Leonhard Euler published a proof of Fermat s little theorem in 1736. In 1760, Euler managed to find a natural generalization of the congruence in Fermat s theorem that holds for composite integers. Before stating this theorem we must first define a special function used in Euler s Theorem. Theorem 14 (Euler s Theorem) If n is a positive integer and a is an integer with gcd(a, n) = 1, then a φ(n) 1(mod n) Remark The mathematician Pierre de Fermat (1601 1665) is more recently known for his famous last theorem which is based on a simple statement relating to a property of right-angled triangles. In a right-angled triangle, the sum of the squares of the lengths of the sides containing the right angle is equal to the square of the hypothenuse; i.e. a 2 + b 2 = c 2 B c a α A C b This statement is known as Pythagoras Theorem. Three positive integers a, b and c such that a 2 +b 2 = c 2 are called Pythagorean triples. For example (3, 4, 5), (5, 12, 13), (6, 8, 10), (8, 15, 17), (9, 12, 15) are all solutions of the equation a 2 + b 2 = c 2 In the early 1600 s, Fermat, a French lawyer and mathematician posed the following question if the power of 2 in the above equation was replaced by 3 could there be found three non-zero integers a, b and c that satisfy the equation a 3 + b 3 = c 3? The same question could be asked if the power was

21 CHAPTER 4. NUMBER THEORY FOR CRYPTOGRAPHY 21 increased to 4 then to 5 and down to any positive integer n. a 3 + b 3 = c 3 a 4 + b 4 = c 4...... a n + b n c n Fermat stated that the no matter how hard you try you will never find integer solutions to these equations. This famous statement become known as Fermat s Last Theorem, which was not solved until 1994 by British-American mathematician Andrew Wiles. Wiles devoted seven years of his life to proving the famous theorem, which may have generated more attempts at proofs than any other theorem. Pierre de Fermat (1601 1665) Fermat s Last Theorem states that a n + b n = c n has no non-zero integer solutions for a, b and c when n > 2. Fermat stated his theorem in 1637 when he wrote that I have a truly marvelous proof of this proposition which this margin is too narrow to contain. Today, however, we believe that Fermat had no such proof.

22 CHAPTER 4. NUMBER THEORY FOR CRYPTOGRAPHY 22 1.6 Public-Key Cryptography With the increasing quantity of digital information being stored and communicated via telephone lines, microwaves or satellites, organizations in both the public and commercial sector need to protect this information when it is being transmitted. Cryptography is the science of making communications unintelligible to all except authorized parties. In the language of cryptography, where codes are called ciphers, the information to be concealed is called plaintext. After transformation to a secret form, a message is called ciphertext. The process of converting from plaintext to ciphertext is called encryption, while the reverse process of changing from ciphertext back to plaintext is called decryption. Let A = {A, B, C, D,...} The encryption function f(x) is given as f : A A : f(a 1 a 2...a n ) = f(a 1 )f(a 2 )...f(a n ) The encryption of A is a 1 1 function of A onto itself. To encrypt a word we encrypt one letter at a time where A = 0, B = 1, C = 2, D = 3, E = 4,..., Y = 24, Z = 25 The cryptosystems we have discussed previously (year 2) are all examples of private key or symmetric cryptosystems, where the encryption and decryption keys are easily found from each other. The disadvantage of each of the cryptosystems so far is that the secret key used by the person encrypting the message must also be transmitted in order for the receiver to decrypt the message. To avoid transmitting the secret key a new type of cryptosystem, called public-key cryptosystem, was invented in the 1970 s. In this type of cryptosystem, encryption keys can be made public but in doing so certain private information is with-held (in the case of the RSA algorithm the private information is with-held by the receiver). The security of this cryptosystem is attributed to the unrealistic large amount of computer time that is required to find the decryption key from the encryption key without the private information being known. There are several widely used public-key cryptosystems. We will consider in detail the RSA algorithm however other public-key systems include the Rabin public-key system and the ElGamal public-key system. The security of these systems rests on the difficulty of two computationally difficult problems the factorization of composite integers into their prime factors and finding discrete logarithms. Although public-key cryptosystems have many advantages they are not extensively used for general purpose encryption. The reason is that encrypting and decrypting in these cryptosystems require too much time and memory on most computers, generally several orders of magnitude more than required for private-key cryptosystems. However, public-key cryptosystems are used extensively to encrypt keys for private-key cryptosystems so that the private key can transmitted securely. They are also used in a wide variety of cryptographic protocols, such as in digital signatures.

23 CHAPTER 4. NUMBER THEORY FOR CRYPTOGRAPHY 23 1.6.1 The RSA Algorithm The RSA algorithm, invented by Ronald Rivest, Adi Shamir and Leonard Adleman in the 1970s and patented by them in 1983, is a public-key cryptosystem similar to exponential ciphers. Recall again the Euler-phi function. The ingredients of the RSA algorithm are as follows: two primes p and q of 100 digits or more. n = pq. φ(n) = φ(p).φ(q) = (p 1).(q 1). random number e < φ(n) such that gcd(e, φ(n)) = 1. d, the inverse of e(mod φ(n)). The procedure to apply this method of cryptography is as follows. The sender and receiver make contact and agree to transmit a message using this method. The receiver chooses the primes p and q multiplies them and places the composite number n = p.q in a public directory. It is the receiver who alone knows the actual prime factors of n and does not tell anyone their value. The receiver also generates e < φ(n) such that gcd(e, φ(n)) = 1. Notice again that the receiver alone can calculate φ(n) since it is the receiver who only knows the value of p and q. So the integers n and e are placed in a public directory no other information is placed in the public directory. Now the sender takes the integers n and e from the public directory. To encrypt the message the sender will first translate the letters of the message its equivalent two digit numerical equivalent. A = 00, B = 01, C = 02, D = 03, E = 04,..., Y = 24, Z = 25 Then group the resulting numbers into blocks of length four. Ciphertext blocks are formed using C P e (mod n) Now that the ciphertext has been transmitted which was encrypted how does the receiver decode it? To decipher a ciphertext block the receiver must determine the deciphering key d such that ed 1(mod φ(n)) i.e., d is the inverse of e(mod φ(n)), which exists since gcd(e, φ(n)) = 1.

24 CHAPTER 4. NUMBER THEORY FOR CRYPTOGRAPHY 24 It is the receiver who alone who can determine d since it is only he\she that know the prime factors p and q that will allow φ(n) to be determined. Although the composite integer n is in the public domain the factorization of composite numbers with large prime factors is prohibitively time consuming. Note that if p and q are digits of 100 digits each the n will be a 200 digit integer. Using the fastest factorization known, thousands of years of computer time are required to factorize an integer of this size. In summary, for the RSA algorithm C P e (mod n) P laintext (P ) Ciphertext (C) primes p, q n = p.q gcd(e, φ(n)) = 1 P C d (mod n) where ed 1(mod φ(n)) Now raising C to the d th power will recover the plaintext blocks C d (P e ) d (mod n) C d P ed (mod n) Now ed 1(mod φ(n)) φ(n) ed 1 ed 1 = k.φ(n) ed = k.φ(n) + 1 Therefore C d P k.φ(n)+1 (mod n) C d (P φ(n) ) k.p (mod n) C d P (mod n)

25 CHAPTER 4. NUMBER THEORY FOR CRYPTOGRAPHY 25 This is due to Euler s theorem which states that if n is a positive integer and a is an integer with gcd(a, n) = 1, then a φ(n) 1(mod n) Hence the plaintext blocks are hence recovered using P C d (mod n) As we have remarked, the security of the RSA cryptosystem depends on the difficulty of factorizing large integers into their prime factors p and q. A few extra precautions should be taken in choosing the primes p and q to prevent the use of rapid techniques to factor n = pq. For example, both p 1 and q 1 should have large prime factors, gcd(p 1, q 1) should be small, and p and q should have decimal expansions differing in length by a few digits. Example Let p = 41 and q = 67 be two primes chosen by the receiver. Also φ(2747) = φ(41).φ(67) = (41 1).(67 1) = 40.66 = 2640 Now the receiver places the following integers in a public directory n = 2747, e = 13 Note that gcd(e, φ(n)) = 1 The sender locates the the public keys n and e and encrypts a message using C P e (mod n) i.e. C P 13 (mod 2747) Say this cipher produced the following ciphertext 2206 0755 0436 1165 1737 How will the receiver decode the message?

26 CHAPTER 4. NUMBER THEORY FOR CRYPTOGRAPHY 26 He\she will decipher the cirhertext message using the congruence P C d (mod 2747) To do this the receiver must determine the deciphering key d such that ed 1(mod φ(n)) i.e., d is the inverse of e(mod φ(n)), which exists since gcd(e, φ(n)) = 1. We can do so as follows 13d 1(mod 2640) 2640 13d 1 13d 1 = 2640k 13d 2640k = 1 Applying Euclid s Algorithm to the evaluation of gcd(2640, 13), we find that 2640 = 203(13) + 1 13 = 13(1) + 0 Hence gcd(2640, 13) = 1. Since 1 1, a solution to this equation exists. Recall theorem 1.2. To obtain the integer 1 as a linear combination of a = 2640 and b = 13, we work backwards through the above calculation as follows: a = 203b + 1 1 = a 203b Now a 203b = 1 Comparing this equation with the equation we are asked to solve yields an initial solution d = 203 and k = 1. So 2437 is an inverse of 13 modulo 2640. Therefore P C 2437 (mod 2747)

27 CHAPTER 4. NUMBER THEORY FOR CRYPTOGRAPHY 27 Now each block of four is decrypted using this congruence as follows: P (2206) 2437 (mod 2747) 617 P (0755) 2437 (mod 2747) 404 P (0436) 2437 (mod 2747) 1908 P (1165) 2437 (mod 2747) 1306 P (1737) 2437 (mod 2747) 1823 GR EE T I NG SX The plaintext message was GREETINGS Note that these calculations cannot be done on a calculator. They can be easily performed using the computer algebra software WOLFRAM ALPHA. Exercise What is the ciphertext that is produced by the RSA cipher, with public keys n = 2627 and e = 7, is used to encrypt the message LIFE IS A DREAM Exercise If the ciphertext produced by an RSA cipher, with public keys n = 2881 and e = 5, is 0504 1874 0347 0515 2088 2356 0736 0468 what is the plaintext message? Exercise Aisling s public keys for the RSA is (n, e) = (65, 11). i Which two primes did Aisling use? ii Find Aisling s private (deciphering) key d. iii Sarah wishes to send the numerical message 4 to Aisling. What numerical message does Aisling receive? iv Aisling receives the numerical message 30 from Sarah.What numerical message did Sarah send?

28 CHAPTER 4. NUMBER THEORY FOR CRYPTOGRAPHY 28 Exercise Aisling s public keys for the RSA is (n, e) = (91, 11). i Which two primes did Aisling use? ii Find Aisling s private (deciphering) key d. iii Sarah wishes to send the numerical message 5 to Aisling. What numerical message does Aisling receive? iv Aisling receives the numerical message 4 from Sarah.What numerical message did Sarah send? 1.6.2 Digital Signatures When we receive an electronic message, how do we know that it has come from the supposed sender? We need a digital signature that can tell us that the message must have originated with the party who supposedly send it. The RSA cryptosystem can be used to send signed messages. When signatures are used, the recipient of a message is sure that the message came from the sender, and can convince an impartial judge that only the sender could be the source of the message. This authentication is needed for electronic mail, electronic banking, and electronic stock market transactions. To see how the RSA cryptosystem can be used to send signed messages, suppose that individual i wishes to send a signed message to individual j. The first thing that individual i does to a plaintext block P is to compute S D ki (P ) P di (mod n i ) where (d i, n i ) is the decryption key for individual i, which only individual i knows. Individual i encrypts S by forming C E kj (S) S ej (mod n j ) When n j < n i, individual j splits S into blocks of size less than n j and encrypts each block using the encrypting transformation E kj. For decrypting, individual j first uses the private decrypting transformation D kj to recover S, because D kj (C) = D kj (E kj (S)) = S To find the plaintext message P, supposedly sent by individual i, individual j next uses the public encrypting transformation E ki, because

29 CHAPTER 4. NUMBER THEORY FOR CRYPTOGRAPHY 29 E ki (S) = E ki (D ki (P )) = P Here, we have used the identity E ki (D ki (P )) = P, which follows from the fact that because E ki (D ki (P )) (P di ) ei P diei P (mod n i ) d i e i 1(mod φ(n i ) The combination of the plaintext block P and the signed version S convinces individual j that the message actually came from individual i. Also, individual i cannot deny sending the message, because no one other than individual i could have produced the signed message S from the original message P. 1.6.3 Diffe-Hellman Key Exchange 1.6.4 The Knapsack Cryptosystem