Lecture 6: RSA Johan Håstad, transcribed by Martin Lindkvist 2006-01-31, 2006-02-02 and 2006-02-07 1 Introduction Using an ordinary cryptosystem, encryption uses a key K and decryption is performed by reversing each step of the encryption and hence it uses the same key K. Could there be another way where you could use one key, E, to encrypt and another key, D, to decrypt the message? In this case decryption cannot be done by reversing each step ofthe encryption and hence its correctness has to depend on some mathematical insight. 2 Fermat s little theorem This theorem states that if p is a prime then a p 1 1(modp) for 1 a p 1. For example p =7and a =2gives 2 6 =64 1(mod7). Note that this does not imply that 7 is a prime and should only be taken as evidence that 7 might be prime. 3 Public Key Encryption So we are heading for a technique where we could publish p, e to encrypt and keep d secret for decryption. To be able to do this we have three requirements that needs to be fulfilled: 1. Easy to create keys. 2. Easy to encrypt/decrypt. 3. Hard to decipher given the public key (p,e). 1av7
Let us try an example relying on Fermat s little theorem. Set ed =2(p 1) + 1 = 2p 1 where p is a prime 2 1024. Then we could get the encryption C, ofthe message M by C = M e (mod p) and we could decrypt it with C d because C d = M ed = M 2(p 1)+1 = M (mod p). This would not be so great though because ifwe know e and p then a simple division is sufficient to find d. So with this method our third requirement is not fulfilled. Let us be more liberal and only require that ed =1(modp). Still decryption works and, at first sight, it is not obvious how to compute d from e and p. Before we dicsuss this construction let us take a detour. 4 How to find primes? The method for Public Key Encryption that we decided to use above is based on large primes. Therefor it s necessary that we know a good way for finding large primes. The Miller-Rabin primality test is a primality test that works in time O((log p) 3 ) and determines whether p is a prime or not. It is very similar to the simpler Fermat primality test. 4.1 The Fermat primality test The Fermat s little theorem stated that if p is a prime then a p 1 1(modp) for 1 a p 1. So ifwe want to test ifp is a prime then we can choose random a s in the interval and see ifthe equality holds. Ifit does hold for many a s then we can be pretty sure that it is a prime. This works for almost all numbers. The Miller-Rabin test is a slight extension that does work for all numbers but we do not give the details here. 4.2 How to compute a p 1 This is not trivial for big primes but there is a shortcut. Suppose p 2 1024 and a>2. Howbigisa p 1? It sasbigas2 21024 and has 3 10 21024 decimals which is too big. But we now make an important observation: a, a 2,a 3,a 4,..., a p 1 (all (mod p)) All those are not needed in order to determine a p 1. For example a 4 =(a 2 ) 2 (mod p) which saves us some work and a 8 =(a 4 ) 2 (mod p) saves us even more and so on. It turns out that we only need 2logp multiplications (mod p) to compute a p 1 (mod p). 2av7
5 Returning to encryption. Let us return to the suggestion above ofusing e and d such that ed =1 (mod p 1). It turns out that it is easy to compute d from e and p. We use the Euclidean algorithm that computes the GCD (Greatest Common Divisor). With the Extended Euclidean algorithm we, apart from the GCD, also get useful co-factors. We run GCD(e, p 1) which will tell us that the greatest common divisor is 1 but we also get two integers x and y such that xe + y(p 1) = 1 and we can set d = x. A small example will probably make it more clear. Given p =67and e =17compute d. p 1=67 1=66 The Euclidean algorithm gives us: 66 3 17 = 15 17 15 = 2 15 7 2=1 Walk up the road... 1=15 7 2=15 7(17 15) = 8 15 17 17 = 8(66 3 17) 7 17 = 8 66 31 17 31 = 35 (mod 66) = d 35 17 = 1 (mod 66) To make it more difficult to find d we work modulo composite numbers instead ofmodulo primes and get the following description ofthe famous RSA-encryption scheme: 1. Find primes p and q ( 2 1024 ). 2. Choose e with GCD(e, (p 1)(q 1)) = 1. 3. Compute d where de =1 (mod(p 1)(q 1)) by using the Euclidean algorithm. 4. Publish N and e (N = pq but p and q must ofcourse be kept safe). To encrypt the message (M) to the cipher (C) wetakem to the power of e. For decryption we take C to the power of d to get back to M: 3av7
Encryption: C = M e (mod N) Decryption: M = C d (mod N) To see that the decryption is correct assume that the answer is M and remember that ed =1 (modp 1) which implies that ed =1+k(p 1) for some integer k and hence M = C d = M ed = M k(p 1)+1 = M (M p 1 ) k = M mod p and similarly mod q. This implies that M M is divisible by p and q and hence by N and we conclude that M = M (mod N) and decryption is correct. When using RSA for long messages we encrypt block by block where a block M i satisfies 1 M i N and thus has about as many bits as N. In practice RSA is only used to encrypt the a key that can be used to encrypt a message in another symmetric cipher (like AES for example). This is because RSA is much slower. 5.1 Security of RSA The security ofthis cipher basicly depends on two things: 1. How hard is factoring? If we find p and q we can surely find d. 2. Do we really need to factor in order to break RSA? There exists several ways for factoring N and they are not all as fast as we would like them to. Suppose N 2 512 which is about 155 decimal digits. 1. Trial Division. Works in time N which for our example gives about 2 256 operations and that is very inefficient. 2. Pollard-ρ. Works in time p where p is the smallest prime factor. This takes 2 128 about operations in our example which also is too inefficient. 3. Quadratic Sieve Works in time 2 c log N log log N, this is not enough for 512 bits either but it works for 130-140 digits. 4. Number Field Sieve. Works in time 2 c(log N)3 (log log N) 2/3. With this algorithm we would find the factors of a 512-bit integer in about a week with pretty good computer power. The official world record is factoring a number with 200 digits with this algorithm. 4av7
Quantum computers are very good at factorization ( (log N) 3 )soifthey would become reality that would be a real threat to RSA. Regarding the other question about the security ofrsa, ifwe really need to factor N to decrypt RSA nobody knows that answer. What we do know is that ifwant to find d then this is essentially as hard as factoring. Let us briefly see why this is the case by giving a procedure that factors N given d. We know that ed 1 is a multiple of (p 1)(q 1) and hence by Fermat s little theorem a ed 1 1(modN). Nowwriteed 1=2 t U where U is odd. Consider the sequence a U,a 2U,a 4U,..., a 2tU. (1) It ends in a one and each number is the square ofthe previous number. Now the equation x 2 =1 (modn) only has the solutions ±1 iff N prime. However if N = p q then we have four solutions as { x = ±1 (mod p) x 2 =1 (modp) x = ±1 (mod q) x 2 =1 (modq), and we can combine the two pairs in any way we want. In particular, if N = 15 then the four solutions are 1, 4, 11 and 14. For example 4 = 1 (mod 3) and 4= 1 (mod5). For one ofthe interesting solutions (i.e. not ±1) it turns out that GCD(N,x 1) gives a nontrivial factor of N. One can prove that the above sequence (1) contains such an interesting solution with probability at least one half. 5.2 How to choose e? (andd) We have two alternatives: Choose e, calculated SGD(e, (p 1)(q 1)),or Choose d, calculatee SGD(d, (p 1)(q 1)). Small numbers give fast calculations as computing C = M e takes about log e operations and thus it might be tempting to use e (or d) small. Having d really small is clearly bad as it can be guessed. One can even prove that even mid-size d is bad and in fact for d as large as N 1/4, d can be efficiently found from the continued fraction expansion of the number e/n. We skip the details. Having e really small might be slightly dangerous in some situations but no one knows how to find M from M 3 (mod N) if M is chosen randomly. A weakness with this is ifwe have small messages. IfM is small (for example a symmetric key) M 2 128 M 3 2 384 <Nand M 3 (mod N) = M 3 and cube roots are simple to calculate over integers. 5av7
5.3 Weakness There are a few known weakness in RSA, here are some of them: 1. Ifwe have a encryption ofm we can easily create a encryption of 2M. (2M) e =2 e M e 2. We can guess what the message is and then encrypt it ourselves and see ifwe were right. Both ofthese problems can be solved with padding. A fixed padding solves the first weakness and a random padding solves the second. In practice there is another way to attack RSA. By timing the decryption we can get some information. We can definitely compute the number of 1 s in d and we can even compute exactly what d is. It s enough with a couple ofthousand decryptions to compute d. The cure for this is to put in some dummy operations in the decryption implementation. A similar attack is to supervise the power used by a device doing decryption but also this problem is also solved with dummy operations. 6 Chinese Remainder Theorem (CRT) This theorem states that if N = r i=1 p i where p i are primes (or at least co-prime) then x = x i (mod p i ) where i =1, 2,..., r is uniquely and efficiently solvable by a number x modulo N. Let us be slightly more explicit when we have two factors, i.e. N = p 1 p 2 and we want to solve x = x 1 (mod p 1 ) x = x 2 (mod p 2 ) We claim that we can find the solution as x = U 1 x 1 + U 2 x 2 (mod N), where U 1 = { 1 (mod p 1 ) 0 (mod p 2 ) U 2 = { 0 (mod p 1 ) 1 (mod p 2 ). 6av7
To see that this is correct let us check the equation modulo p 1.Wehave x = x 1 U 1 + x 2 U 2 = x 1 1+x 2 0=x 1 (mod p 1 ) and equality modulo p 2 can be checked in a similar way. To find U 1 and U 2 we use the Extended Euclidean Algorithm computing GCD(p 1,p 2 ) which gives us numbers a and b such that 1=a p 1 + b p 2 and we can set U 2 = a p 1 and U 1 = b p 2. All these operations are in fact extremely efficient and in particular are much faster than an RSA encryption or decryption. With help from the CRT we can speed up the decryption of RSA as follows. As a first idea we can compute the result mod p and q separately, i.e. to merge the results of C d (mod p) and C d (mod q). This will require twice as many operations as we need to compute two exponentiations. However as partial results need only be calculated modulo p and modulo q respectively, these operations are done with numbers ofonly halfas many bits and hence each multiplication costs only a forth of what it costs for full size numbers. As CRT is almost for free we gain a factor about 2 in running time. We can be even smarter and calculate better decryption exponents. When computing the result mod p we can use an exponent d 1 such that C d 1 = M (mod p), i.e. it is enough that e d 1 =1 (modp 1) and hence d 1 need only be large as p. Similarly we computed a decryption exponent d 2 such that e d 2 =1 (modq 1). Wegetthatd 1 and d 2 are now halfas many bits as d and we gain an additional factor of two. 7av7