Lecture 7: ElGamal and Discrete Logarithms Johan Håstad, transcribed by Johan Linde 2006-02-07 1 The discrete logarithm problem Recall that a generator g of a group G is an element of order n such that the set g = {g i 0 i n 1} contains every element of G. If g is a generator, the other generators are given by g r where gcd(r, n) =1. We will work with the integers modulo p for a prime p. The corresponding multiplicative group does not contain 0 and hence its order is p 1. We will also be interested in subgroups of this group and in such a case the order is always a factor in p 1. For example, 2 is a generator of the group of positive integers modulo 11 under multiplication, with the following sequence of powers: 1, 2, 4, 8, 16 = 5, 10, 20 = 9, 18 = 7, 14 = 3, 6, 12 = 1 Of course, not all elements are generators. above and we try g =4: For example, if p =11as 1, 4, 16 = 5, 20 = 9, 36 = 3, 12 = 1 We are generally interested in multiplicative groups over positive integers modulo a prime. Primes of the form p =1+2q where q is prime are especially interesting in cryptographic applications. Then, if g is a generator, the other generators are given by g r where r is odd and r q. The other elements, except g q (which in fact equals 1), generate the subgroup with q elements given by the even powers of g and this is also a commonly used subgroup in cryptography. 1.1 The discrete logarithm problem (DLOG) Given p, g, andy, the discrete logarithm problem is to find x such that g x = y mod p, or written another way, calculate x =log g,p y. It is easy to 1of5
compute y from p, g, andx, but no efficient way of calculating x from p, g, and y is known. 2 The ElGamal cryptosystem Take a large prime p (preferably of the form 1+2q where q is prime), a generator g, and randomly choose x, 1 x p 1, and calculate y = g x mod p. Publish y, p, andg. 2.1 Encryption Pick a random r and let α = g r mod p β = m y r mod p. The ciphertext is given by (α, β). 2.2 Decryption Decryption can be done if x is known in addition to α and β as follows: β α = m yr x (g r ) = m (gx ) r = m grx x (g r ) x g rx where all operations are taking modulo p. = m 2.3 Security We have a way to encrypt and a way to decrypt. But is it secure? Obviously, if we can calculate discrete logarithms efficiently, ElGamal can be broken. It is unknown whether the converse is true; it might be that it is possible to break ElGamal without computing discrete logarithms. 3 Algorithms for the discrete logarithm problem 3.1 The naive algorithm The naive way to calculate the discrete logarithm is simply to calculate g, g 2,g 3,... until y is found. This is very inefficient since the time taken 2of5
is proportional to p. 3.2 The baby-step / giant-step algorithm An improvement over the naive algorithm is to first calculate y, y g, y g 2,y g 3,...,y g a,wherea = p, and put the values in a hash table. Then calculate g a,g 2a,g 3a,...,g a2, and look in the table for a collision. In the case of a collision, g i y = g ja gives y = g ja i. The time taken is clearly proportional to p. Are we guaranteed to find a collision? The answer is yes. To see this, rewrite x as x = ax 1 + x 2,where0 x 1,x 2 <a. From y = g x we get y = g ax 1+x 2, and multiplying both sides by g a x 2 we have yg a x 2 = g (x1+1)a. The left hand side is one of the values in the hash table, and the right hand side is one of the other values. 3.3 The Pohlig-Hellman algorithm Assume that p 1 has a factorization with only small primes, i.e. p 1 = Π r i=1 q i, where every q i is small. The idea is to find x modulo each of the q i separately and then use the Chinese Remainder Theorem to find x. Suppose x = x i (mod q i ), i.e. that x = x i + a i q i for some integer a i.we have that g x = y and hence q i = g (x i+a i q i ) q i = g x i +a q i () i = g x i q i (mod p) as g =1. Given a number of the form g ()a/q i, 0 a<q i, how do we find a? Naively, as there are only q i possibilities, we can compute g b q i for 0 b q i and compare. Looking more closely we see that we are back at the discrete logarithm problem in a group of order q i and we can apply the baby-step/giant-step algorithm getting an algorithm that runs in time about q i. 3.4 The best known algorithm The best known algorithm is a variant of the number field sieve, the algorithm used for integer factorization, and has almost the same time complexity, 2 c(log p)1/3 (log log p) 2/3. The constants for this algorithm are worse and not as much effort has been spent solving large instances. However a p with 512 bits must be considered insecure while 1024 bits is probably beyond reach for the moment. 3of5
4 Diffie-Hellman key exchange Assume that Alice and Bob want to share a key K via a possibly insecure communication channel. Alice, who knows a, sends g a to Bob, and Bob, who knows b, sends g b to Alice. If we let the key be g ab mod p, thenboth Alice and Bob can calculate K since (g b ) a =(g a ) b = g ab = K, whereall calculations are mod p, andg is a generator. The ElGamal encryption scheme can be seen as version of DH that has b fixed and where the common key is used as a multiplicative one-time-pad. If ElGamal is used in the entire Z p, i.e. 0 m p 1, thenaneavesdropper can get information about one bit, whether m is even or odd in DLOG g,p in the following way: If x is even, i.e. x =2x,then x 2x 2 = g 2 = g 2 = g x () =(g ) x =1modp If x is odd, i.e. x =2x +1,then x 2 = g 2 = g (2x +1) 2 = g x ()+ 2 = g 2 = 1 modp So { 2 1 if x even = 1 if x odd That is the motivation to instead use p =1+2q with a g that generates a subgroup of order q. The message m is assumed to have even DLOG (if that is not the case, random bits can be used to correct it). DH can then use the same g. It is unknown whether DH is hard given that DLOG is hard. The security of the above key exchange protocol relies on the assumption that the following problems are difficult (of course, CDH is at least as hard as DDH): 4.1 Computational Diffie-Hellman (CDH) Given g a and g b, find g ab. 4.2 Decision Diffie-Hellman (DDH) Given either (g a,g b,g ab ) or (g a,g b,g c ) for random a, b and c, decide whether you have been given a triple of the first kind or the second kind. 4of5
DDH, loosely speaking, says that no efficient algorithm can do better than guessing on this problem, i.e. cannot be correct with probability significantly higher than 50 %. Put differently, DDH says that we cannot recognize the correct answer to CDH even if it is given to us. 5of5