C O M P U L S O R Y R E A D I N G S 1

Size: px

Start display at page:

Download "C O M P U L S O R Y R E A D I N G S 1"

Charity Osborne
6 years ago
Views:

1 C O M P U L S O R Y R E A D I N G S 1 1 According to the author of the module, the compulsory readings do not infringe known copyright.

10. COMPULSORY READINGS Reading #1: Complete Reference: Elementary Number Theory, By W. Edwin Clark, University of South Florida, 2003.

2 10. COMPULSORY READINGS Reading #1: Complete Reference: Elementary Number Theory, By W. Edwin Clark, University of South Florida, (File name on CD: Elem_number_theory_Clarke) Abstract/Rationale: A complete open-source text book in number theory. The complete text is provided as a readable computer file. Specific page references are given in the learning activities to direct the student to activities, readings and exercises. Reading #2: Complete Reference: Elementary Number Theory, By William Stein, Harvard University, 2005 (File name on CD: Number_Theory_Stein) Abstract/Rationale: A complete open-source text book in number theory. The complete text is provided as a readable computer file. Specific page references are given in the learning activities to direct the student to activities, readings and exercises. Reading #3: Complete Reference: MIT Open Courseware, Theory of Numbers, Spring 2003, Prof. Martin Olsson (Folder name on CD: MIT_Theory_of_Numbers) Abstract/Rationale: A collection of lecture notes from Number Theory lectures at MIT in Boston, USA. Each lecture clearly addresses a specific number theory topic to supplement the learning materials. Reading # 1: MacTutor History of Mathematics (visited ) Complete reference :

3 Abstract : MacTutor is a must read for interest and knowledge of the history of Number Theory. It gives accounts of how theorems,propositions, corollaries and lemmas have daunted mathematicians over the centuries. Fermat s last theorem is well illustrated as a very simple concept that a class / grade three pupil can understand.however, the proof of the theorem dodged matheticians for over 300 years from the year 1637 to the year Rationale: History of mathematics as approached in MacTutor not only gives the historical aspects of number theory but also challenges learners to proof theorems, propositions,lemmas, and corollaries that have not been proved. The learner appreciates the challenges of proofs by many approaches such as induction and contradiction. Thus the reference is suitable for a variety of mathematical approaches that every number theory learner needs to know to enhance knowledge and consolidation of abstract mathematics. Reading # 2: Wolfram MathWorld (visited ) Complete reference : Abstract : This reference gives the much needed reading material in Number Theory. Learners are advised to critically check and follow the given the proofs of Lemmas. In addition, the reference has a number of illustraions that empower the learner through different approach methodology. Rationale: The reference enables learners to analise number theory through the abstract approaches that many learners fail to visualise. By reading through, the learner will appreciate the technical inferences to Lemmas, Corollaries, theorems and Propositions that are used in the various proofs.

4 Reading # 3: Wikipedia (visited ) Complete reference : Abstract : Wikipedia should be the learners closest source of reference in Number Theory. It is a very powerful resource that all learners must refer to understand abstract mathematics. Moreover, it enables the learner to access various arguments that have puzzled mathematicians over the centuries. Rationale: It gives definitions, explanations, and examples that learners cannot access in other resources. The fact that wikipedia is frequently updated gives the learner the latest approaches, abstract arguments, illustrations and refers to other soucers to enable the learner acquire other proposed approaches in number theory.

5 R e a d i n g ( s ) # 1

6 Elementary Number Theory W. Edwin Clark Department of Mathematics University of South Florida Revised June 2, 2003 Copyleft 2002 by W. Edwin Clark Copyleft means that unrestricted redistribution and modification are permitted, provided that all copies and derivatives retain the same permissions. Specifically no commerical use of these notes or any revisions thereof is permitted. i

7 ii

8 Preface Number theory is concerned with properties of the integers:..., 4, 3, 2, 1, 0, 1, 2, 3, 4,.... The great mathematician Carl Friedrich Gauss called this subject arithmetic and of it he said: Mathematics is the queen of sciences and arithmetic the queen of mathematics. At first blush one might think that of all areas of mathematics certainly arithmetic should be the simplest, but it is a surprisingly deep subject. We assume that students have some familiarity with basic set theory, and calculus. But very little of this nature will be needed. To a great extent the book is self-contained. It requires only a certain amount of mathematical maturity. And, hopefully, the student s level of mathematical maturity will increase as the course progresses. Before the course is over students will be introduced to the symbolic programming language Maple which is an excellent tool for exploring number theoretic questions. If you wish to see other books on number theory, take a look in the QA 241 area of the stacks in our library. One may also obtain much interesting and current information about number theory from the internet. See particularly the websites listed in the Bibliography. The websites by Chris Caldwell [2] and by Eric Weisstein [11] are especially recommended. To see what is going on at the frontier of the subject, you may take a look at some recent issues of the Journal of Number Theory which you will find in our library. iii

9 iv PREFACE Here are some examples of outstanding unsolved problems in number theory. Some of these will be discussed in this course. A solution to any one of these problems would make you quite famous (at least among mathematicians). Many of these problems concern prime numbers. A prime number is an integer greater than 1 whose only positive factors are 1 and the integer itself. 1. (Goldbach s Conjecture) Every even integer n > 2 is the sum of two primes. 2. (Twin Prime Conjecture) There are infinitely many twin primes. [If p and p + 2 are primes we say that p and p + 2 are twin primes.] 3. Are there infinitely many primes of the form n 2 + 1? 4. Are there infinitely many primes of the form 2 n 1? Primes of this form are called Mersenne primes. 5. Are there infinitely many primes of the form 2 2n + 1? Primes of this form are called Fermat primes. 6. (3n+1 Conjecture) Consider the function f defined for positive integers n as follows: f(n) = 3n+1 if n is odd and f(n) = n/2 if n is even. The conjecture is that the sequence f(n), f(f(n)), f(f(f(n))), always contains 1 no matter what the starting value of n is. 7. Are there infinitely many primes whose digits in base 10 are all ones? Numbers whose digits are all ones are called repunits. 8. Are there infinitely many perfect numbers? [An integer is perfect if it is the sum of its proper divisors.] 9. Is there a fast algorithm for factoring large integers? [A truly fast algoritm for factoring would have important implications for cryptography and data security.]

10 v Famous Quotations Related to Number Theory Two quotations from G. H. Hardy: In the first quotation Hardy is speaking of the famous Indian mathematician Ramanujan. This is the source of the often made statement that Ramanujan knew each integer personally. I remember once going to see him when he was lying ill at Putney. I had ridden in taxi cab number 1729 and remarked that the number seemed to me rather a dull one, and that I hoped it was not an unfavorable omen. No, he replied, it is a very interesting number; it is the smallest number expressible as the sum of two cubes in two different ways. Pure mathematics is on the whole distinctly more useful than applied. For what is useful above all is technique, and mathematical technique is taught mainly through pure mathematics. Two quotations by Leopold Kronecker God has made the integers, all the rest is the work of man. The original quotation in German was Die ganze Zahl schuf der liebe Gott, alles Übrige ist Menschenwerk. More literally, the translation is The whole number, created the dear God, everything else is man s work. Note in particular that Zahl is German for number. This is the reason that today we use Z for the set of integers. Number theorists are like lotus-eaters having once tasted of this food they can never give it up. A quotation by contemporary number theorist William Stein: A computer is to a number theorist, like a telescope is to an astronomer. It would be a shame to teach an astronomy class without touching a telescope; likewise, it would be a shame to teach this class without telling you how to look at the integers through the lens of a computer.

11 vi PREFACE

12 Contents Preface iii 1 Basic Axioms for Z 1 2 Proof by Induction 3 3 Elementary Divisibility Properties 9 4 The Floor and Ceiling of a Real Number 13 5 The Division Algorithm 15 6 Greatest Common Divisor 19 7 The Euclidean Algorithm 23 8 Bezout s Lemma 25 9 Blankinship s Method Prime Numbers Unique Factorization Fermat Primes and Mersenne Primes The Functions σ and τ Perfect Numbers and Mersenne Primes 53 vii

13 viii CONTENTS 15 Congruences Divisibility Tests for 2, 3, 5, 9, Divisibility Tests for 7 and More Properties of Congruences Residue Classes Z m and Complete Residue Systems Addition and Multiplication in Z m The Groups U m Two Theorems of Euler and Fermat Probabilistic Primality Tests The Base b Representation of n Computation of a N mod m The RSA Scheme 113 A Rings and Groups 117

14 Chapter 1 Basic Axioms for Z Since number theory is concerned with properties of the integers, we begin by setting up some notation and reviewing some basic properties of the integers that will be needed later: N = {1, 2, 3, } (the natural numbers or positive integers) Z = {, 3, 2, 1, 0, 1, 2, 3, } (the integers) { n } Q = m n, m Z and m 0 (the rational numbers) R = the real numbers Note that N Z Q R. I assume a knowledge of the basic rules of high school algebra which apply to R and therefore to N, Z and Q. By this I mean things like ab = ba and ab + ac = a(b + c). I will not list all of these properties here. However, below I list some particularly important properties of Z that will be needed. I call them axioms since we will not prove them in this course. Some Basic Axioms for Z 1. If a, b Z, then a + b, a b and ab Z. (Z is closed under addition, subtraction and multiplication.) 2. If a Z then there is no x Z such that a < x < a If a, b Z and ab = 1, then either a = b = 1 or a = b = Laws of Exponents: For n, m in N and a, b in R we have 1

15 2 CHAPTER 1. BASIC AXIOMS FOR Z (a) (a n ) m = a nm (b) (ab) n = a n b n (c) a n a m = a n+m. These rules hold for all n, m Z if a and b are not zero. 5. Properties of Inequalities: For a, b, c in R the following hold: (a) (Transitivity) If a < b and b < c, then a < c. (b) If a < b then a + c < b + c. (c) If a < b and 0 < c then ac < bc. (d) If a < b and c < 0 then bc < ac. (e) (Trichotomy) Given a and b, one and only one of the following holds: a = b, a < b, b < a. 6. The Well-Ordering Property for N: Every non-empty subset of N contains a least element. 7. The Principle of Mathematical Induction: Let P (n) be a statement concerning the integer variable n. Let n 0 be any fixed integer. P (n) is true for all integers n n 0 if one can establish both of the following statements: (a) P (n) is true if n = n 0. (b) Whenever P (n) is true for n 0 n = k + 1. n k then P (n) is true for We use the usual conventions: 1. a b means a b means b < a, and 3. a b means b a. Important Convention. Since in this course we will be almost exclusively concerned with integers we shall assume from now on (unless otherwise stated) that all lower case roman letters a, b,..., z are integers.

16 Chapter 2 Proof by Induction In this section, I list a number of statements that can be proved by use of The Principle of Mathematical Induction. I will refer to this principle as PMI or, simply, induction. A sample proof is given below. The rest will be given in class hopefully by students. A sample proof using induction: I will give two versions of this proof. In the first proof I explain in detail how one uses the PMI. The second proof is less pedagogical and is the type of proof I expect students to construct. I call the statement I want to prove a proposition. It might also be called a theorem, lemma or corollary depending on the situation. Proposition 2.1. If n 5 then 2 n > 5n. Proof #1. Here we use The Principle of Mathematical Induction. Note that PMI has two parts which we denote by PMI (a) and PMI (b). We let P (n) be the statement 2 n > 5n. For n 0 we take 5. We could write simply: P (n) = 2 n > 5n and n 0 = 5. Note that P (n) represents a statement, usually an inequality or an equation but sometimes a more complicated assertion. Now if n = 4 then P (n) becomes the statement 2 4 > 5 4 which is false! But if n = 5, P (n) is the statement 2 5 > 5 5 or 32 > 25 which is true and we have established PMI (a). 3

17 4 CHAPTER 2. PROOF BY INDUCTION Now to prove PMI (b) we begin by assuming that P (n) is true for 5 n k. That is, we assume (2.1) 2 n > 5n for 5 n k. The assumption (2.1) is called the induction hypothesis. We want to use it to prove that P (n) holds when n = k + 1. So here s what we do. By (2.1) letting n = k we have 2 k > 5k. Multiply both sides by two and we get (2.2) 2 k+1 > 10k. Note that we are trying to prove 2 k+1 > 5(k + 1). Now 5(k + 1) = 5k + 5 so if we can show 10k 5k + 5 we can use (2.2) to complete the proof. Now 10k = 5k + 5k and k 5 by (2.1) so k 1 and hence 5k 5. Therefore 10k = 5k + 5k 5k + 5 = 5(k + 1). Thus 2 k+1 > 10k 5(k + 1) so (2.3) 2 k+1 > 5(k + 1). that is, P (n) holds when n = k + 1. So assuming the induction hypothesis (2.1) we have proved (2.3). Thus we have established PMI (b). We have established that parts (a) and (b) of PMI hold for this particular P (n) and n 0. So the PMI tells us that P (n) holds for n 5. That is, 2 n > 5n holds for n 5. I now give a more streamlined proof. Proposition 2.2. If n 5 then 2 n > 5n.

18 5 Proof #2. We prove the proposition by induction on the variable n. If n = 5 we have 2 5 > 5 5 or 32 > 25 which is true. Assume 2 n > 5n for 5 n k (the induction hypothesis). Taking n = k we have Multiplying both sides by 2 gives 2 k > 5k. 2 k+1 > 10k. Now 10k = 5k + 5k and k 5 so k 1 and therefore 5k 5. Hence It follows that and therefore 10k = 5k + 5k 5k + 5 = 5(k + 1). 2 k+1 > 10k 5(k + 1) 2 k+1 > 5(k + 1). Hence by PMI we conclude that 2 n > 5n for n 5. The 8 major parts of a proof by induction: 1. First state what proposition you are going to prove. Precede the statement by Proposition, Theorem, Lemma, Corollary, Fact, or To Prove:. 2. Write the Proof or Pf. at the very beginning of your proof. 3. Say that you are going to use induction (some proofs do not use induction!) and if it is not obvious from the statement of the proposition identify clearly P (n), the statement to be proved, the variable n and the starting value n 0. Even though this is usually clear, sometimes these things may not be obvious. And, of course, the variable need not be n. It could be represented in many different ways. 4. Prove that P (n) holds when n = n Assume that P (n) holds for n 0 n k. This assumption will be referred to as the induction hypothesis.

19 6 CHAPTER 2. PROOF BY INDUCTION 6. Use the induction hypothesis and anything else that is known to be true to prove that P (n) holds when n = k Conclude that since the conditions of the PMI have been met then P (n) holds for n n Write QED or or // or something to indicate that you have completed your proof. Exercise 2.1. Prove that 2 n > 6n for n 5. Exercise 2.2. Prove that n = n(n + 1) 2 for n 1. Exercise 2.3. Prove that if 0 < a < b then 0 < a n < b n for all n N. Exercise 2.4. Prove that n! < n n for n 2. Exercise 2.5. Prove that if a and r are real numbers and r 1, then for n 1 a + ar + ar ar n = a (rn+1 1). r 1 This can be written as follows a(r n+1 1) = (r 1)(a + ar + ar ar n ). And important special case of which is (r n+1 1) = (r 1)(1 + r + r r n ). Exercise 2.6. Prove that n = 2 n+1 1 for n 1. Exercise 2.7. Prove that 111 }{{ 1} = 10n 1 9 n 1 s Exercise 2.8. Prove that n 2 = for n 1. n(n + 1)(2n + 1) 6 if n 1. Exercise 2.9. Prove that if n 12 then n can be written as a sum of 4 s and 5 s. For example, 23 = = [Hint. In this case it will help to do the cases n = 12, 13, 14, and 15 separately. Then use induction to handle n 16.]

20 Exercise (a) For n 1, the triangular number t n is the number of dots in a triangular array that has n rows with i dots in the i-th row. Find a formula for t n, n 1. (b) Suppose that for each n 1. Let s n be the number of dots in a square array that has n rows with n dots in each row. Find a formula for s n. The numbers s n are usually called squares. Exercise Find the first 10 triangular numbers and the first 10 squares. Which of the triangular numbers in your list are also squares? Can you find the next triangular number which is a square? Exercise Some propositions that can be proved by induction can also be proved without induction. Prove Exercises 2.2 and 2.5 without induction. [Hints: For 2.2 write s = (n 1)+n. Directly under this equation write s = n+(n 1) Add these equations to obtain 2s = n(n+1). Solve for s. For Exercise 2.5 write p = a+ar+ar 2 + +ar n. Then multiply both sides of this equation by r to get a new equation with rp as the left hand side. Subtract these two equation to obtain pr p = ar n+1 a. Now solve for p.] 7

21 8 CHAPTER 2. PROOF BY INDUCTION

22 Chapter 3 Elementary Divisibility Properties Definition 3.1. d n means there is an integer k such that n = dk. d n means that d n is false. Note that a b a/b. Recall that a/b represents the fraction a b. The expression d n may be read in any of the following ways: 1. d divides n. 2. d is a divisor of n. 3. d is a factor of n. 4. n is a multiple of d. Thus, the following five statements are equivalent, that is, they are all different ways of saying the same thing divides is a divisor of is a factor of is a multiple of 2. 9

23 10 CHAPTER 3. ELEMENTARY DIVISIBILITY PROPERTIES Definitions will play an important role in this course. Students should learn all definitions and be able to state them precisely. An alternative way to state the definition of d n is as follows. Definition 3.2. d n n = dk for some k. or maybe Definition 3.3. d n iff n = dk for some k. Keep in mind that we are assuming that all letters a, b,..., z represent integers. Otherwise we would have to add this fact to our definitions. One might also see the following definition sometimes. Definition 3.4. d n if n = dk for some k. Note that, iff, and if and only if, all mean the same thing. In definitions such as Definition 3.4 if is interpreted to mean if and only if. It should be emphasized that all the above definitions are acceptable. Take your pick. But be careful about making up your own definitions.

24 Theorem 3.1 (Divisibility Properties). If n, m, and d are integers then the following statements hold: 1. n n (everything divides itself ) 2. d n and n m = d m (transitivity) 3. d n and d m = d an + bm for all a and b (linearity property) 4. d n = ad an (multiplication property) 5. ad an and a 0 = d n (cancellation property) 6. 1 n (one divides everything) 7. n 1 = n = ±1 (1 and 1 are the only divisors of 1.) 8. d 0 (everything divides zero) 9. 0 n = n = 0 (zero divides only zero) 10. If d and n are positive and d n then d n (comparison property) Exercise 3.1. Prove each of the properties 1 through 10 in Theorem 3.1. Definition 3.5. If c = as + bt for some integers s and t we say that c is a linear combination of a and b. Thus, statement 3 in Theorem 3.1 says that if d divides a and b, then d divides all linear combinations of a and b. In particular, d divides a + b and a b. This will turn out to be a useful fact. Exercise 3.2. Prove that if d a and d b then d a b. Exercise 3.3. Prove that if a Z then the only positive divisor of both a and a + 1 is 1. 11

25 12 CHAPTER 3. ELEMENTARY DIVISIBILITY PROPERTIES

26 Chapter 4 The Floor and Ceiling of a Real Number Here we define the floor, a.k.a., the greatest integer, and the ceiling, a.k.a., the least integer, functions. Kenneth Iverson introduced this notation and the terms floor and ceiling in the early 1960s according to Donald Knuth [6] who has done a lot to popularize the notation. Now this notation is standard in most areas of mathematics. Definition 4.1. If x is any real number we define x = the greatest integer less than or equal to x x = the least integer greater than or equal to x x is called the floor of x and x is called the ceiling of x The floor x is sometimes denoted [x] and called the greatest integer function. But I prefer the notation x. Here are a few simple examples: = 3 and 3.1 = = 3 and 3 = = -4 and 3.1 = -3 From now on we mostly concentrate on the floor x. For a more detailed treatment of both the floor and ceiling see the book Concrete Mathematics [5]. According to the definition of x we have (4.1) x = max{n Z n x} 13

27 14 CHAPTER 4. THE FLOOR AND CEILING OF A REAL NUMBER Note also that if n is an integer we have: (4.2) n = x n x < n + 1. From this it is clear that and x x holds for all x, x = x x Z. We need the following lemma to prove our next theorem. Lemma 4.1. For all x R x 1 < x x. Proof. Let n = x. Then by (4.2) we have n x < n + 1. This gives immediately that x x, as already noted above. It also gives x < n + 1 which implies that x 1 < n, that is, x 1 < x. Exercise 4.1. Sketch the graph of the function f(x) = x for 3 x 3. Exercise 4.2. Find π, π, 2, 2, π, π, 2, and 2. Definition 4.2. Recall that the decimal representation of a positive integer a is given by a = a n 1 a n 2 a 1 a 0 where (4.3) a = a n 1 10 n 1 + a n 2 10 n a a 0 and the digits a n 1, a n 2,..., a 1, a 0 are in the set {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} with a n 1 0. In this case we say that the integer a is an n digit number or that a is n digits long. Exercise 4.3. Prove that a N is an n digit number where n = log(a) +1. Here log means logarithm to base 10. Hint: Show that if ( 4.3) holds with a n 1 0 then 10 n 1 a < 10 n. Then apply the log to all terms of this inequality. Exercise 4.4. Use the previous exercise to determine the number of digits in the decimal representation of the number Recall that log(x y ) = y log(x) when x and y are positive.

28 Chapter 5 The Division Algorithm The goal of this section is to prove the following important result. Theorem 5.1 (The Division Algorithm). If a and b are integers and b > 0 then there exist unique integers q and r satisfying the two conditions: (5.1) a = bq + r and 0 r < b. In this situation q is called the quotient and r is called the remainder when a is divided by b. Note that there are two parts to this result. One part is the EXISTENCE of integers q and r satisfying (5.1) and the second part is the UNIQUENESS of the integers q and r satisfying (5.1). Proof. Given b > 0 and any a define q = a b r = a bq Cleary we have a = bq + r. But we need to prove that 0 r < b. By Lemma 4.1 we have a a b 1 < a b b. Now multiply all terms of this inequality by b. Since b is positive, b is negative so the direction of the inequality is reversed, giving us: a b a > b a. b 15

29 16 CHAPTER 5. THE DIVISION ALGORITHM If we add a to all sides of the inequality and replace a/b by q we obtain b > a bq 0. Since r = a bq this gives us the desired result 0 r < b. We still have to prove that q and r are uniquely determined. To do this we assume that a = bq 1 + r 1 and 0 r 1 < b, and a = bq 2 + r 2 and 0 r 2 < b. We must show that r 1 = r 2 and q 1 = q 2. If r 1 r 2 without loss of generality we can assume that r 2 > r 1. Subtracting these two equations we obtain 0 = a a = (bq 1 + r 1 ) (bq 2 + r 2 ) = b(q 1 q 2 ) + (r 1 r 2 ). This implies that (5.2) r 2 r 1 = b(q 1 q 2 ). This implies that b r 2 r 1. By Theorem 3.1(10) this implies that b r 2 r 1. But since 0 r 1 < r 2 0 this tells us that q 1 q 2 = 0, that is, q 1 = q 2. This completes the proof of the uniqueness of r and q in (5.1). Definition 5.1. An integer n is even if n = 2k for some k, and is odd if n = 2k + 1 for some k. Exercise 5.1. Prove using the Division Algorithm that every integer is either even or odd, but never both. Definition 5.2. By the parity of an integer we mean whether it is even or odd. Exercise 5.2. Prove n and n 2 always have the same parity. That is, n is even if and only if n 2 is even.

30 Exercise 5.3. Find the q and r of the Division Algorithm for the following values of a and b: 1. Let b = 3 and a = 0, 1, 1, 10, Let b = 345 and a = 0, 1, 1, 344, 7863, Exercise 5.4. Devise a method for solving problems like those in the previous exercise for large positive values of a and b using a calculator. Illustrate by using a = and b = 123. Hint: If a = bq + r and 0 r < b then a = q + r and so r is the fractional part of the decimal number a. So q is b b b b what you get when you drop the fractional part. Once you have q you can solve a = bq + r for r. Sometimes a problem in number theory can be solved by dividing the integers into various classes depending on their remainders when divided by some number b. For example, this is helpful in solving the following two problems. Exercise 5.5. Show that for all integers n the number n 3 n always has 3 as a factor. (Consider the three cases: n = 3k, n = 3k + 1, n = 3k + 2.) Exercise 5.6. Show that the product of any three consecutive integers has 6 as a factor. (How many cases should you use here?) Definition 5.3. For b > 0 define a mod b = r where r is the remainder given by the Division Algorithm when a is divided by b, that is, a = bq + r and 0 r < b. For example 23 mod 7 = 2 since 23 = and 4 mod 5 = 1 since 4 = 5 ( 1) + 1. Note that some calculators and most programming languages have a function often denoted by MOD(a, b) or mod(a, b) whose value is what we have just defined as a mod b. When this is the case the values r and q in the Division Algorithm for given a and b > 0 are given by r = a mod b q = a (a mod b) b If also the floor function is available we have r = a mod b q = a/b 17

31 18 CHAPTER 5. THE DIVISION ALGORITHM Exercise 5.7. Prove that if b > 0 then b a a mod b = 0. Exercise 5.8. Prove that if b 0 then b a a/b Z. Exercise 5.9. Calculate the following: 1. 0 mod mod mod mod ( 7) mod 3 6. ( 3) mod 7 7. ( 5) mod 5 Exercise Use the Division Algorithm to prove the following more general version: If b 0 then for any a there exists unique q and r such that (5.3) a = bq + r and 0 r < b. Hint: Recall that b is b if b 0 and is b if b < 0. We know the statement holds if b > 0 so we only need to consider the case when b < 0. If b is negative then b is positive, so we can apply the Division Algorithm to a and b. Note that a as well as q can be any integers. This exercise may come in handy later.

32 Chapter 6 Greatest Common Divisor Definition 6.1. Let a, b Z. If a 0 or b 0, we define gcd(a, b) to be the largest integer d such that d a and d b. We define gcd(0, 0) = 0. Discussion. If e a and e b we call e a common divisor of a and b. Let C(a, b) = {e : e a and e b}, that is, C(a, b) is the set of all common divisors of a and b. Note that since everything divides 0 C(0, 0) = Z so there is no largest common divisor of 0 with 0. This is why we must define gcd(0, 0) = 0. Example 6.1. So gcd(18, 30) = 6. Lemma 6.1. If e a then e a. C(18, 30) = { 1, 1, 2, 2, 3, 3, 6, 6}. Proof. If e a then a = ek for some k. Then a = ( e)( k). Since e and k are also integers e a. Lemma 6.2. If a 0, the largest positive integer that divides a is a. 19

33 20 CHAPTER 6. GREATEST COMMON DIVISOR Proof. Recall that a = { a if a 0 a if a < 0. First note that a actually divides a: If a > 0, since we know a a we have a a. If a < 0, a = a. In this case a = ( a)( 1) = a ( 1) so a is a factor of a. So, in either case a divides a, and in either case a > 0, since a 0. Now suppose d a and d is positive. Then a = dk some k so a = d( k) for some k. So d a. So by Theorem 3.1 (10) we have d a. The following lemma shows that in computing gcd s we may restrict ourselves to the case where both integers are positive. Lemma 6.3. gcd(a, b) = gcd( a, b ). Proof. If a = 0 and b = 0, we have a = a and b = b. So gcd(a, b) = gcd( a, b ). Suppose one of a or b is not 0. Note that d a d a. See Exercise 6.1. It follows that C(a, b) = C( a, b ). So the largest common divisor of a and b is also the largest common divisor of a and b. Exercise 6.1. Prove that d a d a [Hint: recall that a = a if a 0 and a = a if a < 0. So you need to consider two cases.] Lemma 6.4. gcd(a, b) = gcd(b, a). Proof. Clearly C(a, b) = C(b, a). It follows that the largest integer in C(a, b) is the largest integer in C(b, a), that is, gcd(a, b) = gcd(b, a). Lemma 6.5. If a 0 and b 0, then gcd(a, b) exists and satisfies 0 < gcd(a, b) min{ a, b }.

34 21 Proof. Note that gcd(a, b) is the largest integer in the set C(a, b) of common division of a and b. Since 1 a and 1 b we know that 1 C(a, b). So the largest common divisor must be at least 1 and is therefore positive. On the other hand d C(a, b) d a and d b so d is no larger than a and no larger than b. So d is at most the smaller of a and b. Hence gcd(a, b) min{ a, b }. Example 6.2. From the above lemmas we have gcd(48, 732) = gcd( 48, 732) = gcd( 48, 732) = gcd(48, 732). We also know that 0 < gcd(48, 732) 48. Since if d = gcd(48, 732), then d 48, to find d we may check only which positive divisors of 48 also divide 732. Exercise 6.2. Find gcd(48, 732) using Example 6.2. Exercise 6.3. Find gcd(a, b) for each of the following values of a and b: (1) a = b, b = 14 (2) a = 1, b = (3) a = 0, b = 78 (4) a = 2, b =

35 22 CHAPTER 6. GREATEST COMMON DIVISOR

36 Chapter 7 The Euclidean Algorithm Unlike the Division Algorithm, the Euclidean Algorithm really is an algorithm. It provides a method to compute gcd(a, b). Since as already noted gcd(0, 0) = 0, gcd(a, b) = gcd( a, b ), and gcd(a, b) = gcd(b, a), it suffices to give a method to compute gcd(a, b) when a b 0. Lemma 7.1. If a > 0, then gcd(a, 0) = a. Proof. Since every integer divides 0, C(a, 0) is just the set of divisors of a. By Lemma 6.2 the largest divisor of a is a. Since a > 0, a = a. This shows that gcd(a, 0) = a. Remark 7.1. So we are now reduced to the problem of finding gcd(a, b) when a b > 0. Exercise 7.1. Prove that if a > 0 then gcd(a, a) = a. Now having done Exercise 7.1 we only need to consider the case a > b > 0. Lemma 7.2. Let a > b > 0. If a = bq + r, then gcd(a, b) = gcd(b, r). Proof. It suffices to show that C(a, b) = C(b, r), that is, the common divisors of a and b are the same as the common divisors of b and r. To show this first let d a and d b. Note that r = a bq, which is a linear combination of a and b. So by Theorem 3.1(3) d r. Thus d b and d r. Next assume d b and d r. Using Theorem 3.1(3) again and the fact that a = bq + r is a linear combination of b and r, we have d a. So d a and d b. We have thus shown that C(a, b) = C(b, r). So gcd(a, b) = gcd(b, r). 23

37 24 CHAPTER 7. THE EUCLIDEAN ALGORITHM Remark 7.2. The Euclidean Algorithm is the process of using Lemmas 7.2 and 7.1 to compute gcd(a, b) when a > b > 0. Rather than give a precise statement of the algorithm I will give an example to show how it goes. Example 7.1. Let s compute gcd(803, 154). gcd(803, 154) = gcd(154, 33) since 803 = gcd(154, 33) = gcd(33, 22) since 154 = gcd(33, 22) = gcd(22, 11) since 33 = gcd(22, 11) = gcd(11, 0) since 22 = gcd(11, 0) = 11. Hence gcd(803, 154) = 11. Remark 7.3. Note that we have formed the gcd of 803 and 154 without factoring 803 and 154. This method is generally much faster than factoring and can find gcd s when factoring is not feasible. Exercise 7.2. Let a > b > 0. Show that gcd(a, b) = gcd(b, a mod b). Remark 7.4. So if your calculator can compute a mod b you may use it when executing the Euclidean Algorithm. Exercise 7.3. Find gcd(a, b) using the Euclidean Algorithm for each of the values below: (1) a = 37, b = 60 (2) a = 793, b = 3172 (3) a = 25174, b = (4) a = 377, b = 233

38 Chapter 8 Bezout s Lemma Lemma 8.1 (Bezout s Lemma). For all integers a and b there exist integers s and t such that gcd(a, b) = sa + tb. Proof. If a = b = 0 then s and t may be anything since gcd(0, 0) = 0 = s 0 + t 0. So we may assume that a 0 or b 0. Let J = {na + mb : n, m Z}. Note that J contains a, a, b and b since a = 1 a + 0 b a = ( 1) a + 0 b b = 0 a + 1 b b = 0 a + ( 1) b. Since a 0 or b 0 one of the elements a, a, b, b is positive. So we can say that J contains some positive integers. Let S denote the set of positive integers in J. That is, S = {na + mb : na + mb > 0, n, m Z}. By the Well-Ordering Property for N, S contains a smallest positive integer, call it d. Let s show that d = gcd(a, b). Note that since d S we have 25

39 26 CHAPTER 8. BEZOUT S LEMMA d = sa+tb for some integers, s and t. Note also that d > 0. Let e = gcd(a, b). Then e a and e b, so by Theorem 3.1 (3) e sa + tb, that is e d. Since e and d are positive, by Theorem 3.1 (10) we have e d. So if we can show that d is a common divisor of a and b we will know that e = d. To show d a using the Division Algorithm we write a = dq + r where 0 r < d. Now r = a dq = a (sa + tb)q = (1 sq)a + ( tq)b. Hence r J. If r > 0 then r S. But this cannot be since r < d and d is the smallest integer in S. So we must have r = 0. That is, a = dq. Hence d a. By a similar argument we can show that d b. Thus, d is indeed a common divisor of a and b since d e = gcd(a, b), we must have d = gcd(a, b). As noted already d = sa + tb, so the theorem is proved. Example = gcd(2, 3) and we have 1 = ( 1) Also we have 1 = 2 2+( 1)3. So the numbers s and t in Bezout s Lemma are not uniquely determined. In fact, as we will see later there are infinitely many choices for s and t for each pair a, b. Remark 8.1. The above proof is an existence theorem. It asserts the existence of s and t, but does not provide a way to actually find s and t. Also the proof does not give any clue about how to go about calculating s and t. We will give an algorithm in the next chapter for finding s and t.

40 Chapter 9 Blankinship s Method In an article in the August-September 1963 issue of the American Mathematical Monthly, W.A. Blankinship 1 gave a simple method to produce the integers s and t in Bezout s Lemma and at the same time produce gcd(a, b): Given a > b > 0 we start with the array [ a 1 ] 0 b 0 1 Then we continue to add multiples of one row to another row, alternating choice of rows until we reach an array of the form [ ] 0 x1 x 2 d y 1 y 2 or [ d y1 ] y 2 0 x 1 x 2 Then d = gcd(a, b) = y 1 a + y 2 b. [The goal is to get a 0 in the first column.] Examples 9.1. First take a = 35, b = 15. [ 35 1 ] Note 35 = , hence ( 2) = 5. 1 Thanks to Chris Miller for bringing this method to my attention. 27

41 28 CHAPTER 9. BLANKINSHIP S METHOD So we multiply row 2 by 2 and add it to row 1, getting [ 5 1 ] Now 3 5 = 15 or 15 + ( 3)5 = 0, so we multiply row 1 by 3 and add it to row 2, getting [ ] Now we can say that and gcd(35, 15) = 5 5 = ( 2) 15. Let s now consider a more complicated example: Take a = 1876, b = 365. [ ] Now 1876 = so we add 5 times the second row to the first row, getting: [ ] Now 365 = , so we add 7 times row 1 to row 2, getting: [ 51 1 ] Now 51 = , so we add 6 times row 2 to row 1, getting: [ 3 43 ] Now 8 = , so we add 2 times row 1 to row 2, getting: [ 3 43 ] Then 3 = , so we add 1 times row 2 to row 1, getting: [ ]

42 Finally, 2 = 1 2 so if we add 2 times row 1 to row 2 we get: [ ] ( ) This tells us that and ( ) gcd(1876, 365) = 1 1 = ( 699)365. Note that it was not necessary to compute the last two entries 365 and 1876 in ( ). It is a good idea however to check that equation ( ) holds. In this case we have: So it is correct = ( 699) 365 = Why Blankinship s Method works: Note that just looking at what happens in the first column you see that we are just doing the Euclidean Algorithm, so when one element in column 1 is 0, the other is, in fact, the gcd. Note that at the start we have [ ] a 1 0 b 0 1 and a = 1 a + 0 b b = 0 a + 1 b. One can show that at every intermediate step [ ] a1 x 1 x 2 b 1 y 1 y 2 we always have a 1 = x 1 a + x 2 b b 1 = y 1 a + y 2 b, and the result follows. I will omit the details. 1 29

43 30 CHAPTER 9. BLANKINSHIP S METHOD Exercise 9.1. Use Blankinship s method to compute the s and t in Bezout s Lemma for each of the following values of a and b. (1) a = 267, b = 112 (2) a = 216, b = 135 (3) a = 11312, b = Exercise 9.2. Show that if 1 = as + bt then gcd(a, b) = 1. Exercise 9.3. Find integers a, b, d, s, t such that all of the following hold (1) a > 0, b > 0, (2) d = sa + tb, and (3) d gcd(a, b). Note that d in Exercise 9.3 cannot be 1 by Exercise 9.2.

44 Chapter 10 Prime Numbers Definition An integer p is prime if p 2 and the only positive divisors of p are 1 and p. An integer n is composite if n 2 and n is not prime. Remark The number 1 is neither prime nor composite. Lemma An integer n 2 is composite if and only if there are integers a and b such that n = ab, 1 < a < n, and 1 < b < n. Proof. Let n 2. If n is composite there is a positive integer a such that a 1, a n and a n. This means that n = ab for some b. Since n and a are positive so is b. Hence 1 a and 1 b. By Theorem 3.1(10) a n and b n. Since a 1 and a n we have 1 < a < n. If b = 1 then a = n, which is not possible, so b 1. If b = n then a = 1, which is also not possible. So 1 1, there is a prime p such that p n. Proof. Assume there is some integer n > 1 which has no prime divisor. Let S denote the set of all such integers. By the Well-Ordering Property there is a smallest such integer, call it m. Now m > 1 and has no prime divisor. So m cannot be prime. Hence m is composite. Therefore by Lemma 10.1 m = ab, 1 < a < m, 1 < b < m. Since 1 < a < m then a is not in the set S. So a must have a prime divisor, call it p. Then p a and a m so by Theorem 3.1, p m. This contradicts the fact that m has no prime divisor. So the set S must be empty and this proves the lemma. 31

45 32 CHAPTER 10. PRIME NUMBERS Theorem 10.1 (Euclid s Theorem). There are infinitely many prime numbers. Proof. Assume, by way of contradiction, that there are only a finite number of prime numbers, say: p 1, p 2,..., p n. Define N = p 1 p 2 p n + 1. Since p 1 2, clearly N 3. So by Lemma 10.2 N has a prime divisor p. By assumption p = p i for some i = 1,..., n. Let a = p 1 p n. Note that a = p i (p 1 p 2 p i 1 p i+1 p n ), so p i a. Now N = a + 1 and by assumption p i a + 1. So by Exercise 3.2 p i (a + 1) a, that is p i 1. By Basic Axiom 3 in Chapter 1 this implies that p i = 1. This contradicts the fact that primes are > 1. It follows that the assumption that there are only finitely many primes is not true. Exercise Use the idea of the above proof to show that if q 1, q 2,..., q n are primes there is a prime q / {q 1,..., q n }. Hint: Take N = q 1 q n +1. By Lemma 10.2 there is a prime q such that q N. Prove that q / {q 1,..., q n }. Exercise Let p 1 = 2, p 2 = 3, p 3 = 5,... and, in general, p i = the i-th prime. Prove or disprove that p 1 p 2 p n + 1 is prime for all n 1. [Hint: If n = 1 we have = 3 is prime. If n = 2 we have = 7 is prime. If n = 3 we have = 31 is prime. Try the next few values of n. You may want to use the next theorem to check primality.] Theorem If n > 1 is composite then n has a prime divisor p n. Proof. Let n > 1 be composite. Then n = ab where 1 < a < n and 1 n and b > n. Hence n = ab > n n = n. This implies n > n, a contradiction. So a n or b n. Suppose a n. Since 1 < a, by Lemma 10.2 there is a prime p such that p a. Hence, by Theorem 3.1 since a n we have p n. Also by Theorem 3.1 since p a we have p a n.

46 Remark We can use Theorem 10.2 to help decide whether or not an integer is prime: To check whether or not n > 1 is prime we need only try to divide it by all primes p n. If none of these primes divides n then n must be prime. Example Consider the number 97. Note that 97 < 100 = 10. The primes 10 are 2, 3, 5, and 7. One easily checks that 97 mod 2 = 1, 97 mod 3 = 1, 97 mod 5 = 2, 97 mod 7 = 6. So none of the primes 2, 3, 5, 7 divide 97 and 97 is prime by Theorem Exercise By using Theorem 10.2, as in the above example, determine the primality 1 of the following integers: 143, 221, 199, 223, Definition Let x R, x > 0. π(x) denotes the number of primes p such that p x. For example, since the only primes p 10 are 2, 3, 5, and 7 we have π(10) = 4. Here is a table of values of π(10 i ) for i = 2,..., 10. I also include known approximations to π(x). Note that the formulas for the approximations do not give integer values, but for the table I have rounded each to the nearest integer. The values in the table were computed using Maple. x π(x) x ln(x) x ln(x) 1 x 2 1 dt ln(t) You may judge for yourself which approximations appear to be the best. This table has been continued up to 10 21, but people are still working on finding 1 This means determine whether or not each number is prime. 33

47 34 CHAPTER 10. PRIME NUMBERS the value of π(10 22 ). Of course, the approximations are easy to compute with Maple but the exact value of π(10 22 ) is difficult to find. The above approximations are based on the so-called Prime Number Theorem first conjectured by Gauss in 1793 but not proved till over 100 years later by Hadamard and Vallée Poussin. Theorem 10.3 (The Prime Number Theorem). ( ) π(x) x ln(x) Remark ( ) means that π(x) lim x x ln(x) for all x > 0. = 1. Although there are infinitely many primes there are long stretches of consecutive integers containing no primes. Theorem For any positive integer n there is an integer a such that the n consecutive integers are all composite. a, a + 1, a + 2,..., a + (n 1) Proof. Given n 1 let a = (n + 1)! + 2. We claim that all the numbers a + i, 0 i n 1 are composite. Since (n + 1) 2 clearly 2 (n + 1)! and 2 2. Hence 2 (n + 1)! + 2. Since (n + 1)! + 2 > 2, (n + 1)! + 2 is composite. Consider a + i = (n + 1)! + i + 2 where 0 i n 1 so 2 i+2 n+1. Thus i+2 (n+1)! and i+2 i+2. Therefore i + 2 a + i. Now a + i > i + 2 > 1, so a + i is composite. Exercise Use the Prime Number Theorem and a calculator to approximate the number of primes Note ln(10 8 ) = 8 ln(10). Exercise Find 10 consecutive composite numbers.

48 Exercise Prove that 2 is the only even prime number. (Joke: Hence it is said that 2 is the oddest prime.) Exercise Prove that if a and n are positive integers such that n 2 and a n 1 is prime then a must be 2. [Hint: By Exercise 2.4 that is, if x 1 and n 1.] 1 + x + x x n 1 = (xn 1) x 1 x n 1 = (x 1) ( 1 + x + x x n 1) Exercise (a) Is 2 n 1 always prime if n 2? Explain. (b) Is 2 n 1 always prime if n is prime? Explain. Exercise Show that if p and q are primes and p q, then p = q. 35

49 36 CHAPTER 10. PRIME NUMBERS

50 Chapter 11 Unique Factorization Our goal in this chapter is to prove the following fundamental theorem. Theorem 11.1 (The Fundamental Theorem of Arithmetic). Every integer n > 1 can be written uniquely in the form n = p 1 p 2 p s, where s is a positive integer and p 1, p 2,..., p s are primes satisfying p 1 p 2 p s. Remark If n = p 1 p 2 p s where each p i is prime, we call this the prime factorization of n. Theorem 11.1 is sometimes stated as follows: Every integer n > 1 can be expressed as a product n = p 1 p 2 p s, for some positive integer s, where each p i is prime and this factorization is unique except for the order of the primes p i. Note for example that 600 = = = etc. Perhaps the nicest way to write the prime factorization of 600 is 600 =

51 38 CHAPTER 11. UNIQUE FACTORIZATION In general it is clear that n > 1 can be written uniquely in the form ( ) n = p a 1 1 p a 2 2 p as s, some s 1, where p 1 < p 2 < < p s and a i 1 for all i. Sometimes ( ) is written n = s i=1 p a i i. Here stands for product, just as stands for sum. To prove Theorem 11.1 we need to first establish a few lemmas. Lemma If a bc and gcd(a, b) = 1 then a c. Proof. Since gcd(a, b) = 1 by Bezout s Lemma there are s, t such that If we multiply both sides by c we get 1 = as + bt. c = cas + cbt = a(cs) + (bc)t. By assumption a bc. Clearly a a(cs) so, by Theorem 3.1, a divides the linear combination a(cs) + (bc)t = c. Definition We say that a and b are relatively prime if gcd(a, b) = 1. So we may restate Lemma 11.1 as follows: If a bc and a is relatively prime to b then a c. Example It is not true generally that when a bc then a b or a c. For example, 6 4 9, but 6 4 and 6 9. Note that Lemma 11.1 doesn t apply here since gcd(6, 4) 1 and gcd(6, 9) 1. Lemma 11.2 (Euclid s Lemma). If p is a prime and p ab, then p a or p b. Proof. Assume that p ab. If p a we are done. Suppose p a. Let d = gcd(p, a). Note that d > 0 and d p and d a. Since d p we have d = 1 or d = p. If d 1 then d = p. But this says that p a, which we assumed was not true. So we must have d = 1. Hence gcd(p, a) = 1 and p ab. So by Lemma 11.1, p b.

52 Lemma Let p be prime. Let a 1, a 2,..., a n, n 1, be integers. If p a 1 a 2 a n, then p a i for at least one i {1, 2,..., n}. Proof. We use induction on n. The result is clear if n = 1. Assume that the lemma holds for n such that 1 n k. Let s show it holds for n = k + 1. So assume p is a prime and p a 1 a 2 a k a k+1. Let a = a 1 a 2 a k and b = a k+1. Then p a or p b by Lemma If p a = a 1 a k, by the induction hypothesis, p a i for some i {1,..., k}. If p b = a k+1 then p a k+1. So we can say p a i for some i {1, 2,..., k +1}. So the lemma holds for n = k +1. Hence by PMI it holds for all n 1. Lemma 11.4 (Existence Part of Theorem 11.1). If n > 1 then there exist primes p 1,..., p s for some s 1 such that 39 and p 1 p 2 p s. n = p 1 p 2 p s Proof. Proof by induction on n, with starting value n = 2: If n = 2 then since 2 is prime we can take p 1 = 2, s = 1. Assume the lemma holds for n such that 2 n k. Let s show it holds for n = k + 1. If k + 1 is prime we can take s = 1 and p 1 = k + 1 and we are done. If k + 1 is composite we can write k + 1 = ab where 1 < a < k + 1 and 1 < b < k + 1. By the induction hypothesis there are primes p 1,..., p u and q 1,..., q v such that This gives us a = p 1 p u and b = q 1 q v. k + 1 = ab = p 1 p 2 p u q 1 q 2 q v, that is k + 1 is a product of primes. Let s = u + v. By reordering and relabeling where necessary we have k + 1 = p 1 p 2 p s where p 1 p 2 p s. So the lemma holds for n = k + 1. Hence by PMI, it holds for all n > 1. Lemma 11.5 (Uniqueness Part of Theorem 11.1). Let n = p 1 p 2 p s for some s 1,

53 40 CHAPTER 11. UNIQUE FACTORIZATION and n = q 1 q 2 q t for some t 1, where p 1,..., p s, q 1,..., q t are primes satisfying and p 1 p 2 p s q 1 q 2 q t. Then, t = s and p i = q i for i = 1, 2,..., t. Proof. Our proof is by induction on s. Suppose s = 1. Then n = p 1 is prime and we have p 1 = n = q 1 q 2 q t. If t > 1, this contradicts the fact that p 1 is prime. So t = 1 and we have p 1 = q 1, as desired. Now assume the result holds for all s such that 1 s k. We want to show that it holds for s = k + 1. So assume and n = p 1 p 2 p k p k+1 n = q 1 q 2 q t where p 1 p 2 p k+1 and q 1 q 2 q t. Clearly p k+1 n so p k+1 q 1 q t. So by Lemma 11.3 p k+1 q i for some i {1, 2,..., t}. It follows from Exercise 10.9 that p k+1 = q i. Hence p k+1 = q i q t. By a similar argument q t n so q t p 1 p k+1 and q t = p j for some j. Hence q t = p j p k+1. This shows that so p k+1 = q t. Note that p k+1 q t p k+1 p 1 p 2 p k p k+1 = q 1 q 2 q t 1 q t Since p k+1 = q t we can cancel this prime from both sides and we have p 1 p 2 p k = q 1 q 2 q t 1. Now by the induction hypothesis k = t 1 and p i = q i for i = 1,..., t 1. Thus we have k + 1 = t and p i = q i for i = 1, 2,..., t. So the lemma holds for s = k + 1 and by the PMI, it holds for all s 1.

54 Now the proof of Theorem 11.1 follows immediately from Lemmas 11.4 and Remark If a and b are positive integers we can find primes p 1,..., p k and integers a 1,..., a k, b 1,..., b k each 0 such that { a = p a 1 1 p a 2 2 p a k k ( ) b = p b 1 1 p b 2 2 p b k k For example, if a = 600 and b = 252 we have It follows that 600 = = gcd(600, 252) = In general, if a and b are given by ( ) we have gcd(a, b) = p min(a 1,b 1 ) 1 p min(a 2,b 2 ) 2 p min(a k,b k ) k. This gives one way to calculate the gcd provided you can factor both numbers. But generally speaking factorization is very difficult! On the other hand, the Euclidean algorithm is relatively fast. Exercise Find the prime factorizations of 1147 and 1716 by trying all primes p 1147 (p 1716) in succession. 41

55 42 CHAPTER 11. UNIQUE FACTORIZATION

56 Chapter 12 Fermat Primes and Mersenne Primes Finding large primes and proving that they are indeed prime is not easy. One way to find large primes is to look at numbers that have some special form, for example, numbers of the form a n + 1 or a n 1. It is easy to rule out some values of a and n. For example we have: Theorem Let a > 1 and n > 1. Then (1) a n 1 is prime a = 2 and n is prime (2) a n + 1 is prime a is even and n = 2 k for some k 1. Proof of (1). We know from Exercise 2.5, page 6, that ( ) a n 1 = (a 1)(a n a + 1) Note that if a > 2 and n > 1 then a 1 > 1 and a n 1 + +a+1 > a+1 > 3 so both factors in ( ) are > 1 and a n 1 is not prime. Hence if a n 1 is prime we must have a = 2. Now suppose 2 n 1 is prime. We claim that n is prime. If not n = st where 1 < s < n, 1 < t < n. Then 2 n 1 = 2 st 1 = (2 s ) t 1 is prime. But we just showed that if a n 1 is prime we must have a = 2. So we must have 2 s = 2. Hence s = 1, t = n. So n is not composite. Hence n must be prime. This proves (1). 43

57 44 CHAPTER 12. FERMAT PRIMES AND MERSENNE PRIMES Proof of (2). From ( ) on p. 43 we have ( ) a n 1 = (a 1)(a n 1 + a n a + 1). Replace a by a in ( ) and we get ( ) ( a) n 1 = ( a 1) ( ( a) n 1 + ( a) n ( a) + 1 ) Since n is odd, n 1 is even, n 2 is odd,..., etc., we have ( a) n = a n, ( a) n 1 = a n 1, ( a) n 2 = a n 2,..., etc. So ( ) yields (a n + 1) = (a + 1) ( a n 1 a n a + 1 ). Multiplying both sides by 1 we get (a n + 1) = (a + 1)(a n 1 a n 2 + a + 1) when n is odd. If n 2 we have 1 < a + 1 < a n + 1. This shows that if n is odd and a > 1, a n + 1 is not prime. Suppose n = 2 s t where t is odd. Then if a n + 1 is prime we have (a 2s ) t + 1 is prime. But by what we just showed this cannot be prime if t is odd and t 2. So we must have t = 1 and n = 2 s. Also a n +1 prime implies that a is even since if a is odd so is a n. Then a n +1 would be even. The only even prime is 2. But since we assume a > 1 we have a 2 so a n Definition A number of the form M n = 2 n 1, n 2, is said to be a Mersenne number. If M n is prime, it is called a Mersenne prime. A number of the form F n = 2 (2n) + 1, n 0, is called a Fermat number. If F n is prime, it is called a Fermat prime. One may prove that F 0 = 3, F 1 = 5, F 2 = 17, F 3 = 257 and F 4 = are primes. As n increases the numbers F n = 2 (2n) + 1 increase in size very rapidly, and are not easy to check for primality. It is known that F n is composite for many values of n 5. This includes all n such that 5 n 30 and a large number of other values of n including (the largest one I know of). It is now conjectured that F n is composite for n 5. So Fermat s original thought that F n is prime for n 0 seems to be pretty far from reality. Exercise Use Maple to factor F 5. [Go to any campus computer lab. Click or double-click on the Maple icon or ask the lab assistant where it is located. When the window comes up, type at the prompt > the following:

58 45 > ifactor(2^32 + 1); Hit the return key and you will get the answer.] M 3 = = 7 is a Mersenne prime and M 4 = = 15 is a Mersenne number which is not a prime. At first it was thought that M p = 2 p 1 is prime whenever p is prime. But M 11 = = 2047 = is not prime. Over the years people have continued to work on the problem of determining for which primes p, M p = 2 p 1 is prime. To date 39 Mersenne primes have been found. It is known that 2 p 1 is prime if p is one of the following 39 primes 2, 3, 5, 7, 13, 17, 19, 31, 61, 89, 107, 127, 521, 607, 1279, 2203, 2281, 3217, 4253, 4423, 9689, 9941, 11213, 19937, 21701, 23209, 44497, 86243, , , , , , , , , , , The largest one, M = , was found on November 14, The decimal representation of this number has 4, 053, 946 digits. It was found by the team of Michael Cameron, George Woltman, Scott Kurowski et al, as a part of the Great Internet Mersenne Prime Search (GIMPS), see Chris Caldwell s page for more about this. This prime could be the 39th Mersenne prime (in order of size), but we will only know this for sure when GIMPS completes testing all exponents below this one.you can find the link to Chris Caldwell s page on the class syllabus on my homepage. Later we show the connection between Mersenne primes and perfect numbers. Lemma If M n is prime, then n is prime. Proof. This is immediate from Theorem 12.1 (1). The most basic question about Mersenne primes is: Are there infinitely many Mersenne primes? Exercise Determine which Mersenne numbers M n are prime when 2 n 12. You may use Maple for this exercise. The Maple command for determining whether or not an integer n is prime is isprime(n); The following primality test for Mersenne numbers makes it easier to check whether or not M p is prime when p is a large prime.

59 46 CHAPTER 12. FERMAT PRIMES AND MERSENNE PRIMES Theorem 12.2 (The Lucas-Lehmer Mersenne Prime Test). Let p be an odd prime. Define the sequence by the rules and for k 2, r 1, r 2, r 3,..., r p 1 r 1 = 4 r k = (r 2 k 1 2) mod M p. Then M p is prime if and only if r p 1 = 0. [The proof of this is not easy. One place to find a proof is the book A Selection of Problems in the Theory of Numbers by W. Sierpinski, Pergamon Press, 1964.] Example Let p = 5. Then M p = M 5 = 31. r 1 = 4 r 2 = (4 2 2) mod 31 = 14 mod 31 = 14 r 3 = (14 2 2) mod 31 = 194 mod 31 = 8 r 4 = (8 2 2) mod 31 = 62 mod 31 = 0. Hence by the Lucas-Lehmer test, M 5 = 31 is prime. Exercise Show using the Lucas-Lehmer test that M 7 = 127 is prime. Remark Note that the Lucas-Lehmer test for M p = 2 p 1 takes only p 1 steps. On the other hand, if one attempts to prove M p prime by testing all primes M p one must consider about 2 p 2 steps. This is MUCH larger than p in general.

60 Chapter 13 The Functions σ and τ Definition For n > 0 define: τ(n) = the number of positive divisors of n, σ(n) = the sum of the positive divisors of n. Example = has positive divisors 1, 2, 3, 4, 6, 12. Hence and τ(12) = 6 σ(12) = = 28. Definition A positive divisor d of n is said to be a proper divisor of n if d < n. We denote the sum of all proper divisors of n by σ (n). Note that if n 2 then Example σ (12) = 16. σ (n) = σ(n) n. Definition n > 1 is perfect if σ (n) = n. Example The proper divisors of 6 are 1, 2 and 3. So σ (6) = 6. Therefore 6 is perfect. 47

61 48 CHAPTER 13. THE FUNCTIONS σ AND τ Exercise Prove that 28 is perfect. The next theorem shows a simple way to compute σ(n) and τ(n) from the prime factorization of n. Theorem Let n = p e 1 1 p e 2 2 p er r, r 1, where p 1 < p 2 < < p r are primes and e i 0 for each i {1, 2,..., r}. Then (1) τ(n) = (e 1 + 1)(e 2 + 1) (e r + 1) ( p e 1 +1 ) ( 1 1 p e 2 +1 ) 2 1 (2) σ(n) = p 1 1 p 2 1 ( p e r+1 r 1 p r 1 Before proving this let s look at an example. Take n = 72 = 8 9 = The theorem says ). τ(72) = (3 + 1)(2 + 1) = 12 ( ) ( ) σ(72) = = = [Proof of Theorem 13.1 (1)] From the Fundamental Theorem of Arithmetic every positive factor d of n will have its prime factors coming from those of n. Hence d n iff d = p f 1 1 p f 2 2 p fr r where for each i: 0 f i e i. That is, for each f i we can choose a value in the set of e i + 1 numbers {0, 1, 2,..., e i }. So, in all, there are (e 1 + 1)(e 2 + 1) (e r + 1) choices for the exponents f 1, f 2,..., f r. So (1) holds. [Proof of (2)] We first establish two lemmas. Lemma Let n = ab where a > 0, b > 0 and gcd(a, b) = 1. Then σ(n) = σ(a)σ(b). Proof. Since a and b have only 1 as a common factor, using the Fundamental Theorem of Arithmetic it is easy to see that d ab d = d 1 d 2 where d 1 a

62 and d 2 b. That is, the divisors of ab are products of the divisors of a and the divisors of b. Let 1, a 1,..., a s denote the divisors of a and let denote the divisors of b. Then 1, b 1,..., b t σ(a) = 1 + a 1 + a a s, σ(b) = 1 + b 1 + b b t. The divisors of n = ab can be listed as follows 1, b 1, b 2,..., b t, a 1 1, a 1 b 1, a 1 b 2,..., a 1 b t, a 2 1, a 2 b 1, a 2 b 2,..., a 2 b t,. a s 1, a s b 1, a s b 2,..., a s b t. It is important to note that since gcd(a, b) = 1, a i b j = a k b l implies that a i = a k and b j = b l. That is there are no repetitions in the above array. If we sum each row we get 1 + b b t = σ(b) a a 1 b a 1 b t = a 1 σ(b). a s 1 + a s b a s b t = a s σ(b). By adding these partial sums together we get This proves the lemma. σ(n) = σ(b) + a 1 σ(b) + a 2 σ(b) + + a 3 σ(b) = (1 + a 1 + a a s )σ(b) = σ(a)σ(b). 49

63 50 CHAPTER 13. THE FUNCTIONS σ AND τ Lemma If p is a prime and k 0 we have σ(p k ) = pk+1 1 p 1. Proof. Since p is prime, the divisors of p k are 1, p, p 2,..., p k. Hence as desired. σ(p k ) = 1 + p + p p k = pk+1 1 p 1, Proof of Theorem 13.1 (2) (continued). Let n = p e 1 1 p e 2 2 p er r. Our proof is by induction on r. If r = 1, n = p e 1 1 and the result follows from Lemma Suppose the result is true when 1 r k. Consider now the case r = k + 1. That is, let n = p e 1 1 p e k k p e k+1 k+1 where the primes p 1,..., p k, p k+1 are distinct and e i 0. Let a = p e 1 1 p e k k, b = p e k+1 k+1. Clearly gcd(a, b) = 1. So by Lemma 13.1 we have σ(n) = σ(a)σ(b). By the induction hypothesis and by Lemma 13.2 ( p e 1 +1 σ(a) = and it follows that ( p e 1 +1 σ(n) = 1 1 p 1 1 ) ( p e k +1 σ(b) = pe k+1+1 k+1 1 p k p 1 1 k 1 p k 1 ) ( e p k+1 +1 k+1 1 p k+1 1 So the result holds for r = k + 1. By PMI it holds for r 1. Exercise Find σ(n) and τ(n) for the following values of n. (1) n = 900 (2) n = 496 (3) n = 32 ) ).

64 51 (4) n = 128 (5) n = 1024 Exercise Determine which (if any) of the numbers in Exercise 13.2 are perfect. Exercise Does Lemma 13.1 hold if we replace σ by σ? [Hint: The answer is no, but find explicit numbers a and b such that the result fails yet gcd(a, b) = 1.]

65 52 CHAPTER 13. THE FUNCTIONS σ AND τ

66 Chapter 14 Perfect Numbers and Mersenne Primes If you do a search for perfect numbers up to 10, 000 you will find only the following perfect numbers: 6 = 2 3, 28 = 2 2 7, 496 = , 8128 = Note that 2 2 = 4, 2 3 = 8, 2 5 = 32, 2 7 = 128 so we have: 6 = 2 (2 2 1), 28 = 2 2 (2 3 1), 496 = 2 4 (2 5 1), 8128 = 2 6 (2 7 1). Note also that 2 2 1, 2 3 1, 2 5 1, are Mersenne primes. One might conjecture that all perfect numbers follow this pattern. We discuss to what extent this is known to be true. We start with the following result. Theorem If 2 p 1 is a Mersenne prime, then 2 p 1 (2 p 1) is perfect. Proof. Write q = 2 p 1 and let n = 2 p 1 q. Since q is odd and prime, by Theorem 13.1 (2) we have σ(n) = σ (2 p 1 q) = ( ) ( ) 2 p 1 q 2 1 = (2 p 1)(q q 1 1) = (2 p 1)2 p = 2n. That is, σ(n) = 2n and n is perfect. 53

67 54 CHAPTER 14. PERFECT NUMBERS AND MERSENNE PRIMES Now we show that all even perfect numbers have the conjectured form. Theorem If n is even and perfect then there is a Mersenne prime 2 p 1 such that n = 2 p 1 (2 p 1). Proof. Let n be even and perfect. Since n is even, n = 2m for some m. We take out as many powers of 2 as possible obtaining ( ) n = 2 k q, k 1, q odd. Since n is perfect σ (n) = n, that is, σ(n) = 2n. Since q is odd, gcd(2 k, q) = 1, so by Lemmas 13.1 and 13.2: σ(n) = σ(2 k )σ(q) = (2 k+1 1)σ(q). So we have hence ( ) 2 k+1 q = 2n = σ(n) = (2 k+1 1)σ(q), 2 k+1 q = (2 k+1 1)σ(q). Now σ (q) = σ(q) q, so σ(q) = σ (q) + q. Putting this in ( ) we get 2 k+1 q = (2 k+1 1)(σ (q) + q) or which implies 2 k+1 q = (2 k+1 1)σ (q) + 2 k+1 q q ( ) σ (q)(2 k+1 1) = q. In other words, σ (q) is a divisor of q. Since k 1 we have 2 k = 3. So σ (q) is a proper divisor of q. But σ (q) is the sum of all proper divisors of q. This can only happen if q has only one proper divisor. This means that q must be prime and σ (q) = 1. Then ( ) shows that q = 2 k+1 1. So q must be a Mersenne prime and k + 1 = p is prime. So n = 2 p 1 (2 p 1), as desired.

68 Corollary There is a 1 1 correspondence between even perfect numbers and Mersenne primes. Three Open Questions: 1. Are there infinitely many even perfect numbers? 2. Are there infinitely many Mersenne primes? 3. Are there any odd perfect numbers? 55 So far no one has found a single odd perfect number. It is known that if an odd perfect number exists, it must be > Remark Some think that Euclid s knowledge that 2 p 1 (2 p 1) is perfect when 2 p 1 is prime may have been his motivation for defining prime numbers.

69 56 CHAPTER 14. PERFECT NUMBERS AND MERSENNE PRIMES

70 Chapter 15 Congruences Definition Let m 0. We write a b (mod m) if m a b, and we say that a is congruent to b modulo m. Here m is said to be the modulus of the congruence. The notation a b (mod m) means that it is false that a b (mod m). Examples (1) 25 1 (mod 4) since 4 24 (2) 25 2 (mod 4) since 4 23 (3) 1 3 (mod 4) since 4 4 (4) a b (mod 1) for all a, b since 1 divides everything. (5) a b (mod 0) a = b for all a, b since 0 divides only 0. Remark As you see, the cases m = 1 and m = 0 are not very interesting so mostly we will only be interested in the case m 2. WARNING. Do not confuse the use of mod in Definition 15.1 with that of Definition 5.3. We shall see that the two uses of mod are related, but have different meanings: Recall a mod b = r where r is the remainder given by the Division Algorithm when a is divided by b 57

71 58 CHAPTER 15. CONGRUENCES and by Definition 15.1 a b (mod m) means m a b. Example since 4 20 but 25 5 (mod 4) is true, 25 = 5 mod 4 is false, since the latter means 25 = 1. Remark The mod in a b (mod m) defines a binary relation, whereas the mod in a mod b is a binary operation. More terminology: Expressions such as x = = 16 x 2 + 2x = sin(x) + 3 are called equations. By analogy, expressions such as x 2 (mod 16) 25 5 (mod 5) x 3 + 2x 6x (mod 27) are called congruences. Before discussing further the analogy between equations and congruences, we show the relationship between the two different definitions of mod. Theorem For m > 0 and for all a, b: a b (mod m) a mod m = b mod m. Proof. Assume that a b (mod m). Let r 1 = a mod m and r 2 = b mod m. We want to show that r 1 = r 2. By definition we have (1) m a b, (2) a = mq 1 + r 1, 0 r 1 < m, and

72 59 (3) b = mq 2 + r 2, 0 r 2 < m From (1) we obtain for some t. Hence a b = mt a = mt + b. Using (2) and (3) we see that a = mq 1 + r 1 = m (q 2 + t) + r 2. Since 0 r 1 < m and 0 r 2 < m by the uniqueness part of the Division Algorithm we obtain r 1 = r 2, as desired. Assume that a mod m = b mod m. We must show that a b (mod m). Let r = a mod m = b mod m, then by definition we have a = mq 1 + r, 0 r < m, and Hence b = mq 2 + r, 0 r < m. a b = m (q 1 q 2 ). This shows that m a b and hence a b (mod m), as desired. Exercise Prove that for all m > 0 and for all a: a a mod m (mod m). Exercise Using Definition 15.1 show that the following congruences are true (mod 3) (mod 3) 1 17 (mod 3) 33 0 (mod 3). Exercise Use Theorem 15.1 to show that the congruences in Exercise 15.2 are valid.

73 60 CHAPTER 15. CONGRUENCES Exercise (a) Show that a is even a 0 (mod 2) and a is odd a 1 (mod 2). (b) Show that a is even a mod 2 = 0 and a is odd a mod 2 = 1. Exercise Show that if m > 0 and a is any integer, there is a unique integer r {0, 1, 2,..., m 1} such that a r (mod m). Exercise Find integers a and b such that 0 < a < 15, 0 0, then a b (mod m) a b (mod d). The next two theorems show that congruences and equations share many similar properties. Theorem 15.2 (Congruence is an equivalence relation). For all a, b, c and m > 0 we have (1) a a (mod m) [reflexivity] (2) a b (mod m) b a (mod m) [symmetry] (3) a b (mod m) and b c (mod m) a c (mod m) [transitivity] Proof of (1). a a = 0 = 0 m, so m a a. Hence a a (mod m). Proof of (2). If a b (mod m), then m a b. Hence a b = mq. Hence b a = m( q), so m b a. Hence b a (mod m). Proof of (3). If a b (mod m) and b c (mod m) then m a b and m b c. By the linearity property m (a b) + (b c). That is, m a c. Hence a c (mod m). Recall that a polynomial is an expression of the form f(x) = a n x n + a n 1 x n a 1 x + a 0. Here we will assume that the coefficients a n,..., a 0 are integers and x also represents an integer variable. Here, of course, n 0 and n is an integer.

74 61 Theorem If a b (mod m) and c d (mod m), then (1) a ± c b ± d (mod m) (2) ac bd (mod m) (3) a n b n (mod m) for all n 1 (4) f(a) f(b) (mod m) for all polynomials f(x) with integer coefficients. Proof of (1). To prove (1) since a c = a + ( c), it suffices to prove only the + case. By assumption m a b and m c d. By linearity, m (a b) + (c d), that is m (a + c) (b + d). Hence a + c b + d (mod m). Proof of (2). Since m a b and m c d by linearity m c(a b) + b(c d). Now c(a b) + b(c d) = ca bd, hence and so ca bd (mod m), as desired. m ca bd, Proof of (3). We prove a n b n (mod m) by induction on n. If n = 1, the result is true by our assumption that a b (mod m). Assume it holds for n = k. Then we have a k b k (mod m). This, together with a b (mod m) using (2) above, gives aa k bb k (mod m). Hence a k+1 b k+1 (mod m). So it holds for all n 1, by the PMI. Proof of (4). Let f(x) = c n x n + + c 1 x + c 0. We prove by induction on n that if a b (mod m) then c n a n + + c 0 c n b n + + c 0 (mod m). If n = 0 we have c 0 c 0 (mod m) by Theorem 15.2 (1). Assume the result holds for n = k. Then we have ( ) c k a k + + c 1 a + c 0 c k b k + + c 1 b + c 0 (mod m).

75 62 CHAPTER 15. CONGRUENCES By part (3) above we have a k+1 b k+1 (mod m). Since c k+1 c k+1 (mod m) using (2) above we have ( ) c k+1 a k+1 c k+1 b k+1 (mod m). Now we can apply Theorem 15.3 (1) to ( ) and ( ) to obtain c k+1 a k+1 + c k a k + + c 0 c k+1 b k+1 + c k b k + + c 0 (mod m). So by the PMI, the result holds for n 0. Before continuing to develop properties of congruences, we give the following example to show one way that congruences can be useful. Example (This example was taken from [1] Introduction to Analytic Number Theory, by Tom Apostol.) The first five Fermat numbers F 0 = 3, F 1 = 5, F 2 = 17, F 3 = 257, F 4 = 65, 537 are primes. We show using congruences without explicitly calculating F 5 that F 5 = is divisible by 641 and is therefore not prime : 2 2 = = ( 2 2) 2 = 4 2 = = ( 2 4) 2 = 16 2 = = ( 2 8) 2 = = 65, 536 So we have By Theorem 15.3 (3): 65, (mod 641) (mod 641). ( 2 16 ) 2 (154) 2 (mod 641). That is, Since , 716 (mod 641). 23, (mod 641)

76 and we have and hence (mod 641) (mod 641) (mod 641). So , as claimed. Clearly , so is composite. Of course, if you already did Exercise 12.1 (p. 44) you will already know that = 4, 294, 967, 297 = (641) (6, 700, 417) and that 641 and 6, 700, 417 are indeed primes. Note that 641 is the 116 th prime, so if you used trial division you would have had to divide by 115 primes before reaching one that divides , and that assumes that you have a list of the first 116 primes. Theorem If m > 0 and then a mod m = r. a r (mod m) where 0 r < m Exercise Prove Theorem [Hint: The Division Algorithm may be useful.] Exercise Find the value of each of the following (without using Maple!). (1) 2 32 mod 7 (2) mod 7 (3) 3 35 mod 7 [Hint: Use Theorem 15.4 and the ideas used in the example on page 62.] Exercise Let gcd (m 1, m 2 ) = 1. Prove that (15.1) a b (mod m 1 ) and a b (mod m 2 ) if and only if (15.2) a b (mod m 1 m 2 ). [Hint. Use Lemma 11.1, page 38.] 63

77 64 CHAPTER 15. CONGRUENCES

78 Chapter 16 Divisibility Tests for 2, 3, 5, 9, 11 Recall from Definition 4.2 on page 14 that the decimal representation of the positive integer a is given by (1) a = a n 1 a n 2 a 1 a 0 when a = a n 1 10 n 1 + a n 2 10 n a a 0 and 0 a i 9 for i = 0, 1,..., n 1. Theorem Let the decimal representation of a be given by (1), then (a) a mod 2 = a 0 mod 2, (b) a mod 5 = a 0 mod 5, (c) a mod 3 = (a n a 0 ) mod 3, (d) a mod 9 = (a n a 0 ) mod 9, (e) a mod 11 = (a 0 a 1 + a 2 a 3 + ) mod 11. Before proving this theorem, let s give some examples mod 2 = 7 mod 2 = mod 5 = 7 mod 5 = mod 3 = ( ) mod 3 = 17 mod 3 = 8 mod 3 = 2 65

79 66 CHAPTER 16. DIVISIBILITY TESTS FOR 2, 3, 5, 9, mod 9 = ( ) mod 9 = 17 mod 9 = 8 mod 9 = mod 11 = mod 11 = 5 mod 11 = 5. Proof of Theorem Consider the polynomial f(x) = a n 1 x n a 1 x + a 0. Note that 10 0 (mod 2). So by Theorem 15.3 (4) a n 1 10 n a a 0 a n 1 0 n a a 0 (mod 2). That is, a a 0 (mod 2). This, together with Theorem 15.1, proves part (a). Since 10 0 (mod 5), the proof of part (b) is similar. Note that 10 1 (mod 3) so applying theorem 15.3 (4) again, we have a n 1 10 n a a 0 a n 1 1 n a a 0 (mod 3). That is, a a n a 1 + a 0 (mod 3). This using Theorem 15.1 proves part (c). Since 10 1 (mod 9), the proof of part (d) is similar. Now 10 1 (mod 11) so a n 1 10 n a a 0 a n 1 ( 1) n a 1 ( 1) + a 0 (mod 11). That is, a a 0 a 1 + a 2 (mod 11) and by Theorem 15.1 we are done.

80 67 Remark Note that m a a mod m = 0, so from Theorem 16.1 we obtain immediately the following corollary. Corollary Let a be given by (1), p. 65. Then (a) 2 a a 0 = 0, 2, 4, 6 or 8 (b) 5 a a 0 = 0 or 5 (c) 3 a 3 a 0 + a a n 1 (d) 9 a 9 a 0 + a a n 1 (e) 11 a 11 a 0 a 1 + a 2 a 3 +. Note that in applying (c), (d) and (e) we can use the fact that (a + m) mod m = a to cast out 3 s (for (c)) and 9 s (for (d)). Here s an example of casting out 9 s: So 1487 mod 9 = mod 9 = ( ) mod 9 = ( ) mod 9 = (4 + 7) mod 9 = (2 + 9) mod 9 = 2 mod 9 = 2. Note that if 0 r < m then r mod m = r. Exercise Let a = Find a mod m for m = 2, 3, 5, 9 and 11.

81 68 CHAPTER 16. DIVISIBILITY TESTS FOR 2, 3, 5, 9, 11 Exercise Let a = a n a 1 a 0 be the decimal representation of a. Then prove (a) a mod 10 = a 0. (b) a mod 100 = a 1 a 0. (c) a mod 1000 = a 2 a 1 a 0. Exercise Prove that if b is a positive square, i.e., b = a 2, a > 0, then the least significant digit of b is one of 0, 1, 4, 5, 6, 9. [Hint: b mod 10 is the least significant digit of b. Write a = a n 1 a 0. Then a a 0 (mod 10) so a 2 a 2 0 (mod 10). For each digit a 0 {0, 1, 2,..., 9} find a 2 0 mod 10. Use Theorem 15.4, among other results.] Exercise Are any of the following numbers squares? Explain. 10, 11, 16, 19, 24, 25, 272, 2983, 11007,

82 Chapter 17 Divisibility Tests for 7 and 13 Theorem Let a = a r a r 1 a 1 a 0 be the decimal representation of a. Then (a) 7 a 7 a r a 1 2a 0. (b) 13 a 13 a r a 1 9a 0. [Here a r a 1 = a a 0 10 = a r 10 r a a 1.] Before proving this theorem we illustrate it with two examples. since 7 12 we have since 6 13 = 78, we have So, by Theorem 17.1 (b),

83 70 CHAPTER 17. DIVISIBILITY TESTS FOR 7 AND 13 Proof of 17.1 (a). Let c = a r a 1. So we have a = 10c + a 0. Hence 2a = 20c 2a 0. Now 1 20 (mod 7) so we have It follows from Theorem 15.1 that 2a c 2a 0 (mod 7). 2a mod 7 = c 2a 0 mod 7. Hence, 7 2a 7 c 2a 0. Since gcd(7, 2) = 1 we have 7 2a 7 a. Hence 7 a 7 c 2a 0, which is what we wanted to prove. Proof of 17.1 (b). (This has a similar proof to that for 17.1 (a) and is left for the interested reader.) Exercise Use Theorem 17.1 (a) to determine which of the following are divisible by 7: (a) 6994 (b) 6993 Exercise In the notation of Theorem 17.1, show that a mod 7 need not be equal to (a r a 1 2a 0 ) mod 7..

84 Chapter 18 More Properties of Congruences Theorem Let m 2. If a and m are relatively prime, there exists a unique integer a such that aa 1 (mod m) and 0 < a < m. We call a the inverse of a modulo m. Note that we do not denote a by a 1 since this might cause some confusion. Of course, if c a (mod m) then ac 1 (mod m) so a is not unique unless we specify that 0 < a < m. Proof. If gcd(a, m) = 1, then by Bezout s Lemma there exist s and t such that as + mt = 1. Hence as 1 = m( t), that is, m as 1 and so as 1 (mod m). Let a = s mod m. Then a s (mod m) so aa 1 (mod m) and clearly 0 < a < m. To show uniqueness assume that ac 1 (mod m) and 0 < c < m. Then ac aa (mod m). So if we multiply both sides of this congruence on the left by c and use the fact that ca 1 (mod m) we obtain c a (mod m). It follows from Exercise 15.5 that c = a. Remark From the above proof we see that Blankinship s Method may be used to compute the inverse of a when it exists, but for small m we may 71

85 72 CHAPTER 18. MORE PROPERTIES OF CONGRUENCES often find a by trial and error. For example, if m = 15 take a = 2. Then we can check each element 0, 1, 2,..., 14: So we can take 2 = (mod 15) (mod 15) (mod 15) (mod 15) (mod 15) (mod 15) (mod 15) (mod 15) (mod 15) since Exercise Show that the inverse of 2 modulo 7 is not the inverse of 2 modulo 15. Theorem Let m > 0. relatively prime to m. If ab 1 (mod m) then both a and b are Proof. If ab 1 (mod m), then m ab 1. So ab 1 = mt for some t. Hence, ab + m( t) = 1. By Exercise 9.2 on page 30, this implies that gcd(a, m) = 1 and gcd(b, m) = 1, as claimed. Corollary a has an inverse modulo m if and only if a and m are relatively prime. Theorem 18.3 (Cancellation). Let m > 0 and assume that gcd(c, m) = 1. Then ( ) ca cb (mod m) a b (mod m). Proof. If gcd(c, m) = 1, there is an integer c such that c c 1 (mod m). Now since c c (mod m) and ca cb (mod m) by Theorem 15.3, p. 61, c ca c cb (mod m).

86 73 But c c 1 (mod m) so c ca a (mod m) and c cb b (mod m). By reflexivity and transitivity this yields a b (mod m). Exercise Find specific positive integers a, b, c and m such that c 0 (mod m), gcd(c, m) > 0, and ca cb (mod m), but a b (mod m). Although ( ) above is not generally true when gcd(c, m) > 1, we do have the following more general kinds of cancellation: Theorem If c > 0, m > 0 then a b (mod m) ca cb (mod cm). Exercise Prove Theorem Theorem Let m > 0 and let d = gcd(c, m). Then ca cb (mod m) a b (mod m d ). Proof. Since d = gcd(c, m) we can write c = d( c ) and m = d( m ). Then d d gcd( c, m ) = 1. Now rewriting ca cb (mod m) we have d d d c d a d c d b (mod d m d ). Since m > 0, d > 0, so by Theorem 18.4 we have c d a c d b (mod m d ). Now since gcd( c, m ) = 1, by Theorem 18.3 d d a b (mod m d ).

87 74 CHAPTER 18. MORE PROPERTIES OF CONGRUENCES Theorem If m > 0 and a b (mod m) we have gcd(a, m) = gcd(b, m). Proof. Since a b (mod m) we have a b = mt for some t. So we can write (1) a = mt + b and (2) b = m( t) + a. Let d = gcd(m, a) and e = gcd(m, b). Since e m and e b, from (1) e a so e is a common divisor of m and a. Hence e d. Using (2) we see similarly that d e. So d = e. Corollary Let m > 0. Let a b (mod m). Then a has an inverse modulo m if and only if b does. Proof. Immediate from Theorems 18.1, 18.2 and Exercise Determine whether or not each of the following is true. Give reasons in each case. (1) x 3 (mod 7) gcd(x, 7) = 1 (2) gcd(68019, 3) = 3 (3) 12x 15 (mod 35) 4x 5 (mod 7) (4) x 6 (mod 12) gcd(x, 12) = 6 (5) 3x 3y (mod 17) x y (mod 17) (6) 5x y (mod 6) 15x 3y (mod 18) (7) 12x 12y (mod 15) x y (mod 5) (8) x 73 (mod 75) x mod 75 = 73 (9) x 73 (mod 75) and 0 x < 75 x = 73 (10) There is no integer x such that 12x 7 (mod 33).

88 Chapter 19 Residue Classes Definition Let m > 0 be given. For each integer a we define (1) [a] = {x : x a (mod m)}. In other words, [a] is the set of all integers that are congruent to a modulo m. We call [a] the residue class of a modulo m. Some people call [a] the congruence class or equivalence class of a modulo m. Theorem For m > 0 we have (2) [a] = {mq + a q Z}. Proof. x [a] x a (mod m) m x a x a = mq for some q Z x = mq + a for some q Z. So (2) follows from the definition (1). Note that [a] really depends on m and it would be more accurate to write [a] m instead of [a], but this would be too cumbersome. Nevertheless it should be kept clearly in mind that [a] depends on some understood value of m. Remark Two alternative ways to write (2) are (3) [a] = {mq + a q = 0, ±1, ±2,... } or (4) [a] = {..., 2m + a, m + a, a, m + a, 2m + a,... }. 75

89 76 CHAPTER 19. RESIDUE CLASSES Exercise Show that if m = 2 then [1] is the set of all odd integers and [0] is the set of all even integers. Show also that Z = [0] [1] and [0] [1] =. Exercise Show that if m = 3, then [0] is the set of integers divisible by 3, [1] is the set of integers whose remainder when divided by 3 is 1, and [2] is the set of integers whose remainder when divided by 3 is 2. Show also that Z = [0] [1] [2] and [0] [1] = [0] [2] = [1] [2] =. Theorem For a given modulus m > 0 we have: [a] = [b] a b (mod m). Proof. Assume [a] = [b]. Note that since a a (mod m) we have a [a]. Since [a] = [b] we have a [b]. By definition of [b] this gives a b (mod m), as desired. Assume a b (mod m). We must prove that the sets [a] and [b] are equal. To do this we prove that every element of [a] is in [b] and vice-versa. Let x [a]. Then x a (mod m). Since a b (mod m), by transitivity x b (mod m) so x [b]. Conversely, if x [b], then x b (mod m). By symmetry since a b (mod m), b a (mod m), so again by transitivity x a (mod m) and x [a]. This proves that [a] = [b]. Theorem Given m > 0. For every a there is a unique r such that [a] = [r] and 0 r < m. Proof. Let r = a mod m. Then by Exercise 15.1 (p. 59) we have a r (mod m). By definiton of a mod m we have 0 r < m. Since a r (mod m) by Theorem 19.2, [a] = [r]. To prove that r is unique, suppose also [a] = [r ] where 0 r < m. By Theorem 19.2 this implies that a r (mod m). This, together with 0 r < m, implies by Theorem 15.4 that r = a mod m = r. Theorem Given m > 0, there are exactly m distinct residue classes modulo m, namely, [0], [1], [2],..., [m 1]. Proof. By Theorem 19.3 we know that every residue class [a] is equal to one of the residue classes: [0], [1],..., [m 1]. So there are no residue classes not in this list. These residue classes are distinct by the uniqueness part of Theorem 19.3, namely if 0 r 1 < m and 0 r 2 < m and [r 1 ] = [r 2 ], then by the uniqueness part of Theorem 19.3 we must have r 1 = r 2.

90 Exercise Given the modulus m > 0 show that [a] = [a + m] and [a] = [a m] for all a. Exercise For any m > 0, show that if x [a] then [a] = [x]. Definition Any element x [a] is said to be a representative of the residue class [a]. By Exercise 19.4 if x is a representative of [a] then [x] = [a], that is, any element of a residue class may be used to represent it. Exercise For any m > 0, show that if [a] [b] then [a] = [b]. Exercise For any m > 0, show that if [a] [b] then [a] [b] =. Exercise Let m = 2. Show that [0] = [2] = [4] = [32] = [ 2] = [ 32] 77 and [1] = [3] = [ 3] = [31] = [ 31].

91 78 CHAPTER 19. RESIDUE CLASSES

92 Chapter 20 Z m and Complete Residue Systems Throughout this section we assume a fixed modulus m > 0. Definition We define Z m = {[a] a Z}, that is, Z m is the set of all residue classes modulo m. We call Z m the ring of integers modulo m. In the next chapter we shall show how to add and multiply residue classes. This makes Z m into a ring. See Appendix A for the definition of ring. Often we drop the ring and just call Z m the integers modulo m. From Theorem 19.4 Z m = {[0], [1],..., [m 1]} and since no two of the residue classes [0], [1],..., [m 1] are equal we see that Z m has exactly m elements. By Exercise 19.4 if we choose a 0 [0], a 1 [1],..., a m 1 [m 1] then So we also have [a 0 ] = [0], [a 1 ] = [1],..., [a m 1 ] = [m 1]. Z m = {[a 0 ], [a 1 ],..., [a m 1 ]}. 79

93 80 CHAPTER 20. Z M AND COMPLETE RESIDUE SYSTEMS Example If m = 4 we have, for example, 8 [0], 5 [1], 6 [2], 11 [3]. And hence: Z 4 = {[8], [5], [ 6], [11]}. Definition A set of m integers {a 0, a 1,..., a m 1 } is called a complete residue system modulo m if Z m = {[a 0 ], [a 1 ],..., [a m 1 ]}. Remark A complete residue system modulo m is sometimes called a complete set of representatives for Z m. Example By Theorem 19.4, p. 76, for m > 0 {0, 1, 2,..., m 1} is a complete residue system modulo m. Example From the above discussion it is clear that for each m > 0 there are infinitely many distinct complete residue systems modulo m. For example, here are some examples of complete residue systems modulo 5: 1. {0, 1, 2, 3, 4} 2. {0, 1, 2, 2, 1} 3. {10, 9, 12, 8, 14} 4. {0 + 5n 1, 1 + 5n 2, 2 + 5n 3, 3 + 5n 4, 4 + 5n 4 } where n 1, n 2, n 3, n 4, n 5 may be any integers. Definition The set {0, 1,..., m 1} is called the set of least nonnegative residues modulo m. Theorem Let m > 0 be given.

94 81 (1) If m = 2k, then {0, 1, 2,..., k 1, k, (k 1),..., 2, 1} is a complete residue system modulo m. (2) If m = 2k + 1, then {0, 1, 2,..., k, k,..., 2, 1} is a complete residue system modulo m. Proof of (1). Since if m = 2k Z m = {[0], [1],..., [k], [k + 1],..., [k + i], [k + k 1]}, it suffices to note that by Exercise 19.3 we have So as desired. [k + i] = [k + i 2k] = [ k + i] = [ (k i)]. [k + 1] = [ (k 1)], [k + 2] = [ (k 2)],..., [k + k 1] = [ 1], Proof of (2). In this case so as desired. [k + i] = [ (2k + 1) + k + i] = [ k + i + 1] = [ (k i + 1)] [k + 1] = [ k], [k + 2] = [ (k 1)],..., [2k] = [ 1], Definition The complete residue system modulo m given in Theorem 20.1 is called the least absolute residue system modulo m. Remark If one chooses in each residue class [a] the smallest nonnegative integer one obtains the least nonnegative residue system. If one chooses in each residue class [a] an element of smallest possible absolute value one obtains the least absolute residue system. Exercise Find both the least nonnegative residue system and the least absolute residues for each of the moduli given below. Also, in each case find a third complete residue system different from these two. m = 3, m = 4, m = 5, m = 6, m = 7, m = 8.

95 82 CHAPTER 20. Z M AND COMPLETE RESIDUE SYSTEMS

96 Chapter 21 Addition and Multiplication in Z m In this chapter we show how to define addition and multiplication of residue classes modulo m. With respect to these binary operations Z m is a ring as defined in Appendix A. Definition For [a], [b] Z m we define [a] + [b] = [a + b] and [a][b] = [ab]. Example For m = 5 we have [2] + [3] = [5], and [2][3] = [6]. Note that since 5 0 (mod 5) and 6 1 (mod 5) we have [5] = [0] and [6] = [1] so we can also write [2] + [3] = [0] [2][3] = [1]. 83

97 84 CHAPTER 21. ADDITION AND MULTIPLICATION IN Z M Since a residue class can have many representatives, it is important to check that the rules given in Definition 21.1 do not depend on the representatives chosen. For example, when m = 5 we know that so we should have and In this case we can check that [7] = [2] and [11] = [21] [7] + [11] = [2] + [21] [7][11] = [2][21]. [7] + [11] = [18] and [2] + [21] = [23]. Now (mod 5) since Hence [18] = [23], as desired. Also [7][11] = [77] and [2][21] = [42]. Then = 35 and 5 35 so (mod 5) and hence [77] = [42], as desired. Theorem For any modulus m > 0 if [a] = [b] and [c] = [d] then [a] + [c] = [b] + [d] and [a][c] = [b][d]. Proof. (This follows immediately from Theorem 15.3 (p. 61) and Theorem 19.2 (p. 76).) Exercise Prove Theorem When performing addition and multiplication in Z m using the rules in Definition 21.1, due to Theorem 21.1, we may at any time replace [a] by [a ] if a a (mod m). This will sometimes make calculations easier. Example Take m = 151. Then (mod 151) and (mod 151), so [150][149] = [ 1][ 2] = [2] and since (mod 151). [150] + [149] = [ 1] + [ 2] = [ 3] = [148]

98 85 When working with Z m it is often useful to write all residue classes in the least nonnegative residue system, as we do in constructing the following addition and multiplication tables for Z 4. + [0] [1] [2] [3] [0] [0] [1] [2] [3] [1] [1] [2] [3] [0] [2] [2] [3] [0] [1] [3] [3] [0] [1] [2] [0] [1] [2] [3] [0] [0] [0] [0] [0] [1] [0] [1] [2] [3] [2] [0] [2] [0] [2] [3] [0] [3] [2] [1] Recall that by Exercise 15.1 (p. 59) we have for all a and m > 0 a a mod m (mod m). So using residue classes modulo m this gives Hence, [a] = [a mod m]. [a] + [b] = [(a + b) mod m] [a][b] = [(ab) mod m] So if a and b are in the set {0, 1,..., m 1}, these equations give us a way to obtain representations of the sum and product of [a] and [b] in the same set. This leads to an alternative way to define Z m and addition and multiplication in Z m. For clarity we will use different notation. Definition For m > 0 define and for a, b J m define J m = {0, 1, 2,..., m 1} a b = (a + b) mod m a b = (ab) mod m.

99 86 CHAPTER 21. ADDITION AND MULTIPLICATION IN Z M Remark J m with and as defined is isomorphic to Z m with addition and multiplication given by Definition [Students taking Elementary Abstract Algebra will learn a rigorous definition of the term isomorphic. For now, we take isomorphic to mean has the same form. ] The addition and multiplication tables for J 4 are: Exercise Prove that for every modulus m > 0 we have for all a, b J m and [a] + [b] = [a b], [a][b] = [a b]. Exercise Construct addition and multiplication tables for J 5. Exercise Without doing it, tell how to obtain addition and multiplication tables for Z 5 from the work in Exercise Example Let s solve the congruence (1) 272x 901 (mod 9). Using residue classes modulo 9 we see that (1) is equivalent to (2) [272x] = [901] which is equivalent to (3) [272][x] = [901] which is equivalent to (4) [2][x] = [1]. Now we know [x] {[0], [1],..., [8]} so by trial and error we see that x = 5 is a solution.

100 Chapter 22 The Groups U m Definition Let m > 0. A residue class [a] Z m is called a unit if there is another residue class [b] Z m such that [a][b] = [1]. In this case [a] and [b] are said to be inverses of each other in Z m. Theorem Let m > 0. A residue class [a] Z m is a unit if and only if gcd(a, m) = 1. Proof. Let [a] be a unit. Then there is some [b] such that [a][b] = [1]. Hence [ab] = [1] so ab 1 (mod m). So by Theorem 18.2, p. 72, gcd(a, m) = 1. To prove the converse, let gcd(a, m) = 1. Then by Theorem 18.1, page 71, there is an integer a such that aa 1 (mod m). Hence, [aa ] = [1]. So [a][a ] = [aa ] = [1], and we can take b = a. Note that from Theorem 18.6 we see that if [a] = [b] (i.e., a b (mod m)) then gcd(a, m) = 1 gcd(b, m) = 1. So in checking whether or not a residue class is a unit we can use any representative of the class. Exercise Show that [1] and [m 1] are always units in Z m. Hint: [m 1] = [ 1]. Definition The set of all units in Z m is denoted by U m and is called the group of units of Z m. See Appendix A for the definition of a group. Theorem Let m > 0, then U m = {[i] 1 i m and gcd(i, m) = 1}. 87

101 88 CHAPTER 22. THE GROUPS U M Proof. We know that if [a] Z m then [a] = [i] where 0 i m 1. If m = 1 then Z m = Z 1 = {[0]} = {[1]} and since [1][1] = [1], [1] is a unit, U 1 = {[1]} and the theorem holds. If m 2, then gcd(i, m) = 1 can only happen if 1 i m 1, since gcd(0, m) = gcd(m, m) = m 1. So the theorem follows from Theorem 22.1 and the above remarks. Theorem (U m is a group 1 under multiplication.) (1) If [a], [b] U m then [a][b] U m. (2) For all [a], [b], [c] in U m we have ([a][b])[c] = [a]([b][c]). (3) [1][a] = [a][1] = [a] for all [a] U m. (4) For each [a] U m there is a [b] U m such that [a][b] = [1]. (5) For all [a], [b] U m we have [a][b] = [b][a]. Exercise Prove Theorem Example Using Theorem 22.2 we see that U 15 = {[1], [2], [4], [7], [8], [11], [13], [14]} = {[1], [2], [4], [7], [ 7], [ 4], [ 2], [ 1]}. Note that using absolute least residue modulo 15 simplifies multiplication somewhat. Rather than write out the entire multiplication table, we just find the inverse of each element of U 15 : [1][1] = [1] [2][ 7] = [2][8] = [1] [4][4] = [1] [7][ 2] = [7][13] = [1] [ 4][ 4] = [11][11] = [1] [ 1][ 1] = [14][14] = [1]. Exercise Find the elements of U 7 in both least nonnegative and absolute least residue form and find the inverse of each element, as in the example above. 1 Actually (1) (4) are all that is required for U n to be a group. Property (5) says that U n is an Abelian group. See Appendix A.

102 Definition If X is a set, the number of elements in X is denoted by X. Example {1} = 1, {0, 1, 3, 9} = 4, Z m = m if m > 0. Definition If m 1, φ(m) = {i Z 1 i m and gcd(i, m) = 1}. The function φ is called the Euler phi function or the Euler totient function. Corollary If m > 0, Note that U m = φ(m). U 1 = {[1]} so φ(1) = 1 U 2 = {[1]} so φ(2) = 1 U 3 = {[1], [2]} so φ(3) = 2 U 4 = {[1], [3]} so φ(4) = 2 U 5 = {[1], [2], [3], [4]} so φ(5) = 4 U 6 = {[1], [5]} so φ(6) = 2 U 7 = {[1], [2], [3], [4], [5], [6]} so φ(7) = 6. Generally φ(m) is not easy to calculate. However, the following theorems show that once the prime factorization of m is given, computing φ(m) is easy. Theorem If a > 0 and b > 0 and gcd(a, b) = 1, then φ(ab) = φ(a)φ(b). Theorem If p is prime and n > 0 then φ (p n ) = p n p n 1. Theorem Let p 1, p 2,..., p k be distinct primes and let n 1, n 2,..., n k be positive integers, then φ (p n 1 1 p n 2 2 p n k k ) = ( ) ( p n 1 1 p n p n k k ) pn k 1 k. 89

103 90 CHAPTER 22. THE GROUPS U M Before discussing the proofs of these three theorems, let s illustrate their use: φ(12) = φ ( ) = ( ) ( ) = 2 2 = 4 φ(9000) = φ ( ) = ( ) ( ) ( ) Note that if p is any prime then = = φ(p) = p 1. I will sketch a proof of Theorem 22.4 in Exercise 22.6 below. Now I give the proof of Theorem Proof of Theorem We want to count the number of elements in the set A = {1, 2,..., p n } that are relatively prime to p n. Let B be the set of elements of A that have a factor > 1 in common with A. Note that if b B and gcd (b, p n ) = d > 1, then d is a factor of p n and d > 1 so d has p as a factor. Hence b = pk, for some k, and p b p n, so p kp p n. It follows that 1 k p n 1. That is, B = { p, 2p, 3p,..., kp,..., p n 1 p }. We are interested in the number of elements of A not in B. Since A = p n and B = p n 1, this number is p n p n 1. That is, φ (p n ) = p n p n 1. The proof of Theorem 22.6 follows from Theorems 22.4 and The proof is by induction on n and is quite similar to the proof of Theorem 13.1 (2) on page 50, so I omit the details. Exercise Find the sets U m, for 8 m 20. Note that U m = φ(m). Use Theorem 22.6 to calculate φ(m) and check that you have the right number of elements for each set U m, 8 m 20. Exercise Show that if m = p n 1 1 p n 2 2 p n k k where p 1,..., p k are distinct primes and each n i 1, then ) ) ) φ(m) = m (1 (1 1p1 1p2 (1 1pk.

104 91 Exercise Let a and b be relatively prime positive integers. n = ab. Define the mapping f by the rule Write f([x] n ) = ([x] a, [x] b ). Here we denote the residue class of x modulo m by [x] m. First illustrate each of the following for the special case a = 3 and b = 5. Then prove each in general. (The proof is difficult and is optional.) 1. f : Z n Z a Z b is one-to-one and onto. (This is called the Chinese Remainder Theorem.) 2. f : U n U a U b is also a one-to-one, onto mapping. 3. Conclude from (2) that φ(ab) = φ(a)φ(b).

105 92 CHAPTER 22. THE GROUPS UM

106 Chapter 23 Two Theorems of Euler and Fermat Fermat s Big Theorem or, as it is also called, Fermat s Last Theorem states that x n + y n = z n has no solutions in positive integers x, y, z when n > 2. This was proved by Andrew Wiles in 1995 over 350 years after it was first mentioned by Fermat. The theorem that concerns us in this chapter is Fermat s Little Theorem. This theorem is much easier to prove, but has more far reaching consequences for applications to cryptography and secure transmission of data on the Internet. The first theorem below is a generalization of Fermat s Little Theorem due to Euler. Theorem 23.1 (Euler s Theorem). If m > 0 and a is relatively prime to m then a φ(m) 1 (mod m). Theorem 23.2 (Fermat s Little Theorem). If p is prime and a is relatively prime to p then a p 1 1 (mod p). Let s look at some examples. Take m = 12 then φ(m) = φ ( ) = ( ) (3 1) = 4. 93

107 94 CHAPTER 23. TWO THEOREMS OF EULER AND FERMAT The positive integers a < m with gcd(a, m) = 1 are 1, 5, 7 and 11. and (mod 12) is clear (mod 12) since ( 5 2) (mod 12) (mod 12). Now 7 5 (mod 12) and since 4 is even (mod 12) (mod 12) (mod 12) and again since 4 is even we have 11 4 ( 1) 4 (mod 12) (mod 12). So we have verified Theorem 23.1 for the single case m = 12. Exercise Verify that Theorem 23.2 holds if p = 5 by direct calculation as in the above example. Definition (Powers of residue classes.) If [a] U m define [a] 1 = [a] and for n > 1, [a] n = [a][a] [a] where there are n copies of [a] on the right. Theorem If [a] U m, then [a] n U m for n 1 and [a] n = [a n ]. Proof. We prove that [a] n = [a n ] U m for n 1 by induction on n. If n = 1, [a] 1 = [a] = [a 1 ] and by assumption [a] U m. Suppose [a] k = [ a k] U m for some k 1. Then [a] k+1 = [a] k [a] = [ a k] [a] by the induction hypothesis = [ a k a ] by Definition 21.1, p. 83 = [ a k+1] since a k a = a k+1. So by the PMI, the theorem holds for n 1.

108 Note that for fixed m > 0 if gcd(a, m) = 1 then [a] U m. And using Theorem 23.3 we have a n 1 (mod m) [a n ] = [1] [a] n = [1]. It follows that Euler s Theorem (Theorem 23.1) is equivalent to the following theorem. Theorem If m > 0 and [a] U m then [a] φ(m) = [1]. A proof of Theorem 23.4 is outlined in the following exercise. Exercise 23.2 (Optional). Let U m = {X 1, X 2,..., X φ(m) }. Here we write X i for a residue class in U m to simplify notation. 1. Show that if X U m then 2. Show that if X U m then {XX 1, XX 2,, XX φ(m) } = U m. XX 1 XX 2 XX φ(m) = X 1 X 2 X φ(m). 3. Let A = X 1 X 2 X φ(m). Show that if X U m then X φ(m) A = A. 4. Conclude from (3) that X φ(m) = [1] and hence Theorem 23.4 is true. Also Theorem 23.4 is an easy consequence of Lagrange s Theorem, which students who take (or have taken) a course in abstract algebra will learn about (or will already know). Exercise Show that Fermat s Little Theorem follows from Euler s Theorem. Exercise Show that if p is prime then a p a (mod p) for all integers a. Hint: Consider two cases: I. gcd(a, p) = 1 and II. gcd(a, p) > 1. Note that in the second case p a. Exercise Let m > 0. Let gcd(a, m) = 1. Show that a φ(m) 1 is an inverse for a modulo m. (See Theorem 18.1, p. 71.) 95

109 96 CHAPTER 23. TWO THEOREMS OF EULER AND FERMAT Exercise For all a {1, 2, 3, 4, 5, 6} find the inverse a of a modulo 7 by use of Exercise Choose a in each case so that 1 a 6. Example Note that Fermat s Little Theorem can be used to simplify the computation of a n mod p where p is prime. Recall that if a n r (mod p) where 0 r < p, then a n mod p = r. We can do two things to simplify the computation: (1) Replace a by a mod p. (2) Replace n by n mod (p 1). Suppose we want to calculate mod 11. Note that (mod 11), that is, (mod 11). Since gcd(2, 11) = 1 we have (mod 11). Now = (786543) so and 2 5 = (mod 11). Hence, (786543) 10+5 (mod 11) ( 2 10) (mod 11) (mod 11) 2 5 (mod 11), (mod 11). It follows that mod 11 = 10. Exercise Use the technique in the above example to calculate mod 13. [Here you cannot use the mod 11 trick, of course.]

110 Chapter 24 Probabilistic Primality Tests According to Fermat s Little Theorem, if p is prime and 1 a p 1, then a p 1 1 (mod p). The converse is also true in the following sense: Theorem If m 2 and for all a such that 1 a m 1 we have then m must be prime. a m 1 1 (mod m) Proof. If the hypothesis holds, then for all a with 1 a m 1, we know that a has an inverse modulo m, namely, a m 2 is an inverse for a modulo m. By Theorem 18.2, this says that for 1 a m 1, gcd(a, m) = 1. But if m were not prime, then we would have m = ab with 1 < a < m, 1 1, a contradiction. So m must be prime. Using the above theorem to check that p is prime we would have to check that a p 1 1 (mod p) for a = 1, 2, 3,..., p 1. This is a lot of work. Suppose we just know that 2 m 1 1 (mod m) for some m > 2. Must m be prime? Unfortunately, the answer is no.the smallest composite m satisfying 2 m 1 1 (mod m) is m = 341. Exercise Use Maple (or do it via hand and or calculator) to verify that (mod 341) and that 341 is not prime. 97

111 98 CHAPTER 24. PROBABILISTIC PRIMALITY TESTS The moral is that even if 2 m 1 1 (mod m), the number m need not be prime. On the other hand, consider the case of m = 63. Note that 2 6 = 64 1 (mod 63). Hence, (mod 63). Raising both sides to the 10th power we have (mod 63). Then multiplying both sides by 2 2 we get (mod 63) since we have 4 1 (mod 63) (mod 63). This tells us that 63 is not prime, without factoring 63. We emphasize that in general if 2 m 1 1 (mod m) then we can be sure that m is not prime. FACT. There are 455,052,511 odd primes p 10 10, all of which satisfy 2 p 1 1 (mod p). There are only 14,884 composite numbers 2 < m that satisfy 2 m 1 1 (mod m). Thus, if 2 < m and m satisfies 2 m 1 1 (mod m), the probability m is prime is 455, 052, , 052, , In other words, if you find that 2 m 1 1 (mod m), then it is highly likely (but not a certainty) that m is prime, at least when m Thus the following Maple procedure will almost always give the correct answer: > is_prob_prime:=proc(n) if n <=1 or Power(2,n-1) mod n <> 1 then return "not prime"; else return "probably prime"; end if; end proc:

112 Note that the Maple command Power(a,n-1) mod n is an efficient way to compute a n 1 mod n. We discuss this in more detail later. The procedure is_prob_prime(n) just defined returns probably prime if 2 n 1 mod n = 1 and not prime if n 1 or if 2 n 1 mod n 1. If the answer is not prime, then we know definitely that n is not prime. If the answer is probably prime, we know that there is a very small probability that n is not prime. In practice, there are better probabilistic primality tests than that mentioned above. For more details see, for example, Elementary Number Theory, Fourth Edition, by Kenneth Rosen. The built-in Maple procedure isprime is a very sophisticated probabilistic primality test. The command isprime(n) returns false if n is not prime and returns true if n is probably prime. So far no one has found an integer n for which isprime(n) gives the wrong answer. One might ask what happens if we use 3 instead of 2 in the above probabilistic primality test. Or, better yet, what if we evaluate a m 1 mod m for several different values of a. Consider the following data: The number of primes 10 6 is 78,498. The number of composite numbers m 10 6 such that 2 m 1 1 (mod m) is 245. The number of composite numbers m 10 6 such that 2 m 1 1 (mod m) and 3 m 1 1 (mod m) is 66. The number of composite numbers m 10 6 such that a m 1 1 (mod m) for a {2, 3, 5, 7, 11, 13, 17, 19, 31, 37, 41} is 0. Thus, we have the following result: If m 10 6 and a m 1 1 (mod m) for a {2, 3, 5, 7, 11, 17, 19, 31, 37, 41}, then m is prime. The above results for m 10 6 were found using Maple. If m > 10 6 and a m 1 1 (mod m) for a {2, 3, 5, 7, 11, 17, 19, 31, 37, 41}, it is highly likely, but not certain, that m is prime. Actually the primality test isprime that is built into Maple uses a somewhat different idea. Exercise Use Maple to show that 99

113 100 CHAPTER 24. PROBABILISTIC PRIMALITY TESTS (1) (mod 91), but 91 is not prime. (2) 2 m 1 1 (mod m) and 3 m 1 1 (mod m) for m = 1105, but 1105 is not prime. [Hints. Note that a n 1 (mod m) a n mod m = 1. In Maple, 3 90 is written 3^90 and 3 90 mod 91 is written 3^90 mod 91. A faster way to compute a n mod m in Maple is to use the command Power(a,n) mod m. Recall that ifactor(m) is the command to factor m.]

114 Chapter 25 The Base b Representation of n Definition Let b 2 and n > 0. We write (1) n = [a k, a k 1,..., a 1, a 0 ] b if and only if for some k 0 n = a k b k + a k 1 b k a 1 b + a 0 where a i {0, 1,..., b 1} for i = 0, 1,..., k. [a k, a k 1,..., a 1, a 0 ] is called a base b representation of n. Remark Base b is called binary if b = 2, ternary if b = 3, octal if b = 8, decimal if b = 10, hexadecimal if b = 16. If b is understood, especially if b = 10, we write a k a k 1 a 1 a 0 in place of [a k, a k 1,..., a 1, a 0 ] 10. In the case of b = 16, which is used frequently in computer science, the digits 10, 11, 12, 13, 14 and 15 are replaced by A, B, C, D, E and F, respectively. For a fixed base b 2, the numbers a i {0, 1, 2,..., b 1} in equation (1) are called the digits of the base b representation of n. In the binary case a i {0, 1} and the a i s are called bits (binary digits). 101

115 102 CHAPTER 25. THE BASE B REPRESENTATION OF N Here are a few examples: (1) 267 = [5, 3, 1] 7 since 267 = (2) 147 = [1, 0, 0, 1, 0, 0, 1, 1] 2 since 147 = (3) 4879 = [4, 8, 7, 9] 10 since 4879 = (4) = [A, 3, 5, B, 0, F ] 16 since = (5) = [107, 56, 791] 1000 since = Theorem If b 2, then every n > 0 has a unique base b representation of the form n = [a k,..., a 1, a 0 ] b with a k > 0. Proof. Apply repeatedly the Division Algorithm as follows: n = bq 0 + r 0, q 0 = bq 1 + r 1, q 1 = bq 2 + r 2, 0 r 0 0:. 0 r k q 0 > q 1 > > q k. Since this cannot go on forever we eventually obtain q l = 0 for some l. Then we have q l 1 = b 0 + r l. I claim that n = [r l, r l 1,..., r 0 ] if l is the smallest integer such that q l = 0. To see this, note that n = bq 0 + r 0

116 103 and q 0 = bq 1 + r 1. Hence Continuing in this way we find that And, since q l = 0 we have n = b (bq 1 + r 1 ) + r 0 n = b 2 q 1 + br 1 + r 0. n = b l+1 q l + b l r l + + br 1 + r 0. ( ) n = b l r l + + br 1 + r 0, which shows that n = [r l,..., r 1, r 0 ] b. To see that this representation is unique, note that from ( ) we have n = b ( b l 1 r l + + r 1 ) + r0, 0 r 0 < b. By the Division Algorithm it follows that r 0 is uniquely determined by n, as is the quotient q = b l 1 r l + + r 1. A similar argument shows that r 1 is uniquely determined. Continuing in this way we see that all the digits r l, r l 1,..., r 0 are uniquely determined. Example (1) We find the base 7 representation of 1,749. Hence 1749 = [5, 0, 4, 6] = = = =

117 104 CHAPTER 25. THE BASE B REPRESENTATION OF N (2) We find the base 12 representation of 19, , 151 = , 595 = = = , 151 = [11, 0, 11, 11] 12. (3) Find the base 10 representation of 1, = = = = = [1, 2, 0, 3] 10. (4) Find the base 2 (binary) representation of = = = = = = = = = [1, 0, 0, 0, 1, 0, 0, 1] 2. Exercise Generalize the following observations 3 = [1, 1] 2 7 = [1, 1, 1] 2 15 = [1, 1, 1, 1] 2 31 = [1, 1, 1, 1, 1] 2 63 = [1, 1, 1, 1, 1, 1] 2 Prove your generalization. [HINT: See Exercise 2.5 on page 6.]

118 105 Exercise Generalize the following observation: 8 = [2, 2] 3 26 = [2, 2, 2] 3 80 = [2, 2, 2, 2] = [2, 2, 2, 2, 2] 3 Prove your generalization. [HINT: See Exercise 2.5 on page 6.] Exercise Generalize Exercises 25.1 and 25.2 to an arbitrary base b 2. Remark To find the binary representation of a small number, the following method is often easier than the above method: Given n > 0 let 2 n 1 be the largest power of 2 satisfying 2 n 1 n. Let 2 n 2 be the largest power of 2 satisfying Let 2 n 3 2 n 2 n 2 n 1. be the largest power of 2 satisfying Note that at this point we have 2 n 3 n 2 n 1 2 n 2. 0 n (2 n n n 3 ) < n (2 n n 2 ) < n 2 n 1 < n. Continuing in this way, eventually we get 0 = n (2 n n n k ). Then n = 2 n n n k, and this gives the binary representation of n. Example Take n = 137. Note that 2 1 = 2, 2 2 = 4, 2 3 = 8, 2 4 = 16, 2 5 = 32, 2 6 = 64, 2 7 = 128, and 2 8 = 256. Using the above method we compute: So we have = = 9, = 1, = = = , 137 = So 137 = [1, 0, 0, 0, 1, 0, 0, 1] 2.

119 106 CHAPTER 25. THE BASE B REPRESENTATION OF N Exercise Show how to use both methods to find the binary representation of 455. Exercise Make a vertical list of the binary representation of the integers 1 to 16.

120 Chapter 26 Computation of a N mod m Let s first consider the question: What is the smallest number of multiplications required to compute a N where N is any positive integer? Suppose we want to calculate 2 8. One way is to perform the following 7 multiplications: 2 2 = 2 2 = = 2 4 = = 2 8 = = 2 16 = = 2 32 = = 2 64 = = = 256 But we can do it in only 3 multiplications: In general, using the method: 2 2 = 2 2 = = ( 2 2) 2 = 4 4 = = ( 2 4) 2 = = 256 a 2 = a a, a 3 = a 2 a, a 4 = a 3 a,..., a n = a n 1 a requires n 1 multiplications to compute a n. 107

121 108 CHAPTER 26. COMPUTATION OF A N MOD M On the other hand if n = 2 k then we can compute a n by successive squaring with only k multiplications: Note that the fact that a 2 = a a a 22 = ( a 2) 2 = a2 a 2 ( a 23 = a 22) 2 = a 2 2 a 22 together with the Laws of Exponents: and.. ( a 2k = a 2k 1) 2 = a 2 k 1 a 2k 1 2 k = ( 2 k 1) 2 = 2 k k 1 (a n ) m = a nm a n a m = a n+m is what makes this method work. Note that if n = 2 k then k is generally a lot smaller than n 1. For example, 1024 = 2 10 and 10 is quite a bit smaller than If n is not a power of 2 we can use the following method to compute a n. The Binary Method for Exponentiation. Let n be a positive integer. Let x be any real number. This is a method for computing x n. Step 1. Find the binary representation for n. n = [a r, a r 1,..., a 0 ] 2

122 109 Step 2. Compute the powers x 2, x 22, x 23,..., x 2r by successive squaring as shown above. Step 3. Compute the product x n = x ar2r x a r 12 r 1 x a 12 x a 0. [Note each a i is 0 or 1, so all needed factors were obtained in Step 2.] Example Let s compute Note that 15 = = [1, 1, 1, 1] 2. So this takes care of Step 1. For Step 2, we note that 3 2 = 3 3 = = 9 9 = = = 6561 So 3 15 = For this we need 3 multiplications: So we have = 3 9 = 27 ( ) = = 2187 ( ) 3 23 = = = Note that we have used just 6 multiplications, which is less than the 14 it would take if we used the naive method. Let s not forget that some additional effort was needed to compute the binary representation of 15, but not much. Theorem Computing x n using the binary method requires log 2 (n) applications of the Division Algorithm and at most 2 log 2 (n) multiplications. Proof. If n = [a r,..., a 0 ] 2, a r = 1, then n = 2 r + + a a 0. Hence ( ) 2 r n 2 r + 2 r = 2 r 1 1 < 2 r+1. Since log 2 (2 x ) = x and when 0 < a < b we have log 2 (a) < log 2 (b), we have from ( ) that log 2 (2 r ) log 2 (n) < log 2 ( 2 r+1 )

123 110 CHAPTER 26. COMPUTATION OF A N MOD M or r log 2 (n) < r + 1. Hence r = log 2 (n). Note that r is the number of times we need to apply the Division Algorithm to obtain the binary representation n = [a r,..., a 0 ] 2, a r = 1. To compute the powers x, x 2, x 22,..., x 2r by successive squaring requires r = log 2 (n) multiplications and similarly to compute the product x 2r x a r 12 r 1 x a 12 x a 0 requires r multiplicatons. So after obtaining the binary representation we need at most 2r = 2 log 2 (n) multiplications. Use of a calculator to compute log 2 (x): To find log 2 (x) one may use the formula log 2 (x) = 1 ln(2) ln(x) or [ ] 1 log 2 (x) ln(x) ( ) where ln(x) is the natural logarithm of x. For small values of x it is sometimes faster to use the fact that r = log 2 (x) is equivalent to 2 r x < 2 r+1, that is, r is the largest positive integer such that 2 r x. The Maple command for log 2 (x) is log[2](x). Note that if we count an application of the Division Algorithm and a multiplication as the same, the above tells us that we need at most 3 log 2 (n) operations to compute x n. So, for example, if n = 10 6, then it is easy to see that 3 log 2 (n) = 57. So we may compute x 1,000,000 with only 57 operations. Exercise Calculate 3 log 2 (n) for n = 2, 000, 000. Exercise Use the binary method to compute Exercise Approximately how many operations would be required to compute 2 n when n = ? Explain. Exercise Note that 6 multiplications are used to compute 3 15 using the binary method. Show that one can compute 3 15 with fewer than 6 multiplications. [You will have to experiment.]

124 111 Computing a n mod m. We use the binary method for exponentiation with the added trick that after every multiplication we reduce modulo m, that is, we divide by m and take the remainder. This keeps the products from getting too big. Example We compute 3 15 mod 10: 3 2 = 3 3 = 9 9 (mod 10) 3 4 = 9 9 = 81 1 (mod 10) (mod 10) 3 15 = = 27 7 (mod 10). Note that (mod 10) implies that 3 15 mod 10 = 7. [Recall that on page 109 we calculated that 3 15 = which is clearly congruent to 7 mod 10, but the multiplications were not so easy.] Example Let s find mod 645. It is easy to see that 644 = [1, 0, 1, 0, 0, 0, 0, 1, 0, 0] 2 That is, 644 = = Now by successive squaring and reducing modulo 645 we get 2 2 = 2 2 = 4 4 (mod 645) = (mod 645) = (mod 645) = 65, (mod 645) = 152, (mod 645) = (mod 645) = 65, (mod 645) = 152, (mod 645) = (mod 645). Now hence = , (mod 645).

125 112 CHAPTER 26. COMPUTATION OF A N MOD M So and = (mod 645) = (mod 645) so we have (mod 645). Hence mod 645 = 1. Exercise Calculate mod 10. Exercise Calculate mod 100. Exercise If you multiplied out 2 517, how many decimal digits would you obtain? [See Exercise 4.3 on page 14.] Exercise Note that on page 96 we calculated mod 11 with very few multiplications. Why can we not use that method to compute mod 12?

126 Chapter 27 The RSA Scheme In this chapter we discuss the basis of the so-called RSA scheme. This is the most important example of a public key cryptographic scheme. The RSA scheme is due to R. Rivest, A. Shamir and L. Adelman 1 and was discovered by them in We show how to implement it in more detail later using Maple. Here we give the number-theoretic underpinning of the scheme. We assume that the message we wish to send has been converted to an integer in the set J m = {0, 1, 2,..., m 1} where m is some positive integer to be determined. Generally this is a large integer. We will require two functions: E : J m J m (E for encipher) and D : J m J m (D for decipher). To be able to use D to decipher what E has enciphered we need to have D(E(x)) = x for all x J m. To show how m, E, and D are chosen we first prove a lemma: Lemma Let p and q be any two distinct primes and let m = pq. Let e and d be any two positive integers which are inverses of each other modulo φ(m). Then x ed x (mod m) for all x. 1 A copy of the paper A Method for Obtaining Digital Signatures and Public-Key Cryptosystems may be downloaded from 113

127 114 CHAPTER 27. THE RSA SCHEME Proof. By Theorem 22.6, φ(m) = (p 1)(q 1). Since ed 1 (mod φ(m)) we have ed 1 = kφ(m) = k(p 1)(q 1) for some k. Note k > 0 unless ed = 1 in which case the theorem is obvious. So we have ( ) ed = kφ(m) + 1 = k(p 1)(q 1) + 1 for some k > 0. Now by Fermat s Little Theorem, if gcd(x, p) = 1 we have x p 1 1 (mod p) and raising both sides of the congruence to the power (q 1)k we obtain: x (p 1)(q 1)k 1 (mod p) and multiplying both sides by x we have That is, by ( ) x (p 1)(q 1)k+1 x (mod p) ( ) x ed x (mod p). Now we proved ( ) when gcd(x, p) = 1, but if gcd(x, p) = p it is obvious since then x 0 (mod p). So in all cases ( ) holds. A similar argument proves that for all x x ed x (mod q). So by Exercise 15.11, page 63, we have since gcd(p, q) = 1 for all x. x ed x (mod m) Theorem Let J m = {0, 1, 2,..., m 1} and define E : J m J m by and D : J m J m by E(x) = x e mod m D(x) = x d mod m. Then E and D are inverses of each other if m, e and d are as in Lemma 27.1.

128 115 Proof. It suffices to show that D(E(x)) = x for all x J m. Let x J m and let E(x) = x e mod m = r 1. Also let D (r 1 ) = r d 1 mod m = r 2. We must show that r 2 = x. Since x e mod m = r 1 we know that x e r 1 (mod m). Hence x ed r d 1 (mod m). We also know that r d 1 r 2 (mod m). Hence x ed r 2 (mod m). By Lemma 27.1 x ed x (mod m) so we have x r 2 (mod m). Since both x and r 2 are in J m we have by Exercise 15.5 that x = r 2. This completes the proof. More details on the use of the RSA scheme will be given in the Maple worksheets which are available from the course website which may be reached from my home page:

129 116 CHAPTER 27. THE RSA SCHEME

130 Appendix A Rings and Groups The material in this appendix is optional reading. However, for the sake of completeness we state here the definition of a ring and the definition of a group. If you are interested in learning more you might take the course Elementary Abstract Algebra. Having had this course should make it a little easier to understand the ideas in abstract algebra and vice versa. For more details you may download the free book Elementary Abstract Algebra from my homepage: Alternatively, look in almost any book whose title contains the words Abstract Algebra or Modern Algebra. Look for one with Introductory or Elementary in the title. Definition A.1. A ring is an ordered triple (R, +, ) where R is a set and + and are binary operations on R satisfying the following properties: A1 a + (b + c) = (a + b) + c for all a, b, c in R. A2 a + b = b + a for all a, b in R. A3 There is an element 0 R satisfying a + 0 = a for all a in R. A4 For every a R there is an element b R such that a + b = 0. M1 a (b c) = (a b) c for all a, b, c in R. D1 a (b + c) = a b + a c for all a, b, c in R. 117

131 118 APPENDIX A. RINGS AND GROUPS D2 (b + c) a = b a + c a for all a, b, c in R. Thus, to describe a ring one must specify three things: 1. a set, 2. a binary operation on the set called multiplication, 3. a binary operation on the set called addition. Then, one must verify that the properties above are satisfied. Example A.1. Here are some examples of rings. The two binary operations + and are in each case the ones that you are familiar with. 1. (R, +, ) the ring of real numbers. 2. (Q, +, ) the ring of rational numbers. 3. (Z, +, ) the ring of integers. 4. (Z n, +, ) the ring of integers modulo n. 5. (M n (R), +, ) the ring of all n n matrices over R. Definition A.2. A group is an ordered pair (G, ) where G is a set and is a binary operation on G satisfying the following properties 1. x (y z) = (x y) z for all x, y, z in G. 2. There is an element e G satisfying e x = x and x e = x for all x in G. 3. For each element x in G there is an element y in G satisfying x y = e and y x = e. Definition A.3. A group (G, ) is said to be Abelian if x y = y x for all x, y G. Thus, to describe a group one must specify two things: 1. a set, and 2. a binary operation on the set.

132 119 Then, one must verify that the binary operation is associative, that there is an identity in the set, and that every element in the set has an inverse. Example A.2. Here are some examples of groups. The binary operations are in each case the ones that you are familiar with. 1. (Z, +) is a group with identity 0. The inverse of x Z is x. 2. (Q, +) is a group with identity 0. The inverse of x Q is x. 3. (R, +) is a group with identity 0. The inverse of x R is x. 4. (Q {0}, ) is a group with identity 1. The inverse of x Q {0} is x (R {0}, ) is a group with identity 1. The inverse of x R {0} is x (Z n, +) is a group with identity 0. The inverse of x Z n is n x if x 0, the inverse of 0 is (U n, ) is a group with identity [1]. The inverse of [a] U n was shown to exist in Chapter (R n, +) where + is vector addition. The identity is the zero vector (0, 0,..., 0) and the inverse of the vector x = (x 1, x 2,..., x n ) is the vector x = ( x 1, x 2,..., x n ). 9. (M n (R), +). This is the group of all n n matrices over R and + is matrix addition.

133 120 APPENDIX A. RINGS AND GROUPS

134 Bibliography [1] Tom Apostol, Introduction to Analytic Number Theory, Springer-Verlag, New York-Heidelberg, [2] Chris Caldwell, The Primes Pages, [3] W. Edwin Clark, Number Theory Links, [4] Earl Fife and Larry Husch, Number Theory (Mathematics Archives, [5] Ronald Graham, Donald Knuth, and Oren Patashnik, Concrete Mathematics, Addison-Wesley, [6] Donald Knuth The Art of Computer Programming, Vols I and II, Addison-Wesley, [7] The Math Forum, Number Theory Sites [8] Oystein Ore, Number Theory and its History, Dover Publications, [9] Carl Pomerance and Richard Crandall, Prime Numbers A Computational Perspective, Springer -Verlag, [10] Kenneth A. Rosen, Elementary Number Theory, (Fourth Edition), Addison-Wesley, [11] Eric Weisstein, World of Mathematics Number Theory Section, 121

135 R e a d i n g ( s ) # 2

136 This is page i Printer: Opaque this Elementary Number Theory William Stein October 2005

137 ii To my students and my wife, Clarita Lefthand.

138 Contents This is page iii Printer: Opaque this Preface 3 1 Prime Numbers Prime Factorization The Sequence of Prime Numbers Exercises The Ring of Integers Modulo n Congruences Modulo n The Chinese Remainder Theorem Quickly Computing Inverses and Huge Powers Finding Primes The Structure of (Z/pZ) Exercises Public-Key Cryptography The Diffie-Hellman Key Exchange The RSA Cryptosystem Attacking RSA Exercises Quadratic Reciprocity Statement of the Quadratic Reciprocity Law Euler s Criterion

139 Contents First Proof of Quadratic Reciprocity A Proof of Quadratic Reciprocity Using Gauss Sums Finding Square Roots Exercises Continued Fractions Finite Continued Fractions Infinite Continued Fractions The Continued Fraction of e Quadratic Irrationals Recognizing Rational Numbers Sums of Two Squares Exercises Elliptic Curves The Definition The Group Structure on an Elliptic Curve Integer Factorization Using Elliptic Curves Elliptic Curve Cryptography Elliptic Curves Over the Rational Numbers Exercises Computational Number Theory Prime Numbers The Ring of Integers Modulo n Public-Key Cryptography Quadratic Reciprocity Continued Fractions Elliptic Curves Exercises Answers and Hints 165 References 173

140 2 Contents

141 Preface This is page 3 Printer: Opaque this This is a textbook about prime numbers, congruences, basic public-key cryptography, quadratic reciprocity, continued fractions, elliptic curves, and number theory algorithms. We assume the reader has some familiarity with groups, rings, and fields, and for Chapter 7 some programming experience. This book grew out of an undergraduate course that the author taught at Harvard University in 2001 and Notation and Conventions. We let N = {1, 2, 3,...} denote the natural numbers, and use the standard notation Z, Q, R, and C for the rings of integer, rational, real, and complex numbers, respectively. In this book we will use the words proposition, theorem, lemma, and corollary as follows. Usually a proposition is a less important or less fundamental assertion, a theorem a deeper culmination of ideas, a lemma something that we will use later in this book to prove a proposition or theorem, and a corollary an easy consequence of a proposition, theorem, or lemma. Acknowledgements. Brian Conrad and Ken Ribet made a large number of clarifying comments and suggestions throughout the book. Baurzhan Bektemirov, Lawrence Cabusora, and Keith Conrad read drafts of this book and made many comments. Frank Calegari used the course when teaching Math 124 at Harvard, and he and his students provided much feedback. Noam Elkies made comments and suggested Exercise 4.5. Seth Kleinerman wrote a version of Section 5.3 as a class project. Samit Dasgupta, George Stephanides, Kevin Stern, and Heidi Williams all suggested corrections. I

142 4 Contents also benefited from conversations with Henry Cohn and David Savitt. I used Emacs, L A TEX, and Python in the preparation of this book.

143 1 Prime Numbers This is page 5 Printer: Opaque this In Section 1.1 we describe how the integers are built out of the prime numbers 2, 3, 5, 7, 11,.... In Section 1.2 we discuss theorems about the set of primes numbers, starting with Euclid s proof that this set is infinite, then explore the distribution of primes via the prime number theorem and the Riemann Hypothesis (without proofs). 1.1 Prime Factorization Primes The set of natural numbers is and the set of integers is N = {1, 2, 3, 4,...}, Z = {..., 2, 1, 0, 1, 2,...}. Definition (Divides). If a, b Z we say that a divides b, written a b, if ac = b for some c Z. In this case we say a is a divisor of b. We say that a does not divide b, written a b, if there is no c Z such that ac = b. For example, we have 2 6 and Also, all integers divide 0, and 0 divides only 0. However, 3 does not divide 7 in Z. Remark The notation b. : a for b is divisible by a is common in Russian literature on number theory.

144 6 1. Prime Numbers Definition (Prime and Composite). An integer n > 1 is prime if it the only positive divisors of n are 1 and n. We call n composite if n is not prime. The number 1 is neither prime nor composite. The first few primes of N are 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79,..., and the first few composites are 4, 6, 8, 9, 10, 12, 14, 15, 16, 18, 20, 21, 22, 24, 25, 26, 27, 28, 30, 32, 33, 34,.... Remark J. H. Conway argues in [Con97, viii] that 1 should be considered a prime, and in the 1914 table [Leh14], Lehmer considers 1 to be a prime. In this book we consider neither 1 nor 1 to be prime. Every natural number is built, in a unique way, out of prime numbers: Theorem (Fundamental Theorem of Arithmetic). Every natural number can be written as a product of primes uniquely up to order. Note that primes are the products with only one factor and 1 is the empty product. Remark Theorem 1.1.5, which we will prove in Section 1.1.4, is trickier to prove than you might first think. For example, unique factorization fails in the ring Z[ 5] = {a + b 5 : a, b Z} C, where 6 factors into irreducible elements in two different ways: 2 3 = 6 = (1 + 5) (1 5) The Greatest Common Divisor We will use the notion of greatest common divisor of two integers to prove that if p is a prime and p ab, then p a or p b. Proving this is the key step in our proof of Theorem Definition (Greatest Common Divisor). Let gcd(a, b) = max {d Z : d a and d b}, unless both a and b are 0 in which case gcd(0, 0) = 0. For example, gcd(1, 2) = 1, gcd(6, 27) = 3, and for any a, gcd(0, a) = gcd(a, 0) = a. If a 0, the greatest common divisor exists because if d a then d a, and there are only a positive integers a. Similarly, the gcd exists when b 0.

145 Lemma For any integers a and b we have 1.1 Prime Factorization 7 gcd(a, b) = gcd(b, a) = gcd(±a, ±b) = gcd(a, b a) = gcd(a, b + a). Proof. We only prove that gcd(a, b) = gcd(a, b a), since the other cases are proved in a similar way. Suppose d a and d b, so there exist integers c 1 and c 2 such that dc 1 = a and dc 2 = b. Then b a = dc 2 dc 1 = d(c 2 c 1 ), so d b a. Thus gcd(a, b) gcd(a, b a), since the set over which we are taking the max for gcd(a, b) is a subset of the set for gcd(a, b a). The same argument with a replaced by a and b replaced by b a, shows that gcd(a, b a) = gcd( a, b a) gcd( a, b) = gcd(a, b), which proves that gcd(a, b) = gcd(a, b a). Lemma Suppose a, b, n Z. Then gcd(a, b) = gcd(a, b an). Proof. By repeated application of Lemma 1.1.8, we have gcd(a, b) = gcd(a, b a) = gcd(a, b 2a) = = gcd(a, b 2n). Assume for the moment that we have already proved Theorem A natural (and naive!) way to compute gcd(a, b) is to factor a and b as a product of primes using Theorem 1.1.5; then the prime factorization of gcd(a, b) can read off from that of a and b. For example, if a = 2261 and b = 1275, then a = and b = , so gcd(a, b) = 17. It turns out that the greatest common divisor of two integers, even huge numbers (millions of digits), is surprisingly easy to compute using Algorithm below, which computes gcd(a, b) without factoring a or b. To motivate Algorithm , we compute gcd(2261, 1275) in a different way. First, we recall a helpful fact. Proposition Suppose that a and b are integers with b 0. Then there exists unique integers q and r such that 0 r < b and a = bq + r. Proof. For simplicity, assume that both a and b are positive (we leave the general case to the reader). Let Q be the set of all nonnegative integers n such that a bn is nonnegative. Then Q is nonempty because 0 Q and Q is bounded because a bn < 0 for all n > a/b. Let q be the largest element of Q. Then r = a bq < b, otherwise q + 1 would also be in Q. Thus q and r satisfy the existence conclusion. To prove uniqueness, suppose for the sake of contradiction that q and r = a bq also satisfy the conclusion but that q q. Then q Q since r = a bq 0, so q < q and we can write q = q m for some m > 0. But then r = a bq = a b(q m) = a bq + bm = r + bm > b since r 0, a contradiction.

146 8 1. Prime Numbers For us an algorithm is a finite sequence of instructions that can be followed to perform a specific task, such as a sequence of instructions in a computer program, which must terminate on any valid input. The word algorithm is sometimes used more loosely (and sometimes more precisely) than defined here, but this definition will suffice for us. Algorithm (Division Algorithm). Suppose a and b are integers with b 0. This algorithm computes integers q and r such that 0 r < b and a = bq + r. We will not describe the actual steps of this algorithm, since it is just the familiar long division algorithm. We use the division algorithm repeatedly to compute gcd(2261, 1275). Dividing 2261 by 1275 we find that 2261 = , so q = 1 and r = 986. Notice that if a natural number d divides both 2261 and 1275, then d divides their difference 986 and d still divides On the other hand, if d divides both 1275 and 986, then it has to divide their sum 2261 as well! We have made progress: gcd(2261, 1275) = gcd(1275, 986). This equality also follows by repeated application of Lemma Repeating, we have 1275 = , so gcd(1275, 986) = gcd(986, 289). Keep going: 986 = = = Thus gcd(2261, 1275) = = gcd(51, 17), which is 17 because Thus gcd(2261, 1275) = 17. Aside from some tedious arithmetic, that computation was systematic, and it was not necessary to factor any integers (which is something we do not know how to do quickly if the numbers involved have hundreds of digits). Algorithm (Greatest Common Division). Given integers a, b, this algorithm computes gcd(a, b). 1. [Assume a > b 0] We have gcd(a, b) = gcd( a, b ) = gcd( b, a ), so we may replace a and b by their absolute value and hence assume a, b 0. If a = b output a and terminate. Swapping if necessary we assume a > b.

147 1.1 Prime Factorization 9 2. [Quotient and Remainder] Using Algorithm , write a = bq+r, with 0 r < b and q Z. 3. [Finished?] If r = 0 then b a, so we output b and terminate. 4. [Shift and Repeat] Set a b and b r, then go to step 2. Proof. Lemmas imply that gcd(a, b) = gcd(b, r) so the gcd does not change in step 4. Since the remainders form a decreasing sequence of nonnegative integers, the algorithm terminates. See Section for an implementation of Algorithm Example Set a = 15 and b = = gcd(15, 6) = gcd(6, 3) 6 = gcd(6, 3) = gcd(3, 0) = 3 Note that we can just as easily do an example that is ten times as big, an observation that will be important in the proof of Theorem below. Example Set a = 150 and b = = gcd(150, 60) = gcd(60, 30) 60 = gcd(60, 30) = gcd(30, 0) = 30 Lemma For any integers a, b, n, we have gcd(an, bn) = gcd(a, b) n. Proof. The idea is to follow Example ; we step through Euclid s algorithm for gcd(an, bn) and note that at every step the equation is the equation from Euclid s algorithm for gcd(a, b) but multiplied through by n. For simplicity, assume that both a and b are positive. We will prove the lemma by induction on a + b. The statement is true in the base case when a + b = 2, since then a = b = 1. Now assume a, b are arbitrary with a b. Let q and r be such that a = bq + r and 0 r < b. Then by Lemmas , we have gcd(a, b) = gcd(b, r). Multiplying a = bq + r by n we see that an = bnq + rn, so gcd(an, bn) = gcd(bn, rn). Then b + r = b + (a bq) = a b(q 1) a < a + b, so by induction gcd(bn, rn) = gcd(b, r) n. Since gcd(a, b) = gcd(b, r), this proves the lemma. Lemma Suppose a, b, n Z are such that n a and n b. Then n gcd(a, b). Proof. Since n a and n b, there are integers c 1 and c 2, such that a = nc 1 and b = nc 2. By Lemma , gcd(a, b) = gcd(nc 1, nc 2 ) = n gcd(c 1, c 2 ), so n divides gcd(a, b).

148 10 1. Prime Numbers At this point it would be natural to formally analyze the complexity of Algorithm We will not do this, because the main reason we introduced Algorithm is that it will allow us to prove Theorem 1.1.5, and we have not chosen to formally analyze the complexity of the other algorithms in this book. For an extensive analysis of the complexity of Algorithm , see [Knu98, 4.5.3]. With Algorithm , we can prove that if a prime divides the product of two numbers, then it has got to divide one of them. This result is the key to proving that prime factorization is unique. Theorem (Euclid). Let p be a prime and a, b N. If p ab then p a or p b. You might think this theorem is intuitively obvious, but that might be because the fundamental theorem of arithmetic (Theorem 1.1.5) is deeply ingrained in your intuition. Yet Theorem will be needed in our proof of the fundamental theorem of arithmetic. Proof of Theorem If p a we are done. If p a then gcd(p, a) = 1, since only 1 and p divide p. By Lemma , gcd(pb, ab) = b. Since p pb and, by hypothesis, p ab, it follows from Lemma that p gcd(pb, ab) = b Numbers Factor as Products of Primes In this section, we prove that every natural number factors as a product of primes. Then we discuss the difficulty of finding such a decomposition in practice. We will wait until Section to prove that factorization is unique. As a first example, let n = The sum of the digits of n is divisible by 3, so n is divisible by 3 (see Proposition 2.1.3), and we have n = The number 425 is divisible by 5, since its last digit is 5, and we have 1275 = Again, dividing 85 by 5, we have 1275 = , which is the prime factorization of Generalizing this process proves the following proposition: Proposition Every natural number is a product of primes. Proof. Let n be a natural number. If n = 1, then n is the empty product of primes. If n is prime, we are done. If n is composite, then n = ab with a, b < n. By induction, a and b are products of primes, so n is also a product of primes. Two questions immediately arise: (1) is this factorization unique, and (2) how quickly can we find such a factorization? Addressing (1), what if

149 1.1 Prime Factorization 11 we had done something differently when breaking apart 1275 as a product of primes? Could the primes that show up be different? Let s try: we have 1275 = Now 255 = 5 51 and 51 = 17 3, and again the factorization is the same, as asserted by Theorem above. We will prove uniqueness of the prime factorization of any integer in Section Regarding (2), there are algorithms for integer factorization; e.g., in Sections 6.3 and we will study and implement some of them. It is a major open problem to decide how fast integer factorization algorithms can be. Open Problem Is there an algorithm which can factor any integer n in polynomial time? (See below for the meaning of polynomial time.) By polynomial time we mean that there is a polynomial f(x) such that for any n the number of steps needed by the algorithm to factor n is less than f(log 10 (n)). Note that log 10 (n) is an approximation for the number of digits of the input n to the algorithm. Peter Shor [Sho97] devised a polynomial time algorithm for factoring integers on quantum computers. We will not discuss his algorithm further, except to note that in 2001 IBM researchers built a quantum computer that used Shor s algorithm to factor 15 (see [LMG + 01, IBM01]). You can earn money by factoring certain large integers. Many cryptosystems would be easily broken if factoring certain large integers were easy. Since nobody has proven that factoring integers is difficult, one way to increase confidence that factoring is difficult is to offer cash prizes for factoring certain integers. For example, until recently there was a $10000 bounty on factoring the following 174-digit integer (see [RSA]): This number is known as RSA-576 since it has 576 digits when written in binary (see Section for more on binary numbers). It was factored at the German Federal Agency for Information Technology Security in December 2003 (see [Wei03]): The previous RSA challenge was the 155-digit number

150 12 1. Prime Numbers It was factored on 22 August 1999 by a group of sixteen researchers in four months on a cluster of 292 computers (see [ACD + 99]). They found that RSA-155 is the product of the following two 78-digit primes: p = q = The next RSA challenge is RSA-640: , and its factorization was worth $20000 until November 2005 when it was factored by F. Bahr, M. Boehm, J. Franke, and T. Kleinjun. This factorization took 5 months. Here is one of the prime factors (you can find the other): (This team also factored a 663-bit RSA challenge integer.) The smallest currently open challenge is RSA-704, worth $30000: These RSA numbers were factored using an algorithm called the number field sieve (see [LL93]), which is the best-known general purpose factorization algorithm. A description of how the number field sieve works is beyond the scope of this book. However, the number field sieve makes extensive use of the elliptic curve factorization method, which we will describe in Section The Fundamental Theorem of Arithmetic We are ready to prove Theorem using the following idea. Suppose we have two factorizations of n. Using Theorem we cancel common primes from each factorization, one prime at a time. At the end, we discover that the factorizations must consist of exactly the same primes. The technical details are given below.

151 1.2 The Sequence of Prime Numbers 13 Proof. If n = 1, then the only factorization is the empty product of primes, so suppose n > 1. By Proposition , there exist primes p 1,..., p d such that Suppose that n = p 1 p 2 p d. n = q 1 q 2 q m is another expression of n as a product of primes. Since p 1 n = q 1 (q 2 q m ), Euclid s theorem implies that p 1 = q 1 or p 1 q 2 q m. By induction, we see that p 1 = q i for some i. Now cancel p 1 and q i, and repeat the above argument. Eventually, we find that, up to order, the two factorizations are the same. 1.2 The Sequence of Prime Numbers This section is concerned with three questions: 1. Are there infinitely many primes? 2. Given a, b Z, are there infinitely many primes of the form ax + b? 3. How are the primes spaced along the number line? We first show that there are infinitely many primes, then state Dirichlet s theorem that if gcd(a, b) = 1, then ax + b is a prime for infinitely many values of x. Finally, we discuss the Prime Number Theorem which asserts that there are asymptotically x/ log(x) primes less than x, and we make a connection between this asymptotic formula and the Riemann Hypothesis There Are Infinitely Many Primes Each number on the left in the following table is prime. We will see soon that this pattern does not continue indefinitely, but something similar works. 3 = = = = =

152 14 1. Prime Numbers Theorem (Euclid). There are infinitely many primes. Proof. Suppose that p 1, p 2,..., p n are n distinct primes. We construct a prime p n+1 not equal to any of p 1,..., p n as follows. If then by Proposition there is a factorization N = p 1 p 2 p 3 p n + 1, (1.2.1) N = q 1 q 2 q m with each q i prime and m 1. If q 1 = p i for some i, then p i N. Because of (1.2.1), we also have p i N 1, so p i 1 = N (N 1), which is a contradiction. Thus the prime p n+1 = q 1 is not in the list p 1,..., p n, and we have constructed our new prime. For example, = = Multiplying together the first 6 primes and adding 1 doesn t produce a prime, but it produces an integer that is merely divisible by a new prime. Joke (Hendrik Lenstra). There are infinitely many composite numbers. Proof. To obtain a new composite number, multiply together the first n composite numbers and don t add Enumerating Primes The Sieve of Eratosthenes is an efficient way to enumerate all primes up to n. The sieve works by first writing down all numbers up to n, noting that 2 is prime, and crossing off all multiples of 2. Next, note that the first number not crossed off is 3, which is prime, and cross off all multiples of 3, etc. Repeating this process, we obtain a list of the primes up to n. Formally, the algorithm is as follows: Algorithm (Sieve of Eratosthenes). Given a positive integer n, this algorithm computes a list of the primes up to n. 1. [Initialize] Let X = [3, 5,...] be the list of all odd integers between 3 and n. Let P = [2] be the list of primes found so far. 2. [Finished?] Let p to be the first element of X. If p n, append each element of X to P and terminate. Otherwise append p to P. 3. [Cross Off] Set X equal to the sublist of elements in X that are not divisible by p. Go to step 2.

153 1.2 The Sequence of Prime Numbers 15 For example, to list the primes 40 using the sieve, we proceed as follows. First P = [2] and X = [3, 5, 7, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39]. We append 3 to P and cross off all multiples of 3 to obtain the new list X = [5, 7, 11, 13, 17, 19, 23, 25, 29, 31, 35, 37]. Next we append 5 to P, obtaining P = [2, 3, 5], and cross off the multiples of 5, to obtain X = [7, 11, 13, 17, 19, 23, 29, 31, 37]. Because , we append X to P and find that the primes less than 40 are 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37. Proof of Algorithm The part of the algorithm that is not clear is that when the first element a of X satisfies a n, then each element of X is prime. To see this, suppose m is in X, so n m n and that m is divisible by no prime that is n. Write m = p ei i with the p i distinct primes and p 1 n for each i and there is more than one p i, then m > n, a contradiction. Thus some p i is less than n, which also contradicts out assumptions on m. See Section for an implementation of Algorithm The Largest Known Prime Though Theorem implies that there are infinitely many primes, it still makes sense to ask the question What is the largest known prime? A Mersenne prime is a prime of the form 2 q 1. According to [Cal] the largest known prime as of July 2004 is the Mersenne prime p = , which has decimal digits, so writing it out would fill over 10 books the size if this book. Euclid s theorem implies that there definitely is a prime bigger than this 7.2 million digit p. Deciding whether or not a number is prime is interesting, both as a motivating problem and for applications to cryptography, as we will see in Section 2.4 and Chapter Primes of the Form ax + b Next we turn to primes of the form ax + b, where a and b are fixed integers with a > 1 and x varies over the natural numbers N. We assume that gcd(a, b) = 1, because otherwise there is no hope that ax + b is prime infinitely often. For example, 2x + 2 = 2(x + 1) is only prime if x = 0, and is not prime for any other x N.

154 16 1. Prime Numbers Proposition There are infinitely many primes of the form 4x 1. Why might this be true? We list numbers of the form 4x 1 and underline those that are prime: 3, 7, 11, 15, 19, 23, 27, 31, 35, 39, 43, 47,... It is plausible that underlined numbers would continue to appear indefinitely. Proof. Suppose p 1, p 2,..., p n are distinct primes of the form 4x 1. Consider the number N = 4p 1 p 2 p n 1. Then p i N for any i. Moreover, not every prime p N is of the form 4x + 1; if they all were, then N would be of the form 4x + 1. Thus there is a p N that is of the form 4x 1. Since p p i for any i, we have found a new prime of the form 4x 1. We can repeat this process indefinitely, so the set of primes of the form 4x 1 cannot be finite. Note that this proof does not work if 4x 1 is replaced by 4x + 1, since a product of primes of the form 4x 1 can be of the form 4x + 1. Example Set p 1 = 3, p 2 = 7. Then is a prime of the form 4x 1. Next N = = 83 N = = 6971, which is again a prime of the form 4x 1. Again: N = = = This time 61 is a prime, but it is of the form 4x + 1 = However, is prime and = We are unstoppable: N = = This time the small prime, 5591, is of the form 4x 1 and the large one is of the form 4x + 1. Theorem (Dirichlet). Let a and b be integers with gcd(a, b) = 1. Then there are infinitely many primes of the form ax + b. Proofs of this theorem typically use tools from advanced number theory, and are beyond the scope of this book (see e.g., [FT93, VIII.4]).

155 1.2 The Sequence of Prime Numbers 17 TABLE 1.1. Values of π(x) x π(x) How Many Primes are There? We saw in Section that there are infinitely many primes. In order to get a sense for just how many primes there are, we consider a few warm-up questions. Then we consider some numerical evidence and state the prime number theorem, which gives an asymptotic answer to our question, and connect this theorem with a form of the Riemann Hypothesis. Our discussion of counting primes in this section is very cursory; for more details, read Crandall and Pomerance s excellent book [CP01, 1.1.5]. The following vague discussion is meant to motivate a precise way to measure the number of primes. How many natural numbers are even? Answer: Half of them. How many natural numbers are of the form 4x 1? Answer: One fourth of them. How many natural numbers are perfect squares? Answer: Zero percent of all natural numbers, in the sense that the limit of the proportion of perfect squares to all natural numbers converges to 0. More precisely, #{n N : n x and n is a perfect square} lim = 0, x x since the numerator is roughly x and lim x x x = 0. Likewise, it is an easy consequence of Theorem below that zero percent of all natural numbers are prime (see Exercise 1.4). We are thus led to ask another question: How many positive integers x are perfect squares? Answer: roughly x. In the context of primes, we ask, Question How many natural numbers x are prime? Let For example, π(x) = #{p N : p x is a prime}. π(6) = #{2, 3, 5} = 3. Some values of π(x) are given in Table 1.1, and Figures 1.1 and 1.2 contain graphs of π(x). These graphs look like straight lines, which maybe bend down slightly. Gauss had a lifelong love of enumerating primes. Eventually he computed π( ), though the author doesn t know whether or not Gauss got the right answer, which is Gauss conjectured the following asymptotic formula for π(x), which was later proved independently by Hadamard and Vallée Poussin in 1896 (but will not be proved in this book):

156 18 1. Prime Numbers y 180 Graph of π(x) 100 (200, 46) (100, 25) 100 FIGURE 1.1. Graph of π(x) for x < 1000 (900, 154) (1000, 168) 900 x TABLE 1.2. Comparison of π(x) and x/(log(x) 1) x π(x) x/(log(x) 1) (approx) Theorem (Prime Number Theorem). The function π(x) is asymptotic to x/ log(x), in the sense that lim x π(x) x/ log(x) = 1. We do nothing more here than motivate this deep theorem with a few further numerical observations. The theorem implies that so for any a, lim x lim π(x)/x = lim 1/ log(x) = 0, x x π(x) x/(log(x) a) = lim x π(x) x/ log(x) aπ(x) = 1. x Thus x/(log(x) a) is also asymptotic to π(x) for any a. See [CP01, 1.1.5] for a discussion of why a = 1 is the best choice. Table 1.2 compares π(x) and x/(log(x) 1) for several x < As of 2004, the record for counting primes appears to be π( ) = The computation of π( ) reportedly took ten months on a 350 Mhz Pentium II (see [GS02] for more details).

157 1.2 The Sequence of Prime Numbers 19 π(x) x π(x) 4800 x FIGURE 1.2. Graphs of π(x) for x < and x < For the reader familiar with complex analysis, we mention a connection between π(x) and the Riemann Hypothesis. The Riemann zeta function ζ(s) is a complex analytic function on C \ {1} that extends the function defined on a right half plane by n=1 n s. The Riemann Hypothesis is the conjecture that the zeros in C of ζ(s) with positive real part lie on the line Re(s) = 1/2. This conjecture is one of the Clay Math Institute million dollar millennium prize problems [Cla]. According to [CP01, 1.4.1], the Riemann Hypothesis is equivalent to the conjecture that Li(x) = x 2 1 log(t) dt is a good approximation to π(x), in the following precise sense: Conjecture (Equivalent to the Riemann Hypothesis). For all x 2.01, π(x) Li(x) x log(x). If x = 2, then π(2) = 1 and Li(2) = 0, but 2 log(2) = , so the inequality is not true for x 2, but 2.01 is big enough. We will do nothing more to explain this conjecture, and settle for one numerical example. Example Let x = Then π(x) = , Li(x) = , π(x) Li(x) = , x log(x) = , x/(log(x) 1) = One of the best popular article on the prime number theorem and the Riemann hypothesis is [Zag75].

158 20 1. Prime Numbers 1.3 Exercises 1.1 Compute the greatest common divisor gcd(455, 1235) by hand. 1.2 Use the Sieve of Eratosthenes to make a list of all primes up to Prove that there are infinitely many primes of the form 6x 1. π(x) 1.4 Use Theorem to deduce that lim x x = 0.

159 2 The Ring of Integers Modulo n This is page 21 Printer: Opaque this This chapter is about the ring Z/nZ of integers modulo n. First we discuss when linear equations modulo n have a solution, then introduce the Euler ϕ function and prove Fermat s Little Theorem and Wilson s theorem. Next we prove the Chinese Remainer Theorem, which addresses simultaneous solubility of several linear equations modulo coprime moduli. With these theoretical foundations in place, in Section 2.3 we introduce algorithms for doing interesting computations modulo n, including computing large powers quickly, and solving linear equations. We finish with a very brief discussion of finding prime numbers using arithmetic modulo n. 2.1 Congruences Modulo n In this section we define the ring Z/nZ of integers modulo n, introduce the Euler ϕ-function, and relate it to the multiplicative order of certain elements of Z/nZ. If a, b Z and n N, we say that a is congruent to b modulo n if n a b, and write a b (mod n). Let nz = (n) be the ideal of Z generated by n. Definition (Integers Modulo n). The ring of integers modulo n is the quotient ring Z/nZ of equivalence classes of integers modulo n. It is equipped with its natural ring structure: (a + nz) + (b + nz) = (a + b) + nz (a + nz) (b + nz) = (a b) + nz.

160 22 2. The Ring of Integers Modulo n Example For example, Z/3Z = {{..., 3, 0, 3,...}, {..., 2, 1, 4,...}, {..., 1, 2, 5,...}} We use the notation Z/nZ because Z/nZ is the quotient of the ring Z by the ideal nz of multiples of n. Because Z/nZ is the quotient of a ring by an ideal, the ring structure on Z induces a ring structure on Z/nZ. We often let a or a (mod n) denote the equivalence class a + nz of a. If p is a prime, then Z/pZ is a field (see Exercise 2.11). We call the natural reduction map Z Z/nZ, which sends a to a + nz, reduction modulo n. We also say that a is a lift of a + nz. Thus, e.g., 7 is a lift of 1 mod 3, since 7 + 3Z = 1 + 3Z. We can use that arithmetic in Z/nZ is well defined is to derive tests for divisibility by n (see Exercise 2.7). Proposition A number n Z is divisible by 3 if and only if the sum of the digits of n is divisible by 3. Proof. Write n = a + 10b + 100c +, where the digits of n are a, b, c, etc. Since 10 1 (mod 3), n = a + 10b + 100c + a + b + c + (mod 3), from which the proposition follows Linear Equations Modulo n In this section, we are concerned with how to decide whether or not a linear equation of the form ax b (mod n) has a solution modulo n. Algorithms for computing solutions to ax b (mod n) are the topic of Section 2.3. First we prove a proposition that gives a criterion under which one can cancel a quantity from both sides of a congruence. Proposition (Cancellation). If gcd(c, n) = 1 and ac bc (mod n), then a b (mod n). Proof. By definition n ac bc = (a b)c. Since gcd(n, c) = 1, it follows from Theorem that n a b, so a b (mod n), as claimed.

161 2.1 Congruences Modulo n 23 When a has a multiplicative inverse a in Z/nZ (i.e., aa 1 (mod n)) then the equation ax b (mod n) has a unique solution x a b (mod n) modulo n. Thus, it is of interest to determine the units in Z/nZ, i.e., the elements which have a multiplicative inverse. We will use complete sets of residues to prove that the units in Z/nZ are exactly the a Z/nZ such that gcd(ã, n) = 1 for any lift ã of a to Z (it doesn t matter which lift). Definition (Complete Set of Residues). We call a subset R Z of size n whose reductions modulo n are pairwise distinct a complete set of residues modulo n. In other words, a complete set of residues is a choice of representative for each equivalence class in Z/nZ. For example, R = {0, 1, 2,..., n 1} is a complete set of residues modulo n. When n = 5, R = {0, 1, 1, 2, 2} is a complete set of residues. Lemma If R is a complete set of residues modulo n and a Z with gcd(a, n) = 1, then ar = {ax : x R} is also a complete set of residues modulo n. Proof. If ax ax (mod n) with x, x R, then Proposition implies that x x (mod n). Because R is a complete set of residues, this implies that x = x. Thus the elements of ar have distinct reductions modulo n. It follows, since #ar = n, that ar is a complete set of residues modulo n. Proposition (Units). If gcd(a, n) = 1, then the equation ax b (mod n) has a solution, and that solution is unique modulo n. Proof. Let R be a complete set of residues modulo n, so there is a unique element of R that is congruent to b modulo n. By Lemma 2.1.6, ar is also a complete set of residues modulo n, so there is a unique element ax ar that is congruent to b modulo n, and we have ax b (mod n). Algebraically, this proposition asserts that if gcd(a, n) = 1, then the map Z/nZ Z/nZ given by left multiplication by a is a bijection. Example Consider the equation 2x 3 (mod 7), and the complete set R = {0, 1, 2, 3, 4, 5, 6} of coset representatives. We have so (mod 7). 2R = {0, 2, 4, 6, 8 1, 10 3, 12 5}, When gcd(a, n) 1, then the equation ax b (mod n) may or may not have a solution. For example, 2x 1 (mod 4) has no solution, but 2x 2 (mod 4) does, and in fact it has more than one mod 4 (x = 1 and x = 3). Generalizing Proposition 2.1.7, we obtain the following more general criterion for solvability.

162 24 2. The Ring of Integers Modulo n Proposition (Solvability). The equation ax b (mod n) has a solution if and only if gcd(a, n) divides b. Proof. Let g = gcd(a, n). If there is a solution x to the equation ax b (mod n), then n (ax b). Since g n and g a, it follows that g b. Conversely, suppose that g b. Then n (ax b) if and only if ( n a g g x b ). g Thus ax b (mod n) has a solution if and only if a g x b g (mod n g ) has a solution. Since gcd(a/g, n/g) = 1, Proposition implies this latter equation does have a solution. In Chapter 4 we will study quadratic reciprocity, which gives a nice criterion for whether or not a quadratic equation modulo n has a solution Fermat s Little Theorem The group of units (Z/nZ) of the ring Z/nZ will be of great interest to us. Each element of this group has an order, and Lagrange s theorem from group theory implies that each element of (Z/nZ) has order that divides the order of (Z/nZ). In elementary number theory this fact goes by the monicker Fermat s Little Theorem, and we reprove it from basic principles in this section. Definition (Order of an Element). Let n N and x Z and suppose that gcd(x, n) = 1. The order of x modulo n is the smallest m N such that x m 1 (mod n). To show that the definition makes sense, we verify that such an m exists. Consider x, x 2, x 3,... modulo n. There are only finitely many residue classes modulo n, so we must eventually find two integers i, j with i < j such that x j x i (mod n). Since gcd(x, n) = 1, Proposition implies that we can cancel x s and conclude that x j i 1 (mod n). Definition (Euler s phi-function). For n N, let ϕ(n) = #{a N : a n and gcd(a, n) = 1}.

163 2.1 Congruences Modulo n 25 For example, ϕ(1) = #{1} = 1, ϕ(2) = #{1} = 1, ϕ(5) = #{1, 2, 3, 4} = 4, ϕ(12) = #{1, 5, 7, 11} = 4. Also, if p is any prime number then ϕ(p) = #{1, 2,..., p 1} = p 1. In Section 2.2.1, we will prove that ϕ is a multiplicative function. This will yield an easy way to compute ϕ(n) in terms of the prime factorization of n. Theorem (Fermat s Little Theorem). If gcd(x, n) = 1, then x ϕ(n) 1 (mod n). Proof. As mentioned above, Fermat s Little Theorem has the following group-theoretic interpretation. The set of units in Z/nZ is a group (Z/nZ) = {a Z/nZ : gcd(a, n) = 1}. which has order ϕ(n). The theorem then asserts that the order of an element of (Z/nZ) divides the order ϕ(n) of (Z/nZ). This is a special case of the more general fact (Lagrange s theorem) that if G is a finite group and g G, then the order of g divides the cardinality of G. We now give an elementary proof of the theorem. Let P = {a : 1 a n and gcd(a, n) = 1}. In the same way that we proved Lemma 2.1.6, we see that the reductions modulo n of the elements of xp are the same as the reductions of the elements of P. Thus a P(xa) a (mod n), a P since the products are over the same numbers modulo n. Now cancel the a s on both sides to get x #P 1 (mod n), as claimed.

164 26 2. The Ring of Integers Modulo n Wilson s Theorem The following characterization of prime numbers, from the 1770s, is called Wilson s Theorem, though it was first proved by Lagrange. Proposition (Wilson s Theorem). An integer p > 1 is prime if and only if (p 1)! 1 (mod p). For example, if p = 3, then (p 1)! = 2 1 (mod 3). If p = 17, then But if p = 15, then (p 1)! = (mod 17). (p 1)! = (mod 15), so 15 is composite. Thus Wilson s theorem could be viewed as a primality test, though, from a computational point of view, it is probably the least efficient primality test since computing (n 1)! takes so many steps. Proof. The statement is clear when p = 2, so henceforth we assume that p > 2. We first assume that p is prime and prove that (p 1)! 1 (mod p). If a {1, 2,..., p 1} then the equation ax 1 (mod p) has a unique solution a {1, 2,..., p 1}. If a = a, then a 2 1 (mod p), so p a 2 1 = (a 1)(a+1), so p (a 1) or p (a+1), so a {1, p 1}. We can thus pair off the elements of {2, 3,..., p 2}, each with their inverse. Thus 2 3 (p 2) 1 (mod p). Multiplying both sides by p 1 proves that (p 1)! 1 (mod p). Next we assume that (p 1)! 1 (mod p) and prove that p must be prime. Suppose not, so that p 4 is a composite number. Let l be a prime divisor of p. Then l < p, so l (p 1)!. Also, by assumption, l p ((p 1)! + 1). This is a contradiction, because a prime can not divide a number a and also divide a + 1, since it would then have to divide (a + 1) a = 1. Example We illustrate the key step in the above proof in the case p = 17. We have = (2 9) (3 6) (4 13) (5 7) (8 15) (10 12) (14 11) 1 (mod 17), where we have paired up the numbers a, b for which ab 1 (mod 17).

165 2.2 The Chinese Remainder Theorem The Chinese Remainder Theorem In this section we prove the Chinese Remainder Theorem, which gives conditions under which a system of linear equations is guaranteed to have a solution. In the 4th century a Chinese mathematician asked the following: Question There is a quantity whose number is unknown. Repeatedly divided by 3, the remainder is 2; by 5 the remainder is 3; and by 7 the remainder is 2. What is the quantity? In modern notation, Question asks us to find a positive integer solution to the following system of three equations: x 2 (mod 3) x 3 (mod 5) x 2 (mod 7) The Chinese Remainder Theorem asserts that a solution exists, and the proof gives a method to find one. (See Section 2.3 for the necessary algorithms.) Theorem (Chinese Remainder Theorem). Let a, b Z and n, m N such that gcd(n, m) = 1. Then there exists x Z such that x a x b (mod m), (mod n). Moreover x is unique modulo mn. Proof. If we can solve for t in the equation a + tm b (mod n), then x = a + tm will satisfy both congruences. To see that we can solve, subtract a from both sides and use Proposition together with our assumption that gcd(n, m) = 1 to see that there is a solution. For uniqueness, suppose that x and y solve both congruences. Then z = x y satisfies z 0 (mod m) and z 0 (mod n), so m z and n z. Since gcd(n, m) = 1, it follows that nm z, so x y (mod nm). Algorithm (Chinese Remainder Theorem). Given coprime integers m and n and integers a and b, this algorithm find an integer x such that x a (mod m) and x b (mod n). 1. [Extended GCD] Use Algorithm below to find integers c, d such that cm + dn = [Answer] Output x = a + (b a)cm and terminate.

166 28 2. The Ring of Integers Modulo n Proof. Since c Z, we have x a (mod m), and using that cm + dn = 1, we have a + (b a)cm a + (b a) b (mod n). Now we can answer Question First, we use Theorem to find a solution to the pair of equations x 2 (mod 3), x 3 (mod 5). Set a = 2, b = 3, m = 3, n = 5. Step 1 is to find a solution to t (mod 5). A solution is t = 2. Then x = a + tm = = 8. Since any x with x x (mod 15) is also a solution to those two equations, we can solve all three equations by finding a solution to the pair of equations x 8 (mod 15) x 2 (mod 7). Again, we find a solution to t (mod 7). A solution is t = 1, so x = a + tm = = 23. Note that there are other solutions. Any x x (mod 3 5 7) is also a solution; e.g., = Multiplicative Functions Definition (Multiplicative Function). A function f : N Z is multiplicative if, whenever m, n N and gcd(m, n) = 1, we have f(mn) = f(m) f(n). Recall from Definition that the Euler ϕ-function is ϕ(n) = #{a : 1 a n and gcd(a, n) = 1}. Lemma Suppose that m, n N and gcd(m, n) = 1. Then the map ψ : (Z/mnZ) (Z/mZ) (Z/nZ). (2.2.1) defined by is a bijection. ψ(c) = (c mod m, c mod n) Proof. We first show that ψ is injective. If ψ(c) = ψ(c ), then m c c and n c c, so nm c c because gcd(n, m) = 1. Thus c = c as elements of (Z/mnZ). Next we show that ψ is surjective. Given a and b with gcd(a, m) = 1 and gcd(b, n) = 1, Theorem implies that there exists c with c a (mod m) and c b (mod n). We may assume that 1 c nm, and since gcd(a, m) = 1 and gcd(b, n) = 1, we must have gcd(c, nm) = 1. Thus ψ(c) = (a, b).

167 2.3 Quickly Computing Inverses and Huge Powers 29 Proposition (Multiplicativity of ϕ). The function ϕ is multiplicative. Proof. The map ψ of Lemma is a bijection, so the set on the left in (2.2.1) has the same size as the product set on the right in (2.2.1). Thus ϕ(mn) = ϕ(m) ϕ(n). The proposition is helpful in computing ϕ(n), at least if we assume we can compute the factorization of n (see Section for a connection between factoring n and computing ϕ(n)). For example, Also, for n 1, we have ϕ(12) = ϕ(2 2 ) ϕ(3) = 2 2 = 4. ϕ(p n ) = p n pn p = pn p n 1 = p n 1 (p 1), (2.2.2) since ϕ(p n ) is the number of numbers less than p n minus the number of those that are divisible by p. Thus, e.g., ϕ( ) = 388 ( ) = = Quickly Computing Inverses and Huge Powers This section is about how to solve the equation ax 1 (mod n) when we know it has a solution, and how to efficiently compute a m (mod n). We also discuss a simple probabilistic primality test that relies on our ability to compute a m (mod n) quickly. All three of these algorithms are of fundamental importance to the cryptography algorithms of Chapter How to Solve ax 1 (mod n) Suppose a, n N with gcd(a, n) = 1. Then by Proposition the equation ax 1 (mod n) has a unique solution. How can we find it? Proposition (Extended Euclidean representation). Suppose a, b Z and let g = gcd(a, b). Then there exists x, y Z such that ax + by = g. Remark If e = cg is a multiple of g, then cax + cby = cg = e, so e = (cx)a + (cy)b can also be written in terms of a and b.

168 30 2. The Ring of Integers Modulo n Proof of Proposition Let g = gcd(a, b). Then gcd(a/d, b/d) = 1, so by Proposition the equation ( a g x 1 mod b ) (2.3.1) g has a solution x Z. Multiplying (2.3.1) through by g yields ax g (mod b), so there exists y such that b ( y) = ax g. Then ax + by = g, as required. Given a, b and g = gcd(a, b), our proof of Proposition gives a way to explicitly find x, y such that ax+by = g, assuming one knows an algorithm to solve linear equations modulo n. Since we do not know such an algorithm, we now discuss a way to explicitly find x and y. This algorithm will in fact enable us to solve linear equations modulo n to solve ax 1 (mod n) when gcd(a, n) = 1, use the algorithm below to find x and y such that ax + ny = 1. Then ax 1 (mod n). Suppose a = 5 and b = 7. The steps of Algorithm to compute gcd(5, 7) are, as follows. Here we underlying, because it clarifies the subsequent back substitution we will use to find x and y. 7 = so 2 = = so 1 = = 5 2(7 5) = On the right, we have back-substituted in order to write each partial remainder as a linear combination of a and b. In the last step, we obtain gcd(a, b) as a linear combination of a and b, as desired. That example was not too complicated, so we try another one. Let a = 130 and b = 61. We have 130 = = = = = = = = = = Thus x = 23 and y = 49 is a solution to 130x + 61y = 1. Algorithm (Extended Euclidean Algorithm). Suppose a and b are integers and let g = gcd(a, b). This algorithm finds d, x and y such that ax + by = g. We describe only the steps when a > b 0, since one can easily reduce to this case. 1. [Initialize] Set x = 1, y = 0, r = 0, s = [Finished?] If b = 0, set g = a and terminate.

169 2.3 Quickly Computing Inverses and Huge Powers [Quotient and Remainder] Use Algorithm to write a = qb+c with 0 c < b. 4. [Shift] Set (a, b, r, s, x, y) = (b, c, x qr, y qs, r, s) and go to step 2. Proof. This algorithm is the same as Algorithm , except that we keep track of extra variables x, y, r, s, so it terminates and when it terminates d = gcd(a, b). We omit the rest of the inductive proof that the algorithm is correct, and instead refer the reader to [Knu97, 1.2.1] which contains a detailed proof in the context of a discussion of how one writes mathematical proofs. Algorithm (Inverse Modulo n). Suppose a and n are integers and gcd(a, n) = 1. This algorithm finds an x such that ax 1 (mod n). 1. [Compute Extended GCD] Use Algorithm to compute integers x, y such that ax + ny = gcd(a, n) = [Finished] Output x. Proof. Reduce ax+ny = 1 modulo n to see that x satisfies ax 1 (mod n). See Section for implementations of Algorithms and Example Solve 17x 1 (mod 61). First, we use Algorithm to find x, y such that 17x + 61y = 1: 61 = = = = = = = = Thus ( 5) = 1 so x = 18 is a solution to 17x 1 (mod 61) How to Compute a m (mod n) Let a and n be integers, and m a nonnegative integer. In this section we describe an efficient algorithm to compute a m (mod n). For the cryptography applications in Chapter 3, m will have hundreds of digits. The naive approach to computing a m (mod n) is to simply compute a m = a a a (mod n) by repeatedly multiplying by a and reducing modulo m. Note that after each arithmetic operation is completed, we reduce the result modulo n so that the sizes of the numbers involved do not get too large. Nonetheless, this algorithm is horribly inefficient because it takes m 1 multiplications, which is huge if m has hundreds of digits. A much more efficient algorithm for computing a m (mod n) involves writing m in binary, then expressing a m as a product of expressions a 2i, for

170 32 2. The Ring of Integers Modulo n various i. These latter expressions can be computed by repeatedly squaring a 2i. This more clever algorithm is not simpler, but it is vastly more efficient since the number of operations needed grows with the number of binary digits of m, whereas with the naive algorithm above the number of operations is m 1. Algorithm (Write a number in binary). Let m be a nonnegative integer. This algorithm writes m in binary, so it finds ε i {0, 1} such that m = r i=0 ε i2 i with each ε i {0, 1}. 1. [Initialize] Set i = [Finished?] If m = 0, terminate. 3. [Digit] If m is odd, set ε i = 1, otherwise ε i = 0. Increment i. 4. [Divide by 2] Set m = m 2, the greatest integer m/2. Goto step 2. Algorithm (Compute Power). Let a and n be integers and m a nonnegative integer. This algorithm computes a m modulo n. 1. [Write in Binary] Write m in binary using Algorithm 2.3.6, so a m = (mod n). ε i=1 a2i 2. [Compute Powers] Compute a, a 2, a 22 = (a 2 ) 2, a 23 = (a 22 ) 2, etc., up to a 2r, where r + 1 is the number of binary digits of m. 3. [Multiply Powers] Multiply together the a 2i such that ε i = 1, always working modulo n. See Section for an implementation of Algorithms and We can compute the last 2 digits of 6 91, by finding 6 91 (mod 100). Make a table whose first column, labeled i, contains 0, 1, 2, etc. The second column, labeled m, is got by dividing the entry above it by 2 and taking the integer part of the result. The third column, labeled ε i, records whether or not the second column is odd. The fourth column is computed by squaring, modulo n = 100, the entry above it. We have i m ε i 6 2i mod (mod 100). That is easier than multiplying 6 by itself 91 times.

171 2.4 Finding Primes 33 Remark Alternatively, we could simplify the computation using Theorem By that theorem, 6 ϕ(100) 1 (mod 100), so since ϕ(100) = ϕ( ) = (2 2 2) (5 2 5) = 40, we have (mod 100). 2.4 Finding Primes Theorem (Pseudoprimality). An integer p > 1 is prime if and only if for every a 0 (mod p), a p 1 1 (mod p). Proof. If p is prime, then the statement follows from Proposition If p is composite, then there is a divisor a of p with a 1, p. If a p 1 1 (mod p), then p a p 1 1. Since a p, we have a a p 1 1 hence a 1, a contradiction. Suppose n N. Using this theorem and Algorithm 2.3.7, we can either quickly prove that n is not prime, or convince ourselves that n is likely prime (but not quickly prove that n is prime). For example, if 2 n 1 1 (mod n), then we have proved that n is not prime. On the other hand, if a n 1 1 (mod n) for a few a, it seems likely that n is prime, and we loosely refer to such a number that seems prime for several bases as a pseudoprime. There are composite numbers n (called Carmichael numbers) with the amazing property that a n 1 1 (mod n) for all a with gcd(a, n) = 1. The first Carmichael number is 561, and it is a theorem that there are infinitely many such numbers ([AGP94]). Example Is p = 323 prime? We compute (mod 323). Making a table as above, we have Thus i m ε i 2 2i mod (mod 323), so 323 is not prime, though this computation gives no information about 323 factors as a product of primes. In fact, one finds that 323 =

172 34 2. The Ring of Integers Modulo n It s possible to easily prove that a large number is composite, but the proof does not easily yield a factorization. For example if n = , then 2 n 1 1 (mod n), so n is composite. Another practical primality test is the Miller-Rabin test, which has the property that each time it is run on a number n it either correctly asserts that the number is definitely not prime, or that it is probably prime, and the probability of correctness goes up with each successive call. For a precise statement and implementation of Miller-Rabin, along with proof of correctness, see Section If Miller-Rabin is called m times on n and in each case claims that n is probably prime, then one can in a precise sense bound the probability that n is composite in terms of m. For an implementation of Miller-Rabin, see Listing in Chapter 7. Until recently it was an open problem to give an algorithm (with proof) that decides whether or not any integer is prime in time bounded by a polynomial in the number of digits of the integer. Agrawal, Kayal, and Saxena recently found the first polynomial-time primality test (see [AKS02]). We will not discuss their algorithm further, because for our applications to cryptography Miller-Rabin or pseudoprimality tests will be sufficient. 2.5 The Structure of (Z/pZ) This section is about the structure of the group (Z/pZ) of units modulo a prime number p. The main result is that this group is always cyclic. We will use this result later in Chapter 4 in our proof of quadratic reciprocity. Definition (Primitive root). A primitive root modulo an integer n is an element of (Z/nZ) of order ϕ(n). We will prove that there is a primitive root modulo every prime p. Since the unit group (Z/pZ) has order p 1, this implies that (Z/pZ) is a cyclic group, a fact this will be extremely useful, since it completely determines the structure of (Z/pZ) as an abelian group. If n is an odd prime power, then there is a primitive root modulo n (see Exercise 2.25), but there is no primitive root modulo the prime power 2 3, and hence none mod 2 n for n 3 (see Exercise 2.24). Section is the key input to our proof that (Z/pZ) is cyclic; here we show that for every divisor d of p 1 there are exactly d elements of (Z/pZ) whose order divides d. We then use this result in Section to produce an element of (Z/pZ) of order q r when q r is a prime power that exactly divides p 1 (i.e., q r divides p 1, but q r+1 does not divide p 1), and multiply together these elements to obtain an element of (Z/pZ) of order p 1.

173 2.5 The Structure of (Z/pZ) Polynomials over Z/pZ The polynomials x 2 1 has four roots in Z/8Z, namely 1, 3, 5, and 7. In contrast, the following proposition shows that a polynomial of degree d over a field, such as Z/pZ, can have at most d roots. Proposition (Root Bound). Let f k[x] be a nonzero polynomial over a field k. Then there are at most deg(f) elements α k such that f(α) = 0. Proof. We prove the proposition by induction on deg(f). The cases in which deg(f) 1 are clear. Write f = a n x n + a 1 x + a 0. If f(α) = 0 then f(x) = f(x) f(α) = a n (x n α n ) + a 1 (x α) + a 0 (1 1) = (x α)(a n (x n α n 1 ) + + a 2 (x + α) + a 1 ) = (x α)g(x), for some polynomial g(x) k[x]. Next suppose that f(β) = 0 with β α. Then (β α)g(β) = 0, so, since β α 0, we have g(β) = 0. By our inductive hypothesis, g has at most n 1 roots, so there are at most n 1 possibilities for β. It follows that f has at most n roots. Proposition Let p be a prime number and let d be a divisor of p 1. Then f = x d 1 (Z/pZ)[x] has exactly d roots in Z/pZ. Proof. Let e = (p 1)/d. We have x p 1 1 = (x d ) e 1 = (x d 1)((x d ) e 1 + (x d ) e ) = (x d 1)g(x), where g (Z/pZ)[x] and deg(g) = de d = p 1 d. Theorem implies that x p 1 1 has exactly p 1 roots in Z/pZ, since every nonzero element of Z/pZ is a root! By Proposition 2.5.2, g has at most p 1 d roots and x d 1 has at most d roots. Since a root of (x d 1)g(x) is a root of either x d 1 or g(x) and x p 1 1 has p 1 roots, g must have exactly p 1 d roots and x d 1 must have exactly d roots, as claimed. We pause to reemphasize that the analogue of Proposition is false when p is replaced by a composite integer n, since a root mod n of a product of two polynomials need not be a root of either factor. For example, f = x 2 1 Z/15Z[x] has the four roots 1, 4, 11, and 14.

174 36 2. The Ring of Integers Modulo n Existence of Primitive Roots Recall from Section that the order of an element x in a finite group is the smallest m 1 such that x m = 1. In this section, we prove that (Z/pZ) is cyclic by using the results of Section to produce an element of (Z/pZ) of order d for each prime power divisor d of p 1, and then we multiply these together to obtain an element of order p 1. We will use the following lemma to assemble elements of each order dividing p 1 to produce an element of order p 1. Lemma Suppose a, b (Z/nZ) have orders r and s, respectively, and that gcd(r, s) = 1. Then ab has order rs. Proof. This is a general fact about commuting elements of any group; our proof only uses that ab = ba and nothing special about (Z/nZ). Since (ab) rs = a rs b rs = 1, the order of ab is a divisor of rs. Write this divisor as r 1 s 1 where r 1 r and s 1 s. Raise both sides of the equation to the power r 2 = r/r 1 to obtain a r1s1 b r1s1 = (ab) r1s1 = 1. a r1r2s1 b r1r2s1 = 1. Since a r1r2s1 = (a r1r2 ) s1 = 1, we have b r1r2s1 = 1, so s r 1 r 2 s 1. Since gcd(s, r 1 r 2 ) = gcd(s, r) = 1, it follows that s = s 1. Similarly r = r 1, so the order of ab is rs. Theorem (Primitive Roots). There is a primitive root modulo any prime p. In particular, the group (Z/pZ) is cyclic. Proof. The theorem is true if p = 2, since 1 is a primitive root, so we may assume p > 2. Write p 1 as a product of distinct prime powers q ni i : p 1 = q n1 1 qn2 2 qnr r. By Proposition 2.5.3, the polynomial x qn i i the polynomial x qn i 1 i 1 has exactly q ni 1 i (q i 1) elements a Z/pZ such that a qn i i of these elements has order q ni i q ni 1 i an a i of order q ni i 1 has exactly q ni i roots, and roots. There are q ni i q ni 1 i = = 1 but a qn i 1 i 1; each. Thus for each i = 1,..., r, we can choose. Then, using Lemma repeatedly, we see that a = a 1 a 2 a r has order q n1 1 qnr r = p 1, so a is a primitive root modulo p.

175 2.5 The Structure of (Z/pZ) 37 Example We illustrate the proof of Theorem when p = 13. We have p 1 = 12 = The polynomial x 4 1 has roots {1, 5, 8, 12} and x 2 1 has roots {1, 12}, so we may take a 1 = 5. The polynomial x 3 1 has roots {1, 3, 9}, and we set a 2 = 3. Then a = 5 3 = 15 2 is a primitive root. To verify this, note that the successive powers of 2 (mod 13) are 2, 4, 8, 3, 6, 12, 11, 9, 5, 10, 7, 1. Example Theorem is false if, e.g., p is replaced by a power of 2 bigger than 4. For example, the four elements of (Z/8Z) each have order dividing 2, but ϕ(8) = 4. Theorem (Primitive Roots mod p n ). Let p n be a power of an odd prime. Then there is a primitive root modulo p n. The proof is left as Exercise Proposition (Number of primitive roots). If there is a primitive root modulo n, then there are exactly ϕ(ϕ(n)) primitive roots modulo n. Proof. The primitive roots modulo n are the generators of (Z/nZ), which by assumption is cyclic of order ϕ(n). Thus they are in bijection with the generators of any cyclic group of order ϕ(n). In particular, the number of primitive roots modulo n is the same as the number of elements of Z/ϕ(n)Z with additive order ϕ(n). An element of Z/ϕ(n)Z has additive order ϕ(n) if and only if it is coprime to ϕ(n). There are ϕ(ϕ(n)) such elements, as claimed. Example For example, there are ϕ(ϕ(17)) = ϕ(16) = = 8 primitive roots mod 17, namely 3, 5, 6, 7, 10, 11, 12, 14. The ϕ(ϕ(9)) = ϕ(6) = 2 primitive roots modulo 9 are 2 and 5. There are no primitive roots modulo 8, even though ϕ(ϕ(8)) = ϕ(4) = 2 > Artin s Conjecture Conjecture (Emil Artin). Suppose a Z is not 1 or a perfect square. Then there are infinitely many primes p such that a is a primitive root modulo p. There is no single integer a such that Artin s conjecture is known to be true. For any given a, Pieter [Mor93] proved that there are infinitely many p such that the order of a is divisible by the largest prime factor of p 1. Hooley [Hoo67] proved that something called the Generalized Riemann Hypothesis implies Conjecture

176 38 2. The Ring of Integers Modulo n Remark Artin conjectured more precisely that if N(x, a) is the number of primes p x such that a is a primitive root modulo p, then N(x, a) is asymptotic to C(a)π(x), where C(a) is a positive constant that depends only on a and π(x) is the number of primes up to x Computing Primitive Roots Theorem does not suggest an efficient algorithm for finding primitive roots. To actually find a primitive root mod p in practice, we try a = 2, then a = 3, etc., until we find an a that has order p 1. Computing the order of an element of (Z/pZ) requires factoring p 1, which we do not know how to do quickly in general, so finding a primitive root modulo p for large p seems to be a difficult problem. See Section for an implementation of this algorithm for finding a primitive root. Algorithm (Primitive Root). Given a prime p this algorithm computes the smallest positive integer a that generates (Z/pZ). 1. [p = 2?] If p = 2 output 1 and terminate. Otherwise set a = [Prime Divisors] Compute the prime divisors p 1,..., p r of p 1 (see Section 7.1.3). 3. [Generator?] If for every p i, we have a (p 1)/pi 1 (mod p), then a is a generator of (Z/pZ), so output a and terminate. 4. [Try next] Set a = a + 1 and go to step 3. Proof. Let a (Z/pZ). The order of a is a divisor d of the order p 1 of the group (Z/pZ). Write d = (p 1)/n, for some divisor n of p 1. If a is not a generator of (Z/pZ), then since n (p 1), there is a prime divisor p i of p 1 such that p i n. Then a (p 1)/pi = (a (p 1)/n ) n/pi 1 (mod p). Conversely, if a is a generator, then a (p 1)/pi 1 (mod p) for any p i. Thus the algorithm terminates with step 3 if and only if the a under consideration is a primitive root. By Theorem there is at least one primitive root, so the algorithm terminates. We implement Algorithm in Section Exercises 2.1 Compute the following gcd s using Algorithm : gcd(15, 35), gcd(247, 299), gcd(51, 897), gcd(136, 304)

177 2.6 Exercises Use Algorithm to find x, y Z such that 2261x y = Prove that if a and b are integers and p is a prime, then (a + b) p a p + b p (mod p). You may assume that the binomial coefficient is an integer. p! r!(p r)! 2.4 (a) Prove that if x, y is a solution to ax + by = d, then for all c Z, x = x + c b d, is also a solution to ax + by = d. y = y c a d (b) Find two distinct solutions to 2261x y = 17. (c) Prove that all solutions are of the form (2.6.1) for some c. (2.6.1) 2.5 Let f(x) = x 2 + ax + b Z[x] be a quadratic polynomial with integer coefficients and positive leading coefficients, e.g., f(x) = x 2 + x + 6. Formulate a conjecture about when the set {f(n) : n Z and f(n) is prime} is infinite. Give numerical evidence that supports your conjecture. 2.6 Find four complete sets of residues modulo 7, where the ith set satisfies the ith condition: (1) nonnegative, (2) odd, (3) even, (4) prime. 2.7 Find rules in the spirit of Proposition for divisibility of an integer by 5, 9, and 11, and prove each of these rules using arithmetic modulo a suitable n. 2.8 (*) The following problem is from the 1998 Putnam Competition. Define a sequence of decimal integers a n as follows: a 1 = 0, a 2 = 1, and a n+2 is obtained by writing the digits of a n+1 immediately followed by those of a n. For example, a 3 = 10, a 4 = 101, and a 5 = Determine the n such that a n a multiple of 11, as follows: (a) Find the smallest integer n > 1 such that a n is divisible by 11. (b) Prove that a n is divisible by 11 if and only if n 1 (mod 6). 2.9 Find an integer x such that 37x 1 (mod 101) What is the order of 2 modulo 17? 2.11 Let p be a prime. Prove that Z/pZ is a field Find an x Z such that x 4 (mod 17) and x 3 (mod 23).

178 40 2. The Ring of Integers Modulo n 2.13 Prove that if n > 4 is composite then 2.14 For what values of n is ϕ(n) odd? (n 1)! 0 (mod n) (a) Prove that ϕ is multiplicative as follows. Suppose m, n are positive integers and gcd(m, n) = 1. Show that the natural map ψ : Z/mnZ Z/mZ Z/nZ is an injective homomorphism of rings, hence bijective by counting, then look at unit groups. (b) Prove conversely that if gcd(m, n) > 1 then the natural map ψ : Z/mnZ Z/mZ Z/nZ is not an isomorphism Seven competitive math students try to share a huge hoard of stolen math books equally between themselves. Unfortunately, six books are left over, and in the fight over them, one math student is expelled. The remaining six math students, still unable to share the math books equally since two are left over, again fight, and another is expelled. When the remaining five share the books, one book is left over, and it is only after yet another math student is expelled that an equal sharing is possible. What is the minimum number of books which allow this to happen? 2.17 Show that if p is a positive integer such that both p and p are prime, then p = Let ϕ : N N be the Euler ϕ function. (a) Find all natural numbers n such that ϕ(n) = 1. (b) Do there exist natural numbers m and n such that ϕ(mn) ϕ(m) ϕ(n)? 2.19 Find a formula for ϕ(n) directly in terms of the prime factorization of n Find all four solutions to the equation x (mod 35) Prove that for any positive integer n the fraction (12n + 1)/(30n + 2) is in reduced form Suppose a and b are positive integers. (a) Prove that gcd(2 a 1, 2 b 1) = 2 gcd(a,b) 1. (b) Does it matter if 2 is replaced by an arbitrary prime p? (c) What if 2 is replaced by an arbitrary positive integer n?

179 2.6 Exercises For every positive integer b, show that there exists a positive integer n such that the polynomial x 2 1 (Z/nZ)[x] has at least b roots (a) Prove that there is no primitive root modulo 2 n for any n 3. (b) (*) Prove that (Z/2 n Z) is generated by 1 and Let p be an odd prime. (a) (*) Prove that there is a primitive root modulo p 2. (Hint: Use that if a, b have orders n, m, with gcd(n, m) = 1, then ab has order nm.) (b) Prove that for any n, there is a primitive root modulo p n. (c) Explicitly find a primitive root modulo (*) In terms of the prime factorization of n, characterize the integers n such that there is a primitive root modulo n.

180 42 2. The Ring of Integers Modulo n

181 3 Public-Key Cryptography This is page 43 Printer: Opaque this The author recently watched a TV show (not movie!) called La Femme Nikita about a woman named Nikita who is forced to be an agent for a shady anti-terrorist organization called Section One. Nikita has strong feelings for fellow agent Michael, and she most trusts Walter, Section One s ex-biker gadgets and explosives expert. Often Nikita s worst enemies are her superiors and coworkers at Section One. A synopsis for a season three episode is as follows: PLAYING WITH FIRE On a mission to secure detonation chips from a terrorist organization s heavily armed base camp, Nikita is captured as a hostage by the enemy. Or so it is made to look. Michael and Nikita have actually created the scenario in order to secretly rendezvous with each other. The ruse works, but when Birkoff [Section One s master hacker] accidentally discovers encrypted messages between Michael and Nikita sent with Walter s help, Birkoff is forced to tell Madeline. Suspecting that Michael and Nikita may be planning a coup d état, Operations and Madeline use a second team of operatives to track Michael and Nikita s next secret rendezvous... killing them if necessary.

After being captured at the base camp, Nikita is given a phone by her captors, in hopes that she ll use it and they ll be able to figure out what she is really up to.

182 44 3. Public-Key Cryptography FIGURE 3.1. Diffie and Hellman (photos from [Sin99]) What sort of encryption might Walter have helped them to use? I let my imagination run free, and this is what I came up with. After being captured at the base camp, Nikita is given a phone by her captors, in hopes that she ll use it and they ll be able to figure out what she is really up to. Everyone is eagerly listening in on her calls. Remark In this book we will assume available a method for producing random integers. Methods for generating random integers are involved and interesting, but we will not discuss them in this book. For an in depth treatment of random numbers, see [Knu98, Ch. 3]. Nikita remembers a conversation with Walter about a public-key cryptosystem called the Diffie-Hellman key exchange. She remembers that it allows two people to agree on a secret key in the presence of eavesdroppers. Moreover, Walter mentioned that though Diffie-Hellman was the first ever public-key exchange system, it is still in common use today (e.g., in OpenSSH protocol version 2, see Nikita pulls out her handheld computer and phone, calls up Michael, and they do the following, which is wrong (try to figure out what is wrong as you read it). 1. Together they choose a big prime number p and a number g with 1 < g < p. 2. Nikita secretly chooses an integer n. 3. Michael secretly chooses an integer m. 4. Nikita tells Michael ng (mod p). 5. Michael tells mg (mod p) to Nikita. 6. The secret key is s = nmg (mod p), which both Nikita and Michael can easily compute.

183 3. Public-Key Cryptography 45 Michael Nikita Section One Nikita s captors Here s a very simple example with small numbers that illustrates what Michael and Nikita do. (They really used much larger numbers.) 1. p = 97, g = 5 2. n = m = ng 58 (mod 97) 5. mg 87 (mod 97) 6. s = nmg = 78 (mod 97) Nikita and Michael are foiled because everyone easily figures out s: 1. Everyone knows p, g, ng (mod p), and mg (mod p). 2. Using Algorithm 2.3.3, anyone can easily find a, b Z such that ag + bp = 1, which exist because gcd(g, p) = Then ang n (mod p), so everyone knows Nikita s secret key n, and hence can easily compute the shared secret s. To taunt her, Nikita s captors give her a paragraph from a review of Diffie and Hellman s 1976 paper New Directions in Cryptography [DH76]: The authors discuss some recent results in communications theory [...] The first [method] has the feature that an unauthorized eavesdropper will find it computationally infeasible to decipher the message [...] They propose a couple of techniques for implementing the system, but the reviewer was unconvinced.

184 46 3. Public-Key Cryptography 3.1 The Diffie-Hellman Key Exchange As night darkens Nikita s cell, she reflects on what has happened. Upon realizing that she mis-remembered how the system works, she phones Michael and they do the following: 1. Together Michael and Nikita choose a 200-digit integer p that is likely to be prime (see Section 2.4), and choose a number g with 1 < g < p. 2. Nikita secretly chooses an integer n. 3. Michael secretly chooses an integer m. 4. Nikita computes g n (mod p) on her handheld computer and tells Michael the resulting number over the phone. 5. Michael tells Nikita g m (mod p). 6. The shared secret key is then s (g n ) m (g m ) n g nm (mod p), which both Nikita and Michael can compute. Here is a simplified example that illustrates what they did, that involves only relatively simple arithmetic. 1. p = 97, g = 5 2. n = m = g n 7 (mod p) 5. g m 39 (mod p) 6. s (g n ) m 14 (mod p) The Discrete Log Problem Nikita communicates with Michael by encrypting everything using their agreed upon secret key. In order to understand the conversation, the eavesdropper needs s, but it takes a long time to compute s given only p, g, g n, and g m. One way would be to compute n from knowledge of g and g n ; this is possible, but appears to be computationally infeasible, in the sense that it would take too long to be practical.

185 3.1 The Diffie-Hellman Key Exchange 47 Let a, b, and n be real numbers with a, b > 0 and n 0. Recall that the log to the base b function characterized by log b (a) = n if and only if a = b n. We use the log b function in algebra to solve the following problem: Given a base b and a power a of b, find an exponent n such that That is, given a = b n and b, find n. a = b n. Example The number a = is the nth power of b = 3 for some n. With a calculator we quickly find that n = log 3 (19683) = log(19683)/ log(3) = 9. A calculator can quickly compute an approximation for log(x) by computing a partial sum of an appropriate rapidly-converging infinite series (at least for x in a certain range). The discrete log problem is the analogue of this problem but in a finite group: Problem (Discrete Log Problem). Let G be a finite abelian group, e.g., G = (Z/pZ). Given b G and a power a of b, find a positive integer n such that b n = a. As far as we know, finding discrete logarithms when p is large is difficult in practice. Over the years, many people have been very motivated to try. For example, if Nikita s captors could efficiently solve Problem 3.1.2, then they could read the messages she exchanges with Michael. Unfortunately, we have no formal proof that computing discrete logarithms on a classical computer is difficult. Also, Peter Shor [Sho97] showed that if one could build a sufficiently complicated quantum computer, it could solve the discrete logarithm problem in time bounded by a polynomial function of the number of digits of #G. It is easy to give an inefficient algorithm that solves the discrete log problem. Simply try b 1, b 2, b 3, etc., until we find an exponent n such that b n = a. For example, suppose a = 18, b = 5, and p = 23. Working modulo 23 we have b 1 = 5, b 2 = 2, b 3 = 10,..., b 12 = 18, so n = 12. When p is large, computing the discrete log this way soon becomes impractical, because increasing the number of digits of the modulus makes the computation take vastly longer. Perhaps part of the reason that computing discrete logarithms is difficult, is that the logarithm in the real numbers is continuous, but the (minimum) logarithm of a number mod n bounces around at random. We illustrate this exotic behavior in Figure 3.2.

186 48 3. Public-Key Cryptography y x y FIGURE 3.2. Graphs of the continuous log and of the discrete log modulo 97. Which looks easier to compute? x

187 3.1.2 Realistic Diffie-Hellman Example 3.1 The Diffie-Hellman Key Exchange 49 In this section we present an example that uses bigger numbers. First we prove a proposition that we can use to choose a prime p in such a way that it is easy to find a g (Z/pZ) with order p 1. We have already seen in Section 2.5 that for every prime p there exists an element g of order p 1, and we gave Algorithm for finding a primitive root for any prime. The significance of the proposition below is that it suggests an algorithm for finding a primitive root that is easier to use in practice when p is large, because it does not require factoring p 1. Of course, one could also just use a random g for Diffie-Hellman; it is not essential that g generates (Z/pZ). Proposition Suppose p is a prime such that (p 1)/2 is also prime. Then the elements of (Z/pZ) have order either 1, 2, (p 1)/2, or p 1. Proof. Since p is prime, the group (Z/pZ) has order p 1. By assumption, the prime factorization of p 1 is 2 ((p 1)/2). Let a (Z/pZ). Then by Theorem , a p 1 = 1, so the order of a is a divisor of p 1, which proves the proposition. Given a prime p with (p 1)/2 prime, find an element of order p 1 as follows. If 2 has order p 1 we are done. If not, 2 has order (p 1)/2 since 2 doesn t have order either 1 or 2. Then 2 has order p 1. Let p = Then p is prime, but (p 1)/2 is not. So we keep adding 2 to p and testing pseudoprimality using Section 2.4 until we find that the next pseudoprime after p is q = It turns out that q pseudoprime and (q 1)/2 is also pseudoprime. We find that 2 has order (q 1)/2, so g = 2 has order q 1 and is hence a generator of (Z/qZ), at least assuming that q is really prime. The secret random numbers generated by Nikita and Michael are and Nikita sends n = m = g n = (Z/pZ) to Michael, and Michael sends g m = (Z/pZ) to Nikita. They agree on the secret key g nm = (Z/pZ). Remark See Section for a computer implementation of the Diffie-Hellman key exchange.

50 3. Public-Key Cryptography Nikita g nt (mod p) The Man Michael g mt (mod p) PSfrag replacements g nt (mod p) g mt (mod p) FIGURE 3.3. The Man in the Middle Attack 3.1.

188 50 3. Public-Key Cryptography Nikita g nt (mod p) The Man Michael g mt (mod p) PSfrag replacements g nt (mod p) g mt (mod p) FIGURE 3.3. The Man in the Middle Attack The Man in the Middle Attack After their first system was broken, instead of talking on the phone, Michael and Nikita can now only communicate via text messages. One of her captors, The Man, is watching each of the transmissions; moreover, he can intercept messages and send false messages. When Nikita sends a message to Michael announcing g n (mod p), The Man intercepts this message, and sends his own number g t (mod p) to Michael. Eventually, Michael and The Man agree on the secret key g tm (mod p), and Nikita and The Man agree on the key g tn (mod p). When Nikita sends a message to Michael she unwittingly uses the secret key g tn (mod p); The Man then intercepts it, decrypts it, changes it, and re-encrypts it using the key g tm (mod p), and sends it on to Michael. This is bad because now The Man can read every message sent between Michael and Nikita, and moreover, he can change them in transmission in subtle ways. One way to get around this attack is to use a digital signature scheme based on the RSA cryptosystem. We will not discuss digital signatures further in this book, but will discuss RSA in the next section.

189 3.2 The RSA Cryptosystem The RSA Cryptosystem The Diffie-Hellman key exchange has drawbacks. As discussed in Section 3.1.3, it is susceptible to the man in the middle attack. This section is about the RSA public-key cryptosystem of Rivest, Shamir, and Adleman [RSA78], which is an alternative to Diffie-Hellman that is more flexible in some ways. We first describe the RSA cryptosystem, then discuss several ways to attack it. It is important to be aware of such weaknesses, in order to avoid foolish mistakes when implementing RSA. We barely scratched the surface here of the many possible attacks on specific implementations of RSA or other cryptosystems How RSA works The fundamental idea behind RSA is to try to construct a trap-door or one-way function on a set X, that is, an invertible function E : X X such that it is easy for Nikita to compute E 1, but extremely difficult for anybody else to do so. Here is how Nikita makes a one-way function E on the set of integers modulo n. 1. Using a method hinted at in Section 2.4, Nikita picks two large primes p and q, and lets n = pq. 2. It is then easy for Nikita to compute ϕ(n) = ϕ(p) ϕ(q) = (p 1) (q 1). 3. Nikita next chooses a random integer e with 1 < e < ϕ(n) and gcd(e, ϕ(n)) = Nikita uses the algorithm from Section to find a solution x = d to the equation ex 1 (mod ϕ(n)). 5. Finally, Nikita defines a function E : Z/nZ Z/nZ by E(x) = x e Z/nZ. Anybody can compute E fairly quickly using the repeated-squaring algorithm from Section

190 52 3. Public-Key Cryptography Nikita s public key is the pair of integers (n, e), which is just enough information for people to easily compute E. Nikita knows a number d such that ed 1 (mod ϕ(n)), so, as we will see, she can quickly compute E 1. To send Nikita a message, proceed as follows. Encode your message, in some way, as a sequence of numbers modulo n (see Section 3.2.2) then send m 1,..., m r Z/nZ, E(m 1 ),..., E(m r ) to Nikita. (Recall that E(m) = m e for m Z/nZ.) When Nikita receives E(m i ), she finds each m i by using that E 1 (m) = m d, a fact that follows from the following proposition. Proposition (Decryption key). Let n be an integer that is a product of distinct primes and let d, e N be such that p 1 de 1 for each prime p n. Then a de a (mod n) for all a Z. Proof. Since n a de a if and only if p a de a for each prime divisor p of n, it suffices to prove that a de a (mod p) for each prime divisor p of n. If gcd(a, p) 0, then a 0 (mod p), so a de a (mod p). If gcd(a, p) = 1, then Theorem asserts that a p 1 1 (mod p). Since p 1 de 1, we have a de 1 1 (mod p) as well. Multiplying both sides by a shows that a de a (mod p). Thus to decrypt E(m i ) Nikita computes E(m i ) d = (m e i ) d = m i. For an implementation of RSA see Section Encoding a Phrase in a Number In order to use the RSA cryptosystem to encrypt messages, it is necessary to encode them as a sequence of numbers of size less than n = pq. We now describe a simple way to do this. For an implementation of a slightly more general encoding that includes extra randomness so that plain text encodes differently each time, see Section Suppose s is a sequence of capital letters and spaces, and that s does not begin with a space. We encode s as a number in base 27 as follows: a single space corresponds to 0, the letter A to 1, B to 2,..., Z to 26. Thus RUN NIKITA is a number written in base 27: RUN NIKITA = (in decimal).

191 3.2 The RSA Cryptosystem 53 To recover the letters from the decimal number, repeatedly divide by 27 and read off the letter corresponding to each remainder: = A = T = I = K = I = N = = N 507 = U 18 = R If 27 k n, then any sequence of k letters can be encoded as above using a positive integer n. Thus if we use can encrypt integers of size at most n, then we must break our message up into blocks of size at most log 27 (n) Examples So the arithmetic is easy to follow, we use small primes p and q and encrypt the single letter X using the RSA cryptosystem. 1. Choose p and q: Let p = 17, q = 19, so n = pq = Compute ϕ(n): ϕ(n) = ϕ(p q) = ϕ(p) ϕ(q) = (p 1)(q 1) = pq p q + 1 = = Randomly choose an e < 288: We choose e = Solve 95x 1 (mod 288). Using the GCD algorithm, we find that d = 191 solves the equation. The public key is (323, 95), so the encryption function is E(x) = x 95, and the decryption function is D(x) = x 191. Next, we encrypt the letter X. It is encoded as the number 24, since X is the 24th letter of the alphabet. We have E(24) = = 294 Z/323Z.

192 54 3. Public-Key Cryptography To decrypt, we compute E 1 : E 1 (294) = = 24 Z/323Z. This next example illustrates RSA but with bigger numbers. Let p = , q = Then n = p q = and ϕ(n) = (p 1)(q 1) = Using a pseudo-random number generator on a computer, the author randomly chose the integer Then e = d = Since log 27 (n) 38.04, we can encode then encrypt single blocks of up to 38 letters. Let s encrypt RUN NIKITA, which encodes as m = We have E(m) = m e = Remark In practice one usually choses e to be small, since that does not seem to reduce the security of RSA, and makes the key size smaller. For example, in the OpenSSL documentation (see about their implementation of RSA it states that The exponent is an odd number, typically 3, 17 or Attacking RSA Suppose Nikita s public key is (n, e) and her decryption key is d, so ed 1 (mod ϕ(n)). If somehow we compute the factorization n = pq, then we can compute ϕ(n) = (p 1)(q 1) and hence compute d. Thus if we can factor n then we can break the corresponding RSA public-key cryptosystem.

193 3.3.1 Factoring n Given ϕ(n) 3.3 Attacking RSA 55 Suppose n = pq. Given ϕ(n), it is very easy to compute p and q. We have ϕ(n) = (p 1)(q 1) = pq (p + q) + 1, so we know both pq = n and p + q = n + 1 ϕ(n). Thus we know the polynomial x 2 (p + q)x + pq = (x p)(x q) whose roots are p and q. These roots can be found using the quadratic formula. Example The number n = pq = is a product of two primes, and ϕ(n) = We have f = x 2 (n + 1 ϕ(n))x + n = x x = (x )(x ), where the factorization step is easily accomplished using the quadratic formula: b + b 2 4ac 2a = = We conclude that n = When p and q are Close Suppose that p and q are close to each other. Then it is easy to factor n using a factorization method of Fermat. Suppose n = pq with p > q, say. Then ( ) 2 ( ) 2 p + q p q n =. 2 2 Since p and q are close, is small, s = p q 2 t = p + q 2 is only slightly larger than n, and t 2 n = s 2 is a perfect square. So we just try t = n, t = n + 1, t = n + 2,...

194 56 3. Public-Key Cryptography until t 2 n is a perfect square s 2. (Here x denotes the least integer n x.) Then p = t + s, q = t s. Example Suppose n = Then n = If t = , then t 2 n = If t = , then t 2 n = If t = , then t 2 n = 804 Z. Thus s = 804. We find that p = t + s = and q = t s = Factoring n Given d In this section, we show that finding the decryption key d for an RSA cryptosystem is, in practice, at least as difficult as factoring n. We give a probabilistic algorithm that given a decryption key determines the factorization of n. Consider an RSA cryptosystem with modulus n and encryption key e. Suppose we somehow finding an integer d such that a ed a (mod n) for all a. Then m = ed 1 satisfies a m 1 (mod n) for all a that are coprime to n. As we saw in Section 3.3.1, knowing ϕ(n) leads directly to a factorization of n. Unfortunately, knowing d does not seem to lead easily to a factorization of n. However, there is a probabilistic procedure that, given an m such that a m 1 (mod n), will find a factorization of n with high probability (we will not analyze the probability here). Algorithm (Probabilistic Algorithm to Factor n). Let n = pq be the product of two distinct odd primes, and suppose m is an integer such that a m 1 (mod n) for all a coprime to n. This probabilistic algorithm factors n with high probability. In the steps below, a always denotes an integer coprime to n = pq. 1. [Divide out powers of 2] If a m/2 1 (mod n) for several randomly chosen a, set m = m/2, and go to step 1, otherwise let a be such that a m/2 1 (mod n). 2. [Compute GCD s] Compute g = gcd(a m/2 1, n). 3. [Terminate?] If g is a proper divisor of n, output g and terminate. Otherwise go to step 1 and choose a different a. In step 1, note that m is even since ( 1) m 1 (mod n), so it makes sense to consider m/2. It is not practical to determine whether or not a m/2 1 (mod n) for all a, because it would require doing a computation for too

195 3.3 Attacking RSA 57 many a. Instead, we try a few random a; if a m/2 1 (mod n) for the a we check, we divide m by 2. Also note that if there exists even a single a such that a m/2 1 (mod n), then half the a have this property, since then a a m/2 is a surjective homomorphism (Z/nZ) {±1} and the kernel has index 2. Proposition implies that if x 2 1 (mod p) then x = ±1 (mod p). In step 2, since (a m/2 ) 2 1 (mod n), we also have (a m/2 ) 2 1 (mod p) and (a m/2 ) 2 1 (mod q), so a m/2 ±1 (mod p) and a m/2 ±1 (mod q). Since a m/2 1 (mod n), there are three possibilities for these signs, so with probability 2/3, one of the following two possibilities occurs: 1. a m/2 +1 (mod p) and a m/2 1 (mod q) 2. a m/2 1 (mod p) and a m/2 +1 (mod q). The only other possibility is that both signs are 1. In the first case, p a m/2 1 but q a m/2 1, so gcd(a m/2 1, pq) = p, and we have factored n. Similarly, in the second case, gcd(a m/2 1, pq) = q, and we again factor n. Example Somehow we discover that the RSA cryptosystem with n = and e = has decryption key d = We use this information and Algorithm to factor n. If m = ed 1 = , then ϕ(pq) m, so a m 1 (mod n) for all a coprime to n. For each a 20 we find that a m/2 1 (mod n), so we replace m by m 2 = Again, we find with this new m that for each a 20, a m/2 1 (mod n), so we replace m by Yet again, for each a 20, a m/2 1 (mod n), so we replace m by This is enough, since 2 m/ (mod n). Then gcd(2 m/2 1, n) = gcd( , ) = , and we have found a factor of n. Dividing, we find that n =

196 58 3. Public-Key Cryptography Further Remarks If one were to implement an actual RSA cryptosystem, there are many additional tricks and ideas to keep in mind. For example, one can add some extra random letters to each block of text, so that a given string will encrypt differently each time it is encrypted. This makes it more difficult for an attacker who knows the encrypted and plaintext versions of one message to gain information about subsequent encrypted messages. For an example implementation that incorporates this randomness, see Listing In any particular implementation, there might be attacks that would be devastating in practice, but which wouldn t require factoring the RSA modulus. RSA is in common use, e.g., it is used in OpenSSH protocol version 1 (see We will consider the ElGamal cryptosystem in Sections It has a similar flavor to RSA, but is more flexible in some ways. 3.4 Exercises 3.1 This problem concerns encoding phrases using numbers using the encoding of Section What is the longest that an arbitrary sequence of letters (no spaces) can be if it must fit in a number that is less than 10 20? 3.2 Suppose Michael creates an RSA cryptosystem with a very large modulus n for which the factorization of n cannot be found in a reasonable amount of time. Suppose that Nikita sends messages to Michael by representing each alphabetic character as an integer between 0 and 26 (A corresponds to 1, B to 2, etc., and a space to 0), then encrypts each number separately using Michael s RSA cryptosystem. Is this method secure? Explain your answer. 3.3 For any n N, let σ(n) be the sum of the divisors of n; for example, σ(6) = = 12 and σ(10) = = 18. Suppose that n = pqr with p, q, and r distinct primes. Devise an efficient algorithm that given n, ϕ(n) and σ(n), computes the factorization of n. For example, if n = 105, then p = 3, q = 5, and r = 7, so the input to the algorithm would be n = 105, ϕ(n) = 48, and σ(n) = 192, and the output would be 3, 5, and 7. For computational exercises about cryptosystems, see the exercises for Chapter 7.

197 4 Quadratic Reciprocity This is page 59 Printer: Opaque this The linear equation ax b (mod n) has a solution if and only if gcd(a, n) divides b (see Proposition 2.1.9). This chapter is about some amazing mathematics motivated by the search for a criterion for whether or not a quadratic equation ax 2 + bx + c 0 (mod n) has a solution. In many cases, the Chinese Remainder Theorem and the quadratic formula reduce this question to the key question of whether a given integer a is a perfect square modulo a prime p. The quadratic reciprocity law of Gauss provides a precise answer to the following question: For which primes p is the image of a in (Z/pZ) a perfect square? Amazingly, the answer depends only on the reduction of p modulo 4a. There are over a hundred proofs of the quadratic reciprocity law (see [Lem] for a long list). We give two proofs. The first, which we give in Section 4.3, is completely elementary and involves keeping track of integer points in intervals. It is satisfying because one can understand every detail without much abstraction, but it is unsatisfying because it is difficult to conceptualize what is going on. In sharp contrast, our second proof, which we we give in Section 4.4, in more abstract and uses a conceptual development of properties of Gauss sums. You should read Sections 4.1 and 4.2, then at least one of Section 4.3 or Section 4.4, depending on your taste and how much abstract algebra you know.

198 60 4. Quadratic Reciprocity In Section 4.5, we return to the computational question of actually finding square roots and solving quadratic equations in practice. 4.1 Statement of the Quadratic Reciprocity Law In this section we state the quadratic reciprocity law. Definition (Quadratic Residue). Fix a prime p. An integer a not divisible by p is quadratic residue modulo p if a is a square modulo p; otherwise, a is a quadratic nonresidue. The quadratic reciprocity theorem connects the question of whether or not a is a quadratic residue modulo p to the question of whether p is a quadratic residue modulo each of the prime divisors of a. To express it precisely, we introduce some new notation. Definition (Legendre Symbol). Let p be an odd prime and let a be an integer coprime to p. Set ( ) { a +1 if a is a quadratic residue, and = p 1 otherwise. We call this symbol the Legendre Symbol. This notation is well entrenched in the literature, even though it is also the notation ( ) for a divided by p ; be careful not to confuse the two. ( ) a a Since p only depends on a (mod p), it makes sense to define p for ( ) ã a Z/pZ to be p for any lift ã of a to Z. ( ) Lemma The map ψ : (Z/pZ) a {±1} given by ψ(a) = p is a surjective group homomorphism. Proof. By Theorem 2.5.5, G = (Z/pZ) is a cyclic group of order p 1. Because p is odd, G has even order, ( so) the subgroup H of squares of elements of G has index 2 in G. Since = 1 if and only if a H, we see that ψ is the composition G G/H = {±1}, where we identify the nontrivial element of G/H with 1. Remark We could also prove that ψ is surjective without using that (Z/pZ) is cyclic, as follows. If a (Z/pZ) is a square, say a b 2 (mod p), then a (p 1)/2 = b p 1 1 (mod p), so a is a root of f = x (p 1)/2 1. By Proposition 2.5.2, the polynomial f has at most (p 1)/2 roots. Thus there must be an a (Z/pZ) that is not a root of f, and for that a, we have ( ψ(a) = a p a p ) = 1, and trivially ψ(1) = 1, so the map ψ is surjective. Note

199 4.1 Statement of the Quadratic Reciprocity Law 61 TABLE 4.1. When is 5 a square modulo p? ( ) ( ) 5 5 p p p mod 5 p p p mod that this argument does not prove that ψ is a homomorphism, though it can be extended ( to ) one that does. The symbol only depends on the residue class of a modulo p, so a p making a table of values ( ) a 5 for ( many ) values of a would be easy. Would 5 it be easy to make a table of p for many p? Probably, since there is ( ) 5 a simple pattern in Table 4.1. It appears that p ( depends only on the ) 5 congruence class of p modulo 5. More precisely, p ( ) = 1 if and only if p 1, 4 (mod 5), i.e., = 1 if and only if p is a square modulo 5. 5 p Based on similar observations, in the 18th century various mathematicians found a conjectural explanation for the mystery suggested by Table 4.1. Finally, on April 8, 1796, at the age of 19, Gauss proved the following theorem. Theorem (Gauss s Quadratic Reciprocity Law). Suppose p and q are distinct odd primes. Then Also ( ) 1 = ( 1) (p 1)/2 and p ( ) p = ( 1) p 1 2 q 1 2 q ( ) q. p We will give two proofs of Gauss s formula relating ( ) { 2 1 if p ±1 (mod 8) = p 1 if p ±3 (mod 8). ( ) p q to ( ) q p. The first elementary proof is in Section 4.3, and the second more algebraic proof is in Section 4.4. In our example Gauss s theorem implies that ( 5 p ) = ( 1) 2 p 1 2 ( p 5) = ( p 5) = { +1 if p 1, 4 (mod 5) 1 if p 2, 3 (mod 5).

200 62 4. Quadratic Reciprocity As an application, the following example illustrates how to answer questions like is a a square modulo b using Theorem Example Is 69 a square modulo the prime 389? We have ( ) ( ) ( ) ( ) = = = ( 1) ( 1) = Here ( ) 3 = 389 and ( ) 23 = 389 ( ) 389 = 3 ( ) ( ) = = ( ) ( ) 1 2 = ( ) 2 = 1, 3 ( ) 2 23 = ( 1) = 1. Thus 69 is a square modulo 389. Though we know that 69 is a square modulo 389, we don t know an explicit x such that x 2 69 (mod 389)! This is reminiscent of how we could prove using Theorem that certain numbers are composite without knowing a factorization. Remark The Jacobi symbol is an extension of the Legendre symbol to composite moduli. For more details, see Exercise Euler s Criterion Let p be an odd prime and a an integer not ( divisible ) by p. Euler used a the existence of primitive roots to show that p is congruent to a (p 1)/2 modulo p. We will use this fact repeatedly below in both proofs of Theorem ( ) a Proposition (Euler s Criterion). We have p = 1 if and only if a (p 1)/2 1 (mod p). Proof. The map ϕ : (Z/pZ) (Z/pZ) given by ϕ(a) = a (p 1)/2 is a group homomorphism, since powering is a group homomorphism of( any ) abelian group. Let ψ : (Z/pZ) {±1} be the homomorphism ψ(a) = of Lemma If a ker(ψ), then a = b 2 for some b Z/pZ, so ϕ(a) = a (p 1)/2 = (b 2 ) (p 1)/2 = b p 1 = 1. Thus ker(ψ) ker(ϕ). By Lemma 4.1.3, ker(ψ) has index 2 in (Z/pZ), so either ker(ϕ) = ker(ψ) or ϕ = 1. If ϕ = 1, the polynomial x (p 1)/2 1 a p

201 4.3 First Proof of Quadratic Reciprocity 63 has p 1 roots in the field Z/pZ, which contradicts Proposition 2.5.2, so ker(ϕ) = ker(ψ), which proves the proposition. From a computational ( ) point of view, Corollary provides a convenient way to compute. See Section for an implementation. a p Corollary The equation x 2 ( a ) (mod p) has no solution if and only if a (p 1)/2 1 (mod p). Thus a (p 1)/2 (mod p). Proof. This follows from Proposition and the fact that the polynomial x 2 1 has no roots besides +1 and 1 (which follows from Proposition 2.5.3). As additional computational ( ) motivation for the value of Corollary 4.2.2, a note that to evaluate p using Theorem would not be practical if a and p both very large, because it would require ( ) factoring a. However, Corollary provides a method for evaluating without factoring a. Example Suppose p = 11. By squaring each element of (Z/11Z), we see that the squares modulo 11 are {1, 3, 4, 5, 9}. We compute a (p 1)/2 = a 5 for each a (Z/11Z) and get 1 5 = 1, 2 5 = 1, 3 5 = 1, 4 5 = 1, 5 5 = 1, 6 5 = 1, 7 5 = 1, 8 5 = 1, 9 5 = 1, 10 5 = 1. Thus the a with a 5 = 1 are {1, 3, 4, 5, 9}, just as Proposition predicts. Example We determine whether or not 3 is a square modulo the prime p = Using a computer we find that a p a p 3 (p 1)/2 1 (mod ). Thus 3 is not a square modulo p. This computation wasn t difficult, but it would have been tedious by hand. The law of quadratic reciprocity provides a way to answer this question, which could easily be carried out by hand: ( ) ( ) = ( 1) (3 1)/2 ( )/ ( ) 1 = ( 1) = First Proof of Quadratic Reciprocity Our first proof of quadratic reciprocity is elementary. The proof involves keeping track of integer points in intervals. Proving Gauss s lemma is the

202 64 4. Quadratic Reciprocity ( first step; this lemma computes a p ) in terms of the number of integers of a certain type that lie in a certain interval. Next we prove Lemma 4.3.2, which controls how the parity of the number of integer points in an interval changes when an endpoint of the interval is changed. Then we prove that ( a p ) depends only on p modulo 4a by applying Gauss s lemma and keeping careful track of intervals as they are rescaled and their endpoints are changed. Finally, in Section we use some basic algebra to deduce the quadratic reciprocity law using the tools we ve just developed. Our proof follows the one given in [Dav99] closely. Lemma (Gauss s Lemma). Let p be an odd prime and let a be an integer 0 (mod p). Form the numbers a, 2a, 3a,..., p 1 2 a and reduce them modulo p to lie in the interval ( p 2, p 2 ). Let ν be the number of negative numbers in the resulting set. Then ( ) a = ( 1) ν. p Proof. In defining ν, we expressed each number in { S = a, 2a,..., p 1 } 2 a as congruent to a number in the set { 1, 1, 2, 2,..., p 1 2, p 1 }. 2 No number 1, 2,..., p 1 2 appears more than once, with either choice of sign, because if it did then either two elements of S are congruent modulo p or 0 is the sum of two elements of S, and both events are impossible. Thus the resulting set must be of the form T = { ε 1 1, ε 2 2,..., ε (p 1)/2 p 1 2 where each ε i is either +1 or 1. Multiplying together the elements of S and of T, we see that ( ) p 1 (1a) (2a) (3a) 2 a ( (ε 1 1) (ε 2 2) ε (p 1)/2 p 1 ) (mod p), 2 so a (p 1)/2 ε 1 ε 2 ε (p 1)/2 (mod p). ( The lemma then follows from Proposition 4.2.1, since }, a p ) = a (p 1)/2.

203 4.3.1 Euler s Proposition For rational numbers a, b Q, let 4.3 First Proof of Quadratic Reciprocity 65 (a, b) Z = {x Z : a x b} be the set of integers between a and b. The following lemma will help us to keep track of how many integers lie in certain intervals. Lemma Let a, b Q. Then for any integer n, and # ((a, b) Z) # ((a, b + 2n) Z) (mod 2) # ((a, b) Z) # ((a 2n, b) Z) (mod 2), provided that each interval involved in the congruence is nonempty. Note that if one of the intervals is empty, then the statement may be false; e.g., if (a, b) = ( 1/2, 1/2) and n = 1 then #((a, b) Z) = 1 but #(a, b 2) Z = 0. Proof. Let x denotes the least integer x. Since n > 0, (a, b + 2n) = (a, b) [b, b + 2n), where the union is disjoint. There are 2n integers, b, b + 1,..., b + 2n 1, in the interval [b, b + 2n), so the first congruence of the lemma is true in this case. We also have (a, b 2n) = (a, b) minus [b 2n, b) and [b 2n, b) contains exactly 2n integers, so the lemma is also true when n is negative. The statement about # ((a 2n, b) Z) is proved in a similar manner. Once we have proved the following proposition, it will be easy to deduce the quadratic reciprocity law. Proposition (Euler). Let p be an odd prime and let a ( be a) positive ( ) integer with p a. If q is a prime with q ±p (mod 4a), then =. ( Proof. We will apply Lemma to compute S = a p { a, 2a, 3a,..., p 1 } 2 a ). Let a p a q

204 66 4. Quadratic Reciprocity and I = ( ) ( ) (( p, p p, 2p b 1 ) ) p, bp, 2 2 where b = 1 2 a or 1 2 (a 1), whichever is an integer. We check that every element of S that reduces to something in the interval ( p 2, 0) lies in I. This is clear if b = 1 2 a p 1 2 a, so ((b 1 2 )p, bp) is the last interval that could contain an element of S that reduces to ( p 2, 0). Note that the integer endpoints of I are not in S, since those endpoints are divisible by p, but no element of S is divisible by p. Thus, by Lemma 4.3.1, ( ) a = ( 1) #(S I). p To compute #(S I), first rescale by a to see that #(S I) = # (Z 1a ) I, where ( 1 ( p a I = 2a, p ) ( 3p a 2a, 2p ) ( (2b 1)p, bp )). a 2a a Write p = 4ac + r, and let ( ( r J = 2a, r ) ( 3r a 2a, 2r ) ( (2b 1)r, br )). a 2a a The only difference between I and J is that the endpoints of intervals are changed by addition of an even integer. By Lemma 4.3.2, ν = # (Z 1a ) I #(Z J) (mod 2). ( ) a Thus p = ( 1) ν depends only on r, i.e., only on p modulo 4a. Thus if ( ) ( ) q p (mod 4a), then =. a p a q If q p (mod 4a), then the only change in the above computation is that r is replaced by 4a r. This changes 1 a I into K = ( 2 r 2a, 4 r ) a ( 6 3r 2a, 8 2r a ( 4b 2 ) (2b 1)r, 4b br 2a a ).

205 4.3 First Proof of Quadratic Reciprocity 67 Thus K is the same as 1 ai, except even integers have been added to the endpoints. By Lemma 4.3.2, (( ) ) 1 #(K Z) # a I Z (mod 2), ( so a p ) ( = a q ), which completes the proof. The following more careful analysis in the special case when a = 2 helps illustrate the proof of the above lemma, and the result is frequently useful in computations. For an alternative proof of the proposition, see Exercise 4.5. Proposition (Legendre symbol of 2). Let p be an odd prime. Then ( ) { 2 1 if p ±1 (mod 8) = p 1 if p ±3 (mod 8). Proof. When a = 2, the set S = {a, 2a,..., 2 p 1 2 } is {2, 4, 6,..., p 1}. We must count the parity of the number of elements of S that lie in the interval I = ( p 2, p). Writing p = 8c + r, we have ) ( 1 # (I S) = # = # 2 I Z (( 2c + r 4, 4c + r 2 (( p = # 4, p 2) ) Z ) ) Z # (( r 4 2), r ) Z (mod 2), where the last equality comes from Lemma The possibilities for r are 1, 3, 5, 7. When r = 1, the cardinality is 0, when r = 3, 5 it is 1, and when r = 7 it is Proof of Quadratic Reciprocity It is now straightforward to deduce the quadratic reciprocity law. First Proof of Theorem First suppose that p q (mod 4). By swapping p and q if necessary, we may assume that p > q, and write p q = 4a. Since p = 4a + q, ( ) p q ( ) 4a + q = = q ( ) 4a = q ( 4 q ) ( ) a = q ( ) a, q and ( ) q = p ( ) ( ) p 4a 4a = = p p ( ) 1 p ( ) a. p

206 68 4. Quadratic Reciprocity ( Proposition implies that ( ) p q ( ) q = p ( 1 p a q ) ( = ) a p ), since p q (mod 4a). Thus = ( 1) p 1 2 = ( 1) p 1 2 q 1 2, where the last equality is because p 1 2 is even if and only if q 1 2 is even. Next suppose that p q (mod 4), so p q (mod 4). Write p + q = 4a. We have ( ) ( ) ( ) ( ) ( ) ( ) p 4a q a q 4a p a = =, and = =. q q q p p p ) ) Since p q (mod 4a), Proposition implies that =. Since ( 1) p 1 2 q 1 2 = 1, the proof is complete. ( p q ( q p 4.4 A Proof of Quadratic Reciprocity Using Gauss Sums In this section we present a beautiful proof of Theorem using algebraic identities satisfied by sums of roots of unity. The objects we introduce in the proof are of independent interest, and provide a powerful tool to prove higher-degree analogues of quadratic reciprocity. (For more on higher reciprocity see [IR90]. See also Section 6 of [IR90] on which the proof below is modeled.) Definition (Root of Unity). An nth root of unity is a complex number ζ such that ζ n = 1. A root of unity ζ is a primitive nth root of unity if n is the smallest positive integer such that ζ n = 1. For example, 1 is a primitive second root of unity, and ζ = is a primitive cube root of unity. More generally, for any n N the complex number ζ n = cos(2π/n) + i sin(2π/n) is a primitive nth root of unity (this follows from the identity e iθ = cos(θ)+ i sin(θ)). For the rest of this section, we fix an odd prime p and the primitive pth root ζ = ζ p of unity. Definition (Gauss Sum). Fix an odd prime p. The Gauss sum associated to an integer a is p 1 ( ) n g a = ζ an, p n=0 where ζ = ζ p = cos(2π/p) + i sin(2π/p).

207 4.4 A Proof of Quadratic Reciprocity Using Gauss Sums 69 g 2 = ( 0 5) + ( 1 5 ) ζ 2 + ( 2 5 ) ζ 4 + ( ( 3 5) ζ + 4 5) ζ 3 = 5 1 ζ = e +1 2πi/5 ζ 2 g 2 2 = ζ 3 +1 ζ 4 1 FIGURE 4.1. Gauss sum g 2 for p = 5 Note that p is implicit in the definition of g a. If we were to change p, then the Gauss sum g a associated to a would be different. The definition of g a also depends on our choice of ζ; we ve chosen ζ = ζ p, but could have chosen a different ζ and then g a could be different. Figure 4.1 illustrates the Gauss sum g 2 for p = 5. The Gauss sum is obtained by adding the points on the unit circle, with signs as indicated, to obtain the real number 5. This suggests the following proposition, whose proof will require some work. Proposition (Gauss sum). For any a not divisible by p, g 2 a = ( 1) (p 1)/2 p. In order to prove the proposition, we introduce a few lemmas. Lemma For any integer a, { p 1 ζ an p if a 0 (mod p), = 0 otherwise. n=0 Proof. If a 0 (mod p), then ζ a = 1, so the sum equals the number of summands, which is p. If a 0 (mod p), then we use then identity x p 1 = (x 1)(x p x + 1) with x = ζ a. We have ζ a 1, so ζ a 1 0 and p 1 ζ an = ζap 1 ζ a 1 = 1 1 ζ a 1 = 0. n=0 Lemma If x and y are arbitrary integers, then { p 1 ζ (x y)n p if x y (mod p), = 0 otherwise. n=0

208 70 4. Quadratic Reciprocity Proof. This follows from Lemma by setting a = x y. Lemma We have g 0 = 0. Proof. By definition p 1 ( ) n g 0 =. (4.4.1) p n=0 By Lemma 4.1.3, the map ( ) : (Z/pZ) {±1} p is a surjective homomorphism of groups. Thus half the elements of (Z/pZ) map to +1 ( and ) half map to 1 (the subgroup that maps to +1 has index 2). Since = 0, the sum (4.4.1) is 0. 0 p Lemma For any integer a, g a = ( ) a g 1. p Proof. When a 0 (mod p) the lemma follows from Lemma 4.4.6, so suppose that a 0 (mod p). Then ( ) a g a = p ( a p ) p 1 n=0 ( ) n ζ an = p p 1 n=0 ( ) an p 1 ζ an = p m=0 ( ) m ζ m = g 1. p Here we use that multiplication by a is an automorphism of Z/pZ. Finally, ( ) ( ) 2 a a multiply both sides by p and use that p = 1. We have enough lemmas to prove Proposition Proof of Proposition We evaluate the sum p 1 a=0 g ag a in two different ways. By Lemma 4.4.7, since a 0 (mod p) we have g a g a = ( ) ( ) a a g 1 g 1 = p p ( 1 p ) ( ) 2 a g1 2 = ( 1) (p 1)/2 g 2 p 1, ( ) a where the last step follows from Proposition and that p {±1}. Thus p 1 g a g a = (p 1)( 1) (p 1)/2 g1. 2 (4.4.2) a=0

209 4.4 A Proof of Quadratic Reciprocity Using Gauss Sums 71 On the other hand, by definition p 1 ( ) n g a g a = p = = n=0 p 1 p 1 n=0 m=0 p 1 p 1 n=0 m=0 ζ an ( n p ( n p p 1 m=0 ) ( m p ( ) m ζ am p ) ζ an ζ am ) ( ) m ζ an am. p Let δ(n, m) = 1 if n m (mod p) and 0 otherwise. By Lemma 4.4.5, p 1 p 1 p 1 g a g a = a=0 = = p 1 a=0 n=0 m=0 p 1 p 1 n=0 m=0 p 1 p 1 n=0 m=0 p 1 = n=0 ( n p = p(p 1). ( n p ( n p ) 2 p ( n p ) ( ) m ζ an am p ) ( ) m p 1 p ) ( m p a=0 ζ an am ) pδ(n, m) Equate (4.4.2) and the above equality, then cancel (p 1) to see that Since a 0 (mod p), we have g 2 1 = ( 1) (p 1)/2 p. g 2 a = and the proposition is proved. ( a p ) 2 = 1, so by Lemma 4.4.7, ( ) 2 a g1 2 = g 2 p 1, Proof of Quadratic Reciprocity We are now ready to prove Theorem using Gauss sums. Proof. Let q be an odd prime with q p. Set p = ( 1) (p 1)/2 p and recall that Proposition asserts that p = g 2, where g = g 1 = ( ) p 1 n=0 ζ n. n p

210 72 4. Quadratic Reciprocity Proposition implies that ( ) p (p ) (q 1)/2 q (mod q). We have g q 1 = (g 2 ) (q 1)/2 = (p ) (q 1)/2, so multiplying both sides of the displayed equation by g yields a congruence ( ) p g q g (mod q). (4.4.3) q But wait, what does this congruence mean, ( ) given that g q is not an integer? It means that the difference g q g p lies in the ideal (q) in the ring Z[ζ] of all polynomials in ζ with coefficients in Z. The ring Z[ζ]/(q) has characteristic q, so if x, y Z[ζ], then (x + y) q x q + y q (mod q). Applying this to (4.4.3), we see that q g q = ( p 1 n=0 ( ) ) q n ζ n p p 1 n=0 ( ) q n ζ nq p p 1 n=0 ( ) n ζ nq g q p (mod q). By Lemma 4.4.7, g q g q Combining this with (4.4.3) yields ( ) q g p ( ) q g p ( p q ) g (mod q). (mod q). Since ( ) g( 2 = ) p and p q, we can cancel g from both sides to find that q p p q (mod q). Since both residue symbols are ±1 and q is odd, it ) ( ) follows that = p. Finally, we note using Proposition that ( q p ( ) ( p ( 1) (p 1)/2 ) p = = q q q ( 1 q ) (p 1)/2 ( ) p = ( 1) q 1 2 p 1 2 q ( ) p. q 4.5 Finding Square Roots [[something about schoof polynomial time algo!!!]] We return in this section to the question of computing square roots. If K is a field in which

211 4.5 Finding Square Roots , and a, b, c K, with a 0, then the solutions to the quadratic equation ax 2 + bx + c = 0 are x = b ± b 2 4ac. 2a Now assume K = Z/pZ, with p an odd prime. Using Theorem 4.1.5, we can decide whether or not b 2 4ac is a perfect square in Z/pZ, and hence whether or not ax 2 + bx + c = 0 has a solution in Z/pZ. However Theorem says nothing about how to actually find a solution when there is one. Also, note that for this problem we do not need the full quadratic reciprocity law; in practice to decide whether an element of Z/pZ is a perfect square Proposition is quite fast, in view of Section 2.3. Suppose a Z/pZ is a nonzero quadratic residue. If p 3 (mod 4) then b = a p+1 4 is a square root of a because b 2 = a p+1 2 = a p = a p 1 2 a = ( ) a a = a. p We can compute b in time polynomial in the number of digits of p using the powering algorithm of Section 2.3. We do not know a deterministic polynomial-time algorithm to compute a square root of a when p 1 (mod 4). The following is a standard probabilistic algorithm to compute a square root of a, which works well in practice. Consider the quotient ring R = (Z/pZ)[x]/(x 2 a), by which we mean the following. We have with multiplication defined by R = {u + vα : u, v Z/pZ} (u + vα)(z + wα) = (uz + awv) + (uw + vz)α. Here α corresponds to the class of x in the quotient ring. Let b and c be the square roots of a in Z/pZ (though we cannot easily compute b and c yet, we can consider them in order to deduce an algorithm to find them). We have ring homomorphisms f : R Z/pZ and g : R Z/pZ given by f(u + vα) = u + vb and g(u + vα) = u + vc. Together these define a ring isomorphism ϕ : R Z/pZ Z/pZ given by ϕ(u + vα) = (u + vb, u + vc). Choose in some way a random element z of (Z/pZ), and define u, v Z/pZ by u + vα = (1 + zα) p 1 2,

212 74 4. Quadratic Reciprocity where we compute (1 + zα) p 1 2 quickly using an analogue of the binary powering algorithm of Section If v = 0 we try again with another random z. If v 0 we can quickly find the desired square roots b and c as follows. The quantity u + vb is a (p 1)/2 power in Z/pZ, so it equals either 0, 1, or 1, so b = u/v, (1 u)/v, or ( 1 u)/v, respectively. Since we know u and v we can try each of u/v, (1 u)/v, and ( 1 u)/v and see which is a square root of a. We implement this algorithm in Section Example Continuing Example 4.1.6, we find a square root of 69 modulo 389. We apply the algorithm described above in the case p 1 (mod 4). We first choose the random z = 24 and find that (1 + 24α) 194 = 1. The coefficient of α in the power is 0, and we try again with z = 51. This time we have (1 + 51α) 194 = 239α = u + vα. The inverse of 239 in Z/389Z is 153, so we consider the following three possibilities for a square root of 69: u v = 0 1 u v = u v = 153. Thus 153 and 153 are the square roots of 69 in Z/389Z. 4.6 Exercises 4.1 Calculate the following by hand: ( ) ( 3 97, ), ( Use Theorem to show that for p 5 prime, ( ) { 3 1 if p 1, 11 (mod 12), = p 1 if p 5, 7 (mod 12). ) (, and 5! ) 7. ( ) 4.3 (*) Use that (Z/pZ) 3 is cyclic to give a direct proof that p = 1 when p 1 (mod 3). (Hint: There is an c (Z/pZ) of order 3. Show that (2c + 1) 2 = 3.) ( ) 4.4 (*) If p 1 (mod 5), show directly that = 1 by the method of Exercise 4.3. (Hint: Let c (Z/pZ) be an element of order 5. Show that (c + c 4 ) 2 + (c + c 4 ) 1 = 0, etc.) ( ) 4.5 (*) Let p be an odd prime. In this exercise you will prove that = 1 if and only if p ±1 (mod 8). 5 p 2 p (a) Prove that x = 1 t2 1 + t 2, y = 2t 1 + t 2

213 4.6 Exercises 75 is a parameterization of the set of solutions to x 2 + y 2 1 (mod p), in the sense that the solutions (x, y) Z/pZ are in bijection with the t Z/pZ { } such that 1+t 2 0 (mod p). Here t = corresponds to the point ( 1, 0). (Hint: if (x 1, y 1 ) is a solution, consider the line y = t(x + 1) through (x 1, y 1 ) and ( 1, 0), and solve for x 1, y 1 in terms of t.) (b) Prove that the number of solutions to x 2 + y 2 1 (mod p) is p + 1 if p 3 (mod 4) and p 1 if p 1 (mod 4). (c) Consider the set ( S) of pairs ( ) (a, b) (Z/pZ) (Z/pZ) such that a + b = 1 and = = 1. Prove that #S = (p + 1 4)/4 a p b p if p 3 (mod 4) and #S = (p 1 4)/4 if p 1 (mod 4). Conclude that #S is odd if and only if p ±1 (mod 8) (d) The map σ(a, b) = (b, a) that swaps coordinates is a bijection of the set S. It has exactly one fixed point ( ) if and only if there is a an a Z/pZ such that 2a = 1 and p = 1. Also, prove that ( ) a 2a = 1 has a solution a Z/pZ with p = 1 if and only if ( ) = 1. 2 p (e) Finish by showing that σ has exactly one fixed point if and only if #S is odd, i.e., if and only if p ±1 (mod 8). Remark: The method of proof of this exercise can be generalized to give a proof of the full quadratic reciprocity law. 4.6 How many natural numbers x < 2 13 satisfy the equation x 2 5 (mod )? You may assume that is prime. 4.7 Find the natural number x < 97 such that x 4 48 (mod 97). Note that 97 is prime. 4.8 In this problem we( will ) formulate an analogue of quadratic reciprocity for a symbol like, but without the restriction that q be a prime. a q Suppose n is a positive integer, which we factor as k the Jacobi symbol ( a n) as follows: ( a = n) k ( ) ei a. i=1 p i i=1 pei i. We define (a) Give an example to show that ( a n) = 1 need not imply that a is a perfect square modulo n.

214 76 4. Quadratic Reciprocity (b) (*) Let n be odd and a and b be integers. Prove that the following holds: i. ( ( a b ) ( n) n = ab ) ( n. (Thus a a ) n induces a homomorphism from (Z/nZ) to {±1}.) ii. ( ) 1 n n (mod 4). iii. ( 2 n) = 1 if n ±1 (mod 8) and 1 otherwise. iv. ( ) a 1 ( a n = ( 1) 2 n 1 2 n ) a 4.9 (*) Prove that for any n Z the integer n 2 + n + 1 does not have any divisors of the form 6k 1.

215 5 Continued Fractions This is page 77 Printer: Opaque this A continued fraction is an expression of the form 1 a a a 2 + a 3 +. In this book we will assume that the a i are real numbers and a i > 0 for i 1, and the expression may or may not go on indefinitely. More general notions of continued fractions have been extensively studied, but they are beyond the scope of this book. We will be most interested in the case when the a i are all integers. We denote the continued fraction displayed above by [a 0, a 1, a 2,...]. For example, [1, 2] = = 3 2, 1 [3, 7, 15, 1, 292] = = = ,

216 78 5. Continued Fractions and 1 [2, 1, 2, 1, 1, 4, 1, 1, 6] = = = The second two examples were chosen to foreshadow that continued fractions can be used to obtain good rational approximations to irrational numbers. Note that the first approximates π and the second e. Continued fractions have many applications. For example, they provide an algorithmic way to recognize a decimal approximation to a rational number. Continued fractions also suggest a sense in which e might be less complicated than π (see Example and Section 5.3). In Section 5.1 we study continued fractions [a 0, a 1,..., a n ] of finite length and lay the foundations for our later investigations. In Section 5.2 we give the continued fraction procedure, which associates to a real number x a sequence a 0, a 1,... of integers such that x = lim n [a 0, a 1,..., a n ]. We also prove that if a 0, a 1,... is any infinite sequence of positive integers, then the sequence c n = [a 0, a 1,..., a n ] converges; more generally, we prove that if the a n are arbitrary positive real numbers and n=0 a n diverges then (c n ) converges. In Section 5.4, we prove that a continued fraction with a i N is (eventually) periodic if and only if its value is a non-rational root of a quadratic polynomial, then discuss open questions concerning continued fractions of roots of irreducible polynomials of degree greater than 2. We conclude the chapter with applications of continued fractions to recognizing approximations to rational numbers (Section 5.5) and writing integers as sums of two squares (Section 5.6). The reader is encouraged to read more about continued fractions in [HW79, Ch. X], [Khi63], [Bur89, 13.3], and [NZM91, Ch. 7]. 5.1 Finite Continued Fractions This section is about continued fractions of the form [a 0, a 1,..., a m ] for some m 0. We give an inductive definition of numbers p n and q n such

217 that for all n m 5.1 Finite Continued Fractions 79 [a 0, a 1,..., a n ] = p n q n. (5.1.1) ( We then give related formulas for the determinants of the 2 2 matrices pn p n 1 ) ( pn p q n q n 1 and n 2 ) q n q n 2. which we will repeatedly use to deduce properties of the sequence of partial convergents [a 0,..., a k ]. We will use Algorithm to prove that every rational number is represented by a continued fraction, as in (5.1.1). Definition (Finite Continued Fraction). A finite continued fraction is an expression 1 a 0 +, 1 a a a n where each a m is a real number and a m > 0 for all m 1. Definition (Simple Continued Fraction). A simple continued fraction is a finite or infinite continued fraction in which the a i are all integers. To get a feeling for continued fractions, observe that [a 0 ] = a 0, [a 0, a 1 ] = a = a 0a 1 + 1, a 1 a 1 1 [a 0, a 1, a 2 ] = a 0 + a = a 0a 1 a 2 + a 0 + a 2. a 1 a a 2 Also, [a 0, a 1,..., a n 1, a n ] = [ a 0, a 1,..., a n 2, a n = a [a 1,..., a n ] = [a 0, [a 1,..., a n ]]. a n ] Partial Convergents Fix a finite continued fraction [a 0,..., a m ]. We do not assume at this point that the a i are integers. Definition (Partial convergents). For 0 n m, the nth convergent of the continued fraction [a 0,..., a m ] is [a 0,..., a n ]. These convergents for n < m are also called partial convergents.

218 80 5. Continued Fractions For each n with 2 n m, define real numbers p n and q n as follows: p 2 = 0, p 1 = 1, p 0 = a 0, p n = a n p n 1 + p n 2, q 2 = 1, q 1 = 0, q 0 = 1, q n = a n q n 1 + q n 2. Proposition (Partial Convergents). For n 0 we have [a 0,..., a n ] = p n q n. Proof. We use induction. The assertion is obvious when n = 0, 1. Suppose the proposition is true for all continued fractions of length n 1. Then [a 0,..., a n ] = [a 0,..., a n 2, a n ] a ( ) n a n a n p n 2 + p n 3 = ( ) a n a n q n 2 + q n 3 = (a n 1a n + 1)p n 2 + a n p n 3 (a n 1 a n + 1)q n 2 + a n q n 3 = a n(a n 1 p n 2 + p n 3 ) + p n 2 a n (a n 1 q n 2 + q n 3 ) + q n 2 = a np n 1 + p n 2 a n q n 1 + q n 2 = p n q n. Proposition For n 0 we have and Equivalently, and p n q n 1 q n p n 1 = ( 1) n 1 (5.1.2) p n q n 2 q n p n 2 = ( 1) n a n. (5.1.3) p n q n p n 1 q n 1 = ( 1) n 1 1 q n q n 1 p n p n 2 = ( 1) n a n. q n q n 2 q n q n 2 Proof. The case for n = 0 is obvious from the definitions. Now suppose n > 0 and the statement is true for n 1. Then p n q n 1 q n p n 1 = (a n p n 1 + p n 2 )q n 1 (a n q n 1 + q n 2 )p n 1 = p n 2 q n 1 q n 2 p n 1 = (p n 1 q n 2 p n 2 q n 1 ) = ( 1) n 2 = ( 1) n 1.

219 5.1 Finite Continued Fractions 81 This completes the proof of (5.1.2). For (5.1.3), we have p n q n 2 p n 2 q n = (a n p n 1 + p n 2 )q n 2 p n 2 (a n q n 1 + q n 2 ) = a n (p n 1 q n 2 p n 2 q n 1 ) = ( 1) n a n. Remark Expressed in terms of matrices, the proposition asserts that the determinant of ( p n p n 1 ) q n q n 1 is ( 1) n 1, and of ( p n p n 2 ) q n q n 2 is ( 1) n a n. Corollary (Convergents in lowest terms). If [a 0, a 1,..., a m ] is a simple continued fraction, so each a i is an integer, then the p n and q n are integers and the fraction p n /q n is in lowest terms. Proof. It is clear that the p n and q n are integers, from the formula that defines them. If d is a positive divisor of both p n and q n, then d ( 1) n 1, so d = The Sequence of Partial Convergents Let [a 0,..., a m ] be a continued fraction and for n m let c n = [a 0,..., a n ] = p n q n denote the nth convergent. Recall that by definition of continued fraction, a n > 0 for n > 0, which gives the partial convergents of a continued fraction additional structure. For example, the partial convergents of [2, 1, 2, 1, 1, 4, 1, 1, 6] are 2, 3, 8/3, 11/4, 19/7, 87/32, 106/39, 193/71, 1264/465. To make the size of these numbers clearer, we approximate them using decimals. We also underline every other number, to illustrate some extra structure. 2, 3, , , , , , , The underlined numbers are smaller than all of the non-underlined numbers, and the sequence of underlined numbers is strictly increasing, whereas the non-underlined numbers strictly decrease. We next prove that this extra structure is a general phenomenon. Proposition (How convergents converge). The even indexed convergents c 2n increase strictly with n, and the odd indexed convergents c 2n+1 decrease strictly with n. Also, the odd indexed convergents c 2n+1 are greater than all of the even indexed convergents c 2m.

220 82 5. Continued Fractions Proof. The a n are positive for n 1, so the q n are positive. By Proposition 5.1.5, for n 2, c n c n 2 = ( 1) n a n q n q n 2, which proves the first claim. Suppose for the sake of contradiction that there exist integers r, m such that c 2m+1 < c 2r. Proposition implies that for n 1, c n c n 1 = ( 1) n 1 1 q n q n 1 has sign ( 1) n 1, so for all s 0 we have c 2s+1 > c 2s. Thus it is impossible that r = m. If r < m, then by what we proved in the first paragraph, c 2m+1 < c 2r < c 2m, a contradiction (with s = m). If r > m, then c 2r+1 < c 2m+1 < c 2r, which is also a contradiction (with s = r) Every Rational Number is Represented Proposition (Rational continued fractions). Every nonzero rational number can be represented by a simple continued fraction. Proof. Without loss of generality we may assume that the rational number is a/b, with b 1 and gcd(a, b) = 1. Algorithm gives: a = b a 0 + r 1, 0 < r 1 0 for i > 0 (also r n = 1 since gcd(a, b) = 1). Rewrite the equations as follows: a/b = a 0 + r 1 /b = a 0 + 1/(b/r 1 ), b/r 1 = a 1 + r 2 /r 1 = a 1 + 1/(r 1 /r 2 ), r 1 /r 2 = a 2 + r 3 /r 2 = a 2 + 1/(r 2 /r 3 ), r n 1 /r n = a n. It follows that a b = [a 0, a 1,..., a n ].

221 5.2 Infinite Continued Fractions 83 The proof of Proposition leads to an algorithm for computing the continued fraction of a rational number. See Section 7.5 for an implementation. A nonzero rational number can be represented in exactly two ways; for example, 2 = [1, 1] = [2] (see Exercise 5.2). 5.2 Infinite Continued Fractions This section begins with the continued fraction procedure, which associates to a real number x a sequence a 0, a 1,... of integers. After giving several examples, we prove that x = lim n [a 0, a 1,..., a n ] by proving that the odd and even partial convergents become arbitrarily close to each other. We also show that if a 0, a 1,... is any infinite sequence of positive integers, then the sequence of c n = [a 0, a 1,..., a n ] converges, and, more generally, if a n is an arbitrary sequence of positive reals such that n=0 a n diverges then (c n ) converges The Continued Fraction Procedure Let x R and write x = a 0 + t 0 with a 0 Z and 0 t 0 < 1. We call the number a 0 the floor of x, and we also sometimes write a 0 = x. If t 0 0, write 1 t 0 = a 1 + t 1 with a 1 N and 0 t 1 < 1. Thus t 0 = 1 a 1+t 1 = [0, a 1 + t 1 ], which is a (non-simple) continued fraction expansion of t 0. Continue in this manner so long as t n 0 writing 1 t n = a n+1 + t n+1 with a n+1 N and 0 t n+1 < 1. We call this procedure, which associates to a real number x the sequence of integers a 0, a 1, a 2,..., the continued fraction process. We implement it in on a computer in Section 7.5. Example Let x = 8 3. Then x = , so a 0 = 2 and t 0 = 2 3. Then 1 t 0 = 3 2 = , so a 1 = 1 and t 1 = 1 2. Then 1 t 1 = 2, so a 2 = 2, t 2 = 0, and the sequence terminates. Notice that 8 = [2, 1, 2], 3 so the continued fraction procedure produces the continued fraction of 8 3.

222 84 5. Continued Fractions Example Let x = Then so a 0 = 1 and t 0 = We have 1 t 0 = x = , = = so again a 1 = 1 and t 1 = Likewise, a n = 1 for all n. As we will see below, the following exciting equality makes sense = Example Suppose x = e = Using the continued fraction procedure, we find that a 0, a 1, a 2,... = 2, 1, 2, 1, 1, 4, 1, 1, 6, 1, 1, 8, 1, 1, 10,... For example, a 0 = 2 is the floor of 2. Subtracting 2 and inverting, we obtain 1/ = , so a 1 = 1. Subtracting 1 and inverting yields 1/ = , so a 2 = 2. We will prove in Section 5.3 that the continued fraction of e obeys a simple pattern. The 5th partial convergent of the continued fraction of e is [a 0, a 1, a 2, a 3, a 4, a 5 ] = = , which is a good rational approximation to e, in the sense that e = Note that < 1/32 2 = , which illustrates the bound in Corollary below. Let s do the same thing with π = : Applying the continued fraction procedure, we find that the continued fraction of π is a 0, a 1, a 2,... = 3, 7, 15, 1, 292, 1, 1, 1, 2, 1, 3, 1, 14,... The first few partial convergents are 3, 22 7, , , ,

223 5.2 Infinite Continued Fractions 85 These are good rational approximations to π; for example, = Notice that the continued fraction of e exhibits a nice pattern (see Section 5.3 for a proof), whereas the continued fraction of π exhibits no pattern that is obvious to the author. The continued fraction of π has been extensively studied, and over 20 million terms have been computed. The data suggests that every integers appears infinitely often as a partial convergent. For much more about the continued fraction of π or of any other sequence in this book, type the first few terms of the sequence into [Slo] Convergence of Infinite Continued Fractions Lemma For every n such that a n is defined, we have x = [a 0, a 1,..., a n + t n ], and if t n 0 then x = [a 0, a 1,..., a n, 1 t n ]. Proof. We use induction. The statements are both true when n = 0. If the second statement is true for n 1, then [ ] 1 x = a 0, a 1,..., a n 1, t n 1 = [a 0, a 1,..., a n 1, a n + t n ] [ = a 0, a 1,..., a n 1, a n, 1 ]. t n Similarly, the first statement is true for n if it is true for n 1. Theorem (Continued Fraction Limit). Let a 0, a 1,... be a sequence of integers such that a n > 0 for all n 1, and for each n 0, set c n = [a 0, a 1,... a n ]. Then lim n c n exists. Proof. For any m n, the number c n is a partial convergent of [a 0,..., a m ]. By Proposition the even convergents c 2n form a strictly increasing sequence and the odd convergents c 2n+1 form a strictly decreasing sequence. Moreover, the even convergents are all c 1 and the odd convergents are all c 0. Hence α 0 = lim n c 2n and α 1 = lim n c 2n+1 both exist and α 0 α 1. Finally, by Proposition so α 0 = α 1. c 2n c 2n 1 = 1 1 q 2n q 2n 1 2n(2n 1) 0,

224 86 5. Continued Fractions We define [a 0, a 1,...] = lim c n. n Example We illustrate the theorem with x = π. As in the proof of Theorem 5.2.5, let c n be the nth partial convergent to π. The c n with n odd converge down to π c 1 = , c 3 = , c 5 = whereas the c n with n even converge up to π c 2 = , c 4 = , c 6 = Theorem Let a 0, a 1, a 2,... be a sequence of real numbers such that a n > 0 for all n 1, and for each n 0, set c n = [a 0, a 1,... a n ]. Then lim c n exists if and only if the sum n n=0 a n diverges. Proof. We only prove that if a n diverges then lim n c n exists. A proof of the converse can be found in [Wal48, Ch. 2, Thm. 6.1]. Let q n be the sequence of denominators of the partial convergents, as defined in Section 5.1.1, so q 2 = 1, q 1 = 0, and for n 0, q n = a n q n 1 + q n 2. As we saw in the proof of Theorem 5.2.5, the limit lim n c n exists provided that the sequence {q n q n 1 } diverges to positive infinity. For n even, q n = a n q n 1 + q n 2 = a n q n 1 + a n 2 q n 3 + q n 4 = a n q n 1 + a n 2 q n 3 + a n 4 q n 5 + q n 6 = a n q n 1 + a n 2 q n a 2 q 1 + q 0 and for n odd, q n = a n q n 1 + a n 2 q n a 1 q 0 + q 1. Since a n > 0 for n > 0, the sequence {q n } is increasing, so q i 1 for all i 0. Applying this fact to the above expressions for q n, we see that for n even q n a n + a n a 2, and for n odd q n a n + a n a 1. If a n diverges, then at least one of a 2n or a 2n+1 must diverge. The above inequalities then imply that at least one of the sequences {q 2n } or {q 2n+1 } diverge to infinity. Since {q n } is an increasing sequence, it follows that {q n q n 1 } diverges to infinity.

225 5.2 Infinite Continued Fractions 87 Example Let a n = 1 n log(n) for n 2 and a 0 = a 1 = 0. By the integral test, a n diverges, so by Theorem the continued fraction [a 0, a 1, a 2,...] converges. This convergence is very slow, since, e.g. yet [a 0, a 1,..., a 9999 ] = [a 0, a 1,..., a ] = Theorem Let x R be a real number. Then x is the value of the (possibly infinite) simple continued fraction [a 0, a 1, a 2,...] produced by the continued fraction procedure. Proof. If the sequence is finite then some t n = 0 and the result follows by Lemma Suppose the sequence is infinite. By Lemma 5.2.4, x = [a 0, a 1,..., a n, 1 t n ]. By Proposition (which we apply in a case when the partial quotients of the continued fraction are not integers!), we have x = Thus if c n = [a 0, a 1,..., a n ], then 1 p n + p n 1 t n. 1 q n + q n 1 t n Thus x c n = x p n = q n 1 t n p n q n + p n 1 q n 1 t n p n q n p n q n 1 ( ). 1 q n t n q n + q n 1 = p n 1q n p n q ( n 1 ) 1 q n t n q n + q n 1 ( 1) n = ( ). 1 q n t n q n + q n 1 1 x c n = ( ) 1 q n t n q n + q n 1 1 < q n (a n+1 q n + q n 1 ) 1 1 = q n q n+1 n(n + 1) 0.

226 88 5. Continued Fractions 1 In the inequality we use that a n+1 is the integer part of t n, and is hence 1 t n < 1, since t n < 1. This corollary follows from the proof of the above theorem. Corollary (Convergence of continued fraction). Let a 0, a 1,... define a simple continued fraction, and let x = [a 0, a 1,...] R be its value. Then for all m, x p m < 1. q m q m+1 q m Proposition If x is a rational number then the sequence a 0, a 1,... produced by the continued fraction procedure terminates. Proof. Let [b 0, b 1,..., b m ] be the continued fraction representation of x that we obtain using Algorithm , so the b i are the partial quotients at each step. If m = 0, then x is an integer, so we may assume m > 0. Then x = b 0 + 1/[b 1,..., b m ]. If [b 1,..., b m ] = 1 then m = 1 and b 1 = 1, which will not happen using Algorithm , since it would give [b 0 +1] for the continued fraction of the integer b Thus [b 1,..., b m ] > 1, so in the continued fraction algorithm we choose a 0 = b 0 and t 0 = 1/[b 1,..., b m ]. Repeating this argument enough times proves the claim. 5.3 The Continued Fraction of e The continued fraction expansion of e begins [2, 1, 2, 1, 1, 4, 1, 1, 6,...]. The obvious pattern in fact does continue, as Euler proved in 1737 (see [Eul85]), and we will prove in this section. As an application, Euler gave a proof that e is irrational by noting that its continued fraction is infinite. The proof we give below draws heavily on the proof in [Coh], which describes a slight variant of a proof of Hermite (see [Old70]). The continued fraction representation of e is also treated in the German book [Per57], but the proof requires substantial background from elsewhere in that text Preliminaries First, we write the continued fraction of e in a slightly different form. Instead of [2, 1, 2, 1, 1, 4,...], we can start the sequence of coefficients [1, 0, 1, 1, 2, 1, 1, 4,...] to make the pattern the same throughout. (Everywhere else in this chapter we assume that the partial quotients a n for n 1 are positive, but

227 5.3 The Continued Fraction of e 89 temporarily relax that condition here and allow a 1 = 0.) The numerators and denominators of the convergents given by this new sequence satisfy a simple recurrence. Using r i as a stand-in for p i or q i, we have r 3n = r 3n 1 + r 3n 2 r 3n 1 = r 3n 2 + r 3n 3 r 3n 2 = 2(n 1)r 3n 3 + r 3n 4. Our first goal is to collapse these three recurrences into one recurrence that only makes mention of r 3n, r 3n 3, and r 3n 6. We have r 3n = r 3n 1 + r 3n 2 = (r 3n 2 + r 3n 3 ) + (2(n 1)r 3n 3 + r 3n 4 ) = (4n 3)r 3n 3 + 2r 3n 4. This same method of simplification also shows us that r 3n 3 = 2r 3n 7 + (4n 7)r 3n 6. To get rid of 2r 3n 4 in the first equation, we make the substitutions 2r 3n 4 = 2(r 3n 5 + r 3n 6 ) = 2((2(n 2)r 3n 6 + r 3n 7 ) + r 3n 6 ) = (4n 6)r 3n 6 + 2r 3n 7. Substituting for 2r 3n 4 and then 2r 3n 7, we finally have the needed collapsed recurrence, r 3n = 2(2n 1)r 3n 3 + r 3n Two Integral Sequences We define the sequences x n = p 3n, y n = q 3n. Since the 3n-convergents will converge to the same real number that the n-convergents do, x n /y n also converges to the limit of the continued fraction. Each sequence {x n }, {y n } will obey the recurrence relation derived in the previous section (where z n is a stand-in for x n or y n ): z n = 2(2n 1)z n 1 + z n 2, for all n 2. (5.3.1) The two sequences can be found in Table 5.1. (The initial conditions x 0 = 1, x 1 = 3, y 0 = y 1 = 1 are taken straight from the first few convergents of the original continued fraction.) Notice that since we are skipping several convergents at each step, the ratio x n /y n converges to e very quickly.

228 90 5. Continued Fractions TABLE 5.1. Convergents n x n y n x n /y n A Related Sequence of Integrals Now, we define a sequence of real numbers T 0, T 1, T 2,... by the following integrals: T n = 1 0 t n (t 1) n n! e t dt. Below, we compute the first two terms of this sequence explicitly. (When we compute T 1, we are doing the integration by parts u = t(t 1), dv = e t dt. Since the integral runs from 0 to 1, the boundary condition is 0 when evaluated at each of the endpoints. This vanishing will be helpful when we do the integral in the general case.) T 0 = T 1 = = 0 1 e t dt = e 1, t(t 1)e t dt 0 ((t 1) + t)e t dt 1 1 = (t 1)e t te t = 1 e + 2(e 1) = e 3. e t dt The reason that we defined this series now becomes apparent: T 0 = y 0 e x 0 and that T 1 = y 1 e x 1. In general, it will be true that T n = y n e x n. We will now prove this fact. It is clear that if the T n were to satisfy the same recurrence that the x i and y i do, in equation (5.3.1), then the above statement holds by induction. (The initial conditions are correct, as needed.) So we simplify T n by

229 integrating by parts twice in succession: T n = 1 = = t n (t 1) n n! e t dt t n 1 (t 1) n + t n (t 1) n 1 ( t n 2 (t 1) n (n 2)! (n 1)! + n tn 1 (t 1) n 1 (n 1)! 5.4 Quadratic Irrationals 91 e t dt ) e t dt + n tn 1 (t 1) n 1 + tn (t 1) n 2 (n 1)! (n 2)! 1 t n 2 (t 1) n 2 = 2nT n 1 + (2t 2 2t + 1) e t dt n 2! = 2nT n t n 1 (t 1) n 1 n 2! = 2nT n 1 + 2(n 1)T n 1 + T n 2 = 2(2n 1)T n 1 + T n 2, 1 e t t n 2 (t 1) n 2 dt + 0 n 2! e t dt which is the desired recurrence. Therefore T n = y n e x n. To conclude the proof, we consider the limit as n approaches infinity: lim n by inspection, and therefore 1 0 t n (t 1) n n! e t dt = 0, x n lim = lim n y (e T n ) = e. n n y n Therefore, the ratio x n /y n approaches e, and the continued fraction expansion [2, 1, 2, 1, 1, 4, 1, 1,...] does in fact converge to e Extensions of the Argument The method of proof of this section generalizes to show that the continued fraction expansion of e 1/n is [1, (n 1), 1, 1, (3n 1), 1, 1, (5n 1), 1, 1, (7n 1),...] for all n N (see Exercise 5.6). 5.4 Quadratic Irrationals The main result of this section is that the continued fraction expansion of a number is eventually repeating if and only if the number is a quadratic

230 92 5. Continued Fractions irrational. This can be viewed as an analogue for continued fractions of the familiar fact that the decimal expansion of x is eventually repeating if and only if x is rational. The proof that continued fractions of quadratic irrationals eventually repeats is surprisingly difficult and involves an interesting finiteness argument. Section emphasizes our striking ignorance about continued fractions of real roots of irreducible polynomials over Q of degree bigger than 2. Definition (Quadratic Irrational). A real number α R is a quadratic irrational if it is irrational and satisfies a quadratic polynomial with coefficients in Q. Thus, e.g., (1 + 5)/2 is a quadratic irrational. Recall that = [1, 1, 1,...]. The continued fraction of 2 is [1, 2, 2, 2, 2, 2,...], and the continued fraction of 389 is [19, 1, 2, 1, 1, 1, 1, 2, 1, 38, 1, 2, 1, 1, 1, 1, 2, 1, 38,...]. Does the [1, 2, 1, 1, 1, 1, 2, 1, 38] pattern repeat over and over again? Periodic Continued Fractions Definition (Periodic Continued Fraction). A periodic continued fraction is a continued fraction [a 0, a 1,..., a n,...] such that a n = a n+h for some fixed positive integer h and all sufficiently large n. We call the minimal such h the period of the continued fraction. Example Consider the periodic continued fraction [1, 2, 1, 2,...] = [1, 2]. What does it converge to? We have so if α = [1, 2] then 1 [1, 2] = , α = α = α + 1 α = 1 + α 2α + 1 = 3α + 1 2α + 1.

231 5.4 Quadratic Irrationals 93 Thus 2α 2 2α 1 = 0, so α = Theorem (Periodic Characterization). An infinite simple continued fraction is periodic if and only if it represents a quadratic irrational. Proof. (= ) First suppose that [a 0, a 1,..., a n, a n+1,..., a n+h ] is a periodic continued fraction. Set α = [a n+1, a n+2,...]. Then so by Proposition α = [a n+1,..., a n+h, α], α = αp n+h + p n+h 1 αq n+h + q n+h 1. Here we use that α is the last partial quotient. Thus, α satisfies a quadratic equation with coefficients in Q. Computing as in Example and rationalizing the denominators, and using that the a i are all integers, shows that [a 0, a 1,...] = [a 0, a 1,..., a n, α] 1 = a a 1 + a α is of the form c + dα, with c, d Q, so [a 0, a 1,...] also satisfies a quadratic polynomial over Q. The continued fraction procedure applied to the value of an infinite simple continued fraction yields that continued fraction back, so by Proposition , α Q because it is the value of an infinite continued fraction. ( =) Suppose α R is an irrational number that satisfies a quadratic equation aα 2 + bα + c = 0 (5.4.1) with a, b, c Z and a 0. Let [a 0, a 1,...] be the continued fraction expansion of α. For each n, let r n = [a n, a n+1,...], so α = [a 0, a 1,..., a n 1, r n ].

232 94 5. Continued Fractions We will prove periodicity by showing that the set of r n s is finite. If we have shown finiteness, then there exists n, h > 0 such that r n = r n+h, so [a 0,..., a n 1, r n ] = [a 0,..., a n 1, a n,..., a n+h 1, r n+h ] = [a 0,..., a n 1, a n,..., a n+h 1, r n ] = [a 0,..., a n 1, a n,..., a n+h 1, a n,..., a n+h 1, r n+h ] = [a 0,..., a n 1, a n,..., a n+h 1 ]. It remains to show there are only finitely many distinct r n. We have α = p n q n = r np n 1 + p n 2 r n q n 1 + q n 2. Substituting this expression for α into the quadratic equation (5.4.1), we see that A n r 2 n + B n r n + C n = 0, where A n = ap 2 n 1 + bp n 1 q n 1 + cq 2 n 1, B n = 2ap n 1 p n 2 + b(p n 1 q n 2 + p n 2 q n 1 ) + 2cq n 1 q n 2, and C n = ap 2 n 2 + bp n 2 q n 2 + cp 2 n 2. Note that A n, B n, C n Z, that C n = A n 1, and that B 2 4A n C n = (b 2 4ac)(p n 1 q n 2 q n 1 p n 2 ) 2 = b 2 4ac. Recall from the proof of Theorem that α p n 1 < 1. q n q n 1 Thus so Hence ( A n = a αq n 1 + q n 1 αq n 1 p n 1 < 1 q n < 1 q n 1, p n 1 = αq n 1 + δ q n 1 δ q n 1 with δ < 1. ) 2 + b ( αq n 1 + δ q n 1 = (aα 2 + bα + c)qn aαδ + a δ2 qn bδ = 2aαδ + a δ2 qn bδ. ) q n 1 + cq 2 n 1

233 Thus δ2 A n = 2aαδ + a q 2 n Quadratic Irrationals 95 + bδ < 2 aα + a + b. Thus there are only finitely many possibilities for the integer A n. Also, C n = A n 1 and B n = b 2 4(ac A n C n ), so there are only finitely many triples (A n, B n, C n ), and hence only finitely many possibilities for r n as n varies, which completes the proof. (The proof above closely follows [HW79, Thm. 177, pg ].) Continued Fractions of Algebraic Numbers of Higher Degree Definition (Algebraic Number). An algebraic number is a root of a polynomial f Q[x]. Open Problem Give a simple description of the complete continued fractions expansion of the algebraic number 3 2. It begins [1, 3, 1, 5, 1, 1, 4, 1, 1, 8, 1, 14, 1, 10, 2, 1, 4, 12, 2, 3, 2, 1, 3, 4, 1, 1, 2, 14, 3, 12, 1, 15, 3, 1, 4, 534, 1, 1, 5, 1, 1,...] The author does not see a pattern, and the 534 reduces his confidence that he will. Lang and Trotter (see [LT72]) analyzed many terms of the continued fraction of 3 2 statistically, and their work suggests that 3 2 has an unusual continued fraction; later work in [LT74] suggests that maybe it does not. Khintchine (see [Khi63, pg. 59]) No properties of the representing continued fractions, analogous to those which have just been proved, are known for algebraic numbers of higher degree [as of 1963]. [...] It is of interest to point out that up till the present time no continued fraction development of an algebraic number of higher degree than the second is known [emphasis added]. It is not even known if such a development has bounded elements. Generally speaking the problems associated with the continued fraction expansion of algebraic numbers of degree higher than the second are extremely difficult and virtually unstudied. Richard Guy (see [Guy94, pg. 260]) Is there an algebraic number of degree greater than two whose simple continued fraction has unbounded partial quotients? Does every such number have unbounded partial quotients?

234 96 5. Continued Fractions Baum and Sweet [BS76] answered the analogue of Richard Guy s question but with algebraic numbers replaced by elements of a field K other than Q. (The field K is F 2 ((1/x)), the field of Laurent series in the variable 1/x over the finite field with two elements. An element of K is a polynomial in x plus a formal power series in 1/x.) They found an α of degree three over K whose continued fraction has all terms of bounded degree, and other elements of various degrees greater than 2 over K whose continued fractions have terms of unbounded degree. 5.5 Recognizing Rational Numbers Suppose that somehow you can compute approximations to some rational number, and want to figure what the rational number probably is. Computing the approximation to high enough precision to find a period in the decimal expansion is not a good approach, because the period can be huge (see below). A much better approach is to compute the simple continued fraction of the approximation, and truncate it before a large partial quotient a n, then compute the value of the truncated continued fraction. This results in a rational number that has relatively small numerator and denominator, and is close to the approximation of the rational number, since the tail end of the continued fraction is at most 1/a n. We begin with a contrived example, which illustrates how to recognize a rational number. Let x = 9495/3847 = The continued fraction of the truncation is We have [2, 2, 7, 2, 1, 5, 1, 1, 1, 1, 1, 1, , 2, 1, 1, 1,...] [2, 2, 7, 2, 1, 5, 1, 1, 1, 1, 1, 1] = Notice that no repetition is evident in the digits of x given above, though we know that the decimal expansion of x must be eventually periodic, since all decimal expansions of rational numbers are eventually periodic. In fact, the length of the period of the decimal expansion of 1/3847 is 3846, which is the order of 10 modulo 3847 (see Exercise 5.7). For a slightly less contrived application of this idea, suppose f(x) Z[x] is a polynomial with integer coefficients, and we know for some reason that one root of f is a rational number. Then we can find that rational number by using Newton s method to approximate each root, and continued fractions to decide whether each root is a rational number (we can substitute the value of the continued fraction approximation into f to see if it

235 5.6 Sums of Two Squares 97 is actually a root). One could also use the well-known rational root theorem, which asserts that any rational root n/d of f, with n, d Z coprime, has the property that n divides the constant term of f and d the leading coefficient of f. However, using that theorem to find n/d would require factoring the constant and leading terms of f, which could be completely impractical if they have a few hundred digits (see Section 1.1.3). In contrast, Newton s method and continued fractions should quickly find n/d, assuming the degree of f isn t too large. For example, suppose f = 3847x x To apply Newton s method, let x 0 be a guess for a root of f. Then iterate using the recurrence x n+1 = x n f(x n) f (x n ). Choosing x 0 = 0, approximations of first two iterates are and x 1 = , x 2 = The continued fraction of the approximations x 1 and x 2 are and [2, 2, 6, 1, 47, 2, 1, 4, 3, 1, 5, 8, 2, 3] [2, 2, 7, 2, 1, 5, 1, 1, 1, 1, 1, 1, 103, 8, 1, 2, 3,...]. Truncating the continued fraction of x 2 before 103 gives [2, 2, 7, 2, 1, 5, 1, 1, 1, 1, 1, 1], which evaluates to 9495/3847, which is a rational root of f. Another computational application of continued fractions, which we can only hint at, is that there are functions in certain parts of advanced number theory (that are beyond the scope of this book) that take rational values at certain points, and which can only be computed efficiently via approximations; using continued fractions as illustrated above to evaluate such functions is crucial. 5.6 Sums of Two Squares In this section we apply continued fractions to prove the following theorem. Theorem A positive integer n is a sum of two squares if and only if all prime factors of p n such that p 3 (mod 4) have even exponent in the prime factorization of n.

236 98 5. Continued Fractions We first consider some examples. Notice that 5 = is a sum of two squares, but 7 is not a sum of two squares. Since 2001 is divisible by 3 (because 2 + 1), but not by 9 (since is not), Theorem implies that 2001 is not a sum of two squares. The theorem also implies that is a sum of two squares. Definition (Primitive). A representation n = x 2 + y 2 is primitive if x and y are coprime. Lemma If n is divisible by a prime p 3 (mod 4), then n has no primitive representations. Proof. Suppose n has a primitive representation, n = x 2 + y 2, and let p be any prime factor of n. Then p x 2 + y 2 and gcd(x, y) = 1, so p x and p y. Since Z/pZ is a field we may divide by y 2 in the equation x 2 + y 2 0 (mod ( p) ) to see that (x/y) 2 1 (mod p). Thus the quadratic residue symbol equals +1. However, by Proposition 4.2.1, 1 p ( ) 1 = ( 1) (p 1)/2 p ( ) 1 so p = 1 if and only if (p 1)/2 is even, which is to say p 1 (mod 4). Proof of Theorem (= ). Suppose that p 3 (mod 4) is a prime, that p r n but p r+1 n with r odd, and that n = x 2 + y 2. Letting d = gcd(x, y), we have with gcd(x, y ) = 1 and x = dx, y = dy, and n = d 2 n (x ) 2 + (y ) 2 = n. Because r is odd, p n, so Lemma implies that gcd(x, y ) > 1, a contradiction. To prepare for our proof of ( =), we reduce the problem to the case when n is prime. Write n = n 2 1n 2 where n 2 has no prime factors p 3 (mod 4). It suffices to show that n 2 is a sum of two squares, since (x y 2 1)(x y 2 2) = (x 1 x 2 y 1 y 2 ) 2 + (x 1 y 2 + x 2 y 1 ) 2, (5.6.1) so a product of two numbers that are sums of two squares is also a sum of two squares. Since 2 = is a sum of two squares, it suffices to show that any prime p 1 (mod 4) is a sum of two squares.

237 5.6 Sums of Two Squares 99 Lemma If x R and n N, then there is a fraction a in lowest b terms such that 0 < b n and x a 1 b b(n + 1). Proof. Consider the continued fraction [a 0, a 1,...] of x. By Corollary , for each m x p m < 1. q m q m+1 q m Since q m+1 q m + 1 and q 0 = 1, either there exists an m such that q m n < q m+1, or the continued fraction expansion of x is finite and n is larger than the denominator of the rational number x, in which case we take a b = x and are done. In the first case, so a b = p m q m x p m < 1 q m q m+1 q m satisfies the conclusion of the lemma. 1 q m (n + 1), Proof of Theorem ( =). As discussed above, it suffices to prove that any prime p 1 (mod 4) is a sum of two squares. Since p 1 (mod 4), ( 1) (p 1)/2 = 1, so Proposition implies that 1 is a square modulo p; i.e., there exists r Z such that r 2 1 (mod p). Lemma 5.6.4, with n = p and x = r p, implies that there are integers a, b such that 0 < b < p and r p a b 1 b(n + 1) < 1 b p. Letting c = rb + pa, we have that so But c rb (mod p), so c < pb b p = p = p p 0 < b 2 + c 2 < 2p. b 2 + c 2 b 2 + r 2 b 2 b 2 (1 + r 2 ) 0 (mod p). Thus b 2 + c 2 = p. Remark Our proof of Theorem leads to an efficient algorithm to compute a representation of any p 1 (mod 4) as a sum of two squares. See Listing for an implementation.

238 Continued Fractions 5.7 Exercises 5.1 If c n = p n /q n is the nth convergent of [a 0, a 1,..., a n ] and a 0 > 0, show that [a n, a n 1,..., a 1, a 0 ] = p n p n 1 and (Hint: In the first case, notice that [a n, a n 1,..., a 2, a 1 ] = q n q n 1. p n = a n + p n 2 = a n + 1 p p n 1 p n 1.) n 1 p n Show that every nonzero rational number can be represented in exactly two ways be a finite simple continued fraction. (For example, 2 can be represented by [1, 1] and [2], and 1/3 by [0, 3] and [0, 2, 1].) 5.3 Evaluate the infinite continued fraction [2, 1, 2, 1]. 5.4 Determine the infinite continued fraction of Let a 0 R and a 1,..., a n and b be positive real numbers. Prove that if and only if n is odd. [a 0, a 1,..., a n + b] < [a 0, a 1,..., a n ] 5.6 (*) Extend the method presented in the text to show that the continued fraction expansion of e 1/k is [1, (k 1), 1, 1, (3k 1), 1, 1, (5k 1), 1, 1, (7k 1),...] for all k N. (a) Compute p 0, p 3, q 0, and q 3 for the above continued fraction. Your answers should be in terms of k. (b) Condense three steps of the recurrence for the numerators and denominators of the above continued fraction. That is, produce a simple recurrence for r 3n in terms of r 3n 3 and r 3n 6 whose coefficients are polynomials in n and k. (c) Define a sequence of real numbers by T n (k) = 1 1/k (kt) n (kt 1) n k n 0 n! e t dt. i. Compute T 0 (k), and verify that it equals q 0 e 1/k p 0. ii. Compute T 1 (k), and verify that it equals q 3 e 1/k p 3.

239 5.7 Exercises 101 iii. Integrate T n (k) by parts twice in succession, as in Section 5.3, and verify that T n (k), T n 1 (k), and T n 2 (k) satisfy the recurrence produced in part 6b, for n 2. (d) Conclude that the continued fraction [1, (k 1), 1, 1, (3k 1), 1, 1, (5k 1), 1, 1, (7k 1),...] represents e 1/k. 5.7 Let d be an integer that is coprime to 10. Prove that the decimal expansion of 1 d has period equal to the order of 10 modulo d. (Hint: 1 For every positive integer r, we have 1 10 = r n 1 10 rn.) 5.8 Find a positive integer that has at least three different representations as the sum of two squares, disregarding signs and the order of the summands. 5.9 Show that if a natural number n is the sum of two two rational squares it is also the sum of two integer squares (*) Let p be an odd prime. Show that p 1, 3 (mod 8) if and only if p can be written as p = x 2 +2y 2 for some choice of integers x and y Prove that of any four consecutive integers, at least one is not representable as a sum of two squares.

240 Continued Fractions

241 6 Elliptic Curves This is page 103 Printer: Opaque this We introduce elliptic curves and describe how to put a group structure on the set of points on an elliptic curve. We then apply elliptic curves to two cryptographic problems factoring integers and constructing publickey cryptosystems. Elliptic curves are believed to provide good security with smaller key sizes, something that is very useful in many applications, e.g., if we are going to print an encryption key on a postage stamp, it is helpful if the key is short! Finally, we consider elliptic curves over the rational numbers, and briefly survey some of the key ways in which they arise in number theory. 6.1 The Definition Definition (Elliptic Curve). An elliptic curve over a field K is a curve defined by an equation of the form y 2 = x 3 + ax + b, where a, b K and 16(4a b 2 ) 0. The condition that 16(4a b 2 ) 0 implies that the curve has no singular points, which will be essential for the applications we have in mind (see Exercise 6.1).

242 Elliptic Curves FIGURE 6.1. The Elliptic Curve y 2 = x 3 + x over Z/7Z In Section 6.2 we will put a natural abelian group structure on the set E(K) = {(x, y) K K : y 2 = x 3 + ax + b} {O} of K-rational points on an elliptic curve E over K. Here O may be thought of as a point on E at infinity. In Figure 6.1 we graph y 2 = x 3 + x over the finite field Z/7Z, and in Figure 6.2 we graph y 2 = x 3 + x over the field K = R of real numbers. Remark If K has characteristic 2 (e.g., K = Z/2Z), then for any choice of a, b, the quantity 16(4a b 2 ) K is 0, so according to Definition there are no elliptic curves over K. There is a similar problem in characteristic 3. If we instead consider equations of the form y 2 + a 1 xy + a 3 y = x 3 + a 2 x 2 + a 4 x + a 6, we obtain a more general definition of elliptic curves, which correctly allows for elliptic curves in characteristic 2 and 3; these elliptic curves are popular in cryptography because arithmetic on them is often easier to efficiently implement on a computer. 6.2 The Group Structure on an Elliptic Curve Let E be an elliptic curve over a field K, given by an equation y 2 = x 3 + ax + b. We begin by defining a binary operation + on E(K). Algorithm (Elliptic Curve Group Law). Given P 1, P 2 E(K), this algorithm computes a third point R = P 1 + P 2 E(K).

243 2 6.2 The Group Structure on an Elliptic Curve 105 y 1 0 x FIGURE 6.2. The Elliptic Curve y 2 = x 3 + x over R 1. [Is P i = O?] If P 1 = O set R = P 2 or if P 2 = O set R = P 1 and terminate. Otherwise write (x i, y i ) = P i. 2. [Negatives] If x 1 = x 2 and y 1 = y 2, set R = O and terminate. { (3x a)/(2y 1 ) if P 1 = P 2, 3. [Compute λ] Set λ = (y 1 y 2 )/(x 1 x 2 ) otherwise. 4. [Compute Sum] Then R = ( λ 2 x 1 x 2, λx 3 ν ), where ν = y 1 λx 1 and x 3 = λ 2 x 1 x 2 is the x-coordinate of R. Note that in Step 3 if P 1 = P 2, then y 1 0; otherwise, we would have terminated in the previous step. We implement this algorithm in Section Theorem The binary operation + defined above endows the set E(K) with an abelian group structure, in which O is the identity element. Before discussing why the theorem is true, we reinterpret + geometrically, so that it will be easier for us to visualize. We obtain the sum P 1 + P 2 by finding the third point P 3 of intersection between E and the line L determined by P 1 and P 2, then reflecting P 3 about the x-axis. (This description requires suitable interpretation in cases 1 and 2, and when P 1 = P 2.) This is illustrated in Figure 6.3, in which (0, 2) + (1, 0) = (3, 4)

244 Elliptic Curves on y 2 = x 3 5x + 4. To further clarify this geometric interpretation, we prove the following proposition. Proposition (Geometric group law). Suppose P i = (x i, y i ), i = 1, 2 are distinct point on an elliptic curve y 2 = x 3 +ax+b, and that x 1 x 2. Let L be the unique line through P 1 and P 2. Then L intersects the graph of E at exactly one other point Q = ( λ 2 x 1 x 2, λx 3 + ν ), where λ = (y 1 y 2 )/(x 1 x 2 ) and ν = y 1 λx 1. Proof. The line L through P 1, P 2 is y = y 1 + (x x 1 )λ. Substituting this into y 2 = x 3 + ax + b we get (y 1 + (x x 1 )λ) 2 = x 3 + ax + b. Simplifying we get f(x) = x 3 λ 2 x 2 + = 0, where we omit the coefficients of x and the constant term since they will not be needed. Since P 1 and P 2 are in L E, the polynomial f has x 1 and x 2 as roots. By Proposition 2.5.2, the polynomial f can have at most three roots. Writing f = (x x i ) and equating terms, we see that x 1 + x 2 + x 3 = λ 2. Thus x 3 = λ 2 x 1 x 2, as claimed. Also, from the equation for L we see that y 3 = y 1 + (x 3 x 1 )λ = λx 3 + ν, which completes the proof. To prove Theorem means to show that + satisfies the three axioms of an abelian group with O as identity element: existence of inverses, commutativity, and associativity. The existence of inverses follows immediately from the definition, since (x, y) + (x, y) = O. Commutativity is also clear from the definition of group law, since in parts 1 3, the recipe is unchanged if we swap P 1 and P 2 ; in part 4 swapping P 1 and P 2 does not change the line determined by P 1 and P 2, so by Proposition it does not change the sum P 1 + P 2. It is more difficult to prove that + satisfies the associative axiom, i.e., that (P 1 + P 2 ) + P 3 = P 1 + (P 2 + P 3 ). This fact can be understood from at least three points of view. One is to reinterpret the group law geometrically (extending Proposition to all cases), and thus transfer the problem to a question in plane geometry. This approach is beautifully explained with exactly the right level of detail in [ST92, I.2]. Another approach is to use the formulas that define + to reduce associativity to checking specific algebraic identities; this is something that would be extremely tedious to do by hand, but can be done using a computer (also tedious). A third approach (see e.g. [Sil86] or [Har77]) is to develop a general theory of divisors on algebraic curves, from which associativity of the group law falls out as a natural corollary. The third approach is the best, because it opens up many new vistas; however we will not pursue it further because it is beyond the scope of this book.

5 6.3 Integer Factorization Using Elliptic Curves 107 y 4 3 L (3, 4) 2 (0, 2) 1 0-1 -2 (1, 0) x -3-4 L (3, 4) -5-3 -2-1 0 1 2 3 4 FIGURE 6.3. The Group Law: (1, 0) + (0, 2) = (3, 4) on y 2 = x 3 5x + 4 6.

245 5 6.3 Integer Factorization Using Elliptic Curves 107 y 4 3 L (3, 4) 2 (0, 2) (1, 0) x -3-4 L (3, 4) FIGURE 6.3. The Group Law: (1, 0) + (0, 2) = (3, 4) on y 2 = x 3 5x Integer Factorization Using Elliptic Curves In 1987, Hendrik Lenstra published the landmark paper [Len87] that introduces and analyzes the Elliptic Curve Method (ECM), which is a powerful algorithm for factoring integers using elliptic curves. Lenstra s method is also described in [ST92, IV.4], [Dav99, VIII.5], and [Coh93, 10.3]. Lenstra s algorithm is well suited for finding medium sized factors of an integer N, which today means 10 to 20 decimal digits. The ECM method is not directly used for factoring RSA challenge numbers (see Section 1.1.3), but it is used on auxiliary numbers as a crucial step in the number field sieve, which is the best known algorithm for hunting for such factorizations. Also, implementation of ECM typically requires little memory. Lenstra Pollard s (p 1)-Method Lenstra s discovery of ECM was inspired by Pollard s (p 1)-method, which we describe in this section.

246 Elliptic Curves Definition (Power smooth). Let B be a positive integer. If n is a positive integer with prime factorization n = p ei i, then n is B-power smooth if p ei i B for all i. Thus 30 = is B power smooth for B = 5, 7, but 150 = is not 5-power smooth (it is B = 25-power smooth). We will use the following algorithm in both the Pollard p 1 and elliptic curve factorization methods. Algorithm (Least Common Multiple of First B Integers). Given a positive integer B, this algorithm computes the least common multiple of the positive integers up to B. 1. [Sieve] Using, e.g., the Sieve of Eratosthenes (Algorithm 1.2.3), compute a list P of all primes p B. 2. [Multiply] Compute and output the product p P p log p (B). Proof. Let m = lcm(1, 2,..., B). Then ord p (m) = max({ord p (n) : 1 n B}) = ord p (p r ), where p r is the largest power of p that satisfies p r B. Since p r B < p r+1, we have r = log p (B). We implement Algorithm in Section Let N be a positive integer that we wish to factor. We use the Pollard (p 1)-method to look for a nontrivial factor of N as follows. First we choose a positive integer B, usually with at most six digits. Suppose that there is a prime divisor p of N such that p 1 is B-power smooth. We try to find p using the following strategy. If a > 1 is an integer not divisible by p then by Theorem , a p 1 1 (mod p). Let m = lcm(1, 2, 3,..., B), and observe that our assumption that p 1 is B-power smooth implies that p 1 m, so a m 1 (mod p). Thus p gcd(a m 1, N) > 1. If gcd(a m 1, N) < N also then gcd(a m 1, N) is a nontrivial factor of N. If gcd(a m 1, N) = N, then a m 1 (mod q r ) for every prime power divisor q r of N. In this case, repeat the above steps but with a smaller choice of B or possibly a different choice of a. Also, it is a good idea to check from the start whether or not N is not a perfect power M r, and if so replace N by M. We formalize the algorithm as follows:

247 6.3 Integer Factorization Using Elliptic Curves 109 Algorithm (Pollard p 1 Method). Given a positive integer N and a bound B, this algorithm attempts to find a nontrivial factor m of N. (Each prime p m is likely to have the property that p 1 is B-power smooth.) 1. [Compute lcm] Use Algorithm to compute m = lcm(1, 2,..., B). 2. [Initialize] Set a = [Power and gcd] Compute x = a m 1 (mod N) and g = gcd(x, N). 4. [Finished?] If g 1 or N, output g and terminate. 5. [Try Again?] If a < 10 (say), replace a by a + 1 and go to step 3. Otherwise terminate. We implement Algorithm in Section For fixed B, Algorithm often splits N when N is divisible by a prime p such that p 1 is B-power smooth. Approximately 15% of primes p in the interval from and are such that p 1 is 10 6 powersmooth, so the Pollard method with B = 10 6 already fails nearly 85% of the time at finding 15-digit primes in this range (see also Exercise 7.14). We will not analyze Pollard s method further, since it was mentioned here only to set the stage for the elliptic curve factorization method. The following examples illustrate the Pollard (p 1)-method. Example In this example, Pollard works perfectly. Let N = We try to use the Pollard p 1 method with B = 5 to split N. We have m = lcm(1, 2, 3, 4, 5) = 60; taking a = 2 we have (mod 5917) and gcd(2 60 1, 5917) = gcd(3416, 5917) = 61, so 61 is a factor of Example In this example, we replace B by larger integer. Let N = With B = 5 and a = 2 we have (mod ), and gcd(2 60 1, ) = 1. With B = 15, we have m = lcm(1, 2,..., 15) = , (mod ), and gcd( , N) = 2003, so 2003 is a nontrivial factor of

248 Elliptic Curves Example In this example, we replace B by a smaller integer. Let N = Suppose B = 7, so m = lcm(1, 2,..., 7) = 420, (mod 4331), and gcd( , 4331) = 4331, so we do not obtain a factor of If we replace B by 5, Pollard s method works: (mod 4331), and gcd(2 60 1, 4331) = 61, so we split Example In this example, a = 2 does not work, but a = 3 does. Let N = 187. Suppose B = 15, so m = lcm(1, 2,..., 15) = , (mod 187), and gcd( , 187) = 187, so we do not obtain a factor of 187. If we replace a = 2 by a = 3, then Pollard s method works: (mod 187), and gcd( , 187) = 11. Thus 187 = Motivation for the Elliptic Curve Method Fix a positive integer B. If N = pq with p and q prime and p 1 and q 1 are not B-power smooth, then the Pollard (p 1)-method is unlikely to work. For example, let B = 20 and suppose that N = = Note that neither 59 1 = 2 29 nor = 4 25 is B-power smooth. With m = lcm(1, 2, 3,..., 20) = , we have 2 m (mod N), and gcd(2 m 1, N) = 1, so we do not find a factor of N. As remarked above, the problem is that p 1 is not 20-power smooth for either p = 59 or p = 101. However, notice that p 2 = 3 19 is 20-power smooth. Lenstra s ECM replaces (Z/pZ), which has order p 1, by the group of points on an elliptic curve E over Z/pZ. It is a theorem that #E(Z/pZ) = p + 1 ± s for some nonnegative integer s < 2 p (see e.g., [Sil86, V.1] for a proof). (Also every value of s subject to this bound occurs, as one can see using complex multiplication theory.) For example, if E is the elliptic curve y 2 = x 3 + x + 54 over Z/59Z then by enumerating points one sees that E(Z/59Z) is cyclic of order 57. The set of numbers ± s for s 15 contains 14 numbers that are B-power smooth for B = 20 (see Exercise 7.14). Thus working with an elliptic curve gives us more flexibility. For example, 60 = is 5-power smooth and 70 = is 7-power smooth.

6.3 Integer Factorization Using Elliptic Curves 111 FIGURE 6.4. Hendrik Lenstra 6.3.3 Lenstra s Elliptic Curve Factorization Method Algorithm 6.3.8 (Elliptic Curve Factorization Method).

249 6.3 Integer Factorization Using Elliptic Curves 111 FIGURE 6.4. Hendrik Lenstra Lenstra s Elliptic Curve Factorization Method Algorithm (Elliptic Curve Factorization Method). Given a positive integer N and a bound B, this algorithm attempts to find a nontrivial factor m of N. Carry out the following steps: 1. [Compute lcm] Use Algorithm to compute m = lcm(1, 2,..., B). 2. [Choose Random Elliptic Curve] Choose a random a Z/NZ such that 4a (Z/NZ). Then P = (0, 1) is a point on the elliptic curve y 2 = x 3 + ax + 1 over Z/NZ. 3. [Compute Multiple] Attempt to compute mp using an elliptic curve analogue of Algorithm If at some point we cannot compute a sum of points because some denominator in step 3 of Algorithm is not coprime to N, we compute the gcd of this denominator with N. If this gcd is a nontrivial divisor, output it. If every denominator is coprime to N, output Fail. We implement Algorithm in Section If Algorithm fails for one random elliptic curve, there is an option that is unavailable with Pollard s (p 1)-method we may repeat the above algorithm with a different elliptic curve. With Pollard s method we always work with the group (Z/NZ), but here we can try many groups E(Z/NZ) for many curves E. As mentioned above, the number of points on E over Z/pZ is of the form p + 1 t for some t with t < 2 p; Algorithm thus has a chance if p + 1 t is B-power-smooth for some t with t < 2 p Examples For simplicity, we use an elliptic curve of the form y 2 = x 3 + ax + 1, which has the point P = (0, 1) already on it. We factor N = 5959 using the elliptic curve method. Let m = lcm(1, 2,..., 20) = = ,

250 Elliptic Curves where x 2 means x is written in binary. First we choose a = 1201 at random and consider y 2 = x x + 1 over Z/5959Z. Using the formula for P +P from Algorithm implemented on a computer (see Section 7.6) we compute 2 i P = 2 i (0, 1) for i B = {4, 5, 6, 7, 8, 13, 21, 22, 23, 24, 26, 27}. Then i B 2i P = mp. It turns out that during no step of this computation does a number not coprime to 5959 appear in any denominator, so we do not split N using a = Next we try a = 389 and at some stage in the computation we add P = (2051, 5273) and Q = (637, 1292). When computing the group law explicitly we try to compute λ = (y 1 y 2 )/(x 1 x 2 ) in (Z/5959Z), but fail since x 1 x 2 = 1414 and gcd(1414, 5959) = 101. We thus find a nontrivial factor 101 of For bigger examples and an implementation of the algorithm, see Section A Heuristic Explanation Let N be a positive integer and for simplicity of exposition assume that N = p 1 p r with the p i distinct primes. It follows from Lemma that there is a natural isomorphism f : (Z/NZ) (Z/p 1 Z) (Z/p r Z). When using Pollard s method, we choose an a (Z/NZ), compute a m, then compute gcd(a m 1, N). This gcd is divisible exactly by the primes p i such that a m 1 (mod p i ). To reinterpret Pollard s method using the above isomorphism, let (a 1,..., a r ) = f(a). Then (a m 1,..., a m r ) = f(a m ), and the p i that divide gcd(a m 1, N) are exactly the p i such that a m i = 1. By Theorem , these p i include the primes p j such that p j 1 is B-power smooth, where m = lcm(1,..., m). We will not define E(Z/NZ) when N is composite, since this is not needed for the algorithm (where we assume that N is prime and hope for a contradiction). However, for the remainder of this paragraph, we pretend that E(Z/N Z) is meaningful and describe a heuristic connection between Lenstra and Pollard s methods. The significant difference between Pollard s method and the elliptic curve method is that the isomorphism f is replaced by an isomorphism (in quotes) g : E(Z/NZ) E(Z/p 1 Z) E(Z/p r Z) where E is y 2 = x 3 + ax + 1, and the a of Pollard s method is replaced by P = (0, 1). We put the isomorphism in quotes to emphasize that we have not defined E(Z/N Z). When carrying out the elliptic curve factorization algorithm, we attempt to compute mp and if some components of f(q) are O, for some point Q that appears during the computation, but others are nonzero, we find a nontrivial factor of N.

251 6.4 Elliptic Curve Cryptography 6.4 Elliptic Curve Cryptography 113 In this section we discuss an analogue of Diffie-Hellman that uses an elliptic curve instead of (Z/pZ). The idea to use elliptic curves in cryptography was independently proposed by Neil Koblitz and Victor Miller in the mid 1980s. We then discuss the ElGamal elliptic curve cryptosystem Elliptic Curve Analogues of Diffie-Hellman The Diffie-Hellman key exchange from Section 3.1 works well on an elliptic curve with no serious modification. Michael and Nikita agree on a secret key as follows: 1. Michael and Nikita agree on a prime p, an elliptic curve E over Z/pZ, and a point P E(Z/pZ). 2. Michael secretly chooses a random m and sends mp. 3. Nikita secretly chooses a random n and sends np. 4. The secret key is nmp, which both Michael and Nikita can compute. Presumably, an adversary can not compute nmp without solving the discrete logarithm problem (see Problem and Section below) in E(Z/pZ). For well-chosen E, P, and p experience suggests that the discrete logarithm problem in E(Z/pZ) is much more difficult than the discrete logarithm problem in (Z/pZ) (see Section for more on the elliptic curve discrete log problem) The ElGamal Cryptosystem and Digital Rights Management This section is about the ElGamal cryptosystem, which works well on an elliptic curves. This section draws on a paper by a computer hacker named Beale Screamer who cracked a Digital Rights Management (DRM) system. The elliptic curve used in the DRM is an elliptic curve over the finite field k = Z/pZ, where p = In base 16 the number p is 89ABCDEF F7, which includes counting in hexadecimal, and digits of e, π, and 2. The elliptic curve E is y 2 = x x

252 Elliptic Curves We have #E(k) = , and the group E(k) is cyclic with generator B = ( , ). Our heroes Nikita and Michael share digital music when they are not out fighting terrorists. When Nikita installed the DRM software on her computer, it generated a private key n = , which it hides in bits and pieces of files. In order for Nikita to play Juno Reactor s latest hit juno.wma, her web browser contacts a web site that sells music. After Nikita sends her credit card number, that web site allows Nikita to download a license file that allows her audio player to unlock and play juno.wma. As we will see below, the license file was created using the ElGamal public-key cryptosystem in the group E(k). Nikita can now use her license file to unlock juno.wma. However, when she shares both juno.wma and the license file with Michael, he is frustrated because even with the license his computer still does not play juno.wma. This is because Michael s computer does not know Nikita s computer s private key (the integer n above), so Michael s computer can not decrypt the license file. We now describe the ElGamal cryptosystem, which lends itself well to implementation in the group E(Z/pZ). To illustrate ElGamal, we describe how Nikita would set up an ElGamal cryptosystem that anyone could use to encrypt messages for her. Nikita chooses a prime p, an elliptic curve E over Z/pZ, and a point B E(Z/pZ), and publishes p, E, and B. She also chooses a random integer n, which she keeps secret, and publishes nb. Her public key is the four-tuple (p, E, B, nb). Suppose Michael wishes to encrypt a message for Nikita. If the message is encoded as an element P E(Z/pZ), Michael computes a random integer r

253 6.4 Elliptic Curve Cryptography 115 and the points rb and P + r(nb) on E(Z/pZ). Then P is encrypted as the pair (rb, P + r(nb)). To decrypt the encrypted message, Nikita multiplies rb by her secret key n to find n(rb) = r(nb), then subtracts this from P + r(nb) to obtain P = P + r(nb) r(nb). We implement this cryptosystem in Section Remark It also make sense to construct an ElGamal cryptosystem in the group (Z/pZ). Returning out our story, Nikita s license file is an encrypted message to her. It contains the pair of points (rb, P + r(nb)), where and rb = ( , ) P + r(nb) = ( , ). When Nikita s computer plays juno.wma, it loads the secret key n = into memory and computes n(rb) = ( , ). It then subtracts this from P + r(nb) to obtain P = ( , ). The x-coordinate is the key that unlocks juno.wma. If Nikita knew the private key n that her computer generated, she could compute P herself and unlock juno.wma and share her music with Michael. Beale Screamer found a weakness in the implementation of this system that allows Nikita to detetermine n, which is not a huge surprise since n is stored on her computer after all The Elliptic Curve Discrete Logarithm Problem Problem (Elliptic Curve Discrete Log Problem). Suppose E is an elliptic curve over Z/pZ and P E(Z/pZ). Given a multiple Q of P, the elliptic curve discrete log problem is to find n Z such that np = Q.

254 Elliptic Curves For example, let E be the elliptic curve given by y 2 = x 3 + x + 1 over the field Z/7Z. We have E(Z/7Z) = {O, (2, 2), (0, 1), (0, 6), (2, 5)}. If P = (2, 2) and Q = (0, 6), then 3P = Q, so n = 3 is a solution to the discrete logarithm problem. If E(Z/pZ) has order p or p±1 or is a product of reasonably small primes, then there are some methods for attacking the discrete log problem on E, which are beyond the scope of this book. It is thus important to be able to compute #E(Z/pZ) efficiently, in order to verify that the elliptic curve one wishes to use for a cryptosystem doesn t have any obvious vulnerabilities. The naive algorithm to compute #E(Z/pZ) is to try each value of x Z/pZ and count how often x 3 + ax + b is a perfect square mod p, but this is of no use when p is large enough to be useful for cryptography. Fortunately, there is an algorithm due to Schoof, Elkies, and Atkin for computing #E(Z/pZ) efficiently (polynomial time in the number of digits of p), but this algorithm is beyond the scope of this book. In Section we discussed the discrete log problem in (Z/pZ). There are general attacks called index calculus attacks on the discrete log problem in (Z/pZ) that are slow, but still faster than the known algorithms for solving the discrete log in a general group (one with no extra structure). For most elliptic curves, there is no known analogue of index calculus attacks on the discrete log problem. At present it appears that given p the discrete log problem in E(Z/pZ) is much harder than the discrete log problem in the multiplicative group (Z/pZ). This suggests that by using an elliptic curve-based cryptosystem instead of one based on (Z/pZ) one gets equivalent security with much smaller numbers, which is one reason why building cryptosystems using elliptic curves is attractive to some cryptographers. For example, Certicom, a company that strongly supports elliptic curve cryptography, claims: [Elliptic curve crypto] devices require less storage, less power, less memory, and less bandwidth than other systems. This allows you to implement cryptography in platforms that are constrained, such as wireless devices, handheld computers, smart cards, and thin-clients. It also provides a big win in situations where efficiency is important. For an up-to-date list of elliptic curve discrete log challenge problems that Certicom sponsors, see [Cer]. For example, in April 2004 a specific cryptosystem was cracked that was based on an elliptic curve over Z/pZ, where p has 109 bits. The first unsolved challenge problem involves an elliptic curve over Z/pZ, where p has 131 bits, and the next challenge after that is one in which p has 163 bits. Certicom claims at [Cer] that the 163-bit challenge problem is computationally infeasible.

6.5 Elliptic Curves Over the Rational Numbers 117 FIGURE 6.5. Louis J. Mordell 6.5 Elliptic Curves Over the Rational Numbers Let E be an elliptic curve defined over Q.

255 6.5 Elliptic Curves Over the Rational Numbers 117 FIGURE 6.5. Louis J. Mordell 6.5 Elliptic Curves Over the Rational Numbers Let E be an elliptic curve defined over Q. The following is a deep theorem about the group E(Q). Theorem (Mordell). The group E(Q) is finitely generated. That is, there are points P 1,..., P s E(Q) such that every element of E(Q) is of the form n 1 P n s P s for integers n 1,... n s Z. Mordell s theorem implies that it makes sense to ask whether or not we can compute E(Q), where by compute we mean find a finite set P 1,..., P s of points on E that generate E(Q) as an abelian group. There is a systematic approach to computing E(Q) called descent (see e.g., [Cre97, Cre, Sil86]). It is widely believed that descent will always succeeds, but nobody has yet proved that it does. Proving that descent works for all curves is one of the central open problem in number theory, and is closely related to the Birch and Swinnerton-Dyer conjecture (one of the Clay Math Institute s million dollar prize problems). The crucial difficulty amounts to deciding whether or not certain explicitly given curves have any rational points on them or not (these are curves that have points over R and modulo n for all n). The details of using descent to computing E(Q) are beyond the scope of this book. In several places below we will simply assert that E(Q) has a certain structure or is generated by certain elements. In each case, we computed E(Q) using a computer implementation of this method The Torsion Subgroup of E(Q) and the Rank For any abelian group G, let G tor be the subgroup of elements of finite order. If E is an elliptic curve over Q, then E(Q) tor is a subgroup of E(Q), which must be finite because of Theorem (see Exercise 6.6).

Elementary Number Theory

Elementary Number Theory A revision by Jim Hefferon, St Michael s College, 2003-Dec of notes by W. Edwin Clark, University of South Florida, 2002-Dec L A TEX source compiled on January 5, 2004 by Jim Hefferon,