Pell's Equation and Fundamental Units Pell's equation was first introduced to me in the number theory class at Caltech that I never comleted. It was r

Pell's Equation and Fundamental Units Kaisa Taiale University of Minnesota Summer 000 1

Pell's Equation and Fundamental Units Pell's equation was first introduced to me in the number theory class at Caltech that I never comleted. It was resented in the context of solving iohantine equations - finding integral solutions to equations. It looks easy: find the smallest ositive integer solutions (x; y) to the equation x y = 1, with a non-square ositive integer. But althought there are several algorithms for finding solutions, a attern in the solutions has never been detected. Moreover, values of which one might exect to roduce slightly different solutions can at times vary dramatically. A story might illustrate: One day Iwas sitting haily at my comuter, answering email, when I got a note that said merely, The smallest non-trivial integer solution of x 94y =1isx = 14395 and y = 1064:" This intrigued me, for some reason, and I asked what some other solutions were. Iwas urged to ick anumber for, and I icked 151. Solving these equations using a somewhat-better-than-brute-force method took one minute for = 94 and three hours for = 151, finally resulting in x = 178148040 and y = 140634693. Before the solution was found I started to wonder why itwas taking so long, and why I'd want towait so long to hear those numbers, and whether or not there was any way to estimate the size of solutions so that I'd know whether or not I wanted to embark on any such searches in the future. In resonse, I was handed a aer by Yoshihiko Yamamoto, ublished in 1971. Titled, Real Quadratic Number Fields with Large Fundamental Units," it seemed to have nothing to do with Pell's equation. Uon the asking of some questions, like, What's a fundamental unit?" and some research, the connection began to get clearer. Although at first glance the fundamental unit has little to do with the equation I saw inmy number theory class it turns out that they are intimately connected. The fundamental unit ffl of the ring of algebraic integers in a real quadratic number field is a generator of the grou of units (mod ±1). For the subring Z[ ] (consisting of elements x + y with x; y Z) of the ring of integers in Q( ), ffl is equal to x + y where (x; y) is the smallest integer solution for Pell's equation. So finding bounds for the size of the fundamental unit is extremely useful, as when one knows ffl and one can find bounds for x and y. Yamamoto's aer does not address the general subject of finding bounds for ffl. In his introduction, he remarks uon the fact that there are many works dealing with the estimation of the fundamental unit, but that they rimarily deal with fundamental units with small orders of absolute value in comarison to their discriminants." His aer deals instead with constructing real quadratic number fields with comaratively large fundamental units. The essence of the work is that one can find a ositive constant c 1 such that log ffl>c 1 (log( ) for sufficiently large. After the introduction, the aer is divided into four arts. In the first, Yamamoto outlines the idea of reduced quadratic irrationals. In the second, he links these to reduced ideals, which he also defines. Only in the third section does he directly aroach the toic of bounds for ffl. The fourth consists of examles. In outlining Yamamoto's research I will roughly follow his format, adding exlanation or other materials wherever necessary. The first thing Yamamoto addresses is reduced quadratic irrationals. Begin with a quadratic equation, ax + bx + c = 0, a > 0 and gcd(a; b; c) = 1. Then let ff be a root of this equation and let be the discriminant, equal to b 4ac. Then ff is a real number, the root of a quadratic equation, and irrational, because the here is the same as the of Pell's equation. Thus the name quadratic irrational. There exists, too, a conjugate to ff, denoted ff 0. We call ff reduced if ff>1 and 0 >ff 0 > 1. Every quadratic irrational is equivalent to a reduced quadratic irrational, where equivalence between two quadratic irrationals ff and fi is defined by the relation ff = afi + b cfi + d : Here a; b; c; d are integers and ad bc = ±1. The equivalence of ff and fi means that they have the same discriminant.

In addition to satisfying the conditions rovided above, a quadratic irrational is deemed reduced if its continued fractional exansion is strictly eriodic. The continued fractional exansion is found as follows: ff = ff 1 ff i = a i + 1 ff i+1 Here a i is the largest integer not larger than ff i. If ff N+1 = ff the exansion has eriod N; if this is true for no N then ff is not eriodic. An examle: for = 1, 1 + 3 is a reduced quadratic irrational. Not only do the inequalities hold 1 + 3 > 0 and 0 > 1 3 > 1 but the eriodic exansion of 1 + 3is remarkably easy to find ff 1 =1+ 3=+ 1 1+ 3 ff = 1+ 3 1 =1+ 1+ 3 ff 3 = ff 1 The quadratic irrational numbers can be organized into sets based on the revious information. The most basic division is to grou them based on whether or not they are reduced. We will use A Λ = A Λ () to mean the grou of all quadratic irrationals and A = A() to mean the grou of reduced ones. Beyond this, one can use the continued fractional exansion of ff to divide A into cosets. The continued fractional exansion of ff results in the sequence of numbers ff i (i=1,,..., N), and each of these sequences makes u a coset of A with resect to the equivalence relation. Let A = A 1 [ A [ ::: [ A h be the equivalence class decomosition of A, then the number h of the cosets is equal to the ideal class number of the field F = Q( )if is the discriminant of F." To me, this is a very interesting result. I had never seen even the hrase class number" before reading this aer, and so I decided I should look it u. It turns out that the result just exlained above is similar to avery classic way of finding the class number - John Stillwell exlained class number in the introduction to irichlet's Lectures on Number Theory" as [t]he number of inequivalent forms with a given discriminant." He was talking about binary quadratic forms (ax +bxy + cy = n, n an integer). Equivalent forms are forms for which a certain change of variables (x 0 = ffx + fiy, y 0 = flx + ffiy, ffffi fifl = ±1, ff; fi; fl; ffi all integers) does not change the values a form takes. An examle of two inequivalent forms are those with = 5, x +5y and x +xy +3y. These two exressions do not take the same values. There are many arallels between this way of finding class number and the secific method used above. Most recisely, the class number is the cardinality oftheideal class grou, which is the quotient grou G=S where G is the grou of all non-zero fractional ideals, and S is the subgrou of all non-zero rincial fractional ideals. (These terms are defined later in this aer.) But another, erhas more intuitive, wayof thinking of class number is that it is very roughly the number of ways an algebraic integer can be factored in the ring of algebraic integers in the field Q( ). This brings us back to our real quadratic number field, and we can sto thinking about x, y, and changes of variable. It also makes more aarent the idea of class number as a measure of failure of unique factorization in a ring. This is intriguing idea, esecially for those who are just venturing into the world of number theory and have never thought of the imortance of unique factorization. Even more interesting is a simle relationshi that exists between class number and fundamental unit: hffl has a constant value. Yamamoto does not use this result. It is startling enough to this author, though, that it may beanavenue of further inquiry. But back towork. From ff we can also find ffl. It turns out that for ff A, ff = aff + b cff + d Then ad bc =( 1) N, where N is the eriod of the continued fractional exansion of ff, and ffl = cff + d. There are several other ways to find ffl from ff, though. Proosition 1. in Yamamoto's aer says, 3

Ifisequal to the discriminant of a real quadratic number field F and ffl is the fundamental unit of F, then Y ffffla i ff = ffl for any equivalence class A i (i =1; ; :::; h). Corollary 1.3 says that in addition to that, Q ffffla ff = fflh. The corollary follows fairly easily from the roosition, and both can be roven by inductive reasoning. The next section in Yamamoto's aer addresses the relationshi between reduced quadratic irrationals and what will be called reduced ideals. Although the first section was mathematically accessible using only high-school algebra, the second section required me to introduce myself to abstract algebra, as well as a few more things. The first thing Yamamoto brings u is!. Let! = +, and notice that 1 and! make u a basis of the ring R of all algebraic integers in F, where F = Q( ) is the real quadratic number field F with discriminant. Then every integral ideal I has the form [a; b + c!] with these conditions: a; b; c Z a>0;c>0;ac= N(R) a = b = 0 mod c and N(b + c!) =0modac a <b+ c! 0 < 0 where! 0 is the conjugate of!. Then ff=ff(i) = b+c! a. We call ff the quadratic irrational associated with the ideal I, and I is reduced if c=1 and ff(i) is a reduced quadratic irrational. A brief review of terms: An ideal I of a ring R is a subset of R for which, if r R, ri = fraja Ig and Ir = farja Ig. In addition, I is closed under multilication by elements from R. In other words, if any element in a ring is multilied by an element of one of the ring's ideals, their roduct is an element of the ideal. A quick examle is the ideal generated by in the ring of integers: multily any integer by two and it's obvious that the roduct is a member of the set of even integers. Yamamoto refers to integral ideals when seaking of these subsets; this is to distinguish them from fractional ideals. Although never exlicitly mentioned, fractional ideals are necessary in the definition of ideal class grou, for instance, so here is a definition: Let o be the ring of algebraic integers in Q( ). Then a fractional ideal of o in Q( ) is a non-zero finitely-generated o-module inside Q( ). (What does that last art mean? An o-module M is finitely generated if there is a finite list m 1 ;:::;m n of elements in M so that M = om 1 + + om n. A module is to a ring as a vector sace is to a field.) An integral ideal, then, is just a fractional ideal contained in o. Yamamoto also writes of rincial ideals rincial simly means that all the elements in the ideal are multiles of the generator. The main roosition of section two of the aer is this (Proosition.1): The ma I! ff(i) gives a bijection of the set of all reduced ideals to the set A = A() of all reduced quadratic irrationals with discriminant. And it induces a bijection of the ideal class grou of F to the set A 1 ;A ; :::; A h of the equivalence classes of A. Of slightly less theoretical imort, but extremely useful in the roof of the theorem the aer is aiming at, is Proosition.: An integral ideal I is reduced if(i)n(i) < and (ii) the conjugate ideal I 0 is relatively rime to I. 4

Once again, a definition of terms: If an ideal has the form [a; b + c!], then its conjugate has the form [a; b + c! 0 ]. For two ideals to be relatively rime means that only roducts of their elements are in their intersection. These two roositions are fairly significant. The technique of associating ideals with numbers is fairly common, and a good tool. Since numbers are often easier to deal with than ideals, a bijection of a set of ideals to a set of numbers can be a good way to turn an algebra roblem into a roblem with numbers rather than ideals. However, in this roof, we'll see that Theorem 3.1 deals with ideals, and that since the bijection exists the result holds for actual numbers. Section three of Yamamoto's aer brings all the revious information together into a roof of the theorem given at the beginning. What follows, in my aer, is essentially a commentary on and exlanation of the roof given in Yamamoto's note. Theorem 3.1. Let i (i = 1; ; :::; n) be rational rimes satisfying 1 < < < n. Assume that there exist infinitely many real quadratic number fields F satisfying the following condition (*): (*) Every i is decomosed inf into the roduct of two rincial ideals m i and m 0 i. Then there exists a ositive constant c 0 deending only on n and 1 ; ; ; n such that log ffl>c 0 (log ) n+1 holds for sufficiently large, where and ffl are the discriminant and the fundamental unit for F. Proof. Consider the ideals I of the form I = ny i=1 m ei i m0fi i Then I is a rincial integral ideal and reduced if(a)n(i) = e1+f1 1 en+fn n and (b) e 1 f 1 = = e n f n =0(Proosition.). < This is not hard to see: Proosition. says that an integral ideal I is reduced if N(I) <, and it is easy to see that (a) follows this exactly. Although slightly less obvious, (b) follows from art (ii) of Pro... Condition (b) makes it necessary that either e i or f i is equal to 0, for any i. This ensures that I and I 0 are always relatively rime to each other. Let I 1 ;I ; ;I t be the set of all reduced ideals obtained as above. Then the quadratic irrationals ff 1 ;ff ; ;ff t associated with them build a subset of the equivalence class A 1, say, corresonding to the rincial ideal class. The rincial ideal class is the collection of fractional ideals which are rincial. Most imortant in that definition is the fact that it is a subgrou of the grou of all fractional ideals. This allows us to understand the next sentence: So we get, from Proosition 1., ffl = Y ffffla 1 ff> ty i=1 ff i : Proosition 1. says that ffl is the roduct of all the ff in A 1. This roduct is larger than the roduct of all ff i (i =1; ; ;t) because the ff i make u a subset of A 1. Since all ff are larger than one by defintion, multilying more of them makes a bigger number. 5

On the other hand we have ff i = b i +! N(I i ) > ( )(e1+f1 1 en+fn n ); where I i =[N(I i );b i +!] is the canonical basis of I i. This basis can be found from section two by noting that if I has basis [a; b + c!], a condition on a is that N(I) =ac, and I is reduced if c=1. Thus the basis can be written as [N(I);b+!]. Hence we get the following inequality (3:1) ffl> 0Y = e1+f1 i en+fn n = ffl 0 : The roduct in (3.1) is taken over all integers e i and f i satisfying (a 0 ) (e 1 + f 1 ) log 1 + +(e n + f n ) log n < log( (b 0 ) e i 0;f i 0 and e i f i =0(i =1; ; ;n): ); Conditions (a 0 ) and (b 0 ) follow easily from conditions (a) and (b) given earlier in the roof for (a 0 ) use the laws for logarithms, for (b 0 ) notice that if ab =0,a; b both real numbers, a =0orb =0. The condition that e i and f i be ositive is new but not surrising. We have (3:) 0Y =( )t The number t equals to the cardinal of the set of n-tules (e 1 ;f 1 ; ;e n ;f n ) satisfying (a 0 ) and (b 0 ). Remember that t first came u as the number of ideals I obtained by multilying factors of rimes - check the beginning of the roof. Then it holds (3:3) t = n V P + O((log n 1 ); where V is the volume of the n-simlex in the n-dimensional euclidean sace R n ; =f(x 1 ; ;x n ) R n : x 1 0; ;x n 0; x 1 + + x n» log( and P = (log 1 )(log ) (log n ): ) Notice that (a 0 )isvery similar to the second descritor for the set, and that all the x i must be greater than or equal to 0. What Yamamoto is doing is saying that a arallel can be drawn between e i and f i and the x i, so that standard results for n-simlices can be alied to this roblem. It would robably be a good idea, at this oint, to exlain what an n-simlex is. A 1-simlex is a line. A -simlex is a triangle. A 3-simlex is a tetrahedron. The attern goes on like this, with n indicating the dimension of the shae with triangular 'sides.' The volume of the n-simlex aroximates the number of lattice oints inside - the number of e i and f i that satisfy the conditions given. Then the O(( )n 1 ) term is an error term. For sufficiently large, the n-simlex estimation is fairly accurate. When all the vertices of an n-simlex have coordinates (0; ; 1; ; 0), in R n, then the n-volume of the simlex is 1 n!. However, in this case the coordinates of the vertices x i have a condition imosed on them: 6

x 1 + + x n» (log ). Yamamoto combines these facts next. We have V = Z dx 1 dx n = 1 n! (log )n : Almost there... For the roduct of all denominators in the right side of (3.1), we have (3:4) log Π 0 ( e1+f1 1 en+fn n ) = = n P = Z 0X [(e1 + f 1 ) log 1 + +(e n + f n ) log n ] (x 1 + + x n )dx 1 dx n + O((log n n (n + 1)!P (log ) n+1 + O((log ) n ): ) ) This is a fairly easy-to-follow set of maniulations, using only logarithm laws, integration, and algebra. From (3.3) and (3.4), we get [3:45] log ffl 0 = log( = )t log Π 0 ( e1+f1 1 en+fn n ) n (n + 1)!P (log ) n+1 + O((log ) n ): Our theorem follows from this and (3.1). Recall that (3.1) says ffl > Q 0 = e 1 +f 1 en+fn i n n (n+1)!p = ffl 0 : Take the log of both sides, then substitute the exression in [3.45] for log ffl 0. is c 0 in the theorem, and for sufficiently large the error term in [3.45] is negligible. Thus Theorem 3.1 is roven. Theorem 3. is the last theorem in Yamamoto's aer, and I will include it for the sake of comleteness. Not much comment is needed; anything difficult to understand I leave, as an exercise, to the reader. Theorem 3. For the case n =, the assumtion of Theorem 3.1 is satisfied by the following F 's: F = Q( m k ) where we set = 1 and q =. m k =( k q + +1) 4 (k =1; ;:::); Proof We easily see that m k =1(mod ), k =( 1) (mod q) and m k =1(mod 4) (m k =1(mod 8) if =. Hence each one of and q is decomosed into the roduct of two distinct rime ideals in F (if m k is not a square). Set = mm 0 and q = nn 0. From the definition of m k, it holds (3:5) ( k q + +1) m k =4 (3:6) ( k q + +1) m k = 4 k q: 7

From (3.5), m and m 0 are both rincial (set m =( k q++1+ m k ), for examle.) From (3.6), either m k n or m 0k n is rincial. Since m k and m 0k are rincial, both n and n 0 are also rincial. So the condition (*) in Theorem 3.1 is satisfied. Finally, the infiniteness of the number of F 's given above is as follows. Set k =j (we consider the case where k is even), then m k = m j =( j q + +1) 4 = q 4j + nq( +1) j +( 1) : Since the diohantine equation y = q x 4 +q( +1)x +( 1) has only a finite number of rational integral solutions (x; y) for a fixed integer (Siegel's theorem), Q( m j ) reresents infinitely many real quadratic number fields F for j =1; ;:::. This comletes the roof. So the theorems about bounds for fundamental units in real quadratic number fields have been roven. I can find, if I want, a lower bound for the fundamental unit ffl in a real quadratic number field with a given large discriminant ; from this I can find some estimate for the solution to Pell's equation with the same. Let us return, then, to the original question: How long is it going to take to find that solution to x + 151y =1? How big is it going to be? And the answer is this: although we nowhave a method for finding a lower bound for ffl, the method itself takes imlementation. It is necessary to ascertain how many rimes smaller than 151 decomose in Q( 151) into the roduct of two rincial rime ideals. This is not the easiest of tasks I could write a comuter rogram to do it for me, but if I am willing to do that then I might as well write a comuter rogram to find the solutions to Pell's equation directly, and erhas use a more efficient algorithm so that it doesn't take three hours. So here my exloration stoed. My summer has ended, and so has the theoretical art of the question I investigated. Now it's all alication... 8