THE MORDELL-WEIL THEOREM FOR Q

THE MORDELL-WEIL THEOREM FOR Q NICOLAS FORD Abstract. An elliptic curve is a specific type of algebraic curve on which one may impose the structure of an abelian group with many desirable properties. The study of elliptic curves has far-reaching connections number theory, cryptography, analysis, and many other fields, and occupies a large area of current research. If an elliptic curve happens to be defined over the rationals, one may examine the subgroup of points with rational coordinates. The Mordell-Weil Theorem for Q states that this group is finitely generated. Contents 1. Preliminaries 1 2. Heights and the Descent Theorem 4 3. The Weak Mordell-Weil Theorem 4 4. Height over Q 7 References 8 1. Preliminaries We assume some familiarity with the theory of algebraic curves and algebraic number theory. As such, many definitions will be stated quickly and under the assumption that the reader has seen them before, and results that come purely from one of these fields will be stated without proof. Definition 1.1. Let K be an algebraically closed field. Recall that the projective plane over K is the set P 2 (K) of equivalence classes of nonzero points in K 3 under the relation (x, y, z) (λx, λy, λz) for each λ 0 K. The equivalence class of the point (x, y, z) is denoted [x : y : z]. Definition 1.2. For any homogeneous polynomial F in K[X, Y, Z] we have a set V (F ) defined by V (F ) = {[a : b : c] P 2 : F (a, b, c) = 0}. Note that this definition does not depend on the choice of representative for the point [a : b : c], because F (λa, λb, λc) = λ deg F F (a, b, c), since F is homogeneous. Such a set V (F ) is called the locus of F. If L is a subfield of K and the coefficients of F lie in L, we say V (F ) is defined over L. In this paper, we will be concerned only with curves of a specific type, namely those specified by a cubic polynomial with only one point on the line H = {[x : y : 0] : x, y K} at infinity. After a linear change of coordinates, we may assume 1

2 NICOLAS FORD that the point in question is [0 : 1 : 0]. The curve may then be given by an equation of the following form, called a Weierstrass equation: Y 2 Z + a 1 XY Z + a 3 Y Z 2 = X 3 + a 2 X 2 Z + a 4 XZ 2 + a 6 Z 3. Given a Weierstrass equation, we define the following quantities [4, 46]: b 2 = a 2 1 + 4a 2 b 4 = 2a 4 + a 1 a 3 b 6 = a 2 3 + 4a 6 b 8 = a 2 1a 6 + 4a 2 a 6 a 1 a 3 a 4 + a 2 a 2 3 a 2 4 = b 2 2b 8 8b 3 4 27b 2 6 + 9b 2 b 4 b 6. The quantity above is called the discriminant. Note that if the characteristic of K is not 2, we may simplify the Weierstrass equation by completing the square. Replacing Y with 1 2 (Y a 1X a 3 Z), one can easily check that the Weierstrass equation becomes Y 2 Z = 4X 3 + b 2 X 2 Z + 2b 4 XZ 2 + b 6. Definition 1.3. Recall that a point P V (F ) is called a singular point if F/ X(P ) = F/ Y (P ) = F/ Z(P ) = 0. Suppose F is given by a Weierstrass equation. It takes a straightforward but tedious calculation, which we omit here, to verify that every point of V (F ) is nonsingular if and only if 0. If this is the case, we call V (F ) an elliptic curve. The basepoint is the point O = [0 : 1 : 0]. The points of an elliptic curve can be made into an abelian group in a relatively simple way, and it is this group that will be the object of study in this paper. Before defining the group operation, we recall some definitions that will help in defining our operation properly. Definitions 1.4. Let E be an elliptic curve. The divisor group Div(E) of E is the free abelian group generated by the points of E. If D = m i=1 n ip i Div(E), we define deg(d) = m i=1 n i. We write D D if each entry in D is greater than or equal to the corresponding entry in D. Recall that if P is a nonsingular point on E, then the local ring O P at P (the ring of rational functions on E which are defined at P ) is a discrete valuation ring. Given a rational function f K(E) (where K(E) is the field of rational functions on E) we define div(f) = P P ord 2 P (f), where ord P is the valuation on K induced by O P. The image of div forms a subgroup of Div 0 (E) = {D Div(E) : deg D = 0}. The quotient of Div 0 (E) by this subgroup is called Pic 0 (E), and if two elements D and D of the divisor group differ by an element of the image of div, we write D D. For D Div(E), we write L(D) = {f K(E) : div(f) D}. This is a vector space over K. Proposition 1.5. Let E be an elliptic curve and let O E be the point [0 : 1 : 0]. Then for every D Div 0 (E), there is a unique point P such that D P O.

THE MORDELL-WEIL THEOREM FOR Q 3 Proof. A nonsingular plane cubic has genus 1, so we may apply the Riemann-Roch Theorem [2, 108] to get that dim L(D + O) = 1. Let f be a nonzero element of L(D + O). Then div(f) D O which, since deg(div(f)) = deg D = 0, gives us that div(f) = P D O for some P. Then D P O as desired. If D P O for some other P, then P P, so for some rational function f, div(f) = P P L(P ). But by the same reasoning as above, L(P ) is onedimensional, and it contains all the constant functions, so f must be constant, meaning div(f) = 0, so P = P. This gives a one-to-one correspondence between the elements of Pic 0 (E) and the points on E, and since Pic 0 (E) is already an abelian group, this gives us a group structure on E. If L is a subfield of K, we write E(L) for the set of points [x : y : z] of E for which, for some λ K, λx, λy, λz L. That is, E(L) is the set of points of E for which there exists a choice of coordinates using only elements of L. Remark 1.6. There is also a geometric way to think about the group law on an elliptic curve. Suppose P and Q are points on E. By Bézout s theorem [2, 57], the line going through P and Q intersects E in exactly one other point, say R. Write R = P Q. Then P +Q = O (P Q). To see this, let L be the line going through P and Q (or tangent at P if P = Q), and let L be the line going through P Q and O. It is easy to see by inspecting the Weierstrass equation that ord O (Z) = 3 and ord P (Z) = 0 for P O, so we get that Then div(l/z) = P + Q + (P Q) 3O, div(l /Z) = (P Q) + (O (P Q)) 2O. 0 div(l/l ) = P + Q O (O (P Q)), so we get that (P O) + (Q O) = (O (P Q)) O. Thinking of the group law in terms of lines allows us to derive an explicit formula for computing the sum of two points on an elliptic curve. The verification of the formula is long and would take us too far off course, but we transcribe it here. Proposition 1.7. [4, 58] Let P = [x 1 : y 1 : 1] and Q = [x 2 : y 2 : 1] be two nonidentity points on an elliptic curve E. If x 1 = x 2 and y 1 + y 2 + a 1 x 2 + a 3 = 0, then P = Q. Otherwise, if x 1 x 2, define and if x 1 = x 2, define λ = y 2 y 1 x 2 x 1, ν = y 1x 2 y 2 x 1 x 2 x 1, λ = 3x2 1 + 2a 2 x 1 + a 4 a 1 y 1 2y 1 + a 1 x 1 + a 3, ν = x3 1 + a 4 x 1 + 2a 6 a 3 y 1 2y 1 + a 1 x 1 + a 3. The line Y = λx + νz is the line L from the preceding discussion. Then if then [x 3 : y 3 : 1] = P + Q. x 3 = λ 2 + a 1 λ a 2 x 1 x 2, y 3 = (λ + a 1 )x 3 ν a 3,

4 NICOLAS FORD 2. Heights and the Descent Theorem The goal of this paper is to prove that, if E is an elliptic curve, then the group E(Q) is finitely generated. To do this, we will use a general result called the Descent Theorem, which relies on the existence of a function with certain properties which measures the size of an element of the group and uses information about the number of elements of small enough size to find a finite generating set. Theorem 2.1 (Descent). [4, 199] Suppose A is an abelian group, and h : A R is a function with the following properties: For each q A there is a constant c 1 depending on A and q such that h(p + q) 2h(p) + c 1 for all p. For some integer m 2 and some constant c 2 depending only on A, we have that h(mp) m 2 h(p) c 2 for all p. For every real number k, {p A : h(p) < k} is finite. If, in addition, A/mA is finite for the m above, then A is finitely generated. Proof. Pick representatives q 1,..., q r A for each of the cosets of A/mA, and pick some p A. Write p = mp 1 + q i1 for some i 1. Continuing inductively in this way, writing p n 1 = mp n + q in. From the first two properties of the height function, we have that h(p n ) 1 m 2 (h(mp n) + c 2 ) = 1 m 2 (h(p n 1 q in ) + c 2 ) 1 m 2 (2h(p n 1 + c 1 + c 2 ) where c 1 is the maximum of the c 1 s for all the q i s. Doing this repeatedly gives us that ( ) n 2 n h(p n ) h(p) + (c 1 + c 2 ) < m 2 ( 2 ) n h(p) + c 1 + c 2 m 2 m 2 2 2 n h(p) + (c 1 + c 2 )/2. i=1 2 n 1 m 2n So if n is big enough we can make h(p n ) < 1 + (c 1 + c 2 )/2. Since p is a linear combination of p n and the q i s and there are only finitely many elements α of height less than 1 + (c 1 + c 2 )/2, those α s and the q i s generate A. 3. The Weak Mordell-Weil Theorem In order to apply the Descent Theorem to the rational points on an elliptic curve, we need two things: a height function on Q satisfying the conditions of the theorem, and that E(Q)/mE(Q) is finite where m is the number used in the Descent Theorem. The second of these is the focus of this section. In fact, we will prove something stronger [4, 190]: Theorem 3.1 (Weak Mordell-Weil). If K/Q is a finite extension, and E is an elliptic curve in P 2 ( K) defined over K, then for each m 2, E(K)/mE(K) is finite.

THE MORDELL-WEIL THEOREM FOR Q 5 We first prove a series of reductions, and at the end we will take advantage of a finiteness result from algebraic number theory. (For a group G and a natural number m, we write G[m] for the set of elements g G with mg = 0.) Lemma 3.2. If L/K is a finite Galois extension and E(L)/mE(L) is finite, then so is E(K)/mE(K). Proof. Let Φ be the kernel of the obvious map from E(K)/mE(K) to E(L)/mE(L). For each P Φ, pick a Q P in E(L) with mq P = P, and define λ P (σ) = σ(q P ) Q P for each σ Gal(L/K). It is easy to check that λ P lands in E[m], that it is welldefined, and that if λ P = λ P, then P = P in E(K)/mE(K). So the map sending P to λ P is injective, but there are only finitely many maps from Gal(L/K) to E[m], so Φ is finite. (E[m] is the kernel of the multiplication-by-m map, which is representable by a rational function, so it must be finite.) So both the image and kernel of the aforementioned map from E(K)/mE(K) to E(L)/mE(L) are finite, so E(K)/mE(K) is finite. In light of this statement, it is enough to prove Weak Mordell-Weil under the assumption that E[m] is contained in E(K) (note that, as mentioned in the proof above, E[m] is finite): according to the explicit formulas for the group operation on an elliptic curve given in 1.7, membership in the kernel may be specified by a polynomial in x and y, and adjoining the roots of that polynomial gives us a finite extension of K with the desired property, to which we may apply 3.2. We now reduce the question of the finiteness of E(K)/mE(K) to a question about the finiteness of a certain field extension: Lemma 3.3. Let L be the composite of all extensions K(Q) over all Q with mq E(K). If L/K is a finite extension, then E(K)/mE(K) is finite. Proof. Define the map κ : E(K) Gal( K/K) E[m] as follows: for P E(K), pick Q E( K) with mq = P ; then κ(p, σ) = σ(q) Q. It is straightforward to check that this map is bilinear and independent of the choice of Q. The kernel of κ on the left is me(k), because if κ(p, σ) = 0 for all σ, then each σ fixes the Q with mq = P, so the entries of Q are in the fixed field of Gal( K/K), i.e., Q E(K), so P me(k). The kernel on the right is Gal( K/L): if σ fixes L, then the Q for each P is in L by definition, so κ(σ, P ) = O for each P, and if σ is in the kernel, then σ fixes every Q with mq E(K), so σ fixes L. This gives a perfect bilinear pairing E(K)/mE(K) Gal(L/K) E[m], so E(K)/mE(K) = Hom(Gal(L/K), E[m]), but, as mentioned above, E[m] is finite, so the lemma follows. Suppose v is a discrete valuation on K. Since E is an elliptic curve over K, we may consider it as an elliptic curve over the completion K v of K with respect to v. Replacing X and Y by u 2 X and u 3 Y respectively, one can easily see that each coefficient a i in the Weierstrass equation for E is replaced by u i a i. So by doing this with an appropriate u we can find a Weierstrass equation for E so that v(a i ) 0 for each i and v( ) is as small as possible. This is called a minimal Weierstrass equation for E. Given a minimal Weierstrass equation for E over K v we may reduce the coefficients with respect to the maximal ideal of K v to obtain a new elliptic curve Ẽv in P 2 (k v ), where k v is the residue field of K v.

6 NICOLAS FORD Definition 3.4. We say E has good reduction at v if Ẽv is nonsingular. Otherwise, we say E has bad reduction at v. In order to proceed we will need the following fact, whose proof would unfortunately take took long to include here. For a proof, see [4, 176]. Lemma 3.5. Suppose v is a discrete valuation over K, v(m) = 0, and E has good reduction at v. Then the map E(K)[m] Ẽv(k v ) given by the reduction described above is injective. Definition 3.6. If v is a valuation on K, L is an algebraic extension of K, and v is a place of L lying above v, then the inertia group of v over v is the subgroup I v /v of Gal(L v /K v ) which act trivially on the residue field k v of L v. We say that the extension L/K is unramified at v if the action of the inertia group of v over v is trivial for every v lying above v. Proposition 3.7. Let L be the field defined in 3.3. Then Gal(L/K) is abelian of exponent m. Let S be the set of valuations v on K for which v is Archimedean, v(m) 0, or E has bad reduction at v. Then if v is a valuation and v / S, L/K is unramified at v. Proof. From the proof of 3.3, there s an injective map from Gal(L/K) to Hom(E(K), E[m]), which proves that the Galois group is abelian of exponent m. Pick some v / S and some Q with mq E(K). It s enough to show that K(Q)/K is unramified at v, since L is the composite of all such extensions. Pick some place v of K(Q) lying above v. Since E has good reduction at v, it clearly also does at v. Let r be the reduction map E(K(Q)) Ẽv (k v ). Pick some σ in the inertia group of v /v. By definition, r(σ(q) Q) = σ(r(q)) r(q) = r(o). But m(σ(q) Q) = O as well, so σ(q) Q E[m] and it s in the kernel of the reduction map, so by 3.5, σ fixes Q. So, since K(Q) is fixed by every element of the inertia group of v /v for every v lying above v, we see that K(Q)/K is unramified at v. We have now established enough about the extension L/K to prove that it is finite. The proof will require a result from algebraic number theory, which we state here without proof: Theorem 3.8 (Dirichlet s S-Unit Theorem). Let K be a finite extension of Q, and let R be its ring of integers. If S is a finite set of valuations on R, define R S to be the ring {a K : v(a) 0 for all v / S}. Then the group of units R S is finitely generated. Theorem 3.9. If K/Q is a finite extension and S is a finite set of places on K containing all the non-archimedean absolute values, and m 2, let L/K be the maximal extension of K whose automorphism group over K is abelian of exponent m and which is unramified outside of S. Then L/K is a finite extension. Proof. First, note that if the theorem is true for some finite extension K of K (replacing S with the set S of places of K lying above S), then we would have that LK /K finite, so L/K would be as well. Therefore, we may assume K contains the m th roots of unity. We can also make S bigger, since this will only make L bigger. Since the class number of K is finite (since K is a finite extension of Q; this is a fundamental

THE MORDELL-WEIL THEOREM FOR Q 7 fact of algebraic number theory, cf. [3, 34]), we may add finitely many places to S to make the ring R S described above into a principal ideal domain. We may also throw in every v for which v(m) 0. By the primary theorem of Kummer theory (cf. [1, 626, 816]), because K contains the m th roots of unity, abelian extensions of K of exponent m are obtained by adjoining m th roots of elements of K. Pick some (normalized) valuation v / S. Since v(m) = 0, one may check that K v ( m a)/k v is unramified if and only if m v(a). So L is obtained by adjoining m th roots of elements of T S, where T S = {a K /(K ) m : m v(a) for all v / S}. We just need to check that T S is finite. There is a natural map R S T S. For any a K whose residue is in T S, we know that ar S is the m th power of some ideal of R S, since v(a) is a multiple of m for each v / S. So for some unit u, we have a = ub m for some b K, because R S is a principal ideal domain. Now, u R S, but our map sends u to the residue of a, so our map is surjective. And the kernel obviously contains the m th powers, so we have a surjection R S /(R S )m T S. By Dirichlet s S-Unit Theorem, R S is finitely generated, so that quotient is finite, so T S is finite, as desired. 4. Height over Q Our next task should be to find a height function on E(Q) with the properties we need. First we note that, since Q has characteristic 0, we may simplify our Weierstrass equation even further. It is left to the reader to check that in this case, with an appropriate change of coordinates, we may assume that the equation looks like Y 2 Z = X 3 + AXZ 2 + BZ 3 for some A and B. We define a height function as follows: Definition 4.1. For any rational number p/q with gcd(p, q) = 1, define H(p/q) = max{ p, q }. Then for any point in E(Q), define h x ([x : y : 1]) = log H(x) and h x ([0 : 1 : 0]) = 0. We now show that h x satisfies the conditions of the Descent Theorem: Proposition 4.2. [4, 202] (a) If P 0 E(Q) then there is a constant c 1 so that for all P E(Q), we have s. h x (P + P 0 ) 2h x (P ) + c 1 (b) There is a constant c 2 so that for all P E(Q), h x (2P ) 4h x (P ) c 2. (c) For every k, the set {P E(Q) : h x (P ) k} is constant. Proof. With appropriate final adjustments to c 1, we may assume that P, P 0 O and P ±P 0. Let P = [x : y : 1], P 0 = [x 0 : y 0 : 1] where x = a/d 2, y = b/d 3, x 0 = a 0 /d 2 0, y 0 = b 0 /d 3 0 in lowest terms. (It is easy to verify from the Weierstrass equation that they may be put into this form.) Using the addition formula given in 1.7 and the Weierstrass equation, one may verify that the x coordinate of P + P 0 (after normalizing with z = 1) is x(p + P 0 ) = (aa 0 + Ad 2 d 2 0)(ad 2 0 + a 0 d 2 ) + 2Bd 4 d 4 0 2bdb 0 d 0 (ad 2 0 a 0d 2 ) 2.

8 NICOLAS FORD Cancelling common factors would only make the height smaller, so we get that H(x(P + P 0 )) c max{ a 2, d 4, bd }. Now, H(x(P )) = max{ a, d 2 }, so except for the bd this is what we want. But from the Weierstrass equation, we get that b 2 = a 3 + Aad 4 + Bd 6, so b c max{ a 3/2, d 3 }, so bd won t affect the maximum. This proves (a). With appropriate final adjustments to c 2, we can force our height to be above those of the finitely many points of E(Q)[2], allowing us to assume that 2P O. Set P = [x : y : 1]. Using the addition formula, we get that the x coordinate of 2P is x(2p ) = x4 2Ax 2 8Bx + A 2 4x 3. + 4Ax + 4B Let F (X, Z) = X 4 2AX 2 Z 2 8BXZ 3 + A 2 Z 4, G(X, Z) = 4X 3 Z + 4AXZ 3 + 4BZ 4. Then writing x = a/b in lowest terms, it is clear that x(2p ) = F (a, b)/g(a, b). Let = 4A 3 +27B 2 be the discriminant of E. We leave it to the reader to check that there are cubic homogeneous polynomials f 1, g 1, f 2, g 2 so that f 1 F g 1 G = 4δZ 7, f 2 F + g 2 G = 4δX 7. (Check that F and G are relatively prime if 0.) Let δ = gcd(f (a, b), G(a, b)). From the equations just proved, it is clear that δ divides 4, so H(x(2P )) max{f (a, b), G(a, b)}/ 4. Using the homogeneity of the f i s and the g i s, the equations also tell us that max{ 4 a 7, 4 b 7 } 2c max{ a 3, b 3 } max{f (a, b), G(a, b)}, and cancelling and making the appropriate substitutions gives us that H(x(2P )) (2c) 1 H(x(P )) 4, which implies (b). The set {t Q : H(t) k} is obviously finite, and for a given rational there are at most two points on the curve with that number as an x value, so (c) is clear. Remark 4.3. This, together with Weak Mordell-Weil and the Descent Theorem, shows that E(Q) is finitely generated. Given the way Weak Mordell-Weil was stated and proved, it is natural to ask whether E(K) is finitely generated for any finite extension K over Q. The answer is yes, and one may construct a height function on such a K, though the construction and the proof that it satisfies the hypotheses of the Descent Theorem will be a bit more complicated than they were for Q. References [1] David S. Dummit and Richard M. Foote. Abstract Algebra: Third Edition. John Wiley and Sons. 2004. [2] William Fulton. Algebraic Curves. http://www.math.lsa.umich.edu/ wfulton/curvebook.pdf. 2008. [3] Jürgen Neurkirch. Algebraic Number Theory. Springer-Verlag. 1999. [4] Joseph H. Silverman. The Arithmetic of Elliptic Curves. Springer-Verlag. 1986.