HENSEL S LEMMA KEITH CONRAD

Similar documents
MATH 2710: NOTES FOR ANALYSIS

MA3H1 TOPICS IN NUMBER THEORY PART III

The Hasse Minkowski Theorem Lee Dicker University of Minnesota, REU Summer 2001

p-adic Measures and Bernoulli Numbers

RINGS OF INTEGERS WITHOUT A POWER BASIS

Elementary Analysis in Q p

Math 4400/6400 Homework #8 solutions. 1. Let P be an odd integer (not necessarily prime). Show that modulo 2,

Mobius Functions, Legendre Symbols, and Discriminants

Solvability and Number of Roots of Bi-Quadratic Equations over p adic Fields

δ(xy) = φ(x)δ(y) + y p δ(x). (1)

Introduction to Arithmetic Geometry Fall 2013 Lecture #10 10/8/2013

Math 104B: Number Theory II (Winter 2012)

ON THE LEAST SIGNIFICANT p ADIC DIGITS OF CERTAIN LUCAS NUMBERS

MATH 361: NUMBER THEORY EIGHTH LECTURE

By Evan Chen OTIS, Internal Use

x 2 a mod m. has a solution. Theorem 13.2 (Euler s Criterion). Let p be an odd prime. The congruence x 2 1 mod p,

Representing Integers as the Sum of Two Squares in the Ring Z n

We collect some results that might be covered in a first course in algebraic number theory.

MAZUR S CONSTRUCTION OF THE KUBOTA LEPOLDT p-adic L-FUNCTION. n s = p. (1 p s ) 1.

Practice Final Solutions

HOMEWORK # 4 MARIA SIMBIRSKY SANDY ROGERS MATTHEW WELSH

MATH 3240Q Introduction to Number Theory Homework 7

An Overview of Witt Vectors

RECIPROCITY LAWS JEREMY BOOHER

Algebraic number theory LTCC Solutions to Problem Sheet 2

On the Rank of the Elliptic Curve y 2 = x(x p)(x 2)

MA257: INTRODUCTION TO NUMBER THEORY LECTURE NOTES

CERIAS Tech Report The period of the Bell numbers modulo a prime by Peter Montgomery, Sangil Nahm, Samuel Wagstaff Jr Center for Education

An Estimate For Heilbronn s Exponential Sum

p-adic Properties of Lengyel s Numbers

Series Handout A. 1. Determine which of the following sums are geometric. If the sum is geometric, express the sum in closed form.

DIRICHLET S THEOREM ON PRIMES IN ARITHMETIC PROGRESSIONS. 1. Introduction

6 Binary Quadratic forms

#A64 INTEGERS 18 (2018) APPLYING MODULAR ARITHMETIC TO DIOPHANTINE EQUATIONS

Real Analysis 1 Fall Homework 3. a n.

MATH 371 Class notes/outline October 15, 2013

Excerpt from "Intermediate Algebra" 2014 AoPS Inc.

DISCRIMINANTS IN TOWERS

Small Zeros of Quadratic Forms Mod P m

t s (p). An Introduction

Quadratic Reciprocity

MAT 311 Solutions to Final Exam Practice

Pythagorean triples and sums of squares

Topic 7: Using identity types

Applicable Analysis and Discrete Mathematics available online at HENSEL CODES OF SQUARE ROOTS OF P-ADIC NUMBERS

arxiv:math/ v2 [math.nt] 21 Oct 2004

Chapter 7 Rational and Irrational Numbers

ANALYTIC NUMBER THEORY AND DIRICHLET S THEOREM

RAMANUJAN-NAGELL CUBICS

0.6 Factoring 73. As always, the reader is encouraged to multiply out (3

#A47 INTEGERS 15 (2015) QUADRATIC DIOPHANTINE EQUATIONS WITH INFINITELY MANY SOLUTIONS IN POSITIVE INTEGERS

Jacobi symbols and application to primality

Cryptanalysis of Pseudorandom Generators

The Fekete Szegő theorem with splitting conditions: Part I

QUADRATIC RECIPROCITY

The Euler Phi Function

SOME SUMS OVER IRREDUCIBLE POLYNOMIALS

MATH 361: NUMBER THEORY ELEVENTH LECTURE

(Workshop on Harmonic Analysis on symmetric spaces I.S.I. Bangalore : 9th July 2004) B.Sury

f(r) = a d n) d + + a0 = 0

A CONCRETE EXAMPLE OF PRIME BEHAVIOR IN QUADRATIC FIELDS. 1. Abstract

CONGRUENCE PROPERTIES OF TAYLOR COEFFICIENTS OF MODULAR FORMS

3 Properties of Dedekind domains

2 Asymptotic density and Dirichlet density

19th Bay Area Mathematical Olympiad. Problems and Solutions. February 28, 2017

PETER J. GRABNER AND ARNOLD KNOPFMACHER

ON FREIMAN S 2.4-THEOREM

On generalizing happy numbers to fractional base number systems

DIVISIBILITY CRITERIA FOR CLASS NUMBERS OF IMAGINARY QUADRATIC FIELDS

MATH 115, SUMMER 2012 LECTURE 12

MA3H1 Topics in Number Theory. Samir Siksek

Sets of Real Numbers

Why Proofs? Proof Techniques. Theorems. Other True Things. Proper Proof Technique. How To Construct A Proof. By Chuck Cusack

On Character Sums of Binary Quadratic Forms 1 2. Mei-Chu Chang 3. Abstract. We establish character sum bounds of the form.

POINTS ON CONICS MODULO p

ON THE NORM OF AN IDEMPOTENT SCHUR MULTIPLIER ON THE SCHATTEN CLASS

A CRITERION FOR POLYNOMIALS TO BE CONGRUENT TO THE PRODUCT OF LINEAR POLYNOMIALS (mod p) ZHI-HONG SUN

= 1 2x. x 2 a ) 0 (mod p n ), (x 2 + 2a + a2. x a ) 2

Weil s Conjecture on Tamagawa Numbers (Lecture 1)

Elementary theory of L p spaces

On the irreducibility of a polynomial associated with the Strong Factorial Conjecture

Ernie Croot 1. Department of Mathematics, Georgia Institute of Technology, Atlanta, GA Abstract

16 The Quadratic Reciprocity Law

FERMAT S LAST THEOREM

A review of the foundations of perfectoid spaces

Factorability in the ring Z[ 5]

QUIZ ON CHAPTER 4 - SOLUTIONS APPLICATIONS OF DERIVATIVES; MATH 150 FALL 2016 KUNIYUKI 105 POINTS TOTAL, BUT 100 POINTS = 100%

#A37 INTEGERS 15 (2015) NOTE ON A RESULT OF CHUNG ON WEIL TYPE SUMS

2 Asymptotic density and Dirichlet density

QUADRATIC RECIPROCITY

Practice Final Solutions

Intrinsic Approximation on Cantor-like Sets, a Problem of Mahler

Section 0.10: Complex Numbers from Precalculus Prerequisites a.k.a. Chapter 0 by Carl Stitz, PhD, and Jeff Zeager, PhD, is available under a Creative

Lecture 6. 2 Recurrence/transience, harmonic functions and martingales

QUADRATIC RESIDUES AND DIFFERENCE SETS

A FEW EQUIVALENCES OF WALL-SUN-SUN PRIME CONJECTURE

Number Theory Naoki Sato

Primes - Problem Sheet 5 - Solutions

THE THEORY OF NUMBERS IN DEDEKIND RINGS

MATH 210A, FALL 2017 HW 5 SOLUTIONS WRITTEN BY DAN DORE

Transcription:

HENSEL S LEMMA KEITH CONRAD 1. Introduction In the -adic integers, congruences are aroximations: for a and b in Z, a b mod n is the same as a b 1/ n. Turning information modulo one ower of into similar information modulo a higher ower of can be interreted as imroving an aroximation. Examle 1.1. The number 7 is a square mod 3: 7 1 2 mod 3. Although 7 1 2 mod 9, we can write 7 as a square mod 9 by relacing 1 with 1 + 3: 7 (1 + 3) 2 mod 9. Here are exressions of 7 as a square modulo further owers of 3: 7 (1 + 3 + 3 2 ) 2 mod 3 3, 7 (1 + 3 + 3 2 ) 2 mod 3 4, 7 (1 + 3 + 3 2 + 2 3 4 ) 2 mod 3 5,. 7 (1 + 3 + 3 2 + 2 3 4 + 2 3 7 + 3 8 + 3 9 + 2 3 10 ) 2 mod 3 11. If we can kee going indefinitely then 7 is a erfect square in Z 3 with square root 1 + 3 + 3 2 + 2 3 4 + 2 3 7 + 3 8 + 3 9 + 2 3 10 +. That we really can kee going indefinitely is justified by Hensel s lemma, which will rovide conditions under which the root of a olynomial mod can be lifted to a root in Z, such as the olynomial X 2 7 with = 3: its two roots mod 3 can both be lifted to square roots of 7 in Z 3. We will first give a basic version of Hensel s lemma, illustrate it with examles, and then give a stronger version that can be alied in cases where the basic version is inadequate. 2. A Basic Version of Hensel s lemma Theorem 2.1 (Hensel s lemma). If f(x) Z [X] and a Z satisfies 0 mod, f (a) 0 mod then there is a unique α Z such that f(α) = 0 and α a mod. Examle 2.2. Let f(x) = X 2 7. Then f(1) = 6 0 mod 3 and f (1) = 2 0 mod 3, so Hensel s lemma tells us there is a unique 3-adic integer α such that α 2 = 7 and α 1 mod 3. We saw aroximations to α in Examle 1.1, e.g., α 1 + 3 + 3 2 + 2 3 4 mod 3 5. Proof. We will rove by induction that for each n 1 there is an a n Z such that f(a n ) 0 mod n, a n a mod. The case n = 1 is trivial, using a 1 = a. If the inductive hyothesis holds for n, we seek a n+1 Z such that f(a n+1 ) 0 mod n+1, a n+1 a mod. 1

2 KEITH CONRAD Since f(a n+1 ) 0 mod n+1 f(a n+1 ) 0 mod n, each root of f(x) mod n+1 reduces to a root of f(x) mod n. By the inductive hyothesis there is a root a n mod n, so we seek a -adic integer a n+1 such that a n+1 a n mod n and f(a n+1 ) 0 mod n+1. Writing a n+1 = a n + n t n for some t n Z to be determined, can we make f(a n + n t n ) 0 mod n+1? To comute f(a n + n t n ) mod n+1, we use a olynomial identity: (2.1) f(x + Y ) = f(x) + f (X)Y + g(x, Y )Y 2 for some olynomial g(x, Y ) Z [X, Y ]. This formula comes from isolating the first two terms in the binomial theorem: writing f(x) = d i=0 c ix i we have f(x + Y ) = c i (X + Y ) i = c 0 + c i (X i + ix i 1 Y + g i (X, Y )Y 2 ), i=0 where g i (X, Y ) Z[X, Y ]. Thus f(x + Y ) = c i X i + ic i X i 1 Y + i=0 i=1 i=1 g i (X, Y )Y 2 = f(x) + f (X)Y + g(x, Y )Y 2, i=1 where g(x, Y ) = d i=1 c ig i (X, Y ) Z [X, Y ]. This gives us the desired identity. 1 To make (2.1) numerical, for all x and y in Z the number z := g(x, y) is in Z, so (2.2) x, y Z = f(x + y) = f(x) + f (x)y + zy 2, where z Z. In this formula set x = a n and y = n t n : (2.3) f(a n + n t n ) = f(a n ) + f (a n ) n t n + z 2n t 2 n f(a n ) + f (a n ) n t n mod n+1 since 2n n + 1. In f (a n ) n t n mod n+1, the factors f (a n ) and t n only matter mod since there is already a factor of n and the modulus is n+1. Recalling that a n a mod, we get f (a n ) n t n f (a) n t n mod n+1. Therefore from (2.3), f(a n + n t n ) 0 mod n+1 f(a n ) + f (a) n t n 0 mod n+1 f (a)t n f(a n )/ n mod, where the ratio f(a n )/ n is in Z since we assumed that f(a n ) 0 mod n. There is a solution for t n in the congruence mod since we assumed that f (a) 0 mod. Armed with this choice of t n and setting a n+1 = a n + n t n, we have f(a n+1 ) 0 mod n+1 and a n+1 a n mod n, so in articular a n+1 a mod. This comletes the induction. Starting with a 1 = a, our inductive argument has constructed a sequence a 1, a 2, a 3,... in Z such that f(a n ) 0 mod n and a n+1 a n mod n for all n. The second condition, a n+1 a n mod n, imlies that {a n } is a Cauchy sequence in Z. Let α be its limit in Z. We want to show f(α) = 0 and α a mod. From a n+1 a n mod n for all n we get a m a n mod n for all m > n, so α a n mod n by letting m. At n = 1 we get α a mod. For general n, α a n mod n = f(α) f(a n ) 0 mod n = f(α) 1 n. Since this estimate holds for all n, f(α) = 0. 1 The identity (2.1) is similar to Taylor s formula: f(x + h) = f(x) + f (x)h + (f (x)/2!)h 2 +. The catch is that terms in Taylor s formula have factorials in the denominator, which can require some extra care when reducing modulo owers of : think about f (x)/2! mod 2, for instance. What (2.1) essentially does is extract the first two terms of Taylor s formula and say that what remains has -adic integral coefficients, so (2.1) can be reduced mod, or mod n for all n 1.

HENSEL S LEMMA 3 It remains to show α is the unique root of f(x) in Z that is congruent to a mod. Suose f(β) = 0 and β a mod. To show β = α we will show β α mod n for all n. The case n = 1 is clear since α and β are both congruent to a mod. Suose n 1 and we know that β α mod n. Then β = α + n γ n with γ n Z, so a calculation similar to (2.3) imlies f(β) = f(α + n γ n ) f(α) + f (α) n γ n mod n+1. Both α and β are roots of f(x), so 0 f (α) n γ n mod n+1. Thus f (α)γ n 0 mod. Since f (α) f (a) 0 mod, we have γ n 0 mod, which imlies β α mod n+1. Remark 2.3. An argument similar to the last aragrah shows for all n 1 that f(x) has a unique root mod n that reduces to a mod. So in Theorem 2.1 we can think about the uniqueness of the lifting of the mod root in two ways: it has a unique lifting to a root in Z or it has a unique lifting to a root in Z/( n ) for all n 1. Here are five alications of Hensel s lemma. Examle 2.4. Let f(x) = X 3 2. We have f(3) 0 mod 5 and f (3) 0 mod 5, Therefore Hensel s lemma with initial aroximation a = 3 tells us there is a unique cube root of 2 in Z 5 that is congruent to 3 mod 5. Exlicitly, it is 3 + 2 5 2 + 2 5 3 + 3 5 4 +. Examle 2.5. Let f(x) = X 3 X 2. We have f(0) 0 mod 2 and f(1) 0 mod 2, while f (0) 1 mod 2 and f (1) 0 mod 2. Therefore Hensel s lemma with initial aroximation a = 0 imlies there is a unique α Z 2 such that f(α) = 0 and α 0 mod 2. Exlicitly, α = 2 + 2 2 + 2 4 + 2 7 +. Although 1 is a root of f(x) mod 2, it does not lift to a root in Z 2 since it doesn t even lift to a root mod 4: if f(β) = 0 and β 1 mod 2 then f(β) 0 mod 4 and β 1 or 3 mod 4, but f(1) 2 mod 4 and f(3) 2 mod 4. Examle 2.6. For each ositive integer nnot divisible by and each u 1 mod Z, u is an nth ower in Z. Aly Hensel s lemma to f(x) = X n u with initial aroximation a = 1: f(1) = 1 u 0 mod and f (1) = n 0 mod. Therefore there is a unique solution to α n = u in Z such that α 1 mod. Examle 1.1 is the case u = 7, = 3, and n = 2: 7 has a unique 3-adic square root that is 1 mod 3. Examle 2.7. For an odd rime, suose u Z is a square mod. We will show u is a square in Z. For examle, 2 is a square mod 7 since 2 3 2 mod 7, and it will follow that 2 is a square in Z 7. Write u a 2 mod, so a 0 mod. For the olynomial f(x) = X 2 u we have 0 mod and f (a) = 2a 0 mod, since is not 2, so Hensel s lemma tells us that f(x) has a root in Z that reduces to a mod, which means u is a square in Q. Conversely, if u Z is a -adic square, say u = v 2, then 1 = v 2, so v Z and u v 2 mod. Thus the elements of Z that are squares in Q are recisely those that reduce to squares mod. For examle, the nonzero squares mod 7 are 1, 2, and 4, so u Z 7 is a 7-adic square if and only if u 1, 2, or 4 mod 7. This result can have roblems when = 2 because 2a 0 mod 2. In fact the lifting of a square root mod 2 to a 2-adic square root really does have a roblem: 3 1 2 mod 2 but 3 is not a square in Z 2 since 3 is not a square mod 4 (the squares mod 4 are 0 and 1). And 3 is not a square in Q 2 either because a hyothetical square root in Q 2 would have to be in Z 2 : if α 2 = 3 in Q 2 then α 2 2 = 3 2 = 1, so α 2 = 1, and thus α Z 2 Z 2. Examle 2.8. For each integer k between 0 and 1, k k mod. Letting f(x) = X X, we have f(k) 0 mod and f (k) = k 1 1 1 0 mod. Hensel s lemma imlies that there is a unique ω k Z such that ω k = ω k and ω k k mod. For instance,

4 KEITH CONRAD ω 0 = 0 and ω 1 = 1. When > 2, ω 1 = 1. Other ω k for > 2 are more interesting. For = 5, ω k is a root of X 5 X = X(X 4 1) = X(X 1)(X + 1)(X 2 + 1). Thus ω 2 and ω 3 are square roots of 1 in Z 5 : ω 2 = 2 + 5 + 2 5 2 + 5 3 + 3 5 4 + 4 5 5 + 2 5 6 + 3 5 7 +, ω 3 = 3 + 3 5 + 2 5 2 + 3 5 3 + 5 4 + 2 5 6 + 5 7 +. The numbers ω k for 0 k 1 are distinct since they are already distinct when reduced mod, so X X = X(X 1 1) slits comletely in Z [X]. Its roots in Z are 0 and -adic ( 1)th roots of unity. 3. Roots of unity in Q via Hensel s Lemma Hensel s lemma is often considered to be a method of finding roots to olynomials, but that is just one asect: the existence of a root. There is also a uniqueness art to Hensel s lemma: it tells us there is a unique root within a certain distance of an aroximate root. We ll use the uniqueness to find all of the roots of unity in Q. Theorem 3.1. The roots of unity in Q are the ( 1)th roots of unity for odd and ±1 for = 2. Proof. If x n = 1 in Q then x n = 1, so x = 1. This means every root of unity in Q lies in Z. Therefore we work in Z right from the start. First let s consider roots of unity of order relatively rime to. Assume ζ 1 and ζ 2 are roots of unity in Z with order rime to. Letting m be the roduct of the orders of these roots of unity, they are both roots of f(x) = X m 1 and m is rime to. Since f (ζ i ) = mζi m 1 = 1, the uniqueness asect of Hensel s lemma imlies that the only root α of X m 1 satisfying α ζ 1 < 1 is ζ 1. So if ζ 2 ζ 1 mod Z then ζ 2 = ζ 1 : distinct roots of unity in Z having order rime to must be incongruent mod. In Examle 2.8 we found in each nonzero coset mod Z a root of X 1 1, and 1 is rime to. Therefore each congruence class mod Z contains a ( 1)th root of unity, so the only roots of unity of order rime to in Q are the roots of X 1 1. Now we consider roots of unity of -ower order. We will show the only th root of unity in Z is 1 for odd and the only 4th roots of unity in Z 2 are ±1. This imlies the only th ower roots of unity in Z are 1 for odd and ±1 for = 2. (For instance, if there were a nontrivial -th ower root of unity in Q for 2 then there would be a root of unity in Q of order, but we re going to show there aren t any of those.) We first consider odd and suose ζ = 1 in Z. Since ζ ζ mod Z, we have ζ 1 mod Z. For the olynomial f(x) = X 1 we have f (ζ) = ζ 1 = 1/, so the uniqueness in Hensel s lemma imlies that the ball {x Q : x ζ < f (ζ) } = {x Q : x ζ 1/ 2 } = ζ + 2 Z contains no th root of unity excet for ζ. We will now show that ζ 1 mod 2 Z, so 1 is in that ball and therefore ζ = 1. Write ζ = 1 + y, where y Z. Then 1 ( ) 1 = ζ = (1 + y) = 1 + (y) + (y) k + (y). k For 2 k 1, ( k) is divisible by, so all terms in the sum over 2 k 1 are divisible by 3. The last term (y) is also divisible by 3 (since 3). Therefore if we reduce the above equation modulo 3 we get k=2 1 1 + 2 y mod 3 = 0 2 y mod 3.

HENSEL S LEMMA 5 Therefore y is divisible by, so ζ 1 mod 2 and this forces ζ = 1. Now we turn to = 2. We want to show the only 4th roots of unity in Z 2 are ±1. This won t use Hensel s lemma. If ζ Z 2 is a 4th root of unity and ζ ±1 then ζ2 = 1, so ζ 2 1 mod 4Z 2. However, ζ Z 2 = ζ 1 or 3 mod 4Z 2 = ζ 2 1 mod 4Z 2 and 1 1 mod 4Z 2. For each rime, a root of unity is a (unique) roduct of a root of unity of -ower order and a root of unity of order rime to, so the only roots of unity in Q are the roots of X 1 1 for 2 and ±1 for = 2. 4. A stronger version of Hensel s Lemma The hyotheses of Theorem 2.1 are 0 mod and f (a) 0 mod. This means a mod is a simle root of f(x) mod. We will now discuss a more general version of Hensel s lemma than Theorem 2.1. It can be alied to cases where a mod is a multile root of f(x) mod : 0 mod and f (a) 0 mod. This will allow us to describe squares in Z 2 and, more generally, th owers in Z. Theorem 4.1 (Hensel s lemma). Let f(x) Z [X] and a Z satisfy < f (a) 2. There is a unique α Z such that f(α) = 0 and α a < f (a). Moreover, (1) α a = /f (a) < f (a), (2) f (α) = f (a). Since f (a) Z, f (a) 1. If f (a) = 1 then the hyotheses of Theorem 4.1 reduce to those of Theorem 2.1: saying < 1 and f (a) = 1 means 0 mod and f (a) 0 mod. Theorem 4.1 actually goes beyond the conclusions of Theorem 2.1 when the hyotheses of Theorem 2.1 hold, since we learn in Theorem 4.1 exactly how far away the root α is from the aroximate root a. But the main oint of Theorem 4.1 is that it allows for the ossibility that f (a) < 1, which isn t covered by Theorem 2.1 at all. We will rove Theorem 4.1 by two methods, in Sections 5 and 6. Here are some alications where the olynomial has a multile root mod. Examle 4.2. Let f(x) = X 4 7X 3 +2X 2 +2X +1. Then f(x) (X +1) 2 (X 2 +1) mod 3 and we notice 2 mod 3 is a double root. Since f(2) 3 = 1/27 and f (2) 3 = 42 3 = 1/3, the condition f(2) 3 < f (2) 2 3 holds, so there is a unique root α of f(x) in Z 3 such that α 2 3 < 1/3, i.e., α 2 mod 9. In fact there are two roots of f(x) in Z 3 : 2 + 3 + 2 3 2 + 2 3 3 + 2 3 4 + 2 3 6 + and 2 + 3 2 + 3 4 + 2 3 5 + 2 3 6 +. The second root reduces to 2 mod 9, and is α. The first root reduces to 5 mod 9, and its existence can be verified by checking f(5) 3 = 1/27 < f (5) 2 3 = 1/9. Examle 4.3. Let f(x) = X 3 10 and g(x) = X 3 5. We have f(x) (X 1) 3 mod 3 and g(x) (X 2) 3 mod 3: 1 is an aroximate 3-adic root of f(x) and 2 is an aroximate 3-adic root of g(x). We want to see if they can be refined to genuine 3-adic roots. The basic form of Hensel s lemma in Theorem 2.1 can t be used since the olynomials do not have simle roots mod 3. Instead we will try to use the stronger form of Hensel s lemma in Theorem 4.1. Since f(1) 3 = 1/9 and f (1) 3 = 1/3, we don t have f(1) 3 < f (1) 2 3, so Theorem 4.1 can t be used on f(x) with a = 1. However, f(4) 3 = 1/27 and f (4) 3 = 1/3, so we can

6 KEITH CONRAD use Theorem 4.1 on f(x) with a = 4: there is a unique root α of X 3 10 in Z 3 satisfying α 4 3 < 1/3, so α 4 mod 9. The exansion of α begins as 1 + 3 + 3 2 + 2 3 6 + 3 7 +. Turning to g(x), we have g(2) 3 = 1/3 and g (2) 3 = 1/9, so we can t aly Theorem 4.1 with a = 2. In fact there is no root of g(x) in Q 3. If there were a root α in Q 3 then α 3 = 5, so α 3 = 1, and thus α Z 3. Then α 3 5 mod 3 n, so 5 would be a cube modulo every ower of 3. But 5 is not a cube mod 9 (the only cubes mod 9 are 0, 1, and 8). Therefore the mod 3 root of X 3 5 does not lift to a 3-adic root of X 3 5. Theorem 4.4. If u Z 2 then u is a square in Q 2 if and only if u 1 mod 8Z 2. Proof. If u = v 2 in Q 2 then 1 = v 2 2, so v Z 2. In Z 2/8Z 2 = Z/8Z, the units are 1, 3, 5, and 7, whose squares are all congruent to 1 mod 8, so u = v 2 1 mod 8Z 2. To show, conversely, that all u Z 2 satisfying u 1 mod 8Z 2 are squares in Z 2, let f(x) = X2 u and use a = 1 in Theorem 4.1. We have f(1) 2 = 1 u 2 1/8 and f (1) 2 = 2 2 = 1/2, so f(1) 2 < f (1) 2 2. Therefore X2 u has a root in Z 2, so u is a square in Z 2. Theorem 4.5. If 2 and u Z, then u is a th ower in Q if and only if u is a th ower modulo 2. Theorem 4.5 is false for = 2: the criterion for an element of Z 2 to be a 2-adic square needs modulus 2 3, not modulus 2 2. For instance, 5 is a square mod 4 but 5 is not a 2-adic square since 5 1 mod 8. Theorem 4.5 exlains what we found in Examle 4.3: 10 is a 3-adic cube and 5 is not, since 10 mod 9 is a cube and 5 mod 9 is not a cube. Proof. If u = v in Q then 1 = v, so v Z : we only need to look for th roots of u in Z. Let f(x) = X u. In order to use Theorem 4.1 on f(x), we seek an a Z such that < f (a) 2. This means a u < a 1 2 = 1/ 2, or equivalently a u mod 3. So rovided u is a th ower modulo 3, Theorem 4.1 tells us that X u has a root in Z, so u is a th ower. The criterion in the theorem, however, has modulus 2 rather 3. We need to do some work to bootstra an aroximate th root from modulus 2 to modulus 3 in order for Theorem 4.1 to aly. Suose u a mod 2 for some a Z. Then a Z and u/a 1 mod 2. Write u/a 1 + 2 c mod 3, where 0 c 1. By the binomial theorem, ( ) (1 + c) = 1 + (c) + (c) k. k k=2 The terms for k 3 are obviously divisible by 3 and the term at k = 2 is ( ) 2 (c) 2 = 1 2 3 c 2, which is also divisible by 3 since > 2. Therefore (1 + c) 1 + 2 c mod 3, so u/a (1 + c) mod 3. Now we can write u a (1 + c) 1 mod 3. From Theorem 4.1, a -adic integer that is congruent to 1 mod 3 is a th ower (see the first aragrah again). Thus u/(a (1 + c) ) is a th ower, so u is a th ower. Remark 4.6. Hensel s lemma in Theorem 4.1 guarantees a root mod lifts uniquely to a root in Z under weaker conditions than for Hensel s lemma in Theorem 2.1, but it does not guarantee a unique lift mod n, unlike for Theorem 2.1 (Remark 2.3). For examle, for 2 the unique solution to x = 1 in Z is 1, but for n > 1 the congruence x 1 mod n has solutions: 1 + n 1 b mod n for 0 b 1. The unique solution 1 mod of x 1 mod lifts to solutions of x 1 mod n for each n > 1. (What haens if = 2?) This is consistent with a unique lift to a root in Z since 1 + n 1 b 1 as n.

HENSEL S LEMMA 7 5. First Proof of Theorem 4.1: Newton s Method Our first roof of Theorem 4.1 will use Newton s method and is a modification of [4, Cha. II, 2, Pro. 2]. Proof. As in Newton s method from real analysis, define a sequence {a n } in Q by a 1 = a and (5.1) a n+1 = a n f(a n) f (a n ) for n 1. Set t = /f (a) 2 < 1. We will show by induction on n that (i) a n 1, i.e., a n Z, (ii) f (a n ) = f (a 1 ), (iii) f(a n ) f (a 1 ) 2 t 2n 1. For n = 1 these conditions are all clear. Note in articular that we have equality in (iii) and f (a 1 ) 0 since f(a 1 ) < f (a 1 ) 2. For the inductive ste, we need two olynomial identities. The first one is f(x + Y ) = f(x) + f (X)Y + g(x, Y )Y 2 for some g(x, Y ) Z [X, Y ], which is derived in exactly the same way as in the roof of Theorem 2.1. Then (5.2) x, y Z = f(x + y) = f(x) + f (x)y + zy 2, where z Z. The second olynomial identity we need is that for all F (X) Z [X], F (X) F (Y ) = (X Y )G(X, Y ) for some G(X, Y ) Z [X, Y ]. This comes from X Y being a factor of X i Y i for all i 1. Writing F (X) = m i=0 b ix i, m F (X) F (Y ) = b i (X i Y i ) and we can factor X Y out of each term on the right. For x and y in Z, G(x, y) Z, so (5.3) x, y Z = F (x) F (y) = x y G(x, y) x y. Assume (i), (ii), and (iii) are true for n. To rove (i) for n + 1, first note a n+1 is defined since f (a n ) 0 by (ii). To rove (i) it suffices to show f(a n )/f (a n ) 1. Using (ii) and (iii) for n, we have f(a n )/f (a n ) = f(a n )/f (a 1 ) f (a 1 ) t 2n 1 1. To rove (ii) for n + 1, (iii) for n imlies f(a n ) < f (a 1 ) 2 since t < 1, so by (5.3) with F (X) = f (X), f (a n+1 ) f (a n ) a n+1 a n = f(a n) f (a n ) = f(a n) f (a 1 ) < f (a 1 ) so f (a n+1 ) = f (a 1 ). To rove (iii) for n + 1, we use (5.2) with x = a n and y = f(a n )/f (a n ): ( f(a n+1 ) = f(x + y) = f(a n ) + f (a n ) f(a ) ( ) n) f(an ) 2 ( ) f(an ) 2 f + z (a n ) f = z (a n ) f, (a n ) where z Z. Thus, by (iii) for n, f(a n+1 ) f(a n ) f (a n ) This comletes the induction. 2 i=1 = f(a n) 2 f (a 1 ) 2 f (a 1 ) 4 t 2n f (a 1 ) 2 = f(a 1 ) 2 t 2n.

8 KEITH CONRAD Now we show {a n } is Cauchy in Q. From the recursive definition of this sequence, (5.4) a n+1 a n = f(a n ) f (a n ) = f(a n) f f (a 1 ) t 2n 1, (a 1 ) where we used (ii) and (iii). Thus {a n } is Cauchy. Let α be its limit, so α 1 by (i), i.e., α Z. Letting n in (ii) and (iii) we get f (α) = f (a 1 ) = f (a) and f(α) = 0. To show α a = /f (a), it s true if = 0 since then a n = a 1 = a for all n, so α = a. If 0 then we will show a n a = /f (a) for all n 2 and let n. When n = 2 use the definition of a 2 in terms of a 1 = a. For all n 2, by (5.4) (5.5) a n+1 a n f (a 1 ) t 2n 1 f (a 1 ) t 2 < f (a 1 ) t = f (a) t = f (a), where t (0, 1) since 0. 2 If a n a = /f (a) then a n+1 a n < a n a by (5.5), so a n+1 a = (a n+1 a n ) + (a n a) = a n a = /f (a). The last thing to do is show α is the only root of f(x) in the ball {x Z : x a < f (a) }. This will not use anything about Newton s method. Assume f(β) = 0 and β a < f (a). Since α a < f (a) we have β α < f (a). Write β = α + h, so h Z. Then by (5.2), 0 = f(β) = f(α + h) = f(α) + f (α)h + zh 2 = f (α)h + zh 2 for some z Z. If h 0 then f (α) = zh, so f (α) h = β α < f (a). But f (α) = f (a), so we have a contradiction. Thus h = 0, i.e., β = α. Before we give a second roof of Theorem 4.1, it s worth noting that the a n s converge to α very raidly. From the inequality a n+1 a n f (a 1 ) t 2n 1 for all n 1 we obtain by the strong triangle inequality a m a n f (a 1 ) t 2n 1 for all m > n. Letting m, (5.6) α a n f (a 1 ) t 2n 1 = f (a) t 2n 1 = f 2 n 1 (a) f (a) 2. Since /f (a) 2 < 1, the exonent 2 n 1 tells us that the number of initial -adic digits in a n that agree with those in the limit α is at least doubling at each ste. Examle 5.1. Let f(x) = X 2 7 in Q 3 [X]. It has two roots in Z 3 : α 1 = 1 + 3 + 3 2 + 2 3 4 + 2 3 7 + 3 8 + 3 9 +, α 2 = 2 + 3 + 3 2 + 2 3 3 + 2 3 5 + 2 3 6 + 3 8 + 3 9 +. Starting with a 1 = 1, for which f(a 1 )/f (a 1 ) 2 3 = 1/3, Newton s recursion (5.1) has limit α where α a 1 3 < f (a 1 ) 3 = 1, so α a 1 mod 3. Thus a n α 1. For examle, a 4 = 977 368 = 1 + 3 + 32 + 2 3 4 + 2 3 7 + 3 9 + 3 10 +, which has the same 3-adic digits as α 1 u through terms including 3 7 (the first 8 digits). The estimate in (5.6) says α 1 a n 3 f (a 1 ) 3 (1/3) 2n 1 = (1/3) 2n 1 for all n. Using a comuter, this inequality is an equality for 1 n 10. Examle 5.2. Let f(x) = X 2 17 in Q 2 [X]. It has two roots in Z 2 : α 1 = 1 + 2 3 + 2 5 + 2 6 + 2 7 + 2 9 +, α 2 = 1 + 2 + 2 2 + 2 4 + 2 8 +. 2 The case = 0 was not covered in an earlier draft and this omission was found when this roof was formalized using the Lean theorem rover [5,. 8].

HENSEL S LEMMA 9 Using Newton s recursion (5.1) for f(x) with initial seed a Z 2, we need a2 17 2 < 2a 2 2, which is the same as a 2 17 mod 8, and this congruence works for all a Z 2. Therefore (5.1) with a 1 Z 2 converges to α 1 or α 2. Since f (a) 2 = 1/2 for a Z 2, (5.1) with a 1 = a has a limit α satisfying α a 2 < f (a) 2 = 1/2, so α a mod 4: if a 1 mod 4 then α = α 1, and if a 3 mod 4 then α = α 2. By (5.6), α a n 2 f (a) 2 ( /f (a) 2 2 ) 2n 1 = (1/2)(4 a 2 17 2 ) 2n 1. For a few choices of a, this inequality is an equality for 1 n 10: When a = 1, α 1 a n 2 = (1/2) 2n +1 for 1 n 10. When a = 3, α 2 a n 2 = (1/2) 2n 1 +1 for 1 n 10. When a = 5, α 1 a n 2 = (1/2) 2n 1 +1 for 1 n 10. Examle 5.3. Let s solve X 2 1 = 0 in Q 3. This might seem silly, since we know the solutions are ±1, but let s check how an initial aroximation affects the 3-adic limit. We use a 1 = 2. (If we used a 1 = 1 then a n = 1 for all n, which is not interesting.) When f(x) = X 2 1 the recursion for Newton s method is a n+1 = a n f(a n) f (a n ) = a n a2 n 1 2a n = a2 n + 1 2a n = 1 2 ( a n + 1 ). a n Since f(2) 3 = 1/3 < f (2) 2 3, Newton s recursion with a 1 = 2 converges in Q 3. What is the limit? Since a 1 1 mod 3, we have a n 1 mod 3 for all n by induction: if a n 1 mod 3 then a n+1 = (1/2)(a n + 1/a n ) (1/2)( 1 + 1/( 1)) (1/2)( 2) 1 mod 3. Thus lim n a n = 1 in Q 3. What makes this interesting is that in R all a n are ositive so the sequence of rational numbers {a n } converges to 1 in R and to 1 in Q 3. The table below illustrates the raid convergence in R and Q 3 (the 3-adic exansion of 1 is 2 = 2222...). n a n Decimal arox. 3-adic arox. 1 2 2.00000000 20000000 2 5/4 1.25000000 22020202 3 41/40 1.02500000 22220222 4 3281/3280 1.00030487 22222222 6. Second Proof of Theorem 4.1: Contraction Maings Newton s method roduces a sequence converging to a root of f(x) by iterating the function x f(x)/f (x) with initial value a 1 = a where < f (a) 2. A root can also be found with a different iteration ϕ(x) = x f(x)/f (a), where again < f (a) 2. The denominator is f (a), not f (x), so it doesn t change. We will show ϕ is a contraction maing on a suitable ball around a. Then the contraction maing theorem will imly ϕ has a (unique) fixed oint α in that ball. 3 The condition ϕ(α) = α says α f(α)/f (a) = α, so f(α) = 0 and we have a root of f(x). Filling in the details leads to the following second roof of Theorem 4.1. If you re not interested in a second roof, go to the next section. Proof. For r [0, 1] to be determined, set B a (r) = {x Q : x a r}, so B a (r) Z. Set ϕ(x) = x f(x) f (a). We seek an r such that ϕ mas B a (r) back to itself and is a contraction on that ball. To show ϕ is a contraction on some ball around a, we want to estimate ϕ(x) ϕ(y) = f(x) f(y) x y f (a) 3 There is an analogous method in real analysis to simlify Newton s method to the contraction maing theorem by fixing the denominator in the recursion. See [2, Thm. 1.2,. 164].

10 KEITH CONRAD for all x and y near a in order to make this λ x y for some λ < 1. Write f(x) as a olynomial in X a, say f(x) = d i=0 b i(x a) i. Then b 0 =, b 1 = f (a), and b i Z for i > 1. For all x and y in Q, so f(x) f(y) = b i ((x a) i (y a) i ) = f (a)(x y) + i=1 (6.1) ϕ(x) ϕ(y) = x y f(x) f(y) f (a) = 1 f (a) b i ((x a) i (y a) i ), i=2 b i ((x a) i (y a) i ). (If d 1 then the sum on the right is emty.) In the olynomial identity i=2 i 1 X i Y i = (X Y ) X i 1 j Y j for i 2 set X = x a and Y = y a. Then i 1 (x a) i (y a) i = x y (x a) i 1 j (y a) j j=0 j=0 x y max x a i 1 j y a j 0 j i 1 x y max( x a, y a ) i 1 x y max( x a, y a ) when x a and y a are both at most 1. Therefore from (6.1), so for λ [0, 1), x a, y a 1 = ϕ(x) ϕ(y) x y max( x a, y a ) f (a), (6.2) x a, y a λ f (a) = ϕ(x) ϕ(y) λ x y. If we can find λ [0, 1), erhas deending on a and f(x), such that (6.3) x a λ f (a) = ϕ(x) a λ f (a) then (6.2) will tell us that ϕ is a contraction maing on the closed ball around a of radius λ f (a). We will see that if < f (a) 2 then a choice for λ is /f (a) 2. In fact, we want to do more: show the condition < f (a) 2 arises naturally by trying to make (6.3) work for some unknown λ. For λ (0, 1), when x a λ f (a) we have ϕ(x) a λ f (a) x a f(x) f (a) λ f (a) f(x) f (a) λ f (a). Returning to the formula f(x) = d i=0 b i(x a) i, where b 0 = and b 1 = f (a), (6.4) f(x) f (a) = f (a) + (x a) + b i f (a) (x a)i. When x a λ f (a), which is less than f (a) 1, we have for i 2 that b i f (x a)i (a) x a 2 f λ 2 f (a) λ f (a), (a) i=2

so by (6.4) HENSEL S LEMMA 11 f(x) f (a) λ f (a) f (a) λ f (a) f (a) 2 λ. To make this occur for some λ < 1 is equivalent to requiring /f (a) 2 < 1. Therefore if < f (a) 2 and we set λ = /f (a) 2, the maing ϕ(x) = x f(x)/f (a) is a contraction on the closed ball around a with radius λ f (a) = /f (a) and contraction constant λ. Any closed ball in Q is comlete, so the contraction maing theorem imlies that the sequence {a n } defined recursively by a 1 = a and (6.5) a n+1 = ϕ(a n ) = a n f(a n) f (a) for n 1 converges to the unique fixed oint of ϕ in B a ( /f (a) ). By the definition of ϕ, a fixed oint of ϕ is the same thing as a zero of f(x), so there is a unique α in Q satisfying f(α) = 0 and α a /f (a). To finish this roof of Theorem 4.1 we need to show α a = /f (a), α is the unique root of f(x) in Z such that α a < f (a), and f (α) = f (a). α a = /f (a) : We have a 2 a = a 2 a 1 = f(a 1 )/f (a) = /f (a), and for all n a n+1 a n = ϕ n (a) ϕ n 1 (a) = ϕ n 1 (ϕ(a)) ϕ n 1 (a) λ n 1 a 2 a 1 < f (a). Then a n a = /f (a) for all n by induction, so α a = /f (a) by letting n. α is the unique root of f(x) such that α a < f (a) : This was roved at the end of the first roof of Theorem 4.1 without needing Newton s method. We won t rewrite the roof. f (α) = f (a) : Since a 1 = a, of course f (a 1 ) = f (a). If f (a n ) = f (a) for some n 1 then f (a n+1 ) f (a n ) a n+1 a n f (a) where the first inequality is from (5.3) and the second inequality is from a n and a n+1 both lying in the closed ball around a of radius /f (a). Thus f (a n+1 ) f (a n ) < f (a) = f (a n ), so f (a n+1 ) = f (a n ) = f (a). We ve shown f (a n ) = f (a) for all n, and letting n gives us f (α) = f (a). It s worthwhile to comare the recursions from Newton s method and from the contraction maing theorem: Newton s method: a n+1 = a n f(a n) f (a n ), with a 1 = a and < f (a) 2. Contraction maing: a n+1 = a n f(a n) f (a), with a 1 = a and < f (a) 2. The difference is the denominators f (a n ) and f (a), and this has a rofound effect on the rate of convergence. In Newton s method, (5.6) tells us (6.6) α a n f 2 n 1 (a) f (a) 2. To find an estimate on α a n from the contraction maing theorem, the recursion given by (6.5) imlies a n+1 a n = ϕ n 1 (a 2 ) ϕ n 1 (a 1 ) a 2 a 1 λ n 1 for all n 1, where

12 KEITH CONRAD a 2 a 1 = /f (a) and λ = /f (a) 2. If m > n the strong triangle inequality imlies a m a n max a j+1 a j a 2 a 1 λ n 1 = n j m 1 f (a) λ n 1. Letting m yields (6.7) α a n f (a) f (a) 2 Since the uer bound in (6.6) goes to 0 much faster than the uer bound in (6.7), we can anticiate that α a n would shrink to 0 faster when {a n } is roduced from Newton s recursion comared to the contraction maing recursion. In articular, when /f (a) 2 = 1/, the decay bound in (6.6) goes to 0 like (1/) 2n 1 while in (6.7) the decay bound goes to 0 like (1/) n, so we exect there to be a doubling of correct -adic digits at each ste in Newton s recursion while we get just one new correct -adic digit at each ste by the contraction maing recursion. Let s see an examle. Examle 6.1. Let f(x) = X 2 7 with a = 1, so /f (a) 3 = 1/3. The sequence {a n } roduced from the contraction maing recursion (6.5) with a 1 = 1 has a limit α and with a comuter we find α a n 3 = (1/3) n for 1 n 10. The sequence {a n } based on Newton s recursion in Examle 5.1 with a 1 = 1 has the same limit α, but a comuter tells us that α a n 3 = (1/3) 2n 1 for 1 n 10, which is much smaller than (1/3) n. n 1 7. Necessity of < f (a) 2 In Theorem 4.1 we have f (α) = f (a) 0, so Hensel s lemma roduces a simle root of f(x) in Z that is close to a. The criterion < f (a) 2 in Theorem 4.1 is not just a sufficient condition for there to be a simle root of f(x) near a but it is also more or less necessary, as the next theorem makes recise. Theorem 7.1. If f(x) Z [X] has a simle root α in Z, then for all a Z that are close enough to α we have f (a) = f (α) and < f (a) 2. In articular, these conditions hold when a α < f (α). Proof. We have f (α) f (a) α a < f (α), so f (a) = f (α). Also = f(α + (a α)) = f(α) + f (α)(a α) + z(a α) 2 = f (α)(a α) + (a α) 2 z for some z Z. Each term on the right side has absolute value less than f (α) 2, so also < f (α) 2 = f (a) 2. 8. Hensel s lemma beyond Q Both of our roofs of Hensel s lemma, as well as the roof of Theorem 7.1, work in every field comlete with resect to an absolute value satisfying the strong triangle inequality. Theorem 8.1. Let K be a field that is comlete with resect to an absolute value such that x + y max( x, y ) for all x and y in K. Set o = {x K : x 1}. If a olynomial f(x) has coefficients in o and some a o satisfies < f (a) 2 then there is a unique α o such that f(α) = 0 and α a < f (a). Moreover, α a = /f (a) < f (a) and f (α) = f (a). Conversely, if f(x) has a simle root α o, then for a K such that a α < f (α) we have f (a) = f (α) and < f (a) 2..

HENSEL S LEMMA 13 For a version of Hensel s lemma where Z is relaced by a comlete local ring (ossibly not an integral domain, so division is more delicate), see [3, Theorem 7.3]. For a form of Hensel s lemma dealing with zeros of several olynomials (or ower series) in several variables, see [1, Cha. III, 4.3]. References [1] N. Bourbaki, Commutative Algebra, Sringer-Verlag, New York, 1989. [2] C. H. Edwards, Jr., Advanced Calculus of Several Variables, Academic Press, New York, 1973. [3] D. Eisenbud, Commutative Algebra with a View to Algebraic Geometry, Sringer-Verlag, New York, 1995. [4] S. Lang, Algebraic Number Theory, 3rd ed., Sringer-Verlag, New York, 1994. [5] R. Y. Lewis, A Formal Proof of Hensel s Lemma over the -adic Integers, online at htt://roberty lewis.com/adics/adics.df.