Irreducible Polynomials over Finite Fields

Similar documents
g(x) = 1 1 x = 1 + x + x2 + x 3 + is not a polynomial, since it doesn t have finite degree. g(x) is an example of a power series.

Homework 8 Solutions to Selected Problems

Polynomials. Chapter 4

Outline. MSRI-UP 2009 Coding Theory Seminar, Week 2. The definition. Link to polynomials

MTH310 EXAM 2 REVIEW

Polynomial Rings. i=0. i=0. n+m. i=0. k=0

CANONICAL FORMS FOR LINEAR TRANSFORMATIONS AND MATRICES. D. Katz

be any ring homomorphism and let s S be any element of S. Then there is a unique ring homomorphism

Factoring univariate polynomials over the rationals

Local Fields. Chapter Absolute Values and Discrete Valuations Definitions and Comments

Linear Cyclic Codes. Polynomial Word 1 + x + x x 4 + x 5 + x x + x

MATH 431 PART 2: POLYNOMIAL RINGS AND FACTORIZATION

Factoring Polynomials with Rational Coecients. Kenneth Giuliani

Linear Cyclic Codes. Polynomial Word 1 + x + x x 4 + x 5 + x x + x f(x) = q(x)h(x) + r(x),

Introduction to finite fields

Finite Fields: An introduction through exercises Jonathan Buss Spring 2014

Theorem 5.3. Let E/F, E = F (u), be a simple field extension. Then u is algebraic if and only if E/F is finite. In this case, [E : F ] = deg f u.

CHAPTER 10: POLYNOMIALS (DRAFT)

Public-key Cryptography: Theory and Practice

φ(xy) = (xy) n = x n y n = φ(x)φ(y)

Lecture 7: Polynomial rings

Discrete Math, Second Problem Set (June 24)

A field F is a set of numbers that includes the two numbers 0 and 1 and satisfies the properties:

NOTES ON FINITE FIELDS

where c R and the content of f is one. 1

Discrete Math, Fourteenth Problem Set (July 18)

THESIS. Presented in Partial Fulfillment of the Requirements for the Degree Master of Science in the Graduate School of The Ohio State University

U + V = (U V ) (V U), UV = U V.

= 1 2x. x 2 a ) 0 (mod p n ), (x 2 + 2a + a2. x a ) 2

LECTURE NOTES IN CRYPTOGRAPHY

Abstract Algebra: Chapters 16 and 17

Factorization in Integral Domains II

Algebraic Cryptography Exam 2 Review

[06.1] Given a 3-by-3 matrix M with integer entries, find A, B integer 3-by-3 matrices with determinant ±1 such that AMB is diagonal.

Mathematical Olympiad Training Polynomials

A Course in Computational Algebraic Number Theory

Lattice Basis Reduction and the LLL Algorithm

CSE 206A: Lattice Algorithms and Applications Spring Basis Reduction. Instructor: Daniele Micciancio

Rings. EE 387, Notes 7, Handout #10

Lecture 6: Finite Fields

SPRING 2006 PRELIMINARY EXAMINATION SOLUTIONS

Polynomials. Henry Liu, 25 November 2004

A Note on Cyclotomic Integers


SOLUTIONS TO PROBLEM SET 1. Section = 2 3, 1. n n + 1. k(k + 1) k=1 k(k + 1) + 1 (n + 1)(n + 2) n + 2,

88 CHAPTER 3. SYMMETRIES

1. Algebra 1.5. Polynomial Rings

GALOIS GROUPS OF CUBICS AND QUARTICS (NOT IN CHARACTERISTIC 2)

Chapter 4 Finite Fields

Finite Fields. Mike Reiter

ON TESTING THE DIVISIBILITY OF LACUNARY POLYNOMIALS BY CYCLOTOMIC POLYNOMIALS

Topics in linear algebra

+ 1 3 x2 2x x3 + 3x 2 + 0x x x2 2x + 3 4

CHAPTER I. Rings. Definition A ring R is a set with two binary operations, addition + and

A connection between number theory and linear algebra

Math 121 Homework 5: Notes on Selected Problems

Homework 9 Solutions to Selected Problems

FACTORIZATION OF IDEALS

Polynomials, Ideals, and Gröbner Bases

Two Diophantine Approaches to the Irreducibility of Certain Trinomials

Algebra Review 2. 1 Fields. A field is an extension of the concept of a group.

Section IV.23. Factorizations of Polynomials over a Field

55 Separable Extensions

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

Mathematical Foundations of Cryptography

MAT Linear Algebra Collection of sample exams

Resultants. Chapter Elimination Theory. Resultants

a + bi by sending α = a + bi to a 2 + b 2. To see properties (1) and (2), it helps to think of complex numbers in polar coordinates:

Chapter 8. P-adic numbers. 8.1 Absolute values

A linear algebra proof of the fundamental theorem of algebra

18.S34 linear algebra problems (2007)

A linear algebra proof of the fundamental theorem of algebra

Definition For a set F, a polynomial over F with variable x is of the form

Algebra Exam Fall Alexander J. Wertheim Last Updated: October 26, Groups Problem Problem Problem 3...

ABSTRACT ALGEBRA 2 SOLUTIONS TO THE PRACTICE EXAM AND HOMEWORK

THE DIVISION THEOREM IN Z AND R[T ]

Notes on Systems of Linear Congruences

MODEL ANSWERS TO HWK #10

Lecture Notes Math 371: Algebra (Fall 2006) by Nathanael Leedom Ackerman

Galois theory (Part II)( ) Example Sheet 1

Midterm for Introduction to Numerical Analysis I, AMSC/CMSC 466, on 10/29/2015

Algebraic structures I

2x 1 7. A linear congruence in modular arithmetic is an equation of the form. Why is the solution a set of integers rather than a unique integer?

Factorization in Polynomial Rings

ϕ : Z F : ϕ(t) = t 1 =

Congruent Number Problem and Elliptic curves

MINKOWSKI THEORY AND THE CLASS NUMBER

Math 788M: Computational Number Theory (Instructor s Notes)

32 Divisibility Theory in Integral Domains

Math 554 Qualifying Exam. You may use any theorems from the textbook. Any other claims must be proved in details.

Computations/Applications

Basic Algebra. Final Version, August, 2006 For Publication by Birkhäuser Boston Along with a Companion Volume Advanced Algebra In the Series

Algebraic number theory

GRE Subject test preparation Spring 2016 Topic: Abstract Algebra, Linear Algebra, Number Theory.

Polynomial Rings. i=0

PUTNAM TRAINING POLYNOMIALS. Exercises 1. Find a polynomial with integral coefficients whose zeros include

Constructions with ruler and compass.

Groups, Rings, and Finite Fields. Andreas Klappenecker. September 12, 2002

1: Introduction to Lattices

Polynomials with nontrivial relations between their roots

Transcription:

Chapter 4 Irreducible Polynomials over Finite Fields 4.1 Construction of Finite Fields As we will see, modular arithmetic aids in testing the irreducibility of polynomials and even in completely factoring polynomials in Z[x]. If we expect a polynomial fx is irreducible, for example, it is not unreasonable to try to find a prime p such that fx is irreducible modulo p. If we can find such a prime p and p does not divide the leading coefficient of fx, then fx is irreducible in Z[x] see the example on page 4. It is the case that there exist polynomials which are irreducible in Z[x] but are reducible modulo every prime see Exercise 4.1, but as it turns out, one can show that such polynomials are rare and verifying that a polynomial fx is irreducible by trying to find a prime p for which fx is irreducible modulo p will almost always work rather quickly see Chapter 5. This is already strong motivation for looking into the idea of using modular arithmetic, but in this chapter, we plan to explore other aspects of modular arithmetic as well. We begin with a definition. Let p be a prime, and let fx Z[x]. Suppose further that fx 0 mod p. We say that ux vx mod p, fx, where ux and vx are in Z[x], if there exist gx and hx in Z[x] such that ux = vx + fxgx + phx. In other words, ux vx mod p, fx if ux vx is in the ideal generated by p and fx in the ring Z[x]. One easily checks that if ux vx mod p, fx and vx wx mod p, fx, then ux wx mod p, fx. Suppose that u 1 x v 1 x mod p, fx and u 2 x v 2 x mod p, fx. Then u 1 x ± u 2 x v 1 x ± v 2 x mod p, fx. 61

62 Also, using u 1 xu 2 x v 1 xv 2 x = u 1 x u 2 x v 2 x + v 2 x u 1 x v 1 x, we easily see that u 1 xu 2 x v 1 xv 2 x mod p, fx. We note that if ux vx mod p, then ux vx mod p, fx by taking gx 0, and if ux vx mod fx, then ux vx mod p, fx by taking hx 0. A further important and useful observation is that ux 0 mod p, fx if and only if fx is a factor of ux modulo p. Let fx be monic. If ux vx mod p, fx where ux and vx are in Z[x], then there are polynomials g 0 x and h 0 x in Z[x] such that ux vx = fxg 0 x + ph 0 x. Recall see Exercise 1.14 b that when dividing a polynomial in Z[x] by a monic polynomial in Z[x], the quotient and remainder will be in Z[x]. It follows that there are polynomials qx and rx in Z[x] with rx 0 or deg r < deg f such that h 0 x = fxqx + rx. Taking gx = g 0 x + pqx and hx = rx, we deduce that if ux vx mod p, fx, then there are polynomials gx and hx in Z[x] with hx 0 or deg h < deg f such that ux vx = fxgx + phx. A simple argument shows further that such a gx and hx are unique given ux, vx, fx, and p. We will also make use of the following convention. Let p be a prime, and suppose fx = n j=0 a jx j Z[x] with fx 0 mod p. Then we refer to the degree of fx modulo p as the largest positive integer k n for which p does not divide a k. Thus, for example, 2x 3 + 3x 2 + 4 is a polynomial of degree 2 modulo 2. With the added condition that a n = 1, we easily see that any gx Z[x] is congruent mod p, fx to one of the p n polynomials of degree n 1 with coefficients from {0, 1,..., p 1}. Also, each of these p n polynomials are incongruent mod p, fx. In other words, we can view these p n polynomials as representatives of the p n distinct residue classes mod p, fx. Consider now the possibility that a n 1, and let k denote the degree of fx modulo p. Exercise 4.6 implies that arithmetic mod p, fx is the same as arithmetic mod p, f 1 x where f 1 x fx mod p and deg f 1 x = k. Exercise 4.5 further implies that arithmetic mod p, f 1 x is the same as arithmetic mod p, f 2 x where f 2 x is an appropriate monic polynomial with deg f 2 x = k. It follows that there are precisely p k distinct residue classes mod p, fx with representatives given by the polynomials of degree k 1 with coefficients from {0, 1,..., p 1}. Theorem 4.1.1. Let p be a prime. If fx Z[x] is of degree n modulo p and fx is irreducible modulo p, then x pn x mod p, fx. We clarify that in Theorem 4.1.1, as is usual, x pn = x pn. Before we prove this theorem, we consider an example. We show that fx = x p x 1 is irreducible modulo p and hence irreducible over Z. Consider ux

Chapter 11 Computational Considerations 11.1 Berlekamp s Method and Hensel Lifting We begin this chapter by discussing some important and classical methods for factoring polynomials in Z[x]. From this book s point of view, we are mainly interested in knowing whether a given polynomial in Z[x] is irreducible over Q. If it is reducible over Q, factoring methods will allow us to find a non-trivial factorization of it in Z[x]. A key to factoring techniques for polynomials in Z[x] is to make use of a factoring algorithm in F p [x], where F p as before is the field of arithmetic modulo p. We describe an approach due to Berlekamp 1984. This algorithm determines the factorization of a polynomial fx in F p [x] where p is a prime or more generally over finite fields. For simplicity, we suppose fx is monic and squarefree in F p [x]. Let n = deg fx. For wx Z[x], define wx modd p, fx as the unique gx Z[x] satisfying deg g n, each coefficient of gx is in the set {0, 1,..., p 1} and gx wx mod p, fx. Observe that we can view wx modd p, fx as also being in F p [x]. Let A be the matrix with jth column derived from the coefficients of Specifically, write x j 1p modd p, fx = x j 1p modd p, fx. n a ij x i 1 for 1 j n. i=1 255

256 Then we set A = a ij n n. Note that the first column consists of a one followed by n 1 zeroes. In particular, 1, 0, 0,..., 0 will be an eigenvector for A associated with the eigenvalue 1. We are interested in determining the complete set of eigenvectors associated with the eigenvalue 1. In other words, we would like to know the null space of B = A I where I represents the n n identity matrix. It will be spanned by k = n rankb linearly independent vectors which can be determined by performing row operations on B. Suppose v = b 1, b 2,..., b n is one of these vectors, and set gx = n j=1 b jx j 1. Observe that gx p gx mod p, fx. Moreover, the gx with this property are precisely the gx with coefficients obtained from the components of vectors v in the null space of B. Our first result in this chapter connects the factorization of fx in F p [x] with the computations of greatest common divisors of fx and polynomials gx s where s {0, 1,..., p 1}. These greatest common divisors must be computed in F p [x], and we recall the discussion following Definition 1.3.1. For clarification in what follows, if ux and vx are in Z[x] or F p [x], then we use the notation gcd p ux, vx to denote the greatest common divisor of ux and vx computed over the field F p. Theorem 11.1.1. Let fx be a monic polynomial in Z[x]. Suppose fx is squarefree in F p [x]. Let gx be a polynomial with coefficients obtained from a vector in the null space of B = A I as described above. Then p 1 fx gcd p gx s, fx s=0 mod p. Proof. Observe that p 1 gx p gx gx s s=0 mod p. Since gx p gx p mod p, we deduce that fx divides the product on the right in F p. Since the factors gx s, for s {0, 1,..., p 1}, are pairwise relatively prime in F p, we deduce that each monic irreducible factor of fx divides exactly one of the expressions gcd p gx s, fx appearing on the right. The result follows. Observe that if gx is not a constant, then 1 deggx s < deg fx for each s so the above claim implies we get a non-trivial factorization of fx in F p [x]. On the other hand, fx will not necessarily be completely factored. One can completely factor fx by repeating the above procedure for each factor obtained from the claim; but it is simpler to use and not difficult to show that if one takes the product of the greatest common divisors of each factor of

257 fx obtained above with hx s with 0 s p 1 where hx is obtained from another of the k vectors spanning the null space of B, then one will obtain a new non-trivial factor of fx in F p [x]. Continuing to use all k vectors will produce a complete factorization of fx in F p [x]. As an example, we factor x 7 + x 4 + x 3 + x + 1 in F 2 [x]. The matrices A and B are 1 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 1 1 0 1 0 0 1 1 1 0 1 0 0 1 0 0 0 1 1 0 1 0 0 A = 0 0 0 0 0 0 1 and B = 0 0 0 1 0 0 1 0 0 1 0 1 0 1 0 0 1 0 0 0 1 0 0 0 0 1 0 1 0 0 0 0 1 1 1 0 0 0 1 0 1 0 0 0 0 1 0 1 1 Performing elementary row operations in F 2 on the matrix B, one can obtain the matrix 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 1 0 0 0 0 1 0 1. 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Consequently, the dimension of the null space is 2, and it is spanned by the two vectors 1, 0, 0, 0, 0, 0, 0 and 0, 0, 1, 1, 1, 0, 1. The first of these vectors corresponds to polynomial gx = 1. Since this is a constant, Theorem 11.1.1 only leads to the trivial factorization for this choice of gx. The second eigenvector gives gx = x 6 + x 4 + x 3 + x 2. As gx, in this case, is not constant, we know that Theorem 11.1.1 must lead to a non-trivial factorization of fx. In fact, we get gcd 2 fx, gx = x 3 +x 2 +1 and gcd 2 fx, gx 1 = x 4 +x 3 +x 2 +x+1, so that we can deduce fx x 3 + x 2 + 1x 4 + x 3 + x 2 + x + 1 mod 2. Next, we describe Hensel Lifting, which is a procedure for using the factorization of fx in F p [x] p a prime to produce a factorization of fx modulo

258 p k for an arbitrary positive integer k. Suppose that ux and vx are relatively prime polynomials in F p [x] for which fx uxvx mod p. We continue to view fx as being monic, for simplicity, so we take ux and vx also to be monic. Then Hensel Lifting will produce, for any positive integer k, monic polynomials u k x and v k x in Z[x] satisfying u k x ux mod p, v k x vx mod p, and fx u k xv k x mod p k. When k = 1, it is clear how to choose u k x and v k x. For k 1, we determine values of u k+1 x and v k+1 x from the values of u k x and v k x as follows. We compute w k x 1 p k fx u kxv k x Observe that deg w k x < deg fx and mod p. p k w k x fx u k xv k x mod p k+1. Since ux and vx are relatively prime in F p [x], we can find ax and bx in F p [x] depending on k such that axux + bxvx w k x mod p. By choosing tx F p [x] appropriately and replacing ax with ax txvx and bx with bx + txux, we see that we may suppose deg ax < deg vx. The above equation then implies further that we may take deg bx < deg ux. Setting u k+1 x = u k x + bxp k and v k+1 x = v k x + axp k, we see that u k+1 x and v k+1 x are monic and u k+1 xv k+1 x u k x + bxp k v k x + axp k u k xv k x + p k axux + bxvx u k xv k x + p k w k x u k xv k x + fx u k xv k x fx mod p k+1. A complete factorization of fx modulo p k can be obtained from a complete factorization of fx modulo p by modifying this idea. We do not elaborate on the best approach here, but note that such a factorization can be

259 achieved easily as follows. If fx is a product of r monic irreducible polynomials g 1 x, g 2 x,..., g r x modulo p, then one can factor fx modulo p k by taking ux = g 1 x and vx = g 2 xg 3 x g r x above. This will produce a factor u k x modulo p k that is congruent to ux modulo p and another factor v k x congruent to vx modulo p. One can then replace the role of fx with v k x which is g 2 xg 3 x g r x modulo p and repeat the process, factoring v k x modulo p k as a polynomial which is congruent to g 2 x modulo p and a polynomial congruent to g 3 xg 4 x g r x modulo p. Continuing in this manner, one gets a complete factorization of fx into a product of monic irreducible polynomials modulo p k. By way of example, setting fx = x 7 + x 4 + x 3 + x + 1, ux = x 3 + x 2 + 1 and vx = x 4 + x 3 + x 2 + x + 1, we recall that, in our previous example, we showed fx uxvx mod 2. Then w 1 x 1 2 fx uxvx x6 + x 5 + x 4 + x 3 + x 2 mod 2. Taking ax = 0 and bx = x 2, we see that axux + bxvx w 1 x mod 2. Hence, we take and We deduce that u 2 x = ux + 2x 2 = x 3 + 3x 2 + 1 v 2 x = vx = x 4 + x 3 + x 2 + x + 1. fx x 3 + 3x 2 + 1x 4 + x 3 + x 2 + x + 1 mod 4. Continuing in this manner, we obtain fx = x 3 + 7 x 2 + 1x 4 + x 3 + x 2 + x + 1 mod 8 fx = x 3 + 15 x 2 + 1x 4 + x 3 + x 2 + x + 1 mod 16 fx = x 3 + 31 x 2 + 1x 4 + x 3 + x 2 + x + 1 mod 32 fx = x 3 + 63 x 2 + 1x 4 + x 3 + x 2 + x + 1 mod 64 Perhaps the above is enough for the reader to guess how fx factors in Z[x]. We return to this at the end of the next section.

260 11.2 An Inequality of Landau and an Approach of Zassenhaus Landau s inequality gives an upper bound on the size of the factors of a given polynomial in Z[x]. For 11.2.1 fx = j=0 n a j x j = a n j=0 n j=1 x α j, we recall the notations n 1/2 n f = aj 2 and Mf = a n max{1, α j }, the latter being the Mahler measure of the polynomial fx. We make use of following two easily established properties of Mahler measure: j=1 i If gx and hx are in C[x], then Mgh = MgMh. ii If gx is in Z[x], then Mg 1. For a fixed fx Z[x], we want an upper bound on g where gx is a factor of fx in Z[x]. Theorem 11.2.1. If fx, gx, and hx in Z[x] are such that fx = gxhx, then g 2 deg g f. Proof. We begin by proving that for fx R[x], 11.2.2 Mf f 2 deg f Mf. For wx C[x], we use the reciprocal polynomial wx = x deg w w1/x. The coefficient of x deg w in the expanded product wx wx is w 2. For fx as in 11.2.1, we consider wx = a n x α j α j x 1. Observe that Hence, wx = a n wx wx = a 2 n 1 j n α j >1 1 j n α j >1 j=1 1 α j x 1 j n α j 1 1 j n α j 1 α j x. n n x α j 1 α j x = fx fx. j=1

261 By comparing coefficients of x n, we deduce that w = f. Also, the definition of wx implies w0 = Mf. Thus, writing wx = n j=0 c jx j, we obtain Mf = c 0 c 2 0 + c 2 1 + + c 2 n 1/2 = w = f, establishing the first inequality in 11.2.2. For the second inequality, observe that for any k {1, 2,..., n}, the product of any k of the α j has absolute value Mf/ a n. It follows that a n k / a n, which is the sum of the products of the roots taken k at a time, is n k Mf/ an. Hence, a n k n k Mf = n n k The second inequality in 11.2.2 now follows from n 1/2 f = aj 2 j=0 n a j j=0 n j=0 Mf. n Mf = 2 n Mf. j Now, we make use of 11.2.2 and properties i and ii of Mahler measure above to deduce This establishes the theorem. g 2 deg g Mg 2 deg g MgMh = 2 deg g Mgh = 2 deg g Mf 2 deg g f. We explain a method for factoring a given fx Z[x] with the added assumptions that fx is monic and squarefree. This approach has its origins in a paper by Zassenhaus 1969. The latter we can test by computing gcdf, f, which will give us a nontrivial factor of fx if fx is not squarefree. If fx is not monic, then one needs to add a little more to the ideas below but not much. Let B = 2 deg f/2 f. Then if fx has a nontrivial factor gx in Z[x], it has such a factor of degree deg f/2 so that by Theorem 11.2.1, we can use B as a bound on g. Next, we find a prime p for which f is squarefree modulo p. There are a variety of ways this can be done. There are only a finite number of primes for which f is not squarefree modulo p. These primes divide the resultant Rf, f. So one can compute Rf, f and avoid primes which divide Rf, f. Alternatively, one can compute gcd p fx, f x modulo p or simply using Berlekamp s factoring algorithm until a squarefree factorization occurs. We choose a positive integer r as small as possible such that p r > 2B. Then we factor fx modulo p by Berlekamp s algorithm and use Hensel lifting to

262 obtain the factorization of fx modulo p r. Given our conditions on fx, we can suppose all irreducible factors are monic and do so. Next, we can determine if fx = gxhx for some monic gx and hx in Z[x] with g B as follows. We observe that the coefficients of gx are in [ B, B]. We use a residue system modulo p r that includes this interval, namely p r /2, p r /2], and consider each factorization of fx modulo p r with coefficients in this residue system as a product of two monic polynomials ux and vx. Since fx = gxhx, there must be some factorization where gx ux mod p r and hx v mod p r. On the other hand, the coefficients of gx and ux are all in p r /2, p r /2] so that the coefficients of gx ux are each divisible by p r and are each < p r in absolute value. This implies gx = ux. Thus, we can determine if a factor gx exists as above by simply checking each monic factor of fx modulo p r with coefficients in p r /2, p r /2]. Recall we factored fx = x 7 + x 4 + x 3 + x + 1 modulo various powers of 2 in the previous section. Using the above approach, we can deduce a factorization of fx in Z[x]. In the notation above B = 2 3 f = 8 5 < 20. Since 2 6 = 64 > 2B, we can use the factorization of fx that we obtained modulo 64, namely fx = x 3 + 63 x 2 + 1x 4 + x 3 + x 2 + x + 1 mod 64. We see that if fx is reducible, then it must have two factors, one congruent to x 3 + 63 x 2 + 1 modulo 64 and one congruent to x 4 + x 3 + x 2 + x + 1 modulo 64. The first of these is of particular interest to us as its degree is < deg f/2. If gx Z[x] divides fx and gx x 3 + 63 x 2 + 1 mod 64, then the arguments above imply that gx must equal the polynomial obtained by taking the coefficients on the right to be in the interval [ 32, 32]. In other words, from x 3 + 63 x 2 + 1 x 3 x 2 + 1 mod 64, we deduce gx = x 3 x 2 + 1. To clarify, this means that if fx is reducible over Z, then gx will be a factor of fx. To establish that fx in fact has gx as a factor, we are left with checking if gx divides fx. In fact, we have the perhaps not unexpected factorization fx = x 3 x 2 + 1x 4 + x 3 + x 2 + x + 1.

263 11.3 Swinnerton-Dyer s Example The algorithm just described above for factoring a polynomial fx Z[x] of degree n can take time that is exponential in n for some, albeit rare, fx. This has been illustrated by a nice example due to Swinnerton-Dyer. We formulate his example as follows. Let a 1, a 2,..., a m be arbitrary squarefree pairwise relatively prime integers > 1. Let S m be the set of 2 m different m-tuples ε 1,..., ε m where each ε j {1, 1}. We justify that the polynomial 11.3.1 fx = x ε1 a1 + + ε m am has the properties: ε 1,...,ε m S m i The polynomial fx is in Z[x]. ii It is irreducible over the rationals. iii It factors as a product of linear and quadratic polynomials modulo every prime p. Some of the arguments below in particular, the argument for ii makes some use of Galois theory. We elaborate on the details but note in advance that some background is needed in this direction for the presentation given here. One can deduce i by observing that the coefficients of fx are symmetric polynomials with integer coefficients in the roots of x 2 a 1 x 2 a 2 x 2 a m. Alternatively, an easy induction argument on m can be done as follows. For any squarefree positive integer a 1, we have x a1 x + a1 = x 2 a 1. Suppose fx, as above, is in Z[x] whenever m = t where t is a positive integer. Let a 1, a 2,..., a t+1 be arbitrary squarefree pairwise relatively prime integers > 1. The induction hypothesis implies f t x = x ε1 a1 + + ε t at Z[x]. ε 1,...,ε t S t By elementary symmetric functions associated with the two roots of the quadratic x 2 a t+1, we see that f t+1 x = x ε1 a1 + + ε t+1 at+1 ε 1,...,ε t+1 S t+1 = f t x + at+1 ft x at+1 Z[x].

264 The use of elementary symmetric functions can be avoided by observing that f t x + at+1 = ux + vx at+1 for some ux and vx in Z[x] and, consequently, f t x at+1 = ux vx at+1. Hence, the product of f t x+ at+1 and f t x at+1 is ux 2 a t+1 vx 2 Z[x]. Thus, i holds. To establish ii, it suffices to show that the minimal polynomial for α m = a 1 + a 2 + + a m, that is the monic irreducible polynomial in Q[x] that has α m as a root, has all 2 m numbers of the form ε 1 a1 + + ε m am, where ε 1,..., ε m S m, as roots and that these 2 m numbers are distinct. We begin, however, first by looking at the algebraic number field K m = Q a1, a 2,..., a m. Observe that K m is the splitting field for the polynomial x 2 a 1 x 2 a 2 x 2 a m, and therefore forms a Galois extension over Q. We prove The number field K m has degree 2 m over Q, and the 2 m elements of the Galois group GalK m /Q of K m over Q are given by the mappings σ aj = εj aj, for 1 j m, where ε 1,..., ε m varies over the 2 m elements of S m. We proceed by induction on m to establish that holds for each positive integer m and for all choices of a 1, a 2,..., a m squarefree pairwise relatively prime integers > 1. We begin with m = 0 and m = 1. For m = 0, we have K 0 = Q which has degree 1 over Q and implies GalK 0 /Q only consists of the identity element. Thus, holds for m = 0. Suppose m = 1. Given a 1 is squarefree and > 1, the quadratic x 2 a 1 is irreducible over Q. Therefore, the number field K 1 = Q a1 has degree 2 over Q. We deduce GalK1 /Q has exactly one nonidentity element σ. As a root of x 2 a 1 must be mapped to a root of x 2 a 1 by σ, we see that σ a1 = ± a1. Since σ is not the identity mapping, we deduce σ a1 = a1. This establishes what we want to start our induction, namely for m = 1. Suppose now that holds for m t, where t is a positive integer, and let a 1, a 2,..., a t+1 be arbitrary squarefree pairwise relatively prime integers > 1. With the already established notation above, we have that the degree of K t over Q is 2 t and that there is a σ GalK t /Q that satisfies σ a1 = a1

265 and σ aj = aj for 2 j t. We argue next that a t+1 K t. Assume otherwise. Let { t } η S = j aj : η j {0, 1} for each j. j=2 Thus, S consists of 2 t 1 elements. Every element of K t can be expressed uniquely as a linear combination of the elements of S and the elements of S times a 1 with coefficients from Q. In other words, each element of K t can be written as css a 1, s S bss + s S for exactly one choice of bs and cs in Q. Fix such bs and cs so that the above sum represents a t+1. We make use of the automorphism σ defined above. Observe that σ bss + css a 1 = bss css a 1. s S s S s S s S Since σ at+1 2 = σat+1 = a t+1, we deduce that σ at+1 is either at+1 or a t+1. In the former case, we use that 2 a t+1 = a t+1 + σ at+1 = 2 bss Q a2,..., a t. In the latter case, we use that s S 2 a t+1 = a t+1 σ at+1 = 2 a1 css s S which implies a1 a t+1 Q a2,..., a t. On the other hand, by the induction hypothesis, Q a2,..., a t has degree 2 t 1 over Q and both the fields Q a2,..., a t, a t+1 and Q a2,..., a t, a 1 a t+1 have degree 2 t over Q. This is a contradiction. Thus, a t+1 K t. In other words, x 2 a t+1 is irreducible over K t. Note that K t+1 is formed by adjoining a root of x 2 a t+1 to K t. Since the degree of K t over Q is 2 t, we deduce that the degree of K t+1 over Q is 2 t+1. Observe that if σ is an automorphism of K t+1 that fixes Q, then its action on the elments of K t+1 are determined by

266 the values of a j for j {1, 2,..., t + 1}. Also, for each j {1, 2,..., t}, the element a j in K t+1 must be mapped by σ to one of a j or a j. Since the degree of K t+1 over Q is 2 t+1, there are 2 t+1 elements of the Galois group of K t+1 over Q, and we see that they are precisely the 2 t+1 mappings given in with m = t + 1. Thus, holds for m = t + 1, and the induction argument in complete. We return to establishing ii. We show next that 11.3.2 ε 1 a1 + + ε m am ε 1 a1 + + ε m am, where ε 1,..., ε m and ε 1,..., ε m are distinct elements of S m. Taking j maximal such that ε j ε j, we see that if 11.3.2 does not hold, then aj Q a1,..., a j 1. The argument above showing that a t+1 K t implies a contradiction. Thus, the 2 m numbers ε 1 a1 + + ε m am, with ε 1,..., ε m S m are distinct. Observe that α m K m. If wx is the minimal polynomial for α m, then each σ GalK m /Q must map α m to a root of wx. Fix ε 1,..., ε m S m. From, we know that there is a σ GalK m /Q satisfying σα m = ε 1 a1 + + ε m am. Therefore, each root of fx, given by 11.3.1, is a root of wx. We deduce fx = wx and, hence, is irreducible. This completes the proof of ii. To see iii, we again use induction on m. We give two such induction arguments, the first more direct for someone accustomed to working in extensions with factorizations modulo a prime and the second more self-contained given the contents of this book. The case that m = 1 is clear, since in this case fx, as defined in 11.3.1, is a quadratic. Suppose iii holds for m = t where t is a positive integer. Fix a prime p. We make use of the notation in 11.3.1 with m = t + 1 and write 11.3.3 fx = g j x + aj gj x aj, where j {1, 2,..., t + 1} and g j x is defined by g j x = x ε1 a1 + + ε j 1 aj 1 + ε j aj+1 + + ε t at+1. ε 1,...,ε t S t The induction hypothesis implies that each g j x factors modulo p as a product of linear and quadratic polynomials. If some a j is a square modulo p, then there is some integer b such that b 2 a j mod p and, hence, fx g j x+bg j x b mod p. Since each of g j x + b and g j x b factors as a product of linear and quadratic polynomials modulo p, we are through in this case. Now, suppose

267 no a j is a square modulo p. Fix ε 1,..., ε m S m and observe that when the product x + ε1 a1 + + ε m am x ε 1 a1 + + ε m am is expanded the result is an expression with each radicand the product of two of the a j. Since no a j is a square modulo p, each such product of two of the a j will be since the product of two non-quadratic residues is a quadratic residue. This means that the above product can be expressed as a quadratic polynomial modulo p with coefficients from {0, 1,..., p 1}. Pairing then the linear factors of fx appropriately leads to the desired factorization modulo p. For the second argument, we make use of Exercise 4.2b also, see Exercise 8.4. We prove by induction that fx, as defined in 11.3.1, is a product modulo each prime p of linear and quadratic monic polynomials, with the latter of the form x 2 + 2bx + c for some integers b and c so that modulo 2 the middle coefficient is necessarily 0. We start the induction the same, noting the case m = 1 holds and supposing we know that such a factorization holds for m = t where t is a positive integer and for each prime p. We now fix a prime p and make use of the notation in 11.3.1 and in 11.3.3 with m = j = t+1. The induction hypothesis implies there are monic polynomials u 1 x, u 2 x,..., u r x in Z[x], each of degree 1 or 2, and a polynomial vx Z[x] such that Hence, g t+1 x = u 1 xu 2 x u r x + pvx. fx = g t+1 x + at+1 gt+1 x at+1 = u 1 x + at+1 u1 x at+1 ur x + at+1 ur x at+1 + pwx, where wx is a polynomial with each coefficient a symmetric polynomial in at+1 and a t+1 with coefficients in Z. We deduce then that in fact wx Z[x]. By the induction hypothesis, we further may take each u j x of one of the forms u j x = x + b and u j x = x 2 + 2bx + c where b and c are integers. In the first case, we observe that u j x + at+1 uj x at+1 = x + b 2 a t+1. Since this is a monic quadratic with even middle term, this is a factor of fx of a form we want. In the case that u j x = x 2 + 2bx + c, we set d = c b 2 and write u j x = x + b 2 + d. We deduce then that u j x + at+1 uj x at+1 = x + b + at+1 2 + d x + b a t+1 2 + d = x + b 2 a t+1 2 + 2d x + b 2 + a t+1 + d 2. Observe that the above is of the form h x+b 2 where hx is a monic quadratic with an even coefficient for x and a constant term equal to a 2 t+1 + 2da t+1 + d 2 = a t+1 + d 2.

268 Modulo 2, we see that hx 2 and, hence, h x + b 2 factors as a product of two monic quadratics with the coefficient of x equal to 0 for each quadratic. For odd primes p, Exercise 4.2b implies hx 2 and, hence, h x + b 2 is reducible modulo p. Note that k is a root of hx 2 modulo p if and only if k is a root of hx 2 modulo p. Furthermore, 0 is a root of hx 2 if and only if x 2 is a factor of hx 2. Thus, h x + b 2 factors as a product of linear and quadratic monic polynomials modulo p. Since any integer is congruent to twice an integer modulo an odd prime p, we can take the coefficient of x appearing in any quadratic to be even. The induction argument is therefore complete. 11.4 The Lattice Base Reduction Algorithm Lenstra, Lenstra, and Lovasz 1982 showed that it is possible to factor a polynomial fx = n j=0 a jx j Z[x] in polynomial time. If n is the degree of fx so a n 0 and H is the height of fx, that is the maximum of a j for 0 j n, then the quantity nlog 2 H + log 2 n + 2 can be viewed as an upper bound on the length of the input polynomial fx. A polynomial time algorithm for factoring fx corresponds to an algorithm that runs in time that is polynomial in n and log H. The previous factoring algorithm we described is not polynomial as was seen from the example of Swinnerton-Dyer. The main problem there which is notably atypical is that the polynomial fx can factor into many small irreducible factors modulo every prime p causing us to have to consider exponentially many possibilities for the mod p reduction of any nontrivial factor of fx. The algorithm of Lenstra, Lenstra and Lovász, called the lattice base reduction algorithm or the LLL-algorithm, is an approach for getting around having to consider all such mod p reductions and thereby provides a polynomial time algorithm for factoring fx over the rationals. To describe the lattice base reduction algorithm, we turn now to some background on lattices. Let Q n denote the set of vectors a 1, a 2,..., a n with a j Q. For b = a1, a 2,..., a n Q and b = a 1, a 2,..., a n Q, we define the usual dot product b b by Also, we set b b = a 1 a 1 + a 2 a 2 + + a n a n. b = a 2 1 + a2 2 + + a2 n. Further, we use A T to denote the transpose of a matrix A, so the rows and columns of A are the same as the columns and rows of A T, respectively. Let b 1,..., b n Q n, and let A = b1,..., b n be the n n matrix with column vectors b 1,..., b n. The lattice L generated by b 1,..., b n is L = LA = b 1 Z + + b n Z.

269 We will be interested mainly in the case that b 1,..., b n are linearly independent; in this case, b 1,..., b n is called a basis for L. Observe that given L, the value of det A is the same regardless of the basis b 1,..., b n that is used to describe L. To see this, observe that if b 1,..., b n is another basis for L, there are matrices A and B with integer entries such that b1,..., b n AB = b 1,..., b n B = b1,..., b n. Given that b 1,..., b n is a basis for R n, it follows that AB is the identity matrix and det B = ±1. The second equation above then implies We set det L to be this common value. det b 1,..., b n = det b1,..., b n. Next, we describe the Gram-Schmidt orthogonalization process. Define recursively b i = i 1 b i µ ij b j, for 1 i n, where j=1 bi b j µ ij = µ i,j =, for 1 j < i n. b j b j Then for each i {1,..., n}, the vectors b 1,..., b i span the same subspace of R n as b 1,..., b i. In other words, { a1 b 1 + + a i b i : a j R for 1 j i } = { a 1 b1 + + a i bi : a j R for 1 j i }. Furthermore, the vectors b 1,..., b n are linearly independent hence, non-zero and pairwise orthogonal i.e., for distinct i and j, we have b b i j = 0. We leave verification of these facts as exercises. We turn next to Hadamard s inequality. The value of det L can be viewed as the volume of the polyhedron with edges parallel to and the same length as b1,..., b n. As indicated by the above remarks, this volume is independent of the basis. Geometrically, it is apparent that det L b 1 b 2 b n where apparent is limited somewhat to the dimensions we can think in. This is Hadamard s inequality. One can also use the vectors b j to provide a proof in any dimensions as follows. One checks that det b1,..., b n = det b 1, b 2 + µ 21 b 1,..., n 1 b n + µ nj b j = det b 1,..., b n. j=1

270 Since b 1,..., b n is a basis for L, we deduce that det L 2 = det b 1,..., b n T b 1,..., b n b 1 2 0... 0 = det..... 0... 0 b n 2 n 2 = b i. i=1 Thus, det L = n i=1 b i. So it suffices to show b i b i. The orthogonality of the b i s implies b i 2 = i 1 b i + µ ij b 2 j = i 1 b i 2 + µ 2 ij b j 2. j=1 The sum on the right above is clearly positive so that b i b i follows. The Hadamard inequality provides an upper bound on the value of det L. Hermite proved that there is a constant c n depending only on n such that for some basis b 1,..., b n of L, we have j=1 b 1 b 2 b n c n det L. It is known that c n n n. To clarify a point, Minkowski has shown that there exist n linearly independent vectors b 1,..., b n in L such that b 1 b 2 b n n n/2 det L, but b 1,..., b n is not necessarily a basis for L. Further, we note that the problem of finding a basis b 1,..., b n of L for which b 1 b n is minimal is known to be NP-hard. The problem of finding a vector b L with b minimal is not known to be NP-complete, but it may well be. In any case, no one knows a polynomial time algorithm for this problem. We note that Lagarias 1985 has, however, proved that the problem of finding a vector b L which minimizes the maximal absolute value of a component is NP-hard. Observe that Hermite s result mentioned above implies that there is a constant c n, depending only on n, such that b c n n det L. It is possible for a lattice L to contain a vector that is much shorter than this, but it is known that the best constant c n for all lattices L satisfies n/2eπ c n n/eπ. The vectors b j obtained from the Gram-Schmidt orthogonalization process can be used to obtain a lower bound for the shortest vector in a lattice L. More

271 precisely, we have 11.4.1 b L, b 0 = b min{ b 1, b 2,..., b n }. To see this, express b in the form b = u1 b1 + + u k bk, where each u j Z and u k 0. Observe that the definition of the b j imply then that b = v1 b 1 + + v k b k, for some v j Q with v k = u k. In particular, v k is a non-zero integer. We deduce that b 2 = v 1 b 1 + + v k b k v1 b 1 + + v k b k from which 11.4.1 follows. = v 2 1 b 1 2 + + v 2 k b k 2 b k 2, 11.5 Reduced Bases and Factoring with LLL We will want the following important defintion. Definition 11.5.1. Let b 1,..., b n be a basis for a lattice L and b 1,..., b n the corresponding basis for R n obtained from the Gram-Schmidt orthogonalization process, with µ ij as defined before. Then b 1,..., b n is said to be reduced if both of the following hold i µ ij 1 2 for 1 j < i n ii b i + µ i,i 1 b i 1 2 3 4 b i 1 2 for 1 < i n. The main work of Lenstra, Lenstra, and Lovasz 1982 establishes an algorithm that runs in polynomial time which constructs a reduced basis of L from an arbitrary basis b 1,..., b n of L. Our main goal below is to explain how such a reduced basis can be used to factor a polynomial fx in polynomial time. We will need to describe the related lattice and an initial basis for it. We begin, however, with some properties of reduced bases. Let b 1,..., b n be a reduced basis for a lattice L and b 1,..., b n the corresponding basis for R n obtained from the Gram-Schmidt orthogonalization process with µ ij as before. The argument for 11.4.1 can be modified to show that 11.5.1 b L, b 0 = b1 2 n 1/2 b.

272 In particular, the above inequality holds for the shortest vector b L. To prove 11.5.1, observe that i and ii imply b i 2 + 1 4 b i 1 2 b i 2 + µ 2 i,i 1 b i 1 2 = b i + µ i,i 1 b i 1 2 3 4 b i 1 2. Hence, b i 2 1/2 b i 1 2. We deduce that 11.5.2 b i 2 1 2 i j b j 2 for 1 j < i n. Defining k as in the proof of 11.4.1 and following the argument there, we obtain b 2 b k 2. Hence, b 2 b k 2 1 2 k 1 b 1 2 1 2 n 1 b 1 2 = 1 2 n 1 b 1 2, where the last equation makes use of b 1 = b 1. Thus, 11.5.1 follows. Recall that From i and 11.5.2, we obtain b i 2 = i 1 b i 2 + µ 2 ij b j 2. j=1 b i 2 b i 2 + 1 i 1 4 b j 2 b i 2 + 1 i 1 2 i j 4 b i 2 = Using 11.5.2 again, we deduce j=1 j=1 1 + 1 4 2i 2 b i 2 2 i 1 b i 2. 11.5.3 b j 2 2 j 1 b j 2 2 i 1 b i 2 for 1 j i n. We show now the following improvement of 11.5.1. Let x 1, x 2,..., x t be t linearly independent vectors in L. Then 11.5.4 b j 2 n 1/2 max{ x 1 2, x 2 2,..., x t 2 } for 1 j t. For each 1 j t, define a positive integer mj and integers u ji by x j = mj i=1 u ji bi, u jmj 0. By reordering the x j, we may suppose further that m1 m2 mt. The linear independence of the x j implies that mj j for 1 j t. The proof of 11.4.1 implies here that x j b mj for 1 j t.

273 From 11.5.3, we deduce b j 2 2 mj 1 b mj 2 2 mj 1 x j 2 2 n 1 x j 2 for 1 j t. The inequality in 11.5.4 now follows. Recall that det L = n i=1 b i. We obtain from 11.5.3 that n b i 2 i=1 n 2 i 1 n b i 2 2 nn 1/2 b i 2 = 2 nn 1/2 det L 2. i=1 Thus, from Hadamard s inequality, we obtain i=1 2 nn 1/4 b 1 b 2 b n det L b 1 b 2 b n for any basis b 1,..., b n of L. Recall that finding a basis b 1,..., b n for which the product on the right is minimal is NP-hard. The above implies that a reduced basis is close to being such a basis. We also note that Hermite s inequality mentioned earlier is a consequence of the above inequality. Suppose now that we want to factor a non-zero polynomial fx Z[x]. Let p be a prime, and consider a monic irreducible factor hx of fx modulo p k obtained say through Berlekamp s algorithm and Hensel lifting. Now, let h 0 x denote an irreducible factor of fx in Z[x] such that h 0 x is divisible by hx modulo p k. Note that h 0 x being irreducible in Z[x] implies that the content of h 0 x the greatest common divisor of its coefficients is 1. Our goal here is to show how one can determine h 0 x without worrying about other factors of fx modulo p k to avoid the difficulty suggested by Swinnerton-Dyer s example. We describe a lattice for this approach. Let l = deg h. We need only consider the case that l < n. Fix an integer m {l, l+1,..., n 1}. We will successively consider such m beginning with l and working our way up until we find h 0 x. In the end, m will correspond to the degree of h 0 x; and if no such h 0 x is found for l m n 1, then we can deduce that fx is irreducible. We associate with each polynomial wx = a m x m + + a 1 x + a 0 Z[x], a vector b = a 0, a 1,..., a m Z m+1. Observe that b = wx. Let L be the lattice in Z m+1 spanned by the vectors associated with { p k x j 1 for 1 j l w j x = hxx j l 1 for l + 1 j m + 1. It is not difficult to see that these vectors form a basis. Furthermore, the polynomials associated with the vectors in L correspond precisely to the polynomials in Z[x] of degree m that are divisible by hx modulo p k. In particular, if m deg h 0, the vector, say b 0, associated with h 0 x is in L. Observe that if k is large enough and deg h 0 > l, the coefficients of hx are presumably

274 large and the value of b 0 is seemingly small. We will show that in fact if k is large enough and m = deg h 0 and b 1,..., b m+1 is a reduced basis for L, then b # 1 = b 0, where b # 1 corresponds to the vector obtained by dividing the components of b 1 by the greatest common divisor of these components i.e., the polynomial associated with b # 1 is the polynomial associated with b 1 with its content removed. The lattice L seemingly has little to do with fx as its definition only depends on hx. Fix h 0 x as above. We show that if k is large enough, then h 0 x is the only irreducible polynomial in Z[x] which is associated with a short vector in L. For this purpose, suppose g 0 x is an irreducible polynomial in Z[x] divisible by hx but different from h 0 x and that R is the resultant of h 0 x and g 0 x. Note that since h 0 x and g 0 x are irreducible in Z[x], we have R 0. The definition of the resultant implies that if R is large, then g 0 x must be large since we are viewing h 0 x as fixed. So suppose R is not large. There are polynomials ux and vx in Z[x] such that h 0 xux + g 0 xvx = R. We wish to take advantage now of the fact that the left-hand side above is divisible by hx modulo p k, but at the same time we want to keep in mind that unique factorization does not exist modulo p k. Since hx is monic of degree l 1, the left-hand side is of the form hxwx modulo p k where we can now easily deduce that every coefficient of wx is divisible by p k. This implies p k R. Hence, given k is large, we deduce R is large, giving us the desired conclusion that g 0 x is large. The above argument does more. If m = deg h 0 x and b L, then viewing g 0 x L as the polynomial associated with b, we deduce from the above that either g 0 x is large or R = 0. In the latter case, since h 0 x is irreducible and deg g 0 m = deg h 0, we obtain that b # = b 0. How large is large? We take b = b 1 above i.e., gx is the polynomial associated with the first vector in a reduced basis. Recall that Theorem 11.2.1 gives h 0 x 2 m fx. On the other hand, by considering the vectors associated with g 0 x that is, b1 and h 0 x in L Z m+1, we deduce from 11.5.1 that g 0 x 2 m/2 h 0 x. Thus, g 0 x 2 3m/2 fx. We want this bound on g 0 x to assure that R = 0 so that b # = b 0. To see how large p k needs to be, we recall the Sylvester form of the resultant given by 2.2.1. We are interested in the resultant R of g 0 x and h 0 x where

275 we may suppose that deg g 0 m and deg h 0 m. From Hadamard s inequality and Theorem 11.2.1, we deduce R g 0 x m h 0 x m g 0 x m 2 m fx m = 2 m 2 g 0 x m fx m. Our upper bound on g 0 x now implies Hence, we see that if R 2 5m2 /2 fx 2m. p k > 2 5m2 /2 fx 2m, then the vector b 1 in a reduced basis for the lattice L, where m = deg h 0 x, is such that the polynomial corresponding to b # 1 is h 0x. 11.6 Sparse Polynomial Computations In the previous sections of this chapter, we explored algorithms for factoring polynomials with an interest in them as irreducibility tests for polynomials in Z[x]. We began with an approach due to Zassenhaus 1969. An example due to Swinnerton-Dyer shows that this algorithm can take time that is exponential in the degree of the input polynomial fx. In general, this algorithm is nevertheless quite practical as examples of fx for which the running time of the algorithm is exponential in deg f are rare. We then explored an algorithm due to Lenstra, Lenstra, and Lovasz 1982 which can be shown to have running time that is polynomial in deg f as well as the height of fx the maximum of the absolute values of the coefficients of fx. In this section, we describe an algorithm given by Filaseta, Granville, and Schinzel 2008 for determining whether a non-reciprocal polynomial fx is irreducible. Take particular note that this algorithm requires the input polynomial fx to be non-reciprocal. This algorithm is significant when a polynomial is sparse. The algorithm has running time that is almost linear in log deg f, but its dependence on the number of terms of fx and the height of fx is not so good. In particular, the dependence on the number of terms of fx is worse than exponential. On the other hand, if one considers all polynomials in Z[x] with a fixed bound on the number of terms and a fixed bound on the height of fx, then the running time of the algorithm will just depend on the deg f and will be close to linear in log deg f. More precisely, we have the following result of Filaseta, Granville, and Schinzel 2008. Theorem 11.6.1. Let fx = r j=0 a jx dj Z[x] with each a j 0 and with 0 = d 0 < d 1 < < d r 1 < d r = n. Suppose r 1 and n 16. Let H = Hf denote the height of fx, so H = max 0 j r { a j }. Then there is a constant c 1 = c 1 r, H such that an algorithm

276 exists for determining whether a given non-reciprocal polynomial fx Z[x] as above is irreducible and that runs in time O c 1 log n log log n 2 log log log n. The algorithm for Theorem 11.6.1 also provides some information on the factorization of fx in the case that fx is reducible with the same running time. Specifically, we have the following: i If fx has a cyclotomic factor, then the algorithm will detect this and output an m Z + such that the cyclotomic polynomial Φ m x divides fx. ii If fx does not have a cyclotomic factor but has a non-constant reciprocal factor, then the algorithm will produce such a factor. In fact, the algorithm will produce a reciprocal factor of fx of maximal degree. iii Otherwise, if fx is reducible, then the algorithm outputs a complete factorization of fx as a product of irreducible polynomials over Q. These can in fact be viewed as basic parts of the algorithm. First, we will check if fx has a cyclotomic factor by making use of the algorithm for Theorem 6.9.1. If it does, the algorithm will produce m as in i and stop. If it does not, then the algorithm will check if fx has a non-cyclotomic non-constant reciprocal factor. If it does, then the algorithm will produce such a factor as in ii and stop. If it does not, then the algorithm will output a complete factorization of fx as indicated in iii. For ii, we will make use of another algorithm for computing the greatest common divisor of two polynomials in Z[x], that is for computing gcd Z fx, gx as defined in Section 10.2. Here, we will have the added condition that at least one of the two polynomials is not divisible by a cyclotomic polynomial, which in general can be checked by making use of the algorithm in Theorem 6.9.1. This result, also due to Filaseta, Granville, and Schinzel 2008, is as follows. Theorem 11.6.2. There is an algorithm which takes as input two polynomials fx and gx in Z[x], each of degree n and height H and having r + 1 nonzero terms, with at least one of fx and gx free of cyclotomic factors, and outputs the value of gcd Z fx, gx and runs in time O c 2 log n for some constant c 2 = c 2 r, H. In discussing the above theorems in this section, we describe the algorithms which lead to the proofs but do not address details of the running times. The reader should consult the work of Filaseta, Granville, and Schinzel 2008 for further details on the running time estimates. The proof of Theorem 11.6.2 result relies heavily on a result Bombieri and Zannier described in an appendix by the latter in Schinzel 2000b. Alternatively, one can use the later work of Bombieri, Masser, and Zannier 2007. As a consequence of their either of these, we have the following.

277 Theorem 11.6.3. Let F x 1,..., x k, Gx 1,..., x k Q[x 1,..., x k ] be two coprime polynomials. There exists an effectively computable number BF, G with the following property. If u = u 1,..., u k Z k, ξ 0 is algebraic and F ξ u1,..., ξ u k = Gξ u1,..., ξ u k = 0, then either ξ is a root of unity or there exists a nonzero vector v Z k having components bounded in absolute value by BF, G and orthogonal to u. Our proof of Theorem 11.6.2 has similarities to an application of Theorem 11.6.3 by Schinzel 1999c, 2000b. In particular, we make use of the following lemma which is Corollary 6 in Appendix E of Schinzel 2000b. A proof is given there. Lemma 11.6.4. Let l be a positive integer and v Z l with v nonzero. The lattice of vectors u Z l orthogonal to v has a basis v 1, v 2,..., v l 1 such that the maximum absolute value of a component of any vector v j is bounded by l/2 times the maximum absolute value of a component of v. We make use of the notation O r,h αn to denote a function which has absolute value bounded by Cαn for some constant C > 0 and for n sufficiently large. Another result of use to us is the following. Lemma 11.6.5. There is an algorithm with the following property. Given an r t integral matrix M = m ij of rank t r and max{ m ij } = O r,h 1 and given an integral vector d = d 1,..., d r with max{ d j } = O r,h n, the algorithm determines whether there is an integral vector v = v 1,..., v t for which d 1. d r = M holds, and if such a v exists, the algorithm outputs the solution vector v. Furthermore, max{ v j } = O r,h n and the algorithm runs in time O r,h log n. There are a variety of ways we can determine if d = M v has a solution and to determine the solution if there is one within the required time O r,h log n. We use Gaussian elimination to explain an approach. Again, we refer to reader to Filaseta, Granville, and Schinzel 2008 for details on the running time. Performing elementary row operations on M and multiplying by entries from the matrix as one proceeds to use only integer arithmetic allows us to rewrite M in the form of an r t matrix M = m ij with each m ij Z and the first t rows of M forming a t t diagonal matrix with nonzero integers along the diagonal. We perform the analogous row operations and integer multiplications on the vector d = d 1, d 2,..., d r to solve d = M v for v. We are thus left v 1. v t

278 with an equation of the form d = M v where the entries of M are integers that are O r,h 1 and the components of d = d 1, d 2,..., d r are integers that are O r,h n. For each j {1, 2,..., t}, we check if d j 0 mod m jj. If for some j {1, 2,..., t} we have d j 0 mod m jj, then a solution to the original equation d = M v, if it exists, must be such that v j Z. In this case, an integral vector v does not exist. Now, suppose instead that d j 0 mod m jj for every j {1, 2,..., t}. Then we divide d j by m jj to determine the vector v. This vector may or may not be a solution to the equation d = M v. We check whether it is by a direct computation. If it is not a solution to the equation d = M v, then there are no solutions to the equation. Otherwise, v is an integral vector satisfying d = M v, and we output the vector v. This finishes our explanation for Lemma 11.6.5. We also make use of the following notation. For a polynomial F x 1,..., x r, x 1 1,..., x 1 r, in the variables x 1,..., x r and their reciprocals x 1 1,..., x 1 r, we define J F = x u1 1 xur r F x 1,..., x r, x 1 1,..., x 1 r, where each u j is an integer chosen as small as possible so that J F is a polynomial in x 1,..., x r. In the way of examples, if F = x 2 + 4x 1 y + y 3 and G = 2xyw x 2 z 3 w 12w, then J F = x 3 + 4y + xy 3 and J G = 2xyz 3 x 2 12z 3. In particular, note that although w is a variable in G, the polynomial J G does not involve w. We call a multi-variable polynomial F x 1,..., x r Q[x 1,..., x r ] reciprocal if J F x 1 1,..., x 1 r = ±F x1,..., x r. For example, x 1 x 2 x 1 x 2 + 1 and x 1 x 2 x 3 x 4 are reciprocal. Note that this is consistent with our definition of a reciprocal polynomial fx Z[x]. For our proof of Theorem 11.6.2, we can suppose that fx does not have a cyclotomic factor and do so. We consider only the case that f0g0 0 as computing gcd Z fx, gx can easily be reduced to this case by initially removing an appropriate power of x from each of fx and gx. This would need to be followed up by possibly multiplying by a power of x after our gcd computation. We furthermore only consider the case that the content of fx, that is the greatest common divisor of its coefficients, and the content of gx are 1. Otherwise, we simply divide by the contents before proceeding and then multiply the final result by the greatest common divisor of the two contents.