PRIMES OF THE FORM p 2 + Ny 2

Similar documents
ALGEBRAIC AND ANALYTIC PROPERTIES OF ARITHMETIC FUNCTIONS

Witt#5: Around the integrality criterion 9.93 [version 1.1 (21 April 2013), not completed, not proofread]

Linear First-Order Equations

Math Notes on differentials, the Chain Rule, gradients, directional derivative, and normal vectors

Zachary Scherr Math 503 HW 5 Due Friday, Feb 26

Two formulas for the Euler ϕ-function

The Exact Form and General Integrating Factors

NOTES ON EULER-BOOLE SUMMATION (1) f (l 1) (n) f (l 1) (m) + ( 1)k 1 k! B k (y) f (k) (y) dy,

Schrödinger s equation.

A Sketch of Menshikov s Theorem

Separation of Variables

The derivative of a function f(x) is another function, defined in terms of a limiting expression: f(x + δx) f(x)

Math 1B, lecture 8: Integration by parts

Final Exam Study Guide and Practice Problems Solutions

Step 1. Analytic Properties of the Riemann zeta function [2 lectures]

Math 1271 Solutions for Fall 2005 Final Exam

Zachary Scherr Math 503 HW 3 Due Friday, Feb 12

Agmon Kolmogorov Inequalities on l 2 (Z d )

LEGENDRE TYPE FORMULA FOR PRIMES GENERATED BY QUADRATIC POLYNOMIALS

Exam 2 Review Solutions

Implicit Differentiation

Pure Further Mathematics 1. Revision Notes

Quantum Mechanics in Three Dimensions

Differentiation ( , 9.5)

The Principle of Least Action

MA 2232 Lecture 08 - Review of Log and Exponential Functions and Exponential Growth

Lecture Introduction. 2 Examples of Measure Concentration. 3 The Johnson-Lindenstrauss Lemma. CS-621 Theory Gems November 28, 2012

REFINEMENTS OF SELBERG S SIEVE

Diophantine Approximations: Examining the Farey Process and its Method on Producing Best Approximations

Computing Exact Confidence Coefficients of Simultaneous Confidence Intervals for Multinomial Proportions and their Functions

6 General properties of an autonomous system of two first order ODE

Lecture 6: Calculus. In Song Kim. September 7, 2011

Chapter 5. Factorization of Integers

Euler equations for multiple integrals

Survey Sampling. 1 Design-based Inference. Kosuke Imai Department of Politics, Princeton University. February 19, 2013

Math 300 Winter 2011 Advanced Boundary Value Problems I. Bessel s Equation and Bessel Functions

2Algebraic ONLINE PAGE PROOFS. foundations

Advanced Partial Differential Equations with Applications

Lagrangian and Hamiltonian Mechanics

CHAPTER 1 : DIFFERENTIABLE MANIFOLDS. 1.1 The definition of a differentiable manifold

19 Eigenvalues, Eigenvectors, Ordinary Differential Equations, and Control

Make graph of g by adding c to the y-values. on the graph of f by c. multiplying the y-values. even-degree polynomial. graph goes up on both sides

Calculus and optimization

Sturm-Liouville Theory

Chapter 6: Integration: partial fractions and improper integrals

QF101: Quantitative Finance September 5, Week 3: Derivatives. Facilitator: Christopher Ting AY 2017/2018. f ( x + ) f(x) f(x) = lim

Prime Number Theory and the Riemann Zeta-Function

4.2 First Differentiation Rules; Leibniz Notation

On the enumeration of partitions with summands in arithmetic progression

Stable Polynomials over Finite Fields

MATH 566, Final Project Alexandra Tcheng,

1 dx. where is a large constant, i.e., 1, (7.6) and Px is of the order of unity. Indeed, if px is given by (7.5), the inequality (7.

Lectures - Week 10 Introduction to Ordinary Differential Equations (ODES) First Order Linear ODEs

Problem Sheet 2: Eigenvalues and eigenvectors and their use in solving linear ODEs

FLUCTUATIONS IN THE NUMBER OF POINTS ON SMOOTH PLANE CURVES OVER FINITE FIELDS. 1. Introduction

PDE Notes, Lecture #11

θ x = f ( x,t) could be written as

Finding Primes by Sieve Methods

GCD of Random Linear Combinations

Math 210 Midterm #1 Review

First Order Linear Differential Equations

ON THE DISTANCE BETWEEN SMOOTH NUMBERS

Topic 7: Convergence of Random Variables

Math 342 Partial Differential Equations «Viktor Grigoryan

Calculus of Variations

Proof by Mathematical Induction.

Math 115 Section 018 Course Note

Characterizing Real-Valued Multivariate Complex Polynomials and Their Symmetric Tensor Representations

Table of Common Derivatives By David Abraham

1 Heisenberg Representation

SYNCHRONOUS SEQUENTIAL CIRCUITS

d dx But have you ever seen a derivation of these results? We ll prove the first result below. cos h 1

Least-Squares Regression on Sparse Spaces

Differentiability, Computing Derivatives, Trig Review. Goals:

Introduction to variational calculus: Lecture notes 1

7. Localization. (d 1, m 1 ) (d 2, m 2 ) d 3 D : d 3 d 1 m 2 = d 3 d 2 m 1. (ii) If (d 1, m 1 ) (d 1, m 1 ) and (d 2, m 2 ) (d 2, m 2 ) then

Math 229: Introduction to Analytic Number Theory Čebyšev (and von Mangoldt and Stirling)

JUST THE MATHS UNIT NUMBER DIFFERENTIATION 2 (Rates of change) A.J.Hobson

Computing Derivatives

A logarithmic improvement in the Bombieri-Vinogradov theorem

Acute sets in Euclidean spaces

Permanent vs. Determinant

Calculus of Variations

. Using a multinomial model gives us the following equation for P d. , with respect to same length term sequences.

ON THE MAXIMUM NUMBER OF CONSECUTIVE INTEGERS ON WHICH A CHARACTER IS CONSTANT

Conservation Laws. Chapter Conservation of Energy

STUDENT S COMPANIONS IN BASIC MATH: THE FOURTH. Trigonometric Functions

Diagonalization of Matrices Dr. E. Jacobs

Laplace s Equation in Cylindrical Coordinates and Bessel s Equation (II)

Dirichlet s Theorem. Martin Orr. August 21, The aim of this article is to prove Dirichlet s theorem on primes in arithmetic progressions:

Lower Bounds for the Smoothed Number of Pareto optimal Solutions

FIRST YEAR PHD REPORT

Differentiability, Computing Derivatives, Trig Review

Tractability results for weighted Banach spaces of smooth functions

Function Spaces. 1 Hilbert Spaces

Vectors in two dimensions

Construction of the Electronic Radial Wave Functions and Probability Distributions of Hydrogen-like Systems

Week 1: Number Theory - Euler Phi Function, Order and Primitive Roots. 1 Greatest Common Divisor and the Euler Phi Function

Unit #6 - Families of Functions, Taylor Polynomials, l Hopital s Rule

APPROXIMATE SOLUTION FOR TRANSIENT HEAT TRANSFER IN STATIC TURBULENT HE II. B. Baudouy. CEA/Saclay, DSM/DAPNIA/STCM Gif-sur-Yvette Cedex, France

Transcription:

PRIMES OF THE FORM p 2 + Ny 2 XIAOYU HE Contents 1. Introuction 2 1.1. Outline 4 2. An Intuitive Account of Sieve Theory 4 2.1. The Basics 5 2.2. The Brun Sieve 7 2.3. The Selberg Sieve 9 2.4. The Parity Problem 11 2.5. Bombieri s Asymptotic Sieve 13 2.6. The Asymptotic Sieve for Primes 17 3. Primes of the Form p 2 + Ny 2 18 3.1. The Value of A(X 20 3.2. Vaughan s Ientity 21 3.3. A Large Sieve Inequality 23 3.4. The Level of Distribution 27 3.5. The Bilinear Forms Conition 30 4. Acknowlegments 36 References 36 1

2 XIAOYU HE 1. Introuction It is not surprising that he woul have been [misle], unsuspicious as he presumably is of the iabolical malice inherent in the primes. Littlewoo to Hary, about Ramanujan The representation of primes by integer polynomials is one of the chief preoccupations of number theory. The (closely relate questions are: which polynomials take prime values, an how frequent are these values? The Prime Number Theorem answers both questions for the polynomial f(n = n; Dirichlet s theorem answers the first for linear polynomials of the form an + b. Alreay when we go to the simplest nontrivial quaratic polynomials f(n = n 2 + 1 we fin a major open problem, first propose by Lanau. When f is allowe to have two or more variables, the theory becomes extremely rich, tying in intimately with algebraic number theory. The easiest case x 2 + y 2 alreay epens on prime factorization in Z[i] counting such primes asymptotically is a special case of the Prime Number Theorem in arithmetic progressions. The representation of primes by many such special forms relate closely to prime factorization of number fiels of the corresponing egree, an the ieal class group makes an appearance as soon as unique factorization breaks own. The question is even richer an almost entirely intractable for nonhomogeneous forms. The representation of primes by the polynomial f(x, y = x 2 +y 3, for example, is closely tie to the counting of elliptic curves with prime iscriminant; as it stans x 2 + y 3 is consiere hopeless with our current technology. Special values of iscriminant polynomials have great value in arithmetic statistics it is an active area of research toay to count squarefree values of iscriminants, something much more tractable. Analytic number theory brings (at least two extremely isparate approaches to these problems. Historically the first approach to prove the Prime Number Theorem worke via complex analysis using zero-free regions of the Riemann zeta function an similar L-functions. The complex analysis approach provies proofs of the prime number theorem an many vast generalizations thereof for example to arithmetic progressions or number fiels. When these problems are tractable purely via L-functions, the estimates are often far superior to those obtainable by any other means. The historically secon approach is the sieve, which always begins via the purely combinatorial inclusion-exclusion formula. It took a long time for people to recognize the power of sieve theory; beyon its technical ifficulty was a central problem with sieves calle the parity problem which number theorists long thought renere a large class of problems completely out of reach. Classical sieve theory estimates the number of prime values of f by reucing it to a much easier problem: how many values of f lie in any given arithmetic progression? In moern language being able to answer this question is the one of proviing Type I estimates. Unfortunately, using Type I estimates alone is only enough to count almost-primes, numbers with at most 2 prime factors. As we will see, Type I estimates cannot istinguish between numbers with o an even numbers of ivisors. In the en these classical sieves faile to provie asymptotic results as the bouns they provie are off by at least, an often exactly, a factor of two.

PRIMES OF THE FORM p 2 + Ny 2 3 The reason these sieves fail is because they cannot istinguish between numbers between numbers with an even or o number of prime factors this is calle the parity problem. Here are three aspects of the Parity Problem. (1 The Selberg sieve is off by a factor of 2 when trying to give an upper boun for prime numbers. If π(x is the number of primes up through X, then π(x (2 + o(1x log X, which secretly comes from the fact that we can only use the primes up to X to sieve, so the boun is more transparently written as π(x (1 + o(1x log X. (2 It is much easier to count k-almost primes (proucts of up to k primes weighte by the generalize von Mangolt functions (see Definition 10, as long as k 2. For k = 2 the Selberg Symmetry Formula log 2 p + 2 p X p<q X log p log q = 2X log X + O(X is isappointingly easy to prove. (3 Consier the function f(n which is 2 when n has an even number of prime factors an 0 otherwise; from the perspective of Type I estimates, this sequence is inistinguishable from the constant function. In contrast if f(n is 3 when n has a multiple of three prime factors, the sequence is suenly istinguishable! We will iscuss this phenomenon in much more epth in Section 2.4. Although the parity problem is now common knowlege, the most exact formulation remains in Selberg s original lecture notes on sieves [12]. Our goal is to fully highlight how surprising the parity problem is an also sketch why there is no corresponing mo n problem for any n 2. Only recently has the parity problem been broken in a hanful of cases by injecting a secon kin of estimate, the so-calle bilinear forms or Type II estimates. Fouvry an Iwaniec were first to o this in orer to show x 2 + p 2 represents infinitely many primes, when p must also be prime. This metho has been extene with some success, most notably to count primes of the forms x 2 + y 4 [8] an x 3 + 2y 3 [10]. A Type II estimate is a guarantee that about an equal number of values of f have an o or even numbers of ivisors. The motivation for a Type II estimate is a common general principle in mathematics: prove that the known obstructions to solving a problem are the only obstructions. The success of the new sieve shows that while the parity problem cannot be solve by Type I estimates on their own, it is the only obstruction to sieving for primes. Using these techniques we contribute a new asymptotic count for primes of the form p 2 + Ny 2 for fixe N > 0. Let Λ(n be the von Mangolt function, an efine g(n to be the multiplicative function which on prime powers is

4 XIAOYU HE g(p α = ( 1+ N p p α po, p N 1 p = 2, p N, α = 1 2 1 χ 4 (N p = 2, p N, α 2 2 α 0 p N where χ 4 is the nontrivial Dirichlet character mo 4. Theorem 1. For any A, X > 0, Λ(xΛ(x 2 + Ny 2 = πh NX 4 N + O A(X(log X A, x 2 +Ny 2 X where the implie constant epens only on A an H N = p N(1 g(p(1 p 1 1. We remark that the estimates we refer to as Type I an Type II have close analogies in the Hary-Littlewoo Circle Metho. There Type I bouns refer to major arcs an Type II bouns refer to minor arcs. Although the analogy between the Asymptotic Sieve for Primes an the Circle Metho is quite interesting, especially since both are relate to Vaughan s ientity, we will make no attempt to flesh out this connection. 1.1. Outline. The goal of this paper is twofol. First, we revisit the history of the sieve. We trace the evelopment of the earliest sieves Brun s combinatorial sieve an Selberg s sieve that fall within the framework of exploiting only Type I estimates to their maximum potential. From here we move to how the parity problem came to be unerstoo as the central obstruction to sieve theory, an how Bombieri perfecte the art of getting asymptotic counts for almost-prime values using only Type I estimates. Only then can we talk about the injection of Type II estimates to count prime values themselves an the breakthrough of the Frielaner-Iwaniec Asymptotic Sieve for Primes. Secon, we exten the methos of Fouvry an Iwaniec to prove Theorem 1. While it oesn t irectly apply the Asymptotic Sieve for Primes, it uses exactly the same tools an philosophy. When Z[ N] is still a unique factorization omain, the generalization is straightforwar. However, to eal with quaratic number fiels with nontrivial class number new tools must be introuce to prove both the Type I an Type II estimates require. As a consequence this section will prove much more technical than the first. 2. An Intuitive Account of Sieve Theory A etaile an rigorous exposition of all of these results an much more can be foun in Opera e Cribro by Iwaniec an Frielaner [6]. A goo exposition of the prime-etecting sieve in particular can be foun in Harman [9]. In contrast to these sources, we ispense with the full generality of sieve theory in favor of focusing on the core ieas of each sieve by trying to count primes with them. We efine the prime-counting function as π(x = {p prime : p X}.

PRIMES OF THE FORM p 2 + Ny 2 5 We have Atle Selberg to thank for the best ieas that follow. In particular, the powerful Λ 2 sieve, the Symmetry Formula, an the clarification of the parity problem are all ue to Selberg. Henceforth, we use the notations A(X B(X an A(X = O(B(X interchangeably to mean the two functions A, B satisfy B(X 0 for all X an lim sup X A(X B(X <. The following table shows what each sieve proves about the asymptotic behavior of π(x. We efine aitionally the function ψ k (X = n X Λ k (n, where Λ k is the generalize von Mangolt function (see Definition 10. This function ψ k (X counts the k-almost primes up through X weighte in a special way note that in general it oes NOT weight numbers with i prime factors the same, for each 1 i k (see Section 2.5.2. Sieve Boun on Prime Counting Function Eratosthenes π(x X Brun s Pure Sieve Selberg s Sieve Bombieri s Asymptotic Sieve Asymptotic Sieve for Primes π(x log log X X log log X log X π(x (2+o(1X log X ψ k (X = (k+o(1x log X, k 2 π(x = (1+o(1X log X Remark 2. The above table can be misleaing. The more avance form of Brun s sieve oes better, getting π(x X with a weaker constant than Selberg s sieve. Also, the log X Asymptotic Sieve for Primes, while it successfully counts the prime numbers, assumes a much stronger hypothesis than its conclusion (a form of the Prime Number Theorem on arithmetic progressions because it requires Type II information. It is oubly unsuccessful in this case because it achieves a much inferior error term than it assumes. 2.1. The Basics. Sieve theory starts with the tension between what is easy to count: numbers in arithmetic progressions, an what we want to count: prime numbers. For a given sequence of nonnegative reals {a n }, which we can think of as the inicator sequence of a number-theoretically interesting sequence, an assume that the sequence is well-behave in the sense that goo estimates are available for the summatory functions an A(X = n X a n A (X = n X, n asymptotically as X, as long as grows reasonably slowly compare to X. Usually the estimate takes the form a n, A (X = g(a(x + r (X where g is a multiplicative function an r (X is very small compare to A(X.

6 XIAOYU HE Definition 3. A Type I estimate is a boun of the form r (X C A(X log C X D which hols for every C. The function D = D(X is the level of istribution. What we want to count is the amount of {a n } supporte on the primes, that is, S(X = a p p X using only Type I estimates. Henceforth the variables p, q are unerstoo to range only through primes. Later on, it will be fruitful to consier as a proxy S(X = a n Λ(n n X where Λ(n is the von Mangolt function, but from the perspective of the basic combinatorial sieve the first efinition is more natural. Our expectations cannot be too high; as we will see in Section 2.4, even with exact close forms for A(X an A (X we can only estimate S(X to within a factor of two at best. We will concern ourselves only with eriving upper bouns for S(X; there are many methos to convert upper bouns to lower bouns an vice versa, e.g. the Buchstab ientities. The iea is to take as small a linear combination of A (X as possible such that each a n appears with a nonnegative coefficient an each a p has coefficient at least one. It is helpful to a an extra parameter D, which we call the level of istribution, an estimate S(D, X = a p, D<p X so then we can use the small primes p D to sieve. For example, if D is very small compare to X, we can boun S(D, X A(X A p (X + A pq (X A pqr (X + p D p,q D p,q,r D using the inclusion-exclusion principle. This boun is only goo up to about D log X, just because so many error terms r (X accumulate in this expression when we let = p 1 p 2... p k for any choice of istinct primes p i D. How ba is this? Let s use it to boun the total number of primes, so a n = 1 for all n. We have A (X = X 1 + O(1, so S(D, X A(X A p (X + A pq (X p D p,q D ( = X 1 p 1 + p 1 q 1 + O(2 π(d, p D p,q where π(d is the number of primes up through D. The first term has a nice Euler prouct, so we get S(D, X X p D(1 p 1 + O(2 π(d.

PRIMES OF THE FORM p 2 + Ny 2 7 Using elementary methos, Mertens was able to estimate the prouct above as O((log D 1, an so we have the boun π(x X(log D 1 + 2 π(d. The error term grows extremely quickly in D an we can only pick D log X before it overwhelms the main term. Proposition 4. If π is the prime counting function, then (2.1 π(x X log log X. This is very far from the Prime Number Theorem but still nontrivial. 2.2. The Brun Sieve. Viggo Brun was the first person to take the iea of the sieve as a competitive metho for analytic number theory; the conventional wisom was that bouns like (2.1 are the best that can be one. There are at least two sieves in the literature referre to as the Brun sieve; we will evelop the simpler one, usually calle Brun s pure sieve. Brun s sieve is notable for proviing the first nontrivial bouns on the number of twin primes. We return to the inclusion-exclusion formula S(D, X A(X A p (X + A pq (X p D p,q D which we can rewrite conveniently as S(D, X P D µ(a (X p,q,r D A pqr (X +, where µ is the usual Möbius function, an P D is the prouct of primes p D. But instea of expaning out the entire inclusion-exclusion formula, we truncate it at some stage k an throw out all the terms with more than k prime factors. In general if we truncate at a step with k o, we are uner-counting, an if we truncate with k even, we are over-counting; in combinatorial literature these are known as Bonferroni s inequalities. The upshot is that the smallest terms, which also happen to be the most numerous, are thrown out. In this particular instance we only care for upper bouns for S(D, X, so we pick k even. Write ω( to be the number of (istinct prime factors of, so that by Bonferroni s inequality, S(D, X µ(a (X P D,ω( k = X P D,ω( k µ( + O(Dk, happily noting that an error term that was once exponential in D is now polynomial. Now we a back in all the main terms with ω( > k that we just threw out, with the exception that they no longer have O(1 error terms attache. This to get a nicer sum while aing something tiny: S(D, X X P D µ( + O ( X P D,ω(=k+1 1 + O(D k.

8 XIAOYU HE Note that the error boun only nees the terms with ω( = k + 1 exactly, by Bonferonni s inequalities again. The main term now has a nice Euler prouct form, µ( = ( 1 1, p P D p D an the first error term can be boune by Rankin s trick, for a parameter t > 1 to be optimize later: 1 = t k+1 t (k+1 P D,ω(=k+1 P D,ω(=k+1 t (k+1 t ω( P D = t ( (k+1 1 + t. p p D Putting this together we get a boun of the form (2.2 S(D, X X ( 1 1 ( + O Xt (k+1 p p D p D Finally, to estimate the leftover proucts, we have ( ( log 1 + t = t p p + O(1 p D p D = t log log D + O(1 ( 1 + t + O(D k. p by an elementary estimate of Mertens. Thus, the boun (2.2 reuces to π(x X(log D 1 + Xt (k+1 (log D t + D k, an it remains to optimize the values of D, t, k as functions of X subject to the mil conitions that D < X an k is an even natural number. Close to optimal values turn out to be ( D = exp C 1 log X log log X t = C 2 k = C 3 log log X with some suitable constants C i inepenent of X. Proposition 5. If π is the prime counting function, then X log log X π(x log X. This is alreay close to the Prime Number Theorem. It is possible to optimize Brun s argument further by inhomogeneously truncating the inclusion-exclusion, cutting off terms p 1 p 2 p k epening not only on the number k of them but by the sizes of p i as well this sieve is also ue to Brun an leas to a correct orer of magnitue boun on the prime counting function.

PRIMES OF THE FORM p 2 + Ny 2 9 2.3. The Selberg Sieve. The next evolution in sieve theory after Brun was the Selberg quaratic or Λ 2 sieve, which oes away with the combinatorially motivate constraint that the coefficients of A (X shoul be in {0, ±1} (such sieves are naturally calle combinatorial sieves. Without this constraint we are left with a pure optimization problem: pick weights ρ R for A (X so that for any D < n X, ρ 0, an ρ 1 = 1, which together imply (2.3 S(D, X D ρ A (X. n We can think of {ρ } as the optimal projection of the Möbius function onto the space of functions supporte on [1, D]. Selberg s insight was to guarantee the nonnegativity constraint by setting ( 2 ρ = λ n where λ are arbitrary real parameters with λ 1 = 1. In this situation, we can solve for ρ = [ 1, 2 ]= λ 1 λ 2, where [ 1, 2 ] is the lcm of 1, 2. Now the optimal choice of λ is exhibite by simple Cauchy-Schwarz inequality. Again, we begin with our example a n = 1. Expaning out (2.3 with our choice of ρ, S(D, X ( X λ 1 λ 2 [ 1, 2 ] + O(1 1, 2 D X 1, 2 D Thus it remains to minimize the quaratic form Q(λ = n 1 [ 1, 2 ] λ 1 λ 2 + O 1, 2 D 1 [ 1, 2 ] λ 1 λ 2 (( D 2 λ. subject to the constraint λ 1 = 1. Now this quaratic form can be iagonalize by repeately completing the square: Q(λ = gc( 1, 2 λ 1 λ 2 1 2 where ϕ is the Euler totient function. Let 1, 2 D = ϕ(g g D g 1 g 2 = ( ϕ(g g D x(g = g, D g, D λ, λ 1λ 2 1 2 λ 2,

10 XIAOYU HE so that we have essentially a free optimization of Q, subject to the single constraint λ 1 = g D µ(gx(g = 1. By Cauchy-Schwarz, the minimum is Q(λ = g D ϕ(gx(g 2 = ( g D µ(gx(g2 ( g D µ(g2 ϕ(g 1 ( µ(g 2 1, ϕ(g g D the minimum easily achieve using the equality conition of Cauchy-Schwarz. The inner sum is supporte on squarefree numbers, an can be estimate by expaning as a geometric series an multiplicativity 1 ϕ(p = 1 p + 1 p 2 +, so that g D µ(g 2 ϕ(g n D 1 n, = log D + O(1 an so we are left with (( S(D, X (1 + o(1x(log D 1 + O D 2 λ. The last thing to o is to figure out what the optimal values of λ were; it s not ifficult to see that they have magnitue at most 1, so the error term is O(D 2. Picking D = X 1/2 o(1 is then optimal. Proposition 6. If π(x is the prime counting function, then X π(x (2 + o(1 log X. Not only o we get the correct orer of magnitue, the boun is off by exactly a factor of two! This is the parity problem at work. Although the Selberg sieve is permanently hanicappe by the parity problem, it achieves results of great uniformity. The most famous application is the Brun-Titchmarsh Theorem. Theorem 7. (Brun-Titchmarsh If π(x; q, a is the number of primes p X satisfying p a (mo q, then (2 + o(1x π(x; q, a < ϕ(q log(x/q, uniformly for all q < X.

PRIMES OF THE FORM p 2 + Ny 2 11 This theorem was prove by Titchmarsh using the Brun sieve for a much weaker constant in place of 2, an refine to the above form by van Lint an Richert [13] using the Selberg sieve. For comparison, analytic methos involving L-functions can prove the Siegel-Walfisz theorem, the exact asymptotic for π(x; q, a with a much better error term, but it only hols up to q (log X C for fixe C. 2.4. The Parity Problem. Much more can be sai about sieve theory relying only on Type I estimates of the form A (X = g(a(x + r (X. Commonly, sieves are istinguishe by two parameters: the level of istribution D they require, an the sieve imension κ they are best suite for. The sieve imension measures the average number of resiue classes mo p we are sieving out, efine as the constant κ (which almost always exists for which g(p = κ log log X + O(1, p X noting that the stanar case g(p = 1/p is of imension 1. Rosser s linear sieve, or beta sieve, provies bouns of the correct orer of magnitue in the case of sieve imension 1. Sieves with finite imension are known as small sieves. In contrast, the large sieve is a family of methos for ealing with the case that g(p is much larger than 1/p, usually more like a constant; we will see an application of the large sieve inequality later on in Section 3.3. Selberg s sieve as presente above is unique in its generality, competitive with both small an large sieves. All of these sieves cannot get asymptotic results because of the so-calle parity problem, which tells us that Type I estimates alone cannot istinguish between primes an almostprimes, the proucts of two prime factors. Remark 8. The exact statement is more elicate than this, as we can see from the Selberg sieve inequality π(x (2 + o(1x(log X 1. But there are X log log X π 2 (X log X 2-almost primes up to X, so most of them are eliminate by the Selberg sieve above. A correct statement requires either throwing out almost primes with a small prime factor or weighting the almost-primes by Λ 2, the secon von Mangolt function Λ 2 (n = n µ((log n 2. Here is an explicit instance of the parity problem which we steal from the excellent exposition of For [3]. Let λ(n be the Liouville function λ(n = ( 1 Ω(n, where Ω(n is the number of prime factors of n, counting multiplicity. Define a n = 1 + λ(n, so that in particular a p = 0 on all primes p, an S(X = 0. On the other han, with A(X = X an g( = 1/, the remainer terms r (X are very small because numbers in any given arithmetic progression have even or o number of factors with equal probability. With the Riemann Hypothesis we can show r (X = O( X/ log(x/, which is much smaller than neee for classical sieve methos. But

12 XIAOYU HE a n = 1 also has A(X = X an g( = 1/, while S(X = π(x. Hence, it is impossible to istinguish with only this information between a n = 1 an a n = 1 + λ(n, one of which contains all the primes an the other none of them. One of the consequences of the parity problem is that these sieves have a har time proving the existence of primes in any given sequence the numbers they foun coul equally well be almost-primes. No nontrivial information about the number of primes in a sequence a n can be euce from Type I information alone. 2.4.1. Perspective From L-functions. There is something very special going on with the parity problem the Selberg sieve is fully capable of istinguishing primes from proucts pqr of three primes, for example. In fact, from the sequence a n = 1 + λ(n above we can see that the parity problem comes from the square root cancellation of µ(n (which has basically the same summatory function as λ(n. Because we are not prepare to introuce the whole subject of L-functions in etail, this section will work purely heuristically. Dirichlet L-functions provie a general machinery for unerstaning growth rates of arithmetic functions; in particular, the summatory function of µ(n can be relate to a certain contour integral of the inverse of the Riemann zeta function ζ(s 1 = n 1 µ(n n s. The Riemann Hypothesis preicts that ζ(s has no zeroes s with R(s > 1, which implies 2 that ζ(s 1 has no poles past this point. Roughly speaking, a pole of ζ(s 1 at ρ contributes to n X µ(n a term of the form 1 X ρ ζ (ρ ρ, which has orer X Re(ρ. By general principles the asymptotics of this summatory function can be euce from the set of zeroes in a boune region, of which the Riemann zeta function has only finitely many. Thus to assume the Riemann Hypothesis proves µ(n = O(X 1/2 log X. n X Instea of µ(n, let ɛ 3 be a primitive cube root of unity an consier the function { ɛ α 3 n = p 1 p 2 p α µ 3 (n =. 0 n squarefree If n X µ 3(n cancele at all significantly, then by a similar construction as in the previous section, we woul have mo 3 problem an expect all sieves to be off by at least a factor of 3. This is not the case; µ 3 (n cancels much more poorly than µ(n. We state a weaker form of the result of Selberg an Delange for simplicity. Theorem 9. If µ 3 (n is as above, then there exists a constant C 0 for which (C + o(1x µ 3 (n = (log X ɛ 3 infinitely often. n X

PRIMES OF THE FORM p 2 + Ny 2 13 Here is a heuristic (from communications with Terry Tao an Zeb Bray. Write ɛ 3 = e 2πi 3. The L-function of µ 3 (n has an Euler prouct, L(s = n 1 µ 3 (n n s = p p (1 + ɛ 3 p s (1 + p s ɛ 3 ζ(s ɛ 3. Now complex exponentiation is multi-value an there is no way to analytically continue this function past Re(s = 1; we must pick a branch cut past this, an the pole at s = 1 can be checke to contribute an oscillating main term that looks like CX/(log X ɛ 3. The same can be sai for any moulus except 2 in place of 3 the Möbius function has significant cancellation in comparison an is the reason there is a parity problem. 2.4.2. The Way Forwar. Since the parity problem was formulate, one of the major goals of sieve theory has been to prove that the parity problem is the only obstruction to asymptotic sieves, sieves which give asymptotically correct bouns. There are two ways one can formulate this goal. Bombieri s way was to construct an asymptotic sieve using only Type I information which correctly counts almost primes. By relaxing the question to counting S 2 (X = n X a n Λ 2 (n the parity problem goes away an sieve methos give asymptotic bouns correctly. We will escribe Bombieri s Asymptotic Sieve in the next section. Frielaner an Iwaniec were the first to break the parity problem altogether by injecting Type II information, creating their so-calle Asymptotic Sieve for Primes. This sieve successfully counts prime values of x 2 + y 4 asymptotically, but requires a secon conition, the bilinear forms conition, which is very ifficult to prove in practice. Roughly speaking the bilinear forms conition asks that a n never looks like µ(n (or its cousin λ(n on any arithmetic progression. 2.5. Bombieri s Asymptotic Sieve. We first efine the generalize von Mangolt function to give the correct weighting on almost-primes. Definition 10. The generalize von Mangolt function Λ k (n is efine by Λ k (n = ( µ( log n k. n Of course the usual von Mangolt function is just Λ = Λ 1. In fact Bombieri s actual result generalize this further to arbitrary Dirichlet convolutions of the Λ k above, but we will make o with Λ k.

14 XIAOYU HE Note that Λ k (n satisfies the convolution recurrence Λ k+1 (n = Λ k (n log n + n Λ(Λ k ( n, so by inuction Λ k is supporte on integers n with at most k istinct prime factors. Bombieri s asymptotic sieve gives asymptotic results, allowing us to estimate S k (X = n X a n Λ k (n for any k 2. Assume for simplicity sieve imension 1, so that g(p 1/p on average. All that is require is a fairly strong Type I estimate of the form (2.4 A (X = g(a(x + r (X A(X r (X C,D (log X, C D for every level of istribution D = X 1 ε, ε > 0. Asymptotic Sieve, see Opera e Cribro, Chapter 3. Theorem 11. (Bombieri s Asymptotic Sieve Assuming (2.4, where S k (X = (1 + o(1kha(x(log X k 1, H = p (1 g(p(1 1/p 1. For a simple exposition of Bombieri s The prouct for H converges to a nonzero constant iff g has sieve imension 1. Without the parity problem, we woul expect this asymptotic to be true for S(X = S 1 (X as well. 2.5.1. The Selberg Symmetry Formula. In this section we prove the spiritual ancestor of Bombieri s Asymptotic Sieve, namely the Symmetry Formula of Selberg, which leas to the Erős-Selberg elementary proof of the Prime Number Theorem. We follow the proof given by Balay [1]. Theorem 12. Let k 2, an Λ k (n be the k-th von Mangolt function. Then Λ k (n = kx log k 1 X + O(X log k 2 X. n X We first prove this for the case k = 2. Recall that Λ 2 (n satisfies the ientity Λ 2 (n = Λ(n log n + n Λ(Λ( n, so it weights primes p by (log p 2 an almost primes pq by 2 log p log q. By the Prime Number Theorem, the sum of Λ(n log n shoul be asymptotic to X log X, so the Symmetry Formula implies that Λ 2 (n counts primes an almost primes in approximately equal measure. The key lemma is an elementary cancellation estimate for sums of the Möbius function.

Lemma 13. If µ(n is the Möbius function, then µ(n = O(1, n n X n X PRIMES OF THE FORM p 2 + Ny 2 15 n X µ(n n log X n = O(1, µ(n n log2 X n = 2 log X + O(1. Proof. The trick is to use the following transformation of the Möbius inversion formula which Balay [1] uses but never writes own. Let f, g be functions efine on [1,, for which f(x = n x g(x/n. Then µ(m g(x/mn m x µ(mf( x m = m x n x/m = g(x/k µ(m k x m k = g(x. We pick g(1 1 first, so that f(x = x. The ientity becomes µ(n X n + O(X = 1 n X n X µ(n n = O(1, which is the first boun. Next pick g(x = x, so that f(x = x n n x = x log x + C 1 x + O(1 by approximating the integral, where C 1 is Euler s constant, though its exact value is irrelevant for this purpose. After Möbius inversion, we get X µ(n( n log X n + C X 1 n + O(1 = X n X using the previous boun. X n X n X µ(n n log X n = O(X µ(n n log X n = O(1

16 XIAOYU HE Finally, pick g(x = x log x, so that f(x = x n log x n n x = x 2 log2 x + C 2 x log x + C 3 x + O(log x again by approximating the integral. Again C 2, C 3 are irrelevant constants. Möbius inversion gives ( X X µ(n 2n log2 n + C X 2 n log X n + C X 3 n + O(log X n = X log X n X 1 µ(n X 2 n log2 n n X = log X + O(1, combining all the previous bouns, an the fact that the sum of log X is O(X. n Now we can control sums of Λ 2 (n by irectly expaning the convolution. Proof. (of Theorem 12. We expan µ(n log 2 m Λ 2 (n = n X n X m X/n = ( X X µ(n n log2 n + C X 4 n log X n + C X 5 n + X O(log2 n n X = 2 log X + O(X, as esire, by combining all the estimates in Lemma 13. It is in fact straightforwar to continue the inuctive application of Möbius inversion an prove for each k 2, µ(n X n logk n = k logk 1 X + O(log k 2 X n X Λ k (n = kx log k 1 X + O(X log k 2 X. n X The parity problem prevents us from fining such a simple computation of the Möbius sum for k = 1, since if we ha any estimate of the form µ(n n log X n = 1 + o(1 n X the Prime Number Theorem woul immeiately follow. 2.5.2. An Easy Misconception about the Generalize von Mangolt Function. Looking at the formula Λ k (n = kx log k 1 X + O(X log k 2 X n X we might expect that just as in the k = 2 case, approximately X log k 1 X of the sum comes from the j-almost primes, for each 1 j k. This is eciely false in general, since it woul break the parity barrier! For k o this woul give an essentially sieve-theoretic way

PRIMES OF THE FORM p 2 + Ny 2 17 to show that there are more k-almost primes with an o number of prime factors than with an even number. For k = 3 we have Λ 3 (n = Λ 2 (n log n + Λ 2 (Λ( n, n so since Λ 2 (p = log 2 p an Λ 2 (pq = 2 log p log q, we get Λ 3 (p = log 3 p Λ 3 (pq = 3 log p log q log pq Λ 3 (pqr = 6 log p log q log r. But the sum of log p log q log pq over almost-primes pq X is approximately 1 2 log X Λ 2 (n Λ(n log n = 1 2 X log2 X + O(X log X, n X an so the 2-almost primes pq actually contribute 3 2 X log2 X to the sum of Λ 3 (n, which is exactly half the weight of the sum, just as the parity problem preicts. The numbers p contribute X log 2 X to the sum, the numbers pq contribute 3 2 X log2 X, an the numbers pqr contribute just 1 2 X log2 X. 2.6. The Asymptotic Sieve for Primes. Frielaner an Iwaniec [7] finally broke the parity barrier by injecting a very strong secon conition, which they calle the bilinear forms conition. The key tool use in their paper is Vaughan s ientity, a simple combinatorial ientity for the von Mangolt function that separates what might be calle its wavelengths. The goal of this section is to state the simplest formulation of Vaughan s ientity that we will nee for proucing primes of the form p 2 + Ny 2. The ientity [14] allows us to separate the main term, Type I error term, an Type II error term irectly out of S(X. Lemma 14. (Vaughan s ientity Choose integers y, z 1. For any n > z we have Λ(n = µ(b log n b µ(bλ(c + µ(bλ(c, b n,b y an the right han sie is zero if n z. Proof. We have bc n,b y,c z bc n,b>y,c>z Λ(n = b n µ(b log n b = b n,b y = b n,b y = b n,b y µ(b log n b + µ(b log n b + µ(b log n b bc n,b>y bc n,b>y,c>z bc n,b y,c z µ(bλ(c µ(bλ(c + µ(bλ(c + bc n,b>y,c z bc n,b>y,c>z µ(bλ(c µ(bλ(c, since for any fixe c z < n, the whole secon sum over b is Λ(c b nc µ(b = 0. If n z, 1 then the first two sums cancel an the last is empty.

18 XIAOYU HE From here, what nees to be one epens on the application. For example, Frielaner an Iwaniec esigne their Asymptotic Sieve for Primes with the application of x 2 + y 4 in min, an broke the last sum in Vaughan s ientity further own into three sums epening on the ranges of b an c. Heath-Brown also use Vaughan s ientity to break the parity barrier for primes of the form x 3 + 2y 3, but the exact calculations were ifferent in fact to the author s knowlege the exact formulation of the Asymptotic Sieve for Primes in Frielaner-Iwaniec [7] has only been applie to the single case of x 2 + y 4, espite the fact that the general metho works in a variety of settings. In general, the Asymptotic Sieve for Primes expects the first two terms, containing the small oscillations, to prouce the main term of the sieve using only Type I estimates, whilst the last term must be boune more elicately in terms of a bilinear forms conition (see Section 3.5 below. If S(X = n X a n Λ(n as before, then substituting Vaughan s ientity this sum resolves as follows. Lemma 15. Suppose y, z 1 an X > yz. Then, where S(X = S(z + A(X; y, z + B(X; y, z A(X; y, z = ( µ(b a n log n b Λ(c b y b n,n X c z ( B(X; y, z = µ(ba b Λ(c. b X,b>y c,c>z bc n,n X We will allow y, z to remain ineterminate for now. The term S(z will be negligible, A(X; y, z will be the main term asymptotically after we prove the level of istribution, an bouning B(X; y, z reuces almost immeiately into the bilinear forms conition. 3. Primes of the Form p 2 + Ny 2 Fix squarefree N > 0. Our goal is to count asymptotically the number of primes q = p 2 + Ny 2 where p varies through primes an y through all integers. Each q is counte with multiplicity the number of times it occurs as p 2 + Ny 2. Write Λ(n to be the von Mangolt function. We will prove Theorem 1, an in particular that there are infinitely many primes of the form p 2 + Ny 2. This result generalizes the theorem of Fouvry an Iwaniec [5] which provies the case N = 1. We follow their presentation closely, altering the computations where necessary to accommoate the general case. In fact, Fouvry an Iwaniec prove a more general theorem for the case N = 1, replacing Λ(x with any reasonable sequence of complex numbers λ x. However, we chose to specialize to the case Λ(x as it leas to a number of simplifications in the ensuing calculations. Frielaner an Iwaniec [7] provie a general sieve calle the Asymptotic Sieve for Primes to count primes of the form x 2 + y 4 [8]. In this case applying that sieve is unnecessarily burensome an provies a much poorer error term. We will be able to obtain an extremely a n

PRIMES OF THE FORM p 2 + Ny 2 19 high level of istribution X 1 ε, avoiing many of the ifficult computations require by them. Nevertheless our computations are closely relate. We begin with a sequence {a n }, which for us is a n = Λ(x. x 2 +Ny 2 =n,(x,ny=1 We a the conition (x, Ny = 1 to simplify many of the ensuing computations. We wish to write S(X = n X a n Λ(n in terms of the much easier sums A (X = n X, n We write A(X = A 1 (X. Each A (X can be approximate by g(a(x + r (X, where g( is the multiplicative function efine on prime powers as: ( N 1+ p po, p N p α g(p α 1 = p = 2, p N, α = 1 2 1 χ 4 (N p = 2, p N, α 2 2 α 0 p N where χ 4 is the nontrivial Dirichlet character mo 4. The remainer term r (X is relatively small. Write F G if F = O(G. Classical sieve theory tells us that a remainer term boun (or level of istribution boun (3.1 r (X A(X(log X A, D with level of istribution D large enough is sufficient for estimating a n. S 2 (X = n X a n Λ 2 (n, summing a n over 2-almost primes weighte by the secon von Mangolt function Λ 2. The well-known parity problem in sieve theory prohibits estimating S(X irectly from only the level of istribution (3.1. Using Vaughan s ientity, Fouvry an Iwaniec are able to establish such estimates an break the parity problem with an aitional bilinear forms conition: (3.2 M<m 2M N<n (1+ɛN µ(na mn A(X(log X A, which guarantees that a mn oes not conspire with the Möbius function on average. It is natural to ivie the sieve computation into three steps. First, we apply Vaughan s ientity to boun S(X in terms of A(X an the sums (3.1 an (3.2, an compute the main term. Then, we separately prove the two bouns (3.1 an (3.2. At the heart of the remainer term boun (3.1 for x 2 + Ny 2 is a simple equiistribution result, namely that the solutions (ν, to ν 2 + N 0( are well-space in the sense that the fractions ν/ are far apart when is restricte to a short interval D < (1 + δd.

20 XIAOYU HE The case N = 1 was prove by Duke, Frielaner, an Iwaniec [2]; the generalization is not ifficult once we first ivie the fractions {ν/} into a finite number of families (epening only on N. Such a strong well-spacing result gives a corresponingly strong large sieve inequality. In Section 3.4, we show how to euce the remainer term boun from this large sieve inequality. The bilinear forms conition requires writing a mn as a sum over ieals I, J in the ring of integers of Q[ N] such that m = N(I, n = N(J. After reformulating the sum in terms of ieals an conitioning on the ieal class group representative, the sum is essentially ientical to the one Fouvry an Iwaniec treat. It requires a elicate application of the Cauchy-Schwarz inequality, reucing the inequality to a stanar Siegel-Walfisz type boun on Möbius sums for Q[ N]. 3.1. The Value of A(X. We first compute asymptotically the sum A(X = n X a n = x 2 +Ny 2 X,(x,Ny=1 Λ(x. This is elementary, epening only on the prime number theorem. Lemma 16. Let A(X be as above. Then, for a positive constant c. Proof. We compute: A(X = A(X = πx ( ( c(log X 3/5 4 N + O X exp (log log X 1/5 x X,(x,N=1 p X Λ(x (y,x=1,ny 2 X x 2 1 = log p 1 + O(X 3/4 log X p X,p N p y,ny 2 X p 2 = ( X p log p 2 ( X N + O + O(X 3/4 log X p = 1 N x X Λ(x X x 2 + O(X 3/4 log X. We were free to inclue an exclue the prime powers p α, α 2 at will. Also the finite set of primes p N fall into the error term. Let ( ( c(log X 3/5 E(X = O X exp (log log X 1/5

x X PRIMES OF THE FORM p 2 + Ny 2 21 be the best known error term on the prime number theorem, ue to For [4]. Applying summation by parts to the main term, we get Λ(x X x 2 = X tt Λ(x x X t 2 = = x X X 0 x t X 0 tt Λ(x X t 2 tt (t + O(E(t X t 2 = π 4 X + O( XE( X. = π ( c(log X 3/5 (X 4 X + O exp. (log log X 1/5 The constant c in the last line is not necessarily the same as in E(X. 3.2. Vaughan s Ientity. Using Vaughan s ientity (Lemma 15, Iwaniec an Fouvry are able to split the sum S(X into three terms: the main term, a remainer term controlle by (3.1, an a bilinear term controlle by (3.2. We get where S(X = S(z + A(X; y, z + B(X; y, z A(X; y, z = ( µ(b a n log n b Λ(c b y b n,n X c z ( B(X; y, z = µ(ba b Λ(c. b X,b>y c,c>z bc n,n X 3.2.1. Computation of the Main Term. To treat A(X; y, z, we nee the level of istribution estimate (3.1 prove in Section 3.4. Lemma 17. If (3.1 hols an there exists ε > 0 for which y X 1 ε, then A(X; y, z = πh NX 4 N + O A(X(log X A for any A > 0, the implicit constant epening only on A. Proof. We first express A(X; y, z in terms of A (X: A(X; y, z = ( µ(b A b (X log X A b (X log b b y X 1 a n A b (t t t Λ(cA bc (X, c z Now, we have the estimate A (X = g(a(x+r (X, so we wish to approximate A(X; y, z by M(X; y, z = A(X ( µ(b g(b log(x/b ( X Λ(cg(bc A(t t µ(bg(b. b y c z 1 t b y

22 XIAOYU HE Since A(X is approximately linear, integrating against t/t oes nothing except change constants in the error term: M(X; y, z = πx ( ( ( c(log X 3/5 4 ( 1+O exp µ(b g(b log(x/b g(b Λ(cg(bc. N (log log X 1/5 c z To eal with the sum over b, we first exten over all b an show that most of the sum vanishes: (1 g(p µ(bg(b = b 1 p = 0 since for a positive proportion of primes, g(p = 2p 1. Similarly, µ(bg(bc b 1 µ(b c z Λ(cg(bc = c z Λ(c b 1 = c z = 0. Λ(c p c For the last sum left, the following ientity hols: b y (1 g(p p c (g(c g(pc b 1 µ(bg(b log b = p (1 g(p(1 p 1 1, assuming only that g(p = log log X + C + O(log 10 X p X for all X [7]. Note that the prouct on the right is exactly H N, so we get M(X; y, z = πh ( ( NX c(log X 3/5 ( 4 N +O X exp +O X ( µ(b g(b log(x/b g(b Λ(cg(bc. (log log X 1/5 b>y c z We assume that the secon error term is O((log X A for any A, for suitable choice of y [5]. It follows that M(X; y, z = πh NX 4 N + O A((log X A as esire. It remains to hanle the remainer term R(X; y, z = A(X; y, z M(X; y, z = ( µ(b r b (X log(x/b b y X 1 r b (t t t Λ(cr bc (X. c z In Section 3.4 we show that R D (X = D r (X ε X 1 ε

PRIMES OF THE FORM p 2 + Ny 2 23 for all A 1, as long as D X 1 3ε. It follows that as esire. R(X; y, z 2R y (X log X + ε X 1 ε, X 1 R y (t t t 3.3. A Large Sieve Inequality. 3.3.1. Fractions Well-Space Mo 1. From the reuction of level-of-istribution results to large-sieve type inequalities, we are le to consier, for a fixe integer N > 0, the roots of ν 2 + N 0( as ranges through all positive integers for which N has a square root mo. In fact any well-spacing of these fractions ν/ better than the trivial 1/ 2 spacing of istinct rationals will give a nontrivial large sieve boun. Iwaniec an Frielaner were able to show an almost perfect well-spacing of ν/ in the case N = 1, in fact that they are separate by at least about 1/4 when [(1 δd, D] lies in a short interval. We will try to show the general case, getting a slightly weaker boun. The first step is to associate to each ν/ a solution (x, y, T to x 2 + Ny 2 = T satisfying (x, y = 1, x νy(, an 0 < x y N. We call such a triple (x, y, T a stanar solution for ν/. The relationship between well-spaceness of ν/ an these stanar solutions is the following lemma. Lemma 18. If (x, y, T is a stanar solution for ν/, then ν x y T x y (mo 1, where x is the multiplicative inverse of x moulo y. In particular ν/ is within N/ of a fraction with enominator y T /N. Proof. The ientity is just combining νy x (mo T x x (mo y, to fin the resiue class of νy moulo y. Note that (x, y = 1 implies (, y = 1. The first fraction x/y is very small: since 0 < x y N, it is at most N/. On the other han the secon fraction has enominator at most y an Ny 2 T. Using Lemma (18, it is possible to partition the fractions ν/ into a small number of families, each of which is well-space. Lemma 19. Suppose that for every ν/ satisfying ν 2 +N 0( with (, N = 1 there exists a stanar solution (x, y, T for ν/ for which T < T max. Then there exists a constant C > 0 such that for every yaic interval I = [D/2, D] of mouli, the set of fractions {ν/, I} can be partitione into at most C max(t 2 max, T max N

24 XIAOYU HE classes an within each class any two fractions ν/, ν / satisfy ν ν N > D. Proof. Every fraction ν/ is within N/ of a fraction with enominator at most max = Tmax D/N, say f(ν/. As before, the fibers f 1 (a/b have size boune by T max N. Now, sort the fractions up to enominator max mo 1; each consecutive pair has ifference at least 1/ 2 max, so a pair of two such fractions that are k apart are at least k/ 2 max apart in value. If we choose k so that k > 4 N 2 max D, then we can partition our set of fractions into O(k classes first, so that within each class the fractions either correspon to the same x/y or else are at least 2 N/D apart. Split each of these classes further into O(T max N classes so no two fractions correspon to the same x/y, an we get the esire result with O(kT max N classes. Choosing k = max(1, 4Tmax / N, we get the exact boun O(Tmax 2 on the number of classes. With this lemma in han, we can begin to prove the large sieve inequality we nee. All that is neee now is the construction of stanar solutions with boune T. 3.3.2. Construction of Stanar Solutions. Numbers representable as T = x 2 + Ny 2 correspon to norms of principal ieals in the ring of integers in Q[ N]. Write K = Q[ N] an O = O K for its ring of integers. Lemma 20. There exists a constant T N > 0 epening only on N such that for each (, 2N = 1 for which N has square roots mo, an each solution ν to ν 2 + N 0(, there exists a corresponing stanar solution (x, y, T for which T T N. Proof. Because (, 2N = 1 we on t nee to eal with ramifie primes. Let G be the ieal class group of O, an pick a set of generators as a finite abelian group {[I 1 ], [I 2 ],..., [I m ]}, so that [I i ] has orer l i an G i m Z/l i Z, where the generator in each component Z/l i Z is [I i ]. Let I i be arbitrary prime representatives of [I i ] (there are infinitely many primes in any ieal class. Now, for a given, a stanar solution x 2 +Ny 2 = T is just a principal ieal (x+y N with norm a multiple of. We can factor = q e i e i q i i over O. Every integer prime factor splits since N has square roots mo an is coprime to. A stanar solution then correspons to an integral multiple of z = i where p i is either q i or q i. Regarless of the choice of p i, it is possible to pick J i {I i, Īi} so that J i p i for any i, i, so that {J i } i m is a set of generators for G. It follows that there is a unique prouct z i m J α i i p e i i, = x + y N

PRIMES OF THE FORM p 2 + Ny 2 25 which is principal, where 0 α i < l i. We nee to verify that (x, y are coprime, that istinct choices of z correspon to istinct ν x/y (mo, an that all possible ν are attaine by some such x + y N. We chose J i to be nonprincipal prime ieals which are not conjugate to any of the prime factors of z, an not conjugate to each other. Therefore, x + y N has no nontrivial real integral factors an (x, y = 1. If two istinct z, z correspon to the same solution ν 2 +N 0(, then the corresponing stanar solutions satisfy x + y N u(x + y N (mo for a real integer u coprime to. But factoring an using the Chinese Remainer Theorem, this means that the same choice of conjugate of q i ivies both z an z for every i, an so z, z are the same. There are 2 ω( istinct solutions ν 2 + N 0(, where ω( counts the number of istinct (real prime factors of, an there are 2 ω( choices of z. It follows that they are in bijection, as esire. In general, elements x + y N of O N may have x, y half-integers; since is o we may multiply by 2 to guarantee x, y Z. Finally, to make them all into stanar solutions, we nee to apply the transformation (x, y (Ny, x to those (x, y with x > y N. If y < 0 we multiply this by 1. The only common factors that can be introuce are ramifie primes over O N, so we are free to ivie them out from (Ny, x until we have a stanar solution. Thus, we have constructe stanar solutions satisfying T 4N( i m J l i i N, an this is the value of T N we take. 3.3.3. The Large Sieve Inequality. We have shown that when D/2 < D an (, 2N = 1, the maximum value of T max for any given is T N, inepenent of D. Now we nee the large sieve inequality of Montgomery an Vaughan [11]. We say that a set of points α r R/Z is δ-space if their pairwise istances mo 1 are at least δ. We state it in the same form as Theorem 9.1 from Opera e Cribro [6]. Theorem 21. (The Large Sieve Inequality. For any set of δ-space points α r R/Z an any complex numbers a n with n N, where 0 < δ 1 an N is a positive integer, we have 2 a n e(α r n (δ 1 + N 1 a n 2. r n N n N This inequality is a statement about average cancellation of exponential sums, but originates from the stuy of sieving problems where the number of resiue classes to sieve mo p is comparatively large, hence the name. Here we o not use it irectly as a sieve inequality. Lemma 22. For any squarefree N > 0, α m e(νm/ D 1/2 (D + M 1/2 log D α 2 m M D ν 2 +N 0( for any complex numbers {α m } m M. The first sum is over (, N = 1. Proof. By the large sieve inequality, we have immeiately α m e(νm/ 2 (D + M α 2 2 m M D/2< D ν 2 +N 0(

26 XIAOYU HE when summe over (, 2N = 1. Summing over yaic intervals, we get α m e(νm/ 2 (D + M log D α 2 2, m M D ν 2 +N 0( an it remains to remove the parity conition on, assuming N is o. In this case, any given can be written = 2 t where is o, an an inner sum for can be split into 2 t sums for : D ν 2 +N 0( α m e(νm/ m M 2 = D 2 t D ν 2 +N 0(2 t t log 2 D D ν 2 +N 0( α m e(νm/2 t m M m M t log 2 D k 2 t D ν 2 +N 0( t log 2 D α m e(νm/2 t α 2 t m +ke(νk/2 t e(νm / m M/2 t (D + M log D α 2 t m +k 2 k 2 t m M/2 t (D + M(log D 2 α 2 2. Finally, Cauchy-Schwarz gives the esire inequality. In the level of istribution calculation, however, the harmonics we use will not be e(νm/ but instea the relate sum ( y0 k ρ(k, l; = e. Ny 2 0 +l2 0( For (, N = 1, we can multiply the conition by N 1 an write ( νk ρ(k, l; = e Thus, Write = ν 2 +N(N 1 l 2 0( ν 2 +N 0( α k,l ρ(k, l; = D k K l L D k K ( νn 1 kl e. l L D ν 2 +N 0( α n = τ(n kl=n α k,l, α k,l k K ν 2 +N 0( 2 2 ( νn 1 kl e ( νn 1 kl. α k,l e so that (α k,l 2 ( α n 2. Thus, we get a large sieve boun on sums involving ρ(k, l,. α k,l ρ(k, l; D 1/2 (D + KL 1/2 log D ( α n 2. D k K l L l L 2

On the other han, We write α 2 = (α k,l 2 henceforth. PRIMES OF THE FORM p 2 + Ny 2 27 ( α n 2 2 = τ(kl α k,l 2 k l log(kl (α k,l 2 2. Lemma 23. For any squarefree N > 0, α k,l ρ(k, l; D 1/2 (D + KL 1/2 (log(kl 1/2 log D α 2. D k K l L for any complex numbers α k,l. Here the sum is over (, N = 1. 3.4. The Level of Distribution. The sums A (X shoul be approximable by a multiplicative function times A(X: A (X = g(a(x + r (X. In this section we show that r (X is small on average; write R D (X = D Lemma 24. For any ε > 0, if D = X 1 3ε, then r (X. (3.3 R D (X X 1 ε, where the implicit constant epens only on ε. 3.4.1. Preliminaries. Pick δ > 0. It is possible to construct f : R [0, 1] to be a smooth function approximating the inicator function of [0, X] satisfying the following conitions: f is supporte on [0, X], ientically 1 on [δ, X δ], an the n-th erivative of f scales inversely with δ: n f t 1 n δ, n uniformly in t, the implicit constant epening only on n. Define A (f = n a n f(n. But expaning the efinition of a n, the sum A (f can be ivie into many sums of smooth functions over arithmetic progressions. Thus we can apply Poisson summation; write e(t = e 2πit. A (f = Λ(x f(x 2 + Ny 2 y y 0 ( (x,n=1 Ny0 2+x2 0( y y 0 ( f(x 2 + ny 2 = 1 ( y0 k ( tk e f(x 2 + Nt 2 e t A (f = 1 k (x,n=1 Λ(x = 1 Λ(x (x,n=1 k Ny 2 0 +x2 0( ( y0 k e k ρ(k, x; I(k/, x, ( tk f(x 2 + Nt 2 e t.