Experimental Mathematics Publication details, including instructions for authors and subscription information:

This article was downloaded by: [Boston College], [Robert Gross] On: 09 December 20, At: 6:07 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 072954 Registered office: Mortimer House, 37-4 Mortimer Street, London WT 3JH, UK Exerimental Mathematics Publication details, including instructions for authors and subscrition information: htt://www.tandfonline.com/loi/uexm20 Frequencies of Successive Pairs of Prime Residues Avner Ash a, Laura Beltis b, Robert Gross a & Warren Sinnott c a Deartment of Mathematics, Boston College, Chestnut Hill, MA, 02467-3806, USA b Actuarial Analysis Deartment, Tufts Health Plan, Watertown, MA, 02472, USA c Deartment of Mathematics, The Ohio State University, Columbus, OH, 4320-74, USA Available online: 28 Nov 20 To cite this article: Avner Ash, Laura Beltis, Robert Gross & Warren Sinnott 20: Frequencies of Successive Pairs of Prime Residues, Exerimental Mathematics, 20:4, 400-4 To link to this article: htt://dx.doi.org/0.080/0586458.20.565256 PLEASE SCROLL DOWN FOR ARTICLE Full terms and conditions of use: htt://www.tandfonline.com/age/terms-and-conditions This article may be used for research, teaching, and rivate study uroses. Any substantial or systematic reroduction, redistribution, reselling, loan, sub-licensing, systematic suly, or distribution in any form to anyone is exressly forbidden. The ublisher does not give any warranty exress or imlied or make any reresentation that the contents will be comlete or accurate or u to date. The accuracy of any instructions, formulae, and drug doses should be indeendently verified with rimary sources. The ublisher shall not be liable for any loss, actions, claims, roceedings, demand, or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or arising out of the use of this material.

Exerimental Mathematics, 204:400 4, 20 Coyright C Taylor & Francis Grou, LLC ISSN: 058-6458 rint / 944-950X online DOI: 0.080/0586458.20.565256 Frequencies of Successive Pairs of Prime Residues Avner Ash, Laura Beltis, Robert Gross, and Warren Sinnott CONTENTS. Introduction 2. Notation 3. The Heuristic 4. Some Observations 5. Numerical Checks of Antidiagonal Symmetry 6. Numerical Checks of Power-of-2 Prediction 7. Numerical Checks of the Heuristic 8. Another Prediction 9. Further Oen Questions Acknowledgments References We consider statistical roerties of the sequence of ordered airs obtained by taking the sequence of rime numbers and reducing modulo m. Using an inclusion/exclusion argument and a cutoff of an infinite roduct suggested by Pólya, we obtain a heuristic formula for the robability that a air of consecutive rime numbers of size aroximately x will be congruent to a, a + d modulo m. We demonstrate some symmetries of our formula. We test our formula and some of its consequences against data for x in various ranges.. INTRODUCTION In a beautiful aer from 959, Pólya resented a heuristic formula to aroximate the number of rimes <x such that q = + d is also rime, where d is a ositive even integer [Pólya 59]. His formula is a secial case of the Bateman Horn conjecture [Bateman and Horn 62], and had already been roosed in [Hardy and Littlewood 23]. However, his method of derivation is much more elementary. Pólya uses the standard robabilistic model of the rimes. His innovation is in deciding where to cut off a certain roduct, which he determines by invoking Mertens s theorem. In our treatment, this cutoff ste, involving Euler s constant, occurs in 3. Pólya does not care whether there are rimes between and q. In contrast, we ask about consecutive rime airs, and we are interested in gas not only equal to d but congruent to d modulo a given ositive integer m. This interest stems from our earlier work [Ash et al. 09], in which we studied the statistics of the sequence of rime numbers modulo m. For this reason, we actually look at the finer count of consecutive rime airs congruent to a given a, a + d mod m. We can thus state our roblem as follows: 2000 AMS Subject Classification: N05, K45, N69 Keywords: rime airs, Bateman Horn, Hardy Littlewood, Pólya 400 Problem.. Given ositive integers a, d, andm and a ositive number x, define Na, d, m, x to be the number of consecutive rime airs <q such that <x,

Ash et al.: Frequencies of Successive Pairs of Prime Residues 40 a, andq a + d mod m. What are the asymtotics of Na, d, m, x asx tends to infinity? Dirichlet s theorem tells us that if we look only at single rimes, the number of rimes <x with a mod m is aroximately πx/ϕm, indeendent of a rovided that a is relatively rime to m. Here ϕ is the Euler ϕ-function. One might suose that the residues modulo m of consecutive rime airs would be indeendent in a seudorobabilistic sense, and therefore that Na, d, m, x would be aroximately indeendent of a and d rovided that a and a + d are relatively rime to m. However, this aears not to be the case in the numerical evidence comiled in [Ash et al. 09] and in this aer. For examle, suose m = 4 and consider consecutive airs of rimes congruent to, and, 3 resectively. Our data show that there is a definite tendency for the, -airs to occur considerably less frequently than the, 3-airs. We do not know whether this tendency ersists as x increases. The results in [Hardy and Littlewood 23] suggest an exlanation of the imbalance. If d is a ositive even integer, then Hardy and Littlewood roose [Hardy and Littlewood 23,. 42, Conjecture B] that the robability that x and x + d are rime is asymtotic to Here d,>2 C 2 = >2 2 2C 2 log x 2. 2. In articular, the robability that x and x + 2 are rime is asymtotic to 2C 2 log x 2. We observe that the deendence on d occurs only through the factor d,>2 2, and therefore this factor should tell us the relative frequency of rimes such that + d is also rime when comared to the frequency of twin rimes. Here is a table of values for small d: d 2 4 6 8 0 2 4 6 d,>2 2 2 4 3 2 6 5 Thus, for examle, rimes such that + 6 is also rime should occur twice as often as twin rimes. If is a rime congruent to mod 4, and the next rime q is congruent to 3 mod 4, then q = + d for d {2, 6, 0, 4,...}, while if the next rime q is congruent to mod 4, then q = + d for d {4, 8, 2, 6,...}. The table above suggests that, q, 3 mod 4 would be more likely. However, the suggestion raises further questions, since even if + d is rime, it does not mean that it is the next rime after. For examle, it is erfectly ossible for each of, +4,and + 6 to be rime. Thus there are interactions among the various events and + d are rime that need to be taken into account if we want to redict whether, q, 3 mod 4 is more likely than, q, mod 4. To the best of our knowledge, Problem. is wide oen, and cannot be treated using L-functions, unlike the case of Dirichlet s theorem. In this aer, we attemt a heuristic solution to Problem. by alying an inclusion/exclusion argument, together with Pólya s heuristic argument, in Section 3. This leads to a formula for a function we call P a, d, m, x, which heuristically should be the robability that x is a rime congruent to a mod m and the next largest rime is congruent to a + d mod m. Thus, we should have the aroximation Na, d, m, x P a, d, m, xx, in the sense that P a, d, m, xx lim =. x Na, d, m, x However, is not the aroriate conjecture to verify. For the same reasons that the aroximation πx x/ log x is less accurate than πx x 2 dy/ log y, we instead comare Na, d, m, X Na, d, m, x with X y =x+ P a, d, m, y. We do not attemt to rovide a measure of how good this aroximation should be, since there is no actual underlying robability distribution we are samling. Nor did Pólya do so in [Pólya 59]. In fact, our exression for P is an infinite series, and we do not know whether this series converges. Thus, we choose an integer J 0 and truncate P at the Jth term to obtain a finite exression P J a, d, m, x. Our heuristic rooses that P J a, d, m, x should be the robability that x is a rime congruent to a mod m and that the next rime is x + d + jm for some j =0,...,J. Thus, if J is sufficiently large given x, our heuristic suggests that P J a, d, m, x should be a density function for Na, d, m, x in the sense given above. Our heuristic does not tell us exactly how to choose J. We suggest below that d + Jm 4 log x is a reasonable choice. We first investigate roerties of the heuristic formula for P J for a fixed J and test them against some actual data. Below we resent data for only a few values of m, which suffice to convey the general icture.

402 Exerimental Mathematics, Vol. 20 20, No. 4 The function P J a, d, m, x ossesses two exact symmetries. First, Proosition 4.2 states that P J a, d, m, x =P J a d, d, m, x. We call this antidiagonal symmetry because of its aearance when we arrange the values of P J a, d, m, x in a matrix indexed by a and a + d. Second, Proosition 4. says that when m is a ower of 2, P J a, d, 2 k,x is indeendent of a. We test these two symmetry redictions against actual data in Sections 5 and 6 resectively. The idea is that X y =x+ P J a, d, m, y should closely aroximate Na, d, m, X Na, d, m, x. Therefore the matrix of Na, d, m, X, indexed by a, a + d, should show aroximately these two symmetries. In addition to these symmetries, the observations at the start of Section 4 imly that if we combine the terms in /log x 2 from P J a, d, m, x, we obtain a sum that is indeendent of a. Heuristically, we might exect this term to be dominant as x, and therefore indeendence of a should aear for large x. We test this rediction numerically in Section 8. It would be interesting to have analytical roofs of any of these symmetries for the ratio Na, d, m, x/πx in the limit as x. As stated above, we do not know whether Na, d, m, x/πx might be indeendent of both a and d in this limit. We comare the actual values of X y =x+ P J a, d, m, y and Na, d, m, X Na, d, m, x for x =0 3, X =0 6, and various small values of a, d, and m in Section 7. We begin our sum at 0 3, because our heuristic relies on x being large relative to a, d, andm. The alternating sums 9 become too large for feasible comutation when X gets much beyond 0 6. For any x, P J, 2, 2,x should aroximate log x. We test this for J = 28 and various x in Section 4. We also discuss what haens when J varies. We rove an internal consistency, called vertical comatibility, in Proosition 4.3 and Corollary 4.4. This asserts that if m n and J is given, there exists J such that P J a, d, m, x equals a certain sum of P J a,d,n,x s. The function P J aears to stabilize surrisingly quickly as a function of J. Exerimentally, for small values of x, P J still aears to remain aroximately constant, even when J is much larger than 4 log x. See the remarks at the end of Section 4. This stabilization as J varies is somewhat surrising; our heuristic is based on a robabilistic icture, whereas, for examle, we know that the next rime after is certainly less than 2. It would be interesting to have a theoretical gras of how P J varies with J. We could try to eliminate the deendence on J by looking at P a, d, m, x := lim J P J a, d, m, x. However, we see no way of roving that the limit exists. The sum defined by the limit is certainly not absolutely convergent see Section 4 below. Our heuristic makes less and less sense if x is fixed and J increases. 2. NOTATION To simlify the job of the reader, we record all of our considerable notation at the outset. Fix an integer m> and a ositive integer a. LetS be a finite set of nonnegative numbers containing 0. Let bearime.letx>2 be an integer. We resent a list of some of our notation in Table. The definition of n in 3 means the number of distinct residue classes modulo in the set S. Once is larger than the largest element in S, wehaven S = ns. The roduct in 5 is infinite but converges, because for >n, both the numerator and denominator have leading terms n. The roduct in 6 contains only finitely many terms that do not equal, because once >nand >maxs, the numerator and denominator are equal. Below, we will tyically think of a as a ositive integer. However, a aears only as the first argument in βa, m, S. In turn, the definition of βa, m, S deends only on a in testing the condition a + s, m =. Thus, βa, m, S, Qa, d, m, x, j, and P J a, d, m, x deend only on the value of a mod m. 3. THE HEURISTIC We aly Pólya s heuristic to the roblem of determining the following robabilities:. Given a ositive integer d, the robability that x and x + d are successive rimes. 2. Given integers a, d andm 2, the robability that x and x + d are successive rimes and x a mod m.

Ash et al.: Frequencies of Successive Pairs of Prime Residues 403 n = ns = cards 2 n = n S = cards mod 3 n if n = n 4 otherwise A n = n n 5 αs = n S ns 6 { m m n S βa, m, S = if a + s, m = for all s S 0 otherwise 7 S k = {0,,...,k} 8 Qa, d, m, x, j = ns A ns αsβa, m, S log x ns {0,d+jm} S S d + jm 9 P J a, d, m, x = J Qa, d, m, x, j 0 j=0 3. Given integers a, d andm 2, the robability that x is rime, x a mod m, and the next rime is congruent to a + d mod m. 3.. The Probability That x and x + d Are Successive Primes Let S be a nonemty finite set of nonnegative integers. Let be a rime number. Then for integers x, Prob x + k for k S n, where n = n S is defined in 3. Note that n. Furthermore, if n =, then each residue class modulo is reresented by some k S; hence x + k for some k S, sothatprob x + k for k S is indeed 0. Also n = n := card S as soon as is sufficiently large larger than the largest element in S will suffice. In [Pólya 59] a way is roosed to ass from the robability that x + k for k S to the robability that x + k is rime for k S. Pólya argues that the events x + k for k S for various rimes should be considered heuristically indeendent as long as is small TABLE. Summary of notation. comared with x, so that we may conclude that Prob x + k for k S for all <y <y n as long as x is sufficiently larger than y. Nowifx>y> x /2, then x + k for k S for all <y x + k is rime for k S, at least if x is large enough that the size of the elements of S can be ignored. Pólya roosed that the correct value of y should be x µ, where µ = e γ,andγ =0.5772... is Euler s constant. His justification for this trick of the magic µ is simly and solely that it works when S = {0}: indeed, by Mertens s theorem, <x µ log x, while the robability that x is rime is also log x,bythe rime number theorem. Accordingly we roose Probx + k is rime for each k S n. <x µ 3

404 Exerimental Mathematics, Vol. 20 20, No. 4 We rewrite 3 as follows: n <x µ n n n<<x µ n n n n<<x µ n <x µ n<<x µ n n<< n n n n n n n n<<x µ <x µ n n n n n. 3 2 The second roduct may be viewed as a finite roduct, because n = n for sufficiently large. The first two roducts are αs, as defined in 6. The third roduct aroaches A n, as defined in 5. The fourth roduct satisfies <x µ n log x n by Mertens s theorem. Putting all this together gives Probx + k is rime for k S αs log x n. 3 3 As a check, let d be a ositive integer and take S = {0,d}. We should retrieve the Hardy Littlewood formula. In fact, A 2 = 2 2 =4 2 2 >2 =4 2, >2 and n S =if d, with n S = 2 otherwise. Hence, { 0 if d is odd, α{0,d} = if d is even, 2 d,>2 2 which agrees with the formula from [Hardy and Littlewood 23] quoted in the introduction note that our A 2 is four times Hardy and Littlewood s C 2. Let d be a ositive integer. We want to aly the formula 3 3 to find the robability that x and x + d are A n successive rimes. Let S d = {0,,...,d}. Then by inclusion/exclusion, Probx is rime and x + d is the next rime = ns A ns αs log x. ns {0,d} S S d 3.2. The Probability That x and x + d Are Successive Primes and x a mod m We start over with a congruence condition. Let m 2 and a be integers. Then Probx a modm m. Suose that is a rime divisor of m. Then the congruence x a mod m determines whether x + k for any k S. Indeed, if a + k is not relatively rime to m for some k S, then Probx a mod m andx + S are rimes = 0. Here x + S denotes {x + s : s S}. On the other hand, if m, the conditions x + k for k S should be indeendent of congruences modulo m and indeendent of each other, heuristically seaking. So if a + k is relatively rime to m for every k S, then So Probx a mod m andx + S are rimes n. m <x µ m Probx a mod m andx + S are rimes n if a + k, m = k S, m <x µ m 0 otherwise. In the first case, when the elements of a + S are all relatively rime to m, wehaven <for all m, and therefore m Thus <x µ m n = m m m n <x µ n A n αs m n log x n. Probx a mod m andx + S are rimes A n βa, m, SαS log x n,

Ash et al.: Frequencies of Successive Pairs of Prime Residues 405 where βa, m, S is defined in 7. Let d be a ositive integer. Then by inclusion/exclusion, Probx is rime, x a mod m, and x + d is the next rime ns 2 Probx a mod m {0,d} S S d and x + S are rimes ns A ns αsβa, m, S log x. ns {0,d} S S d = Qa, d, m, x, 0. 3.3. The Probability That x Is Prime, x a mod m, and the Next Prime Is a + d mod m Finally, suose that d m and that a, m =and a + d, m =. Then Probx a mod m, x rime, and the next rime is a + d mod m P J a, d, m, x J = ns A ns αsβa, m, S log x, ns j=0 S where the second sum is over {0,d+ jm} S S d+jm. One ossibility is to let j range from 0 to. However, there are various finite uer bounds that can be used. We need to use a large enough value of J to ensure that the set x + S d+jm contains the next rime after x assuming that x is rime. Bertrand s ostulate guarantees that choosing J such that d + Jm x is sufficient. We can safely use a much smaller value of J, however. The average ga between rimes of size x is log x, and the standard deviation is also log x. Wecanbealmost certain that x + S d+jm will contain the next rime if d + Jm 4 log x. It is worth recalling Pólya s remark in [Pólya 59,. 384, footnote]: When we consider a fixed number of rimes, the robabilities introduced can be regarded as indeendent, but they cannot be so regarded when the number of rimes increases in an arbitrary manner. Unfortunately, as a ractical matter, we are only able to choose J such that d + Jm is smaller than 55 so that J = 55 d/m ; otherwise, the number of subsets of S d+jm is too large to be comutationally feasible. Thus in comuting our heuristic, we need to limit ourselves to 4 log x 55, or x less than roughly 0 6. We will discuss this further below. 4. SOME OBSERVATIONS We observe the following: If S contains any odd integers, then αs =0,because n 2 S =2. If a, m, then βa, m, S = 0, because 0 S. Therefore, we will henceforth assume that a, m =. If {0,d+ jm} S and a + d, m, then βa, m, S =0. If a a mod m, then βa, m, S =βa,m,s. If βa, m, S 0, then βa, m, S is indeendent of a. Proosition 4.. If m is a ower of 2, then αsβa, m, S is indeendent of a. Hence,ifm =2 k and a is odd, then P J a, d, m, x is indeendent of a. Proof. Because a, m =, we know that a is odd. If S contains any odd elements, then αs = 0. If S contains only even numbers, then a + S will contain only odd numbers, and hence will be relatively rime to m, regardless of the value of a. It is temting to rearrange the terms in 0 in order of increasing ns and let J. However, the sum of the terms with ns = 2 is divergent. This is true in general, but it is simlest to see when m =2 k. In that case, we must have a odd and d even in order for αsβa, m, S 0. The terms with ns = 2 all have the form S = {0,d+ jm}. In this situation, we can easily comute that βa, m, S = 2 k 2 = 2 k. On the other hand, αs will always be larger than 2,so all of the infinitely many terms with ns = 2 will be A larger than 2, and hence will diverge. 2 k log x 2 Proosition 4.2. Antidiagonal symmetry. P J a, d, m, x =P J a d, d, m, x.

406 Exerimental Mathematics, Vol. 20 20, No. 4 Proof. We will show that for each set {0,d+ jm} S S d+jm in the sum in 0 for P J a, d, m, x, there is a set {0,d+ jm} S S d+jm with ns =ns, αs = αs, and βa, m, S =β a d, m, S. In articular, set S =d + jm S. We automatically have {0,d+ jm} S S d+jm. Obviously, ns = ns, and for any rime, n S =n S. This shows that αs =αs. Finally, suose that for every element s S, wehave a + s, m =,soβa, m, S 0. Then a s, m =, and so jm a s, m =. Therefore, d + jm s+ a d,m =, showing that β a d, m, S 0.We know that n S =n S, and therefore if βa, m, S 0, we see that βa, m, S =β a d, m, S. Conversely, if βa, m, S = 0, the same argument shows that β a d, m, S = 0. We may conclude that P J a, d, m, x = P J a d, d, m, x. Proosition 4.3. Vertical comatibility. Fix J.Suose that n is a ositive integer and m n. Then P J a, d, m, x = P J a,d,n,x, where J +m =J +n. a a mod m d d mod m a,d n Proof. We will first show that each set S d+jm used in the sum on the left-hand side of the equation aears in the sum on the right-hand side of the equation for a unique d. To see this, choose d d + jm mod n with d n. Write d + jm d = j n, and then d + jm = d + j n. On the other hand, if we start with d d mod m and j 0, it is clear that we can find a unique j such that d + jm = d + j n. We know that j J and d n + d m. Therefore, the uer bound for J is J +n m m. This leaves us needing to show that βa, m, S = βa,n,s. a a mod m a n Notice that if βa, m, S = 0, then automatically βa,n,s = 0 for all ossible values of a a mod m. So we may assume that βa, m, S 0. It suffices to consider the case n = qm, with q rime. Notice that the set {a, a + m, a +2m,...,a+q m} contains the comlete set of integers a with a a mod m and a qm. In articular, there are q such integers a. There are two cases to finish the roof: Case : q m. Ifa + S contains only elements relatively rime to m, anda a mod m, then a + S contains only elements relatively rime to m. Hence, a + S contains only elements relatively rime to qm. Therefore, βa, m, S =0 βa,qm,s = 0. Note also that if is rime, then qm m. If βa,qm,s 0, then we know that βa,qm,sis indeendent of a. There are q ossible values of a,and we have βa,qm,s= qm n S a qm = qm n S m = q qm n S m = = βa, m, S. m n S m Case 2: q m. In this case, the elements of the set {a, a + m,...,a+q m} are distributed over all q of the residue classes modulo q. Suose that n q S =c. Then there are q c ossible residue classes r modulo q such that r + S will be relatively rime to q, and there are c residue classes such that r + S will have a factor in common with q namely r s mod q for some s S. Let r be one of the residue classes with r + S relatively rime to q. Recall as usual that if βa, qm, S 0, then it is indeendent of a. We have βa,qm,s a a mod m a qm =q n q Sβr,qm,S =q n q S qm n S qm =q n q S q qm q n q S n S m m = = βa, m, S. m n S Corollary 4.4. Choose an even integer m>2, fixj,and let J = J +m 2 2.Then P J a, d, m, x =P J, 2, 2,x. 4 a m d m

Ash et al.: Frequencies of Successive Pairs of Prime Residues 407 Proof. Proosition 4.3 reduces the left-hand side of 4 to P, 2, 2,x+P,, 2,x+P 2,, 2,x+P 2, 2, 2,x. The last three terms in this sum are all 0. Note that if for some reason it is desirable to begin with odd m, we may relace m with 2m without changing the sum on the left-hand side of 4, again because of Proosition 4.3. Theoretically, P J, 2, 2,x should give the robability that x is rime, which according to our heuristic is log x. We can sum only to J = 28, because the comutational time involved becomes inordinate, and we tabulate our results here: x P 28, 2, 2,x log x Ratio 0 0.425586 0.434294 0.9799 0 2 0.2747 0.2747.0000 0 3 0.44765 0.44765.0000 0 4 0.08567 0.08574 0.9999 0 5 0.0867455 0.0868589 0.9987 0 6 0.079549 0.0723824 0.994 We conjecture that the relatively oor agreement with rediction when x = 0 is caused by relacing a finite roduct over the range <x µ with the infinite roduct defining A n in the enultimate term of formula 3 2. The roduct defining A n converges relatively raidly, but there is still an areciable error introduced by including so many more terms when x = 0. The fact that the ratio is slowly tending away from might show that for x 0 4, J should be larger than 28 for best results. More interestingly, the comutations for x =0 2 and x =0 3 converged relatively raidly and did not change significantly when more terms were included in the sum. For x =0 2, the series attained the value 0.27... for J = 3. Extending the sum over larger values of J did not change the first three significant digits. The sum took the value 0.275... at J = 20, and that did not change u to our maximum value of J = 28. For x =0 3,thesummation takes values varying between 0.44765... and 0.44767... when J runs from 6 to 28. It is hard to understand this aarent stabilization of P J for varying J; our heuristic makes less and less sense as J increases for fixed x. 5. NUMERICAL CHECKS OF ANTIDIAGONAL SYMMETRY Take, for examle, m = 5. We count residues modulo 5 of consecutive rime airs <q with <98245653. 98245653 is the 50 millionth rime number. We then record our results in a matrix with ϕm rows and columns. The results for m =5are 228970 3778890 3732547 2698886 389954 290360 3386288 373408 2995506 3535854 29584 37773 4024863 299554 389838 228944 where a ij records the count of i mod 5, q j mod 5. The airs 3, 5 and 5, 7 are thus omitted from the counts in this table. To check antidiagonal symmetry, we take the ratio of each entry to its reflection across the antidiagonal. We obtain rounded to six laces 0.999893.00048 0.999606.000000.000036 0.99944.000000.000394 0.999997.000000.000559 0.999582.000000.000003 0.999964.00007 The entries are all within 0.0006 of, while the ratio of the largest to the smallest entry in the original array is about.84. If m = 2, we have the following table of counts of rime air residues mod 2, qmod 2, where <q are consecutive rimes with <98245653; we omit the airs 2, 3 and 3, 5 that are not relatively rime to 2: 2265842 3746000 3296656 390380 2944446 2266005 3994598 3295820 3294707 390968 2268555 3745878 3993884 3297895 2940299 2268064 For examle, the entry 3994598 records the number of airs of consecutive rimes congruent to 5, 7 mod 2. Again, our heuristic redicts that this matrix should be symmetric across the antidiagonal. Taking the ratio of each entry to its reflection across the antidiagonal, we obtain rounded to six laces 0.999020.000033.000254.000000.0040 0.998876.000000 0.999746 0.999033.000000.0025 0.999967.000000.000968 0.998592.00098

408 Exerimental Mathematics, Vol. 20 20, No. 4 40905 99733 843077 082276 749027 72892 8043 642323 643202 40807 99575 843672 083407 748537 722996 805076 80358 643323 408783 996936 84243 082780 749737 723085 72382 80528 642633 40877 99795 843473 08932 749402 750223 7224 804604 642735 40762 99768 84347 082432 08276 750074 7230 803572 643435 40708 996935 843628 842474 08207 75349 72474 804368 643327 408482 996565 996983 84330 08386 750633 722470 805240 642498 407695 TABLE 2. Residues modulo 6 of consecutive rime airs <qwith <98245653. The entries are all within 0.002 of, while the ratio of the largest to the smallest entry in the original array is about.76. 6. NUMERICAL CHECKS OF POWER-OF-2 PREDICTION As noted above, when m =2 k, our heuristic redicts that Na, d, 2 k,x should be indeendent of a. Taking m = 6, we count residues modulo 6 of consecutive rime airs <qwith <98245653, with the result shown in Table 2. Our conjecture redicts that the numbers on the broken diagonals arallel to the main diagonal should be aroximately equal. For examle, the numbers in boldface in the table form a broken diagonal. We can test our conjecture by noting that the counts in the matrix range from 40708 to 083407, with a ratio aroximately 2.66, while the counts in the broken diagonal in boldface range from 84243 to 843672, with a ratio of.0047... The next broken diagonal beginning with 082276 varies from 08386 to 083407, with a ratio of.0086... The remaining broken diagonals exhibit similar behavior. 7. NUMERICAL CHECKS OF THE HEURISTIC In Sections 5 and 6, we tested the heuristic indirectly, by observing symmetries in the heuristic and seeing whether the same symmetries aeared in the counts of residues of rime airs. In this section we comare our heuristic formula directly with such counts. This requires comuting the heuristic numerically, which, as noted above, becomes raidly unwieldy as J increases, so that we must restrict J so that d + Jm < 55 in our calculations, and corresondingly restrict x to be at most 0 6. We will use m = 4, 5,, and 2 as tyical examles. We count rime airs reduced modulo m between 0 3 and 0 6 : m =5 3207 6536 6284 3550 5053 2868 5430 6224 4383 5800 280 6630 6934 437 500 350 m =2 2942 635 5253 508 4358 2959 7034 526 5257 500 298 6438 6970 5283 449 2940 We comute 0 6 x=0 3 P J a, d, m, x for all ossible values of a and d for each value of m above. Because of limited comuter time, we chose J maximal with d + Jm < 55. The results are as follows: m =5 336.4 6562.3 6237.6 3570.7 508.4 2738.6 5464.4 6237.6 4360.4 5838.2 2738.6 6562.3 6946.5 4360.4 508.4 336.4 m =2 2960.0 648.9 5258.5 4877.2 4279.9 2960.0 703.5 5258.5 5258.5 4877.2 2960.0 648.9 703.5 5258.5 4279.9 2960.0 To gauge the success of the heuristic in redicting the counts, we take the ratios of the corresonding entries in these tables, that is, we calculate, for each congruence class i, j mod m, the ratio actual count/redicted

Ash et al.: Frequencies of Successive Pairs of Prime Residues 409 m = 263 07 242 522 09 426 625 67 77 285 89 200 836 8 485 06 329 603 50 765 346 84 27 987 227 675 063 324 555 588 834 328 87 272 834 22 499 060 38 640 682 05 362 028 27 095 223 663 049 448 68 534 775 256 794 275 866 29 485 036 382 826 594 784 253 004 262 004 230 58 055 366 599 596 779 387 803 98 867 208 604 072 36 795 577 004 355 836 92 062 238 599 038 403 597 665 832 333 845 273 0 6 P J a, d,,x x =0 3 263.7 048.5 242.9 497.9 029.7 434.8 642.4 606.3 778.9 257.8 802.7 9.4 85.5 29.9 486.6 077.4 33.4 595.3 488. 778.9 335.3 823. 93.7 983.5 205.5 655.8 069.6 33.0 595.3 606.3 80.8 332.3 800.4 237.2 865.3 250.3 493.8 069.6 33.4 642.4 656.4 000.8 359.0 08.3 230.0 7.3 250.3 655.8 077.4 434.8 627.2 560.9 777.0 252. 780.6 230.0 865.3 205.5 486.6 029.7 389.7 806.0 592.5 805.6 252. 08.3 237.2 983.5 29.9 497.9 072.6 322.0 596. 592.5 777.0 359.0 800.4 93.7 85.5 242.9 65.4 03.7 322.0 806.0 560.9 000.8 332.3 823. 9.4 048.5 245.0 65.4 072.6 389.7 627.2 656.4 80.8 335.3 802.7 263.7 Ratios.00.02.00.05 0.99 0.98 0.99.02 0.99..02.04 0.98 0.97.00.03.05.00.03 0.98.03 0.99.2.00.02.03 0.99 0.98 0.97 0.97.04 0.99.02.5 0.96 0.98.0 0.99.0.00.04.0.0.0.8 0.98 0.98.0 0.97.03 0.99 0.95.00.02.02.20.00.0.00.0 0.98.0.00 0.97.00 0.99.0.02.0.04 0.98.4.00.0.00.08.00.02.02 0.97 0.98 0.97 0.98 0.99.03.00.07.02.00.0 0.99 0.97 0.97.03 0.98.0.04 0.99.05.04 count, with the following results, rounded to two laces: m =5.02.00.0 0.99 0.99.05 0.99.00.0 0.99.03.0.00.00.00.00 m =2 0.99 0.98.00.03.02.00.00 0.99.00.03 0.99.00 0.99.00.03 0.99 TABLE 3. Actual counts and redicted values for m =. By comarison, the ratio of the largest to the smallest count is 6934/280 2.47 for m = 5, and 7034/298 2.4 for m = 2. Here are the results for m = 4, again for the range 0 3 to 0 6 : m =4: 0 6 6574 2252 22520 675 x=0 3 P J a, d, 4,x: ratios:.00.0.0.0 668.8 22407.7 22407.7 668.8

40 Exerimental Mathematics, Vol. 20 20, No. 4 Finally, for m =, the calculations aear in Table 3. The counts for m = vary from 92 to 826, with ratio 826/92 9.5. Our redicted values are for the most art quite close to the observed counts, though the redictions for a few entries notably, and curiously, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7 are a bit low, only 85% 90% of their observed counts. This is artly exlained by the size of the entries themselves the main diagonal entries are the smallest, in both the table of counts and the table of redictions, so a discreancy there between rediction and observation shows u as a larger ercentage error than the same discreancy elsewhere. For examle, the observed count for 5, 5 is 4 above the redicted count, or 8%; the observed count for 0, 9 is 42 above redicted, or 5%; the observed count for 2, 4 is 39 below redicted, or 3%. 8. ANOTHER PREDICTION If we let x and fix J, we can view the dominant terms in P J a, d, m, x as those with ns = 2. In reality, the estimation is more comlex, because the value of J should increase with x. If ns = 2, then S = {0,d+ jm}. As long as a, m =a + d, m =, these dominant terms in P J a, d, m, x will be indeendent of a. To see whether in fact this rediction aears to be true, we comute rime air counts modulo 5 and 2 for the first 0 7 robable rimes larger than 0 00.The results are m =5 300655 38723 320000 30947 33970 30042 35948 39266 32837 37325 300976 38355 322830 3372 3402 300023 m =2 30604 38324 35239 34667 33320 300633 3955 36479 35294 3544 300873 3836 32039 35883 34055 300335 These data are consistent with rediction: for examle, for m = d = 5 resectively m = d = 2, art of the rediction is that the entries on the main diagonal should tend to equality; an ad hoc way to judge this is to take the ratio of the largest to the smallest terms on the diagonal for both our first set of residues, involving rimes less than 0 6, and this set. For m = 5 resectively m = 2, the ratio for the small-rime data set is 638 50 390.8 resectively 430.05, while for the large-rime data set, the ratio is 300976 300023.003 resectively 30604 300335.004. 9. FURTHER OPEN QUESTIONS One obvious question is whether the different residue classes of rime airs occur asymtotically equally often: in other words, given any a, d, a,d with a, a + d, a,a + d relatively rime to m, dowehave Na, d, m, x Na,d,m,x as x? This corresonds to asking whether all of the entries in the matrices in the revious section are tending toward equality as x tends to infinity. One way to gauge the variation in our tables is to take the ratio of the largest to the smallest counts. In Section 7, the ratio for m = 5 is aroximately 2.47 resectively 2.39 for m = 2, while for the large rime data set in Section 8, the ratio for m = 5 is aroximately.08 resectively.07. Obviously, the terms aear to be getting closer together, but of course we cannot tell whether they are tending toward a limiting ratio of. If the different residue classes of rime airs do occur asymtotically equally often, we are in a rime race situation, as studied in [Rubinstein and Sarnak 94] and [Granville and Martin 06]. For examle, take m = 4 and count rime airs congruent to, mod 4 versus rime airs congruent to, 3 mod 4. Even if the counts are asymtotically equal, i.e., if lim x N, 4, 4,x/N, 2, 4,x =, we observe for finite chunks of rimes a decided tendency for the total of, 3 s to exceed the total of, s. We can kee score, as we search through the sequence of odd rimes 3, 5, 7,..., and see when the, s are ahead of the, 3 s and when the oosite holds. It seems quite ossible that N, 4, 4,x >N, 2, 4,x for almost all values of x. Another question concerns the aroriate value of J to use in our heuristic. The comutation mentioned at the end of Section 4 shows that in at least one case, P J a, d, m, x is nearly indeendent of J for a range of values of J. A recise understanding of the deendence of P J a, d, m, x onj would go a long way toward establishing the solidity of our heuristic.

Ash et al.: Frequencies of Successive Pairs of Prime Residues 4 ACKNOWLEDGMENTS We thank Jenny Baglivo and K. Soundarajan for their hel, and the referee for helful comments. The first and second authors thank the National Science Foundation for artial funding of this research under grant DMS-0455240. REFERENCES [Ash et al. 09] A. Ash, B. Bate, and R. Gross. Frequencies of Successful Tules of Frobenius Classes. Exeriment. Math. 8 2009, 55 63. [Bateman and Horn 62] Paul T. Bateman and Roger A. Horn. A Heuristic Asymtotic Formula Concerning the Distribution of Prime Numbers. Math. Com. 6 962, 363 367. [Granville and Martin 06] Andrew Granville and Greg Martin. Prime Number Races. Amer. Math. Monthly 3 2006, 33. [Hardy and Littlewood 23] G. H. Hardy and J. E. Littlewood. Some Problems of Partitio Numerorum III: On the Exression of a Number as a Sum of Primes. Acta Math. 44 923, 70. [Pólya 59] G, Pólya. Heuristic Reasoning in the Theory of Numbers. Amer. Math. Monthly 66 959, 375 384. [Rubinstein and Sarnak 94] Michael Rubinstein and Peter Sarnak. Chebyshev s Bias. Exeriment. Math. 3 994, 73 97. Avner Ash, Deartment of Mathematics, Boston College, Chestnut Hill, MA 02467-3806, USA ashav@bc.edu Laura Beltis, Actuarial Analysis Deartment, Tufts Health Plan, Watertown, MA 02472, USA lbeltis@gmail.com Robert Gross, Deartment of Mathematics, Boston College, Chestnut Hill, MA 02467-3806, USA gross@bc.edu Warren Sinnott, Deartment of Mathematics, The Ohio State University, Columbus, OH 4320-74, USA sinnott@math.ohio-state.edu Received February 2, 200; acceted May, 200.