Tight bound for the density of sequence of integers the sum of no two of which is a perfect square

Discrete Mathematics 256 2002 243 255 www.elsevier.com/locate/disc Tight bound for the density of sequence of integers the sum of no two of which is a perfect square A. Khalfalah, S. Lodha, E. Szemeredi Department of Computer Science, Rutgers, State University of NJ, New Brunswick, NJ 08903, USA Received 6 March 200; received inrevised form 2 September 200; accepted October 200 Abstract Erdős and Sarkőzy proposed the problem of determining the maximal density attainable by a set S of positive integers having the property that no two distinct elements of S sum up to a perfect square. Massias [Sur les suites dont les sommes des terms 2 a 2 ne sont par des carr] exhibited such a set consisting of all x mod 4 with x 4; 26; 30 mod. Lagarias et al. [J. Combin. Theory Ser. A 33 982 67] showed that for any positive integer n, one cannot nd more than n residue classes mod n such that the sum of any two is never congruent to a square mod n, thus essentially proving that the Massias set has the best possible density. They [J. Combin. Theory Ser. A 34 983 23] also proved that the density of such a set S is never 0475 when we allow general sequences. We improve on the lower bound for general sequences, essentially proving that it is not 0.475, but arbitrarily close to for sequences made up of only arithmetic progressions. c 2002 Elsevier Science B.V. All rights reserved. Keywords Additive number theory; Combinatorial number theory, the same as that. Introduction Erdős and Sarkőzy [2, p. 209]; see also [, p. 87] proposed the problem of determining the maximal density attainable by a set S ={s i } of positive integers having the Corresponding author. E-mail address lodha@cs.rutgers.edu S. Lodha. Partially supported by DIMACS which was funded by the NSF under Grant STC 9-9999. 002-365X/02/$ - see front matter c 2002 Elsevier Science B.V. All rights reserved. PII S 002-365X000435-6

244 A. Khalfalah et al. / Discrete Mathematics 256 2002 243 255 following property Property NS i j; s i + s j is not a perfect square Massias exhibited such a set consisting of all x mod 4 with x 4; 26; 30 mod in[5]. Its density is. In[3], Lagarias et al. proved the following theorem. Theorem. Let S be a union of arithmetic progressions mod M having property NS. Then the density ds6 with equality possible if and only if M. For all other M, ds6 3. Therefore, they essentially showed that the Massias set has the best possible density if S were to be a union of arithmetic progressions. The same authors [4] also proved. Theorem 2. Let S denote a nite set with all elements 6N which has property NS, and let dn = max S S N Then, there exists an absolute constant N 0 such that for all N N 0, dn60475. It is an improvement over trivial upper bound of 05 ondn, and yet there is a wide gap betweenthe results of Theorems and 2. One must also note that the behavior of sets S having property NS is quite dierent from those sets S having the following property Property DS i j; s i s j is not a perfect square In[6], Sarkőzy proved that any set S having property DS has density 0, to be precise, it has at most [x log log x 2=3 =log x =3 ] elements 6x. Inthis paper, we strengthentheorem 2 and show that the bound in Theorem applies ingeneral case too. Theorem 3. Let S denote a nite set with all elements 6N which has property NS, and let dn =max S S N Then, for any positive real number, there exists an absolute constant N 0 such that for all N N 0, dn +. 2. Outline of proof We sketch the outline of our proof in this section. We are using notation from Section 3.

A. Khalfalah et al. / Discrete Mathematics 256 2002 243 255 245 Givena subset S of [N ] of density +, the number of solutions to the equation x + y = z 2 where x; y S is equal to t=0 t f SQ f S f S Assuming that there is no solutionto this equationthenthe sum ineq. must be 0. Let M be a highly composite number. We shift S by jm, 06j6N and then count the number of solutions to the equation x + y = z 2, where x S and y S + jm for each j. We show that it is almost the same as ineq.. The average error is ON N= P, where P is the largest prime divisor of M. Thenwe partition[n ] into M residue classes mod M and observe how well S gets partitioned into these dierent pieces of [N]. If S is well-distributed inthese residue classes, then the average number of solutions to the equation x + y = z 2 where x S and y S + jm averaging over 06j6N isn N=M 2. But this is not good enough to obtain contradiction! Infact, we will show that we cancombinatorially get a highly composite M such that there exist quite a few pairs of well-distributed dense residue classes modulo M which add up to a quadratic residue modulo M, shifting gives at most ON N= P average error in analytical counting, and combinatorial counting estimates the average number of solutions to be N N=log 2 P. The last two statements give us a contradiction. 3. Notation Let N be a suciently large positive integer and let S be a subset of [N] of density + ; being a positive constant. Let l =56= 3. For 6i6l, let us dene primes p i infollowing way p =7; p i+ 2 6p i Using these primes, we dene l numbers, q through q l q 0 =; + = p p p prime; p6p i+ and p p 6p i+ 2 Let = + + 2 =l. Note that 6 4 l = 3 224

246 A. Khalfalah et al. / Discrete Mathematics 256 2002 243 255 Let i; j S be the density of set S inresidue class j modulo. That is i; j S= {s S s j mod } n= Since we will be working with some xed subset S of [N], we will drop S from the above notation and just use i; j throughout this paper. We use the abbreviations e=e 2i We denote the trigonometrical sum over a set X of integers as f X = x X ex Let SQ be the set of all perfect squares which are 6. Note that f SQ = es= ex 2 s SQ x=0 For the sake of simplication of writing, and without loss of generality, we may assume that is aninteger. Notice that givenany subset A and B of [N], the number of triplets x; y; z such that x A; y B and x + y = z 2 is t=0 t f SQ f A f B 4. Denitions Denition 4. i = i; j 2 =. Denition 5. The residue class j is full if i; j = o. Denition 6. The residue class j is bad if {k 06k + = and i; j i+;j+kqi =4} + = 8 Denition 7. The residue class j is good if it is not bad. 5. Lemmas Lemma 8. + 6 pi+ and 2 qi p i+. Proof. It could be easily done using induction and left as an exercise to the reader.

Lemma 9. i+ i. A. Khalfalah et al. / Discrete Mathematics 256 2002 243 255 247 Proof. qi qi+= k;k i+ i = =0 i+;j+kqi i+;j+k 2 qi+ 2 =q 0 i Lemma 0. i + 2 6 i 6 +. Proof. Using Cauchy Schwartz inequality, we get qi i = i; j 2 qi i; j 2 2 = q 2 i + Since 06 i; j 6, we get i = qi i; j 2 6 qi i; j = + Notice that i = + will meanthat + residue classes modulo are full. Lemma Improved Cauchy Schwartz inequality. If for the integers 0 m n, then m k= x n k k= = x k + ; m n n k= xk 2 n k= x k 2 + 2 mn n n m Proof. n xk 2 = k= m n xk 2 + k= k=m+ x 2 k m k= x k 2 m + n k=m+ x k 2 n m n m k= = x k 2 n 2 + 2m n k= x k + m 2 n n n m k= + x k 2 n 2 2m n k= x k + m2 2 n n m = n k= x k 2 + 2 mn n n m

248 A. Khalfalah et al. / Discrete Mathematics 256 2002 243 255 Lemma 2. If i+ i, then the number of bad residue classes mod is =2. Proof. i+ i = + qi+= k =0 2 i+;k +j += k =0 i+;j+k 2 + = Let number of bad residue classes mod be B. Thenusing Lemma with n =+ = ; m= n=8 and = =4, we get that i+ i + B + 2 2 = B2 2 Since this dierence was, we get B 2 2 2 Lemma 3. If i+ i, then there exists a pair of good residue classes, say a b, modulo such that i; a =2, 2 i; b =2, 3 a + b is a quadratic residue modulo. Proof. Since i+ i, Lemma says that the number of bad residue classes modulo is =2. So the number of good residue classes is at least =2. Also note that k is good i; k + 2 = + 2 G = {k k is good and i; k 2 } If G 2, thenwe have the required pair a and b simply by pigeonhole principle. We will show now that G. Therefore, such a pair must exist inthe light of Theorem. If G6, then k is good i; k 6G+ G 2 6 + 2 64 contradicting lower bound we got above.

A. Khalfalah et al. / Discrete Mathematics 256 2002 243 255 249 Lemma 4. Let z be a quadratic residue modulo q; n be a suciently large positive integer. Then the number of perfect squares in any interval of length H of [n] z mod s at least H=4 n. Proof. Let z k 2 mod q. Suppose the interval is [r; r + q; r +2q;;r +H q], where 06r6n H q. We want to count the number of integral l such that r k=q6l6 r +H q k=q. It follows that the number of such l that satisfy these inequalities is at least r +H q k r k q 2 H q r +H q + rq 2 H 4 n Lemma 5 Shifting lemma. Let S [N ] be such that the number of solutions to the equation x +y = z 2 where x; y S is =0. Let P be a big prime, C an integer constant and let M = C p p p prime; p6p and p p 6P 2 Then the number of solutions where x S and y S + jm for 06j6N 2, 0 is at most ON N= P. Proof. Givena set S [N ], the number of solutions to the equation x +x 2 = z 2 where x ;x 2 S is givenby Eq. t=0 t f SQ f S f S Assuming that there is no solutionto this equationthenthe sum ineq. must be 0. We would like to estimate f SQ t=, by approximating t= by at=bt. Using Farey fraction with denominator Q N and letting r to be the error inthe approximation, we get r = t at bt 6 bt N ; f SQ = e x=0 x 2 t = x=0 at e x 2 bt + r

250 A. Khalfalah et al. / Discrete Mathematics 256 2002 243 255 Now, replacing x by j + btk, where j runs from 0 to bt andk runs from 0 to =bt. We get x 2 = j 2 +2kjbt+k 2 bt 2 and f SQ t= is equal to =bt k=0 = = bt =bt k=0 e j 2 at bt =bt k=0 bt at ej 2 +2kjbt+k 2 bt 2 bt + r bt ek 2 btat+k 2 bt 2 re2kjat ej 2 +2kjbtr ek 2 btatek 2 bt 2 r e2kjat e j 2 at ej 2 +2kjbtr bt Now using the fact that el = when l is integer, we get =bt f SQ = k=0 Assuming that bt6,weget Therefore, bt ek 2 bt 2 r e j 2 at ej 2 +2kjbtr 2 bt j 2 +2kjbtr63N 05+ and ej 2 +2kjbtr 6+3N 05+ bt e j 2 at ej 2 +2kjbtr bt 6 bt e j 2 at bt bt + 3N 05+ 6 bt+3btn 05+ 3 InEq. 3 we used the fact that the maximum value to ej 2 at=bt is, so we divided the sum into two parts, the rst part is the basic part and the second is the error, to nd the maximum value of the error we multiplied it by instead of ej 2 at=bt. Using Gauss sum, we get bt ej 2 at=bt = bt. Now using Eqs. 2 and3, we get Eq. 4, which is true for any bt6 notice that for

A. Khalfalah et al. / Discrete Mathematics 256 2002 243 255 25 bt we get f SQ t= is o using Lemma 2.4 in [7, pp. 2] t =bt 6 f SQ ek 2 bt 2 r bt+3btn 05+ k=0 6 bt+3btn 05+ =bt k=0 6 2 bt 4 We candivide the sum ineq. into two parts based on the value of bt compared to P, = 2 = bt P bt P f SQ f S f S ; f SQ f S f S We want to upper bound 2, so using Eq. 4 andparseval identity, we get 2 = bt P 6 N t 2 f P S 6 N P N 2 6 N P t f SQ f S f S Now since + 2 =0, by the assumption that there is no solution, we conclude that 6 N P 5 We want to show the number of solutions to x +x 2 = z 2 where x S, and, x 2 S+jM, such that j6n 2 is close to the number of solutions to our problem. It is clear, to do, that we need only consider how much and 2 will change if we replaced

252 A. Khalfalah et al. / Discrete Mathematics 256 2002 243 255 Eq. by Eq. 6 t=0 t f SQ f S f S+jM 6 From now on we will call and 2 related to this equationby j and j 2. For j 2,we will use the fact from Eq. 4 with Parseval identity and Cauchy Schwartz inequality to get, 2 j 6 6 2 NP bt P f S f S+jM 2 fs t 2 fs+jm t 2 NP 6 N P 7 For j, we will use the fact that bt M if bt P and jmr N for the values of M; j and Q chosen, to get f S+jM = e x t x S+jM = atx e bt + rx x S+jM Therefore, N t f S = aty + jm e + ry + jm bt y S = aty M e bt + ry e bt jat e jmr y S = ejmr e y t y S = ejmrf S 6 fs+jm 6 +N t f S

A. Khalfalah et al. / Discrete Mathematics 256 2002 243 255 253 Again, using Parseval identity we get j = f SQ f S f S+jM 6 + bt P bt P t f SQ f S f S t fs t f SQ f S 6 + N t f SQ f S 2 6 + N 3=2 t N 6 N P using Eq 5 8 Using Eqs. 8 and7, we get j + 2 j 6 j + 2 j 6 P 6. Proof of main theorem Lemmas 9 and 0 imply that + 2 6 6 2 6 6 l 6 + It cannot be the case that i; i+ i. Therefore, let us say that i+ i for some i. By Lemma 3, we have a pair of good residue classes mod, denote them by a and b. Let { G a = w a i+;a+wa } 4 and { G b = w b i+;b+wb } 4 Consider residue class c a + b modulo. It is QR modulo. Whenclass c is subdivided into smaller residue classes modulo +, some of them will be QRs modulo

254 A. Khalfalah et al. / Discrete Mathematics 256 2002 243 255 c = a + b G_c a G_a b G_b Modulo q_i Modulo q_{i+} Fig.. A pair of good residue classes modulo. + Fig.. Let us denote their index set as { G c = w c 06w c q } i+ and c + w c is a QR modulo + Let us denote G c = L. Note that whenwe shift through aninterval of length h+ = inresidue class c modulo, we are shifting through an interval of length h through each of the rened residue classes modulo +. Using Lemma 4, the number of perfect squares in the interval of length H= h+ = is at least h+ =4 N. Then it is easy to see that the number of perfect squares in the corresponding interval of length h inany of the residue classes ing c is at least h+ =8 L N. Let { W wc = w a ;w b w a G a and w b = w c w a + c a b } modulo + Note that whenwe add any residue pair w a and w b in W wc, we get QR w c modulo +. From Lemma 3 and Denition 6, G a = W wc = G b 3+ =4. So, at least 50% elements w a ;w b inw wc are such that w a G a and w b G b. That is, if we dene G wc = W wc G a G b then G wc + =2. Choose any pair w a ;w b G wc. The number of solutions we get for equation x + y = z 2, giventhat we choose x from residue class a + w a and y from

A. Khalfalah et al. / Discrete Mathematics 256 2002 243 255 255 residue class b + w b modulo + and do shifting in the interval of length h = N 2, 0, is at least N N h+ 4+ 4+ 8 L N If we do this for all pairs in G wc and then repeat it for all dierent possible w c G c, we nd that the number of solutions for x + y = z 2 is at least + N N h+ 2 4+ 4+ w c G c 8 L N = 2 N Nh 256q 2 w c G c i L = 2 N Nh 256qi 2 This contradicts the upper bound that we get when we use Shifting lemma with P =p i+ and C = that is M = + since h= p i+ 2 N Nh=256q 2 i owing to Lemma 8. References [] P. Erdős, R.L. Graham, Old and new problems and results in combinatorial number theory, Monographs Enseign. Math., No. 28, University of Geneva 980. [2] P. Erdős, A. Sarkőzy, On dierences and sums of integers, II, Bull. Greek Math. Soc. 8 977 204 223. [3] J.P. Lagarias, A.M. Odlyzko, J.B. Shearer, On the density of sequences of integers the sum of no two of which is a square, I arithmetic progressions, J. Combin. Theory Ser. A 33 982 67 85. [4] J.P. Lagarias, A.M. Odlyzko, J.B. Shearer, On the density of sequences of integers the sum of no two of which is a square, II general sequences, J. Combin. Theory Ser. A 34 983 23 39. [5] J.P. Massias, Sur les suites dont les sommes des terms 2 a 2 ne sont par des carres, quoted in[3,4]. [6] A. Sarkőzy, On dierence sets of sequences of integers I, Acta Math. Acad. Sci. Hungar. 3 978 25 49. [7] R.C. Vaughan, The Hardy Littlewood Method, 2nd Edition, Cambridge University Press, Cambridge, 997.