ON THE ERDOS-STONE THEOREM V. CHVATAL AND E. SZEMEREDI In 1946, Erdos and Stone [3] proved that every graph with n vertices and at least edges contains a large K d+l (t), a complete (d + l)-partite graph with t vertices in each part. More recently, Bollobas, Erdos and Simonovits [2] proved that log n & t> a d log (1/c) for some positive constant a and conjectured that this bound can be improved into f ^ log n log (1/c) for some positive constant b. The purpose of our paper is to prove that log n 1 ^ 500 log (1/c) for all n large enough with respect to c and d. In a sense, this result is best possible: as shown by Bollobas and Erdos [1], our constant 1/500 cannot be increased beyond 5. Our argument hinges on a theorem asserting that every sufficiently large graph can be partitioned into a small number of classes in such a way that the partition exhibits strong regularity properties. To state the theorem in more precise terms, we need a few definitions. When A and B are nonempty disjoint sets of vertices in some graph then we denote by d{a, B) the density of edges between A and B: this is the number of edges with one endpoint in A and the other endpoint in B divided by \A\-\B\. The pair (A,B) is called E-regular if X A, \X\ ^ E\A\, Y <= B and V ^ E\B\ imply that \d(x, Y) d(a,b)\ < E; otherwise the pair is E-irregular. A partition of a set V into classes C O,C X,...,C k is called E-regular if C 0 ^ F\V\, C, = \Cj\ whenever 1 ^ / < j ^ k and if at most F.k 2 of the pairs (C h Cj) with 1 ^ / < j ^ k are <;-irregular. The Regular Partition Theorem, proved in [5], asserts that for every positive E and for every positive integer m there are positive integers M and N with the following property: if the set V of vertices of a graph has size at least N then there is an ^--regular partition of V into k+ 1 classes such that m ^ k ^ M. LEMMA 1. For every choice of positive numbers c, d and E such that d is an integer and E < c/10 there are positive numbers N and d with the following property. Ifn ^ N Received 5 December. 1979. [J. LONDON MATH. SOC. (2), (1981), 207-214]
208 V. CHVATAL AND E. SZEMEREDI then every graph with n vertices and at least edges contains pairwise disjoint sets C l5 C 2,..., C d+l of vertices such that C, ^ dnfor all i, X d{c h Cj) ^ d -1 + cd for all j, and such that every pair (C h Cj) is e-regular. Proof Since every graph with n vertices has fewer than n 2 /2 edges, we may assume that c < l/2d. Choose an integer m so that m 2 ^ 100/c 3 d and m 2 ^ 4{d + l)/c; note that m ^ 10/c. The Regular Partition Theorem guarantees the existence of certain integers M and N. We claim that N and 3 = (1 s)/m have the properties required by the present lemma. To justify this claim, we consider an e-regular partition of our graph into classes C 0,Cj,...,C fc such that m ^ k ^ M. Trivially, C, ^ Sn whenever 0 < i ^ k. Writing t = C, for i > 0 we have Since t ^ n/k, it follows that k 2 ( \ \\ 1-- + 2C-28--. Define numbers d i} {i j= j) by Since <«- d(c h Cj) if the pair (C,, Cj) is e-regular, 0 otherwise. n n n n we have " " k 2 ( 1 Now call a subset /4 of {1, 2,..., /c} dense if
ON THE ERDOS-STONE THEOREM 209 As we have just proved, {1,2,..., k] is dense. Furthermore, every dense set A satisfies hence /4 ^ 4/c and \A\ ^ d +1. We shall consider a minimal dense set A. Note that for every j in /I we have otherwise,4 {j} would be dense. Now it follows that for every d-point subset D of A we have iea-djed Finally, among all the (d + l)-point subsets of A, choose a subset B maximizing ]T d,-j. Consider an arbitrary s in B and set D = B {s}. Since d sj ^ ^] rf 0 - for i.je B jed jed all / in /4 D, we have ^ ^ X A D \ ^ I ^ Now it follows that the sets C, (i e B) have the desired properties. Kovari, Sos and Turan [4] proved the existence of a function m with the following property: if n ^ m(k, p) then every bipartite graph with n vertices in each part and pn 2 edges altogether contains a K 2 {k). (In fact, they proved that m{k, p) < c k whenever c > 1/p and k is large enough with respect to c; this bound was later improved by Znam [6].) LEMMA 2. Let n be a positive integer; let c and e be positive constants such that e ^ (c/6n) 2 ". Let C l5 C 2,..., C n be pairwise disjoint sets of vertices in some graph such that every pair (C h Cj) is e-regular and such that for all i. Let k be a positive integer and let p be a positive number such that d{c h Cj) ^ p + efor all i andj. Write d u = d{c h C } )-c and 2 k otherwise. Finally, assume that the vertices are labeled in such a way that at most B\Cj\ m(k,p)y\fij
210 V. CHVATAL AND E. SZEMEREDI distinct labels appear in each C y Then each C } contains a subset M } of size at least k/50 such that every two vertices in distinct sets M- } are adjacent and such that every two vertices in the same set Mj have the same label. Proof. We shall proceed by induction on n and distinguish between two cases. In either case, N{X, Y) stands for the largest N such that every vertex in X has at least N neighbours in Y. Case 1, when d tj ^ 2/5 whenever i ^ j. In this case, we shall obtain the desired conclusion without appealing to the induction hypothesis. We shall find it convenient to write 3, = 2(c/6/?) 2 "~'"' whenever 1 ^ t < n. Note that c "" 1 dm "" 1 5-5i = Y\ ~^~ f r a 'l ' an d that Y\ ~^ > ne - The desired sets M, will be ~> n m = i + l 2 i=l 2 constructed in n iterations. At the beginning of iteration t, we have subsets K { of C, (1 < / < f) and subsets Sj = S) of C, (t^j<n) such that \K t \ ^ k, N(Sj, K t ) > dji\k t \ and 1=1 for all i and). (To begin when t = 1, we set Sj = C } for all).) We shall define first a certain subset 5 of S,, then the set K t along with certain subsets Sf of Sj {t < j ^ n) and finally the sets S) +1 {t < j ^ n). The set S consists of all those vertices in S, which have at least (d(cj, C,) e)\sj\ neighbours in each Sj {t < j < n). Since \S } \ ^ ne\c } \ whenever t ^ j ^ n, the assumption that all pairs (Cy, C,) are -regular implies that \S\ ^ S, -(n-l)e C r ^ e\c t \. Note that for every subset X of S and for every subscript) such that t < j ^ n there is a subset S* of Sj such that \S*\ ^ c\sj\ and JV(S*, A') ^ dj t \X\: otherwise the total number of edges between Sj and X would be at most c\sj\ \X\ + (l-c)\sj\d jt \X\ ^ \X\ \Sj\( contradicting the fact that N(X, Sj) ^ (d(cj, C t )-e)\sj\. We extend the label of each vertex v in S by a (t l)-tuple of sets iv, = N { (v) such that N t ^ K h \Ni\ = N{S t, K t ) and such that v is.adjacent to every vertex in N t whenever 1 ^ i < t. Our bound on the number of distinct original labels in C, implies that there are at most \S\/k distinct extended labels in S. Hence S contains a subset K of size k such that every two vertices in K have the same extended label. In particular, there are sets N\ (1 ^ i < t) such that N^v) = N\ for all v in K. Now we ask the following question. Is there a subset J of {t + \, t + 2,..., n} along with subsets Sf of Sj {j J) and a subset K* of K such that (l/5)/e ^ \K*\ ^ (4/5)/c, jej Dyj jej and \Sf\ ^ 5 t \Sj\ for all j in J?
ON THE ERDOS-STONE THEOREM 211 If the answer is "yes" then we may assume that N{Sf, K*) ^ d jt \K*\ for all j in J: subscripts violating this inequality might be just as well deleted from J. For each 7 not in J but such that t < j ^ n, we find a subset Sf of Sj such that \Sf\ ^ c\sj\ and N{Sf,K*) ^ d jt \K*\. Then we set K t = K*. If the answer is "no" then we find subsets Sf of Sj (t < j ^ n) such that S? ^ c Sj and N(SJ, K) ^ <yk. Then we set K t = K. Let M, = Mj (1 ^ i < t) denote the intersection of all the sets N* (/ < s ^ t). For every i and j such that 1 ^ i < t and t < j ^ n, choose a subset 7^- = T ' {j of S* such that (l/3w) S* ^ \T tj \ ^ (l/2n) S* and such that, for every choice of v in T u and w in SJ Tij, the number of neighbours of v in K, M, is at least the number of neighbours of w in X, M,. The new sets Sj +1 {t < j ^ n) are obtained from S* by deleting all the sets 7^ (1 ^ i < t). Note that every vertex in Sj +1 has at most N(T\ p Ki-M]) neighbours in K { -M\ and that S; +1 ^ $\SJ\. When the last iteration is completed, every two vertices belonging to the same set M" have the same label and every two vertices belonging to different sets M" are adjacent. It remains to be shown that \M"\ ^ /c/50 for all i. If the answer in iteration / was "yes" then the conclusion follows easily: since j we have \M1\>\K t \- t {\KA-\Nf\) > \KA-\K t \ t (l-^0)+ A ^ A. j If the answer was "no" then consider the smallest superscript t such that Mi ^ (4/5)k. Since M\ = M\~ l -{K t -N\) and N = N(S\ t K t ) > d ti \K,\ > (2/5)k, we have * 9 \M\\ > lafj-vl^-nji Thus we may assume that t < n. Furthermore, since I^-Mj ^ t (\K t \-\Ni\) ^ k we have (l~^ij)^ V5, and so (1 ^)^4/5. Since the answer in iteration i was "no", we have S} +1 ^ (c/2) S} whenever i < j ^ n and so whenever t ^ j ^ n. But then whenever t < j ^ n. Since the answer was "no", we have j=r+l
212 V. CHVATAL AND E. SZEMEREDI Finally n n j = t + 1 j = t + 1 But N/ = N(S j j, K t ) ^ djilkil and, as we have observed above, Hence \N{-M\\<N{r u,k t -M\). M" ^ M YJ (l-^il~^jil^il)~ TT; \Ki M\\ ^ rf j7 j=r+l 50 j=t+\ 50 + ' J=f +1 *' 7I " 50 5 ' 2, vv/ienrf^ < 2/5 /or some i and j. (Without loss of generality, we may assume that i = n 1 and j = n.) In this case, we shall rely on the induction hypothesis. First we shall find /c-point subsets K } of C, (n 1 ^ 7 ^ n) such that every two vertices in different sets K, are adjacent and such that every two vertices in the same set K } have the same label. Then we shall use these sets K } to define certain subsets Cf of C, (1 ^ J ^ n 2). Applying the induction hypothesis, with e replaced by e*, to these sets Cf we shall obtain the first n 2 of the desired n sets M,. Finding the remaining two sets Mj (n i ^ j < n) will be straightforward. Each of the two sets C i with n l^j^n contains a subset Sj such that i N(Sj, C,-) ^ (rf(c ;j C)- ) CI whenever 1 ^ i ^ n-2 and such that Write m = m(/c, p). In each S J5 consider a maximal subset P 7 which can be partitioned into m-point sets in such a way that every two vertices in the same m-point set have the same label. In Sj Pj, each label appears at most m 1 times. Hence our upper bound on the number of distinct labels in C, implies that \Sj Pj\ ^ e Cjj. Now we have \Pj\ ^ e\cj\ and so d(p n _ x, P n ) ^ d(c n. {, C n )-s ^ p. A straightforward averaging argument shows that each Pj contains an m-point subset Q } such that every two vertices in the same set Qj have the same label and such that d(q n _ x, Q n ) ^ p. Since m = m(k,p), each Qj contains a /c-point subset Kj such that every vertex in K n _! is adjacent to every vertex in K n. For every choice of / and j such that 1 ^ i ^ n 2 and n 1 ^j^nwe have d u ^ 3/5. Hence C,- contains a subset S{ such that \S{\ ^ (i + c) C, and N(S/, Kj) ^ (2^ 1)1^1: otherwise the total number of edges between C, and Kj would be at most (i + c) C. X j + (i-c) C. (2^.-l) X j ^ C,. IK^.+ 4c/5), contradicting the fact that N{Kj, C,) ^ (d(c_,-, C,-) e) /C_,-. Let Cf (1 < i < n-2) denote the intersection of S"~! and S". Note that Cf ^ 2c C, for all 1 and that every pair (Cf, Cf) is e*-regular with * = max (E/2C, 2e). We extend the label of every vertex v in every Cf by sets N n _j = N n _j(u) and
ON THE ERDOS-STONE THEOREM 213 N n = N n {v) such that Nj K p \Nj\ - N{Cf,Kj) and such that v is adjacent to every vertex in Nj for both j. The induction hypothesis applied to Cf (1 < i <«2) and e* guarantees that every C* contains a subset M, of size at least /e/50 such that every two vertices in different sets M, are adjacent and such that every two vertices in the same set M, have the same extended label. In particular, there are sets Nj- (1 ^ i ^ n 2), (n l^j^n) such that Nj{v) = N} for all v in M,-. If M } (n 1 ^ j ^ n) denotes the intersection of all the sets N) (1 ^ i ^ n- 2) then \Mj\ > \Kj\-Y IKj-N^ = \Ki-2(\Kj\-N(Cr t Kj)) i=\ Hence the sets M,- (1 ^ i ^ /i) have the desired properties. LEMMA 3. Let d be a positive integer; let c and e be positive constants such that e ^ (c/6(d + \)) 2ll+ \ Let C,, C 2,..., C d+x be pairwise disjoint sets of vertices in some graph such that every pair (C h C 7 ) is ^-regular and such that for all i. Let k be a positive integer such that {20d 2 ) k m{k, cd/2) ^ e\cj\ for all subscripts j. Then the graph contains a K d+l (t) with t ^ k/50. Proof. To apply Lemma 2 with n = d+1 and p = cd/2, we only have to verify that Y\fu ^ {20d 2 ) k for all subscripts j. Denote by / the set of those subscripts / for which d u ^ 3/4; note that Since the function x -> x log {e/x) is concave, we have f[ (e/xty 1 ^ {ne/xy ^ (ne/2) 2 i= 1 whenever n ^ 2, 0 < x t ^ 1 and x f = x ^ 2 (1 ^ / ^ n). Hence fu ^ (ed/2) 2k as long as / ^ 2. Since there are at most three subscripts outside /, the desired result follows.
214 ON THE ERDOS-STONE THEOREM Now our main result follows routinely from Lemma 1, Lemma 3 and the fact that m(k, p) < (l/p) fc whenever k is large enough with respect to p (for a proof, see [6]). When c and d are fixed, choose an e small enough to satisfy the hypothesis of Lemma 3. Now Lemma 1 guarantees the existence of a certain constant <5: if n is large enough with respect to c and d then every graph with n vertices and at least I 1-2 edges contains pairwise disjoint sets of vertices C l5 C 2,..., C d + l of size at least dn and satisfying the hypothesis of Lemma 3. Let k = k{n) be the largest integer such that (l/c) 8k ^ esn. If n is large enough with respect to c and d then 101og(l/c) and m{k,cd/2) < {2/cdf. But then {20d 2 ) k m{k,cd/2) < {\/c) 8k Lemma 3, our graph contains K d+l (t) with t ^ k/50. as cd ^ 1/2. By References 1. B. Bollobas and P. Erdos, "On the structure of edge graphs", Bull. London Math. Soc, 5 (1973), 317-321. 2. B. Bollobas, P. Erdos and M. Simonovits, "On the structure of edge graphs II", J. London Math. Soc. (2), 12(1976), 219-224. 3. P. Erdos and A. H. Stone, "On the structure of linear graphs", Bull. Amer. Math. Soc, 52 (1946), 1089-1091. 4. T. Kovari, V. T. Sos and P. Turan, "On a problem of Zarankiewicz", Colloq. Math., 2 (1954), 50-57. 5. E. Szemeredi, "Regular partitions of graphs", Problemes en combinatoire et theorie des graphes (C.N.R.S., Paris, 1978), pp. 399-401. 6. S. Znam, "Two improvements of a result concerning a problem of K. Zarankiewicz", Colloq. Math., 13 (1965), 255-258. School of Computer Science, McGill University, 805 Sherbrooke Street West, Montreal PQ, Canada H3A 2K6.