THE SZEMERÉDI REGULARITY LEMMA AND ITS APPLICATION

THE SZEMERÉDI REGULARITY LEMMA AND ITS APPLICATION YAQIAO LI In this note we will prove Szemerédi s regularity lemma, and its application in proving the triangle removal lemma and the Roth s theorem on 3AP. 1. The Regularity Lemma Consider a bipartite graph given by vertex sets A, B, let E(A, B) be the set of edges between them, we define the density of this bipartite graph (A, B) as d(a, B) := E(A, B). A B Definition 1 (ɛ-regular). Say the bipartite graph (A, B) is ɛ-regular if for any A A, B B with A ɛ A and B ɛ B, we have d(a, B ) d(a, B) ɛ. Intuitively, this definition of regularity can be understood as the bipartite graph given by A, B is random looking, or that the edges are uniformly distributed. Let s continue. We say a partition is equipartition if the number of elements in different parts differ by at most one. Definition 2 (ɛ-regular equipartition). Say an equipartition (of V (G), say) given by V 1,..., V k is ɛ-regular if all but at most ɛk 2 of the pairs (V i, V j ) are ɛ-regular. That is, most pairs of this partition give random looking bipartite graphs, with just a tiny fraction of irregular parts. Alert! Although, that different pairs can have very different edge densities (as bipartite graphs). The Szemerédi s regularity lemma guarantees the existence of regular equipartition of arbitrary graphs. Theorem 1 (Szemerédi s regularity). For every ɛ > 0, there exists T (ɛ) > 0 (which depends on ɛ only, not depend on specific graphs) such that every graph G has an ɛ-regular equipartition into k parts, where the bounds of k is given as follows The upper bound T (ɛ) is a tower of 2. 1 ɛ k T (ɛ). 2. Application: The Triangle Removal Lemma We will use this regularity lemma to prove the triangle removal lemma. Theorem 2 (Triangle removal lemma). For every ɛ > 0, there exists δ = δ(ɛ) > 0 such that if G is ɛ-far from being triangle free (that is, we have to remove at least ɛn 2 edges to make G triangle free), then G contains at least δn 3 triangles, where n is the number of vertices of G. Date: April, 2015. McGill University, yaqiao.li@mail.mcgill.ca. 1

2 YAQIAO LI If a graph contains Ω(n 3 ) triangles, we then have to remove Ω(n 2 ) edges to make it triangle free since deleting one edge can remove at most n 2 triangles. This lemma says the converse: if G contains o(n 3 ) triangles, then it can be made triangle free by deleting o(n 2 ) edges. We need the following lemma to prove this theorem. Lemma 1. For any tri-partite graph given by vertex sets A, B, C such that d(a, B) = b, d(a, C) = c, d(b, C) = a, and a 2ɛ, b 2ɛ, c 2ɛ, if all these three pairs (A, B), (A, C) and (B, C) are ɛ-regular, then the number of triangles in this tri-partite graph is at least (1 2ɛ)(a ɛ)(b ɛ)(c ɛ) A B C. This lemma says that if a tri-partite graph is pairwise dense and regular, then it contains a positive portion of triangles (i.e., a lot of triangles). Proof. Let us consider those vertices in A having small neighbourhood in B, S := {v A : N B (v) < (b ɛ) B } A, where N B (v) denotes the set of neighbours of v in B. We will show the set S is very small, specifically we claim S < ɛ A. Assume otherwise, then for the pair (S, B), we have that S ɛ A and B ɛ B, since (A, B) is ɛ-regular, we have d(s, B) d(a, B) = d(s, B) b ɛ. But E(S, B) S (b ɛ) B d(s, B) = < = b ɛ, S B S B contradicting to the preceding inequality. Hence we have the size S is small as claimed. Similarly, we have the set T := {v A : N C (v) < (c ɛ) C }, is also small, where N C (v) denotes the set of neighbours of v in C. Specifically, we have T < ɛ A. Now let us look at the remaining vertices of A, that is the set then we know the set A is large, specifically, A := A S T, A (1 2ɛ) A. By our choice of S and T, we know that every vertex v A has big neighbours both in B and in C, N B (v) (b ɛ) B }, N C (v) (c ɛ) C, v A. It follows by the fact that the pair (B, C) is ɛ-regular that In particular we have implying that d(n B (v), N C (v)) d(b, C) = d(n B (v), N C (v)) a ɛ, v A. d(n B (v), N C (v)) a ɛ, E(N B (v), N C (v)) (a ɛ) N B (v) N C (v) (a ɛ)(b ɛ)(c ɛ) B C.

THE SZEMERÉDI REGULARITY LEMMA AND ITS APPLICATION 3 Observe that E(N B (v), N C (v)) is exactly the number of triangles lying in ({v}, N B (v), N C (v)). Hence #triangles in (A, B, C) #triangles in (A, B, C) ((1 2ɛ) A ) ((a ɛ)(b ɛ)(c ɛ) B C ). Now we prove the triangle removal lemma. The idea is to first use regularity lemma to give a regular equipartition, then appropriately clean up the partition such that we can focus only on those dense tri-partite graphs, and then apply the preceding lemma to conclude we have a lot of triangles. Proof. Step 1: Apply regularity lemma. Let s apply regularity lemma with ɛ 4 to the graph G, then G has an ɛ 4-regular equipartition into k parts, where 4 ɛ k T ( ɛ 4 ). Say this equipartition is given by V 1,..., V k. Step 2: Clean up. Think of this equipartition as an equipartition of the adjacency matrix of G. We will clean up this partition by removing the following three bad parts: the diagonal, the irregular parts, and the sparse parts. Fortunately, these parts are small enough, so we are still left a large and dense parts to work with. Remove the diagonal. That is, to remove all edges in every V i, this will remove at most k (n/k)2 2 1 8 ɛn2 edges. Remove the irregular parts. By the definition of ɛ 4 -regular equipartition, we have at most ɛ 4 k2 irregular pairs, hence removes at most ɛ 4 k2 (n/k) 2 1 4 ɛn2 edges. Remove the sparse parts. Since we are working with ɛ 4-regularity, in the preceding lemma we think a bipartite pair is dense if the edge density is larger than 2 ɛ 4 = ɛ 2. Hence we remove all pairs which has edge density less than ɛ 2, this will remove at most ( k ɛ 2) 2 (n/k)2 1 4 ɛn2. In total, we have removed at most 5 8 ɛn2 edges. Step 3: Apply the preceding lemma. Since G is ɛ-far from triangle free, we know that there are still triangles in G. Suppose the tri-partite graph (V i, V j, V l ) contains a triangle, then there is at least one edge between each pair, hence the edges in every pair have not been removed in step 2. It follows that i, j, l are pairwise unequal and they are pairwise regular and dense. Applying the preceding lemma, we have that (V i, V j, V l ) contains at least triangles, in which as desired. (1 2 ɛ 4 )( ɛ 2 ɛ 4 )3 (n/k) 3 δn 3, δ = (1 ɛ 2 ) ɛ3 1 64 T (ɛ/4) 3, Note that we have used both the lower bound and the upper bound from Szemerédi s regularity lemma to obtain a formula for δ in the triangle removal lemma. The sharp bound of δ is still an open question. 3. Application: Roth s 3AP Theorem We now use the triangle removal lemma to show Roth s theorem. Theorem 3 (Roth). For any ɛ > 0, there exists N = N(ɛ) > 0 such that for all n N, if A Z n satisfies A ɛn, then A contains a nontrivial 3AP.

4 YAQIAO LI Proof. We will construct a graph G and apply the triangle removal lemma. Let G be a tri-partite graph defined by vertex sets V 1 = V 2 = V 3 = Z n, so V (G) = 3n. Define edges for G as follows: For r V 1, s V 2, put (r, s) G if and only if s r A; For s V 2, t V 3, put (s, t) G if and only if t s A; For r V 1, t V 3, put (r, t) G if and only if (t r)/2 A; Consider (r, s, t) V 1 V 2 V 3 and suppose they form a triangle, by our choice of edges, this means that s r A, t s A, (t r)/2 A, hence we get (s r, t r 2, t s) is a 3AP in A. The problem is that this might be just a trivial 3AP, i.e., we may have s r = t r 2 = t s = a A, then this trivial triangle is formed by (s, s + a, s + 2a) V 1 V 2 V 3 where s Z n, a A, that is, each trivial triangle corresponds to a distinct pair (s, a) Z n A, hence different trivial triangles are disjoint. We have ɛn 2 #trivial triangles n 2. The lowerbound implies that G is ɛ 9-far away from being triangle free, by triangle removal lemma this implies that we have at least δ(3n) 3 triangles for some δ = δ(ɛ) > 0. Now use the upper bound for the trivial triangles we know that there must exist some nontrivial triangles, equivalently nontrivial 3AP in A, as long as n is sufficiently large depending on ɛ. 4. Proof of the Regularity Lemma At last, let us prove the regularity lemma. The proof is an energy argument. We will define a notion of energy associated with each partition, we will see that lack of regularity implies energy increment, but this energy is bounded above, hence the regularity must appear after a finite steps of refining (of the partition). Given a real matrix A m n = (A ij ), define its energy as E(A) := A 2 ij 0. Obversely, if the entries of A are bounded by r, then a trivial upper bound is E(A) mnr 2. In particular, if A is an n n adjacency matrix of some graph G, then E(A) n 2. i,j A ij For every matrix A m n = (A ij ), let d(a) := mn be the density of A. Construct a new matrix B m n := (B ij ) where B ij = d(a) for all 1 i m and 1 j n. This new matrix can be viewed as a smoothing of the original matrix A. Let π be an operator that maps every matrix A to its smoothing π(a) = B as just defined. Note that π(a) is a matrix with all entries being a constant, hence the smoothing operation is unique. Conversely, A can be viewed as a mixing of π(a). Note, however, that π(a) can have different mixings: for two matrices A B, it is possible that π(a) = π(b) = C as long as d(a) = d(b), hence A, B are smoothed to the same matrix C, so either A or B can be viewed as a mixed version of C. Here is a useful fact, it says that the energy decreases as we smooth a matrix, conversely, the energy increases as we mix a matrix. Lemma 2. For every matrix A m n = (A ij ), we have E(A) E(π(A)) = (A ij d(a)) 2 = E(A π(a)). In particular, we have E(π(A)) E(A). Proof. It s a direct calculation.

THE SZEMERÉDI REGULARITY LEMMA AND ITS APPLICATION 5 Let M be a partition of [m] and N be a partition of [n], for S M, T N, note that S [m], T [n], let A S T be the corresponding submatrix of A, hence M and N together define a partition of matrix A. Let us denote this partition as P M,N := (M, N ), we also denote P M,N := {A S T : S M, T N }. Let us smooth every submatrix A S T to π(a S T ), and denote this new blockwise-smoothed matrix of A to be P M,N (A). For example, in two extreme cases, we have If M = {[m]} and N = {[n]}, i.e., the matrix is not partitioned, then P M,N (A) = π(a) is just the smoothing of A; If M = {{1}, {2},..., {m}} and N = {{1}, {2},..., {n}}, i.e., the matrix is partitioned into singletons, then P M,N (A) = A since the smoothing of each singleton is just itself. Remember that given a partition P M,N = (M, N ) of matrix A, the matrix P M,N (A) is a blockwise-smoothed version of A, let its entry be denoted by p ij, i.e., P M,N (A) = (p ij ). Given another partition P M,N = (M, N ) that is a refinement of P M,N, let its entry be denoted by p ij, i.e., P M,N (A) = (p ij ). A useful generalization of the above lemma is the following. Lemma 3. For every matrix A m n = (A ij ), and any two partitions P M,N and P M,N of matrix A where P M,N is a refinement of P M,N, we have E(P M,N (A)) E(P M,N (A)) = (p ij p ij ) 2 = E(P M,N (A) P M,N (A)). In particular we have E(π(A)) E(P M,N (A)) E(P M,N (A)) E(A). Proof. Just view each block(that is, a submatrix) of P M,N as an individual matrix, then a refinement of P M,N is just a mixing of each block, then apply the previous lemma to each block. Observe also that we have π(p M,N (A)) = π(p M,N (A)) = π(a), that is, P M,N (A), P M,N (A) and A all can be viewed as different mixing versions of π(a), the difference is that they are successively refined mixing versions. Definition 3 (ɛ-regularity of a matrix). Say a matrix A m n is ɛ-regular if for any S [m], T [n] with S ɛm, T ɛn, we have d(a S T ) d(a) ɛ. The following is important. Lemma 4 (Lack of Regularity implies bounded energy increment). Suppose A m n is not ɛ- regular, then there is a partition P M,N of A such that E(P M,N (A)) E(π(A)) > mnɛ 4. Proof. As A is not ɛ-regular, there exists S [m], T [n] with S ɛm, T ɛn such that d(a S T ) d(a) > ɛ. Define a partition P M,N := (M, N ) as M = {S, [m] S} and N = {T, [n] T }. Apply lemma 3 we have E(P M,N (A)) E(π(A)) = (p ij d(a)) 2 as desired. i S,j T (p ij d(a)) 2 = > ɛ 2 S T mnɛ 4, i S,j T (d(a S T ) d(a)) 2

6 YAQIAO LI Definition 4 (ɛ-regular partition of a matrix). Say a partition P M,N of a matrix A m n is ɛ-regular if all except at most ɛ M N pairs (S, T ) M N satisfy that A S T is ɛ-regular. As a corollary of Lemma 4 we have the following. Lemma 5 (Lack of Regularity implies bounded energy increment-2). Suppose an equipartition P M,N of a matrix A m n is not ɛ-regular, then there is a refined equipartition P M,N such that E(P M,N (A)) E(P M,N (A)) > mnɛ 5. Proof. Assume M = k, N = l, that is, the rows and columns of A are partitioned equally into k and l parts, respectively. As P M,N is not ɛ-regular, there exist at least ɛkl pairs (S, T ) M N such that A S T are not ɛ-regular. By Lemma 4, we can partition each block A S T such that this block has energy increment > S T ɛ 4 = mn kl ɛ4. Define the refined partition P M,N of the whole matrix A to be the intersection of the partitions of each block, if it is not an equipartition, refine it appropriately to make it into an equipartition. Observe that the effect of P M,N on each block is an even finer partition than the one used to achieve bounded energy increment, by Lemma 3, the energy can only increase even more. Hence by Lemma 3 and Lemma 4 we have E(P M,N (A)) E(P M,N (A)) > (ɛkl) ( mn kl ɛ4 ) = mnɛ 5, as desired. We have seen that refine the irregular partition induces bounded energy increment, but the energy is trivially bounded above by mnr if all the entries are bounded by r, hence the refining process must terminate, meaning that the regularity of partition must be achieved after a finite number of steps. Now we can give a sketch of the proof of the regularity lemma. Proof. Let A n n be the adjacency matrix of a graph G. Start with k = 1 ɛ -equipartition of A, repeat the refinement process to achieve the regularity (of partition) in a finite number of steps, specifically, in at most n 2 /(n 2 ɛ 5 ) = 1/ɛ 5 steps. It should be noted that to apply Lemma 5, we should also take care of the size of the refinement equipartition, i.e., it cannot be too large. It can be shown that from a k-equipartition P M,N, one can take an at most 8 k -equipartition P M,N to achieve the energy increment bound in Lemma 5. Hence eventually we will reach an ɛ-regular l-equipartition where l is bounded above by a tower of the form 8 k in at most 1/ɛ 5 levels.