An Application of First-Order Logic to a Problem in Combinatorics 1 I. The Combinatorial Background. A. Families of objects. In combinatorics, one frequently encounters a set of objects in which a), each object has an order n n = 0, 1,,...) and b), for each n, there is a finite number fn) of objects of that order. Here are three examples; the setting of this handout will be the third one. Example 1. Let S n be the set of permutations of the set [n] := {1,..., n} that is, the set of bijections from [n] onto [n]). It is easy to see that fn) = n! for n 1, and an argument can be made that one should put f0) := 1. Example. For n 1, let Π n be the set of all partitions of the set [n]; each of these can be identified with the corresponding equivalence relation on [n]. By direct listing, we get [{1,, 3}] Π 3 = [{1, }{3}] [{1, 3}{}] [{, 3}{1}] [{1}{}{3}] so that f3) = 5. In this case, there is a very strong argument for putting f0) := 1. Example 3. Informally, a labeled simple graph with vertex set [n] is drawing consisting of n dots, labeled 1,,... n, and of line segments or curves called edges) joining some of the pairs of dots. For n 1, let G n be the set of labeled simple graphs with vertex set [n]; a moment s thought will show that usually one puts f0) := 0. Exercise 1. Prove 1). Exercise. Draw all eight graphs in G 3. fn) = n ) = nn 1) n 1); 1) B. Properties in families. Sometimes, you want to count how many objects of order n have some special property. I will posit a possible property for each of the three examples. In S n, you might want to count the fixed-point-free permutations; that is, the permutations σ S n that satisfy A 1 : i)σi) i). ) In Π n, you might want to count the singleton-free partitions; that is, the partitions π Π n that satisfy A : i) j) i j) i j) ). 3) π In G n, you might want to count the isolated-point-free graphs; that is, the graphs g G n that satisfy A 3 : i) j)edgei, j)), 4) where edgei, j) means that in g there is an edge connecting vertices i and j. Observe that each of the properties above has been expressed by a suitably interpreted first-order wf over N; such a property is dubbed, naturally enough, a first-order property. Not every property is first-order; for example consider, for σ S n, the possible property of being reducible 3 that you can partition [n] into or more classes such that σ separately permutes the integers in each class. I doubt that the property of 1 This handout is a greatly expanded version of notes I took at a talk given by Peter Winkler at Swarthmore College in the early 90 s. This is the famous hatcheck problem. 3 This is not a standard term; the standard description would be that σ is not an n cycle. 1
reducibility can be expressed by a first-order wf ; it certainly doesn t smell first-order, defined as it is in terms of subsets of [n]. This handout will deal only with first-order properties. C. The probability of encountering a property. Let A be a first-order property in a family of objects that is, a closed first-order wf that, when interpreted, specifies a property in the family. Put limproba) := lim number of objects of order n with property A, 5) fn) if this limit exists; limproba) has pretty clear right to be named the probability that a randomly chosen object has property A. Consider the wf s in equations ), 3) and 4). It is well known that limproba 1 ) = 1 ; I do not know the e value of limproba ), or whether this limit even exists; 4 and as for limproba 3 ), you will be able to find this for yourself using the mathematics presented in this handout see Exercise 17). II. A Quick Review of Some Probability Theory. In an experiment in which different outcomes are possible, the sample space Ω is the set of all possible outcomes to the experiment; we will restrict ourselves to the case in which Ω is finite or denumerable. To each outcome in ω Ω, we assign a probability pω) 0 in such a way that 5 pω) = 1. An event is a subset of Ω. Each event A also is assigned a probability P A): Two events A and B are independent if ω Ω pω) if A ; P A) := ω A P ) = 0. P A B) = P A)P B); events A and B will be independent whenever the occurrence of one of them has no influence over whether the other one occurs. Exercise 3. Prove that the empty event is independent of every event A. A random variable is a numerical function defined on a sample space; for each number x, we use the notation X = x to denote the possibly empty) event {ω Ω: Xω) = x}. Random variables X and Y defined on the same sample space) are independent if the events X = x and Y = y are independent events for all x and y. The distribution of a random variable X is the possibly infinite) table x 1 p 1 = P X = x 1 ) x p = P X = x ) x 3 p 3 = P X = x 3 ) x 4 p 3 = P X = x 4 ).. where {x 1, x, x 3, x 4,...} are the values that X can assume). 4 But I may have an approach for finding this out. 5 It is impossible to proceed in this fashion if Ω is not denumerable.
The expected value of random variable X is the number E[X] := k 1 x k p k ; 6) it is easy to confirm the alternative formula E[X] = ω Ω Xω)pω), 7) in which the sum runs over the elements of the sample space. Exercise 4. Prove 7) from 6). If X and Y are two random variables on the same sample space then one defines their sum X + Y by X + Y )ω) := Xω) + Y ω); clearly X + Y is a third random variable on this sample space. Exercise 5. Using equation 7) and the definition of X + Y, show that E[X + Y ] = E[X] + E[Y ]. Note that this is true regardless whether or not X and Y are independent!) III. Random Graphs and the Statement of Fagin s Theorem. ) n You completely specify a graph g G n by stating whether each of the possible edges is present or ) n absent; if you do this by flipping a fair coin times and filling in an edge exactly when the corresponding flip comes up heads, you are conducting a probabilistic experiment, in which Ω is the set of all ) n possible sequences of coin flips, with pω) = 1 for each sequence ω Ω. Any first-order property A in the family ) n of graphs corresponds to an event A n in this sample space: A n := {ω Ω: the graph g ω constructed by ω has property A}. Moreover, since all sequences are equally likely, all graphs in G n have the same chance of being constructed. This implies that P A n ) = the number of graphs in G n with property A, ) n which together with 5) implies that limproba) = lim P A n). In this way, we can use the power of probability theory to help evaluate limproba). This insight is due to Paul Erdős and Alfréd Rényi, who, with the help of probability theory, calculated limproba) for a number of first-order properties A in the 1950 s, I think.) 6 The answers they got all turned out to be either 0 or 1, and they conjectured that this would always be the case. In 1976, Ronald Fagin proved this conjecture: Theorem Fagin, 1976). For every first-order property A of graphs, either limproba) = 1 or limproba) = 0. The remainder of this handout is an account of how Fagin proved this result. 6 I don t think that they singled out the class of first-order properties or even used the term. 3
IV. The First-Order System. The wf s that define first-order properties in this family will all be drawn from the following first-order system call it GT ). The first-order language will have no individual constants or function letters; the formal variables will be {x 1, x,... ; y 1, y,... ; z}. Since GT will turn out to be a first-order system with equality, the language will include the predicate letter A 1 nickname = ); in addition, there will be a predicate letter A nickname edge ). The axioms will include the Predicate Calculus axioms K1 K6 and the three equality axioms; in addition, there will be a few graph theory axioms, such as I suppose) and x 1 ) y 1 )edgex 1, y 1 ) x 1 = y 1 )) x 1 ) y 1 )edgex 1, y 1 ) edgey 1, x 1 )). Clearly, any graph g G n can be viewed as a model of GT, which I will call I g. I g is defined in the natural way: D Ig will be the vertex set [n]; A 1k, l) will be true if and only if k = l; and edgek, l) will be true if and only if there is an edge in g from vertex k to vertex l. V. The Alice s Restaurant Predicates. The good idea that opened up an approach to the proof was to single out the family of Alice s Restaurant of wf s: for each i 1 and j 1, let B i,j be the closed wf B i,j : x1 = y x 1 ) x i ) y 1 ) y j ) z) 1 ) x i = y j ) ) ) edgez, x1 ) edgez, x i ) edgez, y 1 ) edgez, y j ) ). }{{} ) The name for these wf s comes of course from the Arlo Guthrie song; 7 it was chosen because of the observation that a graph g with property B i,j that is, such that I g = B i,j ) gives you almost anything you want, to wit: S = i for any choice of sets S [n] and T [n], where T = j, g can supply a vertex z that is adjacent8 S T = to all the vertices in S and nonadjacent to any of the vertices in T. As you will see, the proof turns on the properties of the wf s {B i,j }. VI. The First Step of the Proof. Lemma 1. For each i 1 and j 1, limprobb i,j ) = 1. Proof. First: Fix i, j 1 and n i + j + 1). Choose from [n]: an i-element set S = {x 1,..., x i }; a j-element set T = {y 1,..., y j } disjoint from S; and an element z [n] S T ). Call the choice above CS, T, z). Let us say that a graph g G n succeeds with respect to CS, T, z) if, in g, there are edges from z to all of the x s and there are no edges from z to any of the y s; that is, if part ) of {B i,j } to the right of the quantifiers) interprets to a true statement in I g : edgez, x1 ) edgez, x i ) edgez, y 1 ) edgez, y j ) ). 7 My guess is that this is Winkler s own coinage; I can t imagine that Fagin was so flip in a research article. 8 Two vertices are adjacent if there is an edge between them. 4
Exercise 6. Express the condition g succeeds with respect to CS, T, z) as a condition on the valuations in I g. Exercise 7. Prove that the number of graphs g G n that succeed with respect to CS, T, z) is exactly n ) i+j). It follows from Exercise 7 that in Ω, so that ) i+j) P {ω: g ω succeeds with respect to CS, T, z)}) = n = ) n P {ω: g ω fails with respect to CS, T, z)}) = 1 It will help to have a name for this last quantity. Put so that 7) can be written α := 1 ) i+j 1, ) i+j 1 ; ) i+j 1. 7) P {ω: g ω fails with respect to CS, T, z)}) = α. 7 ) Second: passing to the negation. The reason for writing down 7 ) is this: the next task is to track what happens to the probabilities when we put the quantifiers back in front of ) to rebuild B i,j ; what the proof does is to track probabilities while putting quantifiers in front of ) to build B i,j. To make the rest of the argument easier to follow, I will construct this wf : B i,j : x1 = y x 1 ) x i ) y 1 ) y j ) z) 1 ) x i = y j ) ) ) edgez, x1 ) edgez, x i ) edgez, y 1 ) edgez, y j ) ). }{{} ) Third: incorporating the z). Consider what happens when you fix S and T but vary the choice of z. This gives rise to n i + j) distinct events {ω Ω: g ω fails with respect to CS, T, z t )} 1 t n i + j)). Each of these events clearly has probability α of occurring; and since they depend upon nonoverlapping sets of edges, they are all independent of each other. This implies that the probability that all of these events occur is α n i+j). In short, for fixed S and T : P {ω: g ω fails with respect to CS, T, z) for EVERY z / S T }) = α n i+j). 8) It will help to have a name for this event. Put FAILS, T ) := {ω: g ω fails with respect to CS, T, z) for EVERY z / S T }, so that 8) can be written P FAILS, T )) = α n i+j). 8 ) Fourth: incorporating the existential quantifiers. In order to so this, we will analyze the behavior of some random variables on Ω associated with the events {FAILS, T )}. For each possible choice of S and T, put 1, if ω FAILS, T ); Y S,T ω) := 0, else. 5
Exercise 8. Write down the distribution of Y S,T, and show that E[Y S,T ] = α n i+j). ) ) n n i Exercise 9. Show that there are exactly of the Y S,T s. i j We need one more random variable: the sum of all of the Y S,T s. Put X i,j = S [n] S =i T [n] T =j T S= Y S,T. 9) Exercise 10. Show, for any ω Ω, that X i,j ω) counts how many of the events {FAILS, T )} contain ω. It follows from Exercises 5 and 8 that ) n n i E[X i,j ] = i j ) α n i+j) = 1 α i+j n i ) n i Let us examine 10) more closely. Observe that for fixed positive integer k, the function gn) = ) n nn 1)n )...n k + 1) = k k! j ) α n. 10) ) ) n n i is a polynomial of degree k, so that hn) = is a polynomial in n of degree i + j). i j Since 0 < α < 1, it follows that ) ) lim 1 n n i i,j] = lim α i+j α n = 0. i j 11) But it is also easy to show that so that by the Squeeze Theorem, Exercise 11. Prove 11). Exercise 1. Prove 1). Exercise 13. Prove 13). Fifth and finally): A moment s thought shows that for any ω Ω, 0 P X i,j 1) E[X i,j ], 1) lim P X i,j 1) = 0. 13) X i,j ω) 1 I gω = B i,j, and this means that Equation 13) is equivalent to the statement which in turn is obviously equivalent to the statement Exercise 14. Explain why 14) is true. VII. The Second Step of the Proof. limprob B i,j ) = 0, 13 ) limprobb i,j ) = 1. Lemma 1) 14) We now extend GT by adjoining finite subsets of the set {B i,j }; the main point of this section will be to prove that all such extensions are consistent. 6
Lemma. Let T = {B 1),..., B k) } be a k element subset of {B i,j }. Then ThmGT T ) is a consistent first-order system. Proof. I will show this system to be consistent by demonstrating that it has a model. For 1 t k, let and put The point here is that if p T ) n p T ) n so that by Exercise 15, = 1 P MODELt) := {ω Ω: I gω = B t) }, MODELT ) := MODEL1) MODELk) and p T n ) := P MODELT )). > 0 for any n, then this system has an n vertex model.) First of all, clearly, ) ) MODELT ) = 1 P MODEL1) MODELk), 1 p T ) n 1 P Next, consider the limit of the right-hand-side of 15): lim 1 P MODELt)) ) = 1 = 1 by Equation 13 ) ) = 1 = 1. ) MODELt). 15) lim )) P MODELt) limprob B t) ) Because this limit equals 1, the Squeeze Theorem can be applied to the inequalities in 15), and doing so gives lim pt n ) = 1. 16) This certainly implies that p n > 0 for all large n, and for any such n, as mentioned above, the system has an n-vertex model. Lemma ) 0 Exercise 15. Prove, for any events {E 1,..., E n } of any sample space, that P E 1 E n ) P E 1 ) + + P E n ). VIII. The Third Step of the Proof. The next move is to adjoin the entire set {B i,j } to GT : let S be the first-order system ThmGT {B i,j }). By Lemma and the Compactness Theorem, S is consistent, so by the Löwenheim-Skolem Theorem, it has a denumerable model 9 which will be a simple labeled graph on a denumerable vertex set). The last big piece of the puzzle is Lemma 3, which will imply that S has only one denumerable model. 9 It occurs to me that our proof of the Löwenheim-Skolem Theorem does not cover GT, because GT has no closed terms. I am sure that the result is true for GT, though. 7
Lemma 3. Any two denumerable models of S are isomorphic. Proof. Observe first that any model of S will possess property B i,j for all i 1 and all j 1. Let G and H be two denumerable models of S ; I will construct an isomorphism between them. The first step is to enumerate the vertices of each graph: G = {g 1, g, g 3,...} H = {h 1, h, h 3,...}. The construction will iteratively reorder the vertices G = {ĝ 1, ĝ, ĝ 3,...} this will be done in such a way that for all i < j, H = {ĥ1, ĥ, ĥ3,...}; edgeĝ i, ĝ j ) edgeĥi, ĥj) 17) so that the function fĝ i ) = ĥi will be an isomorphism). { ĝ1 = g 1 The reordering starts by putting and The second step puts ĝ = g but lets ĥ be the first vertex ĥ 1 = h 1. that is adjacent/nonadjacent to ĥ1 according as ĝ is adjacent/nonadjacent to ĝ 1. Then third step makes ĥ3 the first unprocessed H vertex and lets ĝ 3 be the first vertex that will have adjacencies/nonadjacencies to ĝ 1 and ĝ that match those of ĥ3 to ĥ1 and ĥ. The construction proceeds in this fashion, alternating the rôles of G and H. Because G and H possess property B i,j for all i 1 and all j 1, the searches for vertices with the correct adjacency patterns will always succeed, so the process will not halt. Furthermore, the alternation of the graphs rôles guarantees that after at most n iterations, all of {g 1,..., g n } and {h 1,..., h n } will have been processed, so every vertex of each graph eventually be reached. Thirdly, after n iterations of the process, clearly the subgraphs induced by {ĝ 1, ĝ, ĝ 3,... ĝ n } 18) {ĥ1, ĥ, ĥ3,... ĥn} are isomorphic. Finally, to see that 17) is true for any i < j, just let n = j in 18): all four of {ĝ i, ĥi, ĝ j, ĥj} will be present. Lemma 3) Exercise 16. Show that S does not have any finite models. IX. Putting the Pieces Together. As mentioned earlier, Lemma 3 implies that S has only one denumerable model. An immediate consequence is Lemma 4. S is a consistent complete first-order system. Proof. We already have that S is consistent. Suppose S were not complete. Then there would be a closed wf A such that neither A nor A could be deduced in S. This would make both ThmS {A}) and ThmS { A}) consistent first-order systems. Then each of these systems would have a model. Each of these models would also be a model of S, but A would be true in one of them and false in the other. This is not possible, since, by Lemma 3, the two models must be isomorphic graphs. Lemma 4) Finally, let A be any closed wf. Since S is consistent and complete, exactly one of the statements A S is true, and limproba) is determined by which one it is. A S 8
Lemma 5. If A, then limproba) = 1; if S A, then limproba) = 0. S Proof. Clearly the two assertions are equivalent, so it suffices to prove the first one. Suppose that a particular deduction of it. Since the deduction is finite in length, it uses only a finite subset S A; fix T = {B 1),..., B k) } of the {B i,j }. This implies that A. 19) GT T Now consider any graph g G n. If g has all of the properties {B 1),..., B k) }, then by 19), g is a model for GT T ; in conjunction with 19), this implies that I g = A that is, g has property A). As an immediate consequence, we get 1 number of graphs in G n with property A n ) MODELT ) n ) Since lim pt n ) = 1 equation 16)), we can apply the Squeeze Theorem in 0) to get = p T ) n. 0) number of graphs in G n with property A limproba) = lim = 1. Lemma 5 and theorem) ) n Exercise 17. Find limproba 3 ) introduced at the end of Section I). 9