Week 6-8: The Inclusion-Exclusion Principle

Week 6-8: The Inclusion-Exclusion Principle March 24, 2017 1 The Inclusion-Exclusion Principle Let S be a finite set. Given subsets A, B, C of S, we have A B A + B A B, A B C A + B + C A B A C B C + A B C. Let P 1, P 2,..., P n be properties referring to the objects in S. Let A i be the set of elements of S that satisfy the property P i, i.e., A i {x S : x satisfies property P i }, 1 i n. The elements of A i may possibly satisfy some properties other than P i. In many occasions we need to find the number of objects satisfying none of the properties P 1, P 2,..., P n. Theorem 1.1. The number of objects of S which satisfy none of the properties P 1, P 2,..., P n is given by Ā1 Ā2 Ān S A i + A i A j A i A j A k i i<j i<j<k + + ( 1 n A 1 A 2 A n. (1 Proof. The left side of (1 counts the number of objects of S with none of the properties. We establish the identity (1 by showing that an object with none of the properties makes a net contribution of 1 to the right side of (1, and for an object with at least one of the properties makes a net contribution of 0. 1

Let x be an object satisfying none of the properties. Then the net contribution of x to the right side of (1 is 1 0 + 0 0 + + ( 1 n 0 1. Let x be an object of S satisfying exactly r properties of P 1, P 2,..., P n, where r > 0. The net contribution of x to the right side of (1 is ( r ( r ( r ( r ( r + + + ( 1 0 1 2 3 r (1 1 r 0. r Corollary 1.2. The number of objects of S which satisfy at least one of the properties P 1, P 2,..., P n is given by A 1 A 2 A n A i A i A j + A i A j A k i i<j i<j<k + ( 1 n+1 A 1 A 2 A n. (2 Proof. Note that the set A 1 A 2 A n consists of all those objects in S which possess at least one of the properties, and A 1 A 2 A n S A 1 A 2 A n. Then by the DeMorgan law we have Thus A 1 A 2 A n Ā1 Ā2 Ān. A 1 A 2 A n S Ā1 Ā2 Ān. Putting this into the identity (1, the identity (2 follows immediately. 2 Combinations with Repetition Let M be a multiset. Let x be an object of M and its repetition number is larger than r. Let M be the multiset whose objects have the same repetition numbers as those objects in M, except that the repetition number of x in M is exactly r. Then #{r-combinations of M} #{r-combinations of M }. 2

Example 2.1. Determine the number of 10-combinations of the multiset M {3a, 4b, 5c}. Let S be the set of 10-combinations of the multiset M { a, b, c}. Let P 1, P 2, andp 3 be the properties that a 10-combination of M has more than 3 a s, 4 b s, and 5 c s, respectively. Then the number of 10-combinations of M is the number of 10-combinations of M which have none of the properties P 1, P 2, and P 3. Let A i denote the sets consisting of the 10-combinations of M which have the property P i, 1 i 3. Then by the Inclusion-Exclusion Principle the number to be determined in the problem is given by Ā1 Ā2 Ā3 S ( A 1 + A 2 + A 3 + ( A 1 A 2 + A 1 A 3 + A 2 A 3 A 1 A 2 A 3. Note that S 3 10 A 1 3 6 A 2 3 5 A 3 3 4 A 1 A 2 3 1 A 1 A 3 3 0 ( 3+10 1 ( 10 12 ( 3+6 1 ( 6 8 6 ( 3+5 1 5 ( 3+4 1 4 ( 3+1 1 1 ( 3+0 1 0 10 66, 28, ( 7 5 21, ( 6 4 15, ( 3 1 3, ( 2 0 1, A 2 A 3 0, A 1 A 2 A 3 0. Putting all these results into the inclusion-exclusion formula, we have Ā1 Ā2 Ā3 66 (28 + 21 + 15 + (3 + 1 + 0 0 6. The six 10-combinations are listed as follows {3a, 4b, 3c}, {3a, 3b, 4c}, {3a, 2b, 5c}, {2a, 4b, 4c}, {2a, 3b, 5c}, {a, 4b, 5c}. 3

Example 2.2. Find the number of integral solutions of the equation which satisfy the conditions x 1 + x 2 + x 3 + x 4 15 2 x 1 6, 2 x 2 1, 0 x 3 6, 3 x 4 8. Let y 1 x 1 2, y 2 x 2 + 2, y 3 x 3, and y 4 x 4 3. Then the problem becomes to find the number of nonnegative integral solutions of the equation subject to y 1 + y 2 + y 3 + y 4 12 0 y 1 4, 0 y 2 3, 0 y 3 6, 0 y 4 5. Let S be the set of all nonnegative integral solutions of the equation y 1 + y 2 + y 3 + y 4 12. Let P 1 be the property that y 1 5, P 2 the property that y 2 4, P 3 the property that y 3 7, and P 4 the property that y 4 6. Let A i denote the subset of S consisting of the solutions satisfying the property P i, 1 i 4. Then the problem is to find the cardinality Ā1 Ā2 Ā3 Ā4 by the inclusion-exclusion principle. In fact, Similarly, S 4 12 ( 4 + 12 1 12 ( 15 12 455. ( ( 4 4 + 7 1 10 A 1 120, 7 7 7 ( ( 4 4 + 8 1 11 A 2 165, 8 8 8 ( ( 4 4 + 5 1 8 A 3 56, 5 5 5 ( ( 4 4 + 6 1 9 A 4 84. 6 6 6 For the intersections of two sets, we have ( 4 4 + 3 1 A 1 A 2 3 3 4 ( 6 3 20,

A 1 A 3 1, A 1 A 4 A 2 A 3 4, A 2 A 4 10, A 3 A 4 0. For the intersections of more sets, A 1 A 2 A 3 A 1 A 2 A 4 A 1 A 3 A 4 A 2 A 3 A 4 A 1 A 2 A 3 A 4 0. Thus the number required is given by Ā1 Ā2 Ā3 Ā4 455 (120 + 165 + 56 + 84 + (20 + 1 + 4 + 4 + 10 69. 3 Derangements A permutation of {1, 2,..., n} is called a derangement if every integer i (1 i n is not placed at the ith position. We denote by D n the number of derangements of {1, 2,..., n}. Let S be the set of all permutations of {1, 2,..., n}. Then S n!. Let P i be the property that a permutation of {1, 2,..., n} has the integer i in its ith position, and let A i be the set of all permutations satisfying the property P i, where 1 i n. Then D n Ā1 Ā2 Ān. For each (i 1, i 2,..., i k such that 1 i 1 < i 2 < < i k n, a permutation of {1, 2,..., n} with i 1, i 2,..., i k fixed at the i 1 th, i 2 th,..., i k th position respectively can be identified as a permutation of the set {1, 2,..., n} {i 1, i 2,..., i k } of n k objects. Thus A i1 A i2 A ik (n k!. 5

By the inclusion-exclusion principle, we have n Ā1 Ā2 Ān S + ( 1 k A i1 A i2 A ik k1 i 1 <i 2 < <i k n n! + ( 1 k (n k! k1 i 1 <i 2 < <i k n ( n ( 1 k (n k! k k0 n! n ( 1 k k0 k! n! e (when n is large. Theorem 3.1. For n 1, the number D n of derangements of {1, 2,..., n} is ( D n n! 1 1 1! + 1 2! 1 3! + + 1 ( 1n. (3 n! Here are a few derangement numbers: D 0 1, D 1 0, D 2 1, D 3 2, D 4 9, D 5 44. Corollary 3.2. The number of permutations of {1, 2,..., n} with exactly k numbers displaced is ( n D k. k Proposition 3.3. The derangement sequence D n satisfies the recurrence relation D n (n 1(D n 1 + D n 2, n 3 with the initial condition D 1 0, D 2 1. The sequence D n satisfies the recurrence relation D n nd n 1 + ( 1 n, n 2. Proof. The recurrence relations can be proved without using the formula (3. Let S k denote the set of derangements of {1, 2,..., n} having the pattern ka 2 a 3 a n, where k 2, 3,..., n. We may think of a 2 a 3... a n as a permutation of {2,..., k 1, 1, k + 1,..., n} with respect to its order 23 (k 11(k + 1 n. 6

The derangements of S k can be partitioned into two types: ka 2 a 3 a k a n (a k 1 and ka 2 a 3 a k 1 1a k+1 a n. The first type can be considered as permutations of k23... (k 11(k + 1... n such that the first member is fixed and no one is placed in its original place for other members. The number of such permutations is D n 1. The second type can be considered as permutations of k23... (k 11(k + 1... n such that the first and the kth members are fixed, and no one is placed in its original place for other members. The number of such permutations is D n 2. We thus obtain the recurrence relation D n (n 1(D n 1 + D n 2, n 3. Let us rewrite the recurrence relation as D n nd n 1 ( D n 1 (n 1D n 2, n 3. Applying this recurrence relation continuously, we have D n nd n 1 ( 1 i( D n i (n id n i 1, 1 i n 2. Thus D n nd n 1 ( 1 n 2 (D 2 D 1 ( 1 n. Hence D n nd n 1 + ( 1 n. 4 Surjective Functions Let X be a set of m objects and Y a set of n objects. Then the number of functions of X to Y is n m. The number of injective functions from X to Y is ( n m m! P (n, m. Let C(m, n denote the number of surjective functions from X to Y. What is C(m, n? Theorem 4.1. The number C(m, n of surjective functions from a set of m objects to a set of n objects is given by n ( n C(m, n ( 1 k (n k m. k k0 7

Proof. Let S be the set of all functions of X to Y. Write Y {y 1, y 2,..., y n }. Let A i be the set of all functions f such that y i is not assigned to any element of X by f, i.e., y i f(x, where 1 i n. Then C(m, n Ā1 Ā2 Ān. For each (i 1, i 2,..., i k such that 1 i 1 < i 2 < < i k n, the intersection A i1 A i2 A ik can be identified to the set of functions f from X to the set Y {y i1, y i2,..., y ik }. Thus A i1 A i2 A ik (n k m. By the Inclusion-Exclusion Principle, we have n Ā1 Ā2 Ān S + ( 1 k A i1 A i2 A ik k1 i 1 <i 2 < <i k n n m + ( 1 k (n k m k1 i 1 <i 2 < <i k n ( n ( 1 k (n k m. k k0 Note that C(m, n 0 for m < n; we have n ( n ( 1 k (n k m 0 k if m < n. k0 Corollary 4.2. For integers m, n 1, ( m n ( n ( 1 k (n k m. i 1,..., i n k i 1 + +inm i 1,...,in 1 Proof. The integer C(m, n can be interpreted as the number of ways to place objects of X into n distinct boxes so that no box is empty. Let the 1st box be placed i 1 objects,..., the nth box be placed i n objects; then i 1 + + i n m. 8 k0

The number of placements of X into n distinct boxes, such that the 1st box m! contains exactly i 1 objects,..., the nth box contains exactly i n objects, is which is the multinomial coefficient C(m, n ( m i 1,...,i n. We thus have ( m. i 1,..., i n i 1 + +inm i 1,...,in 1 i 1! i n!, 5 The Euler Phi Function Let n be a positive integer. We denote by φ(n the number of integers of [1, n] which are coprime to n, i.e., φ(n {k [1, n] : gcd(k, n 1}. For example, φ(1 1, φ(2 1, φ(3 2, φ(4 2, φ(5 4, φ(6 2. The integer-valued function φ is defined on the set of positive integers, called the Euler Phi function. Theorem 5.1. Let n be a positive integer factorized into the form n p e 1 1 pe r 2 pe r r, where p 1, p 2,..., p r are distinct primes and e 1, e 2,..., e r 1. Then r φ(n n (1 1pi. i1 Proof. Let S {1, 2,..., n}. Let P i be the property of integers in S having factor p i, and let A i be the set of integers in S that satisfy the property P i, where 1 i r. Then φ(n is the number of integers satisfying none of the properties P 1, P 2,..., P r, i.e., φ(n Ā1 Ā2 Ār. Note that A i { ( } n 1p i, 2p i,..., p i, 1 i r. p i 9

Likewise, for q p i1 p i2 p ik with 1 i 1 < i 2 < < i k r, { ( } n A i1 A i2 A ik 1q, 2q,..., q. q Thus A i1 A i2 A ik n q n. p i1 p i2 p ik By the Inclusion-Exclusion Principle, we have r Ā1 Ār S + ( 1 k A i1 A ik k1 i 1 < <i k r n + ( 1 k n p k1 i 1 <i 2 < <i i1 p i2 p ik [ ( k 1 n 1 + + 1 p 1 p ( r 1 + + 1 + + 1 p 1 p 2 p 1 p 3 p r 1 p ( r 1 + 1 1 + + p 1 p 2 p 3 p 1 p 2 p 4 p r 2 p r 1 p ] r + + ( 1 r 1 p 1 p 2 p r r n (1 1pi. i1 Example 5.1. For the integer 36 ( 2 2 3 2, we have ( φ(36 36 1 1 ( 1 1 12. 2 3 The following are the twelve specific integers of [1, 36] that are coprime to 36: 1, 5, 7, 11, 13, 17, 19, 23, 25, 29, 31, 35. Corollary 5.2. For any prime number p, φ(p k p k p k 1. 10

Proof. The result can be directly proved without Theorem 5.1. The set [1, p k ] has p k 1 integers 1p, 2p,... p k 1 p not coprime to p k. Thus φ(p k p k p k 1. Lemma 5.3. Let m m 1 m 2. If gcd(m 1, m 2 1, then we have (i The function f : [m] [m 1 ] [m 2 ] defined by f(a (r 1, r 2, where a q 1 m 1 + r 1 q 2 m 2 + r 2 [m], 1 r 1 m 1, 1 r 2 m 2, is a bijection. (ii The restriction of f to {a [m] : gcd(a, m 1} is a map to the product set { a [m1 ] : gcd(a, m 1 1 } { a [m 2 ] : gcd(a, m 2 1 }, and is also a bijection. Proof. (i It suffices to show that f is surjective. Since gcd(m 1, m 2 1, by the Euclidean Algorithm there exist integers x and y such that xm 1 + ym 2 1. For each (r 1, r 2 [m 1 ] [m 2 ], the integer r : r 2 xm 1 +r 1 ym 2 can be written as r (r 2 r 1 xm 1 + r 1 (xm 1 + ym 2 (r 1 r 2 ym 2 + r 2 (xm 1 + ym 2. Since xm 1 + ym 2 1, we have r (r 2 r 1 xm 1 + r 1 (r 1 r 2 ym 2 + r 2. We modify r by adding an appropriate multiple qm of m to obtain a : qm + r such that 1 a m. Then a q 1 m 1 +r 1 q 2 m 2 +r 2 [m] for some integers q 1 and q 2. We thus have f(a (r 1, r 2. This shows that f is surjective. Since both [m] and [m 1 ] [m 2 ] have the same cardinality m 1 m 2, it follows that f must be a bijection. (ii It follows from the fact that an integer a [m 1 m 2 ] is coprime to m 1 m 2 iff a is coprime to m 1 and coprime to m 2. Theorem 5.4. For positive integers m and n such that gcd(m, n 1, φ(mn φ(mφ(n. 11

If n p e 1 1 pe r r then with e 1,..., e r 1, where p 1,..., p r are distinct primes, φ(n n r (1 1pi. i1 Proof. The first part follows from Lemma 5.3. Note that [p e i i ] has pe i 1 i integers 1p i, 2p i,..., p e i 1 i p i not coprime to p e i i. So φ(pe i p e i p ei 1. The second part follows from the first part, i.e., φ(n r i1 r i1 φ ( p e r i i p e i i i1 (1 1pi ( p e i i n p e i 1 i r (1 1pi. i1 6 Permutations with Forbidden Positions Let X 1, X 2,..., X n be subsets (possibly empty of {1, 2,..., n}. We denote by P (X 1, X 2,..., X n the set of all permutations a 1 a 2 a n of {1, 2,..., n} such that a 1 X 1, a 2 X 2,..., a n X n. In other words, a permutation of S belongs to P (X 1, X 2,..., X n provided that no members of X 1 occupy the first place, no members of X 2 occupy the second place,..., and no members of X n occupy the nth place. Let p(x 1, X 2,..., X n P (X 1, X 2,..., X n. It is known that there is a one-to-one correspondence between permutations of {1, 2,..., n} and the placement of n non-attacking indistinguishable rooks on an n-by-n board. The permutation a 1 a 2 a n of {1, 2,..., n} corresponds to the placement of n rooks on the board in the squares with coordinates (1, a 1, (2, a 2,..., (n, a n. 12

The permutations in P (X 1, X 2,..., X n corresponds to placements of n nonattacking rooks on an n-by-n board in which certain squares are not allowed to be put a rook. Let S be the set of all placements of n non-attacking rooks on an n n-board. A rook placement in S is said to satisfy the property P i provided that the rook in the ith row having column index in X i, where 1 i n. Let A i be the set of rook placements satisfying the property P i. Then by the Inclusion-Exclusion Principle, p(x 1, X 2,..., X n Ā1 Ā2 Ān S A i + A i A j i i<j + ( 1 n A 1 A 2 A n. Proposition 6.1. Let r k (1 k n denote the number of ways to place k non-attacking rooks on an n n-board where each of the k rooks is in a forbidden position. Then r k 1 (n k! 1 i 1 <i 2 < <i k n A i1 A i2 A ik. (4 Proof. Fix (i 1, i 2,..., i k with 1 i 1 < i 2 < < i k n. Let r(i 1, i 2,..., i k denote the number of ways to place k non-attacking rooks such that the rook on the i 1 th row has column index in X i1, the rook on the i 2 th row has column index in X i2,..., and the rook on the i k th row has column index in X ik. For each such k rook arrangement, delete the i 1 th row, i 2 th row,..., i k th row, and delete the columns where the i 1 th, or i 2 th,..., or i k th position is arranged a rook; the other n k rooks cannot be arranged in the deleted rows and columns. The leftover is an (n k (n k-board, and the other n k rooks can be arranged in (n k! ways. So A i1 A i2 A ik r(i 1, i 2,..., i k (n k!. 13

Since r k 1 i 1 <i 2 < <i k n r(i 1, i 2,..., i k, it follows that r k (n k! A i1 A i2 A ik. 1 i 1 <i 2 < <i k n Theorem 6.2. The number of ways to place n non-attacking rooks on an n n-board with forbidden positions is given by n p(x 1, X 2,..., X n ( 1 k r k (n k!, where r k is the number of ways to place k non-attacking rooks on an n n- board where each of the k rooks is in a forbidden position. Example 6.1. Let n 5 and X 1 {1, 2}, X 2 {3, 4}, X 3 {1, 5}, X 4 {2, 3}, and X 5 {4, 5}. k0 Find the number of rook placements with the given forbidden positions. Solution. Note that r 0 1. It is easy to see that Since r 1 1 4! Since r 1 5 2 10. i A i, we have A i r 1 4! 10 4!. (This is not needed. i A 1 A 2 A 2 A 3 A 3 A 4 A 4 A 5 A 1 A 5 4 3!, A 1 A 3 A 1 A 4 A 2 A 4 A 2 A 5 A 3 A 5 3 3!, 14

we see that r 2 1 A i A j 5 4 + 5 3 35. 3! i<j Using the symmetry between A 1, A 2, A 3, A 4, A 5 and A 5, A 4, A 3, A 2, A 1 respectively, we see that A 1 A 2 A 3 A 1 A 2 A 5 A 1 A 4 A 5 A 2 A 3 A 4 A 3 A 4 A 5 6 2!, A 1 A 2 A 4 A 1 A 3 A 4 A 1 A 3 A 5 A 2 A 3 A 5 A 2 A 4 A 5 4 2!. These can be obtained by considering the following six patterns: We then have r 3 5 6 + 5 4 50. Using the symmetric position again, we see that A 1 A 2 A 3 A 4 A 1 A 2 A 3 A 5 A 1 A 2 A 4 A 5 A 1 A 3 A 4 A 5 A 2 A 3 A 4 A 4 5 1!. Thus Finally, r 4 5 5 25. r 5 A 1 A 2 A 3 A 4 A 5 2. 15

The answer 5 k0 ( 1k r k (5 k! is 5! 10 4! + 35 3! 50 2! + 25 1! 2 13. A permutation of {1, 2,..., n} is nonconsecutive if 12, 23,..., (n 1n do not occur. We denote by Q n the number of nonconsecutive permutations of {1, 2,..., n}. We have Q 1 0, Q 2 1, Q 3 3, Q 4 13. Theorem 6.3. For n 1, n 1 ( n 1 Q n ( 1 k (n k!. k k0 Proof. Let S be the set of permutations of {1, 2,..., n}. Let P i be the property that in a permutation the pattern i(i + 1 does occur, where 1 i n 1. Let A i be the set of all permutations satisfying the property P i. Then Q n is the number of permutations satisfying none of the properties P 1,..., P n 1, i.e., Q n Ā1 Ā2 Ān 1. Note that Similarly, More generally, A i (n 1!, 1 i n 1. A i A j (n 2!, 1 i < j n 1. A i1 A ik (n k!, 1 i 1 < < i k n 1. Thus by the Inclusion-Exclusion Principle, n 1 Q n S + ( 1 k k0 k1 1 i 1 < <i k n 1 n 1 ( n 1 ( 1 k (n k!. k A i1 A ik 16

Example 6.2. Eight persons line up in one column in such a way that every person except the first one has a person in front. What is the chance when the eight persons reline up after a break so that everyone has a different person in his/her front? We assign numbers 1, 2,..., 8 to the eight persons so that the number i is assigned to the ith person (counted from the front. The problem is then to find the number of permutations of {1, 2,..., 8} in which the patterns 12, 23,..., 78 do not occur. For instance, 31542876 is an allowed permutation, while 83475126 is not. The answer is given by P Q 8 8! 7 ( 7 (8 k! ( 1 k 0.413864. k 8! k0 Example 6.3. There are n persons seated at a round table. The n persons left the table and reseat after a break. How many seating plans can be made in the second time so that each person has a different person seating on his/her left comparing to the person before the break? This is equivalent to finding the number of circular nonconsecutive permutations of {1, 2,..., n}. A circular nonconsecutive permutation of {1, 2,..., n} is a circular permutation of {1, 2,..., n} such that 12, 23,..., (n 1n, n1 do not occur in the counterclockwise direction. Let S be the set of all circular permutations of {1, 2,..., n}. Let A i denote the subset of all circular permutations of {1, 2,..., n} such that i(i + 1 does not occur, 1 i n. We understand that A n is the subset of all circular permutations that n1 does not occur. The answer is Note that S (n 1!, and More generally, Ā1 Ā1 Ān. A i (n 1!/(n 1 (n 2!. A i1 A ik (n k!/(n k (n k 1!, 1 k n 1; A 1 A 2 A n 1. 17

We thus have Theorem 6.4. Proof. Ā1 Ān n 1 ( n ( 1 k (n k 1! + ( 1 n. k k0 Q n D n + D n 1, n 2. n ( 1 k n 1 ( 1 k D n + D n 1 n! + (n 1! k! k! k0 k0 ( n ( 1 k n ( 1 k 1 (n 1! n + n + k! (k 1! k1 k1 n ( 1 k n! + (n 1! (n k k! k1 n 1 ( n 1 n! + ( 1 k (n k! k k1 n 1 ( n 1 ( 1 k (n k! Q n. k k0 7 Rook Polynomials Definition 7.1. Let C be a board; each square of C is referred as a cell. Let r k (C denote the number of ways to arrange k rooks on the board C so that no one can take another. We assume r 0 (C 1. The rook polynomial of C is R(C, x r k (Cx k. k0 A k-rook arrangement on the board C is an arrangement of k rooks on C. 18

Proposition 7.2. Given a board C. For each cell σ of C, let C σ denote the board obtained from C by deleting the cell σ, and let C σ denote the board obtained from C by deleting all cells on the row and column that contains the cell σ. Then r k (C r k (C σ + r k 1 (C σ. Equivalently, R(C, x R(C σ, x + xr(c σ, x. Proof. The k-rook arrangements on the board C can be divided into two kinds: the rook arrangements that the square σ is occupied and the rook arrangements that the square is not occupied, i.e., the k-rook arrangements on the board C σ and the (k 1-rook arrangements on the board C σ. We thus have r k (C r k (C σ + r k 1 (C σ. Two boards C 1 and C 2 are said to be independent if they have no common rows and common columns. Independent boards must be disjoint. If C 1 and C 2 are independent boards, we denote by C 1 + C 2 the board that consists of the cells either in C 1 or in C 2, i.e., the union of cells. Proposition 7.3. Let C 1 and C 2 be independent boards. Then r k (C 1 + C 2 k r i (C 1 r k i (C 2, i0 where C 1 + C 2 C 1 C 2. Equivalently, R(C 1 + C 2, x R(C 1, xr(c 2, x. Proof. Since C 1 and C 2 have disjoint rows and columns, each i-rook arrangement of C 1 and each j-rook arrangement of C 2 will constitute a (i + j-rook arrangement of C 1 + C 2, and vice versa. Thus r k (C 1 + C 2 r i (C 1 r j (C 2. i+jk i,j 0 19

Example 7.1. The rook polynomial of an m-by-n board C with m n, m ( m ( n R(C, x k!x k k k. k0 Example 7.2. Find the rook polynomial of the board. We use (a square with a dot to denote a selected square when applying the recurrence formula of rook polynomial. R (, x ( ( R, x + xr, x [ ( ( ] ( R, x + xr, x + xr, x ( 1 + 6x + 3 2x 2 + 2x ( 1 + 4x + 2x 2 1 + 8x + 14x 2 + 4x 3. 8 Weighted Version of Inclusion-Exclusion Principle Let X be a set, either finite or infinite. The characteristic function of a subset A of X is a real-valued function 1 A on X, defined by { 1 if x A, 1 A (x 0 if x A. For real-valued functions f, g, and a real number c, we define functions f + g, cf, and fg on X as follows: (f + g(x f(x + g(x, (cf(x cf(x, (fg(x f(xg(x. For subsets A, B X and arbitrary function f on X, it is easy to verify the following properties: (i 1 A B 1 A 1 B, 20

(ii 1 Ā 1 X 1 A, (iii 1 A B 1 A + 1 B 1 A B, (iv 1 X f f. The set of all real-valued functions on X is a vector space over R, and is further a commutative algebra with identity 1 X. Given a function w : X R, usually referred to a weight function on X, such that w is nonzero at only finitely many elements of X; the value w(x is called the weight of x. For each subset A X, the weight of A is w(a x A w(x. If A, we assume w( 0. For each function f : X R, the weight of f is w(f x X w(xf(x w, f. Clearly, w(1 A w(a. For functions f i and constants c i (1 i m, we have ( m m w c i f i c i w(f i. i1 This means that w is a linear functional on the vector space of all real-valued functions on X. Proposition 8.1. Let P 1,..., P n be some properties about the elements of a set X. Let A i denote the set of elements of X that satisfy the property P i, 1 i n. Given a weight function w on X. Then the Inclusion-Exclusion Principle can be stated as n 1 1 Ā1 Ā2 Ān X + ( 1 k 1 Ai1 A i2 A ik ; (5 k1 i1 w ( n Ā 1 Ān w(x + ( 1 k k1 1 i 1 <i 2 < <i k n i 1 < <i k w ( A i1 A ik. (6 21

Proof. Alying properties about characteristic functions, 1 Ā1 Ān 1Ā 1 1Ā n (1 X 1 A1 (1 X 1 An f 1 f n (f i 1 X or f i 1 Ai, 1 i n n 1 } X {{ 1 X} + } 1 X {{ 1 X} ( 1 Ai1 ( 1 Aik n k1 i 1 < <i k n k n 1 X + ( 1 k 1 Ai1 A ik. k1 1 i 1 < <i k n Applying a weight w to both sides, we obtain w ( n Ā 1 Ān w(x + ( 1 k k1 i 1 < <i k w ( A i1 A ik. Let X be a finite set and A 1,..., A n be subsets of X. Let [n] {1, 2,..., n}. We introduce two functions α and β on the power set P([n] of [n] as follows: For each subset I [n], { ( w i I α(i A i if I, 0 if I ; { ( w β(i i I A i if I, 0 if I. By Inclusion-Exclusion, n 1 n i1 A i ( 1 k 1 k1 1 i 1 < <i k n Taking weight w on both sides, we obtain β([n] 1 Ai1 A ik I [n], I ( 1 I 1 α(i I [n] I [n], I ( 1 I +1 α(i. ( 1 I 1 1 i I A i. 22

If one replace Āi with A i in (5, we have n ( 1 n i1 A i 1 X + ( 1 k 1 n Āi1 Āi k0 ( 1k 0 k k1 i 1 < <i k n ( 1 (1 k+1 X 1 Āi1 Āik k1 n ( 1 k+1 k1 I [n], I i 1 < <i k i 1 < <i k 1 Ai1 A ik ( 1 I +1 1 i I A i. Taking the weight w on both sides, we obtain α([n] I [n], I ( 1 I +1 β(i I [n] Theorem 8.2. We have the identities ( 1 I +1 β(i. β(j I J( 1 I +1 α(i, J [n]; (7 α(j I J( 1 I +1 β(i, J [n]. (8 9 Möbius Inversion Let (X, be a locally finite poset, i.e., for each x y in X the interval [x, y] {z X : x z y} is a finite set. Let I(X be the set of all functions f : X X R such that f(x, y 0 if x y; such functions are called incidence functions on the poset X. For an incidence function f, we only specify the values f(x, y for the pairs (x, y such that x y, since f(x, y 0 for all pairs (x, y such that x y. 23

The convolution product of two incidence functions f, g I(X is an incidence function f g : X X R, defined by In fact, (f g(x, y 0 if x y and (f g(x, y (f g(x, y z X f(x, zg(z, y. x z y f(x, zg(z, y if x y. The convolution product satisfies the associative law: f (g h (f g h, where f, g, h I(X. Indeed, for x y, we have (f (g h(x, y f(x, z 1 (g h(z 1, y Likewise, for x y, we have x z 1 y ((f g h(x, y x z 1 y x z 1 z 2 y f(x, z 1 x z 1 z 2 y z 1 z 2 y g(z 1, z 2 h(z 2,, y f(x, z 1 g(z 1, z 2 h(z 2, y. f(x, z 1 g(z 1, z 2 h(z 2, y. For x y, we automatically have (f (g h(x, y ((f g h(x, y 0. The vector space I(X together with the convolution is called the incidence algebra of X. We may think of the incidence functions f only defined on the set {(x, y X X : x y}, and the convolution defined as (f g(x, y f(x, zg(z, y. x z y Example 9.1. Let [n] {1, 2,..., n} be the poset with the natrual order of natural numbers. An incidence function f : [n] [n] R can be viewed as a upper triangular n n matrix A [a ij ] given by a ij f(i, j. The convolution is just the multiplication of upper triangular matrices. 24

There is a special function δ I(X, called the delta function of the poset (X,, defined by { 1 if x y, δ(x, y 0 if x y. The delta function δ is the identity of the algebra I(X, i.e., for all f I(X, Indeed, for x y, (δ f(x, y δ f f f δ. x z y (f δ(x, y x z y δ(x, zf(z, y f(x, y; f(x, zδ(z, y f(x, y. Given an incidence function f I(X. A left inverse of f is a function g I(X such that g f δ. A right inverse of f is a function h I(X such that f h δ. If f has a left inverse g and a right inverse h, then g h. In fact, g g δ g (f h (g f h δ h h. If f has both a left and right inverse, we say that f is invertible; the left inverse and right inverse of f must be same and unique, and it is just called the inverse of f. Note that g f δ g(x, zf(z, y δ(x, y, x y. x z y When x y, we have g(x, xf(x, x 1, i.e., g(x, x 1 f(x,x ; so f(x, x 0. We can obtain g I(X inductively as follows: 1 g(x, x, x X, (9 f(x, x g(x, y 1 g(x, zf(z, y, x < y. (10 f(y, y x z<y 25

This means that f is invertible iff f(x, x 0 for all x X. Likewise, f g δ f(x, zg(z, y δ(x, y, x y. x z y We can obtain g I(X inductively as follows: g(x, x g(x, y 1, x X, (11 f(x, x 1 f(x, zg(z, y, x < y. (12 f(x, x x<z y The zeta function ζ of the poset (X, is an incidence function such that ζ(x, y 1 for all (x, y with x y. Clearly, ζ is invertible. The Möbius function µ of the poset (X, is the inverse of the zeta function ζ in the incidence algebra I(X, i.e., µ ζ 1. The Möbius function µ can be inductively defined by µ(x, x 1, x X, (13 µ(x, y µ(x, z µ(z, y, x < y. (14 x z<y Example 9.2. Let X {1, 2,..., n}. (P(X, is given by x<z y µ(a, B ( 1 B A, where A B. The Möbius function of the poset This can be proved by induction on B A. For B A 0, i.e., A B, it is obviously true. Assume that it is true when B A < m. Consider the case 26

of B A m. In fact, µ(a, B A C B A C B m 1 µ(a, C ( 1 C A ( p ( 1 k k k0 ( 1 m ( 1 B A. Example 9.3. Let X {1, 2,..., n} and consider the linearly ordered set (X,, where 1 < 2 < < n. Then for (k, l X X with k l, the Möbius function is given by µ(k, l 1 if l k, 1 if l k + 1, 0 otherwise. It is easy to see that µ(k, k 1 and µ(k, k + 1 1. It follows that µ(k, k + 2 0 and subsequently, µ(k, k + i 0 for all i 2. 2 1 1 2 1 1 1 1 1 Figure 1: Computing the Möbius function by Hasse diagram Given a finite poset (X,. For each function f : X R, we can multiply an incidence function α I(X to the left of f and to the right as follows to 27

obtain two functions α f and f α on X, defined by (α f(x x y α(x, yf(y, x X; (15 (f α(y x y f(xα(x, y, y X. (16 Theorem 9.1. Let (X, be a finite poset. Given invertible α I(X, f, g F (X. Then g α f iff f α 1 g, i.e., g(x x y α(x, yf(y, x X f(x x y α 1 (x, yg(y, x X. Likewise, g f α iff f g α 1, i.e., g(y x y f(xα(x, y, y X f(y x y g(xα 1 (x, y, y X. (17 Proof. It follows from the fact g α f iff α 1 g α 1 (α f, and the fact α 1 (α f (α 1 α f δ f f. Likewise, g f α g α 1 f α α 1 f δ f. Theorem 9.2. Let (X, be a finite poset. Let f, g be real-valued functions on X. Then g(x x y f(y, x X f(x x y µ(x, yg(y, x X; g(y x y f(x, y X f(y x y g(xµ(x, y, y X. (18 Proof. The first inversion formula follows from the fact that g ζ f f ζ 1 g µ g. The second inversion formula follows from the fact that g f ζ f g ζ 1 g µ. 28

Writing in summations, for each fixed y X, we have g(xµ(x, y f(uµ(x, y x y x y u x f(uζ(u, xµ(x, y x y u x f(u ζ(u, xµ(x, y u y u x y f(uδ(u, y u y f(y. Corollary 9.3. Let [n] {1, 2,..., n}. Let f, g : P([n] R be functions such that g(i f(j, I [n]. J I Then f(i J I( 1 I J g(j, I [n]. Permanent. Fix a positive integer n. Let S n denote the symmetric group of [n] {1, 2,..., n}, i.e., the set of all permutations of [n]. Let A be an n n real matrix. The permanent of A is defined as the number per(a n a i,σ(i. σ S n i1 For the chessboard C in Example 6.1, we associate a 0-1 matrix A [a ij ] as follows: 0 0 1 1 1 1 1 0 0 1 C, A 0 1 1 1 0. 1 0 0 1 1 1 1 1 0 0 29

Then the number of ways to put 5 non-attacking indistinguishable rooks on C is the permanent per(a. Fix an n-by-n matrix A. For each subset I [n], let A I denote the submatrix of A, whose rows are those of A indexed by members of I. Let F (I be the set of all functions σ : [n] I, and let G(I be the set of all surjective functions from [n] onto I. Then F (I J I G(J. We introduce a real-valued function f on the power set P([n] of [n], defined by f( 0, f(i σ G(I i1 n a i,σ(i, I [n], I. Note that f([n] per(a. Let g : P([n] R be defined by g(i J I f(j, I [n]. Then g(i J I n σ G(J i1 σ F (I i1 n i1 j I n a i,σ(i, a ij a i,σ(i, I [n]. Thus by the Möbius inversion, we have f(i J I( 1 I J g(j, I [n]. In particular, f([n] I [n]( 1 n I g(i. 30

Since f([n] per(a, it follows that per(a I [n] ( 1 n I n ( a ij. (19 However this formula is not much useful because there are 2 n terms in the summation. Definition 9.4. Let (X i, i (i 1, 2 be two posets. (X 1 X 2, is given by i1 j I (x 1, x 2 (y 1, y 2 iff x 1 1 y 1, x 2 2 y 2. The product poset For the convenience, we write 1 and 2 simply as. Then (X 1 X 2, is a poset. Theorem 9.5. Let µ i be the Möbius functions of posets (X i, i, i 1, 2. Then the Möbius function µ of X 1 X 2 for (x 1, x 2 (y 1, y 2 is given by µ((x 1, x 2, (y 1, y 2 µ 1 (x 1, y 1 µ 2 (x 2, y 2. Proof. We proceed by induction on l((x 1, x 2, (y 1, y 2, the length of the longest chains in the interval [(x 1, x 2, (y 1, y 2 ]. It is obviously true when l 0. For l 1, by inductive definition of µ, µ((x 1, x 2, (y 1, y 2 µ((x 1, x 2, (z 1, z 2 (x 1,x 2 (z 1,z 2 (y 1,y 2 (x 1,x 2 (z 1,z 2 (y 1,y 2 µ 1 (x 1, z 1 µ 2 (x 2, z 2 (by IH µ 1 (x 1, y 1 µ 2 (x 2, y 2 µ 1 (x 1, z 1 µ 2 (x 2, z 2 x 1 z 1 y 1 x 2 z 2 y 2 µ 1 (x 1, y 1 µ 2 (x 2, y 2 δ 1 (x 1, y 1 δ 2 (x 2, y 2 µ 1 (x 1, y 1 µ 2 (x 2, y 2. Example 9.4. The set Z + {1, 2,...} of positive integers is a poset with the partial order of divisibility. Let n Z + be factored as n p e 1 1 pe 2 2 pe k k, 31

where p i are distinct primes and e i are positive integers. Since µ(m, m 1 for all m Z + and µ(1, n is inductively given by µ(1, n µ(1, m m Z +, m n, m n We only need to to consider the subposet (D(n, divisibility, where For r, s D(n, they can be written as D(n {d [n] : d n}. r p a 1 1 pa 2 2 pa k k, s pb 1 1 p b 2 2 p b k k, where 0 a i, b i e i. Then r s iff a i b i. This means that the poset D(n is isomorphic to the product poset Q { (a 1,..., a k : a i [0, e i ] } k [0, e i ], where [0, e i ] {0, 1,..., e i }. Thus µ(1, n µ Q ((0,..., 0, (e 1,..., e k, where k µ Q ((0,..., 0, (e 1,..., e k µ [0,ei ](0, e i. Note that It follows that µ [0,ei ](0, e i µ(1, n 1 if e i 0, 1 if e i 1, 0 if e i 2. i1 i1 { ( 1 e i if e i 1, 0 if e i 2. { ( 1 e 1 + +e k if all e i 1, 0 otherwise. 1 if n 1, ( 1 j if n is a product of j distinct primes, 0 otherwise. Now for arbitrary m, n Z + such that m n, the bijection { {u Z + : m u, u n} v Z + : v n }, u u m m 32

is an isomorphism of posets for the partial order of divisibility. We thus have ( µ(m, n µ 1, n. m In number theory, we write µ(1, n as µ(n. Theorem 9.6. Let f, g : Z + C be two functions. Then g(n d n f(d, n Z + is equivalent to f(n d n ( n g(dµ, n Z +. d Example 9.5. Let Φ n {a [n] : gcd(a, n 1}. Then φ(n Φ n. Define g(n d n φ(d, n Z +. Consider the set Φ n,d {k [n] : gcd(k, n d} for each factor d of n. In particular, if d 1, then Φ n,1 Φ n. In fact, there is a bijection Φ n,d Φ n/d, k k/d. (Injectivity is trivial. Surjectivity follows from da a for a Φ n/d. Then φ(n/d Φ n/d Φ n,d. Note that for each integer k [n], there is a unique integer d [n] such that gcd(k, n d. We have [n] d n Φ n,d (disjoint union. Thus n Φ n,d ( n φ d φ(k φ(d. d n d n k n d n By the Möbius inversion, φ(n k n kµ(k, n k n ( n kµ k dkn kµ(d d n µ(d n d. (20 Let n 2 and let p 1, p 2,..., p r be distinct primes dividing n. Then {d [n] : d n, µ(d 0} { i I p i : I [r] }, 33

where i p i 1. Since µ(1 1, µ(d ( 1 k if d p i1 p ik is a product of k distinct primes, and µ(d 0 otherwise, we see that (20 becomes ( n φ(n n + n ( n + + + n + p 1 p 2 p 1 p 2 p 1 p 3 + + ( 1 r n n p 1 p 2 p r r (1 1pi n i1 p n, primes ( 1 1. p Example 9.6. Let Σ {a 1,..., a k } be a set and M { a 1,..., a k } a multiset over Σ. A circular n-permutation of M is an arrangement of n elements of M around a circle. Each circular n-permutation of M may be considered as a periodic double-infinite sequence (x i (x i i Z x 2 x 1 x 0 x 1 x 2 of period n, i.e., x i+n x i for all i Z. The minimum period of a circular permutation (x i of M is the smallest positive integer among all periods. We shall see below that the minimum period of a double-infinite sequence divides all periods of the sequence. Let Σ n denote the set of all words of length n over Σ, and Σ the set of all words over Σ. Then Σ n 0 Σ n (disjoint union. Consider the map σ : Σ Σ, σ(x 1 x 2 x n x 2 x n x 1, σ(λ λ. A word w of length n is said to be primitive if the words w, σ(w, σ 2 (w,..., σ n 1 (w are distinct. A period of an n-word w is a positive integer m such that σ m (w w. Every n-word has a trivial period n. The smallest number of all periods of a word w is called the minimum period of w, and it is a common factor of all periods of w. In fact, for a period m of a word w which has minimum period d, write m qd + r, where 0 r < d. Suppose r > 0. Then σ r (w σ r σ} d {{ σ} d (w σ qd+r (w σ m (w w. q 34

This means that r is a period of w, and is smaller than d, which is a contradiction. Let Σ 0 d denote the set of all primitive d-words over Σ, and Σ n,d the subset of Σ n whose n-words have minimum period d, where d n. Then Σ n,d Σ 0 d. Denote by Σ the set of all double-infinite sequences over Σ, by Σ,n the subset of Σ whose members have period n, and by Σ 0,n the subset of Σ,n whose members have minimum period n. Then Σ,n d n Σ 0,d. Let C n (Σ denote the set of all circular n-permutations of M. Then C n (Σ can be identified to the set Σ,n. Thus C n (M m n Σ 0,m. Now we consider the map F : Σ 0 m Σ 0, m, w s 1 s 2 s m (x i www, which is clearly surjective and each member of Σ 0, m receives exactly m objects of Σ 0 m. So Σ 0, m Σ 0 m /m. Thus C n (M m n Σ 0,m m n Σ 0 m /m. Since Σ n d n Σ0 n,d, we have Σ n d n Σ 0 n,d d n Σ 0 d. By the Möbius inversion, Σ 0 n d n Σ d µ(d, n d n ( n Σ d µ. d 35

It follows that C n (M m n a m, m n 1 ( m Σ a µ m a a m 1 m Σ a µ ( m a Let b m/a, i.e., m ab. Then a m and d m are equivalent to a n and b (n/a. Thus C n (M Σ a µ(b ab. a n b (n/a Since φ(n d n µ(d n/d by (20, we see that µ(b ab 1 µ(b n/a 1 ( n n b n φ. a b (n/a b (n/a We finally have C n (M 1 ( n Σ a φ (since Σ a k n a a a n 1 ( n k a φ (set d n/a n a a n 1 k n/d φ(d. n d n Theorem 9.7. The number of circular n-permutations by using some objects of a k-set with repetition allowed is 1 k n/d φ(d n d n Example 9.7. The number circular permutations of the multiset M of type (n 1,..., n k with m gcd(n 1,..., n k and n n 1 + + n k is given by 1 ( n/a φ(a. (21 n n 1 /a,..., n k /a a m 36.

If a is a period of a permutation of M, it is easy to see that a m. Let M d {(dn 1 /m a 1,..., (dn k /m a k }, d 1. Clearly, gcd{dn 1 /m,..., dn k /m} d and M m M. Let S(M d denote the set of all permutations of M d, S 0 a(m d the set of all permutations of M d with minimum period a, and S 0 (M d the set of all primitive permutations of M d, i.e., permutations whose minimum period is the cardinality dn/m of M d. For each w S(M d, let a be the minimum period of w. Then w w } 1 w 1 {{ w } 1 with a primitive word w 1. Thus w 1 is a word of length a of b type ( dn1 /m,..., dn k/m. b b Since b (dn i /m for all i, it follows that b gcd{dn 1 /m,..., dn k /m}, i.e., b d. So w S 0 a(m d with a k i1 (d/b(n i/m (d/b(n/m. Note that We see that gcd {(d/b(n 1 /m,..., (d/b(n k /m} d/b. S(M d b d S 0 (d/b(n/m (M d, We then have S 0 (d/b(n/m (M d S 0 (M d/b if b d. S(M d S 0 (M d/b S 0 (M a. b d a d By the Möbius inversion, S 0 (M d a d S(M a µ ( d. a Let C(M d denote the set of all circular permutations of M d, C 0 a(m d the set of all circular permutations of M d with minimum period a, and C 0 (M d the set of all primitive circular permutations of M d. Likewise, C(M d a d C 0 a(m d, 37

Ca(M 0 d C 0 (M a if a d. Note that M a an/m and S 0 (M a C 0 (M a an/m. We have C(M d a d a d C 0 (M a S 0 (M a a d m ( a S(M b µ. an b b a 1 an/m Set c a/b, i.e., a bc, then a d and b a are equivalent to b d and c (d/b. Thus C(M d 1 n b d 1 n b d 1 n a d m d S(M b c d b m d d S(M b φ( b m d S(M d/a φ(a. Let d m, we have M M m. Recall S(M b d/b c µ(c ( (set a d/b bn/m bn 1 /m,...,bn k /m C(M C(M m 1 φ(a S(M m/a n a m 1 ( n/a φ(a. n n 1 /a,..., n k /a a m, therefore Example 9.8. Consider the multiset M {12a 1, 24a 2, 18a 3 } of type (12, 24, 18. Then m gcd(12, 24, 18 6, whose factors are 1, 2, 3, 6. Recall the values φ(1 1, φ(2 1, φ(3 2, φ(6 2. The number of circular permutations of M is [ ( ( ( ( ] 1 54 27 18 9 φ(1 + φ(2 + φ(3 + φ(6 54 12, 24, 18 6, 12, 9 4, 8, 6 2, 4, 3 38