Report 1 The Axiom of Choice By Li Yu This report is a collection of the material I presented in the first round presentation of the course MATH 2002. The report focuses on the principle of recursive definition, the axiom of choice and its applications. Some proofs which were omitted in the presentations can be found here. 1. Principle of Recursive Definition Theorem 1.1 (Recursive Definition) Let A be a nonempty set, a 0 an element of A. Suppose is a function that assigns, to each function f mapping a section of the positive integers into A, an element of A. Then there exists a unique function h: Z + A satisfying: (1) h(1) = a 0 ; (2) h(i) = (h S i-1 ), for all integers i s.t. i 2. Proof We first prove that: (*) For any positive integer n, there exists a unique function h n : S n A satisfying: (1 ) h n (1) = a 0 ; (2 ) h n (i) = (h n S i-1 ), for all integers i s.t. 2 i n. Step Ⅰ When n = 1, define h 1 : S 1 A as h 1 (1) = a 0. It is immediately clear that h 1 satisfies (1 ) and (2 ), and such h 1 is unique. Step Ⅱ Assume that (*) is true when n=k. Define a function h k+1 : S k+1 A as h k+1 (i) = h k (i) for all integers i s.t. 1 i k, and h k+1 (k+1) = (h k ). By the definition of h k+1, it satisfies (1 ) and (2 ). Hence the existence of such a function when n = k+1 is proved. Let h k+1 : S k+1 A and h k+1 : S k+1 A be two functions satisfying (1 ) and (2 ). Then h k+1 S k and h k+1 S k are functions from S k into A satisfying (1 ) and (2 ). By the assumption of Step Ⅱ, h k+1 S k and h k+1 S k are identical. Hence, h k+1 (k+1) = (h k+1 S k ) = (h k+1 S k ) = h k+1 (k+1). This proves the uniqueness of such a function. Step Ⅲ By mathematical induction, (*) is true. Now define h: Z + A as such that the graph of it is the union of the graphs of h n in (*), where n is a positive integer. By (*), h m (n) = (h m S n )(n) = h n (n) for any positive integers m,n s.t. n<m. Hence h defined above is indeed a function and h(n) = h n (n) = h n+1 (n) = h n+2 (n) = To prove that h satisfies (1), we observe that h(1) = h 1 (1) = a 0. To prove that h satisfies (2), we observe that h(i) = h i (i) = (h i S i-1 ) = (h S i-1 ) for all integers i s.t. i 2. This completes the proof. Q.E.D. 1/7
Example We consider a simple case. The function h: Z + Z + s.t. h(n) = n for all positive integers n can be defined by recursive definition as a function h: Z + Z + satisfying h(1) = 1 and h(n) = h(n-1) + 1. We try to look for the in Theorem 1.1 for this recursion. For every function f Sk-1 from S k-1 into Z +, we define (f Sk-1 ) to be f Sk-1 (k-1) + 1, where k is a positive integer s.t. k 2. Then it is clear that h(n) = (h S n-1 ) = h(n-1) + 1, since h S n-1 is a function from S n-1 into Z +. Hence this is the desired in Theorem 1.1. Now we turn to the main part of this report. 2. The Axiom of Choice In this section we introduce the Axiom of Choice and prove the equivalence of different forms of this axiom. Axiom of Choice (AC 1 ) Let A be a set of pairwise disjoint nonempty sets. There exists a set C consisting of exactly one element of each member of A. Example Let A = {Z, Q-Z, R-Q}. It is obvious that A is a set of pairwise disjoint nonempty sets. The axiom of choice guarantees the existence of a set consisting of one and only one element of each of the sets Z, Q-Z, R-Q. C = {0, -1/2, } is an example of such a set. Example The previous example is trivial to some extent. Let us study another example. Let A be a set of pairwise disjoint nonempty sets consisting of two arbitrary elements, i.e. the members of A are doubletons. Is it possible for one to prove that there exists a set consisting of exactly one element of each member of A without referring to the Axiom of Choice, explicitly or implicitly? It seems quite plausible. But this time one cannot give a desired set explicitly, as we did in the previous example, since the members of A are arbitrary. How about choosing the larger element of each member of A? But what does one mean by larger? How about listing the elements of the members of A in the form of {a,b} and choosing the left element? But how does one list the elements, which one is written on the left? If one gives a rule to list these elements, why does he or she bother to list them? He or she may simply make a set consisting of elements that are intended to be written on the left. The axiom of choice justifies the practice of selecting something without explicitly stating the rule governing the selection. When one says let v be a vector which is not in the subspace spanned by, he or she is referring to the Axiom of Choice implicitly. There are quite a lot of circumstances where we refer to the Axiom of Choice in mathematics. To mention some of them: Hahn-Banach Theorem, every field has a unique algebraic closure. The Axiom of Choice can be formulated in different forms. We next 2/7
investigate two of them. Definition 2.1 f is a choice function for a set A if and only if f is a function from the set of all nonempty subsets of A into A and for every nonempty subset B of A, f(b) B. Example A choice function f for Z + may be defined as f(s) = the smallest element of S, where S is any nonempty subset of Z +. A choice function g for {Tommy, Jerry} may be defined as g({tommy}) = Tommy, g({jerry}) = Jerry and g({tommy, Jerry}) = Jerry. Intuitively, a choice function for a set is a function which chooses an element from each nonempty subset of the set. Axiom of Choice (AC 2 ) Every nonempty set has a choice function. Theorem 2.1 AC 1 and AC 2 are equivalent. Proof AC 1 => AC 2 ) For any nonempty subset B of A, define S B = {(B, b): b B}. Then define S = {S B : B is nonempty subset of A}. For any distinct nonempty subsets B 1 and B 2 of A, S B1 and S B2 are disjoint since the first components of all elements of S B1 are B 1, whereas those of all elements of S B2 are B 2. Hence by AC 1, there exists a set C consisting of exactly one element of each member of S. By the definition of C, for every nonempty subset B of A, there exists a unique element b B s.t. (B, b) C. Hence there exists a function F from the set of all nonempty subsets of A into A whose graph is C, i.e. F(B) B. This proves what we want. AC 2 => AC 1 ) Let S = {x: x B, for some B A }. By the definition of S, B is contained in S for every member B of A. By AC 2, S has a choice function F. Let C = {F(B): B A}. For any B A, F(B) is an element of B and F(B) C. Assume there exists a set B 0 in A s.t. b 1 and b 2 are distinct elements of B 0 which are also elements of C. Then by the definition of C, there exist sets B 1 and B 2 in A s.t. F(B 1 ) = b 1 and F(B 2 ) = b 2 and hence b 1 B 1 and b 2 B 2. Then b 1 B 1, b 1 B 0, and hence B 1 = B 0, since the members of A are disjoint. Similarly, B 2 = B 0, which implies B 1 = B 2. Hence b 1 = F(B 1 ) = F(B 2 ) = b 2. This contradicts the assertion that b 1 and b 2 are distinct. So we conclude that C consists of exactly one element of each member of A. This proves AC 1. Q.E.D. Axiom of Choice (AC 3 ) Let A be a set of pairwise disjoint nonempty sets. The Cartesian product of the sets in A is nonempty. Theorem 2.2 AC 1 and AC 3 are equivalent. The simple proof of Theorem 2.2 is omitted. 3/7
Although the content of the axiom of choice seems intuitively clear, or trivial, as you like, amazing consequences of it can be deduced. The next theorem is rather revealing. Theorem 2.3 (Banach-Tarski Paradox) Given a solid ball in R 3, it is possible to partition it into finitely many pieces and reassemble them to form two solid balls, each identical in size to the first. Idea of the proof Quite a lot of abstract algebra is used. The group of rotations in R 3 is involved and the crucial step of the proof is to apply AC 1 to the set of all cosets of a subgroup of this group. This counterintuitive logical consequence of the Axiom of Choice was first proposed by Banach and Tarski with the intention to reject this axiom. The Banach-Tarski Paradox shows to us that the Axiom of Choice is not as trivial as we once imagined. Unfortunately, there is no elementary proof available. 3. Applications of the Axiom of Choice In this section we study some applications of the Axiom of Choice. We use this axiom to prove two theorems, Hausdorff Maximal Principle and Zorn s Lemma, which are widely referred to in practice, and then indicate that they are actually equivalent to the Axiom of Choice. Definition 3.1 A relation R on a set A is called an order relation if it has the following properties: (1) For every x and y in A for which x y, either xry or yrx; (2) For no x in A does the relation xrx hold; (3) If xry and yrz, then xrz. Example The ordinary less than relation on the set of all real numbers is an order relation. So is the ordinary relation of greater than on the same set. The relation R on the set R R defined as (r 1,r 2 )R(s 1,s 2 ) if r 1 <s 1 or r 1 = s 1 and r 2 <s 2 is an order relation. The inclusion relation on the power set of a set need not be an order relation, since property (1) fails to hold in general. Definition 3.2 Let A be a set and < be an order relation on A. a is said to be the smallest element of a subset X of A if a X and for every x X, a = x or a<x. Definition 3.3 A set A with an order relation < is said to be well-ordered if every nonempty subset of A has a smallest element. Example The set Z + of all positive integers is well-ordered by the ordinary less than relation, as can be easily proved by mathematical induction. In fact, one may show that the Well-ordering Property of Z + implies the Principle of Mathematical Induction. However, the set R of all real numbers is not well-ordered by the ordinary less than relation, since the open interval (a, b), where a<b, does not have a smallest element. It should also be noticed that Z + is not well-ordered by the ordinary greater than relation. Hence, well-orderedness of a set does not depend on the set alone. But we have the following beautiful theorem. Theorem 3.1 (Well-ordering Theorem) If A is a set, then there exists an order relation on A that well-orders A. 4/7
Idea of the proof The proof is split into two stages. The first stage is to prove that there exists a well-ordered set of any given size. The second is to construct by transfinite recursion, with the help of the choice function for A, a bijection from A into one of the well-ordered sets, which preserves the order. Example To have a concrete understanding of the proof, let us construct an order relation on Z, which is not well-ordered by the ordinary less than relation. Define f: Z Z + as f(z) = 2z when z 0 and f(z) = -2z-1 when z<0. It is clear that f is a bijection. Define a relation R on Z as z 1 Rz 2 if f(z 1 )<f(z 2 ), the reader may prove without difficulty that R well-orders Z. From the idea of the proof we see that the Well-ordering Theorem is a logical consequence of the Axiom of Choice. An interesting fact is that the inverse is also true, i.e. the Axiom of Choice is derivable from the Well-ordering Theorem. Therefore, the Well-ordering Theorem is said to be equivalent to the Axiom of Choice. Historically, numerous equivalent versions of the Axiom of Choice have been used by mathematicians. The famous and practically important ones are listed without formal proof below. Definition 3.4 A relation R on a set A is called a strict partial order relation if it has the following properties: (1) For no x in A does the relation xrx hold; (2) If xry and yrz, then xrz. Example Note that every order relation on a set is a strict partial order relation. Hence every example following definition 3.1 is an example of strict partial order relation. To differentiate between the two concepts, consider the Cartesian product Z + Z +. Define a relation R on Z + Z + as (m 1,n 1 )R(m 2,n 2 ) if m 1 = m 2 and n 1 <n 2. The reader may easily verify by definition that R is a strict partial order relation on Z + Z +. However, it is neither the case that (1,2)R(2,1) nor (2,1)R(1,2). The first condition of definition 3.1 is not satisfied, so R is not an order relation Z + Z +. The adjective strict indicates that xrx does not hold. It may be omitted without taking the risk of ambiguity in this report. The adjective partial refers to the fact clearly illustrated in the last example, i.e. it is not the case that any two elements are comparable. In this sense, strictly partially ordered sets behave more randomly than well-ordered ones. But patterns can indeed be observed in strictly partially ordered sets, as the next theorem will illustrate. Definition 3.5 Let R be a strict partial order relation on a set A. A subset C of A is said to be a chain in A if R, when restricted to C, is an order relation on C. Definition 3.6 A chain C 0 in a strictly partially ordered set A is said to be a maximal chain if for any chain C in A, it is not the case that C 0 is properly contained in C. Theorem 3.2 (Hausdorff Maximal Principle) Every strictly partially ordered set has a maximal chain. Example Consider the set Z + Z + and the relation R studied above. C = 5/7
{(1,n): n Z + } is a chain in Z + Z +, since any two elements in C are comparable with respect to R. For any chain C in Z + Z + which properly contains C, it must have an element whose first component is not 1. This element is not comparable with (1,1), which is in C. This contradicts the assertion that C is a chain. Hence C is a maximal chain in Z + Z +. Idea of the proof Assume there is no maximal chain in a strictly partially ordered set. Then every chain in this set can be extended. Let C be a chain in this set and S C, the set of all chains properly containing C. We may define a function f on the set of all chains by associating every chain C to a chain in S C, which can be done by applying AC 2 to S C, where C goes through all chains in the set considered. This definition can be intuitively understood as to extend the chains. Then we prove that such a function has a fixed point, i.e. f(c) = C for some chain C. This contradicts the construction of the function f, since, by definition, f(c) is a proper extension of C. Now we proceed to Zorn s Lemma. Definition 3.7 Let R be a strict partial order relation on a set A and B, a subset of A. An element x of A is said to be an upper bound of B if for every b in B, either b = x or brx. An element m of A is said to be a maximal element of A if for no a in A does the relation mra hold. It is no harm to define the concepts of lower bound and minimal element, although these concepts will not be used in the following context. The definition is as follows. Definition 3.8 Let R be a strict partial order relation on a set A and B, a subset of A. An element x of A is said to be a lower bound of B if for every b in B, either x = b or xrb. An element m of A is said to be a minimal element of A if for no a in A does the relation arm hold. Theorem 3.3 (Zorn s Lemma) Let A be a strictly partially ordered set. If every chain in A has an upper bound, then A has a maximal element. Example Consider the set Z + of positive integers with the ordinary greater than as the strict partial order. Every chain in Z + has an upper bound, which is the least integer in the chain, and we know that 1 in the maximal element in Z +. Another example which is slightly different may be constructed by considering the Cartesian product (R-R - ) (R-R - ) of nonnegative real numbers and itself, together with a relation R defined as (a,b)r(c,d) if a>c and b = d. Every chain in (R-R - ) (R-R - ) is of the form {(a,b): b = r} for some fixed r in R-R -, and (0,r) is an upper bound of such a chain. Hence, by Zorn s lemma, we know that (R-R - ) (R-R - ) has a maximal element. This time we have infinitely many maximal elements and every element of the form (0,r), where r is a nonnegative real number, is a maximal element. Idea of the proof Let us, again, demonstrate the power of the axiom of choice informally. To prove Theorem 3.3, we define, by transfinite recursion, an increasing (with respect to the strict partial order) transfinite sequence and then prove that there is an element in this sequence which is a maximal element. The 6/7
choice function plays an analogous role to the function does in Recursive Definition. The significance of the Axiom of Choice in this proof is that we may state explicitly what the function is when we define something by the Principle of Recursive Definition, while it is not always as easy to find such a function in the general case. An alternative, and probably much simpler, way to prove Zorn s Lemma is to use Hausdoff Maximal Principle. Note that the upper bound of a maximal chain, if exists, must be a maximal element, as the readers may easily verify. By the assumption of Zorn s Lemma, the upper bound of a maximal chain does exist. This proves the theorem. We conclude this section by establishing the equivalence of Hausdorff Maximal Principle and Zorn s Lemma to the Axiom of Choice. Theorem 3.4 Zorn s Lemma and Hausdorff Maximal Principle are equivalent to the Axiom of Choice. 4. Conclusion This report deals with some basic concepts of set theory and the Axiom of Choice. It starts from the Principle of Recursive Definition, then introduces the Axiom of Choice and discusses briefly some applications of this axiom. The Axiom of Choice has been studied in great details by mathematicians. This report only covers a very small part of it, but I hope that it provides the readers with reasonable amount of background knowledge in this area. 7/7