Basics on Sets and Functions - PDF Free Download

HISTORY OF MATHEMATICS Spring 2005 Basics on Sets and Functions Introduction to the work of Georg Cantor #10 in the series. Some dictionaries define a set as a collection or aggregate of objects. However, it probably is not possible to define the concept; the definition I just gave merely shifts things around. If a set is a collection of objects, what is a collection? A set is just a bunch of objects put together somehow, or imagined together; and the objects can themselves be sets. Concerning sets in mathematics, they are considered a primitive concept; i.e., they don t get defined. Nobody even cares what they are, only their properties matter. Therefore, in what follows, I assume you know what a set is. Belonging. In general (but not always), I ll use capital letters to denote sets. If A is a set and x is an object (possibly another set, possibly A again) then in regular set theory one and only one possibility holds. Either x is an element of A, symbolized by writing x A (one also says x is in A or x belongs to A) or x is not an element of A; in symbols x / A. There is no other choice, and exactly one must hold. The relation of belonging is the basic relation in set theory. Equality of sets. Two sets are equal if they are equal, but we need to be a bit more precise. If we have two sets A, B to say they are equal means that every element of A is an element of B, and every element of B is an element of A. In symbols: (A = B) ((x A x B) and (x B x A)) The empty set. The empty set is the set without elements. It is the set theoretic equivalent of 0. It is denoted by. In terms of the relation of belonging, it is defined by x / for all x. So the empty set is the set for which the relation of belonging is always false, no matter what the object. There is only one empty set! Once you have nothing, it s all the same? One can even prove this, if we remember when an implication is true. An implication, a statement of the form p q is false if p is true and q is false; in every other instance it is true. That is all that matters logically: True should not imply false; otherwise, what s the point? In particular, p q is always true when p is false. Now one has to be careful here. To give an example, if I say New York is the capital of the United States, 1

therefore double fudge sundaes are very healthy, and say that this statement is logically impeccable, I am not saying anything about New York or the health value of double fudge sundaes. It is impeccably correct from a logical point of view precisely because New York is not the capital of the United States, thus the p part is false, hence it doesn t matter anymore whether the q part is false or not. If some day New York does indeed become the capital of the United States (as it was briefly after the revolution), the statement becomes quite controversial and possibly false. Returning to the empty set, to prove there is only one (just for the fun of it), suppose that A, B both share the definition of the empty set. We have to prove, for equality, that x A x B and x B x A Both implications are true! In fact, if A has the property that x A is always false (the defining property of the empty set), then the premise in the statement x A x B, namely x A, is always false, thus the whole statement is always true. Similarly for x B x A. Think about it. But no too much. Set notation. It is now standard to use one of the following notations to define or identify sets. 1. If the set is finite (and not too large), one lists the elements within curly brackets. For example, if we write A = {1, 5, 6}, we are saying that A is the set whose elements are the numbers 1, 5, and 6. 2. Some infinite sets can also be defined or indicated by listing the elements up to the point where it is clear what all the elements are. At that point one places... to indicate and so forth. A similar devise is used for finite but very large sets, or finite sets in which the elements are variable. For example: (a) B = {1, 2, 3,...} is interpreted as stating that B is the set of all natural numbers. (b) C = {1,..., 100} is interpreted as stating that C is the set of all numbers from 1 to 100. We could list them all, but it would take a lot of time and paper, and serve no real purpose. (c) D = {1,..., n} defines a set in terms of some (presumably) variable n; the set depends on that n. It consists of all numbers from 1 to n. If n = 100 it is the same set as in the previous example. 3. One defines the set by a property of its elements. This definition, perhaps the most common one, looks like this: A = {x x satisfies P } or A = {x : x satisfies P }. 2

The symbol x is what is called a dummy variable, a place holder; it has no meaning by itself, and does not exist outside of the curly brackets. Any other symbol would do as well. P should be a property (a statement) referring to x. We refer to A as the set of all x such that x satisfies P. Whether to separate the first part of the notation from the second part by a vertical bar ( ) or a colon (:) is a matter of choice. I think the world is more or less evenly divided between those who use a vertical bar as a separator and those who use a colon. Some people (but I think they are fewer) use a comma: A = {x, x satisfies P }. Here are some examples. (a) N = {x : x is a natural number }. (b) B = {z : z is a natural number and z 2 > 7}. Frequently, if the elements of the set one is defining or identifying are all elements of some other fixed set for which one has a symbol, one indicates this by writing {x D : x satisfies...} rather that {x : x D and...}. For example, assuming (as we will assume) that the symbol N denotes the set of natural numbers, the set B of this example can also be described by B = {z N : z 2 > 7}. (c) X = {y N : y 3 and there exist a, b, c N such that a y + b y = c y }. For some three hundred years, nobody knew if this set was empty or not; in the 1990 s it was proved empty by Andrew Wiles and Taylor. (d) P = {(x, y, z) : x, y, z N, x 2 + y 2 = z 2 }. Time for some exercises. Exercise 1 Let A = {1, 2, 3}. Let B = {x N : x 2 12}. Prove A = B. Exercise 2 Let R denote the set of all real numbers and let B = {x R : x 2 3x + 1 = 0}. List all the elements of B. Exercise 3 Let A = {x N : x > 5}, B = {t N : t 6}. Is A = B? Explain. Exercise 4 Let A = {x N : x 2 (mod 3), x 3 (mod 4), x 1 (mod 5)} Prove that A = {x N : x 11 (mod 60)}. 3

Set inclusion. If A, B are sets, we say A is a subset of B, and write A B, iff every element of A is also an element of B; that is, if the proposition x A x B is true. It should be clear from this that A = B if and only if A B and B A. The empty set is a subset of every set. In fact, let A be any set. Consider the proposition x x A. Since the premise is false for every x, the proposition is always true. Exercise 5 Let A = {n N : n = 3k for some k N}, B = {m N : m = 6k for some k N}. Prove that B A. Basic set operations. If A, B are sets, one defines A B = {x : x A or x B} (The union of A and B). A B = {x : x A and x B} (The intersection of A and B). A\B = {x : x A, x / B} (The set theoretic difference of A and B). A B = {x : (x A and x / B) or (x B and x / A)}. (The symmetric difference of A and B). Exercise 6 Prove, if A, B, C are sets. 1. A B = B A 2. A B = B A 3. A B = (A\B) (B\A) = (A B)\(A B). 4. (A B) C = A (B C). 5. (A B) C = A (B C). 6. (A B) C = (A C) (B C). 7. (A B) C = (A C) (B C). 8. A\(B C) = (A\B) (A\C). 9. A\(B C) = (A\B) (A\C). 4

Due to part 4 of this exercise, parentheses are not needed when forming the union of several sets. Because (A B) C = A (B C), we simply denote the set obtained as the union of these three sets by A B C. Similarly, because (A B) (C D) is the same as A (B C D), etc., we just write A B C D; no parentheses. All that has been said here for is also valid, thanks to part 5, for. Two useful symbols. The symbol abbreviates for all or for every or for each. For example, the proposition x 2 > 0 for all real numbers x can be abbreviated to x 2 > 0 x R. The symbol abbreviates there exists or there is. For example, the set defined by A = {n N : n = 3k for some k N} can be defined in a somewhat more abbreviated form by A = {n N : k N, n = 3k}. Indexed families of sets. When dealing with more than two or three sets at the same time, it is convenient (sometimes) to index them. For example, if we have to deal with twenty sets we could call them A, B, C, D, E, F, G, H, I, J, K, L, M, N, P, Q, R, S, T, U. But it is better to index them,for example by the integers 1,..., 20, and refer to the sets as A 1, A 2,... A 20. If one is working with an infinite number of sets, indexing is even more common. Any set can serve as an index set, but usually one uses consecutive natural numbers starting at 1 or 0 for finite sets; the set of all positive integers for countable see below infinite sets. If I is a set (finite or not), the notation {A i } i I denotes a family of sets; for each element i of I one has a set A i. Thus, if I = {1, 2, 3}, then the family {A i } i I consists of the sets A 1, A 2, A 3. If J = {x R : x 2 = 2}, then {B j } j J consists of two sets, namely B 2 and B 2. This may seem like a silly choice for an index set, but it is a possible choice. And {{x}} x R indicates a family of sets, indexed by the real numbers, where you have for each real number the set having that real number as its only element. It should be mentioned, perhaps, that in the context of set theory, the words set and family are synonymous. One uses family when it sounds better than saying set. For example, one talks of a family of sets, but one could also call it a set of sets. An indexed family of sets is, however, a slightly different concept than just a plain family of sets, but only very slightly so. For example, let A, B be sets and form the new set C = {A, B}; the set having A, B as elements. Let I = {1, 2, 3} and set A 1 = A, A 2 = B, A 3 = B. Then C is not 5

exactly the same thing as {A i } i I ; the latter is not really a set. But everything one can do with one, one can do with the other. Suppose C is a family of sets; i.e., a set whose elements are sets. We can index it in a weird way, using the family C itself. If C C, we define the set A C by A C = C. Then {A C } C C is basically the set of sets C disguised now as an indexed family of sets. Set operations for indexed families. Assume {A i } i I is an indexed family of sets. All this means is that I is a set and for each i I we are given a set A i. Then one defines: A i = {x : i I such that x A i } i I A i = {x : x A i i I}. i I Exercise 7 Let A, B be sets, let I = {1, 2} and define A 1 = A, A 2 = B. Prove A B = i I A i A B = i I A i. If the index set is of the form I = {1,..., n}, it is customary to write n i=1 n i=1 A i A i for for i I A i A i. i I If the index set is N = {1, 2,...}, it is customary to write i=1 i=1 A i A i for for i N A i A i. i N A number of other such (one hopes) self explanatory notations are in use. Exercise 8 1. Let A be a set. Prove: A = x A {x}. 2. Prove n N {x N : x n} =. 6

3. Let P = {2, 3, 5, 7,...} be the set of prime numbers. If p is a prime number, let A p be the set of all integers which are multiples of p; in symbols Prove A p = {x Z : x = kp, for some integer k}. A p = Z\{1}. p P Here (and always) Z denotes the set of all integers; Z = {0, 1, 1, 2, 2,...}. Correspondences and functions. It is most likely that the importance of the concept of function begins with the work of Euler, however it began to be the dominating mathematics concept in the nineteenth century, and has stayed so ever since. I am not going to give here a rigorous definition of function (as developed at the beginning of the twentieth century), rely rather on the intuitive notion and develop some standard notation. If A, B are sets, a function from A to B is a rule assigning to each element of A a unique element of B. The reason this definition is not rigorous is that someone may wonder what is a rule? and request a definition. For our purposes my answer would be, anything goes. One usually denotes function by symbols such as f, g, h (or F, G, H, or the Greek letters φ (or ϕ), ψ; recall that usually does not mean always. The symbol f : A B indicates that f is a function from A to B. That means we are assigning to each element a A an element of B; the element assigned to a is denoted by f(a) if the function is denoted by f. One says f(a) is the value of the function at a. Let A, B, C, D be sets and let f : A B, g : C D. Then one declares the two functions as equal if A = C, B = D (this condition is sometimes relaxed a bit to C, D are subsets of a common set), and f(x) = g(x) for each x A (f, g assume the same values). If f : A B, we call the set A the domain of f. The set of all values of f; that is, the subset {f(x) : x A}, is called the range of f. A function f : A B is also called a map or mapping from A to B. While function is the generic word, map or mapping is frequently preferred when the sets involved are not necessarily sets of numbers. Examples. 1. If A is a small finite set, one can define a function from A to B by just listing all of its values. For example, let A = {1, 2, 3}; we can define f : A N by stating f(1) = 1, f(2) = 4, f(3) = 9. The domain of this function is A = {1, 2, 3}, the range is {1, 4, 9} N. 2. A common way of defining functions when the domain and range are sets of mathematical objects is by a formula. For example, the function f of 7

the previous example (with the same domain A and the same target set N) can also be defined by stating that f(x) = x 2, x A. Suppose we now define g : N N by g(x) = x 2 for x N. Then g f because the domains are different. 3. Let A be the set of all working U.S. residents; we can define a function f : A R by the rule f(a) = last 4 digits of the social security # of a. first three digits + 1 For example, if someone has the social security number 012-34-5678, then the value of f evaluated at that someone would be 5678 12 + 1 = 5678 13 = 436.769230. Of course, for this function to be well defined, every working U.S. resident must have a social security number. If this is false, then the function is not well defined, another way of saying it isn t defined at all and should not be used under danger of suffering severe mathematical injuries. Let A be any set. The identity function, which we ll denote by id A on A is the function from A to A that assigns every element to itself. In symbols id A (x) = x x A. Exercise 9 How many different functions are there from a set A of m elements to a set B of n elements? Composition of functions. Let A, B, C be sets and assume f : A B (f is a function from A to B), g : B C (g is a function from B to C). Then one can define a function from A to C, denoted by g f and called the composition of g and f by specifying that for all elements x A. Examples. g f(x) = g(f(x)) 1. Let A be the set of all human beings that were born after the year 1000. Let B be the set of all human beings born after the year 800. For x A, let f(x) = mother of x. Notice that f(x) B (as far as we know). For x B, let g(x) = father of x. Then g f(x) is the maternal grandfather of x. 8

2. Let A = {n N : n 2} and let P be the set of all prime numbers. Define f : A P by f(n) = first prime divisor of n. For example, f(2) = 2, f(3) = 3, f(4) = 2, f(5) = 5, f(34) = 17, etc. Define g : P N by g(p) = p 2 (p + 1). Then (please, check that it is so) f g = id P, g f id A. 3. Let A = {x R : x 0 and define f : R A by f(x) = x 2. Define g : A R by g(x) = x 1/2. Then g f(x) = x for all x R. If we let h = f A, the restriction of f to A; that is, if we define h(x) = x 2 for x A, then g h = h g = id A. Exercise 10 Let A = {a, b, c, d, e, f, g, h} and define F : A A by F (a) = b, F (b) = c, F (c) = b, F (d) = c, F (e) = c, F (f) = d, F (g) = f, F (h) = f. An easier way of describing such a function on a finite set is by writing out the elements of the set in a row, writing below each element the value of the function at that element. Something like: ( ) a b c d e f g h F = b c b c c d f f Describe F F and (F F ) F. Exercise 11 Let A = (0, 1) = {x R : 0 < x < 1}. Let B = {x R : x > 0}. Define f : A B, g : B A by f(x) = x 1 x g(x) = x 1 + x if x A, if x B. Show that these functions are well defined, more precisely that f maps into B (f(x) B x A) and g maps into A (g(x) A x B) and that g f = id A, f g = id B. One-to-one correspondences; etc. When you count how many objects a set has, you are putting the objects of the set in correspondence with a section of the set of counting (natural) numbers. We ll try to give the precise definitions here. Let A, B be sets. A function f : A B is said to be one-to-one or injective iff it assigns different values to different elements: a, b A, a b implies f(a) f(b). An equivalent way of stating this definition, which suggests how one proves functions injective, is by the contrapositive: f : A B is injective iff f(a) = f(b) implies a = b. The usual (but by no means only) way of proving that a function f : A B is one-to-one is by saying Leta, b A, assume f(a) = f(b). 9

And somehow one should end by getting a = b. This works best when f is given by a formula so that f(a) = f(b) is an equation that can be solved to give a = b. Example Let A = B = N and define f : N N by f(n) = 2n + 1. Prove that f is one-to-one. To do this, let n, m N and assume f(n) = f(m). Then 2n + 1 = 2m + 1; subtracting 1 from each side we get 2n = 2m; dividing both sides by 2 we get n = m. Done. Example Let f : R R be defined by f(x) = x/(1 + x ). Is f one-to-one? We let x, y R and assume f(x) = f(y), which works out to (1) x 1 + x = y 1 + y The absolute value is a bit disconcerting, but there are several (standard) ways to deal with it. We might divide into cases (x > 0, y > 0, etc.), but here it is easiest to take the absolute value of both sides of (1) to get x 1 + x = y 1 + y Multiplying out we get x (1 + y ) = y (1 + x ), then x + x y = y + x y, hence x = y. Armed with this we return to (1) and see that the denominators are equal. Thus the numerators must also be equal; i.e., x = y. The function is one-to-one. Example Let f : R R be defined by f(x) = x/(1 + x 2 ). Is f one-to-one? As before, we let x, y R and assume f(x) = f(y), which works out this time to x 1 + x 2 = y 1 + y 2 which multiplies out to x + xy 2 = y + x 2 y, hence xy 2 x 2 y = y x, thus xy(y x) = y x. Unfortunately, from this we cannot conclude y = x; it is possible that one has xy = 1. Does this prove f is not one-to-one? The answer is that it almost does so, except that we were moving in the wrong direction. We began assuming f(x) = f(y); if we are trying to prove f is NOT one-toone, then we have to start with x y and end with f(x) = f(y). But our calculations tell us what to do. Let s take any x, y, x y and xy = 1. For example, x = 2, y = 1/2. Then f(2) = 2 5 ; f(1 2 ) = 1 2 1 + 1 4 = 2 5. Because 2 1/2 but f(2) = f(1/2), we conclude that f is not one-to-one. Exercise 12 Let A = {x R : x 1} and define f : A R by f(x) = x/(1 + x 2 ). Prove that f is one-to-one. 10

Let A, B be sets. A function f : A B is said to be onto or surjective iff the range of f is B. That is, for every y B there exists x A such that f(x) = y. If a function f is defined by a formula, a popular way of proving surjectivity is by letting y be an arbitrary element of B and trying to solve the equation f(x) = y in terms of x. Example. Let f : Z Z be defined by f(n) = n + 5. Prove that f is surjective. To do this, we let n Z and show there exists m with f(m) = n; i.e., m such that m + 5 = n. The solution of this equation is n 5 and we see that f(n 5) = n. Done. Example. Let f : [0, ) R be defined by f(x) = x sin(x). This function is surjective. The proof is a bit involved and relies on a Calculus result, the intermediate value theorem. Let y R. We can find an integer k such that (4k 1)π/2 y < (4k + 1)π/2. We have f ( (4k 1) π ) 2 = (4k 1) π ( 2 sin (4k 1) π ) 2 = (4k 1) π ( 2 y < (4k + 1)π 2 = (4k + 1)π 2 sin (4k + 1) π ) ( 2 = f (4k + 1) π ) 2 Since f is continuous, by the intermediate value theorem, there is x, (4k 1) π 2 x < (4k + 1) π 2, such that f(x) = y. Example. Let f : R R be defined by f(x) = x/(1 + x 2 ). Is this function surjective? What is its range if not? We begin looking at the equation y = f(x) = x 1 + x 2. The numbers y for which there exists a solution x are precisely the numbers in the range of f; if there is a solution for every y R, then f is onto (surjective). The equation can be rewritten in the form yx 2 x + y = 0, a quadratic equation for x with solutions x = 1 ± 1 4y 2. 2y However, for this x to make sense (as a real number) we must have y 0 and 4y 2 1; in other words, 0 < y < 1/2. The range of f is ( 1/2, 0) (0, 1/2). If we had defined f as a function into ( 1/2, 0) (0, 1/2) it would be surjective. We didn t, so it isn t. Exercise 13 Define f : R\( 1, 1) R by f(x) = x/(x 2 1). Explain. Is f onto? 11

Let A, B be sets and let f : A B. If f is one-to-one and onto (injective and surjective), one also says that f is bijective (or a bijection). We have the following theorem. Theorem 1 Let f : A B. the following statements are equivalent. 1. f is one-to-one and onto. 2. There exists g : B A such that g f = id A, f g = id B. Proof. Suppose f is one-to-one and onto. Let y B. Because f is onto, there exists x A with f(x) = y; because f is one-to-one, there is only one such x and we can unambiguously define g(y) = x. That is g(y) = x, where x is the unique element of A with f(x) = y. This defines g : B A. If x A then g(f(x)) = x, since x is the only element z A satisfying f(z) = f(x). Thus g f = id A. If y B, then, by definition, g(y) is such that f(g(y)) = y; i.e., f g = id B. Conversely, suppose such a g : B A exists. Let a, b A and assume f(a) = f(b). Then a = g(f(a)) = g(f(b) = b proving that f is one-to-one. Let y B. Taking x = g(y) A, we get f(x) = f(g(y)) = y, proving f is onto. If f : A B is one-to-one and onto, then the function g of the last theorem is unique; i.e., there is only one such function. One needs, however, both equalities, g f = id A and f g = id B. In fact, it is not too hard to show that f : A B being merely one-to-one is equivalent to there existing g : B A such that g f = id A (and that there are zillions such g if f is not onto), and that f is onto if and only if there is g : B A such that f g = id B (and that there are zillions such g if f is not one-to-one). For example, we can define f : N Z by f(n) = n 1. This function is clearly one-to-one but not onto. If we define g : Z N by g(y) = y +1 for y Z we see that g : Z N and g(f(n)) = n for all n N; thus g f = id N. However, f g(y) = y for y Z, thus f g id Z. One could also have obtained g f = id N in many other ways, since the value of g on negative elements of Z plays no role. For example, g(n) = n + 1 if n 0, g(n) = 1 if n < 0. Exercise 14 Let A, B be sets; assume f : A B and g : B A. Prove: If g f = id A, then f is one-to-one and g is onto. The result of the following exercise will be used. Exercise 15 Let f : A B such that there exist g, h : B A satisfying g f = id A, f h = id B. Then f is one-to-one and onto, and g = h. 12

Assume f : A B is one-to-one and onto. Then by Theorem 1, there exists g : B A such that g f = id A, f g = id B. This function is unique. In fact, if there were two such functions we could apply the last exercise letting g be one of these functions, h the other one, and conclude g = h; i.e., the two functions coincide. This allows one to introduce a notation for this function; if f : A B is one-to-one and onto, then f 1 : B A is the one and only function from B to A defined by f 1 (y) = x if and only if f(x) = y. It is the one and only function from B to A satisfying f f 1 = id B, f 1 f = id A. It is called the inverse function of f. Exercise 16 Let f : ( 1, 1) R be defined by f(x) = x/(1 + x ). Prove that f is one-to-one and onto and determine a formula for f 1. Exercise 17 Let f : A B be bijective. prove that f 1 : B A is bijective and show that (f 1 ) 1 = f. Exercise 18 Let A, B, C be sets and let f : A B, g : B C. prove: 1. If f and g are one-to one, then g f is one-to-one. 2. If f and g are onto, then g f is onto. 3. If f and g are one-to one and onto, then g f is one-to-one and onto and (g f) 1 = f 1 g 1. Exercise 19 Let A be a set of n elements. Show that there are exactly n! distinct bijections from A to A. Show also that in this context (a finite set A), a function f : A A is injective if and only if it is surjective, it is surjective (or injective) if and only if it is bijective. Cantor s Theory. One of the basic definitions is: If A, B are sets, then we say that A, B have the same power, or that A and B are equipotent iff there exists a one-to-one onto function f : A B. We indicate this in symbols by writing A B. The definition seems somewhat asymmetric in the sense that the function goes from A to B. But if f : A B is one-to-one and onto, then (Exercise 17) f 1 : B A is one-to-one and onto, so that things are really symmetric in A, B. We have: I. A A for every set A. In fact, the map id A : A A is one-to-one and onto. II. If A, B are sets and if A B, then B A. As pointed out above, if f : A B is one-to-one and onto, then so is f 1 : B A. 13

III. If A, B, C are sets and if A B, B C, then A C. In fact, if f : A B, g : B C are one-to-one and onto, then so is g f : A C (Exercise 18). Examples and exercises As additional exercises, all unproved statements in these examples should be proved. 1. The set E = {2, 4, 6,...} has the same power as N, the set of natural numbers. In fact, the function n n/2 is a bijection from E to N. 2. The sets N and Z have the same power. A typical one-to-one, onto correspondence is given by the listing 0, 1, 1, 2, 2, 3, 3,.... If you can list the elements of a set in this manner, it is countable! The listing means that 1 0, 2 1, 3 1, etc. The fact that no elements of Z are repeated in the list, and that every element of Z will get listed, is equivalent to saying that we have a one-to-one onto function from N to the set. In the present case, it is easy to come up with a more synthetic way of defining the function, a formula. In fact, the correspondence is given by: n 1 if n is odd, f(n) = 2 n if n is even. 2 It is easy to prove the function is one-to-one and onto. In fact, let n, m N and assume f(n) = f(m). Noticing that if n is even we have f(n) > 0 while n odd implies f(n) < 0, we see that n, m will have the same parity (otherwise one of f(n), f(m) is positive, the other one non-negative, and they can t be equal). If both are odd, then f(n) = f(m) works out to (n 1)/2 = (m 1)/2, from which n = m follows. If both are even, then one gets n/2 = m/2 hence also n = m. Having seen that the function is one-to-one, we see it is onto. Let m Z. If m > 0, let n = 2m. Then n is an even natural number and f(n) = n/2 = m. If m 0, let n = 2m + 1. Then n is an odd natural number and f(n) = (n 1)/2 = m. The function f is one-to-one and onto. 3. Other sets that are equipotent with N are the set of all even integers (not just all even positive integers), the set of all odd integers, any set obtained from N or Z by removing a finite set; for example the set {4, 5, 6,...} = {n N : n 4}. Proofs are quite easy and it is time to go to more interesting examples. The ones that began to blow Cantor s mind. 4. The set Q of rational numbers is countable. We ll give Cantor s argument showing that Q +, the set of all positive rationals, is countable. With a bit of imagination, one sees how one can modify the argument to include all the rationals. We begin writing out the rationals in the form of a double list 14

1 2 3 4... 1 2 3 2 5 2 7 2... 1 3 2 3 4 3 5 3... 1 4 3 4 5 4 7 4.................. The first row consists of the positive integers. The second row are all rational numbers that have 2 as a denominator, omitting those already appearing in the first row. The third row consists of rational numbers of the form m/3, except when m has a factor of 3. In general, the n-th row consists of all rationals of the form m/n, where gcd(m, n) = 1. It should be clear that if we keep on expanding this double list, every positive rational number will appear exactly once. We can now traverse the list linearly, by following the arrows in the following scheme. 1 2 3 4... 1 3 5 7... 2 2 2 2 1 2 4 5... 3 3 3 3 1 3 5 7... 4 4 4 4............... The rationals get listed in the following order 1, 1 2, 2, 1 3, 3 2, 3, 1 4, 2 3, 5 2, 4, 1 5, 3 4, 4 3, 7 2, 5,... It should be clear that this establishes a one-to-one, onto correspondence between the natural numbers and the positive rational numbers. Clear, because it should be clear that every positive rational number will eventually appear in this list; its position in the list is a natural number to which it is assigned by the correspondence. 15

5. Is every infinite set perhaps equipotent with N? Cantor discovered in several ways that this was not so. First of all, he found that the reals were not equipotent with N. In fact: The interval [0, 1] consisting of all real numbers x with 0 x 1 is not countable. Here is how Cantor proceeded. Assume it is. Then we can list all of its numbers, say in the form a 1, a 2, a 3,... Let us write out each one of these numbers in decimal expansion; by writing 1 as 0.99999... wherever it may go in the list, we can assume all these expansions start with 0.. The list, written vertically, will look something like a 1 = 0.d 11 d 12 d 13..., a 2 = 0.d 21 d 22 d 23..., a 3 = 0.d 31 d 32 d 33...,... =..., a n = 0.d n1 d n2 d n3...,... =...,... =..., Here d n1, d n2,... denote the decimal digits of the a n, the real number corresponding to n N (the n-th number in the list); d nj {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} for all n, j. Suppose now x = 0.d 1 d 2... and y = 0. d 1 d2... are two numbers in the interval [0, 1]. If they are equal, must we have d j = d j for all j? The answer is no; for example 0.12399999... = 0.1240000... and not all the digits are equal. But these are special cases and one easily sees that if d j d j for a single j, and either x or y does not end with a tail consisting entirely of 0 s or of 9 s, then x y. This allows us to blow to pieces the idea that we have managed to list all the reals between 0 and 1. In fact, for each positive integer n let e n be a digit (an integer in the range 0-9) that is different from 0, 9 and d nn (d nn being the n-th digit of the n-th number in the list). Since there are 10 digits and we only require that e n differ from 3 of them, that leaves us 7 possible choices. Let x = 0.e 1 e 2.... Then x a 1 because the first digit of x is neither 0 nor 9 since no digit of x is either 0 or 9, it cannot end having a tail of 0 s or a tail of 9 s. Similarly x a 2, because the second digit of x, e 2, differs from d 22, the second one of a 2. And so forth. No matter how we may try to list the reals in [0, 1], there is always a real left over. It can t be done! We conclude: [0, 1] N. 6. The Power set. Let A be a set. The power set of A, denote by P(A) is the set of all subsets of A. In symbols: P(A) = {B : B A}. 16

For example, if A = {1, 2, 3}, then P(A) = {, {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, {1, 2, 3}}. Exercise 20 Let A be a finite set of n elements. Show that P(A) contains 2 n elements. Cantor proved that one always has A P(A). In fact, one has a bit more. Theorem 2 Let A be a set. There cannot exist an onto map f : A P(A) (not to mention one-to-one and onto). Proof This is actually a very pretty proof. Assume we have a set A and there is f : A P(A) onto. Then f(x) is a subset of A for each x A and for x A we can ask whether or not x f(x). In fact, we can define a subset C by C = {x A : x / f(x)}. Nothing strange, so far. Except that because we are assuming f is onto, we must have C = f(c) for some c A. Now either c f(c) or c / f(c). But if c f(c) then c / C (since C consists of elements x not in f(x)). Since C = f(c) this can be rephrased as: If c f(c), then c / f(c), which is nonsense. On the other hand, if c / f(c), then c C = f(c), again nonsense. We have reached an absolute contradiction which proves that there cannot exist an onto map f : A P(A). Time for a definition. We will say that a set A is countable iff A N. And another definition. Let A, B be sets. We will say that A is of lesser power than B and write A B, iff there exists a one-to-one map f : A B. Let us also write A B to indicate that A B but A B. For example, N R. For every set A, A P(A). The following exercise should be very easy. Exercise 21 Let A, B, C be sets. Prove: If A B and B C, then A C. Also easy is Exercise 22 Prove that the following statements are equivalent for sets A, B: 1. A B. 2. There exists a surjective map g : B A. 3. A C for some subset C of B. (In particular, if A B, then A B) 17

It is a much more difficult fact that A B and B C. We ll discuss this later; for now it might help to accept that it is true. Cantor also saw that N was the smallest possible infinite set (from the point of view of number of elements); i.e., there is no smaller infinity than countable. I ll rely on intuitive ideas about infinity and finiteness in the proof of the following theorem. Theorem 3 Let A be an infinite set. Then N A. Proof. We will define inductively a map f : N A as follows. Since A is infinite, it clearly is not empty; let a 1 A. Define f(1) = a 1. Assume now f(1) A,..., f(n) A defined for some n N, in such a way that f(i) f(j) for 1 i j n. The set {f(1),..., f(n)} is a finite subset of A; since A is infinite, the set A\{f(1),..., f(n)} must be non-empty; select a n+1 A\{f(1),..., f(n)} and define f(n + 1) = a n+1. By induction, f is defined on N and it is clear (?) that f is a one-to-one map from N into A. The alephs. What is a number? A number indicates a very basic property of a set; perhaps the most primitive thing two sets can have in common is the number of elements. One could say (for example), in a somewhat circular form, that 3 is the thing that all sets having three elements have in common. In all events, Cantor decided to extend the numbering system past the natural numbers and so introduced the alephs, aleph, written ℵ, being the first letter of the Hebrew alphabet. The power of a set is called, in this context, also the cardinality of the set. He used the symbol ℵ 0 (aleph sub-zero) to denote the smallest possible infinite cardinality; that is, the cardinality of N. As seen, the sets N, Z, Q all share the cardinality ℵ 0. What about the set P of all primes? Well, P N, thus P N by Exercise 22. On the other hand, P is an infinite set, thus N P by Theorem 3. This should imply P N, by the result we mentioned above. Actually, it is very easy to show directly that every infinite subset of a countable set is also countable; i.e., if A B and the cardinality of B is ℵ 0, then either A is a finite set, or has cardinality ℵ 0. Things get more complicated for sets of higher cardinality. Since Cantor, the cardinal numbers are 1, 2, 3, 4,..., ℵ 0, ℵ 1,... We ll get to ℵ 1 and higher alephs soon. The Cantor-Schröder-Bernstein Theorem. There are some paternity questions about this theorem. The MacTutor pages state that it was proved by Cantor in a set of two treatises on Set Theory published in 1895 and 1897. The MacTutor pages also state that it was proved independently by Felix Bernstein 18

(1878-1956) and Ernst Schröder (1841-1902). Other authors say that Cantor conjectured the theorem, Schröder gave an incomplete proof, Bernstein the first complete one. It is frequently called the Schröder-Bernstein theorem, sometimes the Cantor-Bernstein theorem. Here it is. Theorem 4 If A, B are sets and if there exist injections f : A B, g : B A, then there exists a bijection f : A B. Briefly: If A B and B A, then A B. Proof. We introduce some notation. Let a be in A. We say x A or x B is an ancestor of a if starting at x and applying successively and alternately f, g we can get to a. (Applying f first if x A, g first if x B.) If a is not in the range of g, then a has no ancestors. If a is in the range of g, then a = g(x 1 ) for some x 1 B; this element x 1 is an ancestor of a. The element x 1 B may or may not be in the range of f. If not, it is the first ancestor of a. If it is in the range of f, then x 1 = f(x 2 ) for some x 2 A. The element x 2 is an ancestor of a. If x 2 is in the range of g, then x 2 = g(x 3 ) for some x 3 B, and x 3 is also an ancestor of a. If not, if x 2 is not in the range of g, then x 2 is the first ancestor of a. And so forth. There are exactly three things that can happen, and exactly one must happen. One, this chain of ancestry goes on forever. Every ancestor of a has itself an ancestor. Two, the chain never starts or starts and ends in some element of A without an ancestor. Three, the chain ends at some element of B that has no ancestor; i.e., is not in the range of f. I hope this is clear; I really don t know how to make it clearer or easier. We can thus partition the set A into three mutually disjoint sets A 1, A 2, A 3 defined as follows: A 1 = {a A : a has no first ancestor, the ancestry chain goes on forever} A 2 = {a A : a has no ancestors or has a first ancestor in A} A 3 = {a A : a has a first ancestor in B} One or even two of these sets can be empty. But their union is A and no two of them have any common element. We are ready to define the one-to-one and onto map. We define h : A B as follows. If a A 1 or a A 2, we set h(a) = f(a). If a A 3 then a must be in the range of g, so a = g(b) for a unique b B (unique because g is one-to-one); set h(a) = b. We claim that h : A B is one-to-one and onto. Perhaps the best approach to establishing the claim is to similarly divide B into three mutually disjoint sets B 1, B 2, B 3 where B 1 = {b B : b has no first ancestor, the ancestry chain goes on forever} B 2 = {b B : b has a first ancestor in A} B 3 = {b B : b has no ancestors or has a first ancestor in B} We can then see that if a A 1, then h(a) B 1. In fact, if a A 1 then a has an infinite ancestry chain and h(a) = f(a); since a is thus an ancestor of f(a), the 19

ancestry chain of f(a) is also infinite. If a A 2, then h(a) = f(a) in B 2. Again this is because a is an ancestor of f(a) and the ancestry chain of a begins with a itself or in A; the same is true for f(a). Finally, if a A 3, then h(a) B 3. In fact, in this case we have h(a) = b where g(b) = a. This makes b an ancestor of a, every ancestor of b one of a, and since the first ancestor of a must be in B the only choice for b is to have no ancestors, or have a first one in B. Thus b = h(a) B 3. h is one-to-one. Let a 1, a 2 A and assume h(a 1 ) = h(a 2 ). We have to see this leads inexorably to a 1 = a 2. Let b = h(a 1 ) = h(a 2 ). From the discussion above, if b B 1, we must have a 1, a 2 A 1 ; if b B 2, we must have a 1, a 2 A 2 ; if b B 3, we must have a 1, a 2 A 3. But then, if a 1, a 2 A 1 A 2 the equation h(a 1 ) = h(a 2 ) becomes f(a 1 ) = f(a 2 ), hence a 1 = a 2 because f is one-to-one; if a 1, a 2 A 3, then h(a 1 ) = h(a 2 ) = b leads to a 1 = g(b) = a 2. One-to-oneness follows. h is onto. Let b B. If b B 1, then b has an infinite ancestry chain; in particular, b = f(a) for some a A. From the discussion above, we ll have a A 1, thus b = f(a) = h(a). Next, let b B 2. In this case the ancestry chain of B ends in A and we see that b = f(a) for some a A 2. Again, b = f(a) = h(a). Finally, let b B 3. Let a = g(b) A. It is clear that a A 3 and h(a) = b. Ontoness is established. The claim has been established, the theorem is proved. Exercise 23 Let A, B, f, g be as in Cantor-Schröder-Bernstein; i.e., f : A B and g : B A are one-to-one. Define the sets A 1, A 2, A 3, B 1, B 2, B 3 as in the proof of the theorem. Prove: 1. If A 2 = A 3 =, then both f and g are onto; hence one-to-one and onto. 2. If A 1 = A 3 =, then f is onto, but g is not onto. 3. If A 1 = A 2 =, then g is onto, but f is not onto. Exercise 24 Let A = N the set of natural numbers, and let B = Q + be the set of all positive rational numbers. Define f : N Q + by f(n) = n for all n N. Define g : Q + N as follows. If r Q + write r = a/b where a, b are positive integers and gcd(a, b) = 1. Then set g(r) = 2 a 3 b. It should be obvious that both f and g are one-to-one. The subsets A 1, A 2, A 3 of N and the subsets B 1, B 2, B 3, and the function h : N Q + are defined as in the proof of Cantor-Schröder-Bernstein. 1. Prove that A 1 =, hence also B 1 =. Hint: Notice that f(n) = n n, while g(r) > r. 2. For the following numbers in A = N determine their first ancestor and decide if they are in A 2 or A 3 : 1, 6, 30, 192, 648. 20

3. Determine a N so that h(a) = b for the following values of b: b = 2/5, b = 1/2, b = 1/3. The algebraic numbers. Cantor published his first paper on set theory in 1874, in Crelle s journal. It was the first public appearance of the idea that there were different levels of infinite and that some sets are more infinite than others. One of the sets he showed was not very infinite, of cardinality ℵ 0, was the set of algebraic numbers. The argument used uses the fact that if you have a countable number of finite sets and form the union of all their elements, the result is at most countable. Here is the result, in the form of a theorem. The proof is not 100% rigorous; there is some appeal to intuition. Theorem 5 Let A n be a finite (possibly empty) set for each n N. Then n=1 A n is at most countable. Proof. We may assume that all the A n s are pairwise disjoint; repeated elements only make the union of all sets smaller than it could be. Similarly, we may assume that all the A n s are non-empty; if we discard the indices for which the sets are empty, the union stays the same. Let K n denote the number of elements in the set A n, so that K n is a positive integer for each n N. Next, order all the elements in each set A n ; that is, list them in the form a n1, a n2,..., a nkn. We define now f : n=1 A n N. Let a n=1 A n. By our assumptions, there is one and only one n N such that a A n ; and then a = a nk for some k, 1 k K n. If n = 1, set f(a) = k. If n = 2, set f(a) = K 1 + k. In general, set f(a) = f(a nk ) = N 1 + + N n 1 + k. It is easy to see this map is one-to-one and onto. In case the proof was a bit too murky, here is the simple idea. Suppose, for example that A 1 = {a, b, c}, A 2 = {d, e}, A 3 = {f, g, h, i}, and so forth. Then we are listing the elements of the union in the form a, b, c, d, e, f, g, h, i, etc. If, for example, the first set has three elements, then these elements get assigned to 1, 2, 3. If the second set has two elements, they get assigned to 4, 5. Etc. Remark. This last theorem remains true if one replaces the word finite by at most countable: A countable union of at most countable sets is at most countable. The proof of this more general result uses essentially the same idea Cantor used in proving that the rationals were countable. We turn to the algebraic numbers. We recall that a real number x is algebraic iff it is the solution of an equation of the form (2) a n x n + + a 1 x + a 0 = 0, 21

where n 1, a n 0 and all the coefficients a 0, a 1,... a n are integers. All the rational numbers are algebraic. In fact, if x is rational, then x = a/b where a, b are integers, b 0. Then bx a = 0, an equation of type (2) with n = 1, integer coefficients a 1 = b, a 0 = a. Roots of rational numbers are algebraic. By definition, a number that is not algebraic is said to be transcendental. If the following proof due to Cantor of the countability of the set of algebraic numbers is too obscure, there are many textbooks that carry it. Or you can ask me for clarification. The proof here is from the MacTutor page on Set Theory. Begin quoted passage. In his 1874 paper Cantor considers at least two different kinds of infinity. Before this, orders of infinity did not exist but all infinite collections were considered the same size. However Cantor examines the set of algebraic real numbers, that is the set of all real roots of equations of the form a n x n + a n 1 x n 1 + a n 2 x n 2 + + a 1 x + a 0 = 0, where a i is an integer. Cantor proves that the algebraic real numbers are in one-one correspondence with the natural numbers in the following way. For an equation of the above form define its index to be a n + a n 1 + a n 2 +... + a 1 + a 0 + n. There is only one equation of index 2, namely x = 0. There are 4 equations of index 3, namely 2x = 0, x + 1 = 0, x 1 = 0 and x 2 = 0. These give roots 0, 1, -1. For each index there are only finitely many equations and so only finitely many roots. Putting them in 1-1 correspondence with the natural numbers is now clear, by ordering them in order of index and increasing magnitude within each index. End of the quoted passage. It is easy to see that if A, B are countable, then A B is countable. Since R is not countable, one concludes that the set of transcendental numbers must be uncountable. In fact, because every interval (a, b) with a < b is uncountable, Cantor now remarks that this proves a theorem due to Liouville (1809-1882), namely that there are infinitely many transcendental (i.e. not algebraic) numbers in each interval. The quote is again from the MacTutor pages. It should be remarked that Liouville s proof was constructive, while Cantor s is, at first glance at least, highly non-constructive since he seems to give no indication on how to obtain even a single transcendental number. On with the alephs, infinity after infinity. Cantor called the smallest infinite cardinal number ℵ 0 ; it is the cardinality of N. But since every set has cardinality less than its power set, one has more cardinals. Cantor introduced more and more alephs so that one has ℵ 0 < ℵ 1 < ℵ 2 < With equipotence among sets being an equivalence relation, Cantor defined cardinals as being equivalence classes of equipotent sets and, of course, for 22

cardinals α, β one has α β iff A α, B β implies A B; one has α < β if A B but A B. All this is very nice and thanks to Cantor- Schröder-Bernstein one almost has trichotomy. But not quite. Suppose A, B are sets, arbitrary sets, possibly enormously large, vast, infinite sets. Is there any reason in the whole wide world to assume that one will have A B or B B? Couldn t these sets be completely incomparable; no way of defining a function from one to the other; at least not a one-to-one function (one can always define a constant function). Cantor conjectured that every set can be well ordered; i.e., that for every set A one could define a relation that is usually denoted by < among its elements satisfying the following properties. 1. If a, b, c A, if a < b and b < c, then a < c. (Transitivity.) 2. If a, b A then exactly one of the following three possibilities holds: a < b, a = b, b < a. (Trichotomy.) 3. For each non-empty subset B of A it holds that there is b B such that b < x for all x B, x b. The last of these three properties is frequently stated by saying that every nonempty subset contains a first or smallest element and is what characterizes the order as a well order. The usual order relation < is a well order for the set N of natural numbers. However, the usual order is not a well order for Q, or even less for R. For example, the set {1/n : n N} is a non-empty subset of Q yet contains no smallest element. However, Q can easily be well ordered by setting in into one-to-one and onto correspondence with N and then transferring the order of N to it. In other words, we know there exists a function f : Q N that is one-to-one and onto. If we redefine x < y for x, y Q to mean f(x) < f(y), then we have well ordered the rationals. The situation is much more iffy for the reals. If one thinks a bit about well orderings one sees that they are very weird animals when occurring in uncountable sets. If the set is countable, it is easy to well order it. Just decide on any element and call it the first element of the set. Then consider the set minus the first element, select an element from there and call it the second element. And so forth. A well ordered set must have a first element, a second element, a third element, etc., and if the set is countable such elements exhaust the set. But the reals? Lets say we try to well order the reals; we decide on a first element, a second one, a third one, etc. Well, we are already in an infinite process here but we are only affecting a minor portion of the reals. The list consisting of the first real, second real, etc. is per force countable, and most reals have won t be related after we are done. Well, with a flight of the imagination we can assume that we have listed a whole infinite (but countable) set of reals, that are now ordered. We ll declare every other real to be larger than those listed, and begin a second list. For the sake of an example, lets assume that what we listed first were the positive integers in their usual order, then we want to add to our reordering all the non-integral square roots of integers, after that positive rational numbers that have denominator three in 23

reduced form. A bit helter-skelter, but we could have lists that look like this, where every listed element is considered as being less than anything to its right: 1, 2, 3... 2, 3, 5,..., 1/3, 2/3,... This is just an example! For example, in this order 1000 < 2 and 2 < 1/3. So far the order is a well order. Notice that nobody says that every non-empty subset has to have a last element, just a first element. What next? Well, maybe after finishing with numbers having denominator 3, we could include cubic roots? The thing is that even after a countable number of steps of this sort, adding again and again an infinite set of ordered numbers to the right of the previous ones, we are not even close to finishing. So far, all we are doing is well ordering in a weird way a countable subset of R. A bit of reflection shows that there isn t, there can t be, a constructive way of well ordering the reals, thus Cantor s conjecture that all sets, even the reals, even sets much more dizzyingly uncountable than the reals, could be well ordered was a pretty bold one. It does imply that any two sets are comparable. In fact, once a set A is well ordered by < one can define a section of A to be any subset B such that if x B, then y B for every y A, y < b. For example, in the case of N with the usual order, sections are all sets of the form {1, 2,..., m} for some m N. One can prove that if A, B are well ordered then either A is equivalent to a section of B or B is equivalent to a section of A. Thus, if every set can be well ordered, then any two sets are comparable meaning that trichotomy holds for cardinals. This conjecture of Cantor was proved by the German mathematician Ernst Zermelo (1871-1953). I wrote proved in quotation marks because what Zermelo really did was to posit an axiom, now known as axiom of choice, from which he derived the result. Very roughly the axiom of choice says that if you have an arbitrary indexed family of sets {A i } i I, and if A i for each i I, then you can form a new object (x i ) i I, consisting of a choice of one element x i A i for each i I. 24