Math 341: Convex Geometry. Xi Chen

Math 341: Convex Geometry Xi Chen 479 Central Academic Building, University of Alberta, Edmonton, Alberta T6G 2G1, CANADA E-mail address: xichen@math.ualberta.ca

CHAPTER 1 Basics 1. Euclidean Geometry 1.1. Vector spaces. Let R be the set of real numbers. We use R n to denote the n dimensional Euclidean space. Every point in R n is given by an n-tuple (x 1, x 2,..., x n ), where x i R. We can define two operations on R n. One is called vector addition: we define the sum of (x 1, x 2,..., x n ) R n and (y 1, y 2,..., y n ) R n to be (1.1) (x 1, x 2,..., x n ) + (y 1, y 2,..., y n ) = (x 1 + y 1, x 2 + y 2, x n + y n ) The other is called scalar multiplication: we defined the product of λ R and (x 1, x 2,..., x n ) to be (1.2) λ(x 1, x 2,..., x n ) = (λx 1, λx 2,..., λx n ) These two operations obviously have the following properties: VS1 (Commutativity) For any u = (x 1, x 2,..., x n ) and v = (y 1, y 2,..., y n ) R n (1.3) u + v = v + u VS2 (Associativity) For any u = (x 1, x 2,..., x n ), v = (y 1, y 2,..., y n ) and w = (z 1, z 2,..., z n ) R n, (1.4) (u + v) + w = u + (v + w) VS3 (Associativity for Scalar Multiplication) For any λ 1, λ 2 R and u = (x 1, x 2,..., x n ) R n, (1.5) (λ 1 λ 2 )u = λ 1 (λ 2 u) VS4 (Distribution Law) For any λ R, u = (x 1, x 2,..., x n ) and v = (y 1, y 2,..., y n ) R n (1.6) λ(u + v) = λu + λv R n, equipped with these two operations, is called a vector space over R, or a real vector space. Indeed, any set equipped with two operations that satisfy the axioms VS1-VS4 is a vector space. 3

4 1. BASICS 1.1.1. Geometric representation of vectors. A vector u = (u 1, u 2,..., u n ) R n can be represented by an arrow starting at point P = (x 1, x 2,..., x n ) and ending at point Q = (y 1, y 2,..., y n ), where y i = x i + u i for i = 1, 2,..., n. By convention, we write u = P Q. Let O = (0, 0,..., 0) be the origin. Since there is a one-to-one correspondence between the vector OP and the point P, we also call a vector in R n a point from time to time. Vector addition can be interpreted geometrically by so-called parallelogram criterion: let A, B, C, D be the vertices (in clockwise order) of a parallelogram. Then AB + AD = AC. Similarly, here is the geometric interpretation for scalar multiplication: let u = P Q and P R = λu. Then P, Q, R are collinear, P R = λ P Q and Q and R are on the same side of P iff λ 0. Proposition 1.1. Let R be the point lying on the line P Q between P and Q and satisfying: P R / RQ = λ. Then (1.7) OR = t 1 OP + (1 t) OQ with t = 1 + λ Proof. We see from the assumption that P R = λ RQ. And since P R = OR OP and RQ = OQ OR, we have OR OP = λ( OQ OR). And (1.7) follows. Let V be a vector space (over R). 1.1.2. Linear dependence and dimension. We say that v 1, v 2,..., v k V are linearly dependent if there exist λ 1, λ 2,..., λ k R, not all zero, such that λ 1 v 1 +λ 2 v 2 +...+λ k v k = 0; and v 1, v 2,..., v k are linearly independent if they are not linearly dependent. Proposition 1.2. Any m > n vectors v 1, v 2,..., v m in R n are linearly dependent. Proof. Let v i = (a i1, a i2,..., a in ) for i = 1, 2,..., m and A = (a ij ) m n. Using Gaussian elimination, A can be turn into an upper-triangular matrix B = (b ij ) m n after a series of row operations. That is, there exists a nonsingular matrix C = (c ij ) m m such that CA = B = (b ij ) m n, where b ij = 0 for i > j. Therefore, (1.8) c m1 v 1 + c m2 v 2 +... + c mm v m = (b m1, b m2,..., b mn ) = 0 Since C is nonsingular, i.e., det(c) 0, c m1, c m2,..., c mm cannot be all zero. Hence v 1, v 2,..., v m are linearly dependent. We say that v V is a linear combination of v 1, v 2,..., v k V if there exist λ 1, λ 2,..., λ k R such that v = λ 1 v 1 + λ 2 v 2 +... + λ k v k. Let S be a subset of V. We say that S generates V if every vector in V is a linear combination of some vectors in S. We say that S is a basis of V if S generates V and v 1, v 2,..., v k are linearly independent for any finite subset {v 1, v 2,..., v k } S.

1. EUCLIDEAN GEOMETRY 5 Theorem 1.3. Let S 1 and S 2 be two bases of V. Then S 1 = S 2. Proof. We will only prove the theorem when S 1 and S 2 are finite as that is the only case we need here. Let S 1 = {u 1, u 2,..., u n } and S 2 = {v 1, v 2,..., v m }. Since S 1 generates V, v j is a linear combination of u 1, u 2,..., u n, i.e., v j = a 1j u 1 + a 2j u 2 +... + a nj u n for some a 1j R. Suppose that m > n. By Proposition 1.2, the m vectors (a 1j, a 2j,..., a nj ) are linear dependent. Hence there exist λ 1, λ 2,..., λ m R, not all zero, such that (1.9) λ j (a 1j, a 2j,..., a nj ) = 0 By (1.9), we have (1.10) λ j v j = λ j (a 1j u 1 + a 2j u 2 +... + a nj u n ) = 0 and hence v 1, v 2,..., v m are linear dependent. Contradiction. Therefore, m n. Similarly, n m. Therefore, m = n and S 1 = S 2. Let S be a basis of V and we define the dimension of vector space V to be dim V = S. By the above theorem, dim V is independent of the choice of basis S. To emphansize the fact V is a vector space over R, we write dim R V for the dimension of V over R. Obviously, dim R R n = n. 1.1.3. Linear subspaces. A subset W V is a (linear) subspace of V if W is closed under vector addition and scalar multiplication, i.e., u + v W for any u, v W and λu W for any u W and λ R. A subspace W V is itself a vector space (over R). Proposition 1.4. W V is a linear subspace if and only if u+λv W for any u, v W and λ R. Proof. : Since W is a subspace, λv W and hence u + λv W. : Since u + λv W for all u, v W and λ R, u + v W by letting λ = 1 and λv W by letting u = 0. For two subsets A, B V and λ R, we define (1.11) A + B = {a + b : a A, b B} and λa = {λa : a A} If A = {x} consists of a single vector, we often write x + B for A + B and we call x + B a translate of B. The following are obvious. Proposition 1.5. Let A, B, C V and λ, µ R. Then (1) A + B = B + A; (2) (A + B) + C = A + (B + C); (3) λ(a + B) = λa + λb; (4) (λ + µ)a = λa + µa; (5) (λµ)a = λ(µa); (6) A is a subspace if and only if A + A = A and λa = A for any λ 0;

6 1. BASICS (7) If A and B are subspaces, so is A + B; (8) The intersection of any collection of subspaces is a subspace. Let V and W be two vector spaces. We call a map f : V W a linear map or transformation if f(x + y) = f(x) + f(y) and f(λx) = λf(x) for all x, y V and λ R. 1.1.4. Affine dependence and affine subspaces. The translate of a linear subspace of V is called an affine subspace. Let W V be a linear subspace and x + W be an affine subspace. The dimension of x + W is defined to be the dimension of W. An affine subspace of dimension 1 is called a line. An affine subspace in V of dimension dim V 1 is called a hyperplane. An affine subspace of dimension k is called a k-plane. Proposition 1.6. (1) Let S V be an affine subspace V and let y S. Then S y = ( y) + S is a linear subspace of V. (2) The intersection of any collection of affine subspaces is an affine subspace. Proof. For (1), let S = x + W for some x V and linear subspace W V. Then y = x + w for some w W. Then S y = x + W (x + w) = W w. We claim that W w = W. For any u W, u + w W and hence u = (u + w) w W w; therefore W W w. And since u w W, W w W. Therefore, S y = W w = W is a linear subspace. For (2), let {S i V } i I be a collection of affine subspaces of V. If i I S i is empty, we are done. If not, let y i I S i. By (1), W i = S i y is a linear subspace. We claim that (1.12) S i = + W i ) = y + i I i I(y i I and then S i is an affine subspace since W i is a linear subspace. To prove (1.12), let x S i. Then x S i = y + W i. Hence x y W i for all i I. Consequently, (1.13) x y W i x y + W i S i y + W i i I i I i I i I Let w W i. Then y + w S i for all i I. Therefore, y + w S i and y + W i S i. And (1.12) follows. We say vectors u 1, u 2,..., u m V are affinely dependent if there exist λ 1, λ 2,..., λ m R, not all zero, such that λ 1 + λ 2 +... + λ m = 0 and λ 1 u 1 + λ 2 u 2 +... + λ m u m = 0. We say u is an affine combination of u 1, u 2,..., u m if there exist λ 1, λ 2,..., λ m R such that λ 1 + λ 2 +... + λ m = 1 and u = λ 1 u 1 + λ 2 u 2 +... + λ m u m. Proposition 1.7. Let v 1, v 2,..., v m R n with v j = (a 1j, a 2j,..., a nj ) for j = 1, 2,..., m and let u j = (1, a 1j, a 2j,..., a nj ) R n+1. (1) v 1, v 2,..., v m are affinely dependent if and only if u 1, u 2,..., u m are linearly dependent. W i

1. EUCLIDEAN GEOMETRY 7 (2) Let v = (a 1, a 2,..., a n ) and u = (1, a 1, a 2,..., a n ). Then v is an affine combination of v 1, v 2,..., v m if and only if u is a linear combination of u 1, u 2,..., u m. Proof. For (1), (1.14) u 1, u 2,..., u m linearly dependent λ 1, λ 2,..., λ m R, not all zero, such that 0 = λ j u j = λ j (1, a 1j, a 2j,..., a nj ) λ j = 0 and λ j v j = v 1, v 2,..., v m affinely dependent λ j (a 1j, a 2j,..., a nj ) = 0 For (2), (1.15) u is a linear combination of u 1, u 2,..., u m λ 1, λ 2,..., λ m R such that (1, a 1, a 2,..., a n ) = u = λ j u j = λ j (1, a 1j, a 2j,..., a nj ) λ j = 1 and λ j v j = λ j (a 1j, a 2j,..., a nj ) = (a 1, a 2,..., a n ) = v v is an affine combination of v 1, v 2,..., v m Corollary 1.8. Any m > n + 1 vectors in R n are affinely dependent. A set S is said to be affine if every affine combination of two points in S belongs to S, i.e., λx + (1 λ)y S for any x, y S. Proposition 1.9. A set S is affine if and only if every affine combination of points of S lies in S. Proof. is trivial. : We prove by induction on n that λ 1 v 1 + λ 2 v 2 +... + λ n v n S for any v 1, v 2,..., v n S and λ 1 + λ 2 +... + λ n = 1. This holds for n = 2 by definition. Suppose that it holds for n < k. When n = k > 2. At least one of λ 1, λ 2,..., λ k is not equal to 1. Without the loss of generality, assume that

8 1. BASICS λ k 1. Then λ 1 + λ 2 +... + λ k 1 0. Therefore, (1.16) λ 1 v 1 + λ 2 v 2 +... + λ k 1 v k 1 + λ k v k ( λ 1 = (λ 1 + λ 2 +... + λ k 1 ) v 1 λ 1 + λ 2 +... + λ k 1 + λ 2 λ k 1 v 2 +... + v k 1 λ 1 + λ 2 +... + λ k 1 λ 1 + λ 2 +... + λ k 1 By induction hypothesis, (1.17) v = ) + λ k v k λ 1 v 1 λ 1 + λ 2 +... + λ k 1 λ 2 λ k 1 + v 2 +... + v k 1 S λ 1 + λ 2 +... + λ k 1 λ 1 + λ 2 +... + λ k 1 Therefore, λ 1 v 1 + λ 2 v 2 +... + λ k v k = (λ 1 + λ 2 +... + λ k 1 )v + λ k v k S. Theorem 1.10. S V is affine if and only if S is an affine subspace. Proof. : If S is empty, there is nothing to prove. Suppose that v S. Let W = S v. We will show that W is a linear subspace of V and then it follows that S is an affine subspace. Let x, y S and λ R. Then (1.18) (x v) + λ(y v) = (x + λy λv) v Note that x + λy λv is an affine combination of x, y and v. Therefore, x + λy λv S, (x v) + λ(y v) W and W is a linear subspace. : Suppose that S is an affine subspace. Let S = v + W, where W is a linear subspace of V. For x, y W and λ R, (1.19) λ(v + x) + (1 λ)(v + y) = v + (λx + (1 λ)y) v + W = S Therefore, S is affine. By the above theorem, affine subspaces and affine sets are the same things. Therefore, we will use these two terms interchangably from now on. The affine hull of a set S is the intersection of all the affine sets which contain S, denoted by aff(s). Alternatively, the affine hull of S can be defined as the smallest affine set that contains S in the sense that aff(s) T for any affine set T S. Proposition 1.11. The affine hull of S consists precisely of all affine combinations of points of S. That is, (1.20) aff(s) = T = {λ 1 x 1 + λ 2 x 2 +... + λ m x m : λ 1 + λ 2 +... + λ m = 1, x 1, x 2,..., x m S} Proof. By Proposition 1.9, λ 1 x 1 + λ 2 x 2 +... + λ m x m aff(s) for all λ 1 + λ 2 +... + λ m = 1, x 1, x 2,..., x m S. Therefore, T aff(s).

1. EUCLIDEAN GEOMETRY 9 On the other hand, for any λ R, x = α 1 x 1 + α 2 x 2 +... + α m x m and y = β 1 y 1 +β 2 y 2 +...+β n y n T with α 1 +α 2 +...+α m = β 1 +β 2 +...+β n = 1 and x 1, x 2,..., x m, y 1, y 2,..., y n S, n (1.21) λx + (1 λ)y = λα i x i + (1 λ)β j y j Since (1.22) λα i + i=1 i=1 n (1 λ)β j = 1 λx+(1 λ)y is an affine combination of x 1, x 2,..., x m, y 1, y 2,..., y n. Therefore, λx + (1 λ)y T and T is affine. It follows from the definition of aff(s) that aff(s) T. Hence T = aff(s). Using the concept of affine hull, we have a (rudimentary) notion of dimensions of subsets in a vector space. The dimension of a set S V is the dimension of its affine hull aff(s). Note that this is not a widelyaccepted notion of dimension. For example, take S = {p, q} consisting of two distinct points p and q. Then aff(s) is the line joining p and q. Then dim(s) = dim(aff(s)) = 1. However, our intuition tells us S should have dimension 0; a better notion of dimension should give dim(s) = 0. However, this is the definition we will use throughout this notes. 1.2. Topology on R n. The inner product u, v of u = (x 1, x 2,..., x n ) and v = (y 1, y 2,..., y n ) R n is defined to be (1.23) u, v = x 1 y 1 + x 2 y 2 +... + x n y n Alternatively, we write u, v = u v. It is easy to check the following. Proposition 1.12. Let u, v, w R n and λ R. (1) u, u 0 and the equality holds if and only if u = 0; (2) u, v = v, u ; (3) u + v, w = u, w + v, w ; (4) λ u, v = λu, v ; (5) u + v, u + v + u v, u v = 2 u, u + 2 v, v For every v V, we define the norm of v, denoted by v, to be v = v, v. Proposition 1.13. Let u, v V and λ R. Then (1) λv = λ v ; (2) u, v u v ; (3) u + v u + v and the equality holds when u = λv for some λ R.

10 1. BASICS Proof. (1) is trivial. For (2), which is usually called Schwartz inequality, since (1.24) u + λv, u + λv = ( v 2 )λ 2 + (2 u, v )λ + u 2 0 for any u, v V and λ R, (1.25) (2 u, v ) 2 4 u 2 v 2 0 and (2) follows. (3) follows from (2) directly. For any two points p, q R n, we define the distance between p and q as (1.26) d(p, q) = op oq where o is the origin. It is easy to check Proposition 1.14. Let x, y, z R n. (1) (Triangle Inequality) d(x, y) + d(y, z) d(z, x); (2) d(x, y) 0 and the equality holds if and only if x = y. d(, ) is called a metric on and this makes R n a metric space. It induces a topology on R n. For x R n and r > 0, we define (1.27) B(x, r) = {y R n : d(y, x) < r} to be the open ball with center x and radius r. Let S R n. A point x S is an interior point of S if B(x, r) S for some r > 0. The interior of S, denoted by Int(S), is the subset of S consisting of all interior points of S. S is open if S = Int(S), i.e., every point of S is an interior point of S. Proposition 1.15. (1) The union of any collection of open sets is open. (2) The intersection of any finitely many open sets is open. (3) Int(S) is the union of all open subsets that are contained in S. Proof. For (1), let {S i } i I be a collection of open sets. For any point x S i, x S α for some α I. Since S α is open, there exists r > 0 such that B(x, r) S α and B(x, r) S i. Consequently, x is an interior point of S i and S i is open. For (2), it suffices to prove the intersection of two open sets is open. Let S 1 and S 2 be two open sets. Let x S 1 S 2. Since S 1 and S 2 are open, there exist r 1, r 2 > 0 such that B(x, r 1 ) S 1 and B(x, r 2 ) S 2. Let r = min(r 1, r 2 ). Then B(x, r) S 1 S 2 and x is an interior point of S 1 S 2. Hence S 1 S 2 is open. For (3), it is obvious that Int(S) S is open. It suffices to show that any open set T S is contained in Int(S). Let x T. Since T is open, there exists r > 0 such that B(x, r) T S. So x is an interior point of S and x Int(S). Therefore, T Int(S).

1. EUCLIDEAN GEOMETRY 11 A set S is called closed if the complement of S is open. The closure of S, denoted by cl(s), is the intersection of all closed sets that contain S. Proposition 1.16. (1) The intersection of any collection of closed sets is closed. (2) The union of any finitely many closed sets is closed. (3) x cl(s) if and only if B(x, r) S for all r > 0. (4) cl(s c ) = Int(S) c. Proof. (1) and (2) follow directly from Proposition 1.15. For (3), let x cl(s). If B(x, r) S = for some r > 0, B(x, r) c S and then x B(x, r) c by the definition of cl(s); contradiction. Therefore, B(x, r) S for all r > 0. On the other hand, let x be a point with the property that B(x, r) S for all r > 0. To show that x cl(s), it suffices to show that x T for every closed set T S. Suppose that x T. Then x T c. Since T c is open, B(x, r) T c for some r > 0. Hence B(x, c) T = and B(x, c) S =. Contradiction. For (4), first notice that S c Int(S) c. Since Int(S) c is closed, cl(s c ) Int(S) c. Let x Int(S) c. If x cl(s c ), then B(x, r) S c = for some r > 0. Consequently, B(x, r) S and x Int(S). Contradiction. Therefore, x cl(s c ) and (4) follows. The boundary of S, denoted by bd(s), is the difference cl(s) Int(S). It is not hard to see that x bd(s) if B(x, r) S and B(x, r) S c for any r > 0. Let S be a subset of R n. S is contained in its affine hull aff(s) = R k. The relative interior of S, denoted by relint(s), is the interior of S as a subset of R k. A map R n R m is called continuous if f 1 (U) is open for any open set U R m. A set S R n is bounded if S B(o, R) for some R > 0. A set S R n is compact if it is closed and bounded. Theorem 1.17. The images of compact sets under a continuous map are compact. We will not prove this theorem here. Interested students may check any standard real analysis book for a proof. Theorem 1.18 (Extreme Value Theorem). Let S be a compact set and f : S R be a continuous function on S. Then f(x) achieves maximum and minimum on S. Again, one may find a proof for this theorem in any standard real analysis book. A set S is connected if there do not exist two open sets A and B such that A S, B S, A B = and S A B.

12 1. BASICS 2. Convex Sets 2.1. Definitions. Let x, y R n. We use the notation xy to denote the line segment (2.1) xy = {αx + βy : α, β 0, α + β = 1} A set S R n is convex if xy S for any x, y S. We call a vector x a convex combination of x 1, x 2,..., x m if there exists λ 1, λ 2,..., λ m 0 such that λ 1 +λ 2 +...+λ m = 1 and x = λ 1 x 1 +λ 2 x 2 +...+ λ m x m. Obviously, xy is the set consisting of all convex combinations of the two points x and y. So S is convex if it contains all convex combinations of any two points in S. Actually, this is true for any finitely many points in S. That is, we have Proposition 2.1. S is convex if and only if it contains all convex combinations of any finitely many points in S. The proof of the above proposition goes exactly like that of Proposition 1.9. The following is obvious. We will leave the proofs of the following two statements as exercises. Proposition 2.2. The intersection of any collection of convex sets is convex. Proposition 2.3. Let f : V W be a linear transformation between two real vector spaces V and W. Then f(s) is convex for any convex set S V and f 1 (T ) is convex for any convex set T W. The convex hull of a set S, denoted by conv(s), is the intersection of all convex sets which contain S. Proposition 2.4. The convex hull of S consists precisely of all convex combinations of points of S. That is, (2.2) conv(s) = T = {λ 1 x 1 + λ 2 x 2 +... + λ m x m : λ i 0, λ i = 1, x i S for i = 1, 2,..., m} i=1 The proof of the above proposition goes exactly like that of Proposition 1.11, which we will not repeat here. Next we will show the interior and closure of a convex set are still convex. Proposition 2.5. Let S be a convex set. If x Int(S) and y S, then relint xy Int(S). Consequently, Int(S) is convex. Proof. Note that (2.3) relint xy = {αx + βy : α, β > 0, α + β = 1} Let z = αx + βy relint xy. Since x Int(S), B(x, r) S for some r > 0. Since S is convex, αb(x, r) + βy S. Note that (2.4) αb(x, r) + βy = B(αx, αr) + βy = B(αx + βy, αr) = B(z, αr)

Therefore, z Int(S) and relint xy Int(S). 2. CONVEX SETS 13 Proposition 2.6. Let S be a convex set. Then cl(s) is convex. Proof. Let x, y cl(s). To show cl(s) is convex, it suffices to show αx + βy cl(s) for all α, β 0 with α + β = 1. Since x cl(s), there exists a sequence {x n S} such that d(x, x n ) 0 as n. Similarly, there exists a sequence {y n S} such that d(y, y n ) 0 as n. To show that z = αx + βx cl(s), it suffices to show that there exists a squence {z n S} such that d(z, z n ) 0 as n. Let z n = αx n + βy n. Obviously, z n S since x n, y n S and S is convex. Then (2.5) d(z, z n ) = z z n = α(x x n ) + β(y y n ) α x x n + β y y n = αd(x, x n ) + βd(y n ) Therefore, lim n d(z, z n ) = 0 because lim n d(x, x n ) = lim n d(y, y n ) = 0. Proposition 2.7. If S is an open set, then conv(s) is also open. Proof. It suffices to show that conv(s) Int(conv(S)). Since S conv(s), Int(S) Int(conv(S)). Since S is open, S Int(conv(S)). By Proposition 2.5, Int(conv(S)) is convex. Therefore, conv(s) Int(conv(S)). It is natural to expect the convex hull of a closed set is also closed. However, this is not true. For example, let f : R R be an arbitrary positive continuous function satisfying (2.6) lim f(x) = lim f(x) = 0 x x and S = {(x, y) : y f(x)}. Since f(x) is continuous, S is closed but we claim that conv(s) = {(x, y) : y > 0}, which is not closed. Let r = (x 0, y 0 ) R 2 with y 0 > 0. Since f(x) 0 as x, there exists x 1 < x 0 such that f(x 0 ) < y 0. Similarly, there exists x 2 > x 0 such that f(x 2 ) < y 0. Let p = (x 1, y 0 ) and q = (x 2, y 0 ). Since y 0 f(x i ) for i = 1, 2, p, q S. So pq conv(s). Obviously, r pq and therefore r conv(s). Hence H = {(x, y) : y > 0} conv(s). On the other hand, S H and H is convex. Therefore, H = conv(s). Therefore, the convex hull of a closed set S R n is not necessarily convex. However, if we further assume that S is bounded, i.e., S is compact, conv(s) is compact. The proof of this theorem requires Caratheodory s theorem. 2.2. Caratheodory s Theorem. Theorem 2.8 (Caratheodory). If S is nonempty subset of R n, then every x in conv(s) can be expressed as a convex combination of n + 1 or fewer points of S.

14 1. BASICS Proof. Let m be the smallest number such that x is the convex combination of m points. Then there exist x 1, x 2,..., x m S and λ 1, λ 2,..., λ m 0 such that (2.7) λ i = 1 and x = λ i x i i=1 Due to our choice of m, λ i > 0 for all i = 1, 2,..., m. Suppose that m > n + 1. By Corollary 1.8, x 1, x 2,..., x m are affinely dependent, i.e., there exist γ 1, γ 2,..., γ m, not all zero, such that (2.8) γ i = 0 and γ i x i = 0 i=1 Let I = {i : γ i > 0} {1, 2,..., m}. Since γ 1, γ 2,..., γ m are not all zero, I. Let (2.9) a = min i I and let α i = λ i aγ i. If i I, then α i 0 since a λ i /γ i ; if i I, γ i 0 and hence α i 0. Therefore, α i 0 for all i = 1, 2,..., m. By (2.8), (2.10) α i = i=1 λ i = 1 and i=1 i=1 λ i γ i i=1 α i x i = i=1 λ i x i = x Due to our choice of a, there is at least one i I such that a = λ i /γ i. Correspondingly, α i = 0. Therefore by (2.10), x is a convex combination of x 1, x 2,..., ˆx i,..., x m. Contradiction. Proposition 2.9. If S R n is a compact set, then conv(s) is also compact. Proof. By Caratheodory s theorem, every point x conv(s) is the convex combination of some n + 1 points x 1, x 2,..., x n+1 in S. Let f : (R n ) n+1 R n+1 : R n be the map: (2.11) f(x 1, x 2,..., x n+1, λ 1, λ 2,..., λ n+1 ) = λ 1 x 1 + λ 2 x 2 +... + λ n+1 x n+1 where x i R n and λ i R. Let (2.12) D = {(λ 1, λ 2,..., λ n+1 ) : λ 1, λ 2,..., λ n+1 0, λ 1 +λ 2 +...+λ n+1 = 1} Then f is continuous and D is compact. By Caratheodory s theorem, f(s n+1 D) = conv(s). Since S n+1 D is compact, conv(s) is compact. Proposition 2.10. Let S R n be a convex set. If S, then relint(s). i=1

2. CONVEX SETS 15 Proof. Consider S as a convex subset of aff(s) = R k. We will show that Int(S) 0. Since aff(s) = R k, there exist x 1, x 2,..., x k+1 S which are affinely independent. Let D R k+1 be the subset (2.13) D = {(λ 1, λ 2,..., λ k+1 ) : λ 1 + λ 2 +... + λ k+1 = 1} Let f : D R k be the map (2.14) f(λ 1, λ 2,..., λ k+1 ) = λ 1 x 1 + λ 2 x 2 +... + λ k+1 x k+1 f is obviously continuous and onto. Also we claim that f is one-to-one. Otherwise, there exist (λ 1, λ 2,..., λ k+1 ) (λ 1, λ 2,..., λ k+1 ) D such that f(λ 1, λ 2,..., λ k+1 ) = f(λ 1, λ 2,..., λ k+1 ), which implies (2.15) (λ 1 λ 1)x 1 + (λ 2 λ 2)x 2 +... + (λ k+1 λ k+1 )x k+1 = 0 and hence x 1, x 2,..., x k+1 are affinely dependent. Contradiction. Hence f is one-to-one. Let g = f 1 : R k D be the inverse of f and let (2.16) g(z 1, z 2,..., z k ) = (g 1 (z 1, z 2,..., z k ), g 2 (z 1, z 2,..., z k ),..., g k+1 (z 1, z 2,..., z k )) where g i are functions from R k R. More explicitly, let x j = (a 1j, a 2j,..., a kj ) and A be the (k + 1) (k + 1) matrix a 11 a 12... a 1,k+1 a 21 a 22... a 2,k+1 (2.17)...... a k1 a k2... a k,k+1 1 1... 1 Then (2.18) z g 1 (z 1, z 2,..., z k ) 1 g 2 (z 1, z 2,..., z k ) z 2. = A 1. z g k+1 (z 1, z 2,..., z k )) k 1 Therefore, each g i (z 1, z 2,..., z k ) is a (nonhomogeneous) linear function in z 1, z 2,..., z k. Therefore, g i is continuous and g is continuous. Let (2.19) x 0 = 1 k + 1 (x 1 + x 2 +... + x k+1 ) S Then 1 (2.20) y 0 = g(x 0 ) = ( k + 1, 1 k + 1,..., 1 k + 1 )

16 1. BASICS Choose an arbitrary r with 0 < r < 1/(k + 1). Since g is continuous, U = g 1 (B(y 0, r) D) is an open set in R k that contains the point x 0. Note (2.21) U = {λ 1 x 1 +λ 2 x 2 +...+λ k+1 x k+1 : (λ 1, λ 2,..., λ k+1 ) B(y 0, r) D} Since 0 < r < 1/(k + 1), λ 1, λ 2,..., λ k+1 > 0 for every (λ 1, λ 2,..., λ k+1 ) B(y 0, r). Therefore, U S and x 0 Int(S).

CHAPTER 2 Some Selected Topics in Convex Geometry 1. Helly s Theorem Theorem 1.1. Let S 1, S 2,..., S m be m n+1 convex sets in R n. If every n+1 sets among S 1, S 2,..., S m have nonempty intersection, S 1 S 2... S m. Proof. We prove by induction on m. It is trivial for m = n+1. Suppose that it holds for m < l, where l > n+1. We want to show it holds for m = l. By induction hypothesis, any m 1 sets among S 1, S 2,..., S m have nonempty intersection. That is, if we let (1.1) T k = S 1 S 2... Ŝk... S m T k for every k = 1, 2,..., m. Since T k, we choose (arbitrarily) a point x k T k. Since m n + 2, x 1, x 2,..., x m are affinely dependent. That is, there exist λ 1, λ 2,..., λ m R, not all zero, such that (1.2) λ 1 x 1 + λ 2 x 2 +... + λ m x m = 0 and (1.3) λ 1 + λ 2 +... + λ m = 0 Obviously, we may rewrite (1.2) as (1.4) By (1.3), (1.5) λ = λ i >0 λ i >0 λ i x i = λ j 0 λ j x j λ i = λ j 0 λ j Since λ 1, λ 2,..., λ m are not all zero, λ > 0. Hence ( ) λi x i = λ λ j 0 (1.6) x = λ i >0 ( λj λ ) x j Therefore, x is a convex combination of {x i : λ i > 0} and it is also a convex combination of {x j : λ j 0}. Let I = {i : λ i > 0} and J = {j : λ j 0}. Obviously, I J = {1, 2,..., m} and I J =. Since x i S j for any i j, 17

18 2. SOME SELECTED TOPICS IN CONVEX GEOMETRY x i j J S j for every i I. Since j J S j is convex, x in j J S j. By the same argument, we see that x i I S i. Therefore, (1.7) x ( S i ) ( m S j ) = S k i I j J Therefore, m k=1 S k. Obviously, the number n + 1 in the theorem is optimal. It is easy to find, for example, three convex sets in R 2 such that any two of them have nonempty intersection but all three of them have empty intersection. k=1 2. Introduction to Linear Programming Let us consider the following question: Question 2.1. Given points x, x 1, x 2,..., x m R n, determine whether x conv{x 1, x 2,..., x m }. Here we assume that dim aff{x 1, x 2,..., x m } = n; otherwise, we can restrict ourselves to R k = aff{x 1, x 2,..., x m }. The case m = n + 1 is easy to treat. Since x 1, x 2,..., x n+1 are affinely independent, x can be written as an affine combination of x 1, x 2,..., x n+1 in a unique way: (2.1) x = λ 1 x 1 + λ 2 x 2 +... + λ n+1 x m where (2.2) λ 1 + λ 2 +... + λ m = 1. Therefore, x conv{x 1, x 2,..., x n+1 } if and only if λ i 0 for all i = 1, 2,..., n + 1. However, the situation is more complicated when m > n + 1. We let x i = (a 1i, a 2i,..., a ni ) and x = (b 1, b 2,..., b n ). Then (2.1) and (2.2) expand to a 11 λ 1 + a 12 λ 2 +... + a 1m λ m = b 1 a 21 λ 1 + a 22 λ 2 +... + a 2m λ m = b 2 (2.3)...... =. a n1 λ 1 + a n2 λ 2 +... + a nm λ m = b n λ 1 + λ 2 +... + λ m = 1 with λ 1, λ 2,..., λ m 0. Obviously, x conv{x 1, x 2,..., x n+1 } if and only if (2.3) has a solution. Here we may assume b i 0 for all i; otherwise, if b i < 0, we replace the i-th equation by (2.4) a i1 λ 1 a i2 λ 2... a im λ m = b i For reasons that will be clear later, we add n auxiliary variables λ m+1, λ m+2,..., λ m+n, λ m+n+1. We consider the following problem:

2. INTRODUCTION TO LINEAR PROGRAMMING 19 Question 2.2. Minimize the function f(λ 1, λ 2,..., λ m, λ m+1, λ m+2,..., λ m+n+1 ) (2.5) = λ m+1 + λ m+2 +... + λ m+n+1 with λ i 0 (i = 1, 2,..., m,..., m + n + 1) satisfying m a 1jλ j + λ m+1 = b 1 (2.6) m a 2jλ j + λ m+2 = b 2................................. m a njλ j + λ m+n = b n m λ j + λ m+n+1 = 1 Lemma 2.3. x conv{x 1, x 2,..., x m } if and only if f min = 0 in Question 2.2. Question 2.2 is a typical problem in so-called Linear Programming (LP). One of the many algorithms to solve a LP problem is Simplex Method. We will use the following example to demonstrate the algorithm. Example 2.1. Let x 1 = (1, 0), x 2 = (0, 1), x 3 = (1, 1) and x 4 = ( 2, 2). Determine whether x = ( 1, 1) conv{x 1, x 2, x 3, x 4 }. First we need to formulate the problem as in Question 2.2. Obviously, x conv{x 1, x 2, x 3, x 4 } if and only if the system of linear equations λ 1 + λ 3 2λ 4 = 1 (2.7) λ 2 + λ 3 2λ 4 = 1 λ 1 + λ 2 + λ 3 + λ 4 = 1 has a solution with λ i 0 for i = 1, 2, 3, 4. It is a requirement of simplex method that b i 0. Therefore, we change (2.7) to the form λ 1 λ 3 + 2λ 4 = 1 (2.8) λ 2 λ 3 + 2λ 4 = 1 λ 1 + λ 2 + λ 3 + λ 4 = 1 By adding three auxiliary variables λ 5, λ 6, λ 7, we put it into a LP problem: ( ) Minimize f(λ 1,..., λ 7 ) = λ 5 + λ 6 + λ 7 with λ i 0 satisfying λ 1 λ 3 + 2λ 4 + λ 5 = 1 (2.9) λ 2 λ 3 + 2λ 4 + λ 6 = 1 λ 1 + λ 2 + λ 3 + λ 4 + λ 7 = 1 In order to apply simplex method to ( ), we need first to put the objective function f in terms of λ 1, λ 2, λ 3, λ 4 (using (2.9)) (2.10) f(λ 1,..., λ 7 ) = λ 3 5λ 4 + 3 We put the coefficients of (2.9) and (2.10) into a table called simplex tableau:

20 2. SOME SELECTED TOPICS IN CONVEX GEOMETRY (2.11) 1 0 1 2 1 0 0. 1 0 1 1 2 0 1 0. 1 1 1 1 1 0 0 1. 1................................... 0 0 1 5 0 0 0. 3 Note that the last row (c 1, c 2,..., c m+n+1, A) corresponds to the objective function (2.12) f(λ 1, λ 2,..., λ m+n+1 ) = c 1 λ 1 + c 2 λ 2 +... + c m+n+1 λ m+n+1 + A The algorithm goes as follows: (1) If the entries of the last row except the rightmost entry, i.e., c j 0 for all j, stop and we are done. The minimum of f is given by A, where A is the rightmost entry of the last row. And x conv{x 1, x 2,..., x m } if and only if A = 0, i.e., the rightmost entry of the last row vanishes. (2) If one of c j < 0, we look at the j-th column. We choose the entry a kj (called pivoting term) such that a kj > 0 and (2.13) { } b k bi = min : a ij > 0 a kj a ij Then we use a kj to eliminate all the other entries of the j-the column by row operations. (3) Repeat step (2) until (1) happens. Now lets carry it out for our example: (the pivoting term of each step is boxed)

3. CONVEX FUNCTIONS 21 (2.14) 1 0 1 2 1 0 0. 1 0 1 1 2 0 1 0. 1 1 1 1 1 0 0 1. 1................................... 0 0 1 5 0 0 0. 3 1 0 1 2 1 0 0. 1 1 1 0 0 1 1 0. 0 3 3 2 1 2 0 1 1 2 0 1. 2.................................... 5 2 0 3 5 2 0 2 0 0. 1 2 0 1 1 2 0 1 0. 1 1 1 0 0 1 1 0. 0 5 3 0 2 2 0 1 3 1 2 1. 2.................................... 0 5 2 3 5 2 0 0 2 0. 1 2 0 0 2 2 2 2 6 5 2 5 5 5. 5 3 1 0 5 0 3 2 2 1 5 5 5. 5 5 3 0 2 2 0 1 3 1 2 1. 2................................. 0 0 0 0 1 1 1. 0 Therefore, f min = 0 and x conv{x 1, x 2, x 3, x 4 }. In addition, the final tableau also tells us how to write x as a convex combination of x 1, x 2, x 3, x 4. Note that if we let λ 3 = λ 5 = λ 6 = λ 7 = 0, then 2λ 4 = 6/5, λ 1 = 1/5 and (5/2)λ 2 = 1/2. Consequently, x = (1/5)x 1 + (1/5)x 2 + (3/5)x 4. 3. Convex Functions Given a function f(x), we consider the region S = {(x, y) : y f(x)} R 2. There is a very useful criterion on the convexity of S. Theorem 3.1. Suppose that f(x) is twice differentiable and f (x) 0 for all a x b. Then the region S = {(x, y) : y f(x), a x b} is convex. An easy corollary of the above theorem is the following:

22 2. SOME SELECTED TOPICS IN CONVEX GEOMETRY Corollary 3.2. Suppose that f(x) is twice differentiable and f (x) 0 for all a x b. Then the region S = {(x, y) : y f(x), a x b} is convex. Proof. Since f (x) 0, f (x) 0 for a x b. Then S = {(x, y) : y f(x)} is convex by the above theorem. Geometrically, S is the reflection of S with respect to the x-axis. More precisely, let g : R 2 R 2 be the linear map given by g(x, y) = (x, y). Then S = g(s ). Since the images of convex sets under linear maps are convex, S is convex. The proof of Theorem 3.1 is elementary. It uses nothing more than Mean Value Theorem (MVT). Let p = (x p, y p ) and q = (x q, y q ) be two points in S. WLOG, assume that x p < x q. Since p, q S, y p f(x p ) and y q f(x q ). Suppose that pq S. Then there exists a point r = (x r, y r ) pq such that r S, i.e., y r < f(x r ). By MVT, there exists x u (x p, x r ) such that (3.1) f (x u ) = f(x r) f(x p ) x r x p and there exists x v (x r, x q ) such that (3.2) f (x v ) = f(x q) f(x r ) x q x r Since y r < f(x r ) and y p f(x p ), (3.3) By the same reason, (3.4) Since p, q, r are collinear, (3.5) Therefore, (3.6) f(x r ) f(x p ) x r x p f(x q ) f(x r ) x q x r y r y p x r x p y q y r x q x r y r y p x r x p = y q y r x q x r f(x r ) f(x p ) x r x p f(x q) f(x r ) x q x r That is, f (x u ) > f (x v ). This contradicts the fact that f (x) 0 and f (x) is nondecreasing on [a, b]. Theorem 3.1 can be generalized to dimension greater than 2. We will state the following without proof. Theorem 3.3. Let f(x 1, x 2,..., x n ) be a twice differentiable function in n variables. Suppose that the n n matrix (called the Hessian of f) [ 2 ] f (3.7) H = x i x j n n

3. CONVEX FUNCTIONS 23 is positive definitive for every (x 1, x 2,..., x n ) in a convex set D R n. Then the region (3.8) S = {(x 1, x 2,..., x n+1 ) : x n+1 f(x 1, x 2,..., x n ), (x 1, x 2,..., x n ) D} in R n+1 is convex.

Bibliography [L] S. R. Lay, Convex Sets and Their Applications. Pure and Applied Mathematics. A Wiley-Interscience Publication. John Wiley & Sons, Inc., New York, 1982. 25