Division of the Humanities and Social Sciences Ec 181 KC Border Convex Analysis and Economic Theory AY 2018 2019 Topic 6: Convex functions I 6.1 Elementary properties of convex functions We may occasionally want to rearrange the definition of convexity to read as follows: f ( x + λ(y x) ) f(x) + λ [ f(y) f(x) ]. Or letting v = y x, we may write f(x + λv) f(x) + λ [ f(x + v) f(x) ]. We start with the following simple, but fundamental, result for concave functions of one variable, cf. Fenchel [2, 2.16, p. 69], Phelps [4, Theorem 1.16, pp. 9 11], or Royden [7, Proposition 5.17, p. 113]. Its proof relies on the following trivial identity, but I shall put a box around it because I have trouble remembering it. x < y < z y = z y z x x+ y x z x z = (1 λ)x+λz where λ = y x z x > 0 and 1 λ = z y z x > 0. The next result is very simple, but is the key to identifying convex and concave functions. 6.1.1 Theorem (Monotonicity of slopes) Given a function f : I R on an interval in R, for x, y I with x < y define S(x, y) = f(y) f(x), y x the slope of the chord joining the points ( x, f(x) ) and ( y, f(y) ). Then statements (1) (5) are equivalent. 1. The function f is convex. 2. S(x, y) is nondecreasing in x and in y. That is, if x < y < z, then S(x, y) S(x, z) S(y, z). S(x, z) S(x, y) S(y, z) x y z Figure 6.1.1. Slopes. KC Border: for Ec 181, 2018 2019 src: ConvexFunctions1 v. 2018.11.02::16.24
KC Border Convex functions I 6 2 3. S(x, y) is nondecreasing in x. That is, if x < x < y, then S(x, y) S(x, y). 4. S(x, y) is nondecreasing in y. That is, if x < y < y, then S(x, y) S(x, y ). 5. If x < y < z, then S(x, y) S(y, z). In addition, statements (1 ) (5 ) are equivalent. 1. The function f is strictly convex. 2. S(x, y) is strictly increasing in x and in y. That is, if x < y < z, then S(x, y) < S(x, z) < S(y, z). 3. S(x, y) is strictly increasing in x. That is, if x < x < y, then S(x, y) < S(x, y). 4. S(x, y) is strictly increasing in y. That is, if x < y < y, then S(x, y) < S(x, y ). 5. If x < y < z, then S(x, y) < S(y, z). 6.1.2 Exercise Prove Theorem 6.1.1. Sample answer: (1) = (2) Let x < y < z belong to I. Then y is the convex combination z y z x x + y x z x z, so the convexity of f implies Subtract f(x) from both sides of (1) to get f(y) z y z x f(x) + y x f(z). (1) z x f(y) f(x) x y z x f(x) + y x z x f(z) and dividing by the positive quantity y x gives S(x, y) S(x, z). v. 2018.11.02::16.24 src: ConvexFunctions1 KC Border: for Ec 181, 2018 2019
KC Border Convex functions I 6 3 Subtracting f(z) from both sides of (1) gives f(y) f(z) z y z x f(x) + y z z x f(z) and dividing by the negative quantity y z gives S(y, z) S(x, z). Combining everything gives S(x, y) S(x, z) S(y, z). Clearly (2) = (3) and (2) = (4). It is also clear that (3) = (5) and (4) = (5). I will now show that (5) = (1). (5) = (1) Let x, z I and let 0 < λ < 1. Without loss of generality we may take x < z. Let y = (1 λ)x + λy. Then λ = y x z y z x and 1 λ = z x. Clearly x < y < z, so by (5), That is, f is convex. S(x, y) S(y, z), f(y) f(x) y x f(z) f(y), z y (z y) [ f(y) f(x) ] (y x) [ f(z) f(y) ], (z x)f(y) (z y)f(x) + (y x)f(z), f(y) z y z x f(x) + y x z x f(z), f ( (1 λ)x + λz ) (1 λ)f(x) + λf(z). The case of strict convexity is similar. The next lemma is a key to proving results about the continuity of convex functions. It provides a bound on the change in a convex function f for changes in a given direction. Since the graph of f lies below the chord joining ( x, f(x) ) and ( x + v, f(x + v) ), it follows by similar triangles that f(x + δv) f(x) δ [ f(x + v) f(x) ]. Figure 6.1.2 shows a nice example. But when f(x + δv) f(x) is negative, as in Figure 6.1.3, this does not help us get a bound on the absolute value of f(x + δv) f(x). But all is not lost. If f(x + δv) f(x) < 0, then we have f(x δv) f(x) 0. (Since x is a convex combination of x δv and x + δv and f is a convex function, the value f(x) cannot be greater than both f(x + δv) and f(x δv).) Then f(x δv) f(x) provides a bound on the absolute value of f(x + δv) f(x), and δ [ f(x v) f(x) ] bounds f(x δv) f(x). See Figure 6.1.4. KC Border: for Ec 181, 2018 2019 src: ConvexFunctions1 v. 2018.11.02::16.24
KC Border Convex functions I 6 4 f(x + v) f(x) δ [ f(x + v) f(x) ] f(x + δv) f(x) x x + δv x + v f(x + δv) f(x) δ [ f(x + v) f(x) ] Figure 6.1.2 f f(x + δv) f(x) x x + δv Figure 6.1.3 x + v f f(x v) f(x) δ [ f(x v) f(x) ] f(x δv) f(x) f(x + δv) f(x) x x + δv x v x δv f(x + δv) f(x) x + v f Figure 6.1.4. f(x + δv) f(x) δ max { f(x + v) f(x), f(x v) f(x) } v. 2018.11.02::16.24 src: ConvexFunctions1 KC Border: for Ec 181, 2018 2019
KC Border Convex functions I 6 5 6.1.3 Lemma Let C be a convex set of a vector space X and let f : C R be a convex function. Let x belong to C and suppose v satisfies [x v, x + v] C. Let δ [0, 1]. Then f(x + δv) f(x) δ max {f(x + v) f(x), f(x v) f(x)} and f(x δv) f(x) δ max {f(x + v) f(x), f(x v) f(x)}. Consequently, max {f(x + v) f(x), f(x v) f(x)} 0. Proof : Since f is convex, f(x + δv) (1 δ)f(x) + δf(x + v). Rearranging terms yields f(x + δv) f(x) δ [f(x + v) f(x)], (2) and replacing v by v gives f(x δv) f(x) δ [f(x v) f(x)]. (3) Also f(x) 1f(x + δv) + 1 f(x δv). Multiplying by two and rearranging terms 2 2 we obtain f(x) f(x + δv) f(x δv) f(x). (4) Combining (3) and (4) yields f(x) f(x + δv) δ [f(x v) f(x)]. This and (2) imply f(x + δv) f(x) δ max { f(x + v) f(x), f(x v) f(x) }. Reversing the roles of v and v completes the proof. The next result is an immediate consequence of the preceding. 6.1.4 Theorem For a convex function f : I R defined on a real interval: 1. The function f is continuous at every interior point of I. 2. The left and right derivatives exist and are finite at each interior point of I. Moreover, if f l and f r denote the left and right derivatives of f, respectively, and x < y are interior points of I, then f l(x) f r(x) f l(y) f r(y). In particular, the left and right derivatives are both nondecreasing on the interior of I. KC Border: for Ec 181, 2018 2019 src: ConvexFunctions1 v. 2018.11.02::16.24
KC Border Convex functions I 6 6 3. The function f is differentiable everywhere on the interior of I except for at most countably many points. Proof : 1. Let x be an interior point of I and choose v small enough so that 0 < v 1 and [x v, x + v] I. Now let ε > 0 be given, and choose δ > 0 so that δ max f(x ± v) f(x) < ε. Then by Lemma 6.1.3, if y x < δv, then f(y) f(x) < ε. That is, f is continuous at x. 2. This follows from the Monotonicity of Slopes Theorem 6.1.1, and the definitions f l(x) = lim v 0 f(x v) f(x) v and f r(x) f(x + v) f(x) = lim. v 0 v 3. Given (2), the only way f can fail to be differentiable at x is if f l(x) < f r(x), in which case there exists some rational number q x satisfying f l(x) < q x < f r(x). It follows from (2) that if x < y are both points of nondifferentiability, then q x < q y. The conclusion follows from the countability of the rational numbers. There are some practical corollaries that allow us to use derivatives to identify concave and convex functions. 6.1.5 Corollary A differentiable function f : I R on a real interval is concave if and only if its derivative is nonincreasing. It is convex if and only if its derivative is nondecreasing. A twice differentiable function f : I R on a real interval is concave if and only if its second derivative is everywhere nonpositive. It is convex if and only if its second derivative is everywhere nonnegative. Another characterization of differentiable concave and convex functions is in terms of the subgradient inequality, which we shall deal with in more depth in Section 18.3. 6.1.6 Theorem (Concave/convex functions lie below/above tangent lines) Let I be an interval of reals, let f : I R, let x be an interior point of I, and assume that f is differentiable at x. If f is convex, then for every y in I, ( ) x, f(x) f(x) + f (x) (y x) f(y). If f is strictly convex, then for every y in I, f(x) + f (x) (y x) < f(y) for y x. f f(x) + f (x) (y x) v. 2018.11.02::16.24 src: ConvexFunctions1 KC Border: for Ec 181, 2018 2019
KC Border Convex functions I 6 7 If f is concave, then for every y in I, f(x) + f (x) (y x) f(y). If f is strictly concave, then for every y in I, f(x) + f (x) (y x) > f(y) for y x. Proof for the concave case: By the definition of concavity and rearranging gives f ( x + λ(y x) ) f(x) + λ ( f(y) f(x) ), f ( x + λ(y x) ) f(x) λ ( f(y) f(x) ). Dividing both sides by λ > 0 and multiplying the left-hand side by (y x)/(y x) yields f ( x + λ(y x) ) f(x) (y x) f(y) f(x). λ(y x) Letting λ 0, the left hand side converges to f (x) (y x). For the case that f is strictly concave, we need a strict inequality, but the argument above just gives us a weak one. But the Monotonicity of Slopes Theorem 6.1.1 supplies us with what we need. Let y x and assume first that y > x, and let z be midway between x and y. Then f(z) f(x) z x f (x) by the above, but f(y) f(x) < f(z) f(x). Multiplying by y x > 0 implies f(y) f(x) < f (x)(y x), y x z x as desired. The case y < x is similar: The argument for the concave case now shows that f(z) f(x) z x f (x), since z x < 0, and monotonicity of slopes gives f(y) f(x) y x > f(z) f(x) z x, but y x < 0, so we reverse the inequality by multiplication and again get f(y) f(x) < f (x)(y x). The converse is true as the following clever argument shows. 6.1.7 Theorem Let I be an open interval of reals, let f : I R, and assume that f is differentiable at every point in I. If f(x) + f (x) (y x) f(y), for all x, y I, then f is concave. If f(x) + f (x) (y x) f(y), for all x, y I, then f is convex. Proof : We prove this only the case for where f is concave. For each x I, define the function h x by h x (y) = f(x) + f (x) (y x). Each h x is affine (and hence concave), f h x for each x, and f(x) = h x (x). Thus so by Exercise 1.3.3(5), f is concave. f = inf x U h x, KC Border: for Ec 181, 2018 2019 src: ConvexFunctions1 v. 2018.11.02::16.24
KC Border Convex functions I 6 8 6.1.8 Exercise (Convexity and classical inequalities) Theorem 6.1.6 can Collect more of these. Check be used to derive a number of well-known inequalities, including: 1. For all x R, 1 e x x e x 1, with strict inequalities for x 0. Hint: Apply Theorem 6.1.6 to the case x = 0. 2. For all x > 0, x 1 ln x. Hint: Apply Theorem 6.1.6 to the case x = 1. By refining the argument use to prove Theorem 6.1.6, you should be able to prove the following. 6.1.9 Proposition Let I be an interval of reals, let f : I R be convex, let x be an interior point of I. Let c satisfy Then for every y in I, f l(x) c f r(x). f(x) + c (y x) f(y). If f is strictly convex, then for every y in I, f(x) + c (y x) < f(y) for y x. 6.1.10 Exercise Prove Proposition 6.1.9. Find the analogous result for concave functions. What can you say if x is on the boundary of the interval I? out [3]. Add Jensen s Inequality here? 6.2 Extrema of convex and concave functions A useful corollary of Theorem 6.1.6 is the following result on extrema. 6.2.1 Theorem (First order conditions are necessary and sufficient) Let I be an interval of reals, let f : I R, let x be an interior point of I, and assume that f is differentiable at x. If f is concave, then f (x) = 0 if and only if x maximizes f over I. If f is convex, then f (x) = 0 if and only if x minimizes f over I. Proof : If f is concave, then by Theorem 6.1.6, we have f(x)+f (x) (y x) f(y) for all I. Thus f (x) = 0 implies f(x) f(y) for all y I. For the converse, we may invoke the usual first order condition for a maximum. The convex case is similar. v. 2018.11.02::16.24 src: ConvexFunctions1 KC Border: for Ec 181, 2018 2019
KC Border Convex functions I 6 9 6.2.2 Theorem (Local extrema are global) Let C be a nonempty convex subset of a topological vector space. If f : C R is concave, then every local maximizer of f is a global maximizer of f over C. If f : C R is convex, then every local minimizer of f is a global minimizer of f over C. Proof : We prove the result for the case where f is concave and x is a local maximizer. That is, there is a neighborhood V of zero such that for all v V we have f(x ) f(x + v). Now pick any x belonging to C. By continuity of scalar multiplication, there is some δ > 0 such that for all 0 λ δ we have λ(x x ) V, so By the definition of concavity, f(x ) f ( x + λ(x x ) ). f ( x + λ(x x ) ) f(x ) + λ [ f(x) f(x ) ] for all 0 λ 1, so combining these gives for 0 < λ δ, which implies f(x ) f(x ) + λ [ f(x) f(x ) ], f(x) f(x ) 0. Since x was an arbitrary point in C, we see that x maximizes f over C. 6.2.3 Corollary If f is strictly concave, a local maximum is a strict global maximum. 6.3 Continuity of convex functions If we adopt the convex analyst s approach that a concave function assumes the value outside its effective domain, then we are likely to have discontinuities at the boundary of the effective domain. Thus continuity is too much to expect as a global property of concave or convex functions. However, we shall see below that proper convex and concave functions are Lipschitz continuous on the relative interiors of their effective domains. However, a concave or convex function can be very badly behaved on the boundary of its domain. 6.3.1 Example (Bad behavior, Rockafellar [6, p. 53]) Let D be the open unit disk in R 2, and let S be the unit circle. Define f to be zero on the disk D, and let f assume any values in [0, ] on S. Then f is convex. Note that f can have both types of discontinuities along S, so convexity or concavity places virtually no restrictions on the boundary behavior. In fact, by taking f to be the indicator of a non-borel subset of S, convexity by itself does not even guarantee measurability. KC Border: for Ec 181, 2018 2019 src: ConvexFunctions1 v. 2018.11.02::16.24
KC Border Convex functions I 6 10 6.3.2 Lemma (Local continuity of convex functions) Let f : X R be a convex function on a topological vector space. If f is bounded above on a neighborhood of an interior point of dom f, then f is continuous at that point. Proof : Assume that x dom f and that there is a neighborhood U of x such that y U = f(y) < f(x) + µ. I need to do something about this argument. Without loss of generality we may assume that U is of the form x + V where V is a circled 1 neighborhood of zero. Fix ε > 0 and choose 0 < δ 1 so that δµ < ε. I claim that for every y x + δv we have f(y) f(x) < ε. So let y x + δv, and pick v V such that y = x + δv. Since V is circled, the line segment [x v, x + v] is included in x + δv. Lemma 6.1.3 we have f(y) f(x) < δ max{f(x + v) f(x), f(x v) f(x)} < δµ < ε. By This shows that f is continuous at x. Amazingly, continuity at a single interior point implies global continuity for convex functions. 6.3.3 Theorem (Global continuity of convex functions) Let f : X R be a convex function on a topological vector space, and let The following statements are equivalent. 1. f is continuous on C. 2. f is upper semicontinuous on C. C = int dom f. 3. f is bounded above on a neighborhood of each point in C. 4. f is bounded above on a neighborhood of some point in C. 5. f is continuous at some point in C. 1 Recall (Definition A.11.2) that a set A is circled if for any point x A, the line segment joining x and x is included A. In any tvs, a neighborhood of zero includes a circled open neighborhood. In R m every Euclidean ball centered at zero is circled. v. 2018.11.02::16.24 src: ConvexFunctions1 KC Border: for Ec 181, 2018 2019
KC Border Convex functions I 6 11 Proof : (1) = (2) Obvious. (2) = (3) Assume that f is upper semicontinuous and x C. Then the set {y C : f(y) < f(x) + 1} is an open neighborhood of x on which f is bounded. (3) = (4) This is trivial. (4) = (5) This is Lemma 6.3.2. (5) = (1) Suppose f is continuous at the point x, and let y be any other point in C. Since scalar multiplication is continuous and C is open, since y = x + 1(y x) C, there is some α > 1 so that z = x + α(y x) C. Then y = (1 λ)z + λx = z + λ(x z) where λ = (α 1)/α. Also, since f is continuous at x, there is a circled neighborhood V of zero such that x + V C and f is bounded above on x + V, say by µ. I claim that f is bounded above on y + λv. y + λv z x + v x x + V y = z + λ(x z) y + λv To see this, let v V. Then y + λv = λ(x + v) + (1 λ)z C. The convexity of f thus implies f(y + λv) λf(x + v) + (1 λ)f(z) λµ + (1 λ)f(z). That is, f is bounded above by λµ + (1 λ)f(z) on y + λv. So by Lemma 6.3.2, f is continuous at y. 6.3.4 Theorem In a finite dimensional topological vector space, every convex function is continuous on the interior of its domain. Proof : Without loss of generality, we may assume x + εe 2 we are dealing with R m. Let f : C R be a convex function defined on a convex subset C of R m. V Let x be an interior point of C. Let {e 1,..., e m } x εe 1 be the standard basis, and pick ε > 0 such that x ± εe i C for each i = 1,..., m. Let V be the x x + εe 1 x εe 2 KC Border: for Ec 181, 2018 2019 src: ConvexFunctions1 v. 2018.11.02::16.24
KC Border Convex functions I 6 12 convex hull of {x±εe i }. Then x belongs to the interior of V. Moreover, since C is convex, V C. Let µ = max{f(x ± εe i )}. Now any point y V may be written as a convex combination m m y = α i (x + εe i ) + β i (x εe i ). Since f is convex, f(y) i i=1 α i f(x + e i ) + i i=1 β i f(x e i ) µ. This shows that f is bounded above on the neighborhood V of x. By Lemma 6.3.2 the function f is continuous at x. 6.3.5 Corollary In a finite dimensional topological vector space, every convex function is continuous on the relative interior of its domain. A convex function on a convex subset of an infinite dimensional topological vector space need not be continuous on the interior of its domain. For instance, a discontinuous linear functional on an infinite dimensional topological vector space is convex. (Recall Example 0.3.1.) If the topology of a tvs is generated by a norm, continuity of a convex function at an interior point implies local Lipschitz continuity. The proof of the next result is adapted from Roberts and Varberg [5]. 6.3.6 Theorem Let f : X R be a convex function on a normed tvs. If f is continuous at the interior point x of dom f, then f is Lipschitz continuous on a neighborhood of x. That is, there exists δ > 0 and µ > 0, such that B δ (x) dom f and for y, z B δ (x), we have f(y) f(z) µ y z. Proof : Since f is continuous at x, there exists δ > 0 such that B 2δ (x) dom f and w, z B 2δ (x) implies f(w) f(z) < 1. Given distinct y and z in B δ (x), let α = y z and let w = y + δ (y z), so w y = δ y z = δ. Then w α α belongs to B 2δ (x) and we may write y as the convex combination y = α w+ δ z. α+δ α+δ Therefore f(y) α α + δ f(w) + Subtracting f(z) from each side gives f(y) f(z) α α + δ δ α + δ f(z). [ f(w) f(z) ] < α α + δ. Switching the roles of y and z allows us to conclude f(y) f(z) < α α + δ < α δ = 1 y z, δ so µ = 1/δ is the desired Lipschitz constant. w 2δ δ y x z v. 2018.11.02::16.24 src: ConvexFunctions1 KC Border: for Ec 181, 2018 2019
KC Border Convex functions I 6 13 It turns out that strictly convex functions on infinite dimensional spaces are quite special. In order for a continuous function to be strictly convex on a compact convex set, the relative topology of the set must be metrizable. See Choquet [1, p. II-139]. References [1] G. Choquet. 1969. Lectures on analysis, volume 1. Reading, Massachusetts: Benjamin. [2] W. Fenchel. 1953. Convex cones, sets, and functions. Lecture notes, Princeton University, Department of Mathematics. From notes taken by D. W. Blackett, Spring 1951. [3] G. H. Hardy, J. E. Littlewood, and G. Pólya. 1952. Inequalities, 2d. ed. Cambridge: Cambridge University Press. [4] R. R. Phelps. 1993. Convex functions, monotone operators and differentiability, 2d. ed. Number 1364 in Lecture Notes in Mathematics. Berlin: Springer Verlag. [5] A. W. Roberts and D. E. Varberg. 1974. Another proof that convex functions are locally Lipschitz. American Mathematical Monthly 81:1014 1016. http://www.jstor.org/stable/2319313 [6] R. T. Rockafellar. 1970. Convex analysis. Number 28 in Princeton Mathematical Series. Princeton: Princeton University Press. [7] H. L. Royden. 1988. Real analysis, 3d. ed. New York: Macmillan. KC Border: for Ec 181, 2018 2019 src: ConvexFunctions1 v. 2018.11.02::16.24