Convexity in R n Let E be a convex subset of R n. A function f : E (, ] is convex iff f(tx + (1 t)y) (1 t)f(x) + tf(y) x, y E, t [0, 1]. A similar definition holds in any vector space. A topology is needed for the next results, but they fail to hold in the infinite dimensional case. Intuitively, a function is convex if at every point the graph lies above the tangent plane. Of course there is no guarantee at first that there is a tangent plane. A first thing to notice is that the definition of convexity implies that if f is convex on E then the set {x E : f(x) < } is convex. From now on we assume that E is a convex subset of R n and f : E R is convex. That is, we throw away for the time being the points where f could be infinite. We will also assume from now on that E is an open convex subset of R n. Some of the results hold even if E is not open, but a lot don t; in particular Lemma 2 is false if E is not open. If x E, u R n, we define I(x, u) = {t R : x + tu E}; this is then an open interval in R containing 0. If x E and u R n, when writing f(x + tu) we will always assume that t I(x, u). The following lemma will be needed in a while. Lemma 1 Let x E, u R n. If I(x, u), 0, define ϕ() = Then ϕ increases in I(x, u)\{0}. f(x + u) f(x). Proof. Assume first 0 < σ < I(x, u). Then 0 < σ/ < 1 and ( x + σu = 1 σ ) x + σ (x + u); hence ( f(x + σu) 1 σ ) f(x) + σ f(x + u). Subtracting f(x) from both sides and dividing by σ, we get ϕ(σ) ϕ(). One proceeds similarly if σ < < 0. In this case 0 < /σ < 1 and we get f(x + u) f(x) (/σ)[f(x + σu) f(x)]. Dividing by reverses the inequality. Finally, assume σ < 0 <. Then x = ( σ) (x + σu) + (x + u), σ σ hence f(x) ( σ) f(x + σu) + f(x + u). σ σ Subtracting /( σ)f(x) from both sides and rearranging gives f(x + u) f(x) ( σ) σ f(x + σu) f(x). σ Multiplying by the positive quantity ( σ)/( σ) gives ϕ(σ) ϕ(). Next; continuity. The main step is proving that f is locally bounded. The idea behind that part of the proof is that every point of E can be in the interior of some polyhedron, a cube for example. By convexity, the function inside the cube is bounded by its values on the faces, on each face it is bounded by the values on the vertices. Thus inside a cube, f is bounded by the maximum on a finite set of values, hence is bounded.
Lemma 2 let E be an open convex subset of R n and let f : E R be convex. Then f is continuous. Proof. First it seems one has to see that f is locally bounded. Let x 0 = (x 01,..., x 0n ) E and let V = {x = (x 1,..., x n ) R n : x j x 0j δ}, where δ is chosen small enough so that V E. We see first that f is bounded above in V. If y V, then y is a convex combination (in more than one way) of the 2 n vertices of V ; thus by convexity f(y) max{f(x 0 + δ ϵ j e j ) : ϵ = (ϵ 1,..., ϵ n ) { 1, 1} n } j=1 where {e 1,..., e n } is the canonical basis of R n. Now let W be a compact neighborhood of x contained in the interior of V. To see that f is bounded below on W, assume it isn t. Then, by compactness, there will exist a sequence {x m } in W, converging to some z W, such that f(x m ) for m. Now let y m V be on the line joining x m to z, but on the opposite side of x m ; for example, let y m = z + ϵ(z x m ) = (1 + ϵ)z ϵx m where ϵ > 0 is small enough so that y m V. Then z = e 1 + ϵ x m + 1 1 + ϵ y m hence f(z) e 1+ϵ f(x m) + 1 1+ϵ f(y m), but the right hand side goes to as m, since ϵ > 0 is fixed and {f(y m )} remains bounded above. This contradicts that f(z) >. We proved f is locally bounded. For continuity, assume again x E and let V be a compact neighborhood of x in E, so that f is bounded on V. Let {x m } be a sequence in V converging to x. We can write x m = x + m u m where for each m N, u m is a unit vector in R n and m 0; lim m m = 0. Let σ > 0 be such that x σu m V for all m. The main thing is that σ should not be zero, and be positive so that by Lemma 1), we have (since m 0 > σ) hence f(x + m ) f(x) m f(x + ( s)u m) f(x) σ f(x m ) f(x) = f(x + m ) f(x) m f(x) f(x σu m ) σ = f(x) f(x σu m), σ proving lim inf m (f(x m ) f(x)) 0. Similarly one proves that lim sup m (f(x m ) f(x)) 0. Continuity follows. 0 as m 0, Hahn-Banach. The version we need here is a finite dimensional one, but it is valid in all topological vector spaces. A good reference is Schaefer s Topological Vector Spaces book, but it appears in zillions of places. Theorem 3 Let L be a topological vector space, let M be a linear manifold in L and let A be an open non-empty convex subset of L such that A M =. There exists a closed hyperplane in L containing M not intersecting A. Just in case, a topological vector space (tvs) is a vector space L over F = R or C with a topology such that the maps (x, y) x + y : L L L, (α, x) αx : F L L, 2
are continuous. For example, normed spaces are tvs; in particular R n. A linear manifold in a vector space L is the translate of a subspace; any set of the form a + W = {a + w : w W }, where W is a linear subspace of L. A hyperplane is a linear manifold of codimension 1; i.e., a linear manifold of the form a + W where L = W N, and N is one-dimensional. All hyperplanes are closed in the finite dimensional case, not so in the infinite dimensional case. One proves that a subset H of the tvs L is a closed hyperplane if and only it is of the form H = {x L : f(x) = α} where f : L F is a continuous, non-identically vanishing, linear functional and α F. In R n a subset H is a hyperplane if and only if there exist ξ R n \{0}, α R such that H = {x R n : x ξ = α}. We want to consider the case in which A R n is open and convex, = A R n and x A, the boundary of A. Singleton sets are linear manifolds; they are translates of the zero dimensional subspace {0}. Thus {x} is and, since A is open, {x} A =, thus there exists 0 ξ R n, α R, such that ξ x = α and ξ y α for all y A. Since A is connected, being convex, one either has ξ y > α for all y A or ξ y < α for all y A. Replacing ξ by ξ if necessary, we can assume ξ y > α for all y A. Assume now that f : E R is convex; E a convex open subset of R n. Consider the set A = {(x, y) R n R : y > f(x)}. We see that A is convex in R n R and, since f is continuous, A is open. Let x E; then (x, f(x)) is on the boundary of A and by what we just did there exists ξ = (ν, ) R n R, ξ (0, 0), α R such that ξ x + f(x) = α, ξ y + s > α y E, s R, s > f(y). We notice that > 0. In fact, we can t have = 0, otherwise ξ y > α for all y E, contradicting that (if = 0) ξ x = α. Take now any point in E, for example x. Then ξ x + s > α for s > f(x), only possible if > 0. Dividing by and setting η = ξ/, we can write the consequence of applying Hahn-Banach as We conclude α = f(x) η x, s > α + η y y E, s R, s > f(y). f(y) α + η y = f(x) + η (y x) y E. The equation z = f(x) + η (y x) is the equation of a hyperplane through the point (x, f(x)), so the last inequality states that the graph of the function is always above that hyperplane. We have the following definition. Let E be a convex open subset of R n and let f : E R be convex. Let x E. The subdifferential of f at x is the set f(x) consisting of all η R n such that (1) f(y) f(x) + η (y x) y E. We have Lemma 4 Let f : E R be convex, where E is an open convex subset of R n. For every x E, the set f(x) is a non-empty closed convex subset of R n. The function f is differentiable at x E if and only if f(x) is a singleton set, in which case f(x) = { f(x)}. 3
Proof. It is proved above, as a consequence of Hahn-Banach, that f(x) for all x E. The fact that f(x) is closed and convex is trivial. Next we want to see that if x E and f(x) is a singleton set, then f is differentiable at x. Assume thus that x E and f(x) = {η}, for some η R n. Let u R n and consider the function ϕ of Lemma 1. By the lemma, the limits ρ(u) = lim 0+ f(x + u) f(x + σu), l(u) = lim σ 0 σ exist and f(x + u) f(x + σu) ρ(u) l(u) σ for all u R n, σ, I(x, u), σ < 0 <. We claim l(u) = η = ρ(u) for all u R n. To see this, let u R n and select s 0 R such that l(u) s 0 ρ(u). As before, let A = {(y, s) E R, s > f(y)}, which is open and convex in R n R, and let M = {(x + tu, f(x) + ts 0 ) : t R, which is a linear manifold (a line) in R n R. If (y, s) A M, then y = x + tu, s = f(x 0 ) + ts 0 for some t R (necessarily t I(x, u)) and f(x + tu) < f(x 0 ) + ts 0. This is impossible if t = 0. If t > 0 it works out to ϕ(t) < s 0, where ϕ is defined as above in terms of the u we are considering right now. However, ϕ(t) ρ(u) s 0 if t > 0. If t < 0 we similarly get ϕ(t) > s 0, a contradiction because ϕ(t) l(u) s 0 if t < 0. Thus A M =, and by Hahn-Banach there exists (ξ, ) R n R\{(0, 0)}, α R, such that (2) (3) ξ (x + tu) + (f(x) + ts 0 ) = α t R, Obviously, (2) is equivalent to (4) (5) ξ y + s > α (y, s) E R, s > f(y). ξ u + s 0 = 0, ξ x + f(x) = α. If = 0, we get ξ x = α, contradicting that we must get ξ x = ξ x + s > α if s > f(x). Thus > 0. In view of this, we see that (3) is equivalent to > 0 and f(y) α 1 ξ y y E. Using the value of α given by (5), we get f(y) f(x) 1 ξ(y x) y E. This implies that ( 1/)ξ f(x), hence ( 1/)ξ = η. By (4), this implies that s 0 = η u. The value of s is thus uniquely determined by u and by η proving that ρ(u) = l(u). On the other hand, if > 0, u R n, f(x + u) f(x) + η u implies that ϕ() η u for all > 0. Similarly ϕ() η u if < 0. Thus, in general ρ(u) η u l(u), proving the claim. To see that f is differentiable at x and that η = f(x), let ϵ > 0 be given. Since l(u) = ρ(u) = η u for all u R n, there exists δ > 0 such that 0 < < δ implies f(x + e i ) f(x) η e i < ϵ for i = 1,..., n, where {e 1,..., e n } is the canonical basis of R n. Let 0 u = (u 1,..., u n ) R n, and assume that = u 1 + + u n < δ. We can write u = n i=1 u i f i where ẽ i = e i if u i 0, ẽ i = e i if u i < 0. Then x + u = i=1 u i (x + ẽ i ); 4
by convexity, Thus f(x + u) (6) 0 f(x + u) f(x) η u i=1 u i f(x + ẽ i ). i=1 ( ) f(x + u 1 ẽ i ) f(x) u i η ẽ i. Now ( ) f(x + ẽ i ) f(x) f(x + ei ) f(x) η ẽ i = ± η e i where = ± ; since = < δ, we get that f(x + ẽ i ) f(x) η ẽ i < ϵ; using this in (6) we proved that 0 f(x + u) f(x) η u < ϵ whenever u R n and < δ. The proof of differentiability is complete. Conversely, assume f is differentiable at x. In this case we get at once that f(x + u) f(x) f(x + u) f(x) ρ(u) = lim = f(x) u = lim = l(x); 0+ 0 as before we see that η f(x) implies l(u) η u ρ(u). It follows that η = (x). Monotonicity and cyclical monotonicity. Let S R n R n. It is said to be monotone of degree 1 iff (x 0 x 1 ) ξ 0 +(x 1 x 0 ) ξ 1 0 for all (x 0, ξ 0 ), (x 1, ξ 1 ) S. This, of course can be abbreviated as (x 0 x 1 ) (ξ 0 ξ 1 ) 0 for all (x 0, ξ 0 ), (x 1, ξ 1 ) S. What exactly this means geometrically may have to be looked into, perhaps. Similarly, one defines a set S R n R n to be monotone of degree m, where m N, iff m i=0 (x i+1 x i ) ξ i 0 for all (x 0, ξ 0 ),..., (x m, ξ m ) S, where x m+1 = x 0. A set S R n R n is said to be cyclically monotone iff it is monotone of all degrees. Let f : E R, E a convex, open subset of R n.the set f R n R n is defined by f = {(x, ξ) : x E, ξ f(x)}. If (x i, ξ i ) f for i = 0,..., m, then f(y) f(x i ) + (y x i ) ξ i for all y R n, i = 0,..., m. Taking in the i-th equation, y = x i+1, x m+1 = x 0, we see that f(x i+1 f(x i ) (x i+1 x i ) ξ i, i = 0,..., m. Adding over i = 0,..., m the left hand sides telescope to 0 and one gets m i=0 (x i+1 x i ) ξ i 0. In other words, the set f is cyclically monotone. It being obvious that a subset of a cyclically monotone set is cyclically monotone, we see that all subsets of f for a convex f are cyclically monotone. Rockafellar s theorem is the converse of this result. That is, he states it in the form of an if and only if condition, but the serious part is the converse. It seems 5
however that the convex function that comes out of this could be infinite at some points. Rockafellar defines a convex function as being proper iff it is not identically. As mentioned at the beginning, the set of points where f is finite is convex. It is also easy to see that a convex function can only be equal to on the outskirts of its domain; that is the set E = {x R n : f(x) < } is open. Thus if f : R n R is a proper convex function, all we said so far is valid on the set E = {x R n : f(x) < }. And, of course, f E R n ; f(x) if and only if x E. Theorem 5 (Rockafellar in R n ) A subset S R n R n is cyclically monotone if and only if there exists a proper convex f : R n R such that S f. Proof. The if part has been done, so we only need to do the only if part. Assume thus that S R n R n is cyclically monotone. One of the simplest of convex functions is an affine function, which is simply a function of the form f(x) = a + x ξ, for a, ξ R n. This type of function has the distinction that the convexity inequality is actually an equality, hence it is not only convex, it is also concave. We will need the following Lemma. The proof is quite simple, except that one has to be careful because the supremum in question can be infinite at some points, or possibly everywhere. Lemma 6 Let f α : R n (, ] be convex for all α A, A an index set. Then f : R n (, ] defined by is convex. f(x) = sup{f α (x)} α A To prove the theorem, Rockafellar now assumes, as one may, that S and fixes a pair (x 0, ξ 0 ) S. He then defines m 1 f(x) = sup{(x x m ) ξ m + (x i+1 x i ) ξ i : m N, (x i, ξ i ) S, i = 1,..., m}. i=0 The function f as a sup of affine, hence convex, functions is convex. It could be at a lot of places, but because S is cyclically monotone one has for all choices of (x 1, ξ 1 ),..., (x m, ξ m ) S, m 1 (x 0 x m ) ξ m + (x i+1 x i ) ξ i 0, i=0 thus f(x 0 ) 0. Selecting m = 1 and (x 1, ξ 1 ) = (x 0 ξ 0 ) we get an affine function that is 0 at x 0, thus f(x 0 ) = 0. Thus f is proper. The rest seems easy. To see that this function works we have to see that if (x, ξ) S then f(y) f(x) + (y x) ξ for all y R n (an inequality that automatically holds for y such that f(y) =, thus we could limit ourselves to y s such that f(y) < ). So let (x, ξ) S. In principle, we don t know yet that f(x) < so what we need to prove is: If α < f(x) then f(y) α + (y x) ξ y R n. This will prove it (and by restricting to y = x 0 proves f(x) < ). And the proof is now almost immediate, since by the definition of f, α < f(x) implies the existence of (x 1, ξ 1 ),..., (x m, ξ m ) S such that α < (x x m ) ξ m + (x m x m 1 ) ξ m 1 + + (x 1 x 0 ) ξ 0. 6
On the other hand, since (x 1, ξ 1 ),..., (x m, ξ m ), (x, ξ) S, the definition of f(y) implies f(y) (y x) ξ+(x x m ) ξ m +(x m x m 1 ) ξ m 1 + +(x 1 x 0 ) ξ 0 > α+(y x) ξ. We are (i.e, Rockafellar is) done. In Singular integrals and differentiability properties of functions Stein proves the following theorem: Theorem ES. Let E R n, let U be open in R n and let f : U R be Lebesgue measurable. Then f is differentiable at almost every point of E if and only if the relation (7) f(x + y) f(x) = O( y ) as y 0 holds for almost every x E. It is, of course, not assumed that the constant appearing in the O above is uniform in x. (Chapter VIII, Theorem 3, page 250.) The proof is quite lengthy and uses a lot of results Stein developed previously. It has an immediate application to convex functions. Lemma 7 Let E be an open subset of R n and let f : E R be convex. Then f is differentiable a.e. Proof. Let x E. Since f is locally bounded, there is r > 0 and a constant C 0 such that if y x < r then y E and f(y) C. Assuming now 0 < y < r, let ω = ry/ y and let t = y /r (0, 1). Then x+y = (1 t)x+t(x+ω) so that by convexity f(x+y) f(x) (1 t)f(x)+tf(x+ω) f(x) = t(f(x+ω) f(x)) 2Ct = 2C r y. On the other hand, if ξ f(x), then f(x + y) f(x) ξ y ξ y. It follows that (7) holds for all x E; by Theorem ES f is a.e. differentiable. 7