Convexity in R n. The following lemma will be needed in a while. Lemma 1 Let x E, u R n. If τ I(x, u), τ 0, define. f(x + τu) f(x). τ.

Similar documents
g 2 (x) (1/3)M 1 = (1/3)(2/3)M.

A NICE PROOF OF FARKAS LEMMA

Integral Jensen inequality

Convex Optimization Notes

Convex Analysis and Economic Theory AY Elementary properties of convex functions

Optimality Conditions for Nonsmooth Convex Optimization

3 Measurable Functions

Lecture 1: Background on Convex Analysis

Division of the Humanities and Social Sciences. Supergradients. KC Border Fall 2001 v ::15.45

BASICS OF CONVEX ANALYSIS

Lebesgue Integration: A non-rigorous introduction. What is wrong with Riemann integration?

Convex Functions and Optimization

Optimization and Optimal Control in Banach Spaces

A SET OF LECTURE NOTES ON CONVEX OPTIMIZATION WITH SOME APPLICATIONS TO PROBABILITY THEORY INCOMPLETE DRAFT. MAY 06

Convex Geometry. Carsten Schütt

Locally convex spaces, the hyperplane separation theorem, and the Krein-Milman theorem

Set, functions and Euclidean space. Seungjin Han

Extreme points of compact convex sets

(convex combination!). Use convexity of f and multiply by the common denominator to get. Interchanging the role of x and y, we obtain that f is ( 2M ε

Chapter 2 Convex Analysis

1 The Local-to-Global Lemma

Review of Multi-Calculus (Study Guide for Spivak s CHAPTER ONE TO THREE)

L p Spaces and Convexity

3 (Due ). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

Appendix B Convex analysis

E. The Hahn-Banach Theorem

Lecture 4 Lebesgue spaces and inequalities

Real Analysis Notes. Thomas Goller

1 Convexity, concavity and quasi-concavity. (SB )

USING FUNCTIONAL ANALYSIS AND SOBOLEV SPACES TO SOLVE POISSON S EQUATION

Measures. Chapter Some prerequisites. 1.2 Introduction

CHAPTER 7. Connectedness

Introduction and Preliminaries

Math 341: Convex Geometry. Xi Chen

REAL VARIABLES: PROBLEM SET 1. = x limsup E k

An introduction to some aspects of functional analysis

PROBLEMS. (b) (Polarization Identity) Show that in any inner product space

Chapter 2 Metric Spaces

Convex functions, subdifferentials, and the L.a.s.s.o.

1/12/05: sec 3.1 and my article: How good is the Lebesgue measure?, Math. Intelligencer 11(2) (1989),

2. Dual space is essential for the concept of gradient which, in turn, leads to the variational analysis of Lagrange multipliers.

NOTES ON VECTOR-VALUED INTEGRATION MATH 581, SPRING 2017

arxiv:math/ v1 [math.fa] 4 Feb 1993

Lecture Notes in Advanced Calculus 1 (80315) Raz Kupferman Institute of Mathematics The Hebrew University

Geometry and topology of continuous best and near best approximations

SOLUTIONS TO SOME PROBLEMS

3. (a) What is a simple function? What is an integrable function? How is f dµ defined? Define it first

Introduction to Real Analysis Alternative Chapter 1

On duality theory of conic linear problems

LECTURE 15: COMPLETENESS AND CONVEXITY

ON SPACE-FILLING CURVES AND THE HAHN-MAZURKIEWICZ THEOREM

Introduction to Convex Analysis Microeconomics II - Tutoring Class

Functional Analysis I

SECOND-ORDER CHARACTERIZATIONS OF CONVEX AND PSEUDOCONVEX FUNCTIONS

The local equicontinuity of a maximal monotone operator

1 Topology Definition of a topology Basis (Base) of a topology The subspace topology & the product topology on X Y 3

Convex Analysis and Economic Theory Winter 2018

Overview of normed linear spaces

Continuity of convex functions in normed spaces

Math 209B Homework 2

Bodies of constant width in arbitrary dimension

Analysis Comprehensive Exam Questions Fall 2008

MATHS 730 FC Lecture Notes March 5, Introduction

Def. A topological space X is disconnected if it admits a non-trivial splitting: (We ll abbreviate disjoint union of two subsets A and B meaning A B =

Optimality Conditions for Constrained Optimization

U e = E (U\E) e E e + U\E e. (1.6)

Semicontinuous functions and convexity

Chapter 1. Measure Spaces. 1.1 Algebras and σ algebras of sets Notation and preliminaries

Subgradients. subgradients and quasigradients. subgradient calculus. optimality conditions via subgradients. directional derivatives

Chapter 1. Optimality Conditions: Unconstrained Optimization. 1.1 Differentiable Problems

Measure Theory on Topological Spaces. Course: Prof. Tony Dorlas 2010 Typset: Cathal Ormond

REAL ANALYSIS I HOMEWORK 4

2 (Bonus). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

Discrete Geometry. Problem 1. Austin Mohr. April 26, 2012

J. Banasiak Department of Mathematics and Applied Mathematics University of Pretoria, Pretoria, South Africa BANACH LATTICES IN APPLICATIONS

Chapter 1. Preliminaries. The purpose of this chapter is to provide some basic background information. Linear Space. Hilbert Space.

Normed Vector Spaces and Double Duals

FUNCTIONAL ANALYSIS-NORMED SPACE

Tools from Lebesgue integration

Constraint qualifications for convex inequality systems with applications in constrained optimization

Whitney s Extension Problem for C m

Combinatorics in Banach space theory Lecture 12

Real Analysis, 2nd Edition, G.B.Folland Elements of Functional Analysis

Notes for Functional Analysis

Introduction to Convex and Quasiconvex Analysis

Normed and Banach spaces

1. Bounded linear maps. A linear map T : E F of real Banach

The small ball property in Banach spaces (quantitative results)

(c) For each α R \ {0}, the mapping x αx is a homeomorphism of X.

Elements of Convex Optimization Theory

Local strong convexity and local Lipschitz continuity of the gradient of convex functions

ANALYSIS QUALIFYING EXAM FALL 2017: SOLUTIONS. 1 cos(nx) lim. n 2 x 2. g n (x) = 1 cos(nx) n 2 x 2. x 2.

MATH 51H Section 4. October 16, Recall what it means for a function between metric spaces to be continuous:

(x k ) sequence in F, lim x k = x x F. If F : R n R is a function, level sets and sublevel sets of F are any sets of the form (respectively);

Notions such as convergent sequence and Cauchy sequence make sense for any metric space. Convergent Sequences are Cauchy

Translative Sets and Functions and their Applications to Risk Measure Theory and Nonlinear Separation

MATH MEASURE THEORY AND FOURIER ANALYSIS. Contents

VISCOSITY SOLUTIONS. We follow Han and Lin, Elliptic Partial Differential Equations, 5.

2 Measure Theory. 2.1 Measures

Problem Set 2: Solutions Math 201A: Fall 2016

Transcription:

Convexity in R n Let E be a convex subset of R n. A function f : E (, ] is convex iff f(tx + (1 t)y) (1 t)f(x) + tf(y) x, y E, t [0, 1]. A similar definition holds in any vector space. A topology is needed for the next results, but they fail to hold in the infinite dimensional case. Intuitively, a function is convex if at every point the graph lies above the tangent plane. Of course there is no guarantee at first that there is a tangent plane. A first thing to notice is that the definition of convexity implies that if f is convex on E then the set {x E : f(x) < } is convex. From now on we assume that E is a convex subset of R n and f : E R is convex. That is, we throw away for the time being the points where f could be infinite. We will also assume from now on that E is an open convex subset of R n. Some of the results hold even if E is not open, but a lot don t; in particular Lemma 2 is false if E is not open. If x E, u R n, we define I(x, u) = {t R : x + tu E}; this is then an open interval in R containing 0. If x E and u R n, when writing f(x + tu) we will always assume that t I(x, u). The following lemma will be needed in a while. Lemma 1 Let x E, u R n. If I(x, u), 0, define ϕ() = Then ϕ increases in I(x, u)\{0}. f(x + u) f(x). Proof. Assume first 0 < σ < I(x, u). Then 0 < σ/ < 1 and ( x + σu = 1 σ ) x + σ (x + u); hence ( f(x + σu) 1 σ ) f(x) + σ f(x + u). Subtracting f(x) from both sides and dividing by σ, we get ϕ(σ) ϕ(). One proceeds similarly if σ < < 0. In this case 0 < /σ < 1 and we get f(x + u) f(x) (/σ)[f(x + σu) f(x)]. Dividing by reverses the inequality. Finally, assume σ < 0 <. Then x = ( σ) (x + σu) + (x + u), σ σ hence f(x) ( σ) f(x + σu) + f(x + u). σ σ Subtracting /( σ)f(x) from both sides and rearranging gives f(x + u) f(x) ( σ) σ f(x + σu) f(x). σ Multiplying by the positive quantity ( σ)/( σ) gives ϕ(σ) ϕ(). Next; continuity. The main step is proving that f is locally bounded. The idea behind that part of the proof is that every point of E can be in the interior of some polyhedron, a cube for example. By convexity, the function inside the cube is bounded by its values on the faces, on each face it is bounded by the values on the vertices. Thus inside a cube, f is bounded by the maximum on a finite set of values, hence is bounded.

Lemma 2 let E be an open convex subset of R n and let f : E R be convex. Then f is continuous. Proof. First it seems one has to see that f is locally bounded. Let x 0 = (x 01,..., x 0n ) E and let V = {x = (x 1,..., x n ) R n : x j x 0j δ}, where δ is chosen small enough so that V E. We see first that f is bounded above in V. If y V, then y is a convex combination (in more than one way) of the 2 n vertices of V ; thus by convexity f(y) max{f(x 0 + δ ϵ j e j ) : ϵ = (ϵ 1,..., ϵ n ) { 1, 1} n } j=1 where {e 1,..., e n } is the canonical basis of R n. Now let W be a compact neighborhood of x contained in the interior of V. To see that f is bounded below on W, assume it isn t. Then, by compactness, there will exist a sequence {x m } in W, converging to some z W, such that f(x m ) for m. Now let y m V be on the line joining x m to z, but on the opposite side of x m ; for example, let y m = z + ϵ(z x m ) = (1 + ϵ)z ϵx m where ϵ > 0 is small enough so that y m V. Then z = e 1 + ϵ x m + 1 1 + ϵ y m hence f(z) e 1+ϵ f(x m) + 1 1+ϵ f(y m), but the right hand side goes to as m, since ϵ > 0 is fixed and {f(y m )} remains bounded above. This contradicts that f(z) >. We proved f is locally bounded. For continuity, assume again x E and let V be a compact neighborhood of x in E, so that f is bounded on V. Let {x m } be a sequence in V converging to x. We can write x m = x + m u m where for each m N, u m is a unit vector in R n and m 0; lim m m = 0. Let σ > 0 be such that x σu m V for all m. The main thing is that σ should not be zero, and be positive so that by Lemma 1), we have (since m 0 > σ) hence f(x + m ) f(x) m f(x + ( s)u m) f(x) σ f(x m ) f(x) = f(x + m ) f(x) m f(x) f(x σu m ) σ = f(x) f(x σu m), σ proving lim inf m (f(x m ) f(x)) 0. Similarly one proves that lim sup m (f(x m ) f(x)) 0. Continuity follows. 0 as m 0, Hahn-Banach. The version we need here is a finite dimensional one, but it is valid in all topological vector spaces. A good reference is Schaefer s Topological Vector Spaces book, but it appears in zillions of places. Theorem 3 Let L be a topological vector space, let M be a linear manifold in L and let A be an open non-empty convex subset of L such that A M =. There exists a closed hyperplane in L containing M not intersecting A. Just in case, a topological vector space (tvs) is a vector space L over F = R or C with a topology such that the maps (x, y) x + y : L L L, (α, x) αx : F L L, 2

are continuous. For example, normed spaces are tvs; in particular R n. A linear manifold in a vector space L is the translate of a subspace; any set of the form a + W = {a + w : w W }, where W is a linear subspace of L. A hyperplane is a linear manifold of codimension 1; i.e., a linear manifold of the form a + W where L = W N, and N is one-dimensional. All hyperplanes are closed in the finite dimensional case, not so in the infinite dimensional case. One proves that a subset H of the tvs L is a closed hyperplane if and only it is of the form H = {x L : f(x) = α} where f : L F is a continuous, non-identically vanishing, linear functional and α F. In R n a subset H is a hyperplane if and only if there exist ξ R n \{0}, α R such that H = {x R n : x ξ = α}. We want to consider the case in which A R n is open and convex, = A R n and x A, the boundary of A. Singleton sets are linear manifolds; they are translates of the zero dimensional subspace {0}. Thus {x} is and, since A is open, {x} A =, thus there exists 0 ξ R n, α R, such that ξ x = α and ξ y α for all y A. Since A is connected, being convex, one either has ξ y > α for all y A or ξ y < α for all y A. Replacing ξ by ξ if necessary, we can assume ξ y > α for all y A. Assume now that f : E R is convex; E a convex open subset of R n. Consider the set A = {(x, y) R n R : y > f(x)}. We see that A is convex in R n R and, since f is continuous, A is open. Let x E; then (x, f(x)) is on the boundary of A and by what we just did there exists ξ = (ν, ) R n R, ξ (0, 0), α R such that ξ x + f(x) = α, ξ y + s > α y E, s R, s > f(y). We notice that > 0. In fact, we can t have = 0, otherwise ξ y > α for all y E, contradicting that (if = 0) ξ x = α. Take now any point in E, for example x. Then ξ x + s > α for s > f(x), only possible if > 0. Dividing by and setting η = ξ/, we can write the consequence of applying Hahn-Banach as We conclude α = f(x) η x, s > α + η y y E, s R, s > f(y). f(y) α + η y = f(x) + η (y x) y E. The equation z = f(x) + η (y x) is the equation of a hyperplane through the point (x, f(x)), so the last inequality states that the graph of the function is always above that hyperplane. We have the following definition. Let E be a convex open subset of R n and let f : E R be convex. Let x E. The subdifferential of f at x is the set f(x) consisting of all η R n such that (1) f(y) f(x) + η (y x) y E. We have Lemma 4 Let f : E R be convex, where E is an open convex subset of R n. For every x E, the set f(x) is a non-empty closed convex subset of R n. The function f is differentiable at x E if and only if f(x) is a singleton set, in which case f(x) = { f(x)}. 3

Proof. It is proved above, as a consequence of Hahn-Banach, that f(x) for all x E. The fact that f(x) is closed and convex is trivial. Next we want to see that if x E and f(x) is a singleton set, then f is differentiable at x. Assume thus that x E and f(x) = {η}, for some η R n. Let u R n and consider the function ϕ of Lemma 1. By the lemma, the limits ρ(u) = lim 0+ f(x + u) f(x + σu), l(u) = lim σ 0 σ exist and f(x + u) f(x + σu) ρ(u) l(u) σ for all u R n, σ, I(x, u), σ < 0 <. We claim l(u) = η = ρ(u) for all u R n. To see this, let u R n and select s 0 R such that l(u) s 0 ρ(u). As before, let A = {(y, s) E R, s > f(y)}, which is open and convex in R n R, and let M = {(x + tu, f(x) + ts 0 ) : t R, which is a linear manifold (a line) in R n R. If (y, s) A M, then y = x + tu, s = f(x 0 ) + ts 0 for some t R (necessarily t I(x, u)) and f(x + tu) < f(x 0 ) + ts 0. This is impossible if t = 0. If t > 0 it works out to ϕ(t) < s 0, where ϕ is defined as above in terms of the u we are considering right now. However, ϕ(t) ρ(u) s 0 if t > 0. If t < 0 we similarly get ϕ(t) > s 0, a contradiction because ϕ(t) l(u) s 0 if t < 0. Thus A M =, and by Hahn-Banach there exists (ξ, ) R n R\{(0, 0)}, α R, such that (2) (3) ξ (x + tu) + (f(x) + ts 0 ) = α t R, Obviously, (2) is equivalent to (4) (5) ξ y + s > α (y, s) E R, s > f(y). ξ u + s 0 = 0, ξ x + f(x) = α. If = 0, we get ξ x = α, contradicting that we must get ξ x = ξ x + s > α if s > f(x). Thus > 0. In view of this, we see that (3) is equivalent to > 0 and f(y) α 1 ξ y y E. Using the value of α given by (5), we get f(y) f(x) 1 ξ(y x) y E. This implies that ( 1/)ξ f(x), hence ( 1/)ξ = η. By (4), this implies that s 0 = η u. The value of s is thus uniquely determined by u and by η proving that ρ(u) = l(u). On the other hand, if > 0, u R n, f(x + u) f(x) + η u implies that ϕ() η u for all > 0. Similarly ϕ() η u if < 0. Thus, in general ρ(u) η u l(u), proving the claim. To see that f is differentiable at x and that η = f(x), let ϵ > 0 be given. Since l(u) = ρ(u) = η u for all u R n, there exists δ > 0 such that 0 < < δ implies f(x + e i ) f(x) η e i < ϵ for i = 1,..., n, where {e 1,..., e n } is the canonical basis of R n. Let 0 u = (u 1,..., u n ) R n, and assume that = u 1 + + u n < δ. We can write u = n i=1 u i f i where ẽ i = e i if u i 0, ẽ i = e i if u i < 0. Then x + u = i=1 u i (x + ẽ i ); 4

by convexity, Thus f(x + u) (6) 0 f(x + u) f(x) η u i=1 u i f(x + ẽ i ). i=1 ( ) f(x + u 1 ẽ i ) f(x) u i η ẽ i. Now ( ) f(x + ẽ i ) f(x) f(x + ei ) f(x) η ẽ i = ± η e i where = ± ; since = < δ, we get that f(x + ẽ i ) f(x) η ẽ i < ϵ; using this in (6) we proved that 0 f(x + u) f(x) η u < ϵ whenever u R n and < δ. The proof of differentiability is complete. Conversely, assume f is differentiable at x. In this case we get at once that f(x + u) f(x) f(x + u) f(x) ρ(u) = lim = f(x) u = lim = l(x); 0+ 0 as before we see that η f(x) implies l(u) η u ρ(u). It follows that η = (x). Monotonicity and cyclical monotonicity. Let S R n R n. It is said to be monotone of degree 1 iff (x 0 x 1 ) ξ 0 +(x 1 x 0 ) ξ 1 0 for all (x 0, ξ 0 ), (x 1, ξ 1 ) S. This, of course can be abbreviated as (x 0 x 1 ) (ξ 0 ξ 1 ) 0 for all (x 0, ξ 0 ), (x 1, ξ 1 ) S. What exactly this means geometrically may have to be looked into, perhaps. Similarly, one defines a set S R n R n to be monotone of degree m, where m N, iff m i=0 (x i+1 x i ) ξ i 0 for all (x 0, ξ 0 ),..., (x m, ξ m ) S, where x m+1 = x 0. A set S R n R n is said to be cyclically monotone iff it is monotone of all degrees. Let f : E R, E a convex, open subset of R n.the set f R n R n is defined by f = {(x, ξ) : x E, ξ f(x)}. If (x i, ξ i ) f for i = 0,..., m, then f(y) f(x i ) + (y x i ) ξ i for all y R n, i = 0,..., m. Taking in the i-th equation, y = x i+1, x m+1 = x 0, we see that f(x i+1 f(x i ) (x i+1 x i ) ξ i, i = 0,..., m. Adding over i = 0,..., m the left hand sides telescope to 0 and one gets m i=0 (x i+1 x i ) ξ i 0. In other words, the set f is cyclically monotone. It being obvious that a subset of a cyclically monotone set is cyclically monotone, we see that all subsets of f for a convex f are cyclically monotone. Rockafellar s theorem is the converse of this result. That is, he states it in the form of an if and only if condition, but the serious part is the converse. It seems 5

however that the convex function that comes out of this could be infinite at some points. Rockafellar defines a convex function as being proper iff it is not identically. As mentioned at the beginning, the set of points where f is finite is convex. It is also easy to see that a convex function can only be equal to on the outskirts of its domain; that is the set E = {x R n : f(x) < } is open. Thus if f : R n R is a proper convex function, all we said so far is valid on the set E = {x R n : f(x) < }. And, of course, f E R n ; f(x) if and only if x E. Theorem 5 (Rockafellar in R n ) A subset S R n R n is cyclically monotone if and only if there exists a proper convex f : R n R such that S f. Proof. The if part has been done, so we only need to do the only if part. Assume thus that S R n R n is cyclically monotone. One of the simplest of convex functions is an affine function, which is simply a function of the form f(x) = a + x ξ, for a, ξ R n. This type of function has the distinction that the convexity inequality is actually an equality, hence it is not only convex, it is also concave. We will need the following Lemma. The proof is quite simple, except that one has to be careful because the supremum in question can be infinite at some points, or possibly everywhere. Lemma 6 Let f α : R n (, ] be convex for all α A, A an index set. Then f : R n (, ] defined by is convex. f(x) = sup{f α (x)} α A To prove the theorem, Rockafellar now assumes, as one may, that S and fixes a pair (x 0, ξ 0 ) S. He then defines m 1 f(x) = sup{(x x m ) ξ m + (x i+1 x i ) ξ i : m N, (x i, ξ i ) S, i = 1,..., m}. i=0 The function f as a sup of affine, hence convex, functions is convex. It could be at a lot of places, but because S is cyclically monotone one has for all choices of (x 1, ξ 1 ),..., (x m, ξ m ) S, m 1 (x 0 x m ) ξ m + (x i+1 x i ) ξ i 0, i=0 thus f(x 0 ) 0. Selecting m = 1 and (x 1, ξ 1 ) = (x 0 ξ 0 ) we get an affine function that is 0 at x 0, thus f(x 0 ) = 0. Thus f is proper. The rest seems easy. To see that this function works we have to see that if (x, ξ) S then f(y) f(x) + (y x) ξ for all y R n (an inequality that automatically holds for y such that f(y) =, thus we could limit ourselves to y s such that f(y) < ). So let (x, ξ) S. In principle, we don t know yet that f(x) < so what we need to prove is: If α < f(x) then f(y) α + (y x) ξ y R n. This will prove it (and by restricting to y = x 0 proves f(x) < ). And the proof is now almost immediate, since by the definition of f, α < f(x) implies the existence of (x 1, ξ 1 ),..., (x m, ξ m ) S such that α < (x x m ) ξ m + (x m x m 1 ) ξ m 1 + + (x 1 x 0 ) ξ 0. 6

On the other hand, since (x 1, ξ 1 ),..., (x m, ξ m ), (x, ξ) S, the definition of f(y) implies f(y) (y x) ξ+(x x m ) ξ m +(x m x m 1 ) ξ m 1 + +(x 1 x 0 ) ξ 0 > α+(y x) ξ. We are (i.e, Rockafellar is) done. In Singular integrals and differentiability properties of functions Stein proves the following theorem: Theorem ES. Let E R n, let U be open in R n and let f : U R be Lebesgue measurable. Then f is differentiable at almost every point of E if and only if the relation (7) f(x + y) f(x) = O( y ) as y 0 holds for almost every x E. It is, of course, not assumed that the constant appearing in the O above is uniform in x. (Chapter VIII, Theorem 3, page 250.) The proof is quite lengthy and uses a lot of results Stein developed previously. It has an immediate application to convex functions. Lemma 7 Let E be an open subset of R n and let f : E R be convex. Then f is differentiable a.e. Proof. Let x E. Since f is locally bounded, there is r > 0 and a constant C 0 such that if y x < r then y E and f(y) C. Assuming now 0 < y < r, let ω = ry/ y and let t = y /r (0, 1). Then x+y = (1 t)x+t(x+ω) so that by convexity f(x+y) f(x) (1 t)f(x)+tf(x+ω) f(x) = t(f(x+ω) f(x)) 2Ct = 2C r y. On the other hand, if ξ f(x), then f(x + y) f(x) ξ y ξ y. It follows that (7) holds for all x E; by Theorem ES f is a.e. differentiable. 7