Convex Functions. Pontus Giselsson

Similar documents
Optimality Conditions for Nonsmooth Convex Optimization

Optimization and Optimal Control in Banach Spaces

A Brief Review on Convex Optimization

Subgradients. subgradients and quasigradients. subgradient calculus. optimality conditions via subgradients. directional derivatives

Lecture 1: Background on Convex Analysis

CSCI : Optimization and Control of Networks. Review on Convex Optimization

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

EC9A0: Pre-sessional Advanced Mathematics Course. Lecture Notes: Unconstrained Optimisation By Pablo F. Beker 1

1. Gradient method. gradient method, first-order methods. quadratic bounds on convex functions. analysis of gradient method

LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE

3.1 Convexity notions for functions and basic properties

3. Convex functions. basic properties and examples. operations that preserve convexity. the conjugate function. quasiconvex functions

LECTURE 12 LECTURE OUTLINE. Subgradients Fenchel inequality Sensitivity in constrained optimization Subdifferential calculus Optimality conditions

Chapter 2 Convex Analysis

Lecture 2: Convex functions

Subgradients. subgradients. strong and weak subgradient calculus. optimality conditions via subgradients. directional derivatives

In Progress: Summary of Notation and Basic Results Convex Analysis C&O 663, Fall 2009

A function(al) f is convex if dom f is a convex set, and. f(θx + (1 θ)y) < θf(x) + (1 θ)f(y) f(x) = x 3

Math 273a: Optimization Subgradients of convex functions

A SET OF LECTURE NOTES ON CONVEX OPTIMIZATION WITH SOME APPLICATIONS TO PROBABILITY THEORY INCOMPLETE DRAFT. MAY 06

Convex Analysis and Optimization Chapter 2 Solutions

3. Convex functions. basic properties and examples. operations that preserve convexity. the conjugate function. quasiconvex functions

The proximal mapping

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 4. Subgradient

BASICS OF CONVEX ANALYSIS

Chapter 2: Preliminaries and elements of convex analysis

Convex Functions. Wing-Kin (Ken) Ma The Chinese University of Hong Kong (CUHK)

Constrained Optimization and Lagrangian Duality

ZERO DUALITY GAP FOR CONVEX PROGRAMS: A GENERAL RESULT

8. Conjugate functions

Concave and Convex Functions 1

Iterative Convex Optimization Algorithms; Part One: Using the Baillon Haddad Theorem

Convex Analysis and Optimization Chapter 4 Solutions

Dedicated to Michel Théra in honor of his 70th birthday

Convex Functions. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)

Subdifferential representation of convex functions: refinements and applications

ON A CLASS OF NONSMOOTH COMPOSITE FUNCTIONS

Convex Optimization & Lagrange Duality

Convex Analysis Notes. Lecturer: Adrian Lewis, Cornell ORIE Scribe: Kevin Kircher, Cornell MAE

Descent methods. min x. f(x)

4. Convex optimization problems (part 1: general)

Subgradient. Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes. definition. subgradient calculus

Chapter 1. Optimality Conditions: Unconstrained Optimization. 1.1 Differentiable Problems

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization

LECTURE SLIDES ON BASED ON CLASS LECTURES AT THE CAMBRIDGE, MASS FALL 2007 BY DIMITRI P. BERTSEKAS.

Lecture: Duality of LP, SOCP and SDP

Concave and Convex Functions 1

Convex Functions and Optimization

Semi-infinite programming, duality, discretization and optimality conditions

Local strong convexity and local Lipschitz continuity of the gradient of convex functions

Strongly convex functions, Moreau envelopes and the generic nature of convex functions with strong minimizers

IE 521 Convex Optimization

Convex Optimization Theory. Chapter 5 Exercises and Solutions: Extended Version

Convex Optimization Conjugate, Subdifferential, Proximation

Brøndsted-Rockafellar property of subdifferentials of prox-bounded functions. Marc Lassonde Université des Antilles et de la Guyane

5. Duality. Lagrangian

Proximal methods. S. Villa. October 7, 2014

Optimisation convexe: performance, complexité et applications.

Math 273a: Optimization Subgradients of convex functions

LECTURE SLIDES ON CONVEX OPTIMIZATION AND DUALITY THEORY TATA INSTITUTE FOR FUNDAMENTAL RESEARCH MUMBAI, INDIA JANUARY 2009 PART I

EE 546, Univ of Washington, Spring Proximal mapping. introduction. review of conjugate functions. proximal mapping. Proximal mapping 6 1

Rockefellar s Closed Functions

g 2 (x) (1/3)M 1 = (1/3)(2/3)M.

In English, this means that if we travel on a straight line between any two points in C, then we never leave C.

Coordinate Update Algorithm Short Course Subgradients and Subgradient Methods

Convex Optimization M2

Lower semicontinuous and Convex Functions

Convex Optimization and Modeling

CHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS. W. Erwin Diewert January 31, 2008.

Division of the Humanities and Social Sciences. Supergradients. KC Border Fall 2001 v ::15.45

Convex Optimization Boyd & Vandenberghe. 5. Duality

1 Convexity, concavity and quasi-concavity. (SB )

Mathematics 530. Practice Problems. n + 1 }

Convex analysis and profit/cost/support functions

DYNAMIC DUALIZATION IN A GENERAL SETTING

Self-equilibrated Functions in Dual Vector Spaces: a Boundedness Criterion

Convex Analysis Background

CONSTRAINT QUALIFICATIONS, LAGRANGIAN DUALITY & SADDLE POINT OPTIMALITY CONDITIONS

Robust Duality in Parametric Convex Optimization

CONTINUOUS CONVEX SETS AND ZERO DUALITY GAP FOR CONVEX PROGRAMS

Lectures on Geometry

Convex Optimization Theory

Lecture: Duality.

Almost Convex Functions: Conjugacy and Duality

Epiconvergence and ε-subgradients of Convex Functions

Maximal Monotone Inclusions and Fitzpatrick Functions

On Semicontinuity of Convex-valued Multifunctions and Cesari s Property (Q)

4. Convex optimization problems

GEOMETRIC APPROACH TO CONVEX SUBDIFFERENTIAL CALCULUS October 10, Dedicated to Franco Giannessi and Diethard Pallaschke with great respect

New formulas for the Fenchel subdifferential of the conjugate function

LECTURE SLIDES ON CONVEX ANALYSIS AND OPTIMIZATION BASED ON LECTURES GIVEN AT THE MASSACHUSETTS INSTITUTE OF TECHNOLOGY CAMBRIDGE, MASS

DEFINABLE VERSIONS OF THEOREMS BY KIRSZBRAUN AND HELLY

Convex Optimization in Communications and Signal Processing

Convex Optimization. Convex Analysis - Functions

DUALITY, OPTIMALITY CONDITIONS AND PERTURBATION ANALYSIS

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings

Franco Giannessi, Giandomenico Mastroeni. Institute of Mathematics University of Verona, Verona, Italy

Convex Optimization Theory. Chapter 3 Exercises and Solutions: Extended Version

Convex Optimization and Modeling

Transcription:

Convex Functions Pontus Giselsson 1

Today s lecture lower semicontinuity, closure, convex hull convexity preserving operations precomposition with affine mapping infimal convolution image function supremum of convex functions (example: conjugate functions) support functions sublinearity directional derivative 2

Domain assume that f : X R is finite-valued then X R n is the domain of f 3

Extending the domain we want to avoid to explicitly state domain in f : X R extend domain of functions with X = R n by constructing: { f(x) if x X ˆf(x) = else (extension works well in convex analysis) obviously ˆf : R n R {+ } define R := R {+ } can compare function values on all of R n using R arithmetics ( = and c < for all c R) 4

Standing assumptions throughout this course we assume that all functions f: may be extended valued, i.e., have range R have extended domain, R n are proper, i.e., f + are minorized by an affine function, i.e., there exist s R n and r R such that f(x) s, x + r for all x or r inf x example, affine minorizer: {f(x) s, x } or 0 inf{f(x) s, x r} x f(x) s, x + r 5

Effective domain the effective domain of f : R n R is the set dom f = {x R n f(x) < } 6

Convex functions function f : R n R is convex if for all x, y R n and θ [0, 1]: f(θx + (1 θ)y) θf(x) + (1 θ)f(y) (in extended valued arithmetics) every convex combination of two points on the graph of f is above the graph a nonconvex function a convex function 7

Comparison to other definition a function f : X R (without extended domain) is convex if f(θx + (1 θ)y) θf(x) + (1 θ)f(y) holds for all x, y X and θ [0, 1], and if X is convex equivalent to definition for functions with extended domain for f : R n R, convexity of dom f is implicit in convexity definition 8

Why convexity? local minima are also global minima! can search for local minima to minimize the function much easier to devise algorithms 9

Jensen s inequality assume that f : R n R is convex then for all collections {x 1,..., x k } of points ( k ) f θ i x i i=1 k θ i f(x i ) i=1 where θ i 0 and k i=1 θ i = 1 for k = 2 this reduces to the convexity definition 10

Strict and strong convexity a function is strictly convex if convex with strict inequality f(θx + (1 θ)y) < θf(x) + (1 θ)f(y) for all x y and θ (0, 1) a function is σ-strongly convex if there exists σ > 0 such that f(θx + (1 θ)y) θf(x) + (1 θ)f(y) σ 2 θ(1 θ) x y 2 for every x, y R n and θ [0, 1] strongly convex functions are strictly convex a function is σ-strongly convex iff f σ 2 2 is convex prove by inserting f σ 2 2 in convexity definition a strongly convex function has at least curvature σ 2 2 11

Uniqueness of minimizers if a function is strictly (strongly) convex the minimizers are unique proof: assume that x 1 x 2 and that both satisfy x 2 = x 1 = argmin f(x) x i.e., f(x 1 ) = f(x 2 ) = inf x f(x), then contradiction! f( 1 2 x 1 + 1 2 x 2) < 1 2 (f(x 1) + f(x 2 )) = inf f(x) x (minimizer might not exist for strictly convex, but always for strongly convex) 12

Smoothness a convex function is β-smooth if there exists β > 0 such that f(θx + (1 θ)y) θf(x) + (1 θ)f(y) β 2 θ(1 θ) x y 2 for every x, y R n and θ [0, 1] (and convexity definition holds) inequality flipped compared to strong convexity a convex function f is β-smooth iff β 2 2 f is convex a smooth function is continuously differentiable (sometimes higher order differentiability required in smoothness definitions) 13

Graphs and epigraphs the graph of f is the set of all couples (x, f(x)) R n R the epigraph of a (proper) function f is the nonempty set epi f = {(x, r) f(x) r} (note dimension of epi f is n + 1 when dimension of dom f is n) epif 14

Epigraphs and convexity let f : R n R then f is convex if and only epi f is a convex set in R n R epif epif nonconvex convex 15

Level-sets a (sub)level-set S r (f) to the function f is defined as S r (f) = {x R n f(x) r} r S r(f) (slice epigraph and project back to R n )) level-sets of convex functions are convex even if all level-sets convex, function might be nonconvex if all level-sets convex, function is quasi-convex 16

Level-sets and constraint qualification assume that f : R n R is convex (and finite-valued) Slater s constraint qualification assumes existence of x such that f( x) < 0 (0 is often used to define constraints, any level can be used) this clearly implies that the level-set S 0 (f) is nonempty in fact, it implies the following statements: cl {x f(x) < 0} = {x f(x) 0} {x f(x) < 0} = int {x f(x) 0} consequently: bd {x f(x) 0} = {x f(x) = 0} 17

Affine functions affine functions f(x) = s, x b for any x 0 R n affine functions can be written as f(x) = f(x 0 ) + s, x x 0 (since b = s, x 0 f(x 0 )) epigraph of affine function is closed half-space with non-horizontal normal vector (s, 1) R n R epif = {(x, r) : r s, x b} = {(x, r) : b (s, 1), (x, r) } s, x b = f(x 0 ) + s, x x 0 (s, 1) 18

Affine minorizers any proper, convex f is minorized by some affine function more precisely: for any x 0 ri dom f, there is s such that f(x) f(x 0 ) + s, x x 0 which coincides with f at x 0 i.e., there is an affine function whose epigraph covers epi f convex epigraph supported by non-vertical hyperplanes normal vector s is called subgradient, much more on this later 19

Lower semicontinuity a function f : R n R is lower semicontinuous if lim inf f(y) f(x) y x consider the following function f which is not defined on x = 0: construct a lower semicontinuous function { f(x) if x 0 g(x) = c else what can c be? at or below lower circle (upper semicontinuous if inequality flipped) 20

Lower semicontinuity The following are equivalent for f : R n R: f is lower semicontinuous (l.s.c.) epi f is a closed set all level-sets S r (f) are closed (may be empty) will call lower semicontinuous functions closed l.s.c. not l.s.c. 21

Closure (lower-semicontinuous hull) the closure cl f of the function f is defined as epi(clf) := cl(epif) a function is closed iff cl f = f, i.e., if epi(f) := cl(epif) the closure is really the lower-semicontinuous hull: cl f = sup{h(x) h lower-semicontinuous, h f} what is the closure of g 1? g 1 cl g 1 22

Outer construction the closure of a convex function f is the supremum of all minorizing affine functions, i.e., cl f = g where g(x) = sup { s, x b : s, y b f(y) for all y R n } s,b let Σ 1 be all non-vertical hyperplanes that support epi f (solid) epi g is intersection of halfspaces defined by hyperplanes in Σ 1 let Σ 0 be all vertical hyperplanes that support epi f (dashed) cl(epif) is intersection of all halfspaces defined by Σ 1 and Σ 0 (consequence of strict separating hyperplane theorem) prove result by showing that halfspaces defined by Σ 0 redundant (i.e., dashed line not needed, then epi(clf) = cl(epif)) 23

Continuity for convex functions for convex f, cl f(x) = f(x) for x ri dom f for finite-valued convex functions, ri dom f = R n l.s.c. we can say more: convex f are locally Lipschitz continuous for each compact convex subset S ri dom f there exists L(S): f(x) f(y) L(S) x y for all x and y in S consequence: convex functions f are continuous on ri dom f 24

Why closed functions? a closed function defined on a nonempty closed and bounded set is bounded below and attains its infimum generalization of Weierstrass extreme value theorem inf attained inf not attained left figure: closed, right figure: not closed (supremum not attained, needs upper semicontinuity) 25

Convex hull the convex hull is the largest convex minorizing function, i.e.: the closed convex hull: conv f(x) = sup{h(x) : h convex, h f} conv f(x) = sup{h(x) : h closed convex, h f} the closed convex hull can equivalently be written as: conv f(x) = sup{ s, x b : s, x b f(y) for all y R n } s,b (supremum of affine functions minorizing f) 26

Convex hull Example the figure shows the closed convex hull conv f(x) = sup{ s, x b : s, x b f(y) for all y R n } s,b of a nonconvex function f(x) f(1) conv f(1) 27

First-order conditions for convexity a differentiable function f : R n R is convex iff holds for all x, y R n f(y) f(x) + f(x), y x f(y) (x, f(x)) f(x) + f(x), y x function has affine minorizer defined by f 28

Second-order conditions for convexity a twice differentiable function is convex if and only if 2 f(x) 0 for all x R n (i.e., the Hessian is positive semi-definite) the function has non-negative curvature 29

Examples of convex functions indicator function ι S (x) = { 0 if x S else [closed] and convex iff S [closed] and convex norms: x norm-squared: x 2 (shortest) distance to convex set: dist S (y) = inf x S { x y } linear functions: f(x) = q, x quadratic forms: f(x) = 1 2 Qx, x with Q positive semi-definite linear operator matrix fractional function: f(x, Y ) = x T Y 1 x 30

How to conclude convexity different ways to conclude convexity use convexity definition show that epigraph is convex set use first or second order condition for convexity show that function built by convexity preserving operations (next) 31

Operations that preserve convexity assume that f j are convex for j = {1,..., m} assume that there exists x such that f j (x) < for all j then positive combination m f = t j f j j=1 with t j > 0 is convex proof : add convexity definitions t j f j (θx + (1 θ)y) t j (θf j (x) + (1 θ)f j (y)) 32

Precomposition with affine mapping let f be convex and L be affine, then (f L)(x) := f(l(x)) is convex if ImL domf then f L proper 33

Infimal convolution the infimal convolution of two functions f, g is defined as (f g)(x) := inf y Rn{f(y) + g(x y)} convex if f and g are convex with a common affine minorizer closed and convex if f, g closed and convex and, e.g.: f(x) as x (coercive) and g bounded from below f(x)/ x as x (super-coercive) in this case, infimal convolution is set addition of epi-graphs (in other case, strict epigraphs are equal) 34

Moreau envelope let γ > 0 and f be closed and convex infimal convolution with g = 1 2γ 2 is called Moreau envelope (f 1 2 2 )(x) := min y R n{f(y) + 1 2γ x y 2 } argmin of this is called proximal operator (more on this later) the Moreau envelope is a smooth under-estimator of f minimizers coincide (can minimize smooth envelope instead of f) example f(x) = x : γ = 2 γ = 1, 2, 4 35

Image of function under linear mapping the image function Lf : R m R { } is defined as (Lf)(x) := inf{f(y) : Ly = x} y where L : R m R n is linear and f : R m R { } convex if f convex and bounded below for all x on inverse image 36

Examples of image functions marginal function: let F : R n R m R { } be convex then (if f is bounded below) the marginal function is convex f : R n R {± } : x inf F (x, y) y Rm why? marginal function f = (LF ) where L(x, y) = x (LF )(z) = inf{f (x, y) L(x, y) = z} x,y = inf{f (x, y) x = z} x,y = inf{f (z, y)} = f(z) y 37

Infimal convolution also the infimal convolution is an image function infimal convolution of f 1 and f 2 : (f 1 f 2 )(z) = inf x {f 1(x) + f 2 (z x)} introduce g(x, y) = f 1 (x) + f 2 (y) and L(x, y) = x + y, then (Lg)(z) = inf{g(x, y) : L(x, y) = z} x,y = inf x,y {f 1(x) + f 2 (y) : x + y = z} = inf x {f 1(x) + f 2 (z x)} = (f 1 f 2 )(z) (therefore image function is not closed in general case) 38

Supremum of convex functions point-wise supremum of convex functions from family {f j } j J : f := sup{f j : j J} example: f 1 = 1 2 x2 3x, f 2 = x + 2, f 3 = 1 4 x2 + 2x f 3 (x) f 2 (x) f 1 (x) convex since intersection of convex epigraphs! 39

Example Conjugate functions the conjugate function f is defined as f (s) := sup x R n { s, x f(x)} for each x j R n, let r j = f(x j ), then f (s) := sup x j { s, x j r j } s, x j r j convex (affine) in s (independent of convexity of f) supremum of family of affine functions convex epigraph of conjugate is intersection of (closed) affine functions 40

Draw the conjugate recall: f (s) := sup x R n { s, x f(x)} draw conjugate of f (f(x) = outside points) s + 0 s + 0 ( 1, 0) (0, 0.2) (1, 0) 0s 0.2 what if f(0) = 0.2 instead? what if f(0) = 0.2 and points are connected with straight lines? each feasible x defines a slope, f(x) defines vertical translation 41

Support functions the support function to a set C is defined as it can be written as σ C (s) := sup s, x x C σ C (s) := sup{ s, x ι C (x)} =: ι C(x) x i.e., it is the conjugate of the indicator function (more on general conjugate functions later) 42

Support function properties graphical interpretation (σ C (s) = sup x C s, x = s, x in figure) s s C x σ C (s) s, x = r 2 s, x = r 1 s, x = σ C (s) put inequalities between r 2, r 1, and σ C (s) 0 σ C (s) r 1 r 2 suppose that σ C (s) = r, what is σ C (2s)? 2r suppose that σ C (s) = r, what is σ C ( s)? don t know! support function is positively homogeneous of degree 1, i.e. σ C (tx) = tσ C (x) if t > 0 43

Closure and convexity assume that C is nonempty then σ C (s) = σ cl C (s) = σ conv C (s) example: the same if (closed) convex hull considered instead s s, x = σ C (s) = σ conv C (s) 0 therefore only necessary to consider closed and convex sets 44

Further properties the support function is convex (since a conjugate function) functions that are convex and positively homogeneous of degree 1 are called sublinear 45

1D example consider the set C = [0.2, 1]: 0.2 1 C draw the support function (σ C (s) = sup x C s, x ) σ C x 0.2x s the epigraph of this support function is a convex cone actually: f is sublinear if and only if epi f is a convex cone 46

Directional derivative assume that f : R n R is finite-valued and convex the directional derivative of f at x in the direction d is d f (x, d) := lim t 0 f(x + td) f(x) t the directional derivative is convex in d for fixed x 47

Positive homogeneity the directional derivative is positively homogeneous of degree 1 proof: let f (x, d 1 ) = lim t 0 f(x + td 1 ) f(x) t set d 2 = αd 1 for some α > 0, then f f(x + td 2 ) f(x) (x, d 2 ) = lim t 0 t f(x + tαd 1 ) f(x) = lim t 0 t f(x + sd 1 ) f(x) = lim s 0 s/α = αf (x, d 1 ) 48

Sublinearity f (x, d) is convex, positively homogeneous, hence sublinear (in d) it is also finite it is the support function for the subdifferential (next lecture) 49

Example 1D example f(x) = 1 2 x2 + x 2 : compute f (2, 1) and f (2, 1): f 0.5(2 + t) 2 + t 2 2 + 3t + t 2 2 (2, 1) = lim = lim = 3 t 0 t t 0 t f 0.5(2 t) 2 + t 2 2 t + t 2 2 (2, 1) = lim = lim = 1 t 0 t t 0 t 50

Example cont d use that f (2, d) positively homogeneous to get explicit expression { f 3d if d 0 (2, d) = 1d if d 0 (since f (2, 1) = 1, then f (2, 2) = 2, etc) with origin shifted to point of interest (2, f(2)), we get tangent cone to epigraph of f is epigraph of directional derivative 51

Example Levelsets assume that f is convex with the following levelsets (increasing values for larger sets) x draw the set of directions d (from x) for with f (x, d) 0 x set of d for which f (x, d) 0 is tangent cone to levelset (under some additional assumptions) 52