Lecture 1: Background on Convex Analysis

Lecture 1: Background on Convex Analysis John Duchi PCMI 2016

Outline I Convex sets 1.1 Definitions and examples 2.2 Basic properties 3.3 Projections onto convex sets 4.4 Separating and supporting hyperplanes II Convex functions 1.1 Definitions 2.2 Subgradients and directional derivatives 3.3 Optimality properties 4.4 Calculus rules

Why convex analysis and optimization? Consider problem minimize x f(x) subject to x X. When is this (efficiently) solvable? When things are convex If we can formulate a numerical problem as minimization of a convex function f over a convex set X, then (roughly) it is solvable

Convex sets Definition A set C R n is convex if for any x, y C tx + (1 t)y C for all t C

Examples Hyperplane: Let a R n, b R, C := {x R n : a, x = b}. Polyhedron: Let a 1, a 2,..., a m R n, b R m, C := {x : Ax b} = {x R n : a i, x b i }

Examples Norm balls: let be any norm, C := {x R n : x 1}

Basic properties Intersections: If C α, α A, are all convex sets, then C := is convex. α A C α Minkowski addition: If C, D are convex sets, then C + D := {x + y : x C, y D} is convex.

Basic properties Convex hulls: If x 1,..., x m are points, { m m } Conv{x 1,..., x m } := t i x i : t i 0, t i = 1. i=1 i=1

Projections Let C be a closed convex set. Define π C (x) := argmin{ x y 2 2 }. y C Existence: Assume x = 0 and that 0 C. Show that if x n is such that x n 2 inf y C y 2 then x n is Cauchy.

Projections Characterization of projection π C (x) := argmin y C { } x y 2 2 We have π C (x) is the projection of x onto C if and only if π C (x) x, y π C (x) 0 for all y C.

Separation Properties Separation of a point and convex set: Let C be closed convex and x C. Then there exists a separating hyperplane: there is v 0 and a constant b such that v, x > b and v, y b for all y C.

Separation Properties Separation of two convex sets: Let C be convex and compact and D be closed convex. Then there is a non-zero separating hyperplane v inf x C v, x > sup v, y. y D Proof: D C is closed convex and 0 D C

Supporting hyperplanes Supporting hyperplane: A hyperplane {x : v, x = b} supports the set C if C is contained in the halfspace {x : v, x b} and for some y bd C we have v, y = b

Supporting hyperplanes Let C be a convex set with x bd C. Then there exists a non-zero supporting hyperplane H passing through x. That is, a v 0 such that Proof: C {y : v, y b} and v, x = b.

Convex functions A function f is convex if its domain dom f is a convex set and for all x, y dom f we have f(tx + (1 t)y) tf(x) + (1 t)f(y) for all t [0, 1]. (Define f(z) = + for z dom f)

Convex functions and epigraphs Duality between convex sets and functions. A function f is convex if and only if its epigraph is convex epi f := {(x, t) : f(x) t, t R}

Minima of convex functions Why convex? Let x be a local minimizer of f on the convex set C. Then global minmization: Proof: Note that for y C, f(x) f(y) for all y C. f(x + t(y x)) = f((1 t)x + ty) (1 t)f(x) + tf(y).

Subgradients A vector g is a subgradient of f at x if f(y) f(x) + g, y x for all y.

Subdifferential The subdifferential (subgradient set) of f at x is f(x) := {g : f(y) f(x) + g, y x for all y}.

Subdifferential examples Let f(x) = x = max{x, x}. Then 1 if x > 0 f(x) = 1 if x < 0 [ 1, 1] if x = 0.

Existence of subgradients Theorem: Let x int dom f. Then f(x) Proof: Supporting hyperplane to epi f at the point (x, f(x))

Optimality properties Subgradient optimality A point x minimizes f if and only if 0 f(x). Immediate: f(y) f(x) + g, y x

Advanced optimality properties Subgradient optimality Consider problem minimize x f(x) subject to x C. Then x solves problem if and only if there exists g f(x) such that g, y x 0 for all y C.

Subgradient calculus Addition Let f 1,..., f m be convex and f = m i=1 f i. Then f(x) = m { m } f i (x) = g i : g i f i (x) i=1 i=1 (Extends to infinite sums, integrals, etc.)

Subgradient calculus Composition Let A R n m and f : R n R be convex, with h(x) = f(ax). Then h(x) = A T f(ax) = {A T g : g f(ax)}.

Subgradient calculus: maxima Let f i, i = 1,..., m, be convex f(x) = max f i (x) i Then with I(x) = {i : f i (x) = f(x) = max j f j (x)}, f(x) = Conv{g i : g i f i (x), i I(x)}.

Subgradient calculus: suprema Let {f α : α A} be a collection of convex functions, f(x) = sup f α (x). α A Then if the supremum is attained and A(x) = {α : f α (x) = f(x)}, f(x) Conv {g α : g α f α (x), α A(x)}

Examples of subgradients Norms: Recall l -norm,, Then x = Conv x = max x j j { } {e i : e i, x = x } { e i : e i, x = x }

Examples of subgradients General norms: Recall dual norm of norm y = sup x, y and x = sup x, y. x: x 1 y: y 1 Then x = {y : y 1, y, x = x }.