IE 521: Convex Optimization Instructor: Niao He Lecture 4: Convex Functions, Part I February 1 Spring 2017, UIUC Scribe: Shuanglong Wang Courtesy warning: These notes do not necessarily cover everything discussed in the class. Please email TA (swang157@illinois.edu) if you find any typos or mistakes. In this lecture, we cover the following topics Convex Functions Examples Convexity-preserving Operations Reference: Boyd &Vandenberghe, Chapter 3.1-3.2 4.1 Convex Function Let f be a function from R n to R. The domain of f is defined as dom(f) = {x R n : f(x) < }. For example, f(x) = 1 x, dom(f) = R\ {0} f(x) = n i=1 x i ln(x i ), dom(f) = R n ++ = {x : x i > 0, i = 1,..., n} Definition 4.1 (Convex function) A function f(x) : R n R is convex if (i) dom(f) R n s a convex set; (ii) x, y dom(f) and λ [0, 1], f(λx + (1 λ)y) λf(x) + (1 λ)f(y). Geometrically, the line segment between (x, f(x)), (y, f(y)) sits above the graph of f. Definitions A function is called strictly convex if (ii) holds with strict sign, i.e. f(λx + (1 λ)y) < λf(x) + (1 λ)f(y). A function is called α-strongly convex if f(x) α 2 x 2 2 A function is called concave if f(x) Note that strongly convex = strictly convex = convex 4-1
Lecture 4: Convex Functions, Part I February 1 4-2 Figure 4.1: Convex function 4.2 Examples 1. Simple univariate functions: Even powers: x p, p is even Exponential: e ax, a R Negative logarithmic: log x Absolute value: x Negative entropy: x log(x) 2. Affine functions: f(x) = a T x + b both convex & concave, but not strctly convex/concave 3. Some quadratic functions: f(x) = 1 2 xt Qx + b T x + c convex if and only if Q 0 is positive semi-definite strictly convex if and only if Q 0 is positive definite special case: f(x) = Ax b 2 2 is convex 4. Norms: A function π( ) is called a norm if (a) π(x) 0, x and π(x) = 0 iff x = 0 (b) π(αx) = α π(x), α R (c) π(x + y) π(x) + π(y) Note that norms are convex: λ [0, 1], π(λx + (1 λ)y) π(λx) + π((1 λ)y) = λπ(x) + (1 λ)π(y) where the inequality comes from (c) and the equality comes from (b). Examples of norms include: l p -norm on R n : x p := ( n i=1 x i p ) 1/p, where p 1 Q-norm on R n : x Q = x T Qx, where Q 0 is positive definite
Lecture 4: Convex Functions, Part I February 1 4-3 Frobenius norm on R m n : A F = ( m n i=1 j=1 A i,j 2 ) 1/2 spectral norm on S n : A = max i=1,...,n λ i (A), where λ i s are the eigenvalues of A. { 0, x C 5. Indicator function I c (x) =, x C The indicator function I C (x) is convex if the set C is a convex set. 6. Supporting function: IC (x) = sup y C x T y The support function IC (x) is always convex for any set C. Proof: Note that sup y C f(y) + g(y) sup y C f(y) + sup y C g(y) Then x 1, x 2, λ [0, 1] IC(λx 1 + (1 λ)x 2 ) = sup λx T 1 y + (1 λ)x T 2 y y C sup y C λx T 1 y + sup(1 λ)x T 2 y y C = λi C(x 1 ) + (1 λ)i C(x 2 ) 7. More examples Piecewise linear functions: max(a T 1 x + b 1,..., a T k x + b k) Log of exponential sums: log( k i=1 eat i x+b i ) Negative log of determinant: log(det(x)) How to show convexity of these functions? 4.3 Convexity-Preserving Operators 1. Taking conic combination: If f i (x), i I are convex functions and α i 0, i I, then g(x) = α i f i (x) is a convex function. Proof: The domain of function g dom(g) = i:αi >0dom(f i ) For any x, y dom(g), λ [0, 1] g(λx + (1 λ)y) = α i f i (λx + (1 λ)y) α i [λf i (x) + (1 λ)f i (y)] = λ α i f i (x) + (1 λ) α i f i (y) = λg(x) + (1 λ)g(y)
Lecture 4: Convex Functions, Part I February 1 4-4 Remark The property extend to infinite sums and integrals. If f(x, ω) is convex in x for any ω Ω and α(ω) 0, ω Ω. then g(x) = α(ω)f(x, ω)dω is convex if well defined. Ω For example if η = η(ω) is a well-defined random variable on Ω, and f(x, η(ω)) is convex, ω Ω, then E η [f(x, η)] is a convex function. 2. Taking affine composition If f(x) : R n R is convex and A(y) : y Ay + b is an affine mapping from R m to R n, then g(y) := f(ay + b) is convex on R m. Proof: dom(g) = {y : Ay + b dom(f)} y 1, y 2 dom(g) : g(λy 1 + (1 λ)y 2 ) = f(λ(ay 1 + b) + (1 λ)(ay 2 + b)) λf(ay 1 + b) + (1 λ)f(ay 2 + b) = λg(y 1 ) + (1 λ)g(y 2 ) For example, Ax b 2 2, i eat i x b i and n i=1 log(at i x b i) are convex. 3. Taking pointwise maximum and supremum If f i (x), i I are convex, then is also convex. g(x) := max f i (x) Proof: First of all, dom(g) = dom(f i ) is convex, For any x, y dom(g), λ [0, 1] g(λx + (1 λ)y) = max f i(λx + (1 λ)y) max {λf i(x) + (1 λ)f i (y)} max λf i(x) + max (1 λ)f i(y) = λg(x) + (1 λ)g(y) Remark The property extends to the pointwise supremum over a infinite set. If f(x, ω) is convex in x, for ω Ω, then g(x) := sup f(x, ω) ω Ω For example, the following functions are convex:
Lecture 4: Convex Functions, Part I February 1 4-5 (a) piecewise linear functions: f(x) = max(a T 1 x + b 1,..., a T k x + b k) (b) support function: I C (x) = sup y C x T y (c) maximum distance to any set C: d max (x, C) = max y C y x 2 (d) maximum eigenvalue of a symmetric matrix: λ max (X) = max y 2 =1 y T Xy Indeed, almost every convex function can be expressed as the pointwise supremum of a family of affine functions! 4. Taking convex monotone composition: scalar case If f is a convex function on R n and F ( ) is a convex and non-decreasing function on R, then g(x) = F (f(x)) vector case If f i (x), i = 1,..., m are convex on R n and F (y 1,..., y m ) is convex and non-decreasing (component-wise) in each argument, then Proof: By convexity of f i, we have g(x) = F (f 1 (x),..., f m (x)) f i (λx + (1 λ)y) λf i (x) + (1 λ)f i (y), i, λ [0, 1]. Hence, we have for any x, y dom(g), λ [0, 1], g(λx + (1 λ)y) = F (f 1 (λx + (1 λ)y),..., f m (λx + (1 λ)y)) F (λf 1 (x) + (1 λ)f 1 (y),..., λf m (x) + (1 λ)f m (y)) ( by monotonicity of F ) λf (f 1 (x),..., f m (x)) + (1 λ)f (f 1 (x),..., f m (x)) ( by convexity of F ) = λg(x) + (1 λ)g(y) ( by definition of g) Remark Taking pointwise maximum is a special case of the above rule, by setting F (y 1,..., y m ) = max(y 1,..., y m ), max i=1,...,m f i(x) = F (f 1 (x),..., f m (x)) For example: (a) e f(x) is convex if f is convex (b) log f(x) is convex if f is concave (c) log( k i=1 ef i ) is convex if f i are convex. 5. Taking Partial minimization: If f(x, y) is convex in (x, y) R n and Y is a convex set, then g(x) = inf f(x, y) y Y
Lecture 4: Convex Functions, Part I February 1 4-6 Proof: dom(g) = {x : (x, y) dom(f) and y C} is a projection of dom(f), hence Given any x 1, x 2, by definition, for any ɛ > 0, y 1 Y, y 2 Y s.t. f(x 1, y 1 ) g(x 1 ) + ɛ/2 f(x 2, y 2 ) g(x 2 ) + ɛ/2 For any λ [0, 1], adding the two equations, we have λf(x 1, y 1 ) + (1 λ)f(x 2, y 2 ) λg(x 1 ) + (1 λ)g(x 2 ) + ɛ. By convexity of f(x, y), this implies f(λx 1 + (1 λ)x 2, λy 1 + (1 λ)y 2 ) λg(x 1 ) + (1 λ)g(x 2 ) + ɛ. Hence for any ɛ > 0, g(λx 1 + (1 λ)x 2 ) λg(x 1 ) + (1 λ)g(x 2 ) + ɛ. Letting ɛ 0 leads to the convexity of g. Examples (a) Minimum distance to a convex set: d(x, C) = min y C x y 2 where C is convex; (b) Define g(x) = inf y {h(y) Ay = x} is convex if h This is because g(x) = inf y f(x, y), where { h(x) Ay = x f(x, y) := o.w. is convex in (x, y).