Convex Functions. Wing-Kin (Ken) Ma The Chinese University of Hong Kong (CUHK)

Convex Functions Wing-Kin (Ken) Ma The Chinese University of Hong Kong (CUHK) Course on Convex Optimization for Wireless Comm. and Signal Proc. Jointly taught by Daniel P. Palomar and Wing-Kin (Ken) Ma National Chiao Tung Univ., Hsinchu, Taiwan December 19-21, 28

Convex Functions A function f : R n R is said to be convex if i) domf is convex; and ii) for any x, y domf and θ [,1], f(θx + (1 θ)y) θf(x) + (1 θ)f(y) (y, f(y)) (x, f(x)) W.-K. Ma 1

f is strictly convex if f(θx+(1 θ)y) < θf(x)+(1 θ)f(y) for all < θ < 1 and for all x y. f is concave if f is convex. convex strictly convex (and of course convex) non-convex W.-K. Ma 2

1st and 2nd Order Conditions Gradient (for differentiable f) f(x) = [ f(x),..., f(x) ] T R n x 1 x n Hessian (for twice differentiable f): A matrix function 2 f(x) S n in which [ 2 f(x)] ij = 2 f(x) x i x j Taylor series: f(x + ν) = f(x) + f(x) T ν + 1 2 νt 2 f(x)ν +... W.-K. Ma 3

First-order condition: A differentiable f is convex iff given any x domf, f(x) f(x ) + f(x ) T (x x ), x domf f(x) f(x ) + f(x ) T (x x ) x x Second-order condition: A twice differentiable f is convex if and only if 2 f(x), x domf W.-K. Ma 4

Examples on R ax + b is convex. It is also concave. x 2 is convex (on R). x is convex. e αx is convex. log x is concave on R ++. xlog x is convex on R +. log x e t2 /2 dt is concave. W.-K. Ma 5

Affine function is both convex and concave. Examples on R n n f(x) = a T x + b = a i x i + b i=1 f(x) 2 1.5 1.5.5 1 1.5 2 1.5 1 x 2.5 1 1.5 x 1.5 W.-K. Ma 6

Quadratic function f(x) = x T Px + 2q T x + r = n n P ij x i x j + 2 n q i x i + r i=1 j=1 i=1 is convex if and only if P. 2 1 15 f(x) 1 5 f(x) 5 5 1.5 x 2.5 1 1.5 x 1.5 1 1 1.5 x 2.5 1 1.5 x 1.5 1 (a) P. (b) P. W.-K. Ma 7

p-norm is convex for p 1. f(x) = x p = ( n i=1 x i p )1 p 1.8.6.4 p = p = 2 p = 1 1.8.6.4 p =.5 p =.3 p =.1.2.2.2.2.4.4.6.8.6 1.8 1.8.6.4.2.2.4.6.8 1 (a) Region of x p = 1, p 1. 1 1.8.6.4.2.2.4.6.8 1 (b) Region of x p = 1, p 1. W.-K. Ma 8

Geometric mean is concave on R n ++. f(x) = n i=1 x i Log-sum-exp. f(x) = log ( n is convex on R n. (Log-sum-exp. can be used as an approx. to max i=1,...,n x i) i=1 e x i ) W.-K. Ma 9

Affine function: Examples on S n (intuitively less obvious) is convex and concave. f(x) = tr(ax) + b = Logarithmetic determinant function: n i=1 n A ji X ij + b j=1 f(x) = log det(x) is concave on S n ++. Maximum eigenvalue function: is convex on S n. f(x) = λ max (X) = sup y y T Xy y T y W.-K. Ma 1

Convexity Preserving Operations Affine transformation of the domain: f convex = f(ax + b) convex Example: (Least squares function) f(x) = y Ax 2 is convex, since f(x) = 2 is convex. Example: (MIMO capacity) f(x) = log det(hxh T + I) is concave on S n +, since f(x) = log det(x) is concave on S n ++. W.-K. Ma 11

Composition: Let g : R n R and h : R R. g convex, h convex, extended h nondecreasing = f(x) = h(g(x)) convex g concave, h convex, extended h nonincreasing = f(x) = h(g(x)) convex Example: f(x) = y Ax 2 2 is convex by composition, where g(x) = y Ax 2, h(x) = max{, x 2 }. W.-K. Ma 12

Non-negative weighted sum: f 1,...,f m convex w 1,...,w m = m w i f i convex i=1 Example: Regularized least squares function is convex for γ. f(x) = y Ax 2 2 + γ x 2 2 Extension of non-negative weighted sum: f(x, y) convex in x for each y A w(y) for each y A = A w(y)f(x, y)dy convex Example: Ergodic MIMO capacity f(x) = E H {log det(hxh T + I)} ( = ) p(h)log det(hxh T + I)dH is concave on S n +. W.-K. Ma 13

Pointwise maximum: f 1,...,f m convex = f(x) = max{f 1 (x),...,f m (x)} convex Example: Infinity norm f(x) = x = max i=1,...,m x i is convex because f(x) = max{x 1, x 1, x 2, x 2,...,x n, x n }. Pointwise supremum: f(x, y) convex in x for each y A = sup f(x, y) convex y A Example: Worst-case least squares function f(x) = max E E y (A + E)x 2 2 is convex. (E does not even need to be convex!) W.-K. Ma 14

The basic inequality for convex f Jensen s Inequality f(θx + (1 θ)y) θf(x) + (1 θ)f(y) is also called Jensen s inequality. It can be extended to f ( k ) θ k x k i=1 k θ k f(x k ) i=1 where θ 1,...,θ k, and k i=1 θ k = 1; and to f ( S ) p(x)xdx S p(x)f(x)dx where p(x) on S domf, and S p(x)dx = 1. W.-K. Ma 15

Inequalities derived from Jensen s inequality: Arithmetic-geometric inequality: ab (a + b)/2 for a, b Hadamard inequality: detx n i=1 X ii for X S n + Kullback-Leiber divergence: Let p(x),q(x) be PDFs on S, S p(x) log ( ) p(x) dx q(x) Hölder inequality: where 1/p + 1/q = 1, p > 1. x T y x p x q W.-K. Ma 16

Epigraph The epigraph of f is epif = {(x, t) x domf, f(x) t} A powerful property: f convex epif convex e.g., some convexity preserving properties can be proven quite easily by epigraph. t f(x) epif x W.-K. Ma 17

Sublevel Sets The α-sublevel set of f S α = {x domf f(x) α} f convex = S α convex for every α, but S α convex for every α f convex f(x) f(x) α α S α x S α x convex f and convex S α non-convex f but convex S α W.-K. Ma 18

Quasiconvex Functions f : R n R is called quasiconvex (or unimodal) if domf is convex and every S α is convex. If f is convex then f is quasiconvex (by the definition). f is called quasiconcave if f is quasiconvex. f is called quasilinear if f is quasiconvex and quasiconcave. Example: Linear fractional function (useful in modeling SINR) f(x) = at x + b c T x + d is quasiconvex on {x c T x + d > }, because S α = {x a T x + b t(c T x + d)} is a hyperplane. In fact it is also quasiconcave. W.-K. Ma 19