Lecture: Examples of LP, SOCP and SDP

1/34 Lecture: Examples of LP, SOCP and SDP Zaiwen Wen Beijing International Center For Mathematical Research Peking University http://bicmr.pku.edu.cn/~wenzw/bigdata2018.html wenzw@pku.edu.cn Acknowledgement: this slides is based on Prof. Farid Alizadeh and Prof. Lieven Vandenberghe s lecture notes

Problems with absolute values 2/34 c i x i, assume c 0 i Ax b Reformulation 1: c i z i i Ax b x i z i c i z i i Ax b z i x i z i Reformulation 2: x i = x i + xi, x i +, xi 0. Then x i = x i + + xi c i (x i + + xi ) i Ax + Ax b, x +, x 0

3/34 Problems with absolute values data fitting: Compressive sensing x Ax b x Ax b 1 x 1, Ax = b (LP) µ x 1 + 1 2 Ax + b 2 (QP, SOCP) Ax b, x 1 1

4/34 Quadratic Programg (QP) q(x) = x Qx + a x + β assume Q 0, Q = Q Ax = b x 0 q(x) = ū 2 + β 1 4 a Q 1 a, where ū = Q 1/2 x + 1 2 Q 1/2 a. equivalent SOCP u 0 ū = Q 1/2 x + 1 2 Q 1/2 a Ax = b x 0, (u 0, ū) Q 0

Quadratic constraints 5/34 q(x) = x B Bx + a x + β 0 is equivalent to where ū = ( Bx ) a x+β+1 2 (u 0, ū) Q 0, and u 0 = 1 a x β 2

Norm imization problems 6/34 Let v i = A i x + b i R n i. x i v i is equivalent to i v i0 v i = A i x + b i (v i0, v i ) Q 0 x max 1 i r v i is equivalent to t v i = A i x + b i (t, v i ) Q 0

Norm imization problems 7/34 Let v i = A i x + b i R n i. v [1],..., v [r] are the norms v 1,..., v r sorted in nonincreasing order x i v [i] is equivalent to u i + kt i v i = A i x + b i v i u i + t u i 0

Rotated Quadratic Cone 8/34 rotated cone w w xy, where x, y 0, is equivalent to ( ) 2w x + y x y Minimize the harmonic mean of positive affine functions 1/(a i x + β i ), a i x + β i > 0 i is equivalent to i u i v i = a i x + β i 1 u i v i u i 0

9/34 Logarithmic Tchebychev approximation x max 1 i r ln(a i x) ln b i Since ln(a i x) ln b i = ln max(a i x/b i, b i /a i x), the problem is equivalent to t 1 (a i x/b i )t a i x/b i t t 0 Inequalities involving geometric means ( n 1/n (a i x + b i )) t i=1

10/34 n=4 max 4 (a i x b i ) i=1 max w 3 a i x b i 0 (a 1 x b 1 )(a 2 x b 2 ) w 2 1 (a 3 x b 3 )(a 4 x b 4 ) w 2 2 w 1 w 2 w 2 3 w i 0 This can be extended to products of rational powers of affine functions

Robust linear programg 11/34 the parameters in LP are often uncertain There can be uncertainty in c, a i, b. c x a i x b i two common approaches to handling uncertainty (in a i, for simplicity) deteristic model: constraints must hold for all a i E i c x a i x b i, for all a i E i stochastic model: a i is random variable; constraints must hold with probability η c x prob(a i x b i ) η

deteristic approach via SOCP 12/34 Choose an ellipsoid as E i : E i = {ā i + P i u u 2 1}, ā i R n, P i R n n Robust LP c x a i x b i, for all a i E i is equivalent to the SOCP c x ā i x + P i x 2 b i since sup (ā i + P i u) x = ā i x + P i x 2 u 2 1

stochastic approach via SOCP a i is Gaussian with mean ā i, covariance Σ i ( a i N (ā i, Σ i ) a i x is Gaussian r.v. with mean ā i x, variance x Σ i x; hence ( prob(a bi ā ) i x i x b i ) = Φ Σ 1/2 x 2 where Φ(x) = (1/ 2π) x e t2 /2 dt is CDF of N (0, 1) robust LP c x prob(a i x b i ) η is equivalent to the SOCP c x ā i x + Φ 1 (η) Σ 1/2 x 2 b i 13/34

14/34 SDP Standard Form S n = {X R n n X = X}, S n + = {X S n X 0}, S n ++ = {X S n X 0} Define linear operator A : S n R m : A(X) = ( A 1, X,..., A m, X ), X S n. Since A(X) y = m i=1 y i A i, X = m i=1 y ia i, X, the adjoint of A: A (y) = m y i A i i=1 The SDP standard form: (P) C, X A(X) = b X 0 (D) max b y A (y) + S = C S 0

Facts on matrix calcuation If A, B R m n, then Tr(AB ) = Tr(B A) If U, V S n and Q is orthogonal, then U, V = Q UQ, Q UQ If X S n, then U = Q ΛQ, where Q Q = I and Λ is diagonal. Matrix norms: X F = λ(x) 2, X 2 = λ(x), λ(x) = diag(λ) X 0 v Xv for all v R n λ(x) 0 X = B B The dual cone of S+ n is S+ n If X 0, then X ii 0. If X ii = 0, then X ik = X ki = 0 for all k. If X 0, then PXP 0 for any P of approriate dimensions ( ) X11 X If X = 12 X12 0, then X X 11 0. 22 X 0 iff every principal submatrix is positive semidefinite (psd). 15/34

Facts on matrix calcuation ( ) A B Let U = B with A and C symmetric and A 0. Then C U 0 ( or 0) C B A 1 B 0 ( or 0). The matrix C B A 1 B is the Schur complement of A in U: ( ) ( ) ( ) ( A B I 0 A 0 I A B = 1 ) B C B A 1 I 0 C B A 1 B 0 I If A S n, then x Ax = A, xx If A 0, then A, B > 0 for every nonzero B 0 and {B 0 A, B β} is bounded for β > 0 If A, B 0, then A, B = 0 iff AB = 0 A, B S n, then A and B are commute iff AB is symmetric, iff A and B can be simultaneously diagonalized 16/34

17/34 Eigenvalue optimization imizing the largest eigenvalue λ max (A 0 + i x ia i ): λ max (A 0 + i x i A i ) can be expressed as an SDP and its dual is z max A 0, Y zi i x i A i A 0 A i, Y = k I, Y = 1 Y 0 follows from λ max (A) t A ti

Eigenvalue optimization 18/34 Let A i R m n. Minimizing the 2-norm of A(x) = A 0 + i x ia i : can be expressed as an SDP x A(x) 2 x,t t ( ) ti A(x) A(x) 0 ti Constraint follows from A 2 t A A t 2 I, t 0 ( ) ti A(x) A(x) 0 ti

Eigenvalue optimization 19/34 Let Λ k (A) indicate sum of the k largest eigenvalues of A. Then imizing Λ k (A 0 + i x ia i ): Λ k (A 0 + i x i A i ) can be expressed as an SDP kz + Tr(X) zi + X i x i A i A 0 since X 0 Λ k (A) t t kz Tr(X) 0, zi + X A, X 0

20/34 The following problems can be expressed as SDP maximizing sum of the k smallest eigenvalues of A 0 + i x ia i imizing sum of the k absolute-value-wise largest eigenvalues imizing sum of the k largest singular values of A 0 + i x ia i

Quadratically Constrained Quadratic Programg 21/34 Consider QCQP x A 0 x + 2b 0 x + c 0 assume A i S n x A i x + 2b i x + c i 0, i = 1,..., m If A 0 0 and A i = B i B i, i = 1,..., m, then it is a SOCP If A i S n but may be indefinite x A i x + 2b i x + c i = A i, xx + 2b i x + c i The original problem is equivalent to TrA 0 X + 2b 0 x + c 0 TrA i X + 2b i x + c i 0, i = 1,..., m X = xx

QCQP 22/34 If A i S n but may be indefinite ( x A i x + 2b Ai b i x + c i = i b i c i ), ( ) X x x := Ā 1 i, X X 0 is equivalent to X xx The SDP relaxation is TrA 0 X + 2b 0 x + c 0 TrA i X + 2b i x + c i 0, i = 1,..., m X xx Maxcut: max x Wx, xi 2 = 1 Phase retrieval: a i x = b i, the value of a i x is complex

Max cut 23/34 For graph (V, E) and weights w ij = w ji 0, the maxcut problem is (Q) max x 1 2 w ij (1 x i x j ), x i { 1, 1} i<j Relaxation: (P) 1 max v i R n 2 w ij (1 v i v j ), v i 2 = 1 i<j Equivalent SDP of (P): (SDP) 1 max X S n 2 w ij (1 X ij ), X ii = 1, X 0 i<j

24/34 Max cut: rounding procedure Goemans and Williamson s randomized approach Solve (SDP) to obtain an optimal solution X. Compute the decomposition X = V V, where V = [v 1, v 2,..., v n ] Generate a vector r uniformly distributed on the unit sphere, i.e., r 2 = 1 Set x i = { 1 v i r 0 1 otherwise

Max cut: theoretical results Let W be the objective function value of x and E(W) be the expected value. Then E(W) = 1 w ij arccos(v i v j ) π Goemans and Williamson showed: E(W) α 1 w ij (1 v i v j ) 2 i<j i<j where 2 θ α = 0 θπ π 1 cos θ > 0.878 Let Z(SDP) and Z (Q) be the optimal values of (SDP) and (Q) E(W) αz (SDP) αz (Q) 25/34

SDP-Representablity 26/34 What kind of problems can be expressed by SDP and SOCP? Definition: A set X R n is SDP-representable (or SDP-Rep for short) if it can be expressed linearly as the feasible region of an SDP X = { x there exist u R k such that for some A i, B j, C R m m : x i A i + u j B j + C 0 i j

SDP-Representablity 27/34 Definition: A function f (x) is SDP-Rep if its epigraph is SDP-representable epi(f ) = {(x 0, x) f (x) x 0 } If X is SDP-Rep, then x X c x is an SDP If f (x) is SDP-Rep, then x f (x) is an SDP

28/34 A calculus of SDP-Rep sets and functions SDP-Rep sets and functions remain so under finitely many applications of most convex-preserving operations. If X and Y are SDP-Rep then so are Minkowski sum X + Y intersection X Y Affine pre-image A 1 (X) if A is affine Affine map A(X) if A is affine Cartesian Product: X Y = {(x, y) x X, y Y}

29/34 SDP-Rep Functions If functions f i, i = 1,..., m and g are SDP-Rep. Then the following are SDP-Rep nonnegative sum i α if i for α i 0 maximum max i f i composition: g(f 1 (x),..., f m (x)) if f i : R n R and g : R m R Legendre transform f (y) = max x y x f (x)

Positive Polynomials 30/34 The set of nonnegative polynomials of a given degree forms a proper cone P n = {(p 0,..., p n ) p 0 + p 1 t +... + p n t t > 0 for all t I} where I is any of [a, b], [a, ) or (, ) Important fact: The cone of positive polynomials is SDP-Rep To see this we need to introduce another problem

The Moment cone 31/34 The Moment space and Moment cone Let (c 0, c 1,..., c n ) be such that there is a probability measure F where c i = t i df, for i = 0,..., n. The set of such vectors is the Moment space I The Moment cone M n = {ac There is a distribution F : c i = t i df and a 0} I

The Moment cone 32/34 The moment cone is also SDP-Rep: The discrete Hamburger moment problem: I = R, c M 2n+1 c 0 c 1... c n c 1 c 2... c n+1...... 0 c n c n+1... c 2n This is the Hankel matrix

The Moment cone 33/34 The discrete Stieltjes moment problem I = [0, ), c M m c 0 c 1... c n c 1 c 2... c n 1 c 1 c 2... c n+1 c 2 c 3... c n..... 0, and....... 0 c n c n+1... c 2n c n 1 c n... c 2n 1 where m = n 2 The Hausdorff moment problem where I = [0, 1] is similarly SDP-Rep

Moment and positive polynomial cones 34/34 P n = M n, i.e., i.e. moment cones and nonnegative polynomials are dual of each other If {u 0 (x),..., u n (x)}, x I are linearly independent functions (possibly of several variables) The cone of polynomials that can be expressed as sum of squares is SDP-Rep. if I is a one dimensional set then positive polynomials and sum of square polynomials coincide In general except for I one-dimensional positive polynomials properly include sum of square polynomials