Convex Optimization. (EE227A: UC Berkeley) Lecture 6. Suvrit Sra. (Conic optimization) 07 Feb, PDF Free Download

Convex Optimization (EE227A: UC Berkeley) Lecture 6 (Conic optimization) 07 Feb, 2013 Suvrit Sra

Organizational Info Quiz coming up on 19th Feb. Project teams by 19th Feb Good if you can mix your research with class projects More info in a few days 2 / 31

Mini Challenge Kummer s confluent hypergeometric function M(a, c, x) := (a) j x j, a, c, x R, (c) j j! j 0 and (a) 0 = 1, (a) j = a(a + 1) (a + j 1) is the rising-factorial. 3 / 31

Mini Challenge Kummer s confluent hypergeometric function M(a, c, x) := (a) j x j, a, c, x R, (c) j j! j 0 and (a) 0 = 1, (a) j = a(a + 1) (a + j 1) is the rising-factorial. Claim: Let c > a > 0 and x 0. Then the function h a,c (µ; x) := µ Γ(a + µ) M(a + µ, c + µ, x) Γ(c + µ) is strictly log-convex on [0, ) (note that h is a function of µ). Recall: Γ(x) := 0 t x 1 e t dt is the Gamma function (which is known to be log-convex for x 1; see also Exercise 3.52 of BV). 3 / 31

Write min LP formulation Ax b 1 as a linear program. min Ax b 1 x R n min i at i x b i t i, a T i i x b i t i, i = 1,..., m. min x,t min x,t 1 T t, t i a T i x b i t i, i = 1,..., m. 4 / 31

Write min LP formulation Ax b 1 as a linear program. min Ax b 1 x R n min i at i x b i t i, a T i i x b i t i, i = 1,..., m. min x,t min x,t 1 T t, t i a T i x b i t i, i = 1,..., m. Exercise: Recast Ax b 2 2 + λ Bx 1 as a QP. 4 / 31

Cone programs overview Last time we briefly saw LP, QP, SOCP, SDP 5 / 31

Cone programs overview Last time we briefly saw LP, QP, SOCP, SDP LP (standard form) min f T x s.t. Ax = b, x 0. Feasible set X = {x Ax = b} R n + (nonneg orthant) 5 / 31

Cone programs overview Last time we briefly saw LP, QP, SOCP, SDP LP (standard form) min f T x s.t. Ax = b, x 0. Feasible set X = {x Ax = b} R n + (nonneg orthant) Input data: (A, b, c) Structural constraints: x 0. 5 / 31

Cone programs overview Replace linear map x Ax by a nonlinear map? 6 / 31

Cone programs overview Replace linear map x Ax by a nonlinear map? Quickly becomes nonconvex, potentially intractable 6 / 31

Cone programs overview Replace linear map x Ax by a nonlinear map? Quickly becomes nonconvex, potentially intractable Generalize structural constraint R n + 6 / 31

Cone programs overview Replace linear map x Ax by a nonlinear map? Quickly becomes nonconvex, potentially intractable Generalize structural constraint R n + Replace nonneg orthant by a convex cone K; Replace by conic inequality Nesterov and Nemirovski developed nice theory in late 80s Rich class of cones for which cone programs are tractable 6 / 31

Conic inequalities We are looking for good vector inequalities on R n 7 / 31

Conic inequalities We are looking for good vector inequalities on R n Characterized by the set K := {x R n x 0} of vector nonneg w.r.t. x y x y 0 x y K. 7 / 31

Conic inequalities We are looking for good vector inequalities on R n Characterized by the set of vector nonneg w.r.t. K := {x R n x 0} x y x y 0 x y K. Necessary and sufficient condition for a set K R n to define a useful vector inequality is: it should be a nonempty, pointed cone. 7 / 31

Cone programs inequalities K is nonempty: K = K is closed wrt addition: x, y K = x + y K K closed wrt noneg scaling: x K, α 0 = αx K K is pointed: x, x K = x = 0 8 / 31

Cone programs inequalities K is nonempty: K = K is closed wrt addition: x, y K = x + y K K closed wrt noneg scaling: x K, α 0 = αx K K is pointed: x, x K = x = 0 Cone inequality x K y x y K x K y x y int(k). 8 / 31

Conic inequalities Cone underlying standard coordinatewise vector inequalities: x y x i y i x i y i 0, is the nonegative orthant R n +. 9 / 31

Conic inequalities Cone underlying standard coordinatewise vector inequalities: x y x i y i x i y i 0, is the nonegative orthant R n +. Two more important properties that R n + has as a cone: It is closed { x i R n +} x = x R n + It has nonempty interior (contains Euclidean ball of positive radius) 9 / 31

Conic optimization problems Standard form cone program min f T x s.t. Ax = b, x K min f T x s.t. Ax K b. 10 / 31

Conic optimization problems Standard form cone program min f T x s.t. Ax = b, x K min f T x s.t. Ax K b. The nonnegative orthant R n + The second order cone Q n := {(x, t) R n x 2 t} The semidefinite cone: S n + := { X = X T 0 }. Other cones K given by Cartesian products of these These cones are nice : LP, QP, SOCP, SDP: all are cone programs 10 / 31

Cone programs tough case Copositive cone Def. Let CP n := { A S n n x T Ax 0, x 0 }. Exercise: Verify that CP n is a convex cone. 11 / 31

Cone programs tough case Copositive cone Def. Let CP n := { A S n n x T Ax 0, x 0 }. Exercise: Verify that CP n is a convex cone. If someone told you convex is easy... they lied! 11 / 31

Cone programs tough case Copositive cone Def. Let CP n := { A S n n x T Ax 0, x 0 }. Exercise: Verify that CP n is a convex cone. If someone told you convex is easy... they lied! Testing membership in CP n is co-np complete. (Deciding whether given matrix is not copositive is NP-complete.) Copositive cone programming: NP-Hard 11 / 31

SOCP in conic form min f T x s.t. A i x + b i 2 c T i x + d i i = 1,..., m Let A i R n i n ; so A i x + b i R n i. 12 / 31

SOCP in conic form min f T x s.t. A i x + b i 2 c T i x + d i i = 1,..., m Let A i R ni n ; so A i x + b i R n i. [ ] A1 b 1 c T 1 [ ] d 1 K = Q n 1 Q n A2 b 2 2 Q nm, A = c T 2, b = d 2. [. ]. Am b m d m c T m SOCP in conic form min f T x Ax K b 12 / 31

SOCP representation Exercise: Let 0 Q = LL T, then show that x T Qx + b T x + c 0 L T x + L 1 b 2 b T Q 1 b c 13 / 31

SOCP representation Exercise: Let 0 Q = LL T, then show that x T Qx + b T x + c 0 L T x + L 1 b 2 b T Q 1 b c Rotated second-order cone Q n r := { (x, y, z) R n+1 x 2 yz, y 0, z 0 }. 13 / 31

SOCP representation Exercise: Let 0 Q = LL T, then show that x T Qx + b T x + c 0 L T x + L 1 b 2 b T Q 1 b c Rotated second-order cone Q n r := { (x, y, z) R n+1 x 2 yz, y 0, z 0 }. Convert into standard SOC (verify!) [ 2x (y + z) y z] 2 x 2 yz. Exercise: Rewrite the constraint x T Qx t, where both x and t are variables using the rotated second order cone. 13 / 31

Convex QP as SOCP min x T Qx + c T x s.t. Ax = b. 14 / 31

Convex QP as SOCP min x T Qx + c T x s.t. Ax = b. min x,t c T x + t s.t. Ax = b, x T Qx t. 14 / 31

Convex QP as SOCP min x T Qx + c T x s.t. Ax = b. min x,t min x,t c T x + t s.t. Ax = b, x T Qx t. c T x + t s.t. Ax = b, (2L T x, t, 1) Q n r. Since, x T Qx = x T LL T x = L T x 2 2 14 / 31

Convex QCQPs as SOCP Quadratically Constrained QP min q 0 (x) s.t. q i (x) 0, i = 1,..., m where each q i (x) = x T P i x + b T i x + c i is a convex quadratic. 15 / 31

Convex QCQPs as SOCP Quadratically Constrained QP min q 0 (x) s.t. q i (x) 0, i = 1,..., m where each q i (x) = x T P i x + b T i x + c i is a convex quadratic. Exercise: Show how QCQPs can be cast at SOCPs using Q n r Hint: See Lecture 5! Exercise: Explain why we cannot cast SOCPs as QCQPs. That is, why cannot we simply use the equivalence Ax + b 2 c T x + d Ax + b 2 2 (c T x + d) 2, c T x + d 0. Hint: Look carefully at the inequality! 15 / 31

Robust LP min c T x s.t. a T i x b i a i E i where E i := {ā i + P i u u 2 1}. 16 / 31

Robust LP min c T x s.t. a T i x b i a i E i where E i := {ā i + P i u u 2 1}. Robust half-space constraint: Wish to ensure a T i x b i holds irrespective of which a i we pick from the uncertainty set E i. This happens, if b i sup ai E i a T i x. sup (ā i + P i u) T x = ā T i x + Pi T x 2. u 2 1 We used the fact that sup u 2 1 u T v = v 2 (recall dual-norms) 16 / 31

Semidefinite Program (SDP) Cone program (semidefinite) min c T x s.t. Ax = b, x K, where K is a product of semidefinite cones. 17 / 31

Semidefinite Program (SDP) Cone program (semidefinite) min c T x s.t. Ax = b, x K, where K is a product of semidefinite cones. Standard form Think of x as a matrix variable X 17 / 31

Semidefinite Program (SDP) Cone program (semidefinite) min c T x s.t. Ax = b, x K, where K is a product of semidefinite cones. Standard form Think of x as a matrix variable X Wlog we may assume K = S n + (Why?) Say K = S n 1 + Sn 2 + The condition (X 1, X 2 ) K X := Diag(X 1, X 2 ) S n 1+n 2 + Thus, by imposing non diagonals blocks to be zero, we reduce to where K is the semidefinite cone itself (of suitable dimension). 17 / 31

SDP min y R n c T y SDP (conic form) s.t. A(y) := A 0 + y 1 A 1 + y 2 A 2 +... + y n A n 0. 18 / 31

SDP min y R n c T y SDP (conic form) s.t. A(y) := A 0 + y 1 A 1 + y 2 A 2 +... + y n A n 0. Standard form SDP min Tr(CX) s.t. Tr(A i X) = b i, i = 1,..., m X 0. 18 / 31

SDP min y R n c T y SDP (conic form) s.t. A(y) := A 0 + y 1 A 1 + y 2 A 2 +... + y n A n 0. Standard form SDP min Tr(CX) s.t. Tr(A i X) = b i, i = 1,..., m X 0. One can be converted into another 18 / 31

SDP CVX form cvx_begin variables X(n,n) symmetric ; minimize ( trace (C*X) ) subject to for i = 1:m, trace (A{i}*X) == b(i); end X == semidefinite (n); cvx_ end Note: remember symmetric and semidefinite 19 / 31

SDP representation LP LP as SDP min f T x s.t. Ax b. 20 / 31

SDP representation LP LP as SDP min f T x s.t. Ax b. SDP formulation min f T x s.t. A(x) := diag(b 1 a T 1 x,..., b m a T mx) 0. 20 / 31

SDP representation SOCP SOCP as SDP min f T x s.t. A T i x + b i c T i x + d i, i = 1,..., m. 21 / 31

SDP representation SOCP SOCP as SDP min f T x s.t. A T i x + b i c T i x + d i, i = 1,..., m. SDP formulation [ ] t x T x 2 t 0 x ti [ ] Schur-complements: A B T 0 A B T C 1 B 0. B C 21 / 31

SDP representation SOCP SOCP as SDP min f T x s.t. A T i x + b i c T i x + d i, i = 1,..., m. SDP formulation [ ] t x T x 2 t 0 x ti [ ] Schur-complements: A B T 0 A B T C 1 B 0. B C [ A T i x + b i c T c T i x + d i i x + d i (A T i x + b i) T ] A T i x + b i (c T i x + d 0. i) 21 / 31

SDP / LMI representation Def. A set S R n is called linear matrix inequality (LMI) representable if there exist symmetric matrices A 0,..., A n such that S = {x R n A 0 + x 1 A 1 + + x n A n 0}. S is called SDP representable if it equals the projection of some higher dimensional LMI representable set. 22 / 31

SDP / LMI representation Convex quadratics: x T LL T x + b T x c iff [ I L T ] x x T L c b T 0 x 23 / 31

SDP / LMI representation Convex quadratics: x T LL T x + b T x c iff [ I L T ] x x T L c b T 0 x Eigenvalue inequalities: λ max (X) t, iff ti X 0 23 / 31

SDP / LMI representation Convex quadratics: x T LL T x + b T x c iff [ I L T ] x x T L c b T 0 x Eigenvalue inequalities: λ max (X) t, iff ti X 0 λ min (X) t iffx ti 0 λ max cvx λ min concave. 23 / 31

SDP / LMI representation Convex quadratics: x T LL T x + b T x c iff [ I L T ] x x T L c b T 0 x Eigenvalue inequalities: λ max (X) t, iff ti X 0 λ min (X) t iffx ti 0 λ max cvx λ min concave. Matrix norm: X R m n, X 2 t (i.e., σ max (X) t) iff [ ] tim X X T 0. ti n Proof. t 2 I XX T = t 2 λ max (XX T ) = σ 2 max(x). 23 / 31

SDP / LMI representation Sum of top eigenvalues: For X S n, k i=1 λ i(x) t iff t ks Tr(Z) 0 Z 0 Z X + si 0. 24 / 31

SDP / LMI representation Sum of top eigenvalues: For X S n, k i=1 λ i(x) t iff Proof: Suppose k i=1 λ i(x) t. t ks Tr(Z) 0 Z 0 Z X + si 0. 24 / 31

SDP / LMI representation Sum of top eigenvalues: For X S n, k i=1 λ i(x) t iff t ks Tr(Z) 0 Z 0 Z X + si 0. Proof: Suppose k i=1 λ i(x) t. Then, choosing s = λ k and Z = Diag(λ 1 s,..., λ k s, 0,..., 0), above LMIs hold. Conversely, if above LMI holds, then, (since Z 0) X Z + si = k n λ i(x) k (λ i(z) + s) i=1 i=1 λ i(z) + ks i=1 t (from first ineq.). 24 / 31

SDP / LMI Representation Nuclear norm: X R m n ; X tr := n i=1 σ i(x) t iff 25 / 31

SDP / LMI Representation Nuclear norm: X R m n ; X tr := n i=1 σ i(x) t iff t ns Tr(Z) 0 Z 0 [ ] 0 X Z X T + si 0 m+n 0. 25 / 31

SDP / LMI Representation Nuclear norm: X R m n ; X tr := n i=1 σ i(x) t iff Follows from: λ t ns Tr(Z) 0 Z 0 [ ] 0 X Z X T + si 0 m+n 0. ([ ]) 0 X = (±σ(x), 0,..., 0). X T 0 25 / 31

SDP / LMI Representation Nuclear norm: X R m n ; X tr := n i=1 σ i(x) t iff Follows from: λ t ns Tr(Z) 0 Z 0 [ ] 0 X Z X T + si 0 m+n 0. ([ ]) 0 X = (±σ(x), 0,..., 0). X T 0 Alternatively, we may SDP-represent nuclear norm as [ ] U X X tr t U, V : X T 0, Tr(U + V ) 2t. V Proof is slightly more involved (see lecture notes). 25 / 31

SDP example Logarithmic Chebyshev approximation min max 1 i m log(at i x) log b i 26 / 31

SDP example Logarithmic Chebyshev approximation min max 1 i m log(at i x) log b i log(a T i x) log b i = log max(a T i x/b i, b i /a T i x) 26 / 31

SDP example Logarithmic Chebyshev approximation min max 1 i m log(at i x) log b i log(a T i x) log b i = log max(a T i x/b i, b i /a T i x) Reformulation min x,t t s.t. 1/t a T i x/b i t, i = 1,..., m. 26 / 31

Least-squares SDP min X Y 2 2 s.t. X 0. Exercise 1: Try solving using CVX (assume Y T = Y ); note 2 above is the operator 2-norm; not the Frobenius norm. Exercise 2: Recast as SDP. Hint: Begin with min X,t t s.t.... Exercise 3: Solve the two questions also with X Y 2 F Exercise 4: Verify against analytic solution: X = UΛ + U T, where Y = UΛU T, and Λ + = Diag(max(0, λ 1 ),..., max(0, λ n )). 27 / 31

SDP relaxation Binary Least-squares min Ax b 2 x i { 1, +1} i = 1,..., n. Fundamental problem (engineering, computer science) Nonconvex; x i { 1, +1} 2 n possible solutions Very hard in general (even to approximate) min x T A T Ax 2x T A T b + b T b x 2 i = 1 min Tr(A T Axx T ) 2b T Ax x 2 i = 1 28 / 31

SDP relaxation Replace Y = xx T by Y xx T. Thus, we obtain min Tr(A T AY ) 2b T Ax Y xx T, diag(y ) = 1. 29 / 31

SDP relaxation Replace Y = xx T by Y xx T. Thus, we obtain min Tr(A T AY ) 2b T Ax Y xx T, diag(y ) = 1. This is an SDP, since Y xx T (using Schur complements). [ ] Y x x T 0 1 29 / 31

SDP relaxation Replace Y = xx T by Y xx T. Thus, we obtain min Tr(A T AY ) 2b T Ax Y xx T, diag(y ) = 1. This is an SDP, since Y xx T [ ] Y x x T 0 1 (using Schur complements). Optimal value gives lower bound on binary LS Recover binary x by randomized rounding Exercise: Try the above problem in CVX. 29 / 31

Nonconvex quadratic optimization min x T Ax + b T x x T P i x + b T i x + c 0, i = 1,..., m. 30 / 31

Nonconvex quadratic optimization min x T Ax + b T x x T P i x + b T i x + c 0, i = 1,..., m. Exercise: Show that x T Qx = Tr(Qxx T ) (where Q is symmetric). 30 / 31

Nonconvex quadratic optimization min x T Ax + b T x x T P i x + b T i x + c 0, i = 1,..., m. Exercise: Show that x T Qx = Tr(Qxx T ) (where Q is symmetric). min X,x Tr(AX) + b T x Tr(P i X) + b T i x + c 0, X 0, rank(x) = 1. i = 1,..., m 30 / 31

References 1 L. Vandenberghe. MLSS 2012 Lecture slides; EE236B slides 2 A. Nemirovski. Lecture slides on modern convex optimization. 31 / 31

Convex Optimization. (EE227A: UC Berkeley) Lecture 6. Suvrit Sra. (Conic optimization) 07 Feb, 2013