Semidefinite Programming, Combinatorial Optimization and Real Algebraic Geometry assoc. prof., Ph.D. 1 1 UNM - Faculty of information studies Edinburgh, 16. September 2014
Outline Introduction Definition of SDP Dual theory
Definition of SDP Dual theory Some notation I - unit matrix, X R m n x = vec(x ) R mn x, y = x T y = i x iy i A, B = trace(a T B) = i,j a ijb ij
Definition of SDP Dual theory Some notation Symmetric matrices S n := {X R n n : X T = X } Cone of positive semidefinite matrices S n + := {X S n : u T Xu 0, u R n } - closed convex pointed cone Positive definite matrices S ++ n := {X S n : u T Xu > 0, u R n } The dual cone: (S + n ) = {Y S n : Y, X 0, X S + n } = S + n (Note: inf X 0 X, Y = 0 Y 0) For X S + n we use X 0.
Definition of SDP Dual theory Few properties of PSD matrices For A S n the following are equivalent A S n +, all eigenvalues of A are nonnegative real numbers, there exist P and D = Diag(d), d 0, such that A = PDP T, det AII 0 for every main submatrix A II (A II is main submatrix, if A = [a ij ], for i, j I {1,..., n}). BAB T S n +, for some non-singular B.
Definition of SDP Dual theory Few properties of PSD matrices For X, Y S + n : X, Y 0 and X, Y = 0 XY = 0. Schur s complement: if A S n ++ then [ ] A B B t 0 C B t A 1 B 0. C
Definition of SDP Dual theory Primal and the dual SDP Primal semidefinite programming problem (PSDP) inf C, X s. t. A i, X = b i, i, X S n + Dual semidefinite programming problem (DSDP) s. t. sup b T y i y ia i + Z = C Z S n + PSDP and DSDP are dual of each other.
Definition of SDP Dual theory Example 1: linear programming (LP) Linear programming problem (LP) in standard primal form min c T x p. p. Ax = b x 0 LP as PSDP: C = Diag(c), X = Diag(x), A i = Diag(A(i, :)) min C, X p. p. A i, X = b i, 1 i m, X S n +
Definition of SDP Dual theory Example 2: convex quadratic programming Def. Convex quadratic programming problem (CCP): Rem. inf f 0(x) p. p. f i (x) 0, i = 1,..., m. (CCP) where: f i (x) = x t U i x v t i x z i, U i 0. Note f i (x) 0 A i = [ ] I U 1/2 i x (U 1/2 i x) t vi t x + z i 0. [ [ [ ] I 0 0 U A i = ]+x 1/2 i (:, 1) 0 U 0 z 1 ]+ 1/2 i U 1/2 i (:, n) +x n i (1, :) v i1 U 1/2. i (n, :) v in
Definition of SDP Dual theory Example 2: convex quadratic programming cnt. Finding min of f 0 (x) is equiv. to finding min. of t with additional constraint f 0 (x) t. (CCP) is equivalent to: inf t p. p. Diag(A 0, A 1,..., A m ) 0, where [ ] I U 1/2 A 0 = 0 x (U 1/2 0 x) t v0 tx + z. 0 + t
Definition of SDP Dual theory Weak duality Let us define A i, X = b i =: A(X ) = b y i A i =: A T (y) i OPT P = inf { C, X ; A(X ) = b, X 0} and OPT D = sup {b t y ; A t (y) + Z = C, Z 0, y R m }. More definitions sup =, inf =. Thm.: OPT P OPT D. Proof: Duality gap: OPT P OPT D = C, X opt b T y opt = X opt, Z opt.
Definition of SDP Dual theory Strong duality Def.: PSDP is strictly feasible, if there exists X S ++ n A(X ) = b. DSDP is strictly feasible, if there exists pair (y, Z) R m S ++ n such that A T (y) + Z = C. Thm.: Let (DSDP) be strictly feasible. Then OPTP = OPT D. such that If OPT P < then there exists X 0 s.t. A(X ) = b and C, X = OPT P.
Definition of SDP Dual theory Example 3: optimum is not attained min [ ] 1 0, X 0 0 [ ] 0 1 p. p., X = 2, X 0. 1 0 [ ] x11 1 Feasible solutions: X = with x 1 x 22 > 0 in 22 x 11 x 22 1. OPT P = 0, but OPT P is not attained. Interpretation : DSDP has no strictly feasible solution.
Definition of SDP Dual theory Example 4: positive duality gap min p. p. 0 0 0 0 0 0, X 0 0 1 0 0 1 0 0 0, X = 0, 0 0 0 1 0 0 1 0 0, X = 2, X 0. 0 0 2 OPT P = 1, OPT D = 0.
Definition of SDP Dual theory Optimal conditions for SDP Let PSDP and DSDP be strictly feasible. Thm.: X in (y, Z ) are optimal for PSDP and DSDP if and only if: (prim. dop.) A(X ) = b, X 0 (dual. dop.) A T (y) + Z = C, Z 0 (zero duality gap) XZ = 0 (b T y = C, X )
Central path Assumptions eqs. Ai, X = b i linearly independant PSDP and DSDP strictly feasible Then the system (prim. dop.) A(X ) = b, X 0 (dual. dop.) A T (y) + Z = C, Z 0 XZ = µi µ > 0 has a unique solution (X µ, y µ, Z µ ). Def.: The central path: {(X µ, y µ, Z µ ): µ > 0}.
Example PSDP min p. p. [ ] 1 0, X 0 1 [ ] 0 1, X 1 0 = 2, X 0. DSDP max 2y p. p. Z = [ 1 ] y y 1 0.
The central path (primal part)
The properties of the central path Thm: The central path is a smooth curve parameterized by µ. Thm: The central path always converge lim µ 0 (X µ, y µ, Z µ ) = (X, y, Z ). Limit point is maximally complementary primal dual optimal solution. Def: Let (X, y, Z ) be primal dual optimal solution. It is maximally complementary if rank(x ) is maximal among all primal optimal solutions and rank(y ) is maximal among all dual optimal solutions. Proof: See e.g. H. Wolkowicz, R. Saigal, L. Vanderberghe (ed.): Handbook of Semidefinite Programming. Kluwer Academic Publishers, Boston-Dordrecht-London, 2000.
Path following IPMs Idea: We solve approximately the equations defining the central path (we follow the central path approximately).
Path following IPMs - more precisely Input: A, b R m, C S n, ε > 0, σ (0, 1) and (X 0, y 0, Z 0 ) strictly feasible for PSDP in DSDP 1. Set k := 0. 2. Repeat 2.1 µ k = X k, Z k /n 2.1 Solve A(X k + X ) = b, A T (y k + y) + Z + Z = C, (X k + X )(Z k + Z) = σµ k I. 2.2 X = ( X + ( X ) T )/2. 2.3 (X k+1, y k+1, Z k+1 ) := (X k + X, y k + y, Z k + Z). 2.4 k := k + 1. 3. until X k+1, Z k+1 ε. Output: (X k, y k, Z k ).
Solving the system Note that the starting point is strictly feasible. A( X ) = 0, A T ( y) + Z = 0, XZ k + X k Z = µ k I X k Z k, FIRSTLY: Z = A T ( y) X = ( X + ( X ) T )/2, THEN: X = (µi X k Z k X k Z)Z 1 k FINALLY: A(X k A T ( y)z 1 k ) = A(µZ 1 k )+A(X k ) = b A(µZ 1 k ).
Solving the system: bottlenecks On each step we have to - compute one inverse (Z 1 k ) - compose the system matrix M y = b. Each m ij = A i, XA j Z 1 k takes O(m 2 n 2 + mn 3 ) flops. - Solve the system M y = b (takes O(m 3 ) flops) - Compute Z in X (takes O(mn 2 + n 3 ) flops)
Theoretical guaranty Thm.: Let (X 0, y 0, Z 0 ) be strictly feasible starting point with Z 1/2 XZ 1/2 µi θ X, Z. The described path following IPM gives ε-optimal solution in at most n/δ log(ε 1 X 0, S 0 ) iterations, where δ = n(1 σ). (1+θ) 2 1 (θ 2 + n(1 σ) 2 ) σθ. 2(1 θ) 3 2
(Povh, Rendl, Wiegele, 2005) Relaxed optimality conditions (primal feas.) A(X ) b, X 0 (dual feas.) A T (y) + Z C, Z 0 (zero opt. dual. gap) XZ = 0
(Povh, Rendl, Wiegele, 2005) Relaxed optimality conditions (primal feas.) A(X ) b, X 0 (dual feas.) A T (y) + Z C, Z 0 (zero opt. dual. gap) XZ = 0
Idea behind the Augmented Lagrangian approach to DSDP sup{b T y : A T y + Z = C, Z S n + } Replace DSDP by DSDP-L: min {b T y + X, C A T (y) Z + σ 2 C AT (y) Z 2 : Z 0}
BPM: algorithm INPUT: A, b, C S n. 1. Select σ > 0, X 0 0. 2. k = 0. 3. Repeat 3.1 Solve DSDP-L approximately to get y k and Z k 0. 3.2 Update X k+1 := X k + σ(z k C + A T (y k )) 3.3 k := k + 1. 4. Until: stopping criteria is satisfied. OUTPUT: X k, y k, Z k. Efficient idea: alternating y and Z.
in praxis - we use software packages Nekaj najbolj razširjenih in robustnih paketa - SEDUMI (http://sedumi.ie.lehigh.edu/). - SDPT3 (http://www.math.nus.edu.sg/ mattohkc/sdpt3.html). - SDPA (http://sdpa.sourceforge.net/). - MOSEK (http://www.mosek.com/). - YALMIP (http://users.isy.liu.se/johanl/yalmip/). To solve large SDP (m is large) we can use: - SDPLR (http://dollar.biz.uiowa.edu/ burer/software/sdplr/) - spectral bundle method (http://www-user.tu-chemnitz.de/ helmberg/sbmethod/). - boundary point method (http://www.math.uni-klu.ac.at/or/software/).
Literature 1. Miguel F. Anjos and Jean B. Lasserre. Handbook of Semidefinite, Conic and Polynomial Optimization: Theory, Algorithms, Software and Applications, volume 166 of International Series in Operational Research and Management Science. Springer, 2012 2. E. de Klerk: Aspects of Semidefinite Programming - Interior Point Algorithms and Selected Applications. Kluwer Academic Publishers, Dordrecht, 2002. 3. M. Laurent. Sums of squares, moment matrices and optimization over polynomials, volume 149 of The IMA Volumes in Mathematics and its Applications, str. 157 270. Springer, 2009. 4. J. Povh: Semidefinitno programiranje in kombinatorična optimizacija: magistrsko delo. Ljubljana, 2002 5. J. Povh: Semidefinitno programiranje. Obz. mat. fiz., 2002, letn. 49, št. 6, str. 161-173 6. H. Wolkowicz, R. Saigal, L. Vanderberghe (ed.): Handbook of Semidefinite Programming. Kluwer Academic Publishers, Boston-Dordrecht-London, 2000.