Semidefinite Programming Basics and Applications

Semidefinite Programming Basics and Applications Ray Pörn, principal lecturer Åbo Akademi University Novia University of Applied Sciences

Content What is semidefinite programming (SDP)? How to represent different constraints Representability Relaxation techniques Reformulation strategies

Convex optimization General form: minimize subject to f x x X where f is a convex function and X is a convex set. Why is convex optimization important? Many practical problems can be posed as convex programs Local optimum = global optimum Hard non-convex problems can be approximated with convex ones Efficient (polynomial time) algorithms exist

Basic linear algebra and notation Definition: A symmetric matrix A is called positive semidefinite if x T Ax 0 for all vectors x R n.

The cone of positive semidefinite matrices 3D The cone is a convex set. This is a convex constraint: X 0

Hierarchy of optimization problems Optimization problem Convex program nice Conic program convexification Non-convex program not so nice SDP SOCP LP QP LP approximation

Intro to semidefinite programming SDP SOCP QCQP QP LP semidefinite program second order conic program convex quadratically constrained quadratic program convex quadratic program linear program SDP SOCP (linear) SDP minimize trace CX subject to trace A 1 X = b 1 trace A m X = b m X 0 QCQP QP LP Minimize a linear function over the intersection of an affine set and the cone of positive semidefinite matrices

Intro to SDP The constant matrices C, A 1,, A m are assumed to be symmetric. Different notations: n n trace CX = Tr CX = C, X = C X = c ij x ij i=1 j=1 trace(cx) is the natural inner product C, X in the space of symmetric matrices. trace CX is a linear function of variables x ij. Example:

Semidefinite programming Standard form of SDP: minimize trace CX s. t. trace A i X = a i i = 1,, m X 0 This form is often called the primal problem. It has a matrix variable X, linear equality constraints and one conic constraint (X is psd). Equivalent form of SDP: minimize c T x s. t. Bx = b B 0 + x 1 B 1 + x 2 B 2 + x n B n 0 This form is connected to the dual problem. It has a vector variable x, one Linear Matrix Inequality (LMI) and a set of linear equalities.

Example SDP with C = 1 2 2 3, A 1 = 1 1 1 2, A 2 = 2 3 3 0, a = 8 6 and symmetric matrix variable X = x 1 x 2 x 2 x 3. trace CX = x 1 + 4x 2 + 3x 3 trace A 1 X = x 1 2x 2 + 2x 3 trace A 2 X = 2x 1 + 6x 2 minimize x 1 + 4x 2 + 3x 3 s. t. x 1 2x 2 + 2x 3 = 8 2x 1 + 6x 2 = 6 x 1 x 2 x 2 x 0 3 Decompose matrix: X = x 1 x 2 1 x 2 x = x 1 3 0 0 0 +x 2 Define: c = 1 4 3 T b = a B = 1 2 2 1 2 6 0 x 1 0 0 0 +x 2 0 1 1 0 +x 3 0 0 0 1 minimize x 1 + 4x 2 + 3x 3 s. t. x 1 2x 2 + 2x 3 = 8 2x 1 + 6x 2 = 6 0 1 1 0 +x 3 0 0 0 1 0

Representability - LP A set of linear inequality constraints: 2x + 3y 10 ቊ x + 2y 5 10 2x 3y 0 5 + x 2y 0 10 2x 3y 0 0 5 + x 2y 0 since a diagonal matrix is PSD iff all diagonal elements are non-negative. LP as an SDP

Representability a convex quadratic constraint A convex quadratic constraint: 4x 2 10x + 2 0 Recall: a b b c 0 a 0 ac b2 0 4x 2 10x + 2 0 1 10x 2 2x 2 0 1 2x 2x 10x 2 0 What about a concave quadratic constraint? 4x 2 10x + 2 0

Representability - QP A general convex quadratic constraint: x T Qx + q T x + q 0 0 x T Qx + q T x + q 0 0 x T R T Rx + q T x + q 0 0 Rx T Rx + q T x + q 0 0 q T x q 0 Rx T I(Rx) 0 I Rx Rx T q T x q 0 0

Representability Convex QCQP 1 Nonlinear in xx Linear in x, θx

Representability Convex QCQP 2 (another way) Nonlinear in xx Linear in x, W and θx

Representability SOCP A second order conic constraint: Qx + d g T x + h Qx + d 2 g T x + h 2 g T x + h 2 Qx + d T Qx + d 0 g T x + h Qx + d T I g T Qx + d 0 x + h

Representability The Schur complement

Nonlinear matrix inequalities Riccati inequality (P variable, A, B, Q fixed and R fixed and pos. def.): A T P + PA + PBR 1 B T P + Q 0 A T PA Q PB B T P R 0 Fractional quadratic inequality: General matrix inequality:

Application: SDP relaxation of MAX-CUT MAX-CUT problem Consider an undirected graph G=(V,E) with n vertices and edge weights w ij 0 (w ij = w ji ) for all edges (i, j) E. Find a subset S of V such that the sum of the weights of the edges that cross from S to V \ S (the complement of S) is maximized. MAX-CUT (combinatorial) formulation n 1 maximize 4 w ij (1 x i x j ) i=1 j=1 s. t. x j 1,1 j = 1,, n n MAX-CUT is a non-convex, quadratic combinatorial problem. The problem is NP-hard (very difficult to solve to optimality for large instances).

Let X = xx T, that is X ij = x i x j, and x j 1,1 x j 2 = 1 X jj = 1. This leads to an equivalent formulation of MAX-CUT: n n maximize 1 4 w ij 1 4 w ij X ij i=1 j=1 i=1 j=1 s. t. X jj = 1, j = 1,.., n X = xx T n We relax the problematic rank-1 constraint X = xx T to X 0 and denote by W the matrix with elements w ij. SDP relaxation of MAX-CUT n maximize 1 4 w ij 1 4 trace(wx) i=1 j=1 s. t. X jj = 1, j = 1,, n X 0 n n

Classification Separation of two sets of points in R n Set 1: X = {x 1, x 2,, x N } Set 2: Y = {y 1, y 2,, y M } Find a function f(x) that separates X and Y as good as possible. That is, f x i > 0 and f y i < 0 for as many points as possible. Linear discrimination: f x = a T x + b (LP)

Classification Quadratic convex discrimination: f x = x T Px + q T x + r Assumption: Separation surface is ellipoidal (P 0) and contains all points X and none of points Y. This leads to a SDP feasibility problem: minimize 1 T u + 1 T v s.t.

Classification

SDP representability We have seen, for example, that: convex quadratic contraints: x T Px + q T x + r 0 and second order cone constraints: Ax + b c T + d can be represented by linear semidefinite constraints, also called LMIs. These constraints (sets) are semidefinite representable (SDr). Definition: A convex function is called SDr if its epigraph is SDr. We will see that a variety of convex functions admits an SDr. This means that the modeling abilities of SDP are far greater than in LP, QP, QCQP and SOCP programming.

Eigenvalue formulations using SDP SDP allows modeling of functions that include eigenvalues and singular values of matrices. Largest eigenvalue of a symmetric matrix Spectral norm of a symmetric matrix

Singular value formulations using SDP The representation of singular values of a general rectangular matrix follows from: The largest singular value of a matrix (the operator norm) The sum of p largest singular values is also SDr.

Combinatorial problem: 0-1 Quadratic Program (01 QP) A standard 0-1 QP has the form: Q, A, B are matrices and q, a, b are vectors of appropriate dimensions. Some applications include: Max-Cut of a graph (unconstrained) Knapsack problems (inequality constrained) Graph bipartitioning Task allocation Quadratic assignment problems Coulomb glass Boolean least squares

SDP relaxation 0-1 QP Relaxation of binary x into a positive semidefinite matrix variable X. X = xx T X xx T 0 1 xt x X 0 A quadratic expression in x is linear in X: x T Qx = Q X = σ i σ j Q ij X ij Binary condition: x i 0,1 x i 2 x i = 0 X ii = x i diag X = x Semidefinite relaxation: Gives tight lower bound on 0-1 QP min s. t. Q X + q T x Ax = a Bx b diag X = x 1 x T x X 0

Convexification of 0-1 QPs Basic approach: If Q is indefinite, add sufficient large quadratic terms to the diagonal and subtract the same amount from the linear terms. Recall that: x i 0,1 x i 2 = x i Example f x = x T 1 3 3 2 x = x 1 2 + 6x 1 x 2 + 2x 2 2 Same function on {0,1}x{0,1} f x = x T 1 3 3 2 x = xt 3 3 3 5 x 2 3 T x = 3x1 2 + 6x 1 x 2 + 5x 2 2 2x 1 3x 2 Indefinite Positive semidefinite

Convexification of 0-1 QPs The following are equivalent (Q = Q T ): The quadratic function f x = x T Qx is convex on R n. The matrix Q is positive semidefinite (Q 0). All eigenvalues of Q are non-negative (λ i 0). A sufficient condition for convexity: A diagonally dominant matris is PSD. Definition: A matrix Q is diagonally dominant if Q ii i j Q ij i

Convexification of 0-1 QPs Example : a) Diagonal dominance b) Minimum eigenvalue min x T Qx s. t. x 0,1 4 a) Diagonal dominance Q = 7 2 3 2 2 9 3 4 3 3 6 0 2 4 0 6 Q = q = b) Minimum eigenvalue 6.17 2 3 2 2 7.17 3 4 Q = 3 3 7.17 0 2 4 0 3.17 1 2 3 2 2 2 3 4 3 3 2 0 2 4 0 2 6 7 4 8 eig( Q) = q = 5.17 5.17 5.17 5.17 1.66 4.90 6.88 14.56 eig( Q) = eig(q) = 5.17 1.04 0.95 8.26 min x T Qx q T x s. t. x [0,1] 4 optimal value = 5. 93 0 4.13 6.12 13.43 min x T Qx q T x s. t. x [0,1] 4 optimal value = 5. 34

Convexification of 0-1 QPs c) The best diagonal. The QCR (SDP based) method allows computation of the diagonal that gives the largest value of the relaxation. Q = 2.93 2 3 2 2 4.28 3 4 3 3 6.83 0 2 4 0 6.20 q = 1.93 2.28 4.83 8.20 eig( Q) = 0 1.31 6.71 12.21 min x T Qx q T x s. t. x [0,1] 4 optimal value = 4. 08 min x T Qx s. t. x 0,1 4 optimal value = 3 Bounding: 5. 93 5. 34 4. 08 3

Convexification of 0-1 QPs Relaxation into a positive semidefinite matrix variable X = xx T X xx T 0 1 xt x X 0 A quadratic expression in x is linear in X: x T Qx = Q X = σ i σ j Q ij X ij Binary condition: x i 0,1 x i 2 x i = 0 X ii = x i Semidefinite relaxation: min s. t. Q X + q T x Ax = a Bx b diag X = x 1 x T x X 0

Deriving the optimal diagonal Lagrangian relaxation of 0-1 QP: f x, λ, μ, δ = x T Qx + q T x + λ T Ax a + μ T Bx b + δ i (x 2 i x i ) = x T (Q + Diag δ Lagrangian dual problem: തQ ҧ sup inf x T തQx + തq T x + c δ, λ, μ x R n which equals a semidefinite program max s. t. n i=1 )x + (q + A T λ + B T T μ δ) x λ T a μ T b t 1 t + cҧ 2 തqT 0 1 2 തq തQ δ R n, λ R m k, μ R + തq cҧ

Convexification of 0-1 QPs min s. t. Q X + q T x Ax = a Bx b diag X = x 1 x T x X 0 max s. t. t 1 t + cҧ 2 തqT 0 1 2 തq തQ δ R n, λ R m k, μ R + Solution of dual gives optimal values: δ, λ, μ. Solution of primal gives optimal values: x* and X* The multipliers from the constraints x i 2 = x i are used to construct the best diagonal perturbation of matrix Q according to Q = Q + Diag δ.

Summary A short introduction to semidefinite programming SDP is a general form of a convex program. It includes LP, QP and SOCP as special cases. SDP can be used, for example, for relaxation and reformulation of hard combinatorial problems. SDP has many applications in modern control theory, statistics, mechanics and various problems connected to eigenvalues and singular values.