CSC Linear Programming and Combinatorial Optimization Lecture 10: Semidefinite Programming

Similar documents
SEMIDEFINITE PROGRAM BASICS. Contents

SDP Relaxations for MAXCUT

16.1 L.P. Duality Applied to the Minimax Theorem

E5295/5B5749 Convex optimization with engineering applications. Lecture 5. Convex programming and semidefinite programming

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 12 Luca Trevisan October 3, 2017

Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 5

Introduction to Semidefinite Programming I: Basic properties a

CSC Linear Programming and Combinatorial Optimization Lecture 12: The Lift and Project Method

Summer School: Semidefinite Optimization

Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 4

Semidefinite Programming Basics and Applications

Approximation Algorithms

CSCI 1951-G Optimization Methods in Finance Part 10: Conic Optimization

Semidefinite Programming

- Well-characterized problems, min-max relations, approximate certificates. - LP problems in the standard form, primal and dual linear programs

Lecture 14: Optimality Conditions for Conic Problems

Assignment 1: From the Definition of Convexity to Helley Theorem

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

Lecture: Introduction to LP, SDP and SOCP

Knowledge Discovery and Data Mining 1 (VO) ( )

Linear and non-linear programming

Linear Algebra (Review) Volker Tresp 2018

Linear Algebra Review

Basic Concepts in Linear Algebra

Lecture 1. 1 Conic programming. MA 796S: Convex Optimization and Interior Point Methods October 8, Consider the conic program. min.

Review of Basic Concepts in Linear Algebra

Notes taken by Graham Taylor. January 22, 2005

Global Optimization of Polynomials

Lecture Note 5: Semidefinite Programming for Stability Analysis

Lecture 8 : Eigenvalues and Eigenvectors

Lecture 10. Semidefinite Programs and the Max-Cut Problem Max Cut

Convex Optimization M2

Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 2

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 11 Luca Trevisan February 29, 2016

A Review of Linear Programming

CS 246 Review of Linear Algebra 01/17/19

CS 6820 Fall 2014 Lectures, October 3-20, 2014

CS261: A Second Course in Algorithms Lecture #9: Linear Programming Duality (Part 2)

Notes on Linear Algebra and Matrix Theory

6.854J / J Advanced Algorithms Fall 2008

Continuous Optimisation, Chpt 9: Semidefinite Problems

Economics 204 Summer/Fall 2010 Lecture 10 Friday August 6, 2010

Lecture 3: Semidefinite Programming

HW1 solutions. 1. α Ef(x) β, where Ef(x) is the expected value of f(x), i.e., Ef(x) = n. i=1 p if(a i ). (The function f : R R is given.

Lecture 13: Spectral Graph Theory

CO 250 Final Exam Guide

Semidefinite Programming

Semidefinite programs and combinatorial optimization

ELE539A: Optimization of Communication Systems Lecture 15: Semidefinite Programming, Detection and Estimation Applications

Basic Linear Algebra in MATLAB

Real Symmetric Matrices and Semidefinite Programming

Quantum Computing Lecture 2. Review of Linear Algebra

Linear Algebra (Review) Volker Tresp 2017

Lectures 6, 7 and part of 8

8 Approximation Algorithms and Max-Cut

A notion of Total Dual Integrality for Convex, Semidefinite and Extended Formulations

4. Algebra and Duality

Convex and Semidefinite Programming for Approximation

1 The independent set problem

Lecture 5. Theorems of Alternatives and Self-Dual Embedding

Review of Linear Algebra

Convex Optimization M2

Basic Concepts in Matrix Algebra

Convex Optimization. (EE227A: UC Berkeley) Lecture 6. Suvrit Sra. (Conic optimization) 07 Feb, 2013

3. Linear Programming and Polyhedral Combinatorics

15-780: LinearProgramming

CS 143 Linear Algebra Review

Chap6 Duality Theory and Sensitivity Analysis

CS675: Convex and Combinatorial Optimization Fall 2016 Combinatorial Problems as Linear and Convex Programs. Instructor: Shaddin Dughmi

Review of Some Concepts from Linear Algebra: Part 2

Spring 2017 CO 250 Course Notes TABLE OF CONTENTS. richardwu.ca. CO 250 Course Notes. Introduction to Optimization

Lecture 17 (Nov 3, 2011 ): Approximation via rounding SDP: Max-Cut

14 Singular Value Decomposition

LMI MODELLING 4. CONVEX LMI MODELLING. Didier HENRION. LAAS-CNRS Toulouse, FR Czech Tech Univ Prague, CZ. Universidad de Valladolid, SP March 2009

Fundamentals of Linear Algebra. Marcel B. Finan Arkansas Tech University c All Rights Reserved

4 Linear Algebra Review

5. Duality. Lagrangian

1 Positive definiteness and semidefiniteness

The Nearest Doubly Stochastic Matrix to a Real Matrix with the same First Moment

1. Algebraic and geometric treatments Consider an LP problem in the standard form. x 0. Solutions to the system of linear equations

Linear Algebra for Machine Learning. Sargur N. Srihari

MIT Algebraic techniques and semidefinite optimization February 14, Lecture 3

EE 227A: Convex Optimization and Applications October 14, 2008

15 Singular Value Decomposition

Lecture: Examples of LP, SOCP and SDP

4 Linear Algebra Review

Linear and Combinatorial Optimization

Notes taken by Costis Georgiou revised by Hamed Hatami

Tutorials in Optimization. Richard Socher

IE 5531: Engineering Optimization I

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010

15. Conic optimization

(x, y) = d(x, y) = x y.

Topic: Primal-Dual Algorithms Date: We finished our discussion of randomized rounding and began talking about LP Duality.

A TOUR OF LINEAR ALGEBRA FOR JDEP 384H

Lecture 11: Post-Optimal Analysis. September 23, 2009

Linear Programming Duality

Lecture 7: Semidefinite programming

Lecture: Algorithms for LP, SOCP and SDP

Combinatorial Types of Tropical Eigenvector

Transcription:

CSC2411 - Linear Programming and Combinatorial Optimization Lecture 10: Semidefinite Programming Notes taken by Mike Jamieson March 28, 2005 Summary: In this lecture, we introduce semidefinite programming (SDP) We describe SDP as a generalization of linear programming (LP), and how an LP and other problems can be written as an SDP We present vector programming (VP) as another form of SDP and discuss duality in the SDP context Finally, we note some important differences between LP and SDP 1 Components of Semidefinite Programming Today we go beyond LP into semidefinite programming (SDP) The development of SDP was one of the major successes of the 1990 s It is a special case of convex optimization with many useful properties 11 LP and Closed Cones Recall that in standard form, and LP may be written as min c, x such that a i, x = b i, i = 1 m x 0 If we define K = (R + ) n = {x x 0} then we can rewrite the last condition above as x K Definition 11 A set K R n is a closed cone if x, y K and α, β R + and K is closed αx + βy K Lecture Notes for a course given by Avner Magen, Dept of Computer Science, University of Toronto 1

12 Positive Semidefinite Matrices Definition 12 An n n matrix X is positive semidefinite (PSD) if a R n, a t Xa 0 We can denote the set of all symmetric n n PSD matrices as P SD n P SD n = {X X is an n n symmetric matrix, X is PSD} Claim 13 P SD n is a closed clone Proof To see that P SD n is closed under positive addition, consider that X, Y P SD n and α, β R + a t (αx + βy )a = αa t Xa + βa t Y a 0 To show that P SD n to be closed, it is sufficient to show that its compliment P SD n is open For every X P SD n and any Y R n n (within finite elements), there exists a R n and ε, k > 0 sufficiently small that a t Xa + ε < 0 ka t Y a < ε a t (X + ky )a < 0 This illustrates that for any X P SD n any sufficiently small perturbation is also in P SD n Therefore, P SD n is open and its compliment P SD n is closed In the LP formulation, we require x (R + ) n In the next section, we describe semidefinite programs in which the cone (R + ) n is replaced by P SD n For now, we concentrate on the algebraic properties of P SD n Claim 14 For a symmetric matrix A, the following are equivalent (i) A is P SD (ii) the eigenvalues of A are non-negative (iii) A can be written as V t V (V is an n n matrix) Proof To show (iii) (i), consider that, given (iii), V R n n, x R n, x t Ax = x t V t V x = V x 2 2 0 For (i) (ii), let λ R be an eigenvalue of A and x R n the corresponding eigenvector, such that Ax = λx Then A P SD n, λ x 2 2 = xt λx = x t Ax 0 To demonstrate (ii) (iii), we employ the fact that if A is an n n symmetric matrix, it has a decomposition 2

A = U t DU where U and D are n n matrices, U is orthogonal, D is diagonal and the diagonal elements of D are the eigenvalues of A If all eigenvalues of A are non-negative, D R n n, so A = U t DU = (U t D t )( DU) = V t V Remark 15 To sketch how we can decompose A, we note that a symmetric matrix is made diagonal using symmetric row and column operations One row operation can remove a term below the main diagonal while the symmetric column operation removes the corresponding above-diagonal term 1 2 2 7 R 2 = R 2 2R 1 = 1 2 0 3 C2 = C 2 2C 1 = 1 0 0 3 These row and column operations are equivalent to modifying I left and right and applying to A 1 0 0 1 2 0 2 1 0 0 A 0 1 0 0 1 1 Remark 16 (i) (iii) says that A is PSD if and only if v 1,, v n R n so that a ij = v i, v j v 1 A P SD A = (V t )(V ) = v 1 v n v n = ( v i, v j ) ij So each element a ij encodes the dot product of a pair of vectors, and A as a whole can be thought of as encoding the relationships among a set of vectors Note that V is not unique We can rotate or reflect the set of vectors without changing the inner products Remark 17 For x (R + ) n, all elements of x are non-negative X P SD n does not place such a constraint on the individual elements Instead, from (i) (ii), we see that the eigenvalues of the matrix as a whole are non-negative 3

Example 18 Some instances of PSD matrices A with the corresponding V and v i (i) Let a 1,, a n 0, A = diag(a 1,, a n ) A = a 1 0 0 a n V = a1 0 0 an v i = 0 ai 0 (ii) A = ( 2 1 1 1 ) V = ( 1 1 1 0 ) ( 1 v 1 = 1 ) ( 1 v 2 = 0 ) 13 Matrix Inner Product (Frobenius Inner Product) We would like to view matrices as vectors when we apply linear constraints, and optimize linear functions on them The matrix inner product is SDP s counterpart to the vector inner product in LP Definition 19 The matrix inner product of A and B is A B = i,j a ij b ij = trace(ab t ) Notice that if A and B are thought of simply as reshaped vectors, this definition is equivalent to the usual vector inner product 2 Writing Semidefinite Programs An Semidefinite Program (SDP) has the form min C X A i X = b i, X 0 such that i = 1 m where X, C and each A i are n n matrices Whereas in LP we had n variables, in SDP the objective, constraints and solution matrices have n 2 elements Note that we may also write SDPs where the objective is to maximize C X The last condition, X 0, is an order relation in which 0 represents the n n zero matrix X 0 means that X P SD n X B B X (X B) 0 x R n, x t (X B)x 0 Next we study some examples of semidefinite programs 4

Example 21 X is n n where n = 3, and there are m = 2 constraints C = 1 1 1 1 1 1 A 1 = 1 0 1 0 0 7 A 2 = 0 2 0 2 6 0 b 1 = 0 b 1 1 1 1 7 0 0 0 4 2 = 2 Writing the problem more compactly in terms of the elements of X = (x ij ), we have min i,j x ij x 11 + 2x 13 + 14x 23 = 0 4x 12 + 6x 22 + 4x 33 = 2 X P SD n Example 22 Given the following incomplete matrix, 3 4? 5????? We want to complete to a PSD matrix while minimizing the sum of entries Representing the solution matrix as X = (x ij ), we write Expressed in matrix form, min i,j x ij st x 11 = 5, x 21 = 3, x 22 = 4 X P SD n 1 0 0 X = 5, min 1 1 1 1 1 1 1 1 1 1 0 0 X 0 X X = 3, st 0 1 0 X = 4 5

Observation 23 LP is a special case of SDP It is enough to consider the standard form: LP = SDP min c, x = min c 1 0 X x has n elements = i = 1 m, a i, x = b i = i = 1 m, 0 c n i j, X ij = 0 (by inner product constraints) a (1) i 0 0 a (n) i X = b i While we would not normally choose to write an LP in this form due to its inefficiency, it is useful to keep in mind that SDP is an extension of LP Unfortunately, as we shall see later, not all the nice properties of LP are maintained in the extension Example 24 Suppose you have two sets of points in R n P = {p 1,, p r }, and Q = {q 1,, q s } In Figure 1, points in P are represented by o s and points in Q are represented by x s We wish to find an ellipse, centred at the origin, that includes all p i P and excludes all q i Q (where we allow q i to lie on the boundary) Recall that an ellipse centred at the origin is {x x t Ax 1} for some A P SD n Here we use the n n matrix X P SD n to represent the ellipse, p i P, p t i Xp i 1 q i Q, q t i Xq i 1 We write the ellipse constraints (1) as matrix inner products using, p t Xp = (pp t ) X, where (pp t ) is the outer product (which also happens to be PSD) Finally, we add slack variables s pi and s qi to transform each constraint into an equality (1) p i P, p t i Xp i 1 (p i p t i ) X 1 (p ip t i ) X + s p i = 1 q i Q, q t i Xq i 1 (q i q t i ) X 1 (q iq t i ) X s q i = 1 We can incorporate the slack variables by writing the augmented matrix X as a blockdiagonal X = X 0 0 0 S P 0 0 0 S Q S P = s P1 0 0 s Pr S Q = s Q1 0 0 s Qs 6

Figure 1: An ellipse centred at the origin, separating two sets of points Figure 2: A feasible region bounded by an infinite family of linear constraints Then p i P we have a constraint in matrix form: p ip t i 0 0 0 E i 0 X = 1 where E i is an n n matrix with zeros everywhere except the i th position along the diagonal which is one Similar constraints may be written q i Q Note that the final condition X 0 ensures simultaneously that X 0 and that S P, S Q 0 This in turn ensures that all slack variables are non-negative Remark 25 In the example 24, we rewrote the PSD property in terms of the outer product of a vector with itself and the matrix inner product, X P SD a, a t Xa 0 a, (aa t ) X 0 For a specific vector, a, the constraint (aa t ) X 0 acts like a single LP constraint, a hyperplane in R n n The constraint holds for any a R n, so aa t defines an infinite family of linear constraints Thus in a sense X satisfies infinite constraints, and the region of feasible solutions to an SDP is bounded by an infinite set of hyperplanes, as in Figure 2, for instance 7

3 Vector Programming Recall X P SD n iff v 1,, v n R n st x ij = v i, v j That is, instead of finding the best matrix X that satisfies a given set of constraints, we can think of finding an optimal set of vectors V = {v 1,, v n } with constraints on the vector inner products We write a vector program as k = 1 m min i,j c ij v i, v j i,j a(k) ij v i, v j = b k v 1,, v n R n st Observation 31 This is identical to an SDP, just written in a different form It is clear that a feasible solution here translates to a feasible SDP solution, by X = (x ij ) = ( v i, v j ) and vice versa, with C = (c ij ) and A k = (a k ij ) Definition 32 A metric d on n points {1,, n} is a function d ij st d ij 0 (non-negative) d ij = d ji (symmetry) d ii = 0 d ij + d jk d ik (triangle inequality) Note that d can be represented as an n n symmetric matrix Definition 33 A metric d on n points is Euclidean if p 1,, p n R n st p i p j 2 = d ij (2) The set {p 1,, p n } is referred to as the Euclidean embedding of d Observation 34 Not every metric is Euclidean Proof Consider Figure 3, for instance Suppose {p A, p B, p C, p D } is the Euclidean embedding Since p A p B 2 + p B p C 2 = d AB + d BC = d AC = p A p C 2 p A, p B and p C must lie on a line with p B at the midpoint Similarly, p A, p B and p D must also lie on a line with p B at the midpoint This collapses p C and p D, implying (p c p d ) 2 = 0 However, this is inconsistent with d CD = 2 We relax (2) and define the cost of the embedding as the minimum c such that i, j {1,, n}, d ij p i p j 2 cd ij 8

Figure 3: An example of a metric on four points that is not Euclidean The inequalities are preserved when squaring all terms, d 2 ij p i p j 2 c2 d 2 2 ij Using the connection between the Euclidean norm and inner product, p i p j 2 2 = p i p j, p i p j = p i, p i + p j, p j + 2 p i, p j In other words, taking C = c 2, we can formulate the problem of finding the minimum cost embedding as a VP instance min C, st i, j {1,, n}, d 2 ij p i, p i + p j, p j 2 p i, p j Cd 2 ij Observation 35 We can t hope to add a dimension constraint to the above, for example to ask for the vectors to be of dimension 5 Vector programming with restrictions on dimension is NP-hard Example 36 Consider the Vertex cover problem Given a graph G = (V, E), find the smallest set of vertices S V such that for all edges {i, j} E either vertex i S or v j S (or both) Suppose we have a VP in which the vectors are restricted to dimension one We have variables x i for vertices i V, and x 0 (an indicator value) min i x 0, x i x i 2 = 1 i, j E, x i, x 0 + x j, x 0 0 x i R 1 Claim 37 A solution to the above translates to an exact solution to vertex cover, where S = {v i x i = x 0 } and i x 0, x i = 2 S n 9

4 Duality in SDP As in LP, we should think of the dual as a linear method that bounds the optimimum To bound C X from below, we want to: (i) Find coefficients y i for each of the m constraints (ii) Guarantee that X 0, it holds that m y i A i X C X (3) i=1 Having done so, since y i A i = y i b i, we get a lower bound on the objective of the primal: C X y, b What restrictions must we put on y so that (3) holds? Rewrite (3) as X 0, (C m y i A i ) X 0 i=1 In particular, for x R n, we have that xx t 0 and so C y i A i must satisfy x R n, (C y i A i ) (xx t ) = x t (C y i A i )x 0 Therefore, to satisfy (3), (C y i A i ) must be P SD On the other hand, if X 0 then w 1 w i wi, t where V = X = V t V = i w n That is, every X 0 can be represented as the non-negative sum of outer-products and so, if (C m i=1 y ia i ) satisfies (3) for every xx t, it satisfies it for every X 0 Definition 41 Let K be a cone The dual cone K is defined K = {y x, y 0, x K} In formulating the dual of LP, we use ((R + ) n ) = (R + ) n We now need to understand the dual of P SD n Claim 42 P SD n = P SD n We write the dual as Primal Dual min C X max y, b i = 1 m, A i X = b i yi A i C 0 X 0 y 0 Note that the constraint y i A i C 0 is equivalent to y i A i C 10

5 Differences between LP and SDP Many of the nice LP properties do not carry over for SDP The optimum cannot always be attained, even if the problem is bounded Even if the optimum can be attained, it may be irrational, and so can t be solved exactly on a finite machine There is no analogue in SDP for LP s Basic Feasible Solutions (BFS) There can be a duality gap See theorem 51, below Strong duality does not neccessarily hold Implied by the above, there is no natural bound on the size of the solution in terms of input size Theorem 51 If there is a feasible solution X 0 (all eigenvalues strictly positive) with C y i A i 0 then both primal and dual attain optimum and they are the same 11