On the Sandwich Theorem and a approximation algorithm for MAX CUT

Similar documents
Introduction to Semidefinite Programming I: Basic properties a

MIT Algebraic techniques and semidefinite optimization February 14, Lecture 3

Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 5

Research Note. A New Infeasible Interior-Point Algorithm with Full Nesterov-Todd Step for Semi-Definite Optimization

Advances in Convex Optimization: Theory, Algorithms, and Applications

IE 521 Convex Optimization

Relaxations and Randomized Methods for Nonconvex QCQPs

Copositive Programming and Combinatorial Optimization

The maximal stable set problem : Copositive programming and Semidefinite Relaxations

Convex Optimization. (EE227A: UC Berkeley) Lecture 6. Suvrit Sra. (Conic optimization) 07 Feb, 2013

Largest dual ellipsoids inscribed in dual cones

Lecture 1. 1 Conic programming. MA 796S: Convex Optimization and Interior Point Methods October 8, Consider the conic program. min.

Discrete (and Continuous) Optimization WI4 131

Lecture 6: Conic Optimization September 8

Copositive Programming and Combinatorial Optimization

Convex Optimization M2

ELE539A: Optimization of Communication Systems Lecture 15: Semidefinite Programming, Detection and Estimation Applications

A Second Full-Newton Step O(n) Infeasible Interior-Point Algorithm for Linear Optimization

A CONIC DANTZIG-WOLFE DECOMPOSITION APPROACH FOR LARGE SCALE SEMIDEFINITE PROGRAMMING

A semidefinite relaxation scheme for quadratically constrained quadratic problems with an additional linear constraint

c 2000 Society for Industrial and Applied Mathematics

SDP Relaxations for MAXCUT

Lecture 5. Theorems of Alternatives and Self-Dual Embedding

Lecture 5. The Dual Cone and Dual Problem

Summer School: Semidefinite Optimization

1 The independent set problem

Interior Point Methods: Second-Order Cone Programming and Semidefinite Programming

12. Interior-point methods

LNMB PhD Course. Networks and Semidefinite Programming 2012/2013

Convex Optimization M2

Lecture Note 5: Semidefinite Programming for Stability Analysis

Chapter 3. Some Applications. 3.1 The Cone of Positive Semidefinite Matrices

LMI MODELLING 4. CONVEX LMI MODELLING. Didier HENRION. LAAS-CNRS Toulouse, FR Czech Tech Univ Prague, CZ. Universidad de Valladolid, SP March 2009

Semidefinite Programming Basics and Applications

Applications of the Inverse Theta Number in Stable Set Problems

What can be expressed via Conic Quadratic and Semidefinite Programming?

3. Linear Programming and Polyhedral Combinatorics

Four new upper bounds for the stability number of a graph

Semidefinite programs and combinatorial optimization

Selected Examples of CONIC DUALITY AT WORK Robust Linear Optimization Synthesis of Linear Controllers Matrix Cube Theorem A.

1 Introduction Semidenite programming (SDP) has been an active research area following the seminal work of Nesterov and Nemirovski [9] see also Alizad

A full-newton step infeasible interior-point algorithm for linear programming based on a kernel function

An Infeasible Interior-Point Algorithm with full-newton Step for Linear Optimization

Semidefinite Programming

Chapter 1. Preliminaries

Lecture: Algorithms for LP, SOCP and SDP

5. Duality. Lagrangian

Lagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)

SEMIDEFINITE PROGRAM BASICS. Contents

Handout 6: Some Applications of Conic Linear Programming

Canonical Problem Forms. Ryan Tibshirani Convex Optimization

Conic Linear Optimization and its Dual. yyye

A new primal-dual path-following method for convex quadratic programming

Lecture: Examples of LP, SOCP and SDP

CSC Linear Programming and Combinatorial Optimization Lecture 10: Semidefinite Programming

4. Algebra and Duality

Acyclic Semidefinite Approximations of Quadratically Constrained Quadratic Programs

6.854J / J Advanced Algorithms Fall 2008

COURSE ON LMI PART I.2 GEOMETRY OF LMI SETS. Didier HENRION henrion

A solution approach for linear optimization with completely positive matrices

Approximation Algorithms

A PREDICTOR-CORRECTOR PATH-FOLLOWING ALGORITHM FOR SYMMETRIC OPTIMIZATION BASED ON DARVAY'S TECHNIQUE

CSCI 1951-G Optimization Methods in Finance Part 10: Conic Optimization

Lecture 4: January 26

EE 227A: Convex Optimization and Applications October 14, 2008

A Full Newton Step Infeasible Interior Point Algorithm for Linear Optimization

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 17

Semidefinite Programming

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

Relations between Semidefinite, Copositive, Semi-infinite and Integer Programming

Semidefinite and Second Order Cone Programming Seminar Fall 2012 Project: Robust Optimization and its Application of Robust Portfolio Optimization

Lecture 3: Semidefinite Programming

1 Review of last lecture and introduction

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010

A path following interior-point algorithm for semidefinite optimization problem based on new kernel function. djeffal

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 11 Luca Trevisan February 29, 2016

Local Self-concordance of Barrier Functions Based on Kernel-functions

Lecture 17: Primal-dual interior-point methods part II

Modeling with semidefinite and copositive matrices

Limiting behavior of the central path in semidefinite optimization

On self-concordant barriers for generalized power cones

Acyclic Semidefinite Approximations of Quadratically Constrained Quadratic Programs

Lecture 14: Optimality Conditions for Conic Problems

Notation and Prerequisites

ORF 523 Lecture 9 Spring 2016, Princeton University Instructor: A.A. Ahmadi Scribe: G. Hall Thursday, March 10, 2016

Semidefinite Programming

Primal-Dual Symmetric Interior-Point Methods from SDP to Hyperbolic Cone Programming and Beyond

Example: feasibility. Interpretation as formal proof. Example: linear inequalities and Farkas lemma

Discrete (and Continuous) Optimization Solutions of Exercises 2 WI4 131

Full Newton step polynomial time methods for LO based on locally self concordant barrier functions

A strongly polynomial algorithm for linear systems having a binary solution

2.1. Jordan algebras. In this subsection, we introduce Jordan algebras as well as some of their basic properties.

New stopping criteria for detecting infeasibility in conic optimization

POLYNOMIAL OPTIMIZATION WITH SUMS-OF-SQUARES INTERPOLANTS

15. Conic optimization

3. Linear Programming and Polyhedral Combinatorics

Lecture 17 (Nov 3, 2011 ): Approximation via rounding SDP: Max-Cut

A notion of Total Dual Integrality for Convex, Semidefinite and Extended Formulations

Convex and Semidefinite Programming for Approximation

Tamás Terlaky George N. and Soteria Kledaras 87 Endowed Chair Professor. Chair, Department of Industrial and Systems Engineering Lehigh University

Transcription:

On the Sandwich Theorem and a 0.878-approximation algorithm for MAX CUT Kees Roos Technische Universiteit Delft Faculteit Electrotechniek. Wiskunde en Informatica e-mail: C.Roos@its.tudelft.nl URL: http://ssor.twi.tudelft.nl/ roos WI4 060 feb A.D. 2004

Outline Conic Optimization (CO) Duality theory for CO Semidefinite Optimization (SDO) Some examples Algorithms for SDO Application to some combinatorial problems Maximal clique Graph coloring Lovasz sandwich theorem Maximal cut problem Semidefinite relaxation of MAX CUT Result of Goemans and Williamson Nemirovski s proof Original proof Concluding remarks Some references Optimization Group 1

General conic optimization A general conic optimization problem is a problem in the conic form { c T x : Ax b K }, min x R n where K is a closed convex pointed cone. We restrict ourselves to the cases that K is either the non-negative orthant R m + (Linear Inequality constraints); the Lorentz (or second order, or ice-cream) cone L m (Conic Quadratic constraints); the semidefinite cone S m +, i.e. the cone of positive semidefinite matrices of size m m (Linear Matrix Inequality (LMI) constraints); a direct product of such cones. In all these cases the above problem can be solved efficiently by an interior-point method. Optimization Group 2

Conic optimization Conic optimization addresses the problem of minimizing a linear objective function over the intersection of an affine set and a convex cone. The general form is { (COP) c T x : Ax b K }. min x R n The convex cone K is a subset of R m. The objective function c T x is linear. Ax b represents an affine function from R n to R m. Usually A is given as an m n (constraint) matrix, and b R m. Two important facts: many nonlinear problems can be modelled in this way, and under some weak conditions on the underlying cone K, conic optimization problems can be solved efficiently. The most easy and most well known case occurs when the cone K is the nonnegative orthant of R m, i.e. when K = R m + : (LO) { min c T x : Ax b R m } +. x R This is nothing else as one of the standard forms of the well known Linear Optimization (LO) problem. Thus it becomes clear that LO is a special case of CO. Optimization Group 3

Convex cones A subset K of R m is a cone if and the cone K is a convex cone if moreover a K, λ 0 λa K, (1) a, a K a + a K. (2) We will impose three more conditions on K. Recall that CO is a generalization of LO. To obtain duality results for CO similar to those for LO, the cone K should inherit three more properties from the cone underlying LO, namely the nonnegative orthant: R m + = { x = (x 1,..., x m ) T : x i 0, i = 1,..., m }. This cone is called the linear cone. Optimization Group 4

Convex cones (cont.) The linear cone is not just a convex cone; it is also pointed, it is closed and it has a nonempty interior. These are exactly the three properties we need. We describe these properties now. A convex cone K is called pointed if it does not contain a line. This property can be stated equivalently as a K, a K a = 0. (3) A convex cone K is called closed if it is closed under taking limits: a i K (i = 1,2,...), a = lim i a i a K. (4) Finally, denoting the interior of a cone K as int K, we will require that int K =. (5) This means that there exists a vector (in K) such that a ball of positive radius centered at the vector is contained in K. Optimization Group 5

Convex cones (cont.) In conic optimization we only deal with cones K that enjoy all of the above properties. So we always assume that K is a pointed and closed convex cone with a nonempty interior. Apart from the linear cone, two other relevant examples of such cones are 1. The Lorentz cone L m = { x R m : x m } x 2 1 +... + x2 m 1. This cone is also called the second-order cone, or the ice-cream cone. 2. The positive semidefinite cone S m +. This cone lives in the space Sm of m m symmetric matrices (equipped with the Frobenius inner product A, B = Tr(AB) = i,j A ij B ij ) and consist of all m m matrices A which are positive semidefinite, i.e., S m + = { A S m : x T Ax 0, x R m}. We assume that the cone K in (CP) is a direct product of the form K = K 1... K m, where each K i is either a linear, a Lorentz or a semidefinite cone. Optimization Group 6

Conic Duality Before we discuss the duality theory for conic optimization, we need to define the dual cone of a convex cone K: K = { λ R m : λ T a 0, a K }. Theorem 1 Let K R m is a nonempty cone. Then (i) The set K is a closed convex cone. (ii) If K has a nonempty interior (i.e., int K = ) then K is pointed. (iii) If K is a closed convex pointed cone, then int K. (iv) If K is a closed convex cone, then so is K, and the cone dual to K is K itself. Corollary 1 If K R m is a closed pointed convex cone with nonempty interior then so is K, and vice versa. One may easily verify that the linear, the Lorentz and the semidefinite cone are self-dual. Since K = K 1... K m K = K 1... Km, any direct product of linear, Lorentz and semidefinite cones is self-dual. Optimization Group 7

Conic Duality Now we are ready to deal with the problem dual to a conic problem (COP). We start with observing that whenever x is a feasible solution for (COP) then the definition of K implies λ T (Ax b) 0, for all λ K, and hence x satisfies the scalar inequality λ T Ax λ T b, λ K. It follows that whenever λ K satisfies the relation then one has A T λ = c (6) c T x = (A T λ) T x = λ T Ax λ T b = b T λ for all x feasible for (COP). So, if λ K satisfies (6), then the quantity b T λ is a lower bound for the optimal value of (COP). The best lower bound obtainable in this way is the optimal value of the problem { (COD) max b T λ : A T } λ = c, λ K. λ R m By definition, (COD) is the dual problem of (COP). Using Theorem 1 (iv), one easily verifies that the duality is symmetric: the dual problem is conic and the problem dual to the dual problem is the primal problem. Optimization Group 8

Conic Duality Indeed, from the construction of the dual problem it immediately follows that we have the weak duality property: if x is feasible for (COP) and λ is feasible for (COD), then c T x b T λ 0. The crucial question is, of course, if we have equality of the optimal values whenever (COP) and (COD) have optimal values. Different from the LO case, however, this is in general not the case, unless some additional conditions are satisfied. The following theorem clarifies the situation. We call the problem (COP) solvable if it has a (finite) optimal value, and this value is attained. Before stating the theorem it may be worth pointing out that a finite optimal value is not necessarily attained. For example, the problem min x : x 1 1 y 0, x, y R has optimal value 0, but one may easily verify that this value is not attained. We need one more definition: if there exists an x such that Ax b int K then we say that (COP) is strictly feasible. We have similar, and obvious, definitions for (COD) being solvable and strictly feasible, respectively. Optimization Group 9

Conic Duality Theorem Theorem 2 Let the primal problem (COP) and its dual problem (COD) be as given above. Then one has (i) a. If (COP) is below bounded and strictly feasible, then (COD) is solvable and the respective optimal values are equal. b. If (COD) is above bounded and strictly feasible, then (COP) is solvable, and the respective optimal values are equal. (ii) Suppose that at least one of the two problems (COP) and (COD) is bounded and strictly feasible. Then a primal-dual feasible pair (x, λ) is comprised of optimal solutions to the respective problems a. if and only if b T λ = c T x (zero duality gap). b. if and only if λ T [Ax b] = 0 (complementary slackness). This result is slightly weaker than the duality theorem for LO: in the LO case the theorem holds by putting everywhere feasible instead of strictly feasible. The adjective strictly cannot be omitted here, however. For a more extensive discussion and some appropriate counterexamples we refer to the book of Ben-Tal and Nemirovski. Optimization Group 10

min Bad duality example Consider the following conic problem with two variables x = (x 1, x 2 ) T and the 3-dimensional ice-cream cone: x 1 x 2 : Ax b = x 2 x 1 L3. The problem is equivalent to the problem { min x 2 : } x 2 1 + x2 2 x 1, i.e., to the problem min {x 2 : x 2 = 0, x 1 0}. The problem is clearly solvable, and its optimal set is the ray {x 1 0, x 2 = 0}. Now let us build the conic dual to our (solvable!) primal. Since the cone dual to an ice-cream cone is this ice-cream cone itself, the dual problem is max λ 0 : λ 1 + λ 3 = 0, λ L 3 λ 2 1. In spite of the fact that the primal problem is solvable, the dual is infeasible: indeed, assuming that λ is dual feasible, we have λ L 3, which means that λ 3 λ 2 1 + λ2 2 ; since also λ 1 + λ 3 = 0, we come to λ 2 = 0, which contradicts the equality λ 2 = 1. Optimization Group 11

LO as a special case of SOCO By definition, the m-dimensional Lorentz cone is given by { L m = x R m : x m x 2 1 +... + x2 m 1 Hence, the 1-dimensional Lorentz cone is given by Thus it follows that L 1 = {x R : x 0}. R n + = L1 L 1... L 1 }{{} n times As a consequence, every LO problem can be written as a SOCO problem. }. Optimization Group 12

SOCO as a special case of SDO Recall that the m-dimensional Lorentz cone is given by { L m = x R m : x m x 2 1 +... + x2 m 1 }. One may easily verify that x L m if and only if x m x m 1 x m 2... x 1 x m 1 x m 0... 0 x m 2 0 x m... 0. 0 0... 0 x 1 0 0 0 x m S m +. The above matrix depends linearly on (the coordinates of) the vector x, and hence any SOCO constraint could be written as an SDO constraint. Optimization Group 13

SOCO as a special case of SDO (proof) Let a R m 1 and α R. Then (a, α) L m if and only if a α. We need to show that this holds if and only if α at 0, a αi where I denotes the (m 1) (m 1) identity matrix. The latter is equivalent to ( ) β b T α at β 0, β R, b R m 1. a αi b Thus we obtain α at a αi 0 αβ 2 + 2βb T a + αb T b 0, β R, b R m 1 This proves the claim. α 2 β 2 + 2αβb T a + α 2 b T b 0, β R, b R m 1, α 0 ( αβ + b T a ) 2 + α 2 b T b (b T a) 2 0, β R, b R m 1, α 0 α 2 b T b (b T a) 2 0, b R m 1, α 0 α 2 a T a (a T a) 2 0, α 0 α 2 a T a 0, α 0 a α. Optimization Group 14

Convex quadratic optimization (CQO) as a special case of SOCO Any convex quadratic problem with quadratic constraints can be written as min {f 0 (x) : f i (x) 0, i = 1,..., m}, where f i (x) = (B i x + b i ) T (B i x + b i ) c T i x d i, i = 0,1,... m. The objective can be made linear by introducing an extra variable τ such that f 0 (x) τ, and minimizing τ. Redefining f 0 (x) := f 0 (x) τ, the problem becomes min {τ : f i (x) 0, i = 0,, m}. So it suffices to show that every convex quadratic f i (x) 0 can be written as a SOC constraint. Omitting the index i such a constraint has the form (Bx + b) T (Bx + b) c T x + d, or Bx + b 2 c T x + d. This is not yet a SOC constraint!! A SOC constraint has the form Gx + g p T x + q Gx + g p T x + q L k. Optimization Group 15

Convex quadratic optimization (CQO) as a special case of SOCO Bx + b 2 c T x + d. To put this constraint in the SOC form we observe that c T x + d = [ c T x + d + 1 ] 2 [ 4 c T x + d 1 2 4] Thus we have Bx + b 2 c T x + d Bx + b 2 + [ c T x + d 1 4 This is equivalent, for a suitable k (k = rowsize(b) + 2), to Bx + b c T x + d 1 4 c T x + d + 1 4 Lk. ] 2 [ c T x + d + 1 4 ] 2. Optimization Group 16

More on Semidefinite Optimization A semidefinite optimization problem can be written in the form (SD) d = sup { b T y : A(y) C } where A(y) := y 1 A 1 + + y m A m ; A i = A T i R n n, 1 i m; C = C T R n n ; A(y) C means: C A(y) is positive semidefinite (PSD). Optimization Group 17

Convex quadratic optimization (CQO) as a special case of SDO z T z ρ I z T z ρ 0 We have seen before that any convex quadratic constraint has the form Bx + b 2 c T x + d. By the above statement (which is a simple version of the so-called Schur complement lemma) this can be equivalently expressed as the SD constraint I Bx + b (Bx + b) T c T 0. x + d Optimization Group 18

Semidefinite duality Dual problem: Primal problem: d = sup bt y : m i=1 y i A i + S = C, S 0 p = inf {Tr(CX) : Tr(A i X) = b i ( i), X 0} Duality gap: Tr(CX) b T y = Tr (SX) 0 Central path: SX = µi, µ > 0. the central path exists if and only if the primal and the dual problem are strictly feasible (IPC); then both problems have optimal solutions and the duality gap = 0 at optimality. Optimization Group 19

Algorithms for SDO Dikin-type affine scaling approach Primal dual search directions ( X, y, Z) must satisfy Tr(A i X) = 0, mi=1 y i A i + Z = 0. i = 1,..., m X and Z are orthogonal: Tr( X Z) = 0. Duality gap after the step: Tr(X + X)(Z + Z). We minimize this duality gap over the so-called Dikin ellipsoid. The search directions follow by solving X + D ZD = XZX ( Tr(XZ) 2 )1 2, subject to the feasibility conditions. D is the socalled Nesterov-Todd (NT) scaling-matrix D := Z 1 2 ( Z 1 2XZ 1 2)1 2 Z 1 2. We assume that the matrices A i are linearly independent. Optimization Group 20

Measure of centrality The eigenvalues of XZ are real and positive if X, Z 0, since where denotes the similarity relation. XZ X 1 2 (XZ) X 1 2 = X 1 2ZX 1 2 0, The proximity to the central path is measured by κ(xz) := λ max(xz) λ min (XZ), where λ max (XZ) denotes the largest eigenvalue of XZ and λ min (XZ) the smallest. Optimization Group 21

Input: A strictly feasible pair (X 0, Z 0 ); a step size parameter α > 0; an accuracy parameter ɛ > 0. begin X := X 0 ; Z := Z 0 ; while Tr(XZ) ɛ do begin X := X + α X; y := y + α y; Z := Z + α Z; end end Primal dual Dikin affine-scaling algorithm Theorem 3 Let τ > 1 be such that κ(x 0 Z 0 ) τ. If α = 1 stops after at most τnl iterations, where L := ln ( Tr(X 0 Z 0 )/ɛ ). τ n, the Dikin Step Algorithm The output is a feasible primal dual pair (X, Z ) satisfying κ(x Z ) τ and Tr(X Z ) ɛ. Optimization Group 22

Primal dual Newton direction As before, the search directions ( X, y, Z) must satisfy Tr(A i X) = 0, mi=1 y i A i + Z = 0. i = 1,..., m We want Omitting the quadratic term this leads to (X + X)(Z + Z) = µi. which can be rewritten as XZ + X Z + XZ = µi, X + X ZZ 1 = µz 1 X, Note that Z will be symmetric, but X possibly not! To overcome this difficulty we use instead the equation X + D ZD = µz 1 X, and we obtain the socalled Nesterov-Todd (NT) directions. Here D is the same NT scalingmatrix as introduced before. Optimization Group 23

Proximity measure Let (X, Z) be a strictly feasible pair. We measure the distance of this pair to the µ-center by δ(x, Z; µ) := 1 2 where V is determined by µv 1 1 V µ V 2 = D 1 2XZD 1 2. Theorem 4 If δ := δ(x, Z, µ) < 1 then the full Newton step is strictly feasible and the duality gap attains it target value nµ. Moreover, δ ( X +, Z + ; µ ) δ 2. 2(1 δ 2 ) Optimization Group 24

Algorithm with full Newton steps Input: A proximity parameter τ, 0 τ < 1; an accuracy parameter ɛ > 0; X 0, Z 0, µ 0 > 0 such that δ(x 0, Z 0 ; µ 0 ) τ; a barrier update parameter θ, 0 < θ < 1. begin X := X 0 ; Z := Z 0 ; µ := µ 0 ; while nµ (1 θ)ɛ do begin X := X + X; Z := Z + Z; µ := (1 θ)µ; end end Theorem 5 If τ = 1/ 2 and θ = 1/(2 n), then the above algorithm with full NT steps requires at most 2 nlog nµ0 ɛ iterations. The output is a primal-dual pair (X, Z) such that Tr(XZ) ɛ. Optimization Group 25

Approximation algorithm for discrete problems, via SDO relaxation In graph G = (V, E) find a maximal clique, i.e., a subset C V, such that C is maximal, with i, j C (i j) : {i, j} E. Linear model: ω(g) := max e T x s.t. x i + x j 1, {i, j} / E (i j) x i {0,1}, i V. Quadratic model: ω(g) = max e T x s.t. x i x j = 0, {i, j} / E (i j) x i {0,1}, i V. Semidefinite relaxation: ω(g) ϑ(g) := max Tr ( ee T X ) = e T Xe s.t. X ij = 0, {i, j} / E (i j) Tr(X) = 1, X 0. Optimization Group 26

Lovasz sandwich theorem (1979) Semidefinite dual for ϑ(g): ϑ(g) = min s.t. λ Y + ee T λi Y ij = 0, {i, j} E Y ii = 0, i V A coloring of G with k colors induces a feasible Y, with λ = k, yielding ϑ(g) k. Thus we obtain the sandwich theorem of Lovász: ω(g) ϑ(g) χ(g), where ω(g) and χ(g) are the clique and chromatic numbers of G. The name sandwich theorem comes from the interesting paper of D.E. Knuth, The sandwhich theorem, The Electronic Journal of Combinatorics, Vol. 1, 1-48, 1994. The next two sheets provide proofs of both inequalities based on semidefinite duality. The first proof is more or less classical, but the second proof seems to be new. If G is perfect then ω(g) = ϑ(g) = χ(g). (Grötschel, Lovász, Schrijver, 1981) Optimization Group 27

Proof of the first inequality Let V = {v 1, v 2,, v n }, and let C = {v 1, v 2,, v k } be a clique in G = (V, E) of size k,1 k n. So the vertices v i C are mutually connected. Define Then satisfies x C = 1,,1 }{{} k X = 1 k,0,,0, X := 1 k x Cx T C. } {{ } n k 1 1 0 0.......... 1 1 0 0 0 0 0 0.......... 0 0 0 0 X ij = 0, {i, j} / E (i j) and X 0,Tr(X) = 1. Since e T Xe = k, ϑ(g) k. Optimization Group 28

Proof of the second inequality Let Γ = (C 1, C 2,, C k ) be a coloring of G = (V, E) with k colors. So the C i are vertex disjoint cocliques. Let γ i := C i,1 i k. For i = 1,2,, k define M i := k ( J γi I γi ), where J γi denotes the all one matrix of size γ i γ i and I γi the identity matrix of the same size. Then the block matrix M 1 0 0 0 M Y = 2 0..... 0 0 0 M k is dual feasible, with λ = k. Hence ϑ(g) k. Optimization Group 29

The maximal cut problem Input: A graph G = (V, E) with rational nonnegative weights a vw for {v, w} E. We take a vw = 0 if {v, w} / E. Goal: Partition the nodes into two classes so as to maximize the sum of the weights of the edges whose nodes are in different classes (weight of the cut). a 5 c 2 4 1 d b 3 e Optimization Group 30

The maximal cut problem (cont.) a 5 c 2 4 d 1 e 3 b The maximum cut problem (MAX CUT) is NP-hard, even if a vw {0,1} for all {v, w} E. N.B. The problem is solvable in polynomial-time if the graph is planar. Optimization Group 31

Relevance of the problem In a mathematical sense the maximum cut problem is interesting in itself. But there exist interesting applications in a wide variety of domains: Finding the ground state of magnetic particles subject to a field in the Ising spin glas model. Minimizing the number of vias (holes) drilled in a two-sided circuit board. Solving network design problems. Finding the radius of nonsingularity of a square matrix. Optimization Group 32

Solution approaches Several approaches: Integer/linear optimization enumerative techniques (e.g., branch and cut) heuristics (e.g., local search methods) approximation algorithms Optimization Group 33

Approximation algorithms An α-approximation algorithm for an NP-hard optimization problem: runs in polynomial time returns a feasible solution with value not worse than a factor α from optimal (with α < 1 in case of a maximization and α > 1 in case of a minimization problem). For randomized α-approximation algorithms the expected value of the solution is within a factor α of optimal. A 1 2-approximation algorithm of Sahni and Gonzalez (1976) was the best known for MAX CUT for almost 20 years. Goemans and Williamson (1995) improved this ratio from 1 2 to 0.878. If a 0.94-approximation algorithm exists then P=NP. This has been shown by Trevisan, Sorkin, Sudan and Williamson (1996) and by Hastad (1997). Optimization Group 34

Quadratic model for MAX CUT Let (S, T) be a any cut, i.e., a partitioning of the set V of the nodes in two disjoint classes S and T. Assuming V = n, the cut can be identified with an n-dimensional { 1,1}-vector x as follows x v = 1 if v S 1 if v T The weight of the cut (S, T) is then given by 1 4 v,w V, v V. a vw (1 x v x w ) We conclude that MAX CUT can be posed as follows: max x 1 4 v,w V a vw (1 x v x w ) : x 2 v = 1, v V. Optimization Group 35

max x Semidefinite relaxation of MAX CUT 1 4 v,w V a vw (1 x v x w ) : x 2 v = 1, v V Defining the matrix X = xx T, we have X vw = x v x w and the matrix X is a symmetric positive semidefinite matrix of rank 1. Thus we can reformulate the problem as max X 1 4 v,w V. a vw (1 X vw ) : X 0, rank(x) = 1, X vv = 1, v V Omitting the rank constraint we arrive at the following relaxation: max X 1 4 v,w V a vw (1 X vw ) : X 0, X vv = 1, v V The optimal solutions of this problem are the same as for the SDO problem min X v,w V a vw X vw = Tr(AX) : X 0, X vv = 1, v V where A = (a vw ). Note that X vv = 1 iff Tr (E v X) = 1, where E vv = 1 and all other elements of E v are zero..,. Optimization Group 36

Result of Goemans and Williamson OPT = max x SDP = max X 1 4 1 4 v,w V v,w V a vw (1 x v x w ) : x 2 v = 1, v V a vw (1 X vw ) : X 0, X vv = 1, v V (7) (8) The relaxation (8) can be used in an ingenious way to obtain an 0.878-approximation algorithm for the maximal cut problem. It is based on Theorem 6 With α = 0.878, one has α SDP OPT SDP. and a rounding procedure that generates a cut whose expected weight equals α SDP. (Goemans & Williamson, 1994) S.D.G. Optimization Group 37

Nemirovski s proof of the Goemans-Williamson bound Theorem 6 One has α SDP OPT SDP, with α = 0.878. Proof: The right inequality is obvious. To get the left hand side inequality, let X = [X vw ] be an optimal solution to the SD relaxation. Since X is positive semidefinite, it is the covariance matrix of a Gaussian random vector ξ with zero mean, so that E {ξ v ξ w } = X vw. Now consider the random vector ζ = sign[ξ] comprised of signs of the entries in ξ. A realization of ζ is almost surely a vector with coordinates ±1, i.e., it is a cut. A straightforward computation demonstrates that E {ζ v ζ w } = 2 π arcsin(x vw). It follows that the expected weight of the cut vector ζ is given by 1 E a vw (1 ζ v ζ v ) 4 = 1 a vw (1 2 ) 4 π arcsin(x vw) = 1 ( ) 2 a vw 4 π arccos(x vw) ; v,w V v,w V we used that arccos(t) + arcsin(t) = π for 1 t 1. Now one may easily verify that if 2 1 t 1 then 2 arccos(t) α(1 t), α = 0.878. π Using also a vw 0, this implies that 1 E a vw (1 ζ v ζ v ) 4 α a vw (1 X vw ) = α SDP. 4 v,w V The left hand side in this inequality, by evident reasons, is OPT. Thus we have proved that OPT α SDP. v,w V v,w V Optimization Group 38

The inequality π 2 arccos(t) α(1 t), α = 0.878 2.0 1.5 2 π arccos(t) α(1 t) 1.0 0.5 1.0 0.5 0 0.5 1.0 t Optimization Group 39

Original proof of the Goemans-Williamson bound Theorem 6 One has α SDP OPT SDP, with α = 0.878. Proof: As before, let X = [X vw ] be an optimal solution to the SD relaxation. Since X is positive semidefinite, we may write X = U T U, for some k n matrix U, where k = rank(x). Let u 1,..., u n denote the columns of U. Then we have u T v u w = X vw, for all i and all j. Note that X vv = 1 implies that u v = 1, for each v V. Let r R k be a randomly chosen unit vector in R k. Define a cut vector ζ: +1, if r T u v 0 ζ v = 1, if r T u v < 0. What is the expected weight of the cut corresponding to ζ? We claim that Pr(ζ v ζ w = 1) = 1 π arccos(ut v u w ) = 1 π arccos(x vw). Hence one has E(ζ v ζ w ) = ( 1 1 ) π arccos(x vw) 1 π arccos(x vw) = 2 π arcsin(x vw), where we used again that arccos(t) + arcsin(t) = π for 1 t 1. Hence 2 1 E a vw (1 ζ v ζ v ) 4 = 1 a vw (1 2 ) 4 π arcsin(x vw) = 1 ( ) 2 a vw 4 π arccos(x vw). v,w V v,w V From here we proceed as in Nemirovski s proof. v,w V Optimization Group 40

Geometric pictures related to the Goemans-Williamson proof u 3 u v u 1 θ vw θ vw r u n r u w u 2 The figure shows n unit vectors u v and a random unit vector r. If u v is on one side of the hyperplane r T v = 0 then ζ v = 1, and if u v is on the other side of this hyperplane then ζ v = 1. What is Pr(ζ v ζ w = 1)? In the green region r T u v and r T u w have opposite signs. If r is a random vector, this happens with probability 2θ vw 2π = θ vw π = arccos(ut v u w). π Optimization Group 41

Concluding remarks The last decade gave rise to a revolution in algorithms and software for linear, convex and semidefinite optimization. SDO unifies a wide variety of optimization problems. SDO models can be solved efficiently. This opens the way to many new applications, including applications which could not be handled some years ago. Since 1995, the techniques discussed in this talk have led to numerous improved approximation algorithms for other combinatorial optimization problems, like, e.g., MAX SAT, MAX 2SAT, MAX 3SAT, MAX 4SAT, MAX k-cut, k-coloring, scheduling, etc. Optimization Group 42

Some references A. Ben-Tal and A. Nemirovski. Lectures on Modern Convex Optimization. Analysis, Algorithms and Engineering Applications. Volume 1 of MPS/SIAM Series on Optimization. SIAM, Philadelphia, USA, 2001. S. Boyd, El Ghaoui, Feron and Balakrishnan. Linear Matrix Inequalities in System and Control Theory. SIAM, 1994. M. Goemans and D. Williamson. Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. Journal of the ACM 42, 1115 1145, 1995. E. de Klerk. Aspects of Semidefinite Programming. Volume 65 in the series Applied Optimization. Kluwer, 2002. L. Lovász and A. Schrijver. Cones of matrices and set functions and 0-1 optimization, SIAM Journal on Optimization 1, 166-190, 1991. S. Sahni and T. Gonzalez. P-complete approximation problems. Journal of the ACM, 23:555-565, 1976. A. Nesterov and A. Nemirovsky. Interior Point Polynomial Algorithms in Convex Programming. SIAM, 1994. L. Vandenberghe and S. Boyd. Semidefinite programming. SIAM Review 38, 49 95, 1996. Optimization Group 43