On the Sandwich Theorem and a 0.878-approximation algorithm for MAX CUT Kees Roos Technische Universiteit Delft Faculteit Electrotechniek. Wiskunde en Informatica e-mail: C.Roos@its.tudelft.nl URL: http://ssor.twi.tudelft.nl/ roos WI4 060 feb A.D. 2004
Outline Conic Optimization (CO) Duality theory for CO Semidefinite Optimization (SDO) Some examples Algorithms for SDO Application to some combinatorial problems Maximal clique Graph coloring Lovasz sandwich theorem Maximal cut problem Semidefinite relaxation of MAX CUT Result of Goemans and Williamson Nemirovski s proof Original proof Concluding remarks Some references Optimization Group 1
General conic optimization A general conic optimization problem is a problem in the conic form { c T x : Ax b K }, min x R n where K is a closed convex pointed cone. We restrict ourselves to the cases that K is either the non-negative orthant R m + (Linear Inequality constraints); the Lorentz (or second order, or ice-cream) cone L m (Conic Quadratic constraints); the semidefinite cone S m +, i.e. the cone of positive semidefinite matrices of size m m (Linear Matrix Inequality (LMI) constraints); a direct product of such cones. In all these cases the above problem can be solved efficiently by an interior-point method. Optimization Group 2
Conic optimization Conic optimization addresses the problem of minimizing a linear objective function over the intersection of an affine set and a convex cone. The general form is { (COP) c T x : Ax b K }. min x R n The convex cone K is a subset of R m. The objective function c T x is linear. Ax b represents an affine function from R n to R m. Usually A is given as an m n (constraint) matrix, and b R m. Two important facts: many nonlinear problems can be modelled in this way, and under some weak conditions on the underlying cone K, conic optimization problems can be solved efficiently. The most easy and most well known case occurs when the cone K is the nonnegative orthant of R m, i.e. when K = R m + : (LO) { min c T x : Ax b R m } +. x R This is nothing else as one of the standard forms of the well known Linear Optimization (LO) problem. Thus it becomes clear that LO is a special case of CO. Optimization Group 3
Convex cones A subset K of R m is a cone if and the cone K is a convex cone if moreover a K, λ 0 λa K, (1) a, a K a + a K. (2) We will impose three more conditions on K. Recall that CO is a generalization of LO. To obtain duality results for CO similar to those for LO, the cone K should inherit three more properties from the cone underlying LO, namely the nonnegative orthant: R m + = { x = (x 1,..., x m ) T : x i 0, i = 1,..., m }. This cone is called the linear cone. Optimization Group 4
Convex cones (cont.) The linear cone is not just a convex cone; it is also pointed, it is closed and it has a nonempty interior. These are exactly the three properties we need. We describe these properties now. A convex cone K is called pointed if it does not contain a line. This property can be stated equivalently as a K, a K a = 0. (3) A convex cone K is called closed if it is closed under taking limits: a i K (i = 1,2,...), a = lim i a i a K. (4) Finally, denoting the interior of a cone K as int K, we will require that int K =. (5) This means that there exists a vector (in K) such that a ball of positive radius centered at the vector is contained in K. Optimization Group 5
Convex cones (cont.) In conic optimization we only deal with cones K that enjoy all of the above properties. So we always assume that K is a pointed and closed convex cone with a nonempty interior. Apart from the linear cone, two other relevant examples of such cones are 1. The Lorentz cone L m = { x R m : x m } x 2 1 +... + x2 m 1. This cone is also called the second-order cone, or the ice-cream cone. 2. The positive semidefinite cone S m +. This cone lives in the space Sm of m m symmetric matrices (equipped with the Frobenius inner product A, B = Tr(AB) = i,j A ij B ij ) and consist of all m m matrices A which are positive semidefinite, i.e., S m + = { A S m : x T Ax 0, x R m}. We assume that the cone K in (CP) is a direct product of the form K = K 1... K m, where each K i is either a linear, a Lorentz or a semidefinite cone. Optimization Group 6
Conic Duality Before we discuss the duality theory for conic optimization, we need to define the dual cone of a convex cone K: K = { λ R m : λ T a 0, a K }. Theorem 1 Let K R m is a nonempty cone. Then (i) The set K is a closed convex cone. (ii) If K has a nonempty interior (i.e., int K = ) then K is pointed. (iii) If K is a closed convex pointed cone, then int K. (iv) If K is a closed convex cone, then so is K, and the cone dual to K is K itself. Corollary 1 If K R m is a closed pointed convex cone with nonempty interior then so is K, and vice versa. One may easily verify that the linear, the Lorentz and the semidefinite cone are self-dual. Since K = K 1... K m K = K 1... Km, any direct product of linear, Lorentz and semidefinite cones is self-dual. Optimization Group 7
Conic Duality Now we are ready to deal with the problem dual to a conic problem (COP). We start with observing that whenever x is a feasible solution for (COP) then the definition of K implies λ T (Ax b) 0, for all λ K, and hence x satisfies the scalar inequality λ T Ax λ T b, λ K. It follows that whenever λ K satisfies the relation then one has A T λ = c (6) c T x = (A T λ) T x = λ T Ax λ T b = b T λ for all x feasible for (COP). So, if λ K satisfies (6), then the quantity b T λ is a lower bound for the optimal value of (COP). The best lower bound obtainable in this way is the optimal value of the problem { (COD) max b T λ : A T } λ = c, λ K. λ R m By definition, (COD) is the dual problem of (COP). Using Theorem 1 (iv), one easily verifies that the duality is symmetric: the dual problem is conic and the problem dual to the dual problem is the primal problem. Optimization Group 8
Conic Duality Indeed, from the construction of the dual problem it immediately follows that we have the weak duality property: if x is feasible for (COP) and λ is feasible for (COD), then c T x b T λ 0. The crucial question is, of course, if we have equality of the optimal values whenever (COP) and (COD) have optimal values. Different from the LO case, however, this is in general not the case, unless some additional conditions are satisfied. The following theorem clarifies the situation. We call the problem (COP) solvable if it has a (finite) optimal value, and this value is attained. Before stating the theorem it may be worth pointing out that a finite optimal value is not necessarily attained. For example, the problem min x : x 1 1 y 0, x, y R has optimal value 0, but one may easily verify that this value is not attained. We need one more definition: if there exists an x such that Ax b int K then we say that (COP) is strictly feasible. We have similar, and obvious, definitions for (COD) being solvable and strictly feasible, respectively. Optimization Group 9
Conic Duality Theorem Theorem 2 Let the primal problem (COP) and its dual problem (COD) be as given above. Then one has (i) a. If (COP) is below bounded and strictly feasible, then (COD) is solvable and the respective optimal values are equal. b. If (COD) is above bounded and strictly feasible, then (COP) is solvable, and the respective optimal values are equal. (ii) Suppose that at least one of the two problems (COP) and (COD) is bounded and strictly feasible. Then a primal-dual feasible pair (x, λ) is comprised of optimal solutions to the respective problems a. if and only if b T λ = c T x (zero duality gap). b. if and only if λ T [Ax b] = 0 (complementary slackness). This result is slightly weaker than the duality theorem for LO: in the LO case the theorem holds by putting everywhere feasible instead of strictly feasible. The adjective strictly cannot be omitted here, however. For a more extensive discussion and some appropriate counterexamples we refer to the book of Ben-Tal and Nemirovski. Optimization Group 10
min Bad duality example Consider the following conic problem with two variables x = (x 1, x 2 ) T and the 3-dimensional ice-cream cone: x 1 x 2 : Ax b = x 2 x 1 L3. The problem is equivalent to the problem { min x 2 : } x 2 1 + x2 2 x 1, i.e., to the problem min {x 2 : x 2 = 0, x 1 0}. The problem is clearly solvable, and its optimal set is the ray {x 1 0, x 2 = 0}. Now let us build the conic dual to our (solvable!) primal. Since the cone dual to an ice-cream cone is this ice-cream cone itself, the dual problem is max λ 0 : λ 1 + λ 3 = 0, λ L 3 λ 2 1. In spite of the fact that the primal problem is solvable, the dual is infeasible: indeed, assuming that λ is dual feasible, we have λ L 3, which means that λ 3 λ 2 1 + λ2 2 ; since also λ 1 + λ 3 = 0, we come to λ 2 = 0, which contradicts the equality λ 2 = 1. Optimization Group 11
LO as a special case of SOCO By definition, the m-dimensional Lorentz cone is given by { L m = x R m : x m x 2 1 +... + x2 m 1 Hence, the 1-dimensional Lorentz cone is given by Thus it follows that L 1 = {x R : x 0}. R n + = L1 L 1... L 1 }{{} n times As a consequence, every LO problem can be written as a SOCO problem. }. Optimization Group 12
SOCO as a special case of SDO Recall that the m-dimensional Lorentz cone is given by { L m = x R m : x m x 2 1 +... + x2 m 1 }. One may easily verify that x L m if and only if x m x m 1 x m 2... x 1 x m 1 x m 0... 0 x m 2 0 x m... 0. 0 0... 0 x 1 0 0 0 x m S m +. The above matrix depends linearly on (the coordinates of) the vector x, and hence any SOCO constraint could be written as an SDO constraint. Optimization Group 13
SOCO as a special case of SDO (proof) Let a R m 1 and α R. Then (a, α) L m if and only if a α. We need to show that this holds if and only if α at 0, a αi where I denotes the (m 1) (m 1) identity matrix. The latter is equivalent to ( ) β b T α at β 0, β R, b R m 1. a αi b Thus we obtain α at a αi 0 αβ 2 + 2βb T a + αb T b 0, β R, b R m 1 This proves the claim. α 2 β 2 + 2αβb T a + α 2 b T b 0, β R, b R m 1, α 0 ( αβ + b T a ) 2 + α 2 b T b (b T a) 2 0, β R, b R m 1, α 0 α 2 b T b (b T a) 2 0, b R m 1, α 0 α 2 a T a (a T a) 2 0, α 0 α 2 a T a 0, α 0 a α. Optimization Group 14
Convex quadratic optimization (CQO) as a special case of SOCO Any convex quadratic problem with quadratic constraints can be written as min {f 0 (x) : f i (x) 0, i = 1,..., m}, where f i (x) = (B i x + b i ) T (B i x + b i ) c T i x d i, i = 0,1,... m. The objective can be made linear by introducing an extra variable τ such that f 0 (x) τ, and minimizing τ. Redefining f 0 (x) := f 0 (x) τ, the problem becomes min {τ : f i (x) 0, i = 0,, m}. So it suffices to show that every convex quadratic f i (x) 0 can be written as a SOC constraint. Omitting the index i such a constraint has the form (Bx + b) T (Bx + b) c T x + d, or Bx + b 2 c T x + d. This is not yet a SOC constraint!! A SOC constraint has the form Gx + g p T x + q Gx + g p T x + q L k. Optimization Group 15
Convex quadratic optimization (CQO) as a special case of SOCO Bx + b 2 c T x + d. To put this constraint in the SOC form we observe that c T x + d = [ c T x + d + 1 ] 2 [ 4 c T x + d 1 2 4] Thus we have Bx + b 2 c T x + d Bx + b 2 + [ c T x + d 1 4 This is equivalent, for a suitable k (k = rowsize(b) + 2), to Bx + b c T x + d 1 4 c T x + d + 1 4 Lk. ] 2 [ c T x + d + 1 4 ] 2. Optimization Group 16
More on Semidefinite Optimization A semidefinite optimization problem can be written in the form (SD) d = sup { b T y : A(y) C } where A(y) := y 1 A 1 + + y m A m ; A i = A T i R n n, 1 i m; C = C T R n n ; A(y) C means: C A(y) is positive semidefinite (PSD). Optimization Group 17
Convex quadratic optimization (CQO) as a special case of SDO z T z ρ I z T z ρ 0 We have seen before that any convex quadratic constraint has the form Bx + b 2 c T x + d. By the above statement (which is a simple version of the so-called Schur complement lemma) this can be equivalently expressed as the SD constraint I Bx + b (Bx + b) T c T 0. x + d Optimization Group 18
Semidefinite duality Dual problem: Primal problem: d = sup bt y : m i=1 y i A i + S = C, S 0 p = inf {Tr(CX) : Tr(A i X) = b i ( i), X 0} Duality gap: Tr(CX) b T y = Tr (SX) 0 Central path: SX = µi, µ > 0. the central path exists if and only if the primal and the dual problem are strictly feasible (IPC); then both problems have optimal solutions and the duality gap = 0 at optimality. Optimization Group 19
Algorithms for SDO Dikin-type affine scaling approach Primal dual search directions ( X, y, Z) must satisfy Tr(A i X) = 0, mi=1 y i A i + Z = 0. i = 1,..., m X and Z are orthogonal: Tr( X Z) = 0. Duality gap after the step: Tr(X + X)(Z + Z). We minimize this duality gap over the so-called Dikin ellipsoid. The search directions follow by solving X + D ZD = XZX ( Tr(XZ) 2 )1 2, subject to the feasibility conditions. D is the socalled Nesterov-Todd (NT) scaling-matrix D := Z 1 2 ( Z 1 2XZ 1 2)1 2 Z 1 2. We assume that the matrices A i are linearly independent. Optimization Group 20
Measure of centrality The eigenvalues of XZ are real and positive if X, Z 0, since where denotes the similarity relation. XZ X 1 2 (XZ) X 1 2 = X 1 2ZX 1 2 0, The proximity to the central path is measured by κ(xz) := λ max(xz) λ min (XZ), where λ max (XZ) denotes the largest eigenvalue of XZ and λ min (XZ) the smallest. Optimization Group 21
Input: A strictly feasible pair (X 0, Z 0 ); a step size parameter α > 0; an accuracy parameter ɛ > 0. begin X := X 0 ; Z := Z 0 ; while Tr(XZ) ɛ do begin X := X + α X; y := y + α y; Z := Z + α Z; end end Primal dual Dikin affine-scaling algorithm Theorem 3 Let τ > 1 be such that κ(x 0 Z 0 ) τ. If α = 1 stops after at most τnl iterations, where L := ln ( Tr(X 0 Z 0 )/ɛ ). τ n, the Dikin Step Algorithm The output is a feasible primal dual pair (X, Z ) satisfying κ(x Z ) τ and Tr(X Z ) ɛ. Optimization Group 22
Primal dual Newton direction As before, the search directions ( X, y, Z) must satisfy Tr(A i X) = 0, mi=1 y i A i + Z = 0. i = 1,..., m We want Omitting the quadratic term this leads to (X + X)(Z + Z) = µi. which can be rewritten as XZ + X Z + XZ = µi, X + X ZZ 1 = µz 1 X, Note that Z will be symmetric, but X possibly not! To overcome this difficulty we use instead the equation X + D ZD = µz 1 X, and we obtain the socalled Nesterov-Todd (NT) directions. Here D is the same NT scalingmatrix as introduced before. Optimization Group 23
Proximity measure Let (X, Z) be a strictly feasible pair. We measure the distance of this pair to the µ-center by δ(x, Z; µ) := 1 2 where V is determined by µv 1 1 V µ V 2 = D 1 2XZD 1 2. Theorem 4 If δ := δ(x, Z, µ) < 1 then the full Newton step is strictly feasible and the duality gap attains it target value nµ. Moreover, δ ( X +, Z + ; µ ) δ 2. 2(1 δ 2 ) Optimization Group 24
Algorithm with full Newton steps Input: A proximity parameter τ, 0 τ < 1; an accuracy parameter ɛ > 0; X 0, Z 0, µ 0 > 0 such that δ(x 0, Z 0 ; µ 0 ) τ; a barrier update parameter θ, 0 < θ < 1. begin X := X 0 ; Z := Z 0 ; µ := µ 0 ; while nµ (1 θ)ɛ do begin X := X + X; Z := Z + Z; µ := (1 θ)µ; end end Theorem 5 If τ = 1/ 2 and θ = 1/(2 n), then the above algorithm with full NT steps requires at most 2 nlog nµ0 ɛ iterations. The output is a primal-dual pair (X, Z) such that Tr(XZ) ɛ. Optimization Group 25
Approximation algorithm for discrete problems, via SDO relaxation In graph G = (V, E) find a maximal clique, i.e., a subset C V, such that C is maximal, with i, j C (i j) : {i, j} E. Linear model: ω(g) := max e T x s.t. x i + x j 1, {i, j} / E (i j) x i {0,1}, i V. Quadratic model: ω(g) = max e T x s.t. x i x j = 0, {i, j} / E (i j) x i {0,1}, i V. Semidefinite relaxation: ω(g) ϑ(g) := max Tr ( ee T X ) = e T Xe s.t. X ij = 0, {i, j} / E (i j) Tr(X) = 1, X 0. Optimization Group 26
Lovasz sandwich theorem (1979) Semidefinite dual for ϑ(g): ϑ(g) = min s.t. λ Y + ee T λi Y ij = 0, {i, j} E Y ii = 0, i V A coloring of G with k colors induces a feasible Y, with λ = k, yielding ϑ(g) k. Thus we obtain the sandwich theorem of Lovász: ω(g) ϑ(g) χ(g), where ω(g) and χ(g) are the clique and chromatic numbers of G. The name sandwich theorem comes from the interesting paper of D.E. Knuth, The sandwhich theorem, The Electronic Journal of Combinatorics, Vol. 1, 1-48, 1994. The next two sheets provide proofs of both inequalities based on semidefinite duality. The first proof is more or less classical, but the second proof seems to be new. If G is perfect then ω(g) = ϑ(g) = χ(g). (Grötschel, Lovász, Schrijver, 1981) Optimization Group 27
Proof of the first inequality Let V = {v 1, v 2,, v n }, and let C = {v 1, v 2,, v k } be a clique in G = (V, E) of size k,1 k n. So the vertices v i C are mutually connected. Define Then satisfies x C = 1,,1 }{{} k X = 1 k,0,,0, X := 1 k x Cx T C. } {{ } n k 1 1 0 0.......... 1 1 0 0 0 0 0 0.......... 0 0 0 0 X ij = 0, {i, j} / E (i j) and X 0,Tr(X) = 1. Since e T Xe = k, ϑ(g) k. Optimization Group 28
Proof of the second inequality Let Γ = (C 1, C 2,, C k ) be a coloring of G = (V, E) with k colors. So the C i are vertex disjoint cocliques. Let γ i := C i,1 i k. For i = 1,2,, k define M i := k ( J γi I γi ), where J γi denotes the all one matrix of size γ i γ i and I γi the identity matrix of the same size. Then the block matrix M 1 0 0 0 M Y = 2 0..... 0 0 0 M k is dual feasible, with λ = k. Hence ϑ(g) k. Optimization Group 29
The maximal cut problem Input: A graph G = (V, E) with rational nonnegative weights a vw for {v, w} E. We take a vw = 0 if {v, w} / E. Goal: Partition the nodes into two classes so as to maximize the sum of the weights of the edges whose nodes are in different classes (weight of the cut). a 5 c 2 4 1 d b 3 e Optimization Group 30
The maximal cut problem (cont.) a 5 c 2 4 d 1 e 3 b The maximum cut problem (MAX CUT) is NP-hard, even if a vw {0,1} for all {v, w} E. N.B. The problem is solvable in polynomial-time if the graph is planar. Optimization Group 31
Relevance of the problem In a mathematical sense the maximum cut problem is interesting in itself. But there exist interesting applications in a wide variety of domains: Finding the ground state of magnetic particles subject to a field in the Ising spin glas model. Minimizing the number of vias (holes) drilled in a two-sided circuit board. Solving network design problems. Finding the radius of nonsingularity of a square matrix. Optimization Group 32
Solution approaches Several approaches: Integer/linear optimization enumerative techniques (e.g., branch and cut) heuristics (e.g., local search methods) approximation algorithms Optimization Group 33
Approximation algorithms An α-approximation algorithm for an NP-hard optimization problem: runs in polynomial time returns a feasible solution with value not worse than a factor α from optimal (with α < 1 in case of a maximization and α > 1 in case of a minimization problem). For randomized α-approximation algorithms the expected value of the solution is within a factor α of optimal. A 1 2-approximation algorithm of Sahni and Gonzalez (1976) was the best known for MAX CUT for almost 20 years. Goemans and Williamson (1995) improved this ratio from 1 2 to 0.878. If a 0.94-approximation algorithm exists then P=NP. This has been shown by Trevisan, Sorkin, Sudan and Williamson (1996) and by Hastad (1997). Optimization Group 34
Quadratic model for MAX CUT Let (S, T) be a any cut, i.e., a partitioning of the set V of the nodes in two disjoint classes S and T. Assuming V = n, the cut can be identified with an n-dimensional { 1,1}-vector x as follows x v = 1 if v S 1 if v T The weight of the cut (S, T) is then given by 1 4 v,w V, v V. a vw (1 x v x w ) We conclude that MAX CUT can be posed as follows: max x 1 4 v,w V a vw (1 x v x w ) : x 2 v = 1, v V. Optimization Group 35
max x Semidefinite relaxation of MAX CUT 1 4 v,w V a vw (1 x v x w ) : x 2 v = 1, v V Defining the matrix X = xx T, we have X vw = x v x w and the matrix X is a symmetric positive semidefinite matrix of rank 1. Thus we can reformulate the problem as max X 1 4 v,w V. a vw (1 X vw ) : X 0, rank(x) = 1, X vv = 1, v V Omitting the rank constraint we arrive at the following relaxation: max X 1 4 v,w V a vw (1 X vw ) : X 0, X vv = 1, v V The optimal solutions of this problem are the same as for the SDO problem min X v,w V a vw X vw = Tr(AX) : X 0, X vv = 1, v V where A = (a vw ). Note that X vv = 1 iff Tr (E v X) = 1, where E vv = 1 and all other elements of E v are zero..,. Optimization Group 36
Result of Goemans and Williamson OPT = max x SDP = max X 1 4 1 4 v,w V v,w V a vw (1 x v x w ) : x 2 v = 1, v V a vw (1 X vw ) : X 0, X vv = 1, v V (7) (8) The relaxation (8) can be used in an ingenious way to obtain an 0.878-approximation algorithm for the maximal cut problem. It is based on Theorem 6 With α = 0.878, one has α SDP OPT SDP. and a rounding procedure that generates a cut whose expected weight equals α SDP. (Goemans & Williamson, 1994) S.D.G. Optimization Group 37
Nemirovski s proof of the Goemans-Williamson bound Theorem 6 One has α SDP OPT SDP, with α = 0.878. Proof: The right inequality is obvious. To get the left hand side inequality, let X = [X vw ] be an optimal solution to the SD relaxation. Since X is positive semidefinite, it is the covariance matrix of a Gaussian random vector ξ with zero mean, so that E {ξ v ξ w } = X vw. Now consider the random vector ζ = sign[ξ] comprised of signs of the entries in ξ. A realization of ζ is almost surely a vector with coordinates ±1, i.e., it is a cut. A straightforward computation demonstrates that E {ζ v ζ w } = 2 π arcsin(x vw). It follows that the expected weight of the cut vector ζ is given by 1 E a vw (1 ζ v ζ v ) 4 = 1 a vw (1 2 ) 4 π arcsin(x vw) = 1 ( ) 2 a vw 4 π arccos(x vw) ; v,w V v,w V we used that arccos(t) + arcsin(t) = π for 1 t 1. Now one may easily verify that if 2 1 t 1 then 2 arccos(t) α(1 t), α = 0.878. π Using also a vw 0, this implies that 1 E a vw (1 ζ v ζ v ) 4 α a vw (1 X vw ) = α SDP. 4 v,w V The left hand side in this inequality, by evident reasons, is OPT. Thus we have proved that OPT α SDP. v,w V v,w V Optimization Group 38
The inequality π 2 arccos(t) α(1 t), α = 0.878 2.0 1.5 2 π arccos(t) α(1 t) 1.0 0.5 1.0 0.5 0 0.5 1.0 t Optimization Group 39
Original proof of the Goemans-Williamson bound Theorem 6 One has α SDP OPT SDP, with α = 0.878. Proof: As before, let X = [X vw ] be an optimal solution to the SD relaxation. Since X is positive semidefinite, we may write X = U T U, for some k n matrix U, where k = rank(x). Let u 1,..., u n denote the columns of U. Then we have u T v u w = X vw, for all i and all j. Note that X vv = 1 implies that u v = 1, for each v V. Let r R k be a randomly chosen unit vector in R k. Define a cut vector ζ: +1, if r T u v 0 ζ v = 1, if r T u v < 0. What is the expected weight of the cut corresponding to ζ? We claim that Pr(ζ v ζ w = 1) = 1 π arccos(ut v u w ) = 1 π arccos(x vw). Hence one has E(ζ v ζ w ) = ( 1 1 ) π arccos(x vw) 1 π arccos(x vw) = 2 π arcsin(x vw), where we used again that arccos(t) + arcsin(t) = π for 1 t 1. Hence 2 1 E a vw (1 ζ v ζ v ) 4 = 1 a vw (1 2 ) 4 π arcsin(x vw) = 1 ( ) 2 a vw 4 π arccos(x vw). v,w V v,w V From here we proceed as in Nemirovski s proof. v,w V Optimization Group 40
Geometric pictures related to the Goemans-Williamson proof u 3 u v u 1 θ vw θ vw r u n r u w u 2 The figure shows n unit vectors u v and a random unit vector r. If u v is on one side of the hyperplane r T v = 0 then ζ v = 1, and if u v is on the other side of this hyperplane then ζ v = 1. What is Pr(ζ v ζ w = 1)? In the green region r T u v and r T u w have opposite signs. If r is a random vector, this happens with probability 2θ vw 2π = θ vw π = arccos(ut v u w). π Optimization Group 41
Concluding remarks The last decade gave rise to a revolution in algorithms and software for linear, convex and semidefinite optimization. SDO unifies a wide variety of optimization problems. SDO models can be solved efficiently. This opens the way to many new applications, including applications which could not be handled some years ago. Since 1995, the techniques discussed in this talk have led to numerous improved approximation algorithms for other combinatorial optimization problems, like, e.g., MAX SAT, MAX 2SAT, MAX 3SAT, MAX 4SAT, MAX k-cut, k-coloring, scheduling, etc. Optimization Group 42
Some references A. Ben-Tal and A. Nemirovski. Lectures on Modern Convex Optimization. Analysis, Algorithms and Engineering Applications. Volume 1 of MPS/SIAM Series on Optimization. SIAM, Philadelphia, USA, 2001. S. Boyd, El Ghaoui, Feron and Balakrishnan. Linear Matrix Inequalities in System and Control Theory. SIAM, 1994. M. Goemans and D. Williamson. Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. Journal of the ACM 42, 1115 1145, 1995. E. de Klerk. Aspects of Semidefinite Programming. Volume 65 in the series Applied Optimization. Kluwer, 2002. L. Lovász and A. Schrijver. Cones of matrices and set functions and 0-1 optimization, SIAM Journal on Optimization 1, 166-190, 1991. S. Sahni and T. Gonzalez. P-complete approximation problems. Journal of the ACM, 23:555-565, 1976. A. Nesterov and A. Nemirovsky. Interior Point Polynomial Algorithms in Convex Programming. SIAM, 1994. L. Vandenberghe and S. Boyd. Semidefinite programming. SIAM Review 38, 49 95, 1996. Optimization Group 43