Lecture Semidefinite Programming and Graph Partitioning

Similar documents
Approximation Algorithms

Lecture 17 (Nov 3, 2011 ): Approximation via rounding SDP: Max-Cut

Lecture 21 (Oct. 24): Max Cut SDP Gap and Max 2-SAT

Lecture 12 : Graph Laplacians and Cheeger s Inequality

Lecture 10. Semidefinite Programs and the Max-Cut Problem Max Cut

Lecture 8: The Goemans-Williamson MAXCUT algorithm

Introduction to Semidefinite Programming I: Basic properties a

Lecture 8 : Eigenvalues and Eigenvectors

Lecture 20: Goemans-Williamson MAXCUT Approximation Algorithm. 2 Goemans-Williamson Approximation Algorithm for MAXCUT

CSCI 1951-G Optimization Methods in Finance Part 10: Conic Optimization

6.854J / J Advanced Algorithms Fall 2008

Lecture: Examples of LP, SOCP and SDP

Symmetric matrices and dot products

SDP Relaxations for MAXCUT

MIT Algebraic techniques and semidefinite optimization February 14, Lecture 3

Approximation Algorithms and Hardness of Approximation April 19, Lecture 15

Convex and Semidefinite Programming for Approximation

8 Approximation Algorithms and Max-Cut

1 Positive definiteness and semidefiniteness

Positive Semi-definite programing and applications for approximation

1 Last time: least-squares problems

Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 5

Lecture 16: Constraint Satisfaction Problems

Lecture 5. Max-cut, Expansion and Grothendieck s Inequality

Lecture 1: April 24, 2013

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

Handout 6: Some Applications of Conic Linear Programming

A lower bound for the Laplacian eigenvalues of a graph proof of a conjecture by Guo

6.854J / J Advanced Algorithms Fall 2008

U.C. Berkeley CS270: Algorithms Lecture 21 Professor Vazirani and Professor Rao Last revised. Lecture 21

Lecture 12: Introduction to Spectral Graph Theory, Cheeger s inequality

Lecture 22: Hyperplane Rounding for Max-Cut SDP

Clustering compiled by Alvin Wan from Professor Benjamin Recht s lecture, Samaneh s discussion

1 Matrix notation and preliminaries from spectral graph theory

Grothendieck s Inequality

Out-colourings of Digraphs

ACO Comprehensive Exam March 17 and 18, Computability, Complexity and Algorithms

Notes on the decomposition result of Karlin et al. [2] for the hierarchy of Lasserre by M. Laurent, December 13, 2012

Global Optimization of Polynomials

Convex relaxation. In example below, we have N = 6, and the cut we are considering

Algebraic Methods in Combinatorics

Graph fundamentals. Matrices associated with a graph

16.1 Min-Cut as an LP

Convex relaxation. In example below, we have N = 6, and the cut we are considering

Lecture 13 March 7, 2017

Ch. 7.6 Squares, Squaring & Parabolas

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 8 Luca Trevisan September 19, 2017

Lecture 20: LP Relaxation and Approximation Algorithms. 1 Introduction. 2 Vertex Cover problem. CSCI-B609: A Theorist s Toolkit, Fall 2016 Nov 8

Topics in Theoretical Computer Science April 08, Lecture 8

CSC Linear Programming and Combinatorial Optimization Lecture 10: Semidefinite Programming

1 Matrix notation and preliminaries from spectral graph theory

THE EIGENVALUE PROBLEM

Rings, Paths, and Paley Graphs

Math (P)refresher Lecture 8: Unconstrained Optimization

16.1 L.P. Duality Applied to the Minimax Theorem

Linear Algebra - Part II

Lecture 9: Low Rank Approximation

SEMIDEFINITE PROGRAM BASICS. Contents

Notes for Lecture 14

8.1 Concentration inequality for Gaussian random matrix (cont d)

Algebraic Methods in Combinatorics

Scheduling on Unrelated Parallel Machines. Approximation Algorithms, V. V. Vazirani Book Chapter 17

Semidefinite Programming

Spectral Theorem for Self-adjoint Linear Operators

Chapter 11. Approximation Algorithms. Slides by Kevin Wayne Pearson-Addison Wesley. All rights reserved.

Linear Programming. Scheduling problems

1 Review: symmetric matrices, their eigenvalues and eigenvectors

Lecture 2: Linear Algebra Review

ECE 8201: Low-dimensional Signal Models for High-dimensional Data Analysis

Lecture 7 Spectral methods

Lecture 5: Random Walks and Markov Chain

Lecture 18. Ramanujan Graphs continued

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 11 Luca Trevisan February 29, 2016

Semidefinite Programming Basics and Applications

Advanced Combinatorial Optimization September 22, Lecture 4

An Algorithmist s Toolkit September 15, Lecture 2

Connections between exponential time and polynomial time problem Lecture Notes for CS294

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 2 Luca Trevisan August 29, 2017

ORIE 6334 Spectral Graph Theory November 22, Lecture 25

Finding normalized and modularity cuts by spectral clustering. Ljubjana 2010, October

CSC 2414 Lattices in Computer Science September 27, Lecture 4. An Efficient Algorithm for Integer Programming in constant dimensions

Lecture 4: Polynomial Optimization

Lecture 5. 1 Goermans-Williamson Algorithm for the maxcut problem

Math Matrix Algebra

Hypergraph Matching by Linear and Semidefinite Programming. Yves Brise, ETH Zürich, Based on 2010 paper by Chan and Lau

4.2. ORTHOGONALITY 161

Chapter 13: Vectors and the Geometry of Space

Chapter 13: Vectors and the Geometry of Space

MATH 423 Linear Algebra II Lecture 33: Diagonalization of normal operators.

Chapter 7 Network Flow Problems, I

Machine Learning for Data Science (CS4786) Lecture 11

CS123 INTRODUCTION TO COMPUTER GRAPHICS. Linear Algebra /34

Notes on the Matrix-Tree theorem and Cayley s tree enumerator

ON SUM OF SQUARES DECOMPOSITION FOR A BIQUADRATIC MATRIX FUNCTION

10725/36725 Optimization Homework 4

Lecture 3: Review of Linear Algebra

CSC Linear Programming and Combinatorial Optimization Lecture 12: The Lift and Project Method

Lecture 26: April 22nd

A new look at nonnegativity on closed sets

Lift-and-Project Techniques and SDP Hierarchies

Transcription:

Approximation Algorithms and Hardness of Approximation April 16, 013 Lecture 14 Lecturer: Alantha Newman Scribes: Marwa El Halabi 1 Semidefinite Programming and Graph Partitioning In previous lectures, we saw how linear programs can help us devise approximation algorithms for various N P-hard optimization problems. In this lecture, we will introduce a more general class of relaxations, vector programs, which allow for variables to be vectors instead of scalars. In particular, vector programs are a relaxation for problems that can be formulated as strict quadratic programs. Definition 1 (Quadratic program) A quadratic program (QP) is an optimization problem whose objective function is quadratic in terms of integer valued variables, subject to quadratic constraints. If in addition a quadratic program has all its monomials degree 0 or, then it s a strict quadratic program. As we will show later, vector programs are equivalent to semidefinite programs, which can be viewed as a generalization of linear programs. We can solve semidefinite programs by the ellipsoid algorithm up to an arbitrary small additive error ɛ, in time polynomial in the input size and log(1/ɛ). We will illustrate the concept of semidefinite programming by examining some graph partitioning problems. We start with the well known imum cut problem. 1.1 Maximum Cut Given an undirected graph G = (V, E), the goal of the imum cut problem is to find a partition of the vertices, (S, S), that imizes the number of edges crossing the cut, i.e. edges with one endpoint in S and the other endpoint in S. We denote the number of edges crossing the imum cut by OPT Max-Cut. The cut problem is know to be N P-Hard, so our goal is to find a good polynomial time approximation algorithm for it. Note that E is an upper bound on OPT Max-Cut. Thus, we can achieve a 1 -approximation for cut simply by placing each vertex in S or S with probability 1/. Th cut problem can be formulated as a quadratic program: x i (1 x j ) (1) s.t x i {0, 1} i V, where each variable x i is set to one if vertex i is in set S, and to zero if in set S. Note that an edge with both ends in the same set will not contribute to the objective function. If we relax the integer constraint in this QP, we have the following formulation: x i (1 x j ) () s.t x i [0, 1] i V. If we can solve this relaxed version optimally, we will still be able to find a cut. Consider the solution for the relaxed QP, OPT rel. For any vertex h V, with fractional value assigned to x h, we can rewrite the objective function (1) as follows. Let δ(h) denote all edges adjacent to vertex h. i,j E\δ(h) A { }}{{ }}{ x i (1 x j ) + x h (1 x j ) +(1 x h ) x j. j δ(h) B j δ(h) 1

Then if A B, we round x h to one, otherwise we round it to zero. Let s denote by OPT rd the solution we get after rounding OPT rel. Note that OPT rel OPT rd, so by solving the relaxed QP for cut and rounding the solution, we obtain an integral solution that is at least as good as OPT rel, which is at least as good as the true optimal value OPT -cut. We can deduce from this that solving this particular relaxed version is also N P-hard, since it boils down to solving exactly the N P-hard cut problem in polynomial time. So relaxing the integrality constraint does not help here. Instead, we ll relax the cut problem to a semidefinite program. Definition (Semidefinite program (SDP)) A semidefinite program is an optimization problem whose objective function is linear in terms of its variables x ij, subject to linear constraints over the variables, with the additional constraint that the symmetric matrix X = [x ij ] should be positive semidefinite. We recall the definition of positive semidefinite matrix: Definition 3 A symmetric matrix X R n n is positive semidefinite (X 0) iff for all y R n, y T Xy 0. Theorem 4 X R n n is a symmetric matrix, then the following are equivalent: 1. X is positive semidefinite.. All eigenvalues of X are non negative. 3. V R m n, m n, s.t X = V T V. Note that we can compute the eigendecomposition of a symmetric matrix X = QΛQ T in polynomial time, thus we can test for positive definiteness in polynomial time. However, the decomposition V T V is not polynomial time computable, since taking the square root of the diagonal matrix Λ can lead to irrational values. We can get an arbitrarily good approximation of this decomposition, so we can assume we have the exact decomposition in polynomial time, given that this inaccuracy can be included in the approximation factor. By replacing the variable x ij in a SDP by the inner product of the two vectors v i and v j in the decomposition X = V T V corresponding to entry x ij, we obtain an equivalent vector program: i,j c ij x ij i,j c ij (v i v j ) s.t i,j a ijk x ij = b k s.t i,j a ijk (v i v j ) = b k x ij = x ji X 0 The cut problem admits a strict quadratic program formulation, which can be relaxed to a vector program: 1 v i v j 1 v i v j = s.t v i v i = 1 s.t v i { 1, 1} 1. Random Hyperplan Rounding Given a solution {v u u V } to the vector program of cut with value OPT v, we round our solution as follows:

Pick a random vector r = (r 1, r,, r n ), s.t r i N (0, 1). { r v u 0 u S For all u V : r v u < 0 u S. This rounding procedure is called random hyperplane rounding. Theorem 5 There exist a polynomial time algorithm that achieves a 0.878-approximation of the imum cut with high probability. To prove Theorem 5, we will start by stating the following facts: r Fact 1: The normalized vector r is uniformly distributed on S n, the n-dimensional unit sphere. This is because the distribution of r only depends on r. Fact : The projections of r on to unit vectors e 1 and e are independent iff e 1 and e are orthogonal. Proof: Let s denote r 1 and r the projections of r onto e 1 and e, respectively. The projections of a Gaussian random vector are also Gaussian random vectors, so it s sufficient to have E[r 1 r ] = 0 for r 1 and r to be independent. Since E[r 1 r ] = E[(e T 1 r)(r T e )] = e T 1 E[rr T ]e = e T 1 e, Fact follows directly. Corollary 6 Let r be the projection of r onto a -dimensional plane, then r r is uniformly distributed on a unit circle in the plane. Lemma 7 The probability that edge (i, j) is cut is arccos(vi vj) vectors v i and v j. = θij, where θ ij is the angle between Proof Project vector r onto the plane containing v i and v j. It is easy to see that edge (i, j) is cut iff r falls within the area formed by the vectors perpendicular to v i and v j, which has area equal to θ ij /(). Lemma 8 For x [ 1, 1], one can show that: arccos(x) 0.878 ( ) 1 x. Let W be a random variable denoting the weight of the cut we obtain from solving the vector program for cut and then applying the random hyperplane rounding. Then: E[W ] = E[ Pr(edge (i, j) is cut)] = = θ ij arccos(v i v j ) 0.878 1 (v i v j ) = 0.878 OPT v 0.878 OPT -cut (by lemma 7) (by lemma 8) Given this expected value, one can show the existence of an algorithm that achieves a 0.878- approximation of the imum cut in polynomial time, with high probability. This concludes the proof of Theorem 5. Finally, we give an example to show that this approximation factor is almost tight. Consider a 5-cycle graph. OPT -cut = 4, while the optimal solution for the SDP is placing the 5 vector in a -dimensional plane with an angle 5 between each two vectors. The approximation factor achieved in this case is 0.884. 3

1.3 Correlation Clustering Given an undirected graph G = (V, E), we assign for each edge (i, j) E the weights W + ij and W ij to denote how similar or different the endpoints of this edge are, respectively. (In our analysis we ll assume each edge have only one of these two kind of weights, but the approximation algorithm will still work in general.) The goal of the correlation clustering problem is to find a partition of the graph into clusters of similar vertices. In other words, we aim to imize the following objective function: W + ij + W ij. i,j are in the same cluster We denote the optimal value by OPT cc. i,j are in different clusters The correlation clustering problem also admits a simple 1 -approximation algorithm by picking the best of the following two procedures: 1. Form one cluster: S = V.. Set each vertex to be in its own cluster. Note that if the total sum of W + ij in the graph is greater than the total sum of W ij, choice 1 will guarantee at least half OPT cc, otherwise choice will. The correlation clustering problem admits an exact formulation that can be relaxed to a vector program: ( W + ij (v i v j ) + W ij (1 v i v j ) ) ( W + ij (v i v j ) + W ij (1 v i v j ) ) = s.t v i v i = 1 s.t v i {e 1, e,, e n } i V v i v j 0 where e k denotes the unit vector with the kth entry set to one. In the exact formulation, v i is set to e k if vertex i belongs to cluster k. 1.4 Rounding Algorithm for Correlation Clustering Given a solution {v u u V } to the vector program of correlation clustering with value OPT v, we round our solution as follows: - Pick two random vectors r 1 and r s.t each entry is drawn from the standard distribution N (0, 1). - Form 4 clusters as follows: R 1 = {i V : r 1 v i 0 and r v i 0} R = {i V : r 1 v i 0 and r v i 0} R 3 = {i V : r 1 v i 0 and r v i 0} R 4 = {i V : r 1 v i 0 and r v i 0}. Theorem 9 There exists a polynomial time algorithm that achieves a 3 4-approximation of the correlation clustering problem with high probability. Proof We start with a useful lemma: Lemma 10 For x [0, 1], one can show that acos(x) (1 ) x 0.75 and 1 (1 acos(x) ) 1 x 0.75. 4

Let x ij be a random variable that takes the value one if i and j are in the same cluster. Note that the probability that v i and v j are not separated by either r 1 or r, is (1 arccos(vi vj) ). Thus, E[x ij ] = (1 arccos(v i v j ) ) Let W be a random variable denoting the weight of the clustering we obtain from solving the vector program of correlation clustering and then applying the random -hyperplane rounding. E[W ] = E[ W + ij x ij + W ij (1 x ij)] = 0.75 W + ij (1 arccos(v i v j ) ) + W ij (1 (1 arccos(v i v j ) ) ) W + ij (v i v j ) + W ij (1 v i v j ) (by lemma 10) = 0.75 OPT v 0.75 OPT cc. Given this expected value, one can show the existence of an algorithm that achieves a 0.75-approximation for the correlation clustering problem in polynomial time, with high probability. 5