Positive Semi-definite programing and applications for approximation

Similar documents
Convex and Semidefinite Programming for Approximation

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 11 Luca Trevisan February 29, 2016

Lecture 17 (Nov 3, 2011 ): Approximation via rounding SDP: Max-Cut

Lecture 10. Semidefinite Programs and the Max-Cut Problem Max Cut

16.1 L.P. Duality Applied to the Minimax Theorem

Dual fitting approximation for Set Cover, and Primal Dual approximation for Set Cover

SDP Relaxations for MAXCUT

Integer Linear Programs

Lower bounds on the size of semidefinite relaxations. David Steurer Cornell

Lecture 16: Constraint Satisfaction Problems

Lecture Semidefinite Programming and Graph Partitioning

On the efficient approximability of constraint satisfaction problems

Overview. 1 Introduction. 2 Preliminary Background. 3 Unique Game. 4 Unique Games Conjecture. 5 Inapproximability Results. 6 Unique Game Algorithms

A Linear Round Lower Bound for Lovasz-Schrijver SDP Relaxations of Vertex Cover

Approximation Algorithms

Introduction to Semidefinite Programming I: Basic properties a

Lec. 2: Approximation Algorithms for NP-hard Problems (Part II)

CS 6820 Fall 2014 Lectures, October 3-20, 2014

8 Approximation Algorithms and Max-Cut

approximation algorithms I

Lecture 3: Semidefinite Programming

CSC Linear Programming and Combinatorial Optimization Lecture 10: Semidefinite Programming

Label Cover Algorithms via the Log-Density Threshold

6.854J / J Advanced Algorithms Fall 2008

Unique Games and Small Set Expansion

Lecture 8: The Goemans-Williamson MAXCUT algorithm

Hierarchies. 1. Lovasz-Schrijver (LS), LS+ 2. Sherali Adams 3. Lasserre 4. Mixed Hierarchy (recently used) Idea: P = conv(subset S of 0,1 n )

Approximation Algorithms and Hardness of Approximation May 14, Lecture 22

Lectures 6, 7 and part of 8

MIT Algebraic techniques and semidefinite optimization February 14, Lecture 3

Topics in Theoretical Computer Science April 08, Lecture 8

How hard is it to find a good solution?

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 8 Luca Trevisan September 19, 2017

ORIE 6334 Spectral Graph Theory December 1, Lecture 27 Remix

Introduction to LP and SDP Hierarchies

11.1 Set Cover ILP formulation of set cover Deterministic rounding

Lecture 13: Spectral Graph Theory

6.854J / J Advanced Algorithms Fall 2008

Dissertation Defense

Lecture 21 (Oct. 24): Max Cut SDP Gap and Max 2-SAT

CSCI 1951-G Optimization Methods in Finance Part 10: Conic Optimization

An introductory example

Lecture 22: Hyperplane Rounding for Max-Cut SDP

Lecture 12 : Graph Laplacians and Cheeger s Inequality

Topic: Primal-Dual Algorithms Date: We finished our discussion of randomized rounding and began talking about LP Duality.

Canonical SDP Relaxation for CSPs

Approximation algorithm for Max Cut with unit weights

Partitioning Algorithms that Combine Spectral and Flow Methods

c 2000 Society for Industrial and Applied Mathematics

An Approximation Algorithm for MAX-2-SAT with Cardinality Constraint

Lecture: Examples of LP, SOCP and SDP

PCP Soundness amplification Meeting 2 ITCS, Tsinghua Univesity, Spring April 2009

Lecture 13 March 7, 2017

By allowing randomization in the verification process, we obtain a class known as MA.

Lecture 03 Positive Semidefinite (PSD) and Positive Definite (PD) Matrices and their Properties

Lecture 12: Introduction to Spectral Graph Theory, Cheeger s inequality

Near-Optimal Algorithms for Maximum Constraint Satisfaction Problems

1 Adjacency matrix and eigenvalues

Lecture 6,7 (Sept 27 and 29, 2011 ): Bin Packing, MAX-SAT

Lecture 5. Max-cut, Expansion and Grothendieck s Inequality

A New Approximation Algorithm for the Asymmetric TSP with Triangle Inequality By Markus Bläser

- Well-characterized problems, min-max relations, approximate certificates. - LP problems in the standard form, primal and dual linear programs

Approximation norms and duality for communication complexity lower bounds

1 Primals and Duals: Zero Sum Games

1 Review: symmetric matrices, their eigenvalues and eigenvectors

Notice that lemma 4 has nothing to do with 3-colorability. To obtain a better result for 3-colorable graphs, we need the following observation.

Graph Partitioning Algorithms and Laplacian Eigenvalues

Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 5

Spectral Graph Theory Lecture 2. The Laplacian. Daniel A. Spielman September 4, x T M x. ψ i = arg min

CSC 5170: Theory of Computational Complexity Lecture 13 The Chinese University of Hong Kong 19 April 2010

Lecture 2: November 9

Network Flows. CS124 Lecture 17

Lecture 21: HSP via the Pretty Good Measurement

Lecture: Local Spectral Methods (3 of 4) 20 An optimization perspective on local spectral methods

Approximation & Complexity

ACO Comprehensive Exam October 18 and 19, Analysis of Algorithms

CMPUT 675: Approximation Algorithms Fall 2014

Improved bounds on crossing numbers of graphs via semidefinite programming

Lecture 14: Random Walks, Local Graph Clustering, Linear Programming

Graph coloring, perfect graphs

Delsarte s linear programming bound

CS6999 Probabilistic Methods in Integer Programming Randomized Rounding Andrew D. Smith April 2003

Hypergraph Matching by Linear and Semidefinite Programming. Yves Brise, ETH Zürich, Based on 2010 paper by Chan and Lau

Lecture 2: From Classical to Quantum Model of Computation

The Steiner Network Problem

CS675: Convex and Combinatorial Optimization Fall 2016 Convex Optimization Problems. Instructor: Shaddin Dughmi

Semidefinite Programming

Optimization of Submodular Functions Tutorial - lecture I

1 Seidel s LP algorithm

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 6: Provable Approximation via Linear Programming

Rank minimization via the γ 2 norm

Finite Metric Spaces & Their Embeddings: Introduction and Basic Tools

HW1 solutions. 1. α Ef(x) β, where Ef(x) is the expected value of f(x), i.e., Ef(x) = n. i=1 p if(a i ). (The function f : R R is given.

On the Optimality of Some Semidefinite Programming-Based. Approximation Algorithms under the Unique Games Conjecture. A Thesis presented

Preliminary draft only: please check for final version

Semidefinite Programming Basics and Applications

BBM402-Lecture 20: LP Duality

LIMITS AT INFINITY MR. VELAZQUEZ AP CALCULUS

1 The linear algebra of linear programs (March 15 and 22, 2015)

Linear Programming. Scheduling problems

Transcription:

Combinatorial Optimization 1 Positive Semi-definite programing and applications for approximation Guy Kortsarz

Combinatorial Optimization 2 Positive Sem-Definite (PSD) matrices, a definition Note that we deal only with symmetric matrices. Because in such a case the eigenvalues are real. It does not make any sense to talk on non symmetric PSD matrix. The following are equivalent definition: 1. A symmetric matrix A is PSD 2. There is a matrix B so that B T B = A. 3. All the eigenvalues of the matrix are positive. 4. For every vector v, v T A v 0.

Combinatorial Optimization 3 What is PSD programming? Consider a collection of numbers y ij for 1 i,j n. We use them in an LP and add the constraint: Y 0. This means that as a matrix Y = (y ij ) is a PSD matrix. Note that the definition v T A v 0 implies that if A and B are two vectors of size n 2 that form a PDS matrix, and a,b > 0, then a A+b B is PSD. This implies that any convex combination of PSD matrices is PSD and means that the PSD vectors in R n2 of size n 2 is a convex set. Thus we may apply the Ellipsoids algorithm to check if Y 0, if we can translate this constraint into a collection of of linear constraint and then find a violated constraint.

Combinatorial Optimization 4 Posing the Y 0 as a collection of linear constrains We can give an alternative definition for Y 0 by an infinite number of linear constrains For every x R n there is the constraint x T Y x 0. Note that this is just a linear combination of the y ij variables since x is a constant vector. The number of linear constraints is infinite (this makes no difference). Finding a violated constraint is finding a negative eigenvalue λ < 0 If Y is not PSD then there is an eigenvalue λ < 0. Let x be the eigenvector Then x T A x = λ x T x < 0. This is because clearly x T x is a positive number. Thus we found a violated constraint. PSD programming can be solved in polynomial time.

Combinatorial Optimization 5 Summary We can use n 2 variables y ij and the constraint Y 0 in a program. Finding a violated constraint for Y 0 can be done in polynomial time. There is another way to look at it. If {y ij } is PSD is equivalent to the existence of a matrix D so that D T D = Y. Namely y ij = v T i v j for v i,v j R n. So, we can also look at this as vector programming. Instead of y ij we have v T i v j. v i,v j R n.

Combinatorial Optimization 6 How can we use the fact that we can handle multiplications such as v T i v j? Here is a stronger relaxation for Max-Cut. In what follows v i,v j R n. Maximize ij (1 vt i v j )/2 And y T i y i = 1 for every i. Note if we denote y ij = v T i v j, then Y 0. Let the optimum cut be C,V C. Why is the above a relaxation? Set v T i to (1,0,0,...,0) if i C. And set v T i = ( 1,0,0...,0) if i V \C.

Combinatorial Optimization 7 Why is this a relaxation continued Note that an edge ij is in the cut is adds a value of 2. Namely if v T i v j = 1 it adds 1 v T i v j = 2. If i,j are in the same side, it adds 1 v T i v j = 0. We divide by 2 to get exactly the cut value. Thus the maximum of the above program is at least the max cut. Its very important to note that we are not able to find a solution of low dimension. We can never find a vector in which only one entry is non-zero. All of this is due to Goemans and williamson. They knew the above, for a long time. But how to go from vectors to a partition? This took I think about 3 years.

Combinatorial Optimization 8 What is the problem? We are not going to get vectors that have a single entry. We can not impose such a thing (its NP-hard). The solution will be some complex collection of vectors in R n. The question is how to translate the vectors into a partition. At times, when we go from real numbers to vectors, the answer can become completely meaningless. But this is not the case in Max Cut

Combinatorial Optimization 9 Rounding Let opt v (for vectors) be the optimum for the PSD programming defined above and opt the value of the maximum cut. Thus opt v opt. Note that the vectors are unit vectors in R n. This is because of the constraint v T i v i = 1. The algorithm is: 1) Choose a random unit vector r on the unit sphere. 2) Place in S all so that {i v y r 0} and place the rest of the i in V S Remark: We later discuss in length how to choose a random vector on a sphere.

Combinatorial Optimization 10 Moving to the plane Consider just two vectors. Then we can map them to the plane because the dimension is only 2. v1 v2 ALPHA THE UNIT CIRCLE Figure 1: The angle between two vectors

Combinatorial Optimization 11 Intuition Consider a random unit vector r. Consider an edge ij. When is sign(r T v i ) different than sign(r T v j )? We can always think of the angle between v i,v j as less or equal π. When is cos(α) > 0? If α π/2. When is cos(α) < 0 If α > π/2. Consider the following figure

Combinatorial Optimization 12 Separating r vectors v1 v2 ALPHA r1 r_4 ALPHA r_3 r2 THE UNIT CIRCLE Figure 2: Separating r values cos(α) is positive until π/2 and negative from after π/2 to π.

Combinatorial Optimization 13 Explanation When r = r 1 there is π/2 degree between r and v 2. Say that we move from r 1 toward r 2 that has degree π/2 with v 1. When r is strictly between r 1 and r 2 then r v 1 > 0 and r v 2 < 0 because the degree between r and v 1 is less than π/2 but between r and v 2 is more than Π/2. Then the degree between r and both vectors is more than π/2. When r gets to r 3 the degree towards v 2 is less than π/2 and toward v 1 more than π/2 until we get to r 4. Thus there is 2α choices of r out of 2π that give different signs. We proved: The probability that i and j are separated is 2α/2π = α/π. Thus the vector program will try to choose large degree between v i and v j.

Combinatorial Optimization 14 So what would the PSD do? It increases the chance for i and j to contribute to the objective function if the degree between them is large. The PSD is stronger than the LP because it finds the best collection of vectors with respect to having large degree between v i and v j if i,j is an edge. Define S ij as 1 if ij is a cut edge and 0 otherwise. We showed that E(S i,j ) = Pr(ij is in the cut) = α ij /π. When α ij is the degree between v i,v j. Let T (for Total) be i,j S i,j Thus E(T) = ij Pr(ij is in the cut) = i,j α ij/π.

Combinatorial Optimization 15 Putting T in terms of v 1 v 2 Lemma: The probability that i and j are separated is arccos(v 1 v 2 )/π. Proof: as v i v j = cos(α i,j ) and α = arccos(cos(α)) we get that the probability for separation is arccos(v 1 v 2 )/π. The following fact can be verified by calculus. arccos(x)/π 0.878(1 x)/2

Combinatorial Optimization 16 The approximation ratio Theorem: The ratio of the algorithm is 0.878. Proof The contribution to the vector program of i,j is (1 v i v j )/2 The contribution of i and j to the expectation is: P(i and j are separated) = arccos(v 1 v 2 )/π 0.878(1 v 1 v 2 )/2. Thus the contribution of i, j to the expectation, namely the probability they are separated, is at least 0.878 times the value contribution to the PSD. The ratio of 0.878 follows. While its not immediate, the algorithm can be derandomized. The ratio is tight under the Unique Game Conjecture.

Combinatorial Optimization 17 Coloring a 3-colorable graph This is a promise problem. We cant check that a graph can be colored by 3 colors. Thus the one that produced the graph starts with some 3 independent sets as vertices and then adds edges in an arbitrary way (just not within the independent set). This problem is definitely easier than Min coloring that was shown to be n 1 ǫ inapproximable by Feige et al. We shall discuss a simple Õ( n) ratio algorithm In the next slides, with Õ() ignoring polylogarithmic functions. This already shows the problem is easier than coloring.

Combinatorial Optimization 18 A simple algorithm Recall that we can find an independent set of size n/(d+1) with d the average degree. This yields by a standard analysis an O(dlogn) approximation. But d may be large. We use the fact that 2-coloring a graph is a polynomial problem. The algorithm 1. While thee is a vertex v of degree at least n do (a) 2-color N(v) with new colors. 2. Use the O(d logn) coloring /* When we get to the above line, the maximum degree is at most n */

Combinatorial Optimization 19 Analysis The neighborhood of every 3-colorable graph is 2 colorable, and two coloring a graph is a polynomial problem. Each time we find a vertex of degree at least n we need two new colors and color n vertices with these 2 colors. Thus the number of colors used is at most 2 n/ n = 2 n. When this ends the maximum degree and thus the average degree d as well is at most n and a Õ( n) ratio follows. We proved a Õ( n) approximation ratio. Unfortunately there is no real hardness results for coloring 3-colorable graphs.

Combinatorial Optimization 20 Finding Ω(n/ 1/3 log ) maximum independent set Interestingly, we need to know the theory of the normal distribution. We shall use only normal distribution with mean 0 and variance 1. The way to think of normal distribution. Say that X i is 1 with probability 1/2 and 1 with probability 1/2. Consider n i=1 X i/ n with the number n going to infinite. Clearly the mean is 0. The standard deviation is 1. Almost all the probability will concentrated around the value X = 0. When n goes to infinity, the Gaus bell emerges. The basic normal distribution of min 0 and variance 1 is f(x) = e x2 /2 2π.

Combinatorial Optimization 21 Some facts Fact 1: The sum of two normal distributions with means µ 1 and µ 2 and variance σ 2 1 and σ 2 2 is normal with min µ 1 +µ 2 and variance σ 2 1 +σ 2 2 Let ψ(y) = y f(y)d(y). Now for another fact we will not prove. For every x > 0 f(x) (1/x 1/x 3 ) ψ(x) f(x)/x Fact 3: Say that we chose an n entries vectors so that every entry independently is chosen from a Normal distribution with mean 0 and variance 1 and divide it by n to make the variance aof the vector 1. Then this yields a random vector on the unit sphere. From now on, when we say random vector we mean the above.

Combinatorial Optimization 22 Projections Lemma: For a unit vector v and a random vector r, r T v is the basic normal distribution. This can be proved by previous facts. r T v = n i=1 v i X i with v i the i entry in v and X i the basic normal distribution (0 mean and variance 1). As we saw above the above gives mean 0 and variance n i=1 v2 i = 1 because v is a unit vector. We now define the main notion. Vector 3-coloring. This notion is different than 3 coloring. In fact can be polynomial smaller than 3.

Combinatorial Optimization 23 Vector 3 coloring Assign to every vertex a unit vector v i so that if ij is an edge then v T i v i 1/2. Such a arrangement of vectors is possible if the graph is 3-colorable. Consider the 3 independent set I 1,I 2,I 3 in the graph. Let u 1,u 2,u 3 be three vectors on the plain with degree at least 2 π/3 (120 degrees) between them. Assign u i to all I i. Clearly if ij is an edge then the value is cos(2π/3) = cos(π/3) = 1/2.

Combinatorial Optimization 24 An example of vector 3 coloring B A B C A C F D E 120 120 E F D 120 J K X X J K Figure 3: Vector 3 coloring of a graph

Combinatorial Optimization 25 How to find a legal 3 vector coloring in polynomial time? Use Positive Semi Definite programming. Minimize z Such that v T i v j z for every edge i,j. and v T i v i = 1. And we know that z 1/2. Thus we can assume that we have a collection of vectors that will have at most 2π/3 degree between them.

Combinatorial Optimization 26 An algorithm to find a large independent set Let θ be a threshold chosen later. 1. Find a vector 3 coloring 2. Choose a random unit vector r. 3. Let S = {v i r T v i θ}. 4. Let G(S) the graph induced by S. 5. As long as G(S) contains an edge e = uv, remove u from S 6. Let S be the non removed vertices 7. Return S

Combinatorial Optimization 27 Some random variables Let X = S. Let Y = E(S). Then Z = S max{0,x Y}. And so E(Z) E(X) E(Y). We now show how to bound E(X) from below and E(Y) from above. As r T v i is a basic normal distribution P(r T v θ) = ψ(θ). Thus E(X) = n ψ(θ). We now upper bound E(Y)

Combinatorial Optimization 28 Upper bounding Y E(Y) = edges i,j and r T v i θ) P(r T v i θ P(r T v i +r T v j ) 2θ. Note that the distribution of r T (v i +v j ) is not the basic normal one because v 1 +v 2 is not a unit vector However define u = (v i +v j )/ v i +v j. This is a unit vector and so r T u is the basic normal distribution.

Combinatorial Optimization 29 Analysis continued Note that v i +v j = v i 2 + v j 2 +2v T i v j. For an edge, v T i v j 1/2. Thus u 2 = v i +v j 2 1+1 1 = 1. We got that P(r T v i +r T v j 2θ) = P(r T u 2θ)/ u ψ(2 θ) Recall that m is the number of edges is the maximum degree. Thus by the linearity of expectation E(Y) m ψ(2θ) n ψ(2 θ)/2 We got E(X) E(X) E(Y) nψ(θ) n ψ(2θ)/2 We need to maximize the above over θ.

Combinatorial Optimization 30 Analysis continued Recall that we showed f(x)(1/x 1/x 3 ) ψ(x) f(x)/x. And f(x) = e x2 / 2 π. Set θ = (2 log /2). We get that n ψ(θ) n ψ(2θ)/2 = Ω(n)/( 1/3 log )

Combinatorial Optimization 31 The number of colors From the above it follows that Õ( 1/3 ) coloring It is possible to combine this with removing large degrees (the Õ( n) algorithm) and get a Õ(n 1/4 ) colors. This, albeit, is not the best ratio. An approximation of O(n 0.2111 ) is the best(?) known, using the alogorithm or ARV for sparsest cut. It may be possible that Lift and project techniques will give polylog ratio in time quasi polynomial in n (as far as I know its not known yet). Using vector coloring can not get polylog ratio. There are 3-colorable graphs whose minimum vertex coloring is n 0.05. The SDP does not catches the problem.