Canonical SDP Relaxation for CSPs

Similar documents
Lecture 16: Constraint Satisfaction Problems

Near-Optimal Algorithms for Maximum Constraint Satisfaction Problems

Lecture 17 (Nov 3, 2011 ): Approximation via rounding SDP: Max-Cut

Lecture 21 (Oct. 24): Max Cut SDP Gap and Max 2-SAT

Lecture 22: Hyperplane Rounding for Max-Cut SDP

On the efficient approximability of constraint satisfaction problems

Lecture 10. Semidefinite Programs and the Max-Cut Problem Max Cut

On the Optimality of Some Semidefinite Programming-Based. Approximation Algorithms under the Unique Games Conjecture. A Thesis presented

SDP Relaxations for MAXCUT

Lecture 20: Course summary and open problems. 1 Tools, theorems and related subjects

PCP Course; Lectures 1, 2

Approximability of Constraint Satisfaction Problems

Topic: Primal-Dual Algorithms Date: We finished our discussion of randomized rounding and began talking about LP Duality.

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 6: Provable Approximation via Linear Programming

Approximating Maximum Constraint Satisfaction Problems

Approximation & Complexity

Lecture 5. Max-cut, Expansion and Grothendieck s Inequality

On the Usefulness of Predicates

Lecture 13: Lower Bounds using the Adversary Method. 2 The Super-Basic Adversary Method [Amb02]

Positive Semi-definite programing and applications for approximation

CSC Linear Programming and Combinatorial Optimization Lecture 12: The Lift and Project Method

A On the Usefulness of Predicates

Max Cut (1994; 1995; Goemans, Williamson)

Integer Linear Programs

Overview. 1 Introduction. 2 Preliminary Background. 3 Unique Game. 4 Unique Games Conjecture. 5 Inapproximability Results. 6 Unique Game Algorithms

Lecture 3: Semidefinite Programming

BCCS: Computational methods for complex systems Wednesday, 16 June Lecture 3

Approximation Resistance from Pairwise Independent Subgroups

Lecture 2: November 9

Notes for Lecture 2. Statement of the PCP Theorem and Constraint Satisfaction

The Unique Games and Sum of Squares: A love hate relationship

Lecture 13 March 7, 2017

A Linear Round Lower Bound for Lovasz-Schrijver SDP Relaxations of Vertex Cover

Lecture 10: Hardness of approximating clique, FGLSS graph

Convex and Semidefinite Programming for Approximation

CS264: Beyond Worst-Case Analysis Lecture #15: Smoothed Complexity and Pseudopolynomial-Time Algorithms

Lecture 20: LP Relaxation and Approximation Algorithms. 1 Introduction. 2 Vertex Cover problem. CSCI-B609: A Theorist s Toolkit, Fall 2016 Nov 8

Lecture 2: Perfect Secrecy and its Limitations

8 Approximation Algorithms and Max-Cut

How hard is it to find a good solution?

Lecture 18: Inapproximability of MAX-3-SAT

Maximum cut and related problems

Unique Games and Small Set Expansion

Lecture Semidefinite Programming and Graph Partitioning

CS264: Beyond Worst-Case Analysis Lecture #18: Smoothed Complexity and Pseudopolynomial-Time Algorithms

Increasing the Span of Stars

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 8 Luca Trevisan September 19, 2017

Approximation Algorithms and Hardness of Approximation May 14, Lecture 22

ORIE 6334 Spectral Graph Theory December 1, Lecture 27 Remix

Robust algorithms with polynomial loss for near-unanimity CSPs

Foundations of Machine Learning and Data Science. Lecturer: Avrim Blum Lecture 9: October 7, 2015

Lecture 2: Program Obfuscation - II April 1, 2009

6.854J / J Advanced Algorithms Fall 2008

Lecture: Expanders, in theory and in practice (2 of 2)

Hierarchies. 1. Lovasz-Schrijver (LS), LS+ 2. Sherali Adams 3. Lasserre 4. Mixed Hierarchy (recently used) Idea: P = conv(subset S of 0,1 n )

MA 3280 Lecture 05 - Generalized Echelon Form and Free Variables. Friday, January 31, 2014.

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 11 Luca Trevisan February 29, 2016

A better approximation ratio for the Vertex Cover problem

Conditional distributions (discrete case)

Lecture notes on OPP algorithms [Preliminary Draft]

x 1 + x 2 2 x 1 x 2 1 x 2 2 min 3x 1 + 2x 2

PCP Theorem and Hardness of Approximation

Inapproximability Ratios for Crossing Number

On the efficient approximability of constraint satisfaction problems

Introduction to Semidefinite Programming I: Basic properties a

Lecture 4: Lower Bounds (ending); Thompson Sampling

SDP gaps for 2-to-1 and other Label-Cover variants

Sequence convergence, the weak T-axioms, and first countability

CSC 5170: Theory of Computational Complexity Lecture 9 The Chinese University of Hong Kong 15 March 2010

Provable Approximation via Linear Programming

6.841/18.405J: Advanced Complexity Wednesday, April 2, Lecture Lecture 14

Introduction to LP and SDP Hierarchies

Lecture 15 - NP Completeness 1

Approximating bounded occurrence ordering CSPs

CS264: Beyond Worst-Case Analysis Lecture #14: Smoothed Analysis of Pareto Curves

arxiv: v1 [cs.ds] 19 Dec 2018

+ ɛ Hardness for Max-E3Lin

Lecture 13: Linear programming I

Lecture 4: Constructing the Integers, Rationals and Reals

Lower bounds on the size of semidefinite relaxations. David Steurer Cornell

Lecture 10 + additional notes

Topic: Sampling, Medians of Means method and DNF counting Date: October 6, 2004 Scribe: Florin Oprea

Introduction to Interactive Proofs & The Sumcheck Protocol

Label Cover Algorithms via the Log-Density Threshold

approximation algorithms I

CS261: A Second Course in Algorithms Lecture #18: Five Essential Tools for the Analysis of Randomized Algorithms

BU CAS CS 538: Cryptography Lecture Notes. Fall itkis/538/

Lecture 2: Exchangeable networks and the Aldous-Hoover representation theorem

ORF 523. Finish approximation algorithms + Limits of computation & undecidability + Concluding remarks

MIT Algebraic techniques and semidefinite optimization February 14, Lecture 3

Optimal Inapproximability Results for MAX-CUT and Other 2-Variable CSPs?

c 1 v 1 + c 2 v 2 = 0 c 1 λ 1 v 1 + c 2 λ 1 v 2 = 0

Lecture 12 Scribe Notes

Basic Probabilistic Checking 3

Undecidability and Rice s Theorem. Lecture 26, December 3 CS 374, Fall 2015

Math 3361-Modern Algebra Lecture 08 9/26/ Cardinality

CSC Linear Programming and Combinatorial Optimization Lecture 10: Semidefinite Programming

High Dimensional Geometry, Curse of Dimensionality, Dimension Reduction

Approximating maximum satisfiable subsystems of linear equations of bounded width

NP-Completeness Part II

Transcription:

Lecture 14 Canonical SDP Relaxation for CSPs 14.1 Recalling the canonical LP relaxation Last time, we talked about the canonical LP relaxation for a CSP. A CSP(Γ) is comprised of Γ, a collection of predicates R with label domain D. The canonical LP relaxation is comprised of two parts. Given an instance C, with the domain of the variables being D, a constraint will be written as C = (R, S) C. Then a solution to the LP relaxation contains two objects. First, for each v V, a probability distribution over labels for v. Formally, we have LP variables (µ v [l]) v V,l D subject to v V, µ v [l] = 1 l D l D, µ v [l] 0. Second, for all C = (R, S) C we have a probability distribution λ C over local assignments S D. These are similarly encoded with C D S many LP variables. The objective function is max w c Pr [L(S) satisfies R]. C=(R,S) C Finally, the thing that ties the µ s and the λ s together is the consistent marginals condition (a collection of linear equalities): C = (R, S) C v S l D, Pr [L(v) = l] = µ v [l]. We also showed that rounding the canonical LP relaxation of Max-SAT using plain randomized rounding achieved a (1 1/e) approximation ratio. Recall that plain randomized rounding assigns to variables v in the following way: * Lecturer: Ryan O Donnell. Scribe: Jamie Morgenstern, Ryan O Donnell. 1

{ 1 : w.p. µv [1] F (v) = 0 : w.p. µ v [0] The proof of this approximation factor looked at p c, the probability a particular clause was satisfied by L λ c, and the probability that F satisfied that clause. In the last lecture it was shown that Pr[F satisfies C] 1 ( 1 p ) S c (14.1) S When S 2, 14.1 (3/4)p c which implies that this algorithm gets a 3/4-factor for clauses of length at most 2. On the other hand, for clauses of length at least 2, the trivial random algorithm (assigning each variable to 1 or 0 with probability 1/2) satisfies 3/4 of clauses, yielding a 3/4 approximation. Can we get the best of both worlds, and combine the results for trivial random and plain randomized rounding of the LP relaxation to get a 3/4 approximation for Max-SAT? The answer is yes, by combining the two assignment schemes. If we create our assignment F as { 1 : w.p. avg{µv [1], 1/2} F (v) = 0 : w.p. avg{µ v [0], 1/2} then F will satisfy 3/4 of clauses in expectation for Max-SAT. Showing this will be on the homework. In fact it is possible to do better than a 3/4 approximation for various versions of Max- SAT. Below we give a laundry list of results proven using SDPs to improve this approximation ratio for Max-SAT. (α 2 (β), β) approximation for Max-2SAT, where α 2 (β).940β [LLZ02] and α 2 (1 ɛ) 1 O( ɛ) [CMM07] ( 7 β, β) for Max-3SAT, [KZ97] (computer-assisted), 8 [Zwi02] (computer-verified) ( 7, 1) for Max-4SAT [HZ99] 8 (.833β, β) for Max-SAT [AW00] It is reasonable to conjecture that there is a polynomial-time ( 7 β, β)-approximation algorithm for Max-kSAT for any 8 k. 14.2 Canonical CSP SDP relaxation The SDP relaxation is similar to the LP relaxation, but with an important generalization. We will have exactly the same λ C s for each constraint, and the same objective function. Rather than having the µ v s, however, we ll have a collection of joint real random variables 2

(I v [l]) v V,l D. We will also have constraints which cause these random variables to hang together with the λ C s in a gentlemanly fashion. The random variables I v [l] will be called pseudoindicator random variables. We emphasize that they are jointly distributed. You should think of them as follows: there is a box, and when you press a button on the side of the box ( make a draw ), out comes values for each of the V D random variables. Figure 14.1: The pseudoindicator joint draw I v [l]. For now, never mind about how we actually represent these random variables or enforce conditions on them; we ll come to that later. We d love if it were the case that these pseudoindicator random variables were actual indicator random variables, corresponding to a genuine assignment V D. However, we can only enforce something weaker than that. Specifically, we will enforce the following two sets of conditions: 1. Consistent first moments: 2. Consistent second moments: C = (R, S) C v S l D Pr [L(v) = l] = E[I v [l]] (14.2) C = (R, S) C v, v S l, l D [ ] Pr [L(v) = l L(v ) = l ] = E I v [l] I v [l ] L λ c (14.3) (We emphasize that v and v need not be distinct, and l and l need not be distinct.) 3

We also emphasize again that these pseudoindicator random variables are not independent, so the expected value of their product is not the product of their expected values. We will show we can solve this optimally as an SDP (there are actually vectors hiding inside the box ). Also, as explain more carefully in the next lecture, assuming the Unique Games Conjecture the best polynomial-time approximation for any CSP is given by this SDP. Now, a few remarks about this relaxation. First: [ ] ] Remark 14.1. For all v, l, E I v [l] = E [I v [l] 2. Proof. Consider any C v. Apply (2) with v = v, l = l. Then, we have Of course also Finally, apply (1) which says Pr [L(v) = l L(v) = l] = E [ I v [l] 2]. L λ c Pr [L(v) = l L(v) = l] = Pr [L(v) = l]. L λ c L λ c Pr [L(v = l)] = E [ I v [l] ] L λ c This is somewhat nice because E[I 2 ] = E[I] is something satisfied by a genuinely 0-1- valued random variable I. In fact, our pseudoindicator random variables may take values outside the range [0, 1]. Still, they will at least satisfy the above. Now, we will show that this SDP relaxation is in fact a relaxation (we still haven t explained why it s an SDP): Theorem 14.2. Opt(C) SDPOpt(C) Proof. Let F be a legitimate (optimal) assignment. Then we can construct λ C s and I v [l] s which achieve Val(F ). { 1 : if L is consistent with F λ C [L] = 0 : o/w Then, let I v [l] be the constant random variables { 1 : if F (v) = l I v [l] 0 : o/w It is easy to check that these λ C s and I v [l] s satisfy the consistent first and second moment constraints and have SDP value equal to Val(F ). 4

Now, we show that the SDP relaxation is at least as tight as the LP relaxation. Theorem 14.3. SDPOpt(C) LPOpt(C) Proof. Given an SDP solution S achieving SDPOpt, S = (λ C, I v [l]), we must construct an LP solution and show its objective is no less than that of S. Use the same λ C s for the LP solution. Since the objective value depends only on the λ C s, the objective value for the LP solution will be the same as the SDP value of S. It remains to construct the distributions µ v which are consistent with the λ C s. Naturally, we set [ ] µ v [l] = E I v [l]. Please note that this is indeed a probability distribution, because if we select any C v and apply (1), we get [ ] E I v [l] = Pr [L(v) = l] and the RHS numbers are coming from the genuine probability distribution λ C v. The fact that the λ C s and the µ v s satisfy the LP s consistent marginals condition is equivalent to (1). Next, fix some v V. If the pseudoindicator random variables (I v [l]) l D were really legitimate constant random variables indicating a genuine assignment, we d have l D I v[l] = 1. In fact, this is true with probability 1 in any SDP solution: Proposition 14.4. Given a valid SDP solution, for any v, let J = J v = l D I v[l]. Then J 1. Proof. We will calculate the mean and variance of J. By linearity of expectation, E[J] = l E[I v [l]] = 1. And, [( ) ( )] E[J 2 ] = E I v [l] I v [l ] l l By linearity of expectation, this is just = [ ] E I v [l] I v [l ] l,l Choose any constraint C v. By (2), we have = l,l Pr [L(v) = l L(v) = l ] Here every term with l l is 0. So this reduces to 5

= l Then, computing the variance of J: Pr [L(v) = l] L λ c = 1 Var[J] = E[J 2 ] E[J] 2 = 0 Any random variable with zero variance is a constant random variable, with value equal to its mean. Thus, J 1. Theorem 14.5. Condition (1) in the SDP is superfluous, in the sense that dropping it leads to an equivalent SDP (equivalent meaning that the optimum is the same for all instances). Proof. On the homework. Given the above theorem, we focus for a while on the optimization problem in which joint pseudoindicator random variables only need to satisfy (2). Let s now answer the big question: how is this optimization problem an SDP? 14.3 Why is it an SDP and how do we construct the pseudoindicators? Let us define the numbers [ ] σ (v,l),(v,l ) = E I v [l] I v [l ]. As this notation suggests, we will define a matrix Σ from these numbers. It will be an N N matrix (for N = V D ), with rows and columns indexed by variable/label pairs: Σ = (σ (v,l),(v,l )) Now let us ask what the consistent second moments condition (2) is saying? The second moments constraint is satisfied if and only there exists a collection of N random variables (the pseudoindicators) whose second moment matrix is Σ. But, if you recall, this is equivalent definition #5 from Lecture 10 of PSD-ness of the matrix Σ. Thus our optimization problem which has linear constraints on the variables λ C and σ (v,l),(v,l ), together with the condition that Σ is PSD is indeed an SDP! We still need to discuss how to actually construct/sample from pseudoindicator random variables (I v [l]) corresponding to the Ellipsoid Algorithm s output Σ. It s much like in the beginning of the Goemans Williamson algorithm: given Σ PSD, you compute (a very accurate approximation to) a matrix U R N N such that U U = Σ. The columns of U are vectors y v,l R N such that y v,l y v,l = σ (v,l),(v,l ). How does this help? 6

The key idea is that you can think of a vector as a random variable, and a collection of vectors as a collection of joint random variables. How? A vector y R N defines a random variable Y as follows: to get a draw from Y, pick i [N] uniformly at random and then output Y = y i. A collection of vectors y (1),..., y (d) defines a collection of jointly distributed random variables Y (1),..., Y (d) as follows: to get one draw from (all of) the Y (j) s, pick i [N] uniformly at random and then output Y (j) = ( y (j) ) i for each j [d]. In this way, we can view the vectors that the SDP solver outputs (more precisely, the vectors gotten from the columns of the factorization U U = Σ), namely y (v1,l 1 ),..., y (vn,l q), as the collection of jointly distributed pseudoindicators, Y v1 [l 1 ],..., Y vn [l q ]. Why does this work? The idea is that inner products are preserved (up to a trivial scaling): Observation 14.6. Given vectors y, y R N, the equivalent random variables Y, Y satisfy: E[Y Y ] = N i=1 1 N y i y i = 1 y y N We ll make a slight definition to get rid of this annoying scaling factor: Definition 14.7. We introduce the scaled inner product y, y := 1 y y N Solving the SDP is equivalent to coming up with the pseudoindicator random variables, with this slight need to scale. Given y v,l as in the original SDP, we define Then, z v,l = N y v,l z v,l, z v,l = N y v,l, y v,l = y v,l y v,l = σ (v,l),(v,l ) So actually, the joint random variables corresponding to this collection of vectors z v,l s will be the pseudoindicator random variables. 7

14.4 Summary There are several equivalent perspectives on the canonical SDP relaxation for a CSP. Pseudoindicator random variables which satisfy the first and second moment consistency constraints. This perspective is arguably best for understanding the SDP. for using an SDP solver Pseudoindicator random variables which just satisfy the consistent second moments constraints. This perspective is arguably best when constructing SDP solutions by hand. Vectors ( y v,l ) satisfying the first and second moment consistency constraints. This perspective is the one that s actually used computationally, on a computer. There is one more equivalent perspective that we will see in the next lecture, which is arguably the best perspective for developing SDP rounding algorithms : Jointly Gaussian pseudoindicator random variables which satisfy the consistent first and second moment constraints. In the next lecture we will see how to make the pseudoindicators jointly Gaussian, and why this is good for rounding algorithms. 8

Bibliography [AW00] T. Asano and D.P. Williamson. Improved approximation algorithms for max sat. In Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms, pages 96 105. Society for Industrial and Applied Mathematics, 2000. 14.1 [CMM07] M. Charikar, K. Makarychev, and Y. Makarychev. Near-optimal algorithms for maximum constraint satisfaction problems. In Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms, pages 62 68. Society for Industrial and Applied Mathematics, 2007. 14.1 [HZ99] [KZ97] [LLZ02] [Zwi02] E. Halperin and U. Zwick. Approximation algorithms for max 4-sat and rounding procedures for semidefinite programs. Integer Programming and Combinatorial Optimization, pages 202 217, 1999. 14.1 H. Karloff and U. Zwick. A 7/8-approximation algorithm for max 3sat? In focs, page 406. Published by the IEEE Computer Society, 1997. 14.1 D. Livnat, M. Lewin, and U. Zwick. Improved rounding techniques for the max 2-sat and max di-cut problems. In Proc. of 9th IPCO, pages 67 82, 2002. 14.1 U. Zwick. Computer assisted proof of optimal approximability results. In Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms, pages 496 505. Society for Industrial and Applied Mathematics, 2002. 14.1 9