Low-ranksemidefiniteprogrammingforthe. MAX2SAT problem. Bosch Center for Artificial Intelligence

Similar documents
Automated Program Verification and Testing 15414/15614 Fall 2016 Lecture 3: Practical SAT Solving

SDP Relaxations for MAXCUT

Relaxations and Randomized Methods for Nonconvex QCQPs

arxiv: v3 [math.oc] 4 Jul 2018

Lift me up but not too high Fast algorithms to solve SDP s with block-diagonal constraints

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 6: Provable Approximation via Linear Programming

23. Cutting planes and branch & bound

Provable Approximation via Linear Programming

SDP-based Max-2-Sat Decomposition

Approximation Algorithms

MIT Algebraic techniques and semidefinite optimization February 14, Lecture 3

PROBLEM SOLVING AND SEARCH IN ARTIFICIAL INTELLIGENCE

How hard is it to find a good solution?

8 Approximation Algorithms and Max-Cut

Semidefinite Programming Basics and Applications

Convex Optimization M2

Analysis and synthesis: a complexity perspective

Discrete Optimization 2010 Lecture 12 TSP, SAT & Outlook

6.854J / J Advanced Algorithms Fall 2008

Multivalued Decision Diagrams. Postoptimality Analysis Using. J. N. Hooker. Tarik Hadzic. Cork Constraint Computation Centre

Inference of A Minimum Size Boolean Function by Using A New Efficient Branch-and-Bound Approach From Examples

approximation algorithms I

Convex relaxation. In example below, we have N = 6, and the cut we are considering

Improvements to Core-Guided Binary Search for MaxSAT

Projection in Logic, CP, and Optimization

Convex relaxation. In example below, we have N = 6, and the cut we are considering

Branching. Teppo Niinimäki. Helsinki October 14, 2011 Seminar: Exact Exponential Algorithms UNIVERSITY OF HELSINKI Department of Computer Science

Lecture 17 (Nov 3, 2011 ): Approximation via rounding SDP: Max-Cut

Algorithms. NP -Complete Problems. Dong Kyue Kim Hanyang University

Linear Programming. Scheduling problems

- Well-characterized problems, min-max relations, approximate certificates. - LP problems in the standard form, primal and dual linear programs

CSCI 1951-G Optimization Methods in Finance Part 10: Conic Optimization

On the efficient approximability of constraint satisfaction problems

Logic, Optimization and Data Analytics

A Continuation Approach Using NCP Function for Solving Max-Cut Problem

ahmaxsat: Description and Evaluation of a Branch and Bound Max-SAT Solver

Chapter 6 Constraint Satisfaction Problems

Sparse Matrix Theory and Semidefinite Optimization

The Graph Realization Problem

Lecture 3: Semidefinite Programming

Ranking from Crowdsourced Pairwise Comparisons via Matrix Manifold Optimization

Polyhedral Approaches to Online Bipartite Matching

Lecture Notes on SAT Solvers & DPLL

Convex and Semidefinite Programming for Approximation

Probabilistic Graphical Models

There are several approaches to solve UBQP, we will now briefly discuss some of them:

Lecture 6,7 (Sept 27 and 29, 2011 ): Bin Packing, MAX-SAT

Quadratic reformulation techniques for 0-1 quadratic programs

Agenda. Applications of semidefinite programming. 1 Control and system theory. 2 Combinatorial and nonconvex optimization

An Smodels System with Limited Lookahead Computation

III. Applications in convex optimization

IE418 Integer Programming

Foundations of Artificial Intelligence

maxz = 3x 1 +4x 2 2x 1 +x 2 6 2x 1 +3x 2 9 x 1,x 2

Sums of Products. Pasi Rastas November 15, 2005

COMP538: Introduction to Bayesian Networks

Analyzing the computational impact of individual MINLP solver components

1 Algebraic Methods. 1.1 Gröbner Bases Applied to SAT

Cutting Plane Training of Structural SVM

CSC Linear Programming and Combinatorial Optimization Lecture 10: Semidefinite Programming

Decentralized Control of Stochastic Systems

Decision Diagram Relaxations for Integer Programming

More on NP and Reductions

Projection methods to solve SDP

#A32 INTEGERS 10 (2010), TWO NEW VAN DER WAERDEN NUMBERS: w(2; 3, 17) AND w(2; 3, 18)

SEMIDEFINITE PROGRAM BASICS. Contents

with Binary Decision Diagrams Integer Programming J. N. Hooker Tarik Hadzic IT University of Copenhagen Carnegie Mellon University ICS 2007, January

Lec. 2: Approximation Algorithms for NP-hard Problems (Part II)

CHAPTER 11. A Revision. 1. The Computers and Numbers therein

Sum-of-Squares Method, Tensor Decomposition, Dictionary Learning

Critical Reading of Optimization Methods for Logical Inference [1]

Copositive Programming and Combinatorial Optimization

Discrete Optimization 2010 Lecture 12 TSP, SAT & Outlook

Inderjit Dhillon The University of Texas at Austin

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 8 Luca Trevisan September 19, 2017

Conic optimization under combinatorial sparsity constraints

SAT-Solving: From Davis- Putnam to Zchaff and Beyond Day 3: Recent Developments. Lintao Zhang

A New Upper Bound for Max-2-SAT: A Graph-Theoretic Approach

An instance of SAT is defined as (X, S)

Improving Unsatisfiability-based Algorithms for Boolean Optimization

DISTINGUISH HARD INSTANCES OF AN NP-HARD PROBLEM USING MACHINE LEARNING

MVE165/MMG631 Linear and integer optimization with applications Lecture 8 Discrete optimization: theory and algorithms

A solution approach for linear optimization with completely positive matrices

Separation Techniques for Constrained Nonlinear 0 1 Programming

Generating Hard but Solvable SAT Formulas

NP-Completeness. Algorithmique Fall semester 2011/12

RANK-TWO RELAXATION HEURISTICS FOR MAX-CUT AND OTHER BINARY QUADRATIC PROGRAMS

Parallel PIPS-SBB Multi-level parallelism for 2-stage SMIPS. Lluís-Miquel Munguia, Geoffrey M. Oxberry, Deepak Rajan, Yuji Shinano

6-1 The Positivstellensatz P. Parrilo and S. Lall, ECC

Lecture 13 March 7, 2017

On the local stability of semidefinite relaxations

Lecture 21 (Oct. 24): Max Cut SDP Gap and Max 2-SAT

CS675: Convex and Combinatorial Optimization Fall 2016 Combinatorial Problems as Linear and Convex Programs. Instructor: Shaddin Dughmi

Structure Learning: the good, the bad, the ugly

Positive Semi-definite programing and applications for approximation

LOGIC PROPOSITIONAL REASONING

Determine the size of an instance of the minimum spanning tree problem.

1 Primals and Duals: Zero Sum Games

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 12 Luca Trevisan October 3, 2017

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016

Transcription:

Low-ranksemidefiniteprogrammingforthe MAX2SAT problem Po-Wei Wang J. Zico Kolter Machine Learning Department Carnegie Mellon University School of Computer Science, Carnegie Mellon University, and Bosch Center for Artificial Intelligence 1

Highlights We will talk about - Why maximum satisfiability problem matters - How to approximate the MAXSAT problem with semidefinite program - How to solve the above approximation efficiently - How to usetheapproximation in a combinatorial search procedure In conclusion, you will find - The first continuoussolverformaxsat that beats all the state-of-the-art solvers in some subclasses - A new avenue for combinatorial optimization problems 2

Outline Maximum satisfiability Lifting and Goemans-Williamson approximation SDP solvers and the Mixing method Branch-and-bound for low-rank SDP Experimental results

Reasoning with conflicts Given a knowledge graph with imperfect facts and rules: - Facts: Spain_Nbr_Portugal, Spain_Nbr_Morocco, Spain_Nbr_France Portugal_In_Eu, France_In_Eu, Morocco_In_Af - Rules: A country locates in only one region Neighbors tends to in the same region Say that we want to infer whether S _In_Eu or S _In_Af.. The problem can be translates to a boolean formula ( S _In_Eu S _In_Af ) (S _Nbr_M M _In_Af S _In_Af )... Thus, reasoning corresponds to find a model with the least conflicts - I.e., solving the corresponding MAXSAT problem 3

MAXSAT = Maximizing the # of satisfiable clauses Given a Booleanformula, where variables v i {True, False} ( v 1 ) (v 1 v 2 ) }{{} Clause (v 1 v 2 v 3 ) (v 1 v 2 v 3 ), The SAT problem finds a satisfiable V {T, F } n if exists. But in ML, inputs are noisy and models are not perfect usually cannot find a model that s 100% correct MAXSAT: maximize the number of satisfiable clauses. That is, find a model with the least error 4

Difficulties: MAXSAT is NP-complete MAXSAT is a binary optimization problem, hard to solve in general. Modern discrete solvers apply different heuristics: - Some bound the solution using SAT solvers (core-guided methods) - Some perform greedy localsearch (e.g., CCLS, CCEHC) There are some continuousapproximations in theoretical computer science, but doesn t outperform discrete solvers in general - Underlying optimization problems are expensive comparing to heuristics We will show an efficientapproximation that is easy to compute 5

Outline Maximum satisfiability Lifting and Goemans-Williamson approximation SDP solvers and the Mixing method Branch-and-bound for low-rank SDP Experimental results

Relaxation: Lifting binary variables to higher dimensions The binary optimization problem can be in general written as minimize V L(V ), s.t. v i { 1, +1}, L( ) = Loss func The binary constraint v i { 1, +1} are equivalent to the following v i = 1, v i R 1. To make it smoother, we can lift it toahigherdimensionspace v i = 1, v i R k. That is, relaxing the binary variables to a continuous unit sphere -1 +1 This lifting from binary variable to the unit sphere is called Goemans-Williamsonapproximation. 6

The quadratic approximation for MAX-2-SAT Consider the simplest MAXSAT: the MAX-2-SAT (clause len 2) Maximizing the satisfiability is equivalent to minimizing the loss maximize V merit loss {}}{ {}}{ (v 1 v 2 ) minimize ( v 1 v 2 ) V We can cleverly design the loss as loss(v ) = v 1 + v 2 1 2 1 8 Verify: loss when all literals=false - v 1 = 1, v 2 = 1 : loss(v ) = (9 1)/8 = 1 loss unsatisfied 1 - v 1 = +1, v 2 = ±1 : loss(v ) = (1 1)/8 = 0-3 -1 satisfied 0 1 2F 1F+1T 2T Replace the 1 with v 0 as the new truth direction. Extendable to general clause with length n j (quadratic lower bound) 7

Randomize rounding: from continuous to discrete After obtaining the optimal v i, we can recover the discrete assignment: Randomizedrounding: pick a uniform random direction r binary v i = sign(r T v 0 r T v i ), i = 1 n. Goemans & Williamson (1995) proved 0.878 approximation ratio on MAX2SAT in expectation. Perform multiple times gives surprisingly high approximation rate. 8

Approximation ratio experiment Guaranteed 0.878 approximation: E(Rounding) 0.878 OPT Experiment: 0.978 approximation rate in 2016 MAXSAT dataset. Key idea: E(Rounding) sup E(r-th Rounding) r i.i.d. 9

Outline Maximum satisfiability Lifting and Goemans-Williamson approximation SDP solvers and the Mixing method Branch-and-bound for low-rank SDP Experimental results

Reducing vector program to semidefinite program Recall that the loss is a homogeneous quadratic function to V loss(v ) = v 1 + v 2 v 0 2 1. 8 Thus, the problem is equivalent to the quadratic vectorprogram minimize V C, V T V, s.t. v i = 1, V R k n When lifting the solution to a high enough dimension (k 2n), the optimal solution in V is equivalent to the following SDP (Pataki, 1998) minimize X C, X, s.t. X ii = 1, X 0 R n n. Hence the optimal solution X = V T V in polynomial time. 10

Biggest problem: SDP solvers are slow Since SDP is convex, we can solve it in polynomial time. That said, the complexity for SDP solvers (interior point methods) is O(n 6 ). - Only scales to n 10 3 - Complexity cubic to # of variables (X R n n ) from matrix inversion We d like to solve the problem in the low-rank space V R k n? - Many fewer variables - But non-convex (unit sphere shell) Fortunately, we have a secret weapon that provably recover the optimal solution of SDP in the low-rank space. - Recover the global optimal solution despite the non-convexity - At the same time 10 100x faster than other state-of-the-art solvers 11

Secret weapon: the Mixing method for diagonal SDP For the optimization problem on unit sphere, minimize V C, V T V, s.t. v i = 1, V R k n. Observe the objective can be decomposed as C, V T V = i v T i ( j c ij v j ) If we solve one vector v i at a time and fix the others, the sub-problem has a closed-form solution v i := normalize c ij v j. j i Thus, we can cyclicallyupdateallthevectors for i = 1... n. - Scales to ten millions of variables, 10 100x faster than SOTA - Provably convergent to global optimum of corresponding SDP! 12

No convergence proof before for low-rank SDP solvers Low-rank decomposition first discovered by Burer and Monteiro (2001) minimize V C, V T V, s.t. v i = 1, V R k n. Many working variants, non provably convergent to optimal objective - Boumal et al. (2018): H γi in O(1/γ 3 ), bound f f for k > n - Bhojanapalli et al. (2018): f µ f µ 0 (f µ is extended Lagrangian) Difficulties - Non-convex (spherical manifold dom), rotational equivalence. - Random initialization required. - Singularity of Jacobian and Hessian. 13

Main theoretical result How do we prove it? - Lyapunov instability in control theory and similarity to Gauss-Seidel Results: - First proof of convergence to global optima in 17 years. - Asymptotic m n log 1 ϵ v.s. O(n6 ), where m is the # of literals - Further, it is practically 10 100x faster than other low-rank method. 14

Proof sketch: Lyapunov instability and stable manifolds Consider one step of the Mixing method V next := M (V ) as operator Instability: the operator has expansive direction non-opt criticals. - Eigenvalues of Jacobian on manifold contain those in Euclidean Eigvals(A I k diag(i v i v T i )B I k ) Eigvals(AB) for k > 2n. - Jacobian of Gauss-Seidel is unstable when not PSD ( λ i > 1). - All non-optimal criticals corresponds to a G.-S. on non-psd system. - Thus, the Mixing methods are unstable on non-optimal criticals. Center-stable manifold thm: Existence of invariantmanifolds. - Mixing method with step-size is a diffeomorphism (1-to-1 and C 1 ). - Apply center-stable manifold thm: basin to unstable is 0 measure. - That is, random initialization never converges to unstable criticals. - Converge to critical and never to unstable converge to global opt. 15

Outline Maximum satisfiability Lifting and Goemans-Williamson approximation SDP solvers and the Mixing method Branch-and-bound for low-rank SDP Experimental results

Branch-and-bound framework for SDP Despite good approximation, need tree-search for discrete optimality Branch-and-bound. Goal: find the BEST=minimum UNSAT solution. - Initialize a priority queue Q = the unassigned MAXSAT problem - While Q is not empty: N = Q.pop() Update BEST base on N if possible. Evaluate heuristic lower bound f (N ) UNSAT(N ) If BEST f (N ): Prune. Else: pick a variable v i, Q.push( N v i = T and N v i = F ). Since the loss is a quadratic lower bound for the discrete loss, SDPs is a lower bound in the branch-and-bound framework Solve SDP for every internal node... 16

Bounding SDP by initialization Fortunately, don t need to solve SDPs for every internal node! - Observation: Child problems will be similar to the parent problem - Can apply the primalanddualinitialization to bound the SDP values and only solve SDPs for the unpredictable cases By the SDP duality, we have the following primal-dualpairs minimize f (V ) := C, V T V, s.t. v i 2 = 1. V R k n maximize D(λ) := 1 T λ, s.t. C + D λ 0. λ R n Known fact: f (V ) f D(λ) and BEST UNSAT f. For any feasible primal dual solution V and λ, - f (V ) BEST Expand (not prunable by SDP) - D(λ) BEST Prune (not contain optimal solution) 17

Primal and dual initialization Primal initialization - Trivially copy parent solution and evaluate - Help expansion Dual initialization - Exploit the difference in C, which exhibits an arrow-head structure - Calculate a feasible λ satisfying C + D λ 0 - Help pruning SDP root primal init: expand dual init: prune uncertain: new root Greatly reduce the SDP subproblems that really need to tackle 18

Different priority queue for different tasks There are 2 tracks in MAXSAT evaluation with different requirements The complete track: - Output the result after verification - Thus, we set priority queue Q as stack - Equivalent to performing depth-first search at the tree The incomplete track: - Output the best result as you get it, no verification - Thus, we set priority queue Q as a heap - Equivalent to best-first search at the tree Same algorithm for the two tracks! 19

Outline Maximum satisfiability Lifting and Goemans-Williamson approximation SDP solvers and the Mixing method Branch-and-bound for low-rank SDP Experimental results

Experiment setup For fairness, we replicate the setup of the 2016 MAXSAT evaluation - the same CPU (Intel Xeon E5-2620) - single core mode - 3.5 GB memory limit. - 30 min time limit for complete track, 5 min for incomplete track We compare our results to the best solvers for each problem instances - That is, the virtual best solvers 20

Complete track For the complete random categories, we solved 169/228 MAX2SAT instances in avg 273s, while the best solvers solve 145/228 in 341s. Consistently outperform the VBS! 10 4 MAX2SAT in the complete track 10 3 seconds 10 2 10 1 10 0 MIXSAT Best complete 2016 0 20 40 60 80 100 120 140 160 180 instances 21

Incomplete track For the incomplete random categories, we solved all the instances in 0.22 sec (best solution), while the 2016 best solvers takes 1.92s A 8.72x speedup! 10 1 MAX2SAT in the incomplete track 10 0 seconds 10-1 10-2 10-3 MIXSAT Best incomplete 2016 0 50 100 150 200 250 instances 22

MAX3SAT For all the MAX3SAT instances in the incomplete random track, we solved all except 2 of the instance, and is faster than 2016 best solvers in 114/144 instances. 10 2 MAX3SAT in the incomplete track 10 1 seconds 10 0 10-1 10-2 10-3 MIXSAT Best incomplete 2016 0 20 40 60 80 100 120 140 160 instances 23

Summary In this talk, we - Presented the first continuousmaxsat solver that outperforms the state-of-the-art discrete solvers in some subclasses - Developed a lifting approximation and solve it via a low-rank SDP method (Mixing) that provably converges to the optimal - Recovered the discrete optimal solution via developing the primal-dual initialization in a branch-and-bound framework And we showed that - The solver is many times faster than the best solvers in 2016 in some MAX2SAT categories. - Also promising in MAX3SAT, but certainly not the best yet. - It opens a new avenue for combinatorial optimization, taking the SDP out of the theoretical world! 24

Appendix: The MIXSAT algorithm Initialize a priority queue Q = {initial problem}; while Q is not empty do New SDP root P = Q.pop(); Solve f := SDP(P) with the Mixing method; if f ϵ BEST then continue; Update BEST and resolution orders by randomized rounding on V := arg SDP(P); foreach subproblems of P do Initialize PRIMAL and DUAL objective values; if PRIMAL BEST then Expand; elseif DUAL BEST then Prune; else Push the subproblem into the Q; end end 25

Appendix: Crafted MAXCUT instances 10 1 Crafted MAXCUT in 2016/2018 incomplete track 10 0 seconds 10-1 10-2 10-3 10-4 MIXSAT Best incomplete 2016 / 2018 0 20 40 60 80 100 120 instances For all the MAXCUT instances in the incomplete crafted track in 2016/2018, we solved all the instances in 0.62s, while the 2016/2018 best solvers takes 1.84s. Still 3x speedup. 26