Introduction to Solving Combinatorial Problems with SAT Javier Larrosa December 19, 2014
Overview of the session Review of Propositional Logic The Conjunctive Normal Form (CNF) Modeling and solving combinatorial problems with SAT solvers
Definition of Propositional Logic SYNTAX (what is a formula?): Vocabulary consists of a set P of propositional variables, usually denoted by (subscripted) p,q,r,... A formula overp is: Every propositional variable is a formula If F is a formula, F is also a formula If F and G are formulas,(f G) is also a formula If F and G are formulas,(f G) is also a formula Nothing else is a formula Formulas are usually denoted by (subscripted) F,G,H,... Examples: p p (p q) (p q) (p ( p q)) ((p q) (r q))...
Definition of Propositional Logic SEMANTICS (what is the meaning of a formula) : Propositional variables are boolean (i.e, 0 or 1) An interpretation (a.k.a. truth assignment) I overp is a 0,1 assignment to each variable in P I satisfies F (written I = F) if and only if the evaluation of F under I is 1 (i.e, eval(i,f)=1). The boolean operators (,, ) are specified with their corresponding truth tables, or, equivalently, eval(i, p)= def 1 eval(i,p) eval(i,p q)= def max{eval(i,p),eval(i,q)} eval(i,p q)= def min{eval(i,p),eval(i,q)} If I = F we say that I is a model of F.
Example Consider formula (p q) (r q) Interpretation I ={p := 0,q := 1,r := 0}... is NOT a model of the formula. Interpretation I ={p := 0,q := 1,r := 1}... is a model of the formula.
Small Syntax Extension We will write (F G) as an abbreviation for ( F G) Similarly, (F G) is an abbreviation of ((F G) (G F))
Removing parenthesis From most to least priority: All connectives are left-associative EXAMPLES: F 1 F 2 F 3 F 4 is (((( F 1 ) F 2 ) F 3 ) ( F 4 )) F 1 F 2 F 3 F 4 F 5 is ((((F 1 F 2 ) F 3 ) ( F 4 )) F 5 )
First Modeling Example (Pigeon Hole s Problem) We have 3 pigeons and 2 holes. If each hole can hold at most one pigeon, is it possible to place all pigeons in the holes? Vocabulary: p ij means i-th pigeon is in j-th hole Constraints: Each pigeon is placed in at least one hole: (p 11 p 12 ) (p 21 p 22 ) (p 31 p 32 ) Each pigeon is placed in at most one hole: (p 11 p 12 ) (p 21 p 22 ) (p 31 p 32 ) Each hole can hold at most one pigeon: (p 11 p 21 ) (p 11 p 31 ) (p 21 p 31 ) (p 12 p 22 ) (p 12 p 32 ) (p 22 p 32 ) Resulting formula has no model meaning that there is no way to put 3 pigeons in 2 holes.
Usual Queries Let F and G be arbitrary formulas. Then: F is satisfiable (also consistent) if it has at least one model G is a logical consequence of F (F entails G), denoted F = G, if every model of F is a model of G F and G are logically equivalent, denoted F G, if F and G have the same models
Logical Equivalences F F F F F F F G G F F G G F F F (F G) H F (G H) (F G) H F (G H) F (G H) (F G) (F H) F (G H) (F G) (F H) (F G) F G (F G) F G If F is a tautology then F G G F G F If F is unsatisfiable then F G F F G G
Reduction to SAT Assume we have a black-box SAT that given a formula F: SAT(F)=YES iff F is satisfiable SAT(F)=NO iff F is unsatisfiable How to reuse SAT for detecting logical consequences,...? F = G iff SAT(F G)=NO F G iff SAT((F G) ( F G))=NO Hence, a single tool suffices. A VERY USEFUL TOOL: black-box SAT
Complexity of SAT SAT is an NP-Complete problem All known algorithms have worst-case exponential cost on the input size It is exponentially expensive solving the problem, but only polynomially expensive checking if an assignment is a model Research in the SAT community (see http://www.satlive.org/): develop general purpose SAT solvers that can solve real-size problems in reasonable time. Great success in the last 30 years. Real problems may not be worst-case instances for some algorithms
Conjunctive Normal Form (CNF) In order to construct our SAT black-box it would simplify our job to assume that the formula F has a given format. A literal is a prop. variable (p) or a negation of one ( p) A clause is a disjunction of zero or more literals (l 1...l n ) A formula is in Conjunctive Normal Form (CNF) if it is a conjunction of clauses
CNF Example: p (q r) (q p r) is in CNF Property: Every formula can be transformed into CNF We are going to see three different ways to do it.
Transformation to CNF via truth table Let us take the formula F :=(p q) ( p (q r)) Its truth table is: p q r 0 0 0 0 0 0 1 1 0 1 0 0 0 1 1 0 1 0 0 1 1 0 1 1 1 1 0 1 1 1 1 1 It is easy to compute a CNF for F: ( p q r) (p q r) (p q r) This method may produce unnecessarily large CNF formulas (e.g. p q r)
Tranformation to CNF via distributivity 1. Apply the three transformation rules up to completion: F F (F G) F G (F G) F G 2. Now apply the distributivity rule up to completion: F (G H) (F G) (F H) 3. remove tautology clauses and repeated literals EXAMPLE: let F be (p q) ( p (q r))
Tranformation to CNF via distributivity This method may produce an exponential growth in the size of the formula Example: (p0 p 1 ) (p 2 p 3 )=(p 0 p 2 ) (p 0 p 3 ) (p 1 p 2 ) (p 1 p 3 ) (p0 p 1 ) (p 2 p 3 ) (p 4 p 5 )= (p 0 p 2 p 4 ) (p 0 p 3 p 4 ) (p 1 p 2 p 4 ) (p 1 p 3 p 4 ) (p 0 p 2 p 5 ) (p 0 p 3 p 5 ) (p 1 p 2 p 5 ) (p 1 p 3 p 5 )... In general: ( i D i ) (p q)=( i (D i p)) ( i (D i q))
Tranformation to CNF via Tseitin Let F be (p q) ( p (q r) ) v e 1 p e 2 e 3 ^ e 4 q ^ e 5 v p q e 6 e 7 e 1 e 1 e 2 e 3 e 2 p q e 3 e 4 e 4 e 5 e 6 e 5 p e 6 q e 7 e 7 r r
Tranformation to CNF via Tseitin Variations of Tseitin are the ones used in practice Tseitin does not produce an equivalent CNF but... Any model of CNF can be projected to the variables in F giving a model of F Any model of F can be completed to a model of the CNF Hence no model is lost nor added in the conversion
Transformation to CNF via Tseitin Tseitin does not produce an equivalent CNF F = p (q r) G=e 1 (e 1 e 2 p) (e 2 q r) F G? No, because I ={p=1,q = r = e 1 = e 2 = 0} is a model of F and it is not a model of G
SAT solvers a SAT solver is a program that receives a CNF formula and returns one model (or reports failure if the formula is unsatisfiable) We can use SAT solvers to answer most useful queries in Propositional Logic. In the last 30 years more and more efficient SAT solvers have been developed. Although being a NP-Hard problem, modern SAT solvers can solve instances with thousands of variables and hundreds of thousands clauses see http://www.satlive.org/
Solving Combinatorial Problems with SAT solvers Given a Combinatorial Problem: 1. Modeling: find a CNF formula such that its models are the problem solutions 2. Solving: Use a SAT solver to find one such model We need to chose a SAT solver (such as MiniSAT) We need to know the input format (such as dimacs) We need to write a program that writes the CNF in dimacs format
Modeling Example: 3-Coloring Given an undirected graph G=(V,E) (with V =n, E =m) and 3 colors, assign one color to each vertex v V in such a way that no adjacent nodes have the same color. Variables: x ij (0 i < n, 0 j < 3) Meaning of variables: x ij is true if vertex i gets color j Constraints: Every vertex gets a color: For i = 0..n 1: (x i0 x i1 x i2 ) For i = 0..n 1, For 0 j < j < 3: ( x ij x ij ) Adjacent vertices do not get the same color: For(i,i ) E, For j = 0..2: ( x ij x i j)
Class Assignment: Sudoku Use a SAT solver to solve Sudoku
Modeling Example: Cardinality Constrains Consider a set of variables {x 1,x 2,...,x n }: n i=1 x i 1 For ) all 1 i < j n: ( x i x j ) clauses of size 2 ( n 2 n i=1 x i 2... For ) all 1 i < j < k n: ( x i x j x k ) clauses of size 3 ( n 3 n i=1 x i u For ( all 1 i 1 < i 2 <...<i u+1 n: ( x i1 x i2... x iu+1 ) n u+1) clauses of size u+ 1... n i=1 x i n 1 ( x1 ) x 2... x n ) = 1 clause of size n ( n n
Modeling Example: Cardinality Constrains Consider a set of variables {x 1,x 2,...,x n }: n i=1 x i n 1 For ) all 1 i < j n: (x i x j ) clauses of size 2 ( n 2 n i=1 x i n 2... For ) all 1 i < j < k n: (x i x j x k ) clauses of size 3 ( n 3 n i=1 x i n u For ( all 1 i 1 < i 2 <...<i u+1 n: (x i1 x i2... x iu+1 ) n u+1) clauses of size u+ 1... n i=1 x i 1 (x1 ) x 2... x n ) = 1 clause of size n ( n n
Modeling Example: Equality Constrains Consider a set of variables {x 1,x 2,...,x n }: n i=1 x i = u This is equivalent to, n i=1 x i u n i=1 x i u and we just saw how to model it. The previous method has space problems (formulas are too big) when the value of u is not on the extremes.
Modeling Example: Cardinality Constrains with auxiliary variables We want to model n i=1 x i = u with a small formula: we introduce new variables y ij : y ij means that x i = 1 and it is the j-th taking that value x i y i1 y i2... y iu For all i = 1..n: u j=1 y ij 1 For all j = 1..u: n i=1 y ij = 1 Auxiliary variables are very useful for modeling
Modeling Example: 8-queens problem
Modeling Example: 8-queens Given a 8 8 chess board, place 8 queens in such a way that they do not attack each other Variables: x ij (0 i,j < 8) Meaning of Variables: x ij is true if there is a queen in cell (i,j) Constraints: There are exactly 8 queens on the board x ij = 8 ij Conflicting pairs of cells do not contain queens x ij x i j
Modeling Example: Clique Finding Given an undirected graph G=(V,E) (with V =n, E =m) and a natural number k, find in G an embedded clique of size k Variables: x i (0 i < n) Meaning of variables: x i is true if vertex i is in the clique Constraints: k vertices are selected: ( n i=1 x i = k) If two vertices are selected, they must be connected in G: For(i,i ) / E: ( x i x i ) This model is only feasible when k is near the limits
Another Model for Clique Finding Given an undirected graph G=(V,E) (with V =n, E =m) and a natural number k, find in G an embedded clique of size k Variables: y ij (0 i < n, 0 j < k) Meaning of variables: y ij is true if vertex i is in the j-th one in the clique Constraints: There are k selected vertices For j = 0..k 1: (y 0j y 1j... y n 1j ) For j = 0..k 1, For 0 i < i < n: ( y ij y i j) Each vertex is selected at most once: For i = 0..n 1, For 0 j < j < k: ( y ij y ij )