UCLID: Deciding Combinations of Theories via Eager Translation to SAT. SAT-based Decision Procedures

UCLID: Deciding Combinations of Theories via Eager Translation to SAT Sanjit A. Seshia SAT-based Decision Procedures Input Formula Input Formula Satisfiability-preserving Boolean Encoder Boolean Formula SAT Solver Approximate Boolean Encoder Boolean Formula SAT Solver additional clause satisfiable satisfying assignment unsatisfiable First-order ground decision procedure satisfiable unsatisfiable unsatisfiable satisfiable EAGER ENCODING LAZY ENCODING 2

UCLID Logic Atomic Predicates/Formulas from the following Theories: Equality & Uninterpreted Functions (EUF) Quantifier-Free Presburger Arithmetic (QFP) Also called Integer Linear Arithmetic Restricted Lambda expressions (Rλ)( For modeling arrays, memories, etc. Finite-precision Bit-Vector Arithmetic [new] Quantifier-free, free, arbitrary Boolean combination of atomic formulas 3 Modeling Arrays with λ s Array M Modeled as Function a M Array Operations UCLID Expressions Select(M, a) M(a) Update(M, a, d) λ i. ITE( i = a, d, M(i) ) ) 4

UCLID Operation Input Formula Eager Encoding to SAT Lambda Expansion λ-free Formula Operation Series of transformations leading to Boolean formula Each step is validity (satisfiability) preserving Each step performs optimizations Function & Predicate Elimination Encoding Integer Linear Arith. Linear Arithmetic Formula Boolean Formula Boolean Satisfiability 5 Counterexample Generation Partial Interpretation of Lambdas Lambda Expansion Partial Interpretation of Function Symbols Concrete Counterexample Maps a counterexample of the Boolean formula to (partial) interpretation of symbols in input formula Function & Predicate Elimination Encoding Integer Linear Arith. Integer Assignment Boolean Satisfiability Boolean Assignment 6

Talk Focus Input Formula Lambda Expansion λ-free Formula Function & Predicate Elimination Linear Arithmetic Formula Encoding Integer Linear Arith. Boolean Formula Boolean Satisfiability 7 Eliminating Function Applications Two applications of an uninterpreted function f in a formula f(x 1 ) and f(x 2 ) Ackermann s Encoding Bryant, German, Velev s Encoding f(x 1 ) vf 1 f(x 1 ) vf 1 f(x 2 ) vf 2 f(x 2 ) x 1 = x 2 vf 1 = vf 2 ITE(x 1 = x 2, vf 1,, vf 2 ) 8

Main Part: Encoding Linear Arithmetic using Finite Instantiation 9 Quantifier-Free Presburger Arithmetic (QFP) φ a 1 x 1 + a 2 x 2 + a n x n b φ 1 φ 2 φ 1 φ 2 φ a i, x i, b Z Also called integer linear arithmetic Decision problem: Is φ satisfiable? Only conjunctions: Integer linear programs (ILP) 10

Deciding QFP is NP-complete In NP: If a satisfying solution exists, then one exists within a bound d log d is polynomial in input size small model property Expression for d (n+m) (b max +1) ( m a max [Papadimitriou, 82] max ) 2m+3 Input size: m # constraints n # variables b max largest constant (absolute value) largest coefficient (absolute value) a max 11 Finite Instantiation Steps Calculate the solution bound d Encode each integer variable with log d bits & translate to Boolean formula Run SAT solver Problem: For QFP, d is Ω( m m ) Ω( m log m ) bits per variable 12

Equality Logic Linear constraints are equalities x i = x j Result: d = n Reason x 1 x 2 x 3... x n... 13 Difference Logic Also called separation logic Only difference-bound constraints x i x j + b, ± x i b Result: d = n (b max + 1) 1) [Bryant, Lahiri,, Seshia, CAV 02] 14

Constraint Graph Directed multigraph with edges labeled by constants Vertices Variables Edges Constraint is true in current assignment x x y + c 1 c 1 y x -5 3 2 0 Zero variable, x 0 (thus, n n+1) x i b x i x 0 + b Assume original formula in negation normal form New b max b max + 1 y z 15 Bounding Paths Formula satisfiable assignment that yields a graph without positive cycles x = 10-5 3 2 0 y = 6 z = 6 Maximum spread in variable values = length of longest path (that repeats no vertices) = n b max n b max + 1 16

General Case [Seshia & Bryant, LICS 04] New parameterized solution bound d Parameters characterize sparse structure Occurs in software verification 17 Linear Constraints in Software Verification Characteristics Mostly difference constraints Non-difference constraints are sparse Project Blast [UC Berkeley] Magic [CMU] Upgrade checking [MIT] WiSA [U. Wisc.] Max fraction of non- difference 0.0276 0.0032 0.0087 0.0054 Max width 6 2 3 4 Some similar observations: Pratt 77, ESC/Java- Simplify-TR TR 03 18

Parameterized Solution Bound New parameters: k non-difference constraints, w variables per constraint (width) Our solution bound: (n+2) n (b max +1) ( w a max ) k m n b max a max #constraints #variables max constant max coefficient Previous: (n+m) (b max +1) ( m a max ) 2m+3 Direct dependence on m eliminated (and k m ) 19 Proof of Our Bound: Steps 1. Previous result for integer linear programming (ILP) by Borosh-Treybig Treybig-Flahive [ 76, 86] 2. Express above result in k and w,, in addition to other parameters 3. Derive QFP bound from ILP bound 20

Integer Linear Programming (ILP) Notation A x = b, x 0 n m a max = max i,j a ij b max = max i b i 21 Borosh-Treybig-Flahive Result [1986] Solution bound d is (n+2) where = largest sub-determinant of [A[ b] (abs. value) Problem: Exponentially many sub-determinants! 22

Matrix Structure k w non-zeroes per row Non-difference constraints m n 23 k = 0 : Only Difference Constraints x i - x j b, ± x i b Totally Unimodular: All subdeterminants are in {0, -1, +1} i b i min(n+1, m) b max 24

Arbitrary k w Each term a max k #Terms w k k Det. {0,±1} i b i (a maxk w k ) min(n+1, m) b max (a maxk w k ) 25 Bound for ILP min(n+1, m) b max (a maxk w k ) d = (n+2)( [Borosh-Treybig-Flahive] = (n+2)( min(n+1, m) b max (a maxk w k ) (n+2) n b max (a maxk w k ) (assuming m n) 26

QFP Bound from ILP Bound Consider DNF of arbitrary QFP formula φ φ = φ 1 φ 2... φ N Satisfying assignment to φ must satisfy some φ i Each φ i is an ILP Parameters of φ i are bounded by those of φ Therefore: d = (n+2) n b max (a maxk w k ) 27 Other Main Approaches to Solving QFP ILP + SAT [Pugh, SC [Pugh, SC 92; Berezin et al., TACAS 03] Worst-case exponential number of ILPs to solve Automata-based methods [Boigelot SAS 95] Boigelot, Wolper et al., Accepted words = Binary encoded solutions Exponential-sized automata 28

Experimental Comparison: Setup Our decision procedure: UCLID Uses zchaff SAT solver [Zhang et al., Princeton] Compared against ILP+SAT: CVC-Lite [Barrett, Also uses zchaff Automata-based: LASH [Boigelot [Barrett, Berezin,, et al., Stanford] Boigelot et al., U.Liège ge] Benchmarks From software verification projects cited earlier n 100, m > 1000 Several 1000 Boolean operators 29 Experimental Results (3600) UCLID faster CVC-Lite faster 30

Experimental Results (3600) UCLID faster CVC-Lite faster LASH (automata-based solver) timed out on all benchmarks 31 Newer Results Generalized 2SAT constraints x i + x j b, - x i - x j b, x i - x j b, x i b O ( n2 (b max + 1) 2 m ) 2 n (b max + 1) [Seshia, Subramani,, Bryant, 04] Lazy encoding to SAT [Kroening, Ouaknine,, Seshia, Strichman,, CAV 04] Instead of conservative d, start with smaller d,, and increase on demand 32

Summary of d Values Logic Equality logic Separation logic Generalized 2SAT logic Quantifier-Free Presburger logic d n n ( b max + 1 ) 2 n ( b max + 1 ) (n+2) n (b max + 1) (a maxk w k ) 33 UCLID Decision Procedure Features Eager translation to a Boolean formula Can plug in the champion SAT solver E.g., 100x speedup with zchaff Siege Novel translation schemes Positive equality (uninterpreted( functions) Finite instantiation Terms interpreted over integers Not proof-generating, yet Produces concrete counterexamples 34

Other Work on UCLID Encoding methods for difference logic Eager explication of transitivity axioms [Strichman Seshia, Bryant, CAV 02] Hybrid encoding [Seshia, Verification Deductive verification [Seshia, Lahiri,, Bryant, DAC 03] Strichman, [Lahiri,, Seshia, Bryant, FMCAD 02; Lahiri & Bryant CAV 03] Predicate abstraction [Lahiri,, Bryant, Cook, CAV 03] 35