Automated Program Verification and Testing 15414/15614 Fall 2016 Lecture 7: Procedures for First-Order Theories, Part 1

Automated Program Verification and Testing 15414/15614 Fall 2016 Lecture 7: Procedures for First-Order Theories, Part 1 Matt Fredrikson mfredrik@cs.cmu.edu October 17, 2016 Matt Fredrikson Theory Procedures 1 / 36

First-Order Theories A first-order theory T is defined by: Its signature Σ T, a set of non-logical symbols Its axioms A T, a set of closed formulas over Σ Σ T -formula A Σ T -formula contains only non-logical symbols from Σ T, as well as variables and logical connectives. Basicidea: First-order theories define a limited vocabulary for talking about a subject of interest. Theory axioms define the intended meaning. Matt Fredrikson Theory Procedures 2 / 36

Theory of Equality Signature: Σ E : {=, a, b, c,..., f, g, h,..., p, q, r,...} Axioms: 1. Reflexivity: x.x = x 2. Symmetry: x, y.x = y y = x 3. Transitivity: x, y, z.x = y y = z x = z 4. Function congruence: x, y. ( n i=1 x i = y i ) f(x) = f(y) 5. Predicate congruence: x, y. ( n i=1 x i = y i ) (p(x) p(y)) Matt Fredrikson Theory Procedures 3 / 36

Theory of Equality and Uninterpreted Functions We will make things simpler by removingpredicatesymbols Signature: Σ E : {=, a, b, c,..., f, g, h,..., p, q, r,...} Axioms: 1. Reflexivity: x.x = x 2. Symmetry: x, y.x = y y = x 3. Transitivity: x, y, z.x = y y = z x = z 4. Function congruence: x, y. ( n i=1 x i = y i ) f(x) = f(y) This is the TheoryofEqualityandUninterpretedFunctions(EUF) Does this restrict the theory? Matt Fredrikson Theory Procedures 4 / 36

Removing Predicates from T E We can remove predicate symbols entirely 1. For each predicate p, introduce a fresh function symbol f p 2. Introduce a fresh constant 3. Replace each instance p(t 1,..., t n ) with f p (t 1,..., t n ) = Basicidea: can define f p (...) = whenever p(...) is true Example: x = y (p(x) p(y)) becomes: x = y ((f p (x) = ) (f p (y) = )) Example: p(x) q(x, y) q(y, z) q(x, z) becomes: (f p (x) = f q (x, y) = f q (y, z) = f q (x, z) Matt Fredrikson Theory Procedures 5 / 36

Deciding T E Today we ll discuss an algorithm for deciding T E It is called the CongruenceClosure Algorithm Recall: binary relation R over set S is an equivalencerelation when It is reflexive: s S.sRs It is symmetric: s 1, s 2 S.s 1 Rs 2 s 2 Rs 2 It is transitive: s 1, s 2, s 3 S.s 1 Rs 2 s 2 Rs 3 s 1 Rs 3 If it also obeys congruence, the it is a congruencerelation: Functioncongruence: s, t S n. ( n i=1 s irt i ) f(s)rf(t) I.e., evaluation of terms related by R yields results related by R Matt Fredrikson Theory Procedures 6 / 36

Classes Let R be an equivalence relation over S The equivalenceclass of s S under R is: def [s] R = {s S : srs } Every member of S belongs to an equivalence class of R If R is a congruence relation, then [s] R is the congruenceclass of s Consider the relation 2 over Z, where a 2 b iff (a mod 2) = (b mod 2) The equivalence class of 4 under 2 is: [4] 2 = {n Z : (n mod 2) = 0} = {n Z : n is even} Matt Fredrikson Theory Procedures 7 / 36

Refinements We can view a relation R over S as a set of pairs, i.e., ˆR S S For any two s 1, s 2 S, the set ˆR determined by R is: ˆR def = {(s 1, s 2 ) S S s 1 Rs 2 } Given two relations R 1 and R 2 over S, we say R 1 refines R 2 if: ˆR 1 ˆR 2 Notationally, we write R 1 R 2, and can also define it as: R 1 R 2 iff s 1, s 2 S.s 1 R 1 s 2 s 1 R 2 s 2 Matt Fredrikson Theory Procedures 8 / 36

Refinement Examples Consider the relations: Does R 1 R 2? R 1 : {sr 1 s : s S} R 2 : {s 1 R 2 s 2 : s 1, s 2 S} Recall the relation: n : {a n b : (a mod n) = (b mod n)} Does 2 4? What about 4 2? Matt Fredrikson Theory Procedures 9 / 36

Equivalence Closure The equivalenceclosure R E of a relation R over S is the relation: R refines R E, R R E For all other equivalence relations R where R R, either: 1. R = R E 2. R E R R E is the smallest equivalence relation that includes R Matt Fredrikson Theory Procedures 10 / 36

Equivalence Closure: Example Suppose S = {a, b, c, d}, and R is an equivalence relation where arb, brc, drd To fine R E, think in terms of the definitions: R R E : arb, brc, drd R E Reflexivity: ara, brb, crc R E Symmetry: bra, crb R E Transitivity: arc R E We have to keep repeating until there aren t more updates Symmetry: cra R E R E = {arb, bra, ara, brb, brc, cra, crb, crc, drd} Matt Fredrikson Theory Procedures 11 / 36

Congruence Closures Define the CongruenceClosure R C of R similarly: R C is a congruence relation, and R R C For all other congruence relations R where R R, either: 1. R = R C 2. R C R MainIdea: Given a T E -formula F, F : s 1 = t 1 s m = t m s m+1 t m+1 s n t n F is T E -satisfiable iff there exists a congruence relation where: for each i {1,..., m}, s i t i for each i {m + 1,..., n}, s i t i Note: We ll only work with conjunctions. Why isn t this a problem? Matt Fredrikson Theory Procedures 12 / 36

Congruence Closure Algorithm More precisely, is a relation over the set of subterms S F in F We want to decide the satisfiability of: F : s 1 = t 1 s m = t m s m+1 t m+1 s n t n The algorithm works as follows: Construct the congruence closure of {s 1 = t 1,..., s m = t m } If s i t i for any i {m + 1,..., n} then return unsat Otherwise, return sat Matt Fredrikson Theory Procedures 13 / 36

Congruence Closure Algorithm F : s 1 = t 1 s m = t m s m+1 t m+1 s n t n Given that satisfies: for each i {1,..., m}, s i t i for each i {m + 1,..., n}, s i t i We construct a T E -interpretation that satisfies F D consists of the congruence classes of I assigns elements of D to terms of S F to satisfy I assigns = a relation that behaves like Matt Fredrikson Theory Procedures 14 / 36

Example F : f(a, b) = a f(f(a, b), b) a 1. Build the subterm set S F : S F = {a, b, f(a, b), f(f(a, b), b)} 2. Construct the finest congruence relation on S F : {{a}, {b}, {f(a, b)}, {f(f(a, b), b)}} 3. For each i {1,..., m}, impose s i = t i by merging: {{a, f(a, b)}, {b}, {f(f(a, b), b)}} 4. After each merge, apply axioms to propagate Matt Fredrikson Theory Procedures 15 / 36

Example F : f(a, b) = a f(f(a, b), b) a 1. We left off with: {{a, f(a, b)}, {b}, {f(f(a, b), b)}} 2. We can apply function congruence using f(a, b) a, b b: {{a, f(a, b), f(f(a, b), b)}, {b}} 3. This is the congruence closure of S F Matt Fredrikson Theory Procedures 16 / 36

Example Given {{a, f(a, b), f(f(a, b), b)}, {b}}, we construct an interpretation D = {, } I[a] =, I[f(a, b)] =, I[f(f(a, b), b)] = I[b] = =: {(, ), (, )} Does {{a, f(a, b), f(f(a, b), b)}, {b}} = F? Therefore, this formula is unsat Matt Fredrikson Theory Procedures 17 / 36

Example F : f(f(f(a))) = a f(f(f(f(f(a))))) = a f(a) a 1. Build the subterm set S F : S F = {a, f(a), f 2 (a), f 3 (a), f 4 (a), f 5 (a)} 2. Construct the initial congruence relation on S F : {{a}, {f(a)}, {f 2 (a)}, {f 3 (a)}, {f 4 (a)}, {f 5 (a)}} 3. From f 3 (a) = a, merge {f 3 (a)} and {a} {{a, f 3 (a)}, {f(a)}, {f 2 (a)}, {f 4 (a)}, {f 5 (a)}} 4. From f 3 (a) a, propagate f 4 (a) f(a): {{a, f 3 (a)}, {f(a), f 4 (a)}, {f 2 (a)}, {f 5 (a)}} 5. From f 4 (a) f(a), propagate f 5 (a) f 2 (a): {{a, f 3 (a)}, {f(a), f 4 (a)}, {f 2 (a), f 5 (a)}} Matt Fredrikson Theory Procedures 18 / 36

Example F : f(f(f(a))) = a f(f(f(f(f(a))))) = a f(a) a 1. We re at: {{a, f 3 (a)}, {f(a), f 4 (a)}, {f 2 (a), f 5 (a)}} 2. From f 5 (a) = a, merge {f 2 (a), f 5 (a)} and {a, f 3 (a)} {{a, f 2 (a), f 3 (a), f 5 (a)}, {f(a), f 4 (a)}, } 3. From f 3 (a) f 2 (a), propagate f 4 (a) f 3 (a) {{a, f(a), f 2 (a), f 3 (a), f 4 (a), f 5 (a)}} 4. This is the congruence closure of S F {{a, f(a), f 2 (a), f 3 (a), f 4 (a), f 5 (a)}} = F, so unsat Matt Fredrikson Theory Procedures 19 / 36

Computing Congruence Closures The UnionFind algorithm efficiently computes congruence closures Firststep: represent the subterm set S F as a DAG For each t S F : Each node has a unique id Each node stores the function or constant symbol it represents Directed edges from a function node to its arguments What term does this graph represent? Matt Fredrikson Theory Procedures 21 / 36

Computing Congruence Closures To support merging, each node also tracks its equivalence class This is done by maintaining a unique representative node for each class Each node keeps a pointer to another node in its class The representative points to itself To find the representative for a given node, we follow these pointers transitively What are the congruence classes in this graph? What are the representatives? Matt Fredrikson Theory Procedures 22 / 36

Computing Congruence Closures When merging classes, we need to propagate congruences This requires tracking parents of subterms in the congruence class We track all parents in the representative node type Id = int datatype Node = Node( id: Id, fn: string, args: seq<id>, find: Id, ccpar: set<id> ) // unique id // symbol // arg pointers // class pointer // parent set Matt Fredrikson Theory Procedures 23 / 36

Union Find: Basic Operations find(i): traces the find field of node i s congruence class When a node s find field points to itself, it is the representative union(i1, i2): union of the classes of i1 and i2 First, find the class representatives for nodes i1 and i2 Make one of them the representative by setting the other s find field to it Update the parents of the new representative by adding the other s Matt Fredrikson Theory Procedures 24 / 36

Merging Congruence Classes For each equality s i = t i in F, we need to merge classes First we can imply call union(s i, t i ) After this, we also need to propagate the new congruence Look at pairs of parents (p 1, p 2 ) from s i and t i s respective classes 1. If p 1 and p 2 share the same function symbol, 2. have the same arity, 3. and all their children are in the same classes, pairwise then we recursively merge the classes of p 1, p 2 Matt Fredrikson Theory Procedures 25 / 36

Decision Procedure for T E -Satisfiability Given a T E -formula F : s 1 = t 1 s m = t m s m+1 t m+1 s n t n with subterm set S F : 1. Construct the DAG for S F 2. For i {1,..., m}, merge s i and t i 3. If find(s i ) = find(t i ) for an i {m + 1,..., n}, then unsat 4. If find(s i ) find(t i ) for all i {m + 1,..., n}, then sat Matt Fredrikson Theory Procedures 26 / 36

Example F : f(f(f(a))) = a f(f(f(f(f(a))))) = a f(a) a The initial DAG: We first process f(f(f(a))) = a, by merging 3 and 0 The parents of each class in this merge are {4} and {1} So we recursively merge 4 and 1 The next parents are {5} and {2} Matt Fredrikson Theory Procedures 27 / 36

Example, contd. The initial DAG: F : f(f(f(a))) = a f(f(f(f(f(a))))) = a f(a) a We first process f(f(f(a))) = a, by merging 3 and 0 The parents of each class in this merge are {4} and {1} The next parents are {5} and {2} Matt Fredrikson Theory Procedures 28 / 36

Example, contd. F : f(f(f(a))) = a f(f(f(f(f(a))))) = a f(a) a The next parents are {5} and {2} After processing f(f(f(a))) = a, we have the congruence class: {{a, f 3 (a)}, {f(a), f 4 (a)}, {f 2 (a), f 5 (a)}} Now, process f 5 (a) = a by merging the representatives of 5 and 0 Matt Fredrikson Theory Procedures 29 / 36

Example, contd. F : f(f(f(a))) = a f(f(f(f(f(a))))) = a f(a) a The parents of 5 are {3}, and of 0 are {1, 4}, so merge 3 and 1 Now we have a single congruence class: {{a, f(a), f 2 (a), f 3 (a), f 4 (a), f 5 (a)}} And our class violates the inequality f(a) a, so unsat Matt Fredrikson Theory Procedures 30 / 36

T A : Theory of Arrays Signature: Σ A : {=, [ ], } a[i] is a binary function denoting read of a at index i a i v is a ternary function denoting write of value v into a at index i We ll see how to decide the quantifier-free, conjunctive fragment Is this expressive? Can only talk about individual elements, not entire arrays See chapter 11 of the book for more expressive fragments Matt Fredrikson Theory Procedures 31 / 36

Deciding Theory of Arrays BasicIdea: We ll reduce this to deciding T E If a T A -formula has no writes, then reads can be viewed as uninterpreted function terms If there is a write, it must occur in the context of a read. Why? So all writes occur in read-over-write terms a i v [j] We apply the read-over-write axioms to decompose these terms into simpler ones Then we use our T E solver Matt Fredrikson Theory Procedures 32 / 36

Deciding Theory of Arrays, In Detail Given T A -formula F, follow these steps recursively: If F doesn t contain any write terms, do the following: 1. Associate each array variable a with a fresh function symbol f a 2. Replace each read term a[i] with f a (i) 3. Decide and return the T E satisfiability of the resulting formula Otherwise, select a term a i v [j], and split into cases: 1. By (read-over-write 1), replace F [a i v [j]] with F 1 : F [v] i = j. 2. By (read-over-write 2), repl. F [a i v [j]] with F 2 : F [a[j]] i j. 3. Recurse on F 1 and F 2. If both are unsat, then return unsat. 4. If either is sat, then return sat Matt Fredrikson Theory Procedures 33 / 36

T A Example F : i 1 = j i 1 i 2 a[j] = v 1 a i 1 v 1 i 2 v 2 [j] a[j] F has a write term, so select a read-over-write term to deconstruct: a i 1 v 1 i 2 v 2 [j] According to (read-over-write 1), assume i 2 = j and recurse on: F 1 : i 1 = j i 1 i 2 a[j] = v 1 v 2 a[j] i 2 = j This doesn t have any write terms, so build a T E -formula: F 1 : i 1 = j i 1 i 2 f a (j) = v 1 v 2 f a (j) i 2 = j This is unsatisfiable, so let s move on to the next case Matt Fredrikson Theory Procedures 34 / 36

T A Example F : i 1 = j i 1 i 2 a[j] = v 1 a i 1 v 1 i 2 v 2 [j] a[j] According to (read-over-write 2), assume i 2 j and recurse on: F 2 : i 1 = j i 1 i 2 a[j] = v 1 a i 1 v 1 [j] a[j] i 2 j This has a write term, so apply (read-over-write 1) and assume i 1 = j F 3 : i 1 = j i 1 i 2 a[j] = v 1 v 1 a[j] i 2 j This is unsatisfiable, so (read-over-write 2) and assume i 1 j: F 3 : i 1 = j i 1 i 2 a[j] = v 1 a[j] a[j] i 1 j Now all branches have been tried, and we conclude that F is T A -unsat Matt Fredrikson Theory Procedures 35 / 36

Next Lecture For more on today s material, see Chapter 9 of Bradley & Manna Next time, we ll talk about Dealing with quantifiers Disjunctive formulas, better approaches than DNF Satisfiability Modulo Theories (SMT) Second homework is due on Tuesday! Good questions on Piazza so far, be sure to check up on the answers Matt Fredrikson Theory Procedures 36 / 36