Scalable and Accurate Verification of Data Flow Systems Cesare Tinelli The University of Iowa
Overview AFOSR Supported Research Collaborations NYU (project partner) Chalmers University (research collaborator) Rockwell-Collins (end user) Overall Objective Improve performance and scalability of techniques for verifying mission-critical embedded software 2
Overview Scientific Approach Focus on synchronous data-flow systems Develop automated reasoning techniques and tools based on more powerful logic than propositional logic Develop modular and compositional verification methods and tools Breakthrough Opportunities Use Satisfiability Modulo Theories (SMT) instead of SAT Exploit recent dramatic advances in SMT technology Reason about finite- as well as infinite-state systems 3
Formally Verifying Software Systems Traditional main alternatives: Deductive verification Based on first- or higher-order logical calculi for theorem proving Model checking Based on automata or SAT techniques 4
Deductive Verification vs Model Checking Deductive Verification Pros Natural translation Unrestricted data types Arbitrary properties Favors proving validity Cons Time consuming Expertise required May be hard to produce counterexamples Model checking Pros Fast Automatable Generates concrete counterexamples Cons More complex translation Finite data types only Propositional properties Harder to prove validity 5
Main Idea of This Work Middle-ground approach SMT-based model checking: Automatically translate system S and property P into a first-order logic with built-in theories Try to prove or disprove P automatically with an inductive model checker/verifier Use an SMT solver as the main reasoning engine Verify control and data properties 6
Satisfiability Modulo Theories Lifting of SAT solving techniques to certain fragments of data type theories (linear arithmetic, arrays, lists, tuples, ) Uses efficient specialized reasoners to handle predicates with theory symbols (>, +, a[i], cons, ) SAT SMT Boolean formulas quantifier free formulas over theories More powerful than Boolean representation, but retains decidability and efficiency More compact formulas, better scalability More natural encodings for verification 7
This Research Background Seminal work on SAT-based temporal induction at Chalmers University [Biesse&Claessen, Sheeran et al.] Our contributions lifting to SMT case extension to infinite-state systems enhancements to induction method state-of-the-art SMT solver 8
Main Focus and Approach Consider reactive systems specifiable with synchronous dataflow languages Use SMT-based k-induction to verify safety properties (ie., invariant functional properties) of transition systems For experimental evaluations, work with systems written in Lustre 9
Essence of Lustre Programs Declarative and deterministic System of equational constraints between input and output streams Each stream s of values of type τ is a function s : τ that maps instants to stream values 10
Lustre by Example node thermostat (a_temp, d_temp, marg: real) returns (cool, heat: bool) let cool = (a_temp - d_temp) > marg ; heat = (a_temp - d_temp) < -marg ; tel node therm_control (actual: real; up, dn: bool) returns (heat, cool: bool) var target, margin: real; let margin = 1.5 ; desired = 21.0 -> if dn then (pre desired) - 1.0 else if up then (pre desired) + 1.0 else pre target ; (cool, heat) = thermostat (actual, desired, margin) ; tel 11
From stream algebra constraints to temporal constraints Stream constraints can be reduced to Boolean & arithmetic constraints over instantaneous configurations: margin(n) = 1.5 target(n) = ite(n = 0, 70.0,ite(dn(n), target(n 1) 1.0,K)) cool(n) = (actual(n) target(n)) > margin(n) heat(n) = (actual(n) target(n)) < ( margin(n)) Crucial observation: SMT solvers can process this sort of constraints natively and efficiently 12
SMT Solvers Properties to (dis)prove Context Strategies Formulas Queries Proofs Translated counterexamples SMT Solver: automatic engine for checking satisfiability/validity Formulas can encode transition systems and their properties Solver returns with: 13 Models Counterexamples Proof hints SMT Solver Whether input formula is valid or not in the built-in theory If so, a proof of validity If not, a counter-model
Our Verification Framework Translation of a Lustre program L and a putative invariant property P into set F of SMT formulas SMT-based k-induction on F to prove or disprove P for L Main enhancements: path compression abstraction invariant discovery Lustre program Properties to verify Lustre Verifier SMT Solver Logical formula s OK / Error trace 14
Our Verification Framework Translation of a Lustre program L and a putative invariant property P into set F of SMT formulas SMT-based k-induction on F to prove or disprove P for L Main enhancements: path compression abstraction invariant discovery Lustre program Properties to verify Lustre Verifier SMT Solver Logical formula s OK / Error trace 15
Some Basic Terminology Q : a state space S : a state transition system over Q I : set of S s initial states T : S s transition relation Reachable states: all initial states of S and all T-successors of reachable states 16
A Classification of Functional Properties Valid: satisfied by all states in Q Inductive: I(s 0 ) => P(s 0 ) P(s n ), T(s n, s n+1 ) => P(s n+1 ) k-inductive: I(s 0 ), T(s 0, s 1 ),, T(s k-1, s k ) => P(s 0 ),, P(s k ) T(s n, s n+1 ),, T(s n+k, s n+k+1 ), P(s n ),, P(s n+k ) => P(s n+k+1 ) Invariant: satisfied by all reachable states of S Valid Inductive k-inductive Invariant 17
Proving Invariants by k-induction Automatable when 1. I, T, P can be encoded in some formal logic L 2. Implication (=>) in L can be checked efficiently Challenges Accuracy encoding of T is often an over-approximation k-induction is incomplete for proving invariant in infinite-state systems Scalability size of state space size and complexity of resulting formulas 18
Improving Accuracy: Invariant Discovery Problem If P is invariant for S but k-induction fails to show it then T(s, s ) holds for some unreachable state s Possible Solution 1. Find some other invariant Inv(x) for S 2. Strengthen T(x, y) into T (x, y) = T(x, y) /\ Inv(x) 3. Try again with T 19
A Novel Invariant Discovery Method 1. Let R(x,y) be a formula standing for a binary relation over program values (ex: x => y, x = y, x< y, x y< 10, ) 2. Determine heuristically a finite set U of relevant terms (e.g., including selected terms from T) 3. Find a subset C of { (t 1, t 2 ) in U U R(t 1, t 2 ) is invariant } 4. Let Inv = {R(t 1, t 2 ) (t 1, t 2 ) in C } 20
A Novel Invariant Discovery Method 1. Challenges: Let R(x,y) be a formula standing for a binary C relation should over program values (ex: x => y, x = y, x< y, x y< 10, ) be computed quickly be fairly small 2. Determine heuristically a finite set U of relevant exclude pairs (t terms (e.g., including 1, t 2 ) with R(t selected 1, t terms 2 ) valid from T) 3. Find a subset C of { (t 1, t 2 ) in U U R(t 1, t 2 ) is invariant } 4. Let Inv = {R(t 1, t 2 ) (t 1, t 2 ) in C } 21
Initial Results Developed efficient method for computing set C when R is over a basic data type (eg., bool, int, real) R is reflexive, transitive and anti-symmetric (eg, =>,, =, ) Properties of R allow a compact encoding(dag) of C Method capitalizes on ability of SMT solvers to find counter-models 22
Initial Evaluation ( => ) Considered ~600 Lustre benchmarks (program + property) All properties conjectured to hold Timeout: 120s / benchmark Kind with Yices SMT solver proves 369 (61%) directly 448 (75%) with the aid of separately discovered invariants Average proving times: without invariants: 0.06s with invariants: 0.29s 23
Invariant Discovery Statistics Timeout limit: 300s Timeouts: 17 # of invariants/benchmark Min: 0 Max: 168 Average: 8 Invariants 1 10 100 Times 551 501 451 401 351 301 251 201 151 101 51 1 Generation time 24 Total: 130m Average: 13s 0.01 0.1 1 10 100 1000 551 501 451 401 351 301 251 201 151 101 51 1
Conclusion Automated verification of safety properties of synchronous data-flow systems specified in Lustre Translation of Lustre programs + properties into suitable FOL fragment with built-in theories Use of off-the-shelf SMT solvers to prove properties by k-induction methods Enhanced with path compression, abstraction, & invariant discovery techniques Lustre verifier highly competitive with previous tools (Lesar, Luke, SAL) 25
Future Work Incorporation of in-house new SMT solver, CVC4, into Lustre verifier More powerful abstraction thanks to new CVC4 features Additional invariant discovery methods Modular verification Improved support for Lustre programs with nonlinear arithmetic 26
27 Thank you