Model Checking & Program Analysis Markus Müller-Olm Dortmund University Overview Introduction Model Checking Flow Analysis Some Links between MC and FA Conclusion Apology for not giving proper credit to other researchers' work in this tutorial! MOVEP'02, Nantes, June 17-21, 2002 2
Purposes of Automatic Analysis Optimizing compilation Validation/Verification Type checking Functional correctness Security properties... Debugging MOVEP'02, Nantes, June 17-21, 2002 3 Fundamental Limit Rice's Theorem [Rice,1953]: All interesting semantic questions about programs from a universal programming language are undecidable. MOVEP'02, Nantes, June 17-21, 2002 4
Example: Detection of Constants read(new); π; write(new)? read(new); π; write(k) write(new) can be replaced by write(k) for some constant k π terminates Hence: Constant Detection is undecidable MOVEP'02, Nantes, June 17-21, 2002 5 Two Solutions Weaker formalisms analyze abstract models of systems e.g.: automata, labelled transition systems,... Approximate analyses yield sound but, in general, incomplete results e.g.: detects some instead of all constants Model checking Flow analysis Abstract interpretation Type checking MOVEP'02, Nantes, June 17-21, 2002 6
Weaker Formalisms Program main() { x=17; if (x>63) { y=17;x=10;x=x+1;} else { x=42; while (y<99) { y=x+y;x=y+1;} y=11;} x=y+1; out(x); } Approximate Abstract model Exact Exact analyzer for abstract model MOVEP'02, Nantes, June 17-21, 2002 7 Overview Introduction Model Checking Flow Analysis Some Links between MC and FA Conclusion MOVEP'02, Nantes, June 17-21, 2002 8
Model Checking OK Finite-state structure or G( Φ FΨ) Temporal logic formula Model Checker Error trace MOVEP'02, Nantes, June 17-21, 2002 9 What is a Model Checker? an automatic procedure for deciding M where M is a model structure is a (temporal-logic) formula means satisfaction MOVEP'02, Nantes, June 17-21, 2002 10
Model Structures M MOVEP'02, Nantes, June 17-21, 2002 11 Model Structures own use own use own use Kripke structure acquire start Labeled Transition System release end P Q acquire P Q start P Q Kripke Transition System release end MOVEP'02, Nantes, June 17-21, 2002 12
Kripke Structures own use own use own use K= ( S, R, I), where S states R S S transition relation, total AP I : S 2 interpretation AP atomic propositions MOVEP'02, Nantes, June 17-21, 2002 13 Temporal Logics M MOVEP'02, Nantes, June 17-21, 2002 16
Temporal Logics Linear-Time Logics formulas specify properties of program paths state satisfies property if all paths starting in this state do Branching-Time Logic can specify properties sensitive to branching has (or can simulate) path quantifiers A and E MOVEP'02, Nantes, June 17-21, 2002 17 Branching vs. Linear-Time Logics P coin Q coin coin coffee tea coffee tea LT logics cannot distinguish P and Q: P and Q have same execution paths BT logics can: P [coin] <tea> true but Q [coin] <tea> true MOVEP'02, Nantes, June 17-21, 2002 18
A Linear-Time Logic: PLTL own use own use own use 1. own the resource before using it use U own 2. release the resource in finite time own F( own) PLTL formulas :: = p X U 1 2 1 2 Abbreviations F : = trueu G : = F( ) WUψ : = U ψ G MOVEP'02, Nantes, June 17-21, 2002 19 Linear-Time Modalities X :... next G :... Generally F :... Finally U ψ: ψ... Until WU ψ: U ψ or G Weak Until MOVEP'02, Nantes, June 17-21, 2002 20
Until ψ Linear-time: blue nodes satisfy v U ψ ψ Branching-time: blue nodes satisfy v A( U ψ) MOVEP'02, Nantes, June 17-21, 2002 21 Weak Until ψ Linear-time: blue nodes satisfy v WU ψ ψ Branching-time: blue nodes satisfy v A ( WU ψ) MOVEP'02, Nantes, June 17-21, 2002 22
A Branching-Time Logic: CTL own use own use own use 1. own the resource before using it A( use U own) 2. release the resource in finite time own AF( own) State formulas :: = p Eψ Aψ Path formulas ψ 1 2 ψ :: = X U G F 1 2 MOVEP'02, Nantes, June 17-21, 2002 23 Duality Path quantifiers are duals: A = E E = A Until and weak until are almost duals (on paths): ( U ψ) = ψ WU ( ψ) ( WU ψ) = ψ U ( ψ) MOVEP'02, Nantes, June 17-21, 2002 24
Modal mu-calculus a branching-time logic with local modalities : all successor states satisfy : some successor state satisfies and fixpoint formulae. µ X. (X) : minimum fixpoint formula ν X. (X) : maximum fixpoint formula Fixpoint formulae can be nested Alternation: µx.(νy.x Y) MOVEP'02, Nantes, June 17-21, 2002 25 Local Modalities for Labelled Edges [a] : a a a.. <a> : a a a.. MOVEP'02, Nantes, June 17-21, 2002 26
Computing Fixpoint Formulae On finite structures, meaning of fixpoint formulae can be computed by computing Kleene chains until stabilization: µ :,f( ),f(f( ),f(f(f( ))),... ν :,f( ),f(f( )),f(f(f( ))),... f is a function on state sets derived from the body of the fixpoint formula. MOVEP'02, Nantes, June 17-21, 2002 27 CTL and Modal mu-calculus CTL-formulae can inductively be transformed to modal mu-calculus formulae Global modalities can be expressed by fixpoint formulae, e.g.: A( Uψ ) = µ X.( ψ ( X true)) E( Uψ ) = µ X.( ψ ( X)) A( WU ψ) = ν X.( ψ ( X)) E( WU ψ) = ν X.( ψ ( ( X false))) MOVEP'02, Nantes, June 17-21, 2002 28
Model Checking Approaches M MOVEP'02, Nantes, June 17-21, 2002 29 Global vs. Local Model Checking Global model checking problem Given: finite model structure M, formula Determine: {s s } Local model checking problem Given: finite model structure M, formula, and state s in M Determine, whether s or not MOVEP'02, Nantes, June 17-21, 2002 30
Model Checking Approaches Iterative model checking Automata-theoretic model checking Tableau-based model checking MOVEP'02, Nantes, June 17-21, 2002 31 Iterative Model Checking Good for global model-checking of branching-time logics Idea: Compute semantics of formula on given model by structural induction on the formula Reduce modalities to their fixpoint definition Compute the fixpoints by Kleene chains Alternating fixpoints lead to backtracking ( proceedings) MOVEP'02, Nantes, June 17-21, 2002 32
Iterative Model Checking of CTL Annotate each state with those subformulas ψ of with s ψ. Use structural induction on. Use backwards propagation to compute modalities A( U ψ) and A( WU ψ). MOVEP'02, Nantes, June 17-21, 2002 33 Model Checking A(Uψ) ψ ψ 1. Mark all nodes v with v ψ 2. Mark all unmarked nodes w with w and all successors marked 3. Iterate 2. until stabilization MOVEP'02, Nantes, June 17-21, 2002 34
Model Checking A(WUψ) ψ ψ 1. Mark all nodes with v or v ψ 2. Unmark all nodes v with v ψ and some unmarked successor 3. Iterate 2. until stabilization MOVEP'02, Nantes, June 17-21, 2002 35 Automata-theoretic MC Good for linear-time logics Idea: reduce model-checking to nonemptiness problem of an automaton MOVEP'02, Nantes, June 17-21, 2002 36
Automata-theoretic MC Construct (Büchi-) automaton, A, from A accepts paths satisfying Construct (Büchi-) automaton, A M, from M A M accepts paths exhibited by M M iff L(A ) L(A ) M iff L(A ) L(A ) = M iff L(A A ) = M MOVEP'02, Nantes, June 17-21, 2002 37 Automata-theoretic MC Construct (Büchi-) automaton, A, from A accepts paths satisfying Construct (Büchi-) automaton, A M, from M A M accepts paths exhibited by M Compute automaton A M A Complementation & product construction Decide L(A M A )= by reachability analysis MOVEP'02, Nantes, June 17-21, 2002 38
Automaton for F(P) G(Q) (Finite Maximal Paths Interpretation) {Q} {Q},{P,Q} {Q} {Q},{P,Q} 1 {P,Q} 3 {P,Q} 1 3 {},{P} {},{P} {},{P} {},{P} F(P) & G(Q) Automata construction, Determinization, and Minimization 2 {},{P},{Q},{P,Q} Complementation 2 {},{P},{Q},{P,Q} MOVEP'02, Nantes, June 17-21, 2002 40 A Successful Model-Check for formula F(P) G(Q) Kripke structure: Q P,Q Q 1 2 3 Corresponding automaton: {Q} {P,Q} {Q} 1 2 3 4 {P,Q} Product automaton: {Q} {P,Q} {Q} 11 12 33 34 {P,Q} 32 {P,Q} {P,Q} MOVEP'02, Nantes, June 17-21, 2002 41
A Failing Model-Check for formula F(P) G(Q) Kripke structure: Q P,Q 1 2 3 Corresponding automaton: {Q} {P,Q} {} 1 2 3 4 {P,Q} Product automaton: {Q} {P,Q} {} 11 12 33 24 {P,Q} 32 {P,Q} {P,Q} MOVEP'02, Nantes, June 17-21, 2002 42 Tableau-Based Model-Checking Idea: solve local model checking problem by sub-goaling Try to construct a proof tree that witnesses s If no proof tree can be found, then s For proof tree construction, tableau rules are given that reduce goals to sub-goals MOVEP'02, Nantes, June 17-21, 2002 43
Some Tableau Rules AND s 1 2 s s 1 2 OR1 s 1 2 s 1 OR2 s 1 2 s 2 1 [ a] s a BOX if {s 1,..., sn} = { t s t} s,..., s n DIAMOND s a a if s t t Tableau Rules for Fixpoints Fixpoint formulas are analyzed with unfolding rules, e.g., s µ X. (X) µ -UNFOLD s µ ( X. (X)) If fixpoint formula sequent re-appears later: µ : unsuccessful leaf ν : successful leaf Special care needed for nested fixpoints ( proceedings) MOVEP'02, Nantes, June 17-21, 2002 45
A Model Check by Tableau = ν X.( ψ [b]x) a ψ = µ X. a true b X s t s [ b] s ψ b b ν successful! s s ψ a true s a true t true b ψ s s [ b] Typical Profile of Model Checking Techniques Iterative Tableau methods Automatatheoretic Branchingtime X X Linear-time X X Global X X Local X X MOVEP'02, Nantes, June 17-21, 2002 49
State-Explosion Problem State-Explosion number of states grows exponentially in number of components or variables Techniques for fighting state-explosion symbolic model-checking incremental construction of state space abstraction symmetry reductions partial-order methods compositional methods MOVEP'02, Nantes, June 17-21, 2002 50 Other Classes of Systems Infinite-state systems Timed systems Hybrid systems Probabilistic systems... Learn more about such systems this week! MOVEP'02, Nantes, June 17-21, 2002 51
Overview Introduction Model Checking Flow Analysis Some Links between MC and FA Conclusion MOVEP'02, Nantes, June 17-21, 2002 52 From Programs to Flow Graphs main() { x=17; if (x>63) { y=17;x=10;x=x+1;} else { x=42; while (y<99) { y=x+y;x=y+1;} y=11;} x=y+1; out(x); } x := 17 y := 17 x := 10 x := x+1 x := y+1 out(x) x := 42 y := 11 y := x+y x := y+1 MOVEP'02, Nantes, June 17-21, 2002 53
Dead Code Elimination Goal: find and eliminate assignments that compute values which are never used Fundamental problem: undecidability hence: use approximate algorithm; ignore guards Technique: propagate set of non-needed variables backwards through the flow graph MOVEP'02, Nantes, June 17-21, 2002 54 x := 17 y := 17 {x,y} {x} x := 42 {x} {x,y} y := x+y x := 10 {x} x := x+1 {x} x := y+1 {x,y} y := 11 {x} x := y+1 {x,y} {y} out(x) {x,y} MOVEP'02, Nantes, June 17-21, 2002 55
x := 17 y := 17 {x} x := 42 {x} y := x+y x := 10 {x} x := x+1 {x} x := y+1 y := 11 {x} x := y+1 {y} out(x) {x,y} MOVEP'02, Nantes, June 17-21, 2002 56 Remarks Forward vs. backward analyses Bitvector analyses backward: live/dead variables, very busy expressions, forward: reaching definitions, available expressions Computation strategies MOVEP'02, Nantes, June 17-21, 2002 57
Flow Equations for Dead Code DeadIn[ i] = (DeadOut[ i] { x}\{ y, z} j 1 x := y+z... j k DeadOut[ i] = DeadIn[ i] j Succ[ i] General equations: Equations may be combined or replaced by inequations: DeadIn[ i] = (DeadOut[ i] Mod[ i]) \ Use[ i] DeadOut[ i] = DeadIn[ i] j Succ[ i] DeadOut[ i] = (DeadOut[ j] Mod[ j]) \ Use[ j] j Succ[ i] DeadOut[ i] (DeadOut[ j] Mod[ j]) \ Use[ j], for j Succ[ i] MOVEP'02, Nantes, June 17-21, 2002 58 Data-Flow Frameworks Correctness generic properties of frameworks can be studied and proved Implementation efficient, generic implementations can be constructed MOVEP'02, Nantes, June 17-21, 2002 59
Data-Flow Frameworks a complete lattice (D, ) of data-flow facts a space F of transfer functions f: D D Id F closed under composition initial value init D a control-flow graph with set of entry/exit nodes mapping assigning transfer functions f i to flow graph nodes i MOVEP'02, Nantes, June 17-21, 2002 60 Framework for dead variables lattice D 2 Var Var control flow program graph initial value function space { f : D D d,d : f( d) = ( d d ) \ d } 1 2 1 2 f f ( d) = ( d Mod[ i]) \ Use[ i] i i MOVEP'02, Nantes, June 17-21, 2002 61
What Data Flow Algorithms Compute Forward Analysis Computes maximal solution w.r.t. (D, ) of init In[ i] = Backward Analysis j Pred( i) j f (In[ j]) Computes maximal solution of init Out[ i] = f (Out[ j]) j Succ( i) j i Entry otherwise i Exit otherwise This is called the maximal fixpoint solution MFP[i] MOVEP'02, Nantes, June 17-21, 2002 62 Assessing Data Flow Frameworks Execution Semantics Abstraction MOP-solution sound? how precise? MFP-solution sound? precise? MOVEP'02, Nantes, June 17-21, 2002 63
{x,y} v x := 17 {x,y} {x} y := 17 x := 10 x := x+1 x := 42 y := 11 y := x+y x := y+1 x := y+1 out(x) infinitely many such paths MOP[] v = {, x y} {} x = {} x MOVEP'02, Nantes, June 17-21, 2002 64 Meet-Over-All-Paths Solution Forward Analysis MOP[ i] : = p Paths[ entry, i) F p( init) Backward Analysis MOP[ i] : = p Paths( i, exit ] F p ( init) MOVEP'02, Nantes, June 17-21, 2002 65
Coincidence Theorem Definition: A framework is distributive if f( X)= {f(x) x X} for all X D, f F. Theorem: In any distributive framework, MOP[i]=MFP[i] for all program points i. Theorem: All bitvector frameworks are distributive. MOVEP'02, Nantes, June 17-21, 2002 66 Lattice for Constant Propagation inconsistent value... -2-1 0 1 2... unknown value MOVEP'02, Nantes, June 17-21, 2002 67
Constant Propagation Framework lattice D Var ( {, }) = Var ConstVal ρ ρ': x: ρ( x) ρ'( x) pointwise meet ( x) = f.a. x Var control flow program graph initial value function space { f : D D f monotone} f i e CP dx [ ( d)] if i annotated with x: = e fi( d) = d otherwise MOVEP'02, Nantes, June 17-21, 2002 68 x := 17 ( ρ( x), ρ( y), ρ( z)) x := 2 x := 3 y := 3 y := 2 z := x+y (2,3,5) out(x) (3,2,5) MOP[ v ] = (,,5) MOVEP'02, Nantes, June 17-21, 2002 69
x := 17 ( ρ( x), ρ( y), ρ( z)) (,, ) x := 2 (2,, ) y := 3 (2,3, ) z := x+y out(x) (,, ) (,, ) x := 3 y := 2 (3,, ) (3,2, ) MFP[ v ] = (,, ) MOP[ v ] = (,,5) MOVEP'02, Nantes, June 17-21, 2002 70 Correctness Theorem Definition: A framework is monotone if f(x) f(y) for all x,y D, x y. Theorem: In any monotone framework, MFP[i] MOP[i] for all program points i. Remark: Any "reasonable" framework is monotone. MOVEP'02, Nantes, June 17-21, 2002 71
Assessing Data Flow Frameworks Execution Semantics Abstraction MOP-solution sound MFP-solution sound precise, if distrib. MOVEP'02, Nantes, June 17-21, 2002 72 Where Flow Analysis Looses Precision Execution Semantic MOP MFP Widening Narrowing Loss of Precision MOVEP'02, Nantes, June 17-21, 2002 73
Overview Introduction Model Checking Flow Analysis Some Links between MC and FA Conclusion MOVEP'02, Nantes, June 17-21, 2002 74 Some Links between MC and FA Flow Analysis via Model Checking Model Checking via Flow Analysis Synergy of MC and FA MOVEP'02, Nantes, June 17-21, 2002 75
Flow Analysis via Model Checking yes/no data-flow problem flow graph Flow graph transformer Design & Programming formula Kripke structure Model-Checker yes/no annotation of Kripke structure nodes Interpretation data flow information for flow graph nodes Dead-Code Elimination Flow Graph Transformation: y := x+y Mod(y), Use(x), Use(y) Formula specifying "x is dead": Dead x = X ( use( x) WU (mod( x) use( x))) Note: more direct specification than in the data flow framework MOVEP'02, Nantes, June 17-21, 2002 77
x := 17 Mod(x) y := 17 x := 42 Mod(y) Mod(x) Mod(x), Use(y) y := x+y x := 10 Mod(x) x := x+1 x := y+1 y := 11 x := y+1 Mod(x), Use(x) Mod(y) Mod(y), Use(x), Use(y) out(x) Mod(x), Use(y) exit Use(x) End Check: Dead = X ( use( x) WU (mod( x) use( x))) x MOVEP'02, Nantes, June 17-21, 2002 78 Model Checking via Flow Analysis Evaluation of (CTL) modalities can be seen as a data flow analysis CTL model-checking is iterated flow analysis MOVEP'02, Nantes, June 17-21, 2002 79
Framework for E( U ψ) lattice D B = {0,1} 1 0 0 1 = 1 0 control flow Kripke structure ( SRI,, ) initial nodes Nodes s with s ψ initial value 1 function space { λx.0, λx.1, λxx. } f i λxx. if i ψ, i fi( d) = λx.0 if i ψ, i 0 1 MOVEP'02, Nantes, June 17-21, 2002 80 Why the Framework computes E( U ψ) MFP[ v ] = MOP[ v] = 1 iff s E( U ψ ), ψ, ψ, ψ ψ 0 1 λ x. x λx. x λx. x 1 1 0, ψ, ψ, ψ ψ λx. x λx. x λx.0 1
Model Checking E( U ψ) 0 λxx. 1 ψ λxx. ψ λxx. λxx. λx.0 λxx. MOVEP'02, Nantes, June 17-21, 2002 82 The Model-Extraction Problem in Software Model-Checking Program Abstract model Model Checker void add(object o) { buffer[head] = o; head = (head+1)%size; } Object take() { tail=(tail+1)%size; return buffer[tail]; } Gap Idea: Use techniques from abstract interpretation and program analysis to extract a finite-state model from program MOVEP'02, Nantes, June 17-21, 2002 85
Bandera: [Dwyer, Hatcliff, et. al.] An open tool set for model-checking Java source code Graphical User Interface Optimization Control Checker Inputs Bandera Temporal Specification void add(object o) { buffer[head] = o; head = (head+1)%size; } Object take() { tail=(tail+1)%size; return buffer[tail]; } Java Source taken from Bandera tutorial of John Hatcliff http://www.cis.ksu.edu/santos/bandera Transformation & Abstraction Tools Error Trace Mapping Bandera Model Checkers Checker Outputs Conclusion Overview on fundamentals of Model Checking Flow Analysis Some links MOVEP'02, Nantes, June 17-21, 2002 98
Just the Beginning... More complex properties Theory of Abstract Interpretation Interprocedural Flow Analysis Parallel Programs MOVEP'02, Nantes, June 17-21, 2002 99 Any Questions? MOVEP'02, Nantes, June 17-21, 2002 100