Reachability Analysis of Pushdown Automata with an Upper Stack

Reachability Analysis of Pushdown Automata with an Upper Stack Adrien Pommellet 1 Marcio Diaz 1 Tayssir Touili 2 1 Université Paris-Diderot and LIPN, France 2 LIPN, CNRS, and Université Paris 13, France March 8, 2017

Pushdown Systems Pushdown Systems (PDSs) are often used to model programs with unbounded recursion, but can fail to accurately represent the actual stack.... 1 2 3 4 5 6 7... Figure 1: An assembly stack

The Limits of PDSs... 1 2 3 4... Figure 2: The stack 3 4... Figure 3: A simple PDS stack

The Limits of PDSs... 1 2 3 4... Figure 4: The stack... 1 5 3 4... Figure 5: The stack after a push 3 4... Figure 6: A simple PDS stack 5 3 4... Figure 7: A PDS after a push We push a value on the stack.

The Limits of PDSs... 1 2 3 4... Figure 8: The stack... 1 5 3 4... Figure 9: The stack after a push... 1 5 3 4... Figure 10: The stack after a pop 3 4... Figure 11: A simple PDS stack 5 3 4... Figure 12: A PDS after a push 3 4... Figure 13: A PDS after a pop We pop a value from the stack.

The Limits of PDSs... 1 2 3 4... Figure 14: The stack... 1 5 3 4... Figure 15: The stack after a push... 1 5 3 4... Figure 16: The stack after a pop 3 4... Figure 17: A simple PDS stack 5 3 4... Figure 18: A PDS after a push 3 4... Figure 19: A PDS after a pop How can we handle the instruction mov eax [ 4]?

An idea Our intuition is to use another stack to model the memory section left of the stack pointer.... 1 2 3 4 5 6 7... Figure 20: The assembly stack 5 6 7... Figure 21: Its PDS representation... 1 2 3 4 5 6 7... Figure 22: Using two stacks

A New Model... 1 2 3 4... Figure 23: The stack... 1 5 3 4... Figure 24: The stack after a push... 1 5 3 4... Figure 25: The stack after a pop... 1 2 3 4...... 1 5 3 4...... 1 5 3 4... Figure 26: Lower and upper stacks Figure 27: After a push Figure 28: After a pop

Pushdown Systems with an Upper Stack Denition (Pushdown system with an upper stack) A pushdown system with an upper stack (UPDS) is a triplet P = (P, Γ, ) where P is a nite set of control states, Γ is a nite stack alphabet, and P Γ P ( {ε} Γ Γ 2) a nite set of transition rules. We consider congurations of the form p, w u, w l, with a write-only upper stack. Let Γ be a copy of the stack alphabet Γ. Assuming there is only a single state in P, we can represent a conguration as a single word in Γ Γ : p, abc, def ā b cdef

Semantics of Pop Rules For a pop rule δ = (p, b) (p, ε): a p b c d δ a b p' c d... a b c d...... a b c d...

Semantics of Push Rules For a push rule δ = (p, b) (p, ab): x y p b c δ x p' a b c... x y b c...... x a b c...

The Reachability Problem What are the sets of predecessors pre and successors post of a regular set of congurations of a UPDS? Can we compute them? Are they regular, like the lower stack congurations, as shown by Caucal (CAAP'90), Bouajjani et al. (CONCUR'97), and Earza et al. (CAV'00)?

Reachability Properties of UPDSs Theorem There exist a UPDS P and a regular set of congurations C for which post (C) is not regular. Theorem There exist a UPDS P and a regular set of congurations C for which pre (C) is not regular. Theorem Given a UPDS P, a regular set of congurations C, and a conguration c of P, we can decide whether c post (P, C) or not.

A Counter-Example of Regularity for post We consider the UPDS P : (R a ) (p, a) (p, ε) (R b ) (p, b) (p, ε) (C) (p, a) (p, ab) And the regular set C = {p} {ε} a (ba). p a b a b a Figure 29: A conguration in C

A Relevant Subset of post We consider the subset L = { p, a n+1, b n, n N } post (C). p a b a b a a p a b b a R ar b a b p a b a R ar b R b a a b b p a C CC a a p a b b R a a a a p b b L (R a ) (p, a) (p, ε) (R b ) (p, b) (p, ε) (C) (p, a) (p, ab)

A Constraint on post For any reachable conguration p, w u, w l and the word w = w u w l, the inequality w b + w b + 1 w a + w ā holds. The inequality holds on the starting conguration C = {p} {ε} a (ba). The rules (R a ) = (p, a) (p, ε) and (R b ) = (p, b) (p, ε) do not change the number of occurences of the letter a on the whole stack. The rule (C) = (p, a) (p, ab) can make it smaller.

Applying the Pumping Lemma If we suppose that post (C) is regular, let k be its pumping length. We consider the word w = a k+1 b k of the language L. We apply the pumping lemma to w: w = xyz, xy k, y 1, and xy i z post (C), i 1, with x ā, y ā + and z ( ā + b ). For i large enough, w i = xy i z post (C) and w i ā > w i b + 1. There is a contradiction and post (C) is not regular.

What About pre? We use a similar proof. We consider the UPDS P : (C 0 ) (p, c) (p, ab) (R a ) (p, a) (p, ε) (C 1 ) (p, c) (p, cb) (R b ) (p, b) (p, ε) And the regular set C = {p} (ab) {c}. We then prove that L = { p, b n, c n c, n N} is a subset of pre (C), and that the inequality w u a + w l a n holds if p, b m, c n p, w u, w l. If we suppose that pre (C) is regular, by applying the pumping lemma to a word of L, we can nd a word in pre such that the inequality does not hold. Hence, pre (C) is not regular.

post Is Context-Sensitive For a given UPDS P, we can dene a context-sensitive grammar G whose language is equivalent to post. A conguration p, w u, w l is represented by a word w u pw l and applied context-sensitive rules. We can simulate a pop rule δ : (p, a) (p, ε) with the following sequence grammar rules: ( ) ( ) ( ) r δ 0 pa pδ r δ 1 pδ aδ r δ f aδ ap The push and switch cases are similar. post is therefore context-sensitive, hence, decidable.

Runs and the Upper Stack The set of runs of a UPDS, being similar to a PDS's, is context-free. But what if this set is regular? Theorem For a UPDS P = (P, Γ, ), a regular set of congurations C, and a regular set of runs R of P from C, the set of upper stack congurations reachable using runs in R is regular and eectively computable. Using a nite automaton A of runs, we compute an upper stack automaton A sharing the same states and whose edges are dened according to saturation rules.

The Pop Saturation Rule δ pop (S pop ): for each edge q 0 A q 1 with δ pop = (p, a) (p, ε), add a the edge q 0 A q 1. δ pop q 0 q 1 Figure 30: The run automaton A q 0, p a q 1, p Figure 31: The upper stack automaton A

The Switch Saturation Rule δ (S switch ): for each edge q switch 0 A q 1 with δ switch = (p, a) (p, b), ε add the edge q 0 A E q 1. δ q switch 0 q 1 Figure 32: The run automaton A q 0, p ε q 1, p Figure 33: The upper stack automaton A

The Push Saturation Rule δ push (S push ): for each edge q 0 E q 1 with δ push = (p, a) (p, bc), for each state q such that either q x E q 0 with x a letter, or q is an initial state and q ε A q 0, add the edge q ε A q 1. δ push q 2 q 3 Figure 34: The run automaton A q 0, p a 0 q 1, p b 1 q 2, p q 3, p a Figure 35: The upper stack automaton A

An Example I push a pop a pop a start q 0 q 1 q 2 pop b switch a to b q 3

An Example II a a start q 0 a q 1 a q 2 b ε q 3

Computing a Regular Overapproximation of post 1 Compute a regular overapproximation of the set of runs of the PDS P from C; 2 Compute the set of upper stack congurations reachable using overapproximated runs of P; 3 Compute the exact set of reachable lower stack congurations; 4 Combine the upper and lower stack sets to create an overapproximation of post (P, C).

Using an Overapproximation An overapproximation O of post (C) can be used to prove safety properties regarding a regular set X of forbidden congurations. O X post (C)

Bounded-Phase Analysis A phase is a part of a run during which either pop or push rules are forbidden. We want to compute congurations reachable within a bounded limit of phases. This method was rst applied to Multi-Stack Pushdown Systems (MPDSs) by La Torre et al. (LICS '07), and it has been proven by Anil Seth (CAV'10) that the set of reachable predecessors given a bounded number of phases is regular.

From a UPDS to a 2-MPDS x y p a b c p y x a b c A UPDS can be simulated by a MPDS with two stacks, The second stack of the MPDS is similar to the lower stack. The rst stack is a mirrored upper stack followed by a symbol that can't be popped and is used to know when the end of the stack has been reached. We use bounded-phase analysis to underapproximate pre.

Using an Underapproximation An underapproximation U of pre (C) can be used to detect forbidden behaviours belonging to a regular set X of forbidden congurations. pre (X ) U C

Application 1 Stack Overow Detection We put a symbol on top of an upper stack of bounded height m. # m. times.. # a b... Figure 36: Using to bound the upper stack If the symbol is overwritten, a stack overow malfunction happens.

Application 2 Reading the Upper Stack A register is assigned a value located in the upper stack:... 1 2 3 4 5 6 7... - 8 Figure 37: The stack being read The instruction mov eax [ 8] copies in the register eax the second symbol above the stack pointer. We can approximate this value.

Application 3 Changing the Stack Pointer Changing the stack pointer leads to a new stack conguration:... 1 2 3 4 5 6 7... Figure 38: The original stack... 1 2 3 4 5 6 7... Figure 39: After changing We can approximate this new stack conguration.

Conclusion We dened a new automaton model, called UPDS, to capture advanced stack properties. We analyzed the forward and backward reachability sets of UPDSs. We can either underapproximate or overapproximate these sets. We have shown some potential applications of this model.

Thank you!