(Co-)algebraic Automata Theory

(Co-)algebraic Automata Theory Sergey Goncharov, Stefan Milius, Alexandra Silva Erlangen, 5.11.2013 Chair 8 (Theoretical Computer Science)

Short Histrory of Coalgebraic Invasion to Automata Theory Deterministic automata as coalgebras [Rutten, 1998]. Generalized regular expressions and Kleene s theorem for (Kripke-)polinomial functors [Silva, 2010]. Generalized powerset construction [Silva et al., 2010]. Regular expressions for equationally presented functors and monads [Myers, 2013]. Context-free languages coalgebraically [Winter et al., 2013]. Erlangen, 5.11.2013 Sergey Goncharov, Stefan Milius, Alexandra Silva Chair 8 (Theoretical Computer Science) 2

Base Case: Moore Automata Moore automaton with input alphabet A and output alphabet B is given by t m : X A X (transition) and o m : X B (output) Put differently, a Moore automaton is a coalgebra m : X B X A on Set. A final L A,B -coalgebra is carried by the set B A of formal power series on B. Its coalgebra structure ι : B A B (B A ) A is given by o ι (σ : A B) = σ(ɛ) and t ι (σ : A B, a) = λw.σ(a w). Equivalently, the L A,B -coalgebra structure can be given in terms if derivatives: given w A, ɛ (σ) = o ι (σ) and a w (σ) = t ι ( w (σ), a) Particular important case is B = 2. Then B A P(A ) is the set of all formal languages over A. Erlangen, 5.11.2013 Sergey Goncharov, Stefan Milius, Alexandra Silva Chair 8 (Theoretical Computer Science) 3

L A,B X := B X A Base Case: Moore Automata Moore automaton with input alphabet A and output alphabet B is given by t m : X A X (transition) and o m : X B (output) Put differently, a Moore automaton is a coalgebra m : X B X A on Set. A final L A,B -coalgebra is carried by the set B A of formal power series on B. Its coalgebra structure ι : B A B (B A ) A is given by o ι (σ : A B) = σ(ɛ) and t ι (σ : A B, a) = λw.σ(a w). Equivalently, the L A,B -coalgebra structure can be given in terms if derivatives: given w A, ɛ (σ) = o ι (σ) and a w (σ) = t ι ( w (σ), a) Particular important case is B = 2. Then B A P(A ) is the set of all formal languages over A. Erlangen, 5.11.2013 Sergey Goncharov, Stefan Milius, Alexandra Silva Chair 8 (Theoretical Computer Science) 4

L A,B X := B X A We shall assume finite Base Case: Moore Automata Moore automaton with input alphabet A and output alphabet B is given by t m : X A X (transition) and o m : X B (output) Put differently, a Moore automaton is a coalgebra m : X B X A on Set. A final L A,B -coalgebra is carried by the set B A of formal power series on B. Its coalgebra structure ι : B A B (B A ) A is given by o ι (σ : A B) = σ(ɛ) and t ι (σ : A B, a) = λw.σ(a w). Equivalently, the L A,B -coalgebra structure can be given in terms if derivatives: given w A, ɛ (σ) = o ι (σ) and a w (σ) = t ι ( w (σ), a) Particular important case is B = 2. Then B A P(A ) is the set of all formal languages over A. Erlangen, 5.11.2013 Sergey Goncharov, Stefan Milius, Alexandra Silva Chair 8 (Theoretical Computer Science) 5

Rationality of Formal Power Series By the the universal property of the final coalgebra for any m : X B X A there exists a unique L A,B -coalgebra homomorphism m such that: m X m B A B X A B m A B (B A ) A Given x X, x m := m(x) is the language recognized by m at x. A formal power series σ : A B is rational if the set { w (σ) w A } is finite. Theorem. A formal power series σ is rational iff σ = x m for some m : X B X A and x X. Erlangen, 5.11.2013 Sergey Goncharov, Stefan Milius, Alexandra Silva Chair 8 (Theoretical Computer Science) 6 ι

Expressions for Moore Automata Let σ : A B be rational, let A = {a 1,..., a n }, and let {σ 1,..., σ k } = { w (σ) w A }. Then σ = σ 1 can be found as the solution of system of recursive equations σ i = a 1.σ i1 a n.σ in c i, ( ) which should be read: for all i, j, aj (σ i ) = σ ij and σ i (ɛ) = c i. System ( ) uniquely determines the σ 1,..., σ n. Any recursive equation of ( ) can be rewritten as σ i = µσ i. (a 1.σ i1 a n.σ in c i ) where µ is the fixpoint operator binding the occurrences of σ i in the right term. One can successively eliminate σ i (i 1) and obtain a solution σ = t of ( ) in σ where t is a closed term given by γ ::= µx. (a 1.δ a n.δ B) δ ::= X γ Erlangen, 5.11.2013 Sergey Goncharov, Stefan Milius, Alexandra Silva Chair 8 (Theoretical Computer Science) 7

Expressions for Moore Automata Finite set of variables Let σ : A B be rational, let A = {a 1,..., a n }, and let {σ 1,..., σ k } = { w (σ) w A }. Then σ = σ 1 can be found as the solution of system of recursive equations σ i = a 1.σ i1 a n.σ in c i, ( ) which should be read: for all i, j, aj (σ i ) = σ ij and σ i (ɛ) = c i. System ( ) uniquely determines the σ 1,..., σ n. Any recursive equation of ( ) can be rewritten as σ i = µσ i. (a 1.σ i1 a n.σ in c i ) where µ is the fixpoint operator binding the occurrences of σ i in the right term. One can successively eliminate σ i (i 1) and obtain a solution σ = t of ( ) in σ where t is a closed term given by γ ::= µx. (a 1.δ a n.δ B) δ ::= X γ Erlangen, 5.11.2013 Sergey Goncharov, Stefan Milius, Alexandra Silva Chair 8 (Theoretical Computer Science) 8

Kleene s Theorem for Moore Automata Expressions γ ::= µx. (a 1.δ a n.δ B) δ ::= X γ faithfully represent all rational formal power series. We can introduce Brzozowski derivatives: given e = µx. (a 1.e 1 a n.e 1 c), let ai (e) = e i [e/x] o(e) = c By induction, let a w = a w. Then, let e (w) = o( w (e)). Brzozowski s theorem: for any expression e, { w (e) w A } is finite. Kleene s theorem: for any Moore automaton m and any state x X there is an expression e such that x m = e and converselly, for any expression e there are m and x such that x m = e. Erlangen, 5.11.2013 Sergey Goncharov, Stefan Milius, Alexandra Silva Chair 8 (Theoretical Computer Science) 9

Expressions for Deterministic FSM: Example B = 2 a a b start q 0 b q 1 a q 2 b e 0 = a.e 0 b.e 1 0 e 1 = a.e 0 b.e 2 0 e 2 = a.e 0 b.e 0 1 Hence: e 0 = µx.(a.x b.µy.(a.x b.µz.(a.x b.x 1) 0) 0) Equivalently: e 0 = (a + ba + b) b. Erlangen, 5.11.2013 Sergey Goncharov, Stefan Milius, Alexandra Silva Chair 8 (Theoretical Computer Science) 10

Powerset Construction Let m : X B (P ω X) A where B is a join-semilattice with bottom. We can extend it to m : m : P ω X B (P ω X) A by defining t m (s P ω X, a) = {t m (x, a) x s} o m (s P ω X) = x s o(x) Let x m = m{x}. m X {--} m P ω X m B A B (P ω X) A id ( m ) A B (B A ) A Theorem: x m is the formal power series accepted by m at state x X. ι Erlangen, 5.11.2013 Sergey Goncharov, Stefan Milius, Alexandra Silva Chair 8 (Theoretical Computer Science) 11

Equivalently given by t m : X A P ω X and o m : X B Powerset Construction Let m : X B (P ω X) A where B is a join-semilattice with bottom. We can extend it to m : m : P ω X B (P ω X) A by defining t m (s P ω X, a) = {t m (x, a) x s} o m (s P ω X) = x s o(x) Let x m = m{x}. m X {--} m P ω X m B A B (P ω X) A id ( m ) A B (B A ) A Theorem: x m is the formal power series accepted by m at state x X. ι Erlangen, 5.11.2013 Sergey Goncharov, Stefan Milius, Alexandra Silva Chair 8 (Theoretical Computer Science) 12

Generalized Powerset Construction and T-automata Definition: Let T be a monad. A T-automaton is triple of maps o m : X B, t m : X A TX, a m : TB B where the latter map satisfies the axioms of T-algebra. A T-automaton can be identified as a coalgebra m : X B (TX) A. Let Then t m (p, a) = do x p; t m (x, a) m X η m and we can define x m = η(x) m. o m (p) = a m (do x p; η(o m (x))). TX m B A B (TX) A id ( m ) A B (B A ) A ι Erlangen, 5.11.2013 Sergey Goncharov, Stefan Milius, Alexandra Silva Chair 8 (Theoretical Computer Science) 13

Algebraic Theories An algebraic theory is given by a signature Σ and a set of equations. Any algebraic theory E defines a monad: TX = set of Σ-terms over X modulo E ; η coerces a variable to a term; do x p; q substitutes each x in p : TX with q(x). Definition: We call a monad T finitely presented if it is generated by a finitely axiomatizable algebraic theory E over finite Σ. Example: Finite powerset monad P ω join semilattices with bottom. Finite state monad TX = (X S) S finite mnemoids (lookup l : X v X, update l,v : X X). Non-Example: Distribution monad (TX = set of finitely supported probability distributions ). Erlangen, 5.11.2013 Sergey Goncharov, Stefan Milius, Alexandra Silva Chair 8 (Theoretical Computer Science) 14

Delimited Stores Definition: Given a set S, let L be a nonempty meet-semilattice of equivalence relations of finite index on S. An L-delimited store monad is a submonad T of the store monad on S such that: for any L and any r, t : S (X S) TX, there exists L finer than such that t preserves and r sends -equivalent elements to equal. Examples: Stack monad: L consists of equivalences n on Γ identifying any s and s with the same n-prefixes. Dequeue monad: L consists of equivalences n,m on Γ identifying any s and s with the same n-prefixes and m-postfixes. Tape monad: L consists of equivalences n,m on Γ (Γ + 1) Γ identifying any s, x, t and s, x, t for which s and s have the same n-postfixes and t and t have the same m-prefixes. Erlangen, 5.11.2013 Sergey Goncharov, Stefan Milius, Alexandra Silva Chair 8 (Theoretical Computer Science) 15

Delimited Stores Finite tape alphabet Definition: Given a set S, let L be a nonempty meet-semilattice of equivalence relations of finite index on S. An L-delimited store monad is a submonad T of the store monad on S such that: for any L and any r, t : S (X S) TX, there exists L finer than such that t preserves and r sends -equivalent elements to equal. Examples: Stack monad: L consists of equivalences n on Γ identifying any s and s with the same n-prefixes. Dequeue monad: L consists of equivalences n,m on Γ identifying any s and s with the same n-prefixes and m-postfixes. Tape monad: L consists of equivalences n,m on Γ (Γ + 1) Γ identifying any s, x, t and s, x, t for which s and s have the same n-postfixes and t and t have the same m-prefixes. Erlangen, 5.11.2013 Sergey Goncharov, Stefan Milius, Alexandra Silva Chair 8 (Theoretical Computer Science) 16

Stack Automata For a stack monad T, TX (X Γ ) Γ consists of those state transformers r, t : Γ (X Γ ), for which r(u w) = r(u) t(u w) = t(u) w whenever u n for some n. The stack monad is finitely presentable w.r.t. pop : X Γ+1 X and push i : X X (i Γ ): push i (pop(x 1,..., x n, y)) = x i pop(push 1 (x),..., push n (x), x) = x pop(x 1,..., x n, pop(y 1,..., y n, z)) = pop(x 1,..., x n, z) For a deterministic stack automaton m : X 2 L[Γ ] TX a m is the w. p. transformer: for any s S and r, t : (2 L[Γ ] Γ ) Γ T (2 L[Γ ] ) a m (f )(s) = 1 iff s Γ. s t 1 (s ) r(s)(s ) = 1 Erlangen, 5.11.2013 Sergey Goncharov, Stefan Milius, Alexandra Silva Chair 8 (Theoretical Computer Science) 17

Stack Automata p 2 L[Γ ] iff it is stable under some n L For a stack monad T, TX (X Γ ) Γ consists of those state transformers r, t : Γ (X Γ ), for which r(u w) = r(u) t(u w) = t(u) w whenever u n for some n. The stack monad is finitely presentable w.r.t. pop : X Γ+1 X and push i : X X (i Γ ): push i (pop(x 1,..., x n, y)) = x i pop(push 1 (x),..., push n (x), x) = x pop(x 1,..., x n, pop(y 1,..., y n, z)) = pop(x 1,..., x n, z) For a deterministic stack automaton m : X 2 L[Γ ] TX a m is the w. p. transformer: for any s S and r, t : (2 L[Γ ] Γ ) Γ T (2 L[Γ ] ) a m (f )(s) = 1 iff s Γ. s t 1 (s ) r(s)(s ) = 1 Erlangen, 5.11.2013 Sergey Goncharov, Stefan Milius, Alexandra Silva Chair 8 (Theoretical Computer Science) 18

(Deterministic) Real-time CFL Deterministic real-time context-free languages are exactly those, which are recognized by deterministic real-time (with no internal transitions) push-down automata. Theorem. Let m be a stack automaton. Then for any x and any w Γ, x m (w) is a deterministic real-time CFL; and conversely: any deterministic real-time CFL can be obtained in this way. Conjecture. Real-time CFL (=proper CFL) are recognized exactly by non-deterministic stack automata. That is, automata over TX P ω (X Γ ) Γ, such that r, t : Γ P ω (X Γ ) TX iff r(u w) = r(u) whenever u n for some n. t(u w) = {u w u t(u)} Theorem. Given x X and m : X B TX A with finite B, x m is rational. Erlangen, 5.11.2013 Sergey Goncharov, Stefan Milius, Alexandra Silva Chair 8 (Theoretical Computer Science) 19

(Deterministic) Real-time CFL T = P ω stack monad over Γ Deterministic real-time context-free languages are exactly those, which are recognized by deterministic real-time (with no internal transitions) push-down automata. Theorem. Let m be a stack automaton. Then for any x and any w Γ, x m (w) is a deterministic real-time CFL; and conversely: any deterministic real-time CFL can be obtained in this way. Conjecture. Real-time CFL (=proper CFL) are recognized exactly by non-deterministic stack automata. That is, automata over TX P ω (X Γ ) Γ, such that r, t : Γ P ω (X Γ ) TX iff r(u w) = r(u) whenever u n for some n. t(u w) = {u w u t(u)} Theorem. Given x X and m : X B TX A with finite B, x m is rational. Erlangen, 5.11.2013 Sergey Goncharov, Stefan Milius, Alexandra Silva Chair 8 (Theoretical Computer Science) 20

Tensors Given two algebraic theories E 1 and E 2 their tensor E 1 E 2 is obtained by joining operations and equations and forcing tensor laws of the form f (g(x 1 1,..., x 1 m),..., g(x n 1,..., x n m)) = g(f (x 1 1,..., x n 1),..., f (x 1 m,..., x n m)) for any f : X n X from E 1 and any g : X m X from E 2. This induces tensor of monads; and if T 1 and T 2 are finitely presentable then so is T 1 T 2. Properties: T ( S) S yields state monad transformer T (X S) S. Tape monad is the tensor of stack monad with itself. Finite powermonad T P ω can be axiomatized in terms of unary operations, + and 0 due to tensor law f (x 1,..., x n ) = f (x 1,..., 0) +... + f (0,..., x n ) Erlangen, 5.11.2013 Sergey Goncharov, Stefan Milius, Alexandra Silva Chair 8 (Theoretical Computer Science) 21

Effectful Fixpoint Expressions Let E be an algebraic theory over signature Σ corresponding to T. Given a finite A = {a 1,..., a n } and a T-algebra B effectful fixpoint expressions E B are closed terms γ given by the following grammar γ ::= µx. (a 1.δ a n.δ B) δ ::= X γ f (δ,..., δ) (f Σ) These form a Σ-algebra: given e k = µx. ( a 1.e k 1 a n.e k n bk), f (e 1,..., e m ) = µx. ( a 1.f (e 1 1,..., e m 1 ) a n.f (e 1 n,..., e m n ) f (b 1,..., b m ) ) but also a L A,B -coalgebra: o(µx. (a 1.e 1 a n.e n b)) = b ai (µx. (a 1.e 1 a n.e n b)) = e i [e/x] Let e e iff for all w A, o( w (e)) = o( w (e )). Erlangen, 5.11.2013 Sergey Goncharov, Stefan Milius, Alexandra Silva Chair 8 (Theoretical Computer Science) 22

Kleene s Theorem for Effectful Fixpoint Expressions Given e E T,B and w A, let e (w) = o( w (e)). Conjecture (Kleene s Theorem): for any T-automaton with a finitely presentable T and any x X there is e E T,B such that x m = e. Conversely, for any e E T,B there is m and x X, such that x m = e. Conjecture: is properly Π 0 1 (by reduction to CFL). Conjecture: Let T be the tensor product of n instances of the stack monad and let m be a delimited store monad over T. Then deciding if w e m is linear time. Deciding if w e m with T replaced by T P ω is nondeterministic linear time. Open Question: Does it generalize to any delimited store monad? Erlangen, 5.11.2013 Sergey Goncharov, Stefan Milius, Alexandra Silva Chair 8 (Theoretical Computer Science) 23

Push-down Expressions Push-down expressions are effectful fixpoint expressions for nondeterministic stack automata. Concretely, γ ::= 0 µx. ξ π ξ ::= a.δ a.δ + ξ a.δ + π π ::= pop s (π) empty(0) empty(1) δ ::= X γ pop s (δ) push s (δ) empty(δ) Example: Consider CFG (in Greibach normal form): X axx X b This corresponds to expression µx. (a.push c (x) + b.pop c (x) + b.empty(1)) Erlangen, 5.11.2013 Sergey Goncharov, Stefan Milius, Alexandra Silva Chair 8 (Theoretical Computer Science) 24

Thank You for Your Attention!

Myers, R. (2013). Rational Coalgebraic Machines in Varieties: Languages, Completeness and Automatic Proofs. PhD thesis, Imperial College London. Rutten, J. J. M. M. (1998). Automata and coinduction (an exercise in coalgebra). In Sangiorgi, D. and de Simone, R., editors, CONCUR, volume 1466 of Lecture Notes in Computer Science, pages 194 218. Springer. Silva, A. (2010). Kleene coalgebra. PhD thesis, Radboud Universiteit Nijmegen. Erlangen, 5.11.2013 Sergey Goncharov, Stefan Milius, Alexandra Silva Chair 8 (Theoretical Computer Science) 26

Silva, A., Bonchi, F., Bonsangue, M. M., and Rutten, J. J. M. M. (2010). Generalizing the powerset construction, coalgebraically. In FSTTCS, volume 8 of LIPIcs, pages 272 283. Winter, J., Bonsangue, M. M., and Rutten, J. J. M. M. (2013). Coalgebraic characterizations of context-free languages. Logical Methods in Computer Science, 9, abs/1308.1228(3). Erlangen, 5.11.2013 Sergey Goncharov, Stefan Milius, Alexandra Silva Chair 8 (Theoretical Computer Science) 27