Learning of Bi-ω Languages from Factors

Similar documents
Learning k-edge Deterministic Finite Automata in the Framework of Active Learning

Inference of ω-languages from Prefixes

Learning Regular ω-languages

Learning Regular ω-languages

Nondeterministic Finite Automata

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

CS 530: Theory of Computation Based on Sipser (second edition): Notes on regular languages(version 1.1)

On the Accepting Power of 2-Tape Büchi Automata

Closure Properties of Regular Languages. Union, Intersection, Difference, Concatenation, Kleene Closure, Reversal, Homomorphism, Inverse Homomorphism

HKN CS/ECE 374 Midterm 1 Review. Nathan Bleier and Mahir Morshed

Lecture 3: Nondeterministic Finite Automata

Automata Theory. Lecture on Discussion Course of CS120. Runzhe SJTU ACM CLASS

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Theory of Computation

Theory of Computation

On Recognizable Languages of Infinite Pictures

Büchi Automata and their closure properties. - Ajith S and Ankit Kumar

September 11, Second Part of Regular Expressions Equivalence with Finite Aut

Languages. Non deterministic finite automata with ε transitions. First there was the DFA. Finite Automata. Non-Deterministic Finite Automata (NFA)

CPSC 421: Tutorial #1

Induction of Non-Deterministic Finite Automata on Supercomputers

Computational Models - Lecture 1 1

We define the multi-step transition function T : S Σ S as follows. 1. For any s S, T (s,λ) = s. 2. For any s S, x Σ and a Σ,

Closure under the Regular Operations

Notes on State Minimization

Büchi Automata and Their Determinization

Theory of computation: initial remarks (Chapter 11)

On Recognizable Languages of Infinite Pictures

DM17. Beregnelighed. Jacob Aae Mikkelsen

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2017

Uncountable Automatic Classes and Learning

Computer Sciences Department

Closure under the Regular Operations

Computational Models #1

Name: Student ID: Instructions:

CSci 311, Models of Computation Chapter 4 Properties of Regular Languages

Hierarchy among Automata on Linear Orderings

Context-Free Languages (Pre Lecture)

What we have done so far

COM364 Automata Theory Lecture Note 2 - Nondeterminism

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

Chapter Five: Nondeterministic Finite Automata

Final exam study sheet for CS3719 Turing machines and decidability.

Advanced Automata Theory 7 Automatic Functions

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

Finite Automata. Seungjin Choi

CSE 135: Introduction to Theory of Computation Nondeterministic Finite Automata (cont )

CSE 105 Homework 1 Due: Monday October 9, Instructions. should be on each page of the submission.

Chapter Two: Finite Automata

CSE 105 THEORY OF COMPUTATION. Spring 2018 review class

Deterministic Finite Automata. Non deterministic finite automata. Non-Deterministic Finite Automata (NFA) Non-Deterministic Finite Automata (NFA)

T (s, xa) = T (T (s, x), a). The language recognized by M, denoted L(M), is the set of strings accepted by M. That is,

Lecture 17: Language Recognition

Monadic Second Order Logic and Automata on Infinite Words: Büchi s Theorem

Equivalence of DFAs and NFAs

CMPSCI 250: Introduction to Computation. Lecture #22: From λ-nfa s to NFA s to DFA s David Mix Barrington 22 April 2013

ECS 120: Theory of Computation UC Davis Phillip Rogaway February 16, Midterm Exam

Computability and Complexity

Theory of Computation (IX) Yijia Chen Fudan University

CS 154, Lecture 2: Finite Automata, Closure Properties Nondeterminism,

Enhancing Active Automata Learning by a User Log Based Metric

Timed Watson-Crick Online Tessellation Automaton

Uses of finite automata

COMP4141 Theory of Computation

Foundations of Informatics: a Bridging Course

Tree Automata and Rewriting

Chapter 2: Finite Automata

Finite Automata and Regular Languages

Great Theoretical Ideas in Computer Science. Lecture 4: Deterministic Finite Automaton (DFA), Part 2

Learning Residual Finite-State Automata Using Observation Tables

3515ICT: Theory of Computation. Regular languages

Comment: The induction is always on some parameter, and the basis case is always an integer or set of integers.

2. Elements of the Theory of Computation, Lewis and Papadimitrou,

Context Free Languages. Automata Theory and Formal Grammars: Lecture 6. Languages That Are Not Regular. Non-Regular Languages

CS 455/555: Finite automata

Subsumption of concepts in FL 0 for (cyclic) terminologies with respect to descriptive semantics is PSPACE-complete.

UNIT-VIII COMPUTABILITY THEORY

CS:4330 Theory of Computation Spring Regular Languages. Finite Automata and Regular Expressions. Haniel Barbosa

Time Magazine (1984)

Theory of Computation p.1/?? Theory of Computation p.2/?? Unknown: Implicitly a Boolean variable: true if a word is

Recap DFA,NFA, DTM. Slides by Prof. Debasis Mitra, FIT.

Examples of Regular Expressions. Finite Automata vs. Regular Expressions. Example of Using flex. Application

Aperiodic languages p. 1/34. Aperiodic languages. Verimag, Grenoble

CS 154. Finite Automata, Nondeterminism, Regular Expressions

THEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET

Fooling Sets and. Lecture 5

Sri vidya college of engineering and technology

CS 154 Introduction to Automata and Complexity Theory

CSE 105 THEORY OF COMPUTATION

Ogden s Lemma. and Formal Languages. Automata Theory CS 573. The proof is similar but more fussy. than the proof of the PL4CFL.

Languages. A language is a set of strings. String: A sequence of letters. Examples: cat, dog, house, Defined over an alphabet:

Theoretical Computer Science

Automata Theory and Formal Grammars: Lecture 1

Pushdown Automata. We have seen examples of context-free languages that are not regular, and hence can not be recognized by finite automata.

CS 154. Finite Automata vs Regular Expressions, Non-Regular Languages

CSE 105 THEORY OF COMPUTATION

CSE 105 THEORY OF COMPUTATION

Languages, regular languages, finite automata

CSCE 551 Final Exam, Spring 2004 Answer Key

arxiv: v2 [cs.fl] 29 Nov 2013

Transcription:

JMLR: Workshop and Conference Proceedings 21:139 144, 2012 The 11th ICGI Learning of Bi-ω Languages from Factors M. Jayasrirani mjayasrirani@yahoo.co.in Arignar Anna Government Arts College, Walajapet D.G. Thomas Madras Christian College, Chennai - 600 059 M.H. Begam Arignar Anna Government Arts College, Walajapet J.D. Emerald Arignar Anna Government Arts College, Walajapet dgthomasmcc@yahoo.com humrosia begam@yahoo.co.in emerald sheela@yahoo.com Editors: Jeffrey Heinz, Colin de la Higuera and Tim Oates Abstract De la Higuera and Janodet (2001) gave a polynomial algorithm that identifies the class of safe ω-languages which is a subclass of deterministic ω-languages from positive and negative prefixes. As an extension of this work we study the learning of the family of bi-ω languages. Keywords: DB-machine, bi-ω language, learning. 1. Introduction Nivat (1979) has made a study of infinite words and infinite successful computations in an attempt to define the semantics of recursive programs. He has considered a method of generating infinite words by algebraic or context-free grammars. Bi-infinite words or two sided infinite words are natural extensions of infinite words and have also been objects of interest and study. The theory of finite automata has been extended to bi-infinite words by Nivat and Perrin (1986) and the study has been continued (Beauquier, 1986; Devolder and Litovsky, 1991). Maler and Pnueli (1995) have exhibited a learning algorithm for a subclass of ω-regular languages which is recognized by a finite automaton with Büchi condition and is also recognized by a finite automaton with Muller condition from membership queries and counter examples based on the framework suggested by Angluin (1987). A linear time learning algorithm in Gold s framework of identification in the limit from positive data (Gold, 1967) is given for a subclass of ω-regular languages with restricted superset queries by Saoudi and Yokomori (1994). S. Gnanasekaran and Thomas (2001) have introduced a new class called Büchi local ω- languages and proved that the class of ω-regular languages is an alphabetic morphic image of the class of Büchi local ω-languages. They gave a polynomial algorithm that identifies the class of ω-regular languages. Thomas et al. Thomas et al. (2002) extended this result to bi-ω regular languages. de la Higuera and Janodet (2001) investigated the question of inferring ω-languages through the prefixes of accepted or rejected infinite words. They gave a polynomial alc 2012 M. Jayasrirani, D. Thomas, M. Begam & J. Emerald.

Jayasrirani Thomas Begam Emerald gorithm that identifies the class of safe ω-languages which is a subclass of deterministic ω-languages from positive and negative prefixes. As an extension of this work, in this paper, we exhibit that the classes of bi-ω regular languages and deterministic bi-ω languages are not identifiable in the limit using factors of accepted and rejected bi-ω words and the subclass of bi-ω regular languages called bi-ω safe languages are learnable from positive and negative factors of bi-ω words. 2. Bi-ω Languages and Results In this section we recall the notions of bi-ω languages (Beauquier, 1986; Nivat and Perrin, 1986) and prove certain results. An alphabet Σ is a finite set whose members are called symbols. A finite string or word on an alphabet Σ is a finite sequence of zero or more symbols of Σ. The set of all words on Σ is denoted by Σ and ɛ denotes the empty word which has length zero. If x, y Σ then x is a prefix of xy and y is a suffix of xy. If x, y, z Σ then y is a subword or factor of xyz. For each x Σ, F act(x) is the collection of all factors of x and F act(l) = x L F act(x). An infinite word or a right-infinite word or an ω-word on an alphabet Σ is a mapping from the set N of natural numbers to Σ. An infinite word x on Σ is written as x = x 1 x 2 x 3... (x i Σ, i 1). Let left-infinite word x on an alphabet Σ is mapping from the set Z of negative integers to Σ and is written as x =... x 3 x 2 x 1 (x i Σ, i 1). Let Σ Z denote the set of all mappings from the set of integers into Σ. A bi-infinite word is a class under the equivalence relation ρ over Σ Z, defined by uρv, where u = (u i ) i Z and v = (v j ) j Z, if and only if there exists some p Z such that for every n Z, u n+p = v n. A bi-infinite word is written as x =... x 3 x 2 x 1 x 0 x 1 x 2 x 3... (x i Σ, i Z). The set of infinite words on Σ is denoted by Σ ω, the set of all left-infinite words on Σ is denoted by ω Σ and set of all bi-infinite words on Σ is denoted by ω Σ ω. Definition 1 A Büchi (finite) automaton M over an alphabet Σ is M = Q, Σ, E M, I linf, T rinf where Q is a finite set of states; Σ is a finite alphabet; E M is a subset of Q Σ Q, called the set of arrows; I linf Q is a set of left-infinite repetitive states; T rinf Q is a set of right-infinite repetitive states. An arrow c = (p, a, q) is also denoted by c : p a q. A path in M is a bi-infinite sequence of arrows c i, i Z, namely c =... c 2 c 1 c 0 c 1 c 2.... The arrows in the paths a i+1 are consecutive in the sense that c i+1 : q i q i+1 where i Z. The bi-infinite word w a =... a 2 a 1 a 0 a 1 a 2... is called the label of the path c. A notation c : I linf Trinf denotes a bi-infinite path c of label w which passes through states of I linf infinitely often on the left and through states of T rinf infinitely often on the right. Let L ωω (M) = {w ω Σ ω / (c : I linf w Trinf )} be a bi-ω language accepted by M. Let Rec( ω Σ ω ) = {L ω Σ ω / a Büchi automaton M such that L ωω (M) = L} and Rec ω Σ ω is known as the family of all bi-ω regular languages each of which is a finite union of sets of the form ω XY Z ω = {u ω Σ ω /u =... x n... x 2 x 1 yz 1 z 2... z n 1 z n..., with x i X, y Y and z i Z, i 1} 140

Learning of Bi-ω Languages from Factors where X, Y and Z are regular languages in Σ. The Büchi automaton M over Σ is said to be deterministic if for every three states q, q 1, q 2 Q and every symbol a Σ, (q, a, q 1 ) E M and (q, a, q 2 ) E M q 1 = q 2. Definition 2 A bi-ω language L is safe if x ω Σ ω, v F act(x), u ω Σ, w Σ ω such that uvw L x L i.e., x ω Σ ω, F act(x) F act(l) x L i.e., x ω Σ ω, x L v F act(x) such that u ω Σ ω, w Σ ω, uvw L Let Safe ωω (Σ) denote the class of all safe bi-ω regular languages. Example 1 ω ab ω + ω ba ω + ω ab a ω + ω b ω is a safe language accepted by the automaton M = Q, Σ, δ, I linf, T rinf, Q = {1, 2, 3}, Σ = {a, b}, I linf = {1, 2}, T rinf = {2, 3} and δ is the transition function. a b a b a 1 2 3 But ω ab ω + ω ba ω + ω ab a ω is not a safe language because every factor b k of ω b ω is a factor of ω ab ω + ω ba ω + ω ab a ω + ω b ω. It follows that the class of all safe bi-ω regular languages is a subclass of all deterministic bi-ω languages. Hence Safe ωω (Σ) Det ωω (Σ). Definition 3 A DB-machine is a deterministic Büchi automaton where I linf = Q = T rinf. Definition 4 (Beauquier, 1986) A language L is bi-ω regular if there exist three sequences of regular languages (A i ) i {1,...,n}, (B i ) i {1,...,n} and (C i ) i {1,...,n} such that We note that F act(l) = F act L = ( n n ω A i B i C ω i ω A i B i C ω i. ) = n F act ( ω A i B i Ci ω ) F act ( ω A i B i C ω i ) = Suff(A i )A i B i C i P ref(c i ) Suff(A i )A i P ref(b i ) Suff(B i )C i P ref(c i ) Suff(C i )C i P ref(c i ) Suff(A i )A i P ref(a i ) F act(b i ) where F act(l) is the set of all factors (finite words) of members of L ω Σ ω, P ref(c) is the set of all prefixes (finite words) of members of C Σ and Suff(A) is the set of all suffixes (finite words) of members of A Σ. Definition 5 (de la Higuera and Janodet, 2001) A language P Σ is a regular factor language if (1) P is regular (2) Every factor of a word of P is a word of P i.e., u Σ, a, b Σ, aub P u P (3) Every word of P is a proper factor of another word of P i.e., u P, a, b Σ such that aub P. A deterministic finite state automaton (dfa) is a factor automaton (factor dfa) if and only if (1) Every state is initial and final (2) Every state is alive i.e., q Q, a, b Σ and q Q, δ(q, b) Q and δ(q, a) = q. 141

Jayasrirani Thomas Begam Emerald Proposition 6 1. If L is a bi-ω regular language then F act(l) is a regular factor language. 2. If P is a regular factor language, then there exists a factor automaton which recognizes P. 3. If A = Q, Σ, δ, I, T is a factor automaton then the language L(M) recognized by the DB- machine M = Q, Σ, δ, I linf, T rinf is bi-ω regular and satisfies L(A) = F actl(m). Theorem 7 L is a safe bi-ω regular language if and only if L is recognized by a DBmachine. 3. Learning of Bi-ω Languages In this section, we define positive factors and negative factors of a word in a bi-ω language L and exhibit that bi-ω safe languages are learnable from positive and negative factors of bi-ω words. Definition 8 Let v Σ 1. v is an positive factor of L iff u ω Σ, w Σ ω such that uvw L 2. v is an positive factor of L iff u ω Σ, w Σ ω such that uvw L 3. v is an negative factor of L iff u ω Σ, w Σ ω such that uvw L 4. v is an negative factor of L iff u ω Σ, w Σ ω such that uvw L Given a bi-ω language L, let P (L) denote the set of all -positive factors of L, P (L) denote the set of all positive factors of L, N (L) the set of -negative factors of L and N (L) the set of all -negative factors of L. Two finite sets S+ and S of finite words form together a set of (p, n) examples for a bi-ω language if and only if S+ P p (L) and S N n (L). Without explaining the notions and notations which can be naturally extended for bi-ω languages from the case of ω-languages (de la Higuera and Janodet, 2001), we obtain the following results. Lemma 9 Let L be a class of bi-ω languages and R a class of representations for L. If there exist L 1 and L 2 in L such that L 1 L 2, P p (L 1 ) = P p (L 2 ) and N n (L 1 ) = N n (L 2 ) then the problem < L R, idlim, < p, n >> has a negative status. Theorem 10 For any class of representations R and for all (p, n) {, } {, } problems Reg ωω (Σ) R, idlim, (p, n) and Det ωω (Σ) R, idlim, (p, n) have negative status where idlim stands for identification in the limit. Theorem 11 Safe ωω (Σ) DBM, polyid, (, ) has a positive status where polyid stands for polynomially identifiable in the limit. The result is an extension of the result given in (de la Higuera and Janodet, 2001). 142

Learning of Bi-ω Languages from Factors References D. Angluin. Learning regular sets from queries and counterexamples. Information and Computation, 75:87 106, 1987. D. Beauquier. Thin homogeneous sets of factors. In LNCS, volume 241, pages 239 251, 1986. C. de la Higuera and J.C Janodet. Inference of ω-languages from prefixes. LNAI, 2225: 364 378, 2001. J. Devolder and I. Litovsky. Finitely generated bi-ω-languages. Theoretical Computer Science, 85(1):33 52, 1991. M. Gold. Language identification in the limit. Information and Control, 10:447 474, 1967. O. Maler and A. Pnueli. On the learnability of infinitary regular sets. Information and Computation, 118:316 326, 1995. M. Nivat. Infinite words, infinite trees and infinite computations. 109:1 52, 1979. M. Nivat and D. Perrin. Ensembles reconnaissables de mots bi-infinis. Canadian Journal of Mathematics, 38:513 537, 1986. K.G. Subramanian S. Gnanasekaran, V.R. Dare and D.G. Thomas. Learning ω-regular languages. In Proceedings of the International Symposium on Artificial Intelligence, pages 206 211. Allied Publishers, 2001. A. Saoudi and T. Yokomori. Learning local and recognizable ω-languages and monadic logic programs. In Proceedings of EuroColt 1993, pages 157 169. Oxford University Press, 1994. D.G. Thomas, M.H. Begam, K.G. Subramanian, and S. Gnanasekaran. Learning of regular bi-ω languages. LNAI, 2424:283 292, 2002. Appendix Proof of Proposition 1. 1. L is a bi-ω regular language. So L = n ω A i B i Ci ω where A i, B i, C i are regular sets. Since regular languages are closed under union, product and star, n F act( ω A i B i Ci ω ) is a regular language. Hence F act(l) is regular. 2. Let P be a regular factor language. P is recognized by a dfa A which is minimal and dead state free. As P is a factor language, every state of this automaton is initial and final. Moreover let q 1 be a state of A and u, a word of P, such that δ(q, u) = q 1. By the definition of a factor language, there exist a, b Σ such that aub P. So δ(q, a) = q and δ(q 1, b) is necessarily defined. Hence q is alive. 3. Let A = Q, Σ, δ, I, T be a factor automaton. Consider the corresponding DBmachine M = Q, Σ, δ, I linf, T rinf (where I linf = T rinf = Q). Let us prove that F actl(m) = L(A). Let v F actl(m). 143

Jayasrirani Thomas Begam Emerald Then there exist u ω Σ and w Σ ω such that uvw L(M). So there exists a state q Q such that δ(q, v) Q and hence v L(A). Converesly let v L(A) and q = δ(q, v), q, q Q. As q and q are alive, we can build two words x, u, w, y such that δ(q 0, x) = q 0, δ(q 0, u) = q, δ(q, w) = q and δ(q, y) = y. x y q u v w 0 q q q Clearly the bi-infinite path of label ω xuvwy ω passes through the state q 0 infinitely often on the left and the state q infinitely often on the right. So ω xuvwy ω L(M). Thus v F actl(m). Proof of Theorem 1. Let L be a language recognized by a DB-machine M = Q, Σ, δ, Q, Q and w ω Σ ω. Assume that every factor w i of w can be continued into a word of L recognized by M. The mapping c : N Q such that c(i + 1) = δ(c(i), w i ) is a run of M on w i and hence on w. Since all the states of M are marked, this run is successful and so w L. Hence L is a safe bi-ω regular language. Conversely, let L be a safe bi-ω regular language. By proposition 6, F act(l) is a regular factor language which is recognized by some factor automaton A = Q, Σ, δ, Q, Q. We claim that L is recognized by the DB- machine M = Q, Σ, δ, Q, Q. By Proposition 6, the language L(M) satisfies F actl(m) = L(A). By the first part of the proof, L(M) is a safe language since M is a DB-machine. So L and L(M) are both safe languages such that F act(l) = F actl(m) = L(A). Assume that there exists a word w L and not in L(M) (or vice-versa). As F act(l) = F actl(m), every factor of w is in F actl(m). Since L(M) is a safe language, w itself is in L(M), a contradiction. So L = L(M). Proof of Theorem 2. We prove the theorem by giving counterexample. L 1 = ω aba (a + b) ω, L 2 = ω a ω + ω aba (a + b) ω These languages are accepted by the following automata, respectively. 1) Q = {q 1, q 2 }, I linf = {q 1 }, T rinf = {q 2 } a a,b q b 1 q 2 2) Q = {q 1, q 2 }, I linf = {q 1 }, T rinf = {q 1, q 2 } a q b 1 q 2 a,b But it is clear that whatever the choice of quantifiers p and n, languages P p and N n are identical in both the cases. Formally, P (L 1 ) = P (L 2 ) = Σ ; P (L 1 ) = P (L 2 ) = a ba (a+b) ; N (L 1 ) = N (L 2 ) = a + a ba ; N (L 1 ) = N (L 2 ) = φ. 144