Gold s algorithm. Acknowledgements. Why would this be true? Gold's Algorithm. 1 Key ideas. Strings as states

Similar documents
Chapter 2 Finite Automata

Learning probabilistic finite automata

CS415 Compilers. Lexical Analysis and. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

Worked out examples Finite Automata

CSCI 340: Computational Models. Transition Graphs. Department of Computer Science

CS 301. Lecture 04 Regular Expressions. Stephen Checkoway. January 29, 2018

Grammar. Languages. Content 5/10/16. Automata and Languages. Regular Languages. Regular Languages

First Midterm Examination

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true.

Informed learners DRAFT. Understanding is compression, comprehension is compression! Greg Chaitin (Chaitin, 2007)

1 From NFA to regular expression

NFA DFA Example 3 CMSC 330: Organization of Programming Languages. Equivalence of DFAs and NFAs. Equivalence of DFAs and NFAs (cont.

Assignment 1 Automata, Languages, and Computability. 1 Finite State Automata and Regular Languages

Designing finite automata II

Let's start with an example:

Learning Moore Machines from Input-Output Traces

Homework 3 Solutions

Non-deterministic Finite Automata

Homework 4. 0 ε 0. (00) ε 0 ε 0 (00) (11) CS 341: Foundations of Computer Science II Prof. Marvin Nakayama

Convert the NFA into DFA

CS 373, Spring Solutions to Mock midterm 1 (Based on first midterm in CS 273, Fall 2008.)

Deterministic Finite Automata

First Midterm Examination

CMSC 330: Organization of Programming Languages

Regular expressions, Finite Automata, transition graphs are all the same!!

Intermediate Math Circles Wednesday, November 14, 2018 Finite Automata II. Nickolas Rollick a b b. a b 4

1 Nondeterministic Finite Automata

Myhill-Nerode Theorem

Minimal DFA. minimal DFA for L starting from any other

CS 275 Automata and Formal Language Theory

CS 311 Homework 3 due 16:30, Thursday, 14 th October 2010

Bases for Vector Spaces

Theory of Computation Regular Languages. (NTU EE) Regular Languages Fall / 38

The University of Nottingham SCHOOL OF COMPUTER SCIENCE A LEVEL 2 MODULE, SPRING SEMESTER LANGUAGES AND COMPUTATION ANSWERS

Lecture 09: Myhill-Nerode Theorem

State Minimization for DFAs

Nondeterminism and Nodeterministic Automata

CMPSCI 250: Introduction to Computation. Lecture #31: What DFA s Can and Can t Do David Mix Barrington 9 April 2014

Theory of Computation Regular Languages

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. NFA for (a b)*abb.

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. Comparing DFAs and NFAs (cont.) Finite Automata 2

Fundamentals of Computer Science

80 CHAPTER 2. DFA S, NFA S, REGULAR LANGUAGES. 2.6 Finite State Automata With Output: Transducers

Finite Automata-cont d

Prefix-Free Subsets of Regular Languages and Descriptional Complexity

CS:4330 Theory of Computation Spring Regular Languages. Equivalences between Finite automata and REs. Haniel Barbosa

Closure Properties of Regular Languages

CSCI 340: Computational Models. Kleene s Theorem. Department of Computer Science

Coalgebra, Lecture 15: Equations for Deterministic Automata

More on automata. Michael George. March 24 April 7, 2014

NFAs continued, Closure Properties of Regular Languages

Non-Deterministic Finite Automata. Fall 2018 Costas Busch - RPI 1

Non-deterministic Finite Automata

13 Learning with Queries

Nondeterministic Automata vs Deterministic Automata

Lecture 08: Feb. 08, 2019

Formal Languages and Automata

Lexical Analysis Finite Automate

CS 310 (sec 20) - Winter Final Exam (solutions) SOLUTIONS

Regular Expressions (RE) Regular Expressions (RE) Regular Expressions (RE) Regular Expressions (RE) Kleene-*

Formal languages, automata, and theory of computation

Nondeterminism. Nondeterministic Finite Automata. Example: Moves on a Chessboard. Nondeterminism (2) Example: Chessboard (2) Formal NFA

Chapter Five: Nondeterministic Finite Automata. Formal Language, chapter 5, slide 1

a,b a 1 a 2 a 3 a,b 1 a,b a,b 2 3 a,b a,b a 2 a,b CS Determinisitic Finite Automata 1

GNFA GNFA GNFA GNFA GNFA

Table of contents: Lecture N Summary... 3 What does automata mean?... 3 Introduction to languages... 3 Alphabets... 3 Strings...

The size of subsequence automaton

CS103B Handout 18 Winter 2007 February 28, 2007 Finite Automata

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2014

Scanner. Specifying patterns. Specifying patterns. Operations on languages. A scanner must recognize the units of syntax Some parts are easy:

Compiler Design. Fall Lexical Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz

FABER Formal Languages, Automata and Models of Computation

Non Deterministic Automata. Linz: Nondeterministic Finite Accepters, page 51

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2016

378 Relations Solutions for Chapter 16. Section 16.1 Exercises. 3. Let A = {0,1,2,3,4,5}. Write out the relation R that expresses on A.

On Determinisation of History-Deterministic Automata.

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Thoery of Automata CS402

Automata and Languages

Harvard University Computer Science 121 Midterm October 23, 2012

Inductive and statistical learning of formal grammars

NON-DETERMINISTIC FSA

Finite Automata. Informatics 2A: Lecture 3. John Longley. 22 September School of Informatics University of Edinburgh

CS5371 Theory of Computation. Lecture 20: Complexity V (Polynomial-Time Reducibility)

Section: Other Models of Turing Machines. Definition: Two automata are equivalent if they accept the same language.

CHAPTER 1 Regular Languages. Contents

Name Ima Sample ASU ID

CS103 Handout 32 Fall 2016 November 11, 2016 Problem Set 7

Converting Regular Expressions to Discrete Finite Automata: A Tutorial

1 Structural induction

CS311 Computational Structures Regular Languages and Regular Grammars. Lecture 6

CHAPTER 1 Regular Languages. Contents. definitions, examples, designing, regular operations. Non-deterministic Finite Automata (NFA)

Finite-State Automata: Recap

Regular Language. Nonregular Languages The Pumping Lemma. The pumping lemma. Regular Language. The pumping lemma. Infinitely long words 3/17/15

CS 275 Automata and Formal Language Theory

Formal Language and Automata Theory (CS21004)

Some Theory of Computation Exercises Week 1

CMSC 330: Organization of Programming Languages. DFAs, and NFAs, and Regexps (Oh my!)

Anatomy of a Deterministic Finite Automaton. Deterministic Finite Automata. A machine so simple that you can understand it in less than one minute

The University of Nottingham SCHOOL OF COMPUTER SCIENCE A LEVEL 2 MODULE, SPRING SEMESTER MACHINES AND THEIR LANGUAGES ANSWERS

Transcription:

Acknowledgements Gold s lgorithm Lurent Miclet, Jose Oncin nd Tim Otes for previous versions of these slides. Rfel Crrsco, Pco Cscuert, Rémi Eyrud, Philippe Ezequel, Henning Fernu, Thierry Murgue, Frnck Thollrd, Enrique Vidl, Frédéric Tntini,... List is necessrily incomplete. Excuses to those tht hve een forgotten. http://eurise.univ-st-etienne.fr/~/slides 2 2 Gold's Algorithm Why would this e true? Gold 978 The lgorithm tries to find the minimum DFA comptile with the smple. If we hve such n lgorithm, we cn identify the regulr lnguges. It cn e proved tht if given ll the strings of length up to n, there is only one utomton consistent with the dt. 3 3 4 4 Key ides Represent the sttes of n utomton s strings, prefixes of the strings in the lerning set. Find some incomptiilities etween these prefixes due to seprting suffixes. Invent the others [] [] Strings s sttes [] [] 5 5 6 6

Incomptile prefixes 2 Oservtion tle X + ={} X - ={} Then clerly there t lest 2 sttes, one corresponding to nd nother to. [] [] [] [] [] 7 7 8 8 The informtion is orgnised in tle OT(S,E) where: S Σ is the set of strings/sttes. Some of them re RED, nd some re BLUE. The RED sttes/strings re those such tht S=RED [REDΣ\RED]. E Σ is the experiment set OT: (RED BLUE) E {,,} such tht: if ue L OT[u][e] = if ue L otherwise 9 9 An oservtion tle The experiments (E) The sttes (RED) The trnsitions (BLUE=REDΣ\RED) Mening. L. L 2 2

Holes A hole in the tle is OT[u][e]=. We sy tht the tle is complete if it hs no holes, ie if u RED BLUE, e E OT[u][e] {,}. Rows Let u RED BLUE, OT[u] denotes the row indexed y u. Given u nd v, u nd v re consistent for OT if e E OT[u][e]= OT[v][e] u v OT[u][e]= OT[v][e]. Given u nd v, u nd v re oviously different for OT if e E OT[u][e]= OT[v][e]= OT[u][e]= OT[v][e]=. 3 3 4 4 Exmple: consistent rows Exmple: oviously different rows nd re consistent is consistent with oth nd nd re OD is OD from ll RED rows 5 5 6 6 3 Aout complete tles Closed tle without holes We sy tht the tle is closed if t BLUE s RED: OT[s] = OT[t]. 7 7 8 8

This tle is closed This tle is not closed 9 9 Not closed 2 2 From tle to n utomton: Tle must e complete Tle must e closed S must e prefix-closed nd E must e suffix-closed A lnguge L is prefix closed if uv L u L suffix closed if uv L v L 2 2 22 22 Building n utomton from complete nd closed tle We define C A (OT,S,E)= (Σ,Q,δ,q,F) s follows: Q = {q s : s RED } q = q F = {q u : OT[u][] = } Σ s RED q s if s RED δ(q s,)= q t if t s: t RED ny q t : t RED OT[t] OT[s] 23 23 [] [] 24 24

Comptiility Theorem: Exmple (why E should e suffix-closed) Let OT(S,E) e n oservtion tle closed nd complete. If S is prefix-closed nd E is suffix-closed then C A (OT,S,E) is consistent with the dt in OT(S,E). 25 25 RED = [] Q = {[]} q = [] F = {[]} But suppose the experiments re not suffix-closed... Notice tht the intermedite column cnnot exist 26 26 Building n utomton from tle 4 Wht do we do with the holes? We define C A (OT,S,E)= (Σ,Q,δ,q,F) s follows: Q = {q s : s RED } q = q F = {q ue : OT[u][e] = } δ(q s,)= q s The only q t : t RED OT[t]=OT[s] 27 27 28 28 Given smple X nd test set S prefix complete, it is lwys possile to select set of experiments E such tht the tle OT(S,E) contins ll the informtion in X. But normlly this tle is going to hve some holes. 29 29 Algorithm to uild tle given X nd RED E SUFF(X + ) SUFF(X - ) BLUE RED.Σ\RED For ech p in RED BLUE do For ech e in E do If p.e X + then OT[p][e] else If p.e X - then OT[p][e] else OT[p][e] 3 3

Theorem. If t BLUE such tht OT[t] is oviously different from ny OT[s], (s RED) then no filling of holes in OT(S,E) cn produce closed tle. is OD with ech RED s 3 3 Generl lgorithm. Given X, uild initil tle OT(RED={},BLUE=Σ,E=SUFF(X)) 2. (loop) Find RED sttes when BLUE stte is OD from ll RED, updting the tle. 3. Fill in the holes 4. if OT(S,E) is incomptile with X, return PTA(X + ) 32 32 Algorithm for finding the RED sttes RED = {} uild OT(S,E) from X with E suffix-closed while s BLUE: s is OD do dd s to RED dd s to BLUE, where s RED updte OT(S,E) Algorithm for filling in the holes For ech p RED BLUE,e E s.t. OT[p][e] = if u,v s.t. OT[u][v] then OT[p][e] OT[u][v] else Let t e row s.t. OT[t] OT[p] OT[p][e] OT[t][e] There cn e severl such t 33 33 34 34 Exmple run X + ={,,, } X - ={,,, } nd re OD 35 35 ) We promote line 2) We expnd the tle, dding lines nd 3) is OD 36 36

) We promote line 2) We expnd the tle, dding lines nd 3) We construct the utomton s no line is OD 37 37 [] [] [] 38 38 [] [] [] X + ={,,, } X - ={,,,, } [] [] [] 39 39 The utomton is inconsistent with X. We shll hve to return the PTA insted. 4 4 Equivlence of prolems Complexity Let S e set prefix-closed, nd X e smple. Let OT(S,E) e n oservtion tle consistent with ll the dt in X, with E suffix-closed. The question: Does there exist DFA with the sttes from RED nd consistent with X? is equivlent to: Cn we fill the holes such tht OT(S,E) is closed? The prolem: Given lelled smple X, set of strings RED, is there DFA with sttes in RED nd consistent with X? is NP-Complete. 4 4 42 42

Corollry The question: Given lelled smple X nd positive integer n, is there DFA with n sttes consistent with X? is NP-Complete. Identifiction in the limit There is lerning lgorithm Gold such tht: ) Gold is consistent; 2) Gold identifies in the limit ny regulr lnguge; 3) Gold is polynomil in X ; 4) Gold hs polynomil chrcteristic smples. 43 43 44 44 Detils If the size of the cnonicl cceptor of the lnguge is n, then there is chrcteristic smple with CS L = 2n 2 ( Σ +), such tht Gold (X) produces the cnonicl cceptor for ll X CS L Conclusion The lgorithm will find the correct utomton when chrcteristic smple is included in the dt. The lgorithm runs in polynomil time. 45 45 46 46 Open questions Exercise Cn one fill the holes in more intelligent wy? How fst cn we detect tht choice (for filling) is good or d? Run Gold s lgorithm for the following dt: X + ={,,, } X - ={,,,,, } 47 47 48 48