Learning Moore Machines from Input-Output Traces

Similar documents
Non Deterministic Automata. Linz: Nondeterministic Finite Accepters, page 51

Non-Deterministic Finite Automata. Fall 2018 Costas Busch - RPI 1

Learning Regular Languages over Large Alphabets

Theory of Computation Regular Languages. (NTU EE) Regular Languages Fall / 38

Lecture 6 Regular Grammars

Theory of Computation Regular Languages

Gold s algorithm. Acknowledgements. Why would this be true? Gold's Algorithm. 1 Key ideas. Strings as states

Finite Automata. Informatics 2A: Lecture 3. John Longley. 22 September School of Informatics University of Edinburgh

Non Deterministic Automata. Formal Languages and Automata - Yonsei CS 1

a,b a 1 a 2 a 3 a,b 1 a,b a,b 2 3 a,b a,b a 2 a,b CS Determinisitic Finite Automata 1

1.4 Nonregular Languages

Good-for-Games Automata versus Deterministic Automata.

Inductive and statistical learning of formal grammars

Anatomy of a Deterministic Finite Automaton. Deterministic Finite Automata. A machine so simple that you can understand it in less than one minute

Introduction to ω-autamata

Chapter 2 Finite Automata

Automata and Languages

Probabilistic Model Checking Michaelmas Term Dr. Dave Parker. Department of Computer Science University of Oxford

Minimal DFA. minimal DFA for L starting from any other

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2014

Assignment 1 Automata, Languages, and Computability. 1 Finite State Automata and Regular Languages

On Determinisation of History-Deterministic Automata.

Formal Language and Automata Theory (CS21004)

Finite Automata. Informatics 2A: Lecture 3. Mary Cryan. 21 September School of Informatics University of Edinburgh

Deterministic Finite Automata

CS 330 Formal Methods and Models

1.3 Regular Expressions

Nondeterminism and Nodeterministic Automata

Non-Deterministic Finite Automata

Learning probabilistic finite automata

Non-deterministic Finite Automata

Deterministic Finite-State Automata

Extending Automated Compositional Verification to the Full Class of Omega-Regular Languages

CMPSCI 250: Introduction to Computation. Lecture #31: What DFA s Can and Can t Do David Mix Barrington 9 April 2014

11.1 Finite Automata. CS125 Lecture 11 Fall Motivation: TMs without a tape: maybe we can at least fully understand such a simple model?

CS:4330 Theory of Computation Spring Regular Languages. Equivalences between Finite automata and REs. Haniel Barbosa

Hyper-minimisation of weighted finite automata

CMSC 330: Organization of Programming Languages. DFAs, and NFAs, and Regexps (Oh my!)

AUTOMATA AND LANGUAGES. Definition 1.5: Finite Automaton

5.1 Definitions and Examples 5.2 Deterministic Pushdown Automata

Regular Languages and Applications

Lecture 09: Myhill-Nerode Theorem

Foundations of XML Types: Tree Automata

Model Reduction of Finite State Machines by Contraction

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Lexical Analysis Finite Automate

Regular expressions, Finite Automata, transition graphs are all the same!!

Recursively Enumerable and Recursive. Languages

CS375: Logic and Theory of Computing

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2016

Worked out examples Finite Automata

Automata Theory 101. Introduction. Outline. Introduction Finite Automata Regular Expressions ω-automata. Ralf Huuck.

Part 5 out of 5. Automata & languages. A primer on the Theory of Computation. Last week was all about. a superset of Regular Languages

I. Theory of Automata II. Theory of Formal Languages III. Theory of Turing Machines

CHAPTER 1 Regular Languages. Contents

CSCI 340: Computational Models. Kleene s Theorem. Department of Computer Science

Nondeterministic Biautomata and Their Descriptional Complexity

DFA minimisation using the Myhill-Nerode theorem

How to simulate Turing machines by invertible one-dimensional cellular automata

CS 267: Automated Verification. Lecture 8: Automata Theoretic Model Checking. Instructor: Tevfik Bultan

Formal Methods in Software Engineering

Module 9: Tries and String Matching

Module 9: Tries and String Matching

Refined interfaces for compositional verification

Grammar. Languages. Content 5/10/16. Automata and Languages. Regular Languages. Regular Languages

State Minimization for DFAs

A tutorial on sequential functions

CS 275 Automata and Formal Language Theory

CS 188: Artificial Intelligence Spring 2007

LTL Translation Improvements in Spot

More on automata. Michael George. March 24 April 7, 2014

NFAs continued, Closure Properties of Regular Languages

CS 275 Automata and Formal Language Theory

CMSC 330: Organization of Programming Languages

Review of Calculus, cont d

CSCI FOUNDATIONS OF COMPUTER SCIENCE

CS 330 Formal Methods and Models

Designing finite automata II

NFAs and Regular Expressions. NFA-ε, continued. Recall. Last class: Today: Fun:

Turing Machines Part One

Speech Recognition Lecture 2: Finite Automata and Finite-State Transducers. Mehryar Mohri Courant Institute and Google Research

This lecture covers Chapter 8 of HMU: Properties of CFLs

A likelihood-ratio test for identifying probabilistic deterministic real-time automata from positive data

Prefix-Free Subsets of Regular Languages and Descriptional Complexity

Turing Machines Part One

For convenience, we rewrite m2 s m2 = m m m ; where m is repeted m times. Since xyz = m m m nd jxyj»m, we hve tht the string y is substring of the fir

Faster Regular Expression Matching. Philip Bille Mikkel Thorup

Harvard University Computer Science 121 Midterm October 23, 2012

a b b a pop push read unread

Formal Languages and Automata

13 Learning with Queries

Non-deterministic Finite Automata

Chapter Five: Nondeterministic Finite Automata. Formal Language, chapter 5, slide 1

CISC 4090 Theory of Computation

Reinforcement Learning

Formal languages, automata, and theory of computation

Inference of regular expressions/grammars for given data entities

Speech Recognition Lecture 2: Finite Automata and Finite-State Transducers

CS103B Handout 18 Winter 2007 February 28, 2007 Finite Automata

A New Grey-rough Set Model Based on Interval-Valued Grey Sets

Transcription:

Lerning Moore Mchines from Input-Output Trces Georgios Gintmidis 1 nd Stvros Tripkis 1,2 1 Alto University, Finlnd 2 UC Berkeley, USA

Motivtion: lerning models from blck boxes Inputs? Lerner Forml Model Outputs Mny pplictions: Verify tht blck-box component is sfe to use Dynmic mlwre nlysis... Georgios Gintmidis (Alto University) Lerning Moore Mchines from Input-Output Trces December 8, 2016 2 / 32

Lerning FSMs from input-output trces IO-trces Lerned FSM 020 b 0122 bb 0122 b 02220 bb 02220 Lerner q 0 q 3 0 q 1 b 1, b q 2 2 b 2, b Georgios Gintmidis (Alto University) Lerning Moore Mchines from Input-Output Trces December 8, 2016 3 / 32

Outline 1 Bckground 2 Forml problem definition 3 Relted work 4 Identifiction in the limit 5 Our lerning lgorithms 6 Results 7 Summry & future work Georgios Gintmidis (Alto University) Lerning Moore Mchines from Input-Output Trces December 8, 2016 4 / 32

Outline 1 Bckground 2 Forml problem definition 3 Relted work 4 Identifiction in the limit 5 Our lerning lgorithms 6 Results 7 Summry & future work Georgios Gintmidis (Alto University) Lerning Moore Mchines from Input-Output Trces December 8, 2016 4 / 32

Moore mchines 0 q 1 b 1, b q 0 q 3 q 2 b 2, b 2 (I, O, Q, q 0, δ, λ) input lphbet, I = {, b} output lphbet, O = {0, 1, 2} set of sttes, Q = {q 0, q 1, q 2, q 3 } initil stte, q 0 trnsition function, δ : Q I Q output function, λ : Q O Georgios Gintmidis (Alto University) Lerning Moore Mchines from Input-Output Trces December 8, 2016 5 / 32

Moore mchines 0 q 1 b 1, b q 0 q 3 q 2 b 2, b 2 (I, O, Q, q 0, δ, λ) input lphbet, I = {, b} output lphbet, O = {0, 1, 2} set of sttes, Q = {q 0, q 1, q 2, q 3 } initil stte, q 0 trnsition function, δ : Q I Q output function, λ : Q O By definition, our mchines re deterministic nd complete. Georgios Gintmidis (Alto University) Lerning Moore Mchines from Input-Output Trces December 8, 2016 5 / 32

Input-output trces q 1 b 1, b 020 q 0 q 3 0 q 2 2 b 2, b b 0122 bb 0122 b 02220 bb 02220 Moore mchine Some I/O trces generted by the mchine Georgios Gintmidis (Alto University) Lerning Moore Mchines from Input-Output Trces December 8, 2016 6 / 32

Consistency q 1 b 1, b 020 q 0 q 3 0 q 2 2 b 2, b b 0122 bb 0122 b 02220 bb 02220 This mchine is consistent with this set of trces. Georgios Gintmidis (Alto University) Lerning Moore Mchines from Input-Output Trces December 8, 2016 7 / 32

Consistency 0 r 1 b 1 r 0, b r 3 r 2 2 b 2, b 020 b 0122 bb 0122 b 02220 bb 02220 This mchine is inconsistent with this set of trces. Georgios Gintmidis (Alto University) Lerning Moore Mchines from Input-Output Trces December 8, 2016 8 / 32

Outline 1 Bckground 2 Forml problem definition 3 Relted work 4 Identifiction in the limit 5 Our lerning lgorithms 6 Results 7 Summry & future work Georgios Gintmidis (Alto University) Lerning Moore Mchines from Input-Output Trces December 8, 2016 9 / 32

A first ttempt t problem definition Given... Input lphbet, I Output lphbet, O Set of IO-trces, S (the trining set)... find Moore mchine M such tht: M is deterministic M is complete M is consistent with S Georgios Gintmidis (Alto University) Lerning Moore Mchines from Input-Output Trces December 8, 2016 10 / 32

A trivil solution q 0 b 01 020 b 022 q ɛ 0 b q 2 q b 1 b q b 2, b, b, b Georgios Gintmidis (Alto University) Lerning Moore Mchines from Input-Output Trces December 8, 2016 11 / 32

A trivil solution q 0 b 01 020 b 022 q ɛ 0 b q 2 q b 1 b q b 2, b, b, b This is clled the prefix-tree mchine. Georgios Gintmidis (Alto University) Lerning Moore Mchines from Input-Output Trces December 8, 2016 11 / 32

A trivil solution q 0 b 01 020 b 022 q ɛ 0 b q 2 q b 1 b q b 2, b, b, b This is clled the prefix-tree mchine. Not quite solution: mchine incomplete... Georgios Gintmidis (Alto University) Lerning Moore Mchines from Input-Output Trces December 8, 2016 11 / 32

A trivil solution q 0 b 01 020 b 022 q ɛ 0 b q 2 q b 1 b q b 2, b, b, b... but esily completed with self-loops. Georgios Gintmidis (Alto University) Lerning Moore Mchines from Input-Output Trces December 8, 2016 12 / 32

Problems with the trivil solution (1) Poor generliztion, due to trivil completion with self-loops The mchine my be consistent with the trining set...... but how ccurte is it on test set? Georgios Gintmidis (Alto University) Lerning Moore Mchines from Input-Output Trces December 8, 2016 13 / 32

Problems with the trivil solution (1) Poor generliztion, due to trivil completion with self-loops The mchine my be consistent with the trining set...... but how ccurte is it on test set? (2) Lrge number of sttes in the lerned mchine The prefix-tree mchine does not merge sttes t ll. Georgios Gintmidis (Alto University) Lerning Moore Mchines from Input-Output Trces December 8, 2016 13 / 32

Revised problem definition The LMoMIO problem (Lerning Moore Mchines Input-Output Trces): Given... Input lphbet, I Output lphbet, O Set of IO-trces, S (the trining set)... find Moore mchine M such tht: M is deterministic M is complete M is consistent with S... nd lso: M generlizes well (good ccurcy on -priori unknown test sets) M is smll (few sttes) M is found quickly (good lerning lgorithm complexity) Georgios Gintmidis (Alto University) Lerning Moore Mchines from Input-Output Trces December 8, 2016 14 / 32

How to mesure ccurcy? We define three metrics: Strong, Medium, Wek test trce mchine output strong cc. medium cc. wek cc. bc 1234 1234 1 1 1 bc 1234 4321 0 0 0 bc 1234 1212 0 1 2 1 2 bc 1234 3434 0 0 1 2 bc 1234 1324 0 1 4 1 2 Georgios Gintmidis (Alto University) Lerning Moore Mchines from Input-Output Trces December 8, 2016 15 / 32

Outline 1 Bckground 2 Forml problem definition 3 Relted work 4 Identifiction in the limit 5 Our lerning lgorithms 6 Results 7 Summry & future work Georgios Gintmidis (Alto University) Lerning Moore Mchines from Input-Output Trces December 8, 2016 16 / 32

Relted work ctive A* [Angluin, 1987] pssive exct NP-hrd [Gold, 1978] heuristic K-tils [Biermnn & Feldmn, 1972] Gold's lgorithm [Gold, 1978] RPNI [Oncin & Grci, 1992] Genetic lgorithms Ant colony optimiztion Our work Georgios Gintmidis (Alto University) Lerning Moore Mchines from Input-Output Trces December 8, 2016 17 / 32

Outline 1 Bckground 2 Forml problem definition 3 Relted work 4 Identifiction in the limit 5 Our lerning lgorithms 6 Results 7 Summry & future work Georgios Gintmidis (Alto University) Lerning Moore Mchines from Input-Output Trces December 8, 2016 18 / 32

Identifiction in the limit Concept introduced in [Gold, 1967], in the context of forml lnguge lerning Lerning is seen s n infinite process Trining set keeps growing: S 0 S 1 S 2 Every input word is gurnteed to eventully pper in the trining set For ech S i, the lerner outputs mchine M i Identifiction in the limit := lerner outputs the right mchine fter some i Georgios Gintmidis (Alto University) Lerning Moore Mchines from Input-Output Trces December 8, 2016 19 / 32

Identifiction in the limit Concept introduced in [Gold, 1967], in the context of forml lnguge lerning Lerning is seen s n infinite process Trining set keeps growing: S 0 S 1 S 2 Every input word is gurnteed to eventully pper in the trining set For ech S i, the lerner outputs mchine M i Identifiction in the limit := lerner outputs the right mchine fter some i A good pssive lerning lgorithm must identify in the limit. Georgios Gintmidis (Alto University) Lerning Moore Mchines from Input-Output Trces December 8, 2016 19 / 32

Chrcteristic smples To prove identifiction in the limit, we use the notion of the Chrcteristic Smple [C. de l Higuer, 2010]: Concept existing for DFAs (deterministic finite utomt) we dpt it to Moore mchines Intuition: set of IO-trces tht covers the mchine (covers ll sttes, ll trnsitions) For miniml Moore mchine M = (I, O, Q, q 0, δ, λ), there exists CS of totl length O( Q 4 I ) Georgios Gintmidis (Alto University) Lerning Moore Mchines from Input-Output Trces December 8, 2016 20 / 32

Chrcteristic smples To prove identifiction in the limit, we use the notion of the Chrcteristic Smple [C. de l Higuer, 2010]: Concept existing for DFAs (deterministic finite utomt) we dpt it to Moore mchines Intuition: set of IO-trces tht covers the mchine (covers ll sttes, ll trnsitions) For miniml Moore mchine M = (I, O, Q, q 0, δ, λ), there exists CS of totl length O( Q 4 I ) Chrteristic Smple Requirement (CSR): A lerning lgorithm stisfies CSR if it stisfies the following: If the trining set S is chrcteristic smple of miniml mchine M, then the lgorithm lerns from S mchine isomorphic to M. CSR cn be shown to imply identifiction in the limit Georgios Gintmidis (Alto University) Lerning Moore Mchines from Input-Output Trces December 8, 2016 20 / 32

Outline 1 Bckground 2 Forml problem definition 3 Relted work 4 Identifiction in the limit 5 Our lerning lgorithms 6 Results 7 Summry & future work Georgios Gintmidis (Alto University) Lerning Moore Mchines from Input-Output Trces December 8, 2016 21 / 32

Three lerning lgorithms PTAP - Prefix Tree Acceptor Product PRPNI - Product RPNI MooreMI - Moore Mchine Inference Georgios Gintmidis (Alto University) Lerning Moore Mchines from Input-Output Trces December 8, 2016 22 / 32

PTAP - Prefix Tree Acceptor Product This is the trivil solution we discussed erlier: q 0 b 01 020 q ɛ q 2 b q b, b b 022 0 b q b 1 2, b Drwbcks: Lrge number of sttes in lerned mchine Poor generliztion / ccurcy, b Georgios Gintmidis (Alto University) Lerning Moore Mchines from Input-Output Trces December 8, 2016 23 / 32

PRPNI - Product RPNI Observtions: A DFA is specil cse of Moore mchine with binry output (ccept/reject) A Moore mchine cn be encoded s product of log 2 O DFAs Bsed on these observtions, PRPNI works s follows: Uses the RPNI lgorithm [J. Oncin nd P. Grci, 1992], which lerns DFAs Lerns severl DFAs tht encode the lerned Moore mchine Computes product of the lerned DFAs nd completes it Drwbcks: DFAs re lerned seprtely, therefore do not hve sme stte-trnsition structure = stte explosion during product computtion Georgios Gintmidis (Alto University) Lerning Moore Mchines from Input-Output Trces December 8, 2016 24 / 32

PRPNI - Product RPNI Observtions: A DFA is specil cse of Moore mchine with binry output (ccept/reject) A Moore mchine cn be encoded s product of log 2 O DFAs Bsed on these observtions, PRPNI works s follows: Uses the RPNI lgorithm [J. Oncin nd P. Grci, 1992], which lerns DFAs Lerns severl DFAs tht encode the lerned Moore mchine Computes product of the lerned DFAs nd completes it Drwbcks: DFAs re lerned seprtely, therefore do not hve sme stte-trnsition structure = stte explosion during product computtion Invlid output codes Georgios Gintmidis (Alto University) Lerning Moore Mchines from Input-Output Trces December 8, 2016 24 / 32

Invlid output codes Output lphbet: O = {0, 1, 2} Binry encoding of O: f = {0 00, 1 01, 2 10} b b b b b q 0 q 1 b b b q 2 r 0 r 1 s 0 s 1 s 2 00 11 00 s 5 s 4 s 3 01 10 01 b b b Invlid output code: 11 does not correspond to ny output symbol Georgios Gintmidis (Alto University) Lerning Moore Mchines from Input-Output Trces December 8, 2016 25 / 32

MooreMI - Moore Mchine Inference Modified RPNI, tilored to Moore mchine lerning Like PRPNI, lerns severl DFAs tht encode the lerned Moore mchine Unlike PRPNI, lerned DFAs mintin sme stte-trnsition structure Therefore, no stte explosion during product computtion No invlid output codes either Georgios Gintmidis (Alto University) Lerning Moore Mchines from Input-Output Trces December 8, 2016 26 / 32

Outline 1 Bckground 2 Forml problem definition 3 Relted work 4 Identifiction in the limit 5 Our lerning lgorithms 6 Results 7 Summry & future work Georgios Gintmidis (Alto University) Lerning Moore Mchines from Input-Output Trces December 8, 2016 27 / 32

Results Theorem 1 All three lgorithms return Moore mchines consistent with the IO-trces received s input. Theorem 2 The MooreMI lgorithm stisfies the chrcteristic smple requirement nd identifies in the limit. Experimentl evlution result: MooreMI is better not just in theory, but lso in prctice Georgios Gintmidis (Alto University) Lerning Moore Mchines from Input-Output Trces December 8, 2016 28 / 32

Outline 1 Bckground 2 Forml problem definition 3 Relted work 4 Identifiction in the limit 5 Our lerning lgorithms 6 Results 7 Summry & future work Georgios Gintmidis (Alto University) Lerning Moore Mchines from Input-Output Trces December 8, 2016 29 / 32

Summry Lerning deterministic, complete Moore mchines from input-output trces Chrcteristic smple for Moore mchines Three lgorithms to solve the problem MooreMI lgorithm identifies in the limit Georgios Gintmidis (Alto University) Lerning Moore Mchines from Input-Output Trces December 8, 2016 30 / 32

Future work Extend to Mely mchines Lerning symbolic mchines Lerning from trces nd forml requirements (e.g. LTL formuls) Industril cse studies Georgios Gintmidis (Alto University) Lerning Moore Mchines from Input-Output Trces December 8, 2016 31 / 32

Future work Extend to Mely mchines Lerning symbolic mchines Lerning from trces nd forml requirements (e.g. LTL formuls) Industril cse studies Thnk you! Questions? Georgios Gintmidis (Alto University) Lerning Moore Mchines from Input-Output Trces December 8, 2016 31 / 32

References E. M. Gold. Lnguge identifiction in the limit. Informtion nd Control, 10(5):447-474, 1967. A. W. Biermnn nd J. A. Feldmn. On the synthesis of finite-stte mchines from smples of their behvior. IEEE Trns. Comput., 21(6):592-597, June 1972. E. M. Gold. Complexity of utomton identifiction from given dt. Informtion nd Control, 37(3):302-320, 1978. D. Angluin. Lerning regulr sets from queries nd counterexmples. Inf. Comput., 75(2):87-106, 1987. J. Oncin nd P. Grci. Identifying regulr lnguges in polynomil time. In Advnces in Structurl nd Syntctic Pttern Recognition, pges 99-108, 1992. C. de l Higuer. Grmmticl Inference: Lerning Automt nd Grmmrs. CUP, 2010. Georgios Gintmidis (Alto University) Lerning Moore Mchines from Input-Output Trces December 8, 2016 32 / 32