Finite State Automata Design

Similar documents
REGular and Context-Free Grammars

FINITE STATE AUTOMATA

Closure under the Regular Operations

Formal Definition of Computation. August 28, 2013

CS 154. Finite Automata, Nondeterminism, Regular Expressions

Equivalence of DFAs and NFAs

Closure Properties of Regular Languages. Union, Intersection, Difference, Concatenation, Kleene Closure, Reversal, Homomorphism, Inverse Homomorphism

Closure under the Regular Operations

CS 154, Lecture 3: DFA NFA, Regular Expressions

Some useful tasks involving language. Finite-State Machines and Regular Languages. More useful tasks involving language. Regular expressions

3515ICT: Theory of Computation. Regular languages

Constructions on Finite Automata

T (s, xa) = T (T (s, x), a). The language recognized by M, denoted L(M), is the set of strings accepted by M. That is,

UNIT-II. NONDETERMINISTIC FINITE AUTOMATA WITH ε TRANSITIONS: SIGNIFICANCE. Use of ε-transitions. s t a r t. ε r. e g u l a r

CS:4330 Theory of Computation Spring Regular Languages. Finite Automata and Regular Expressions. Haniel Barbosa

CMSC 330: Organization of Programming Languages. Theory of Regular Expressions Finite Automata

CSE 105 THEORY OF COMPUTATION

Sri vidya college of engineering and technology

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

Regular Expressions and Finite-State Automata. L545 Spring 2008

Theory of computation: initial remarks (Chapter 11)

Constructions on Finite Automata

Theory of computation: initial remarks (Chapter 11)

CS 154 Introduction to Automata and Complexity Theory

Automata: a short introduction

Inf2A: Converting from NFAs to DFAs and Closure Properties

CS 154, Lecture 2: Finite Automata, Closure Properties Nondeterminism,

CS 121, Section 2. Week of September 16, 2013

Uses of finite automata

Regular Expressions. Definitions Equivalence to Finite Automata

How do regular expressions work? CMSC 330: Organization of Programming Languages

Formal Models in NLP

CS 455/555: Finite automata

Introduction to the Theory of Computation. Automata 1VO + 1PS. Lecturer: Dr. Ana Sokolova.

CSC173 Workshop: 13 Sept. Notes

UNIT-III REGULAR LANGUAGES

CS21 Decidability and Tractability

Deterministic Finite Automata (DFAs)

September 11, Second Part of Regular Expressions Equivalence with Finite Aut

CMPSCI 250: Introduction to Computation. Lecture #22: From λ-nfa s to NFA s to DFA s David Mix Barrington 22 April 2013

COM364 Automata Theory Lecture Note 2 - Nondeterminism

Introduction to the Theory of Computation. Automata 1VO + 1PS. Lecturer: Dr. Ana Sokolova.

CMSC 330: Organization of Programming Languages. Regular Expressions and Finite Automata

Deterministic Finite Automata

Outline. Nondetermistic Finite Automata. Transition diagrams. A finite automaton is a 5-tuple (Q, Σ,δ,q 0,F)

TAFL 1 (ECS-403) Unit- II. 2.1 Regular Expression: The Operators of Regular Expressions: Building Regular Expressions

INF Introduction and Regular Languages. Daniel Lupp. 18th January University of Oslo. Department of Informatics. Universitetet i Oslo

Automata & languages. A primer on the Theory of Computation. Laurent Vanbever. ETH Zürich (D-ITET) September,

Proofs, Strings, and Finite Automata. CS154 Chris Pollett Feb 5, 2007.

CSE 135: Introduction to Theory of Computation Nondeterministic Finite Automata (cont )

Non-deterministic Finite Automata (NFAs)

Fooling Sets and. Lecture 5

Optimizing Finite Automata

Regular expressions and Kleene s theorem

Deterministic Finite Automata (DFAs)

Great Theoretical Ideas in Computer Science. Lecture 4: Deterministic Finite Automaton (DFA), Part 2

Deterministic Finite Automata (DFAs)

Theory of Computation p.1/?? Theory of Computation p.2/?? Unknown: Implicitly a Boolean variable: true if a word is

COMP4141 Theory of Computation

Finite Automata. Seungjin Choi

Chapter Two: Finite Automata

CMSC 330: Organization of Programming Languages

Automata Theory. Lecture on Discussion Course of CS120. Runzhe SJTU ACM CLASS

CS 154. Finite Automata vs Regular Expressions, Non-Regular Languages

Computer Sciences Department

Finite Automata and Regular languages

Finite Automata. Finite Automata

Nondeterministic Finite Automata

1. Induction on Strings

Chapter Five: Nondeterministic Finite Automata

Finite Automata Part Two

Formal Definition of a Finite Automaton. August 26, 2013

Lecture 3: Nondeterministic Finite Automata

Kleene Algebras and Algebraic Path Problems

Regular Expressions and Language Properties

CSC236 Week 10. Larry Zhang

CSE 105 THEORY OF COMPUTATION

Notes on Finite Automata

Clarifications from last time. This Lecture. Last Lecture. CMSC 330: Organization of Programming Languages. Finite Automata.

Regular expressions. Regular expressions. Regular expressions. Regular expressions. Remark (FAs with initial set recognize the regular languages)

CISC 4090: Theory of Computation Chapter 1 Regular Languages. Section 1.1: Finite Automata. What is a computer? Finite automata

Equivalence of Regular Expressions and FSMs

Decision, Computation and Language

INF210 Datamaskinteori (Models of Computation)

Finite State Transducers

Takeaway Notes: Finite State Automata

Foundations of

Extended transition function of a DFA

1 Alphabets and Languages

Theory of Computation (I) Yijia Chen Fudan University

Lecture 23 : Nondeterministic Finite Automata DRAFT Connection between Regular Expressions and Finite Automata

Regular expressions and Kleene s theorem

1 More finite deterministic automata

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Lecture 17: Language Recognition

Nondeterministic Finite Automata. Nondeterminism Subset Construction

What Is a Language? Grammars, Languages, and Machines. Strings: the Building Blocks of Languages

October 6, Equivalence of Pushdown Automata with Context-Free Gramm

Automata theory. An algorithmic approach. Lecture Notes. Javier Esparza

Lecture 7 Properties of regular languages

Transcription:

Finite State Automata Design Nicholas Mainardi 1 Dipartimento di Elettronica e Informazione Politecnico di Milano nicholas.mainardi@polimi.it March 14, 2017 1 Mostly based on Alessandro Barenghi s material, enriched by few additional examples.

Finite State Automata What are FSAs? Finite State Automata (Automi a stati finiti AF) are the simplest among abstract computing models They are still able to describe a good deal of useful systems They are characterized by a finite memory capacity In a sense, every real world computing machine is a FSA (as it only has a limited number of memory cells) However modeling it as such is not a good idea (too many states, roughly 2 243 10 300000000000 )

Formalization A recognizer FSA is formally defined as a quintuple (Q, I, δ, q 0, F), where: Q, is the set of states of the automata I is the alphabet of the input string which will be checked δ : Q I Q the transition function q 0 Q the (unique) initial state from where the automaton starts F Q the set of final accepting states of the automaton

Regular expressions cheat sheet A small RE cheat sheet Symbols belonging to the input alphabet are usually lowercase s.t : s concatenated with t r s : either r or s Parentheses force a precedence when reading the regexpr s : an arbitrary number of occurrences of s (Kleene s Star) s + : one or more occurrences of s (equivalent to s(s) ) s? : s is optional zero or one occurrence of s (equivalent to s ɛ)

Introduction An oven knob We will start looking at the design for a simple automaton representing an oven knob The oven may be On, Off or on Grill position. The knob turns left and right: left raises the temperature, right lowers it.

A first attempt l l Off On Grill r r But this oven starts in an undefined state...

A first attempt l l start Off On Grill r r This one s better, but we d like to accept only when we remembered to turn it off at the end of the cooking...

A first attempt l l start Off On Grill r r Ok, but what if the knob is turned further left (resp. right) when it s already on the Grill (resp. Off ) position?

A first attempt l l start Off On Grill r r r l Great. What sequences of actions (language strings) does this recognize? L = (r l(l(l) r) r) = (r l(l + r) r)

A first recognizer Let s try the other way around: design a FSA recognizing a specific language Example: L = (01 1), that is the language where every zero is followed by at least a one (i.e. there cannot be two consecutive zeros) start q 0 q 1 0 1 1 δ(q 1, 0) is undefined here!

The error state The previous FSA does not explicitly represent error states As this may result in issues when performing operation among automata, we ll add it 0 0 q 0 start q 1 error 0 1 1 1 The δ(, ) function is now complete (safer to perform complement).

Complement How-To Given a recognizer FSA for the language L, the one recognizing the complement of the language L C can be obtained flipping the termination condition Basically: all the accepting states become non accepting and vice-versa Take care: the error state is nothing more than a common non accepting state and must be included in the process. A safe way to take it into account is to complete the automaton before building the complementary one

Complement Let s take as an example the recognizer for L = 0(0 1) The recognizer looks like this : Partial δ(, ) 0 Complete δ(, ) 0 1 0 q 1 0 q 1 0 1 start q 0 1 start q 0 1 error

Complement The recogniser for L C is thus : Recogniser for (0 1) \ L 0 0 1 0 1 q 0 start q 1 error 1

Intersection The recogniser automaton for the intersection of two languages L M can be built starting from the ones recognising them The steps to be followed are : 1 Build the state set as the Cartesian product of the state sets (hint: label the states with the combination of the two original labels, that is l i, m i ) 2 Set the initial state to the one obtained via the Cartesian product of l 0 and m 0 3 Start deriving the δ function from the initial state: simply check the original deltas and choose the destination state obtained as the product of the destinations. If at least one of the two original deltas is undefined, than δ is undefined too. 4 Mark as final only the states obtained as the product of two final states: l i, m i F if and only if l i F l m i F m

Example intersection We want to obtain a recogniser for L M Recogniser for L Recogniser for M 0 1 0 1 0 l 1 1 m 1 start l 0 start m 0

Intersection Automaton Resulting automaton (including unreachable states) start l 0, m 0 l 0, m 1 0 1 l 1, m 0 l 1, m 1 0 1

Union It is possible to use complement and intersection to compute the recognizer automaton for L M The key idea is to exploit De Morgan s Law applied to set operations: (a b) = ( a) ( b) It is thus possible to derive a b = ( (a b)) = (( a) ( b)) Therefore, applying in sequence two complements an intersection and another complement we obtain the union automaton

Vasco Rossi s lyrics Vasco Rossi s lyrics clearly belong to the language L = (e + h la) +. Therefore, we can recognize Vasco s lyric with a FSA: FSA Recognizer start q 0 q 1 l e e e e q 3 l q 4 q 2 a h l

Proving Properties on an FSA The following automaton recognizes L = (0 1) 01 2k+1. 0 0 q 1 1 start q 0 1 0,1 q 2 Even if it is possible to intuitively check that this is right, we need to provide a formal proof The most straightforward way to prove it is by induction on the steps performed

Mathematical Induction The idea is to prove by induction (on the steps performed by the automaton): That all the strings in the language lead the computation in an accepting state That all the strings not in the language lead the computation to a non accepting state To this end, we split the possible strings in three classes, associated to the states of the automaton : q 0 Strings without a 0, i.e. L 1 = 1 (/ L) q 1 Ending with an even number of 1 or with at least a 0, i.e. L 2 = {(0 1) 01 2k, k 0} (/ L) q 2 Ending with an odd number of 1, i.e. L 3 = {(0 1) 01 2k+1, k 0} ( L)

Mathematical Induction Building hypotheses and checking base cases Our purpose is thus to prove that: 1 s L 1, δ (q 0, s) = q 0 2 s L 2, δ (q 0, s) = q 1 3 s L 3, δ (q 0, s) = q 2 These three propositions will work for us as induction hypotheses Since we are performing the induction on the steps of the automaton, the base cases are represented by the shortest strings from each class Let s start from checking the base cases: 1 δ(q 0, ɛ) = q 0 and δ(q 0, 1) = q 0? Yes. 2 δ(q 0, 011) = q 1 and δ(q 0, 0) = q 1? Yes. 3 δ(q 0, 01) = q 2? Yes.

Mathematical Induction Proofs 1 s L 1 : δ (q 0, s = s.i) = δ(δ (q 0, s ), i) = δ(δ (q 0, s ), 1) = δ(q 0, 1), since s L 1. Then, δ(q 0, 1) = q 0, thus s L 1, δ (q 0, s) = q 0 is true 2 s L 2 : δ (q 0, s = s.i) = δ(δ (q 0, s ), i). Two cases: i = 0 i = 1 1 i = 0. δ(q i, 0) = q 1 q i Q 2 i = 1. δ(δ (q 0, s ), 1) = δ(q 2, 1), since s L 3. Then,δ(q 2, 1) = q 1 Thus, s L 2, δ (q 0, s) = q 1 is true 3 s L 3 : δ (q 0, s = s.i) = δ(δ (q 0, s ), i) = δ(δ (q 0, s ), 1) = δ(q 1, 1), since s L 2. Then, δ(q 1, 1) = q 2, thus s L 3, δ (q 0, s) = q 2 is true

Translators It is possible to enrich the computation model of a recogniser FSA with the ability to translate languages The FSA is only able to translate languages where the source is recognisable A translation move is characterised by reading at most one character and outputting zero or more characters A typical example is the find-replace function of a common editor

Formalization A transducer FSA is formally defined as a 7-tuple (Q, I, δ, q 0, F, O, η), where: Q, is the set of states of the automata I is the alphabet of the input string which will be checked δ : Q I Q the transition function q 0 Q the (unique) initial state from where the automaton starts F Q the set of final accepting states of the automaton O the output alphabet (may coincide with I) η : Q I {O ɛ} the transduction function

Example We want to build a transducer automaton accepting the input language L = (ab) + c(ba) The target language is L t = (dd) + ef The translation map τ is defined as τ((ab) n c(ba) m ) = d 2n ef (m 2), n 1, m 0 Since only a finite memory is required to perform the 2 operation, the language can be translated by a FSA.

Transducer FSA L to L t transducer a/d b/d start q 0 q 1 q 2 a/f a/d c/e q 7 q 6 q 5 q 4 b/ɛ a/ɛ b/ɛ

Data Compression We want to compress words from the language L = {a n 1 b k 1... a n i b k i... a n N b k N c N 1, i N, 1 n i 4, 1 k i 4} into a more compact, equivalently expressive format L t = n 1 k 1... n i k i... n N k N N 1. L to L t transducer a/4 start a/ɛ q 1 a q 0 a/2 a/3 b/1 qb 1 b/ɛ qb 2 b/ɛ qb 3 b/ɛ qb 4 a/1 c/3 b/2 b/4 c/1 c/2 a/ɛ b/3 c/4 qa 2 a/ɛ qa 3 a/ɛ qa 4 q f