Constructions on Finite Automata

Similar documents
Constructions on Finite Automata

Deterministic finite Automata

Regular expressions and Kleene s theorem

Lecture 3: Nondeterministic Finite Automata

Inf2A: Converting from NFAs to DFAs and Closure Properties

Closure under the Regular Operations

CS 154, Lecture 2: Finite Automata, Closure Properties Nondeterminism,

Closure Properties of Regular Languages. Union, Intersection, Difference, Concatenation, Kleene Closure, Reversal, Homomorphism, Inverse Homomorphism

Regular expressions and Kleene s theorem

T (s, xa) = T (T (s, x), a). The language recognized by M, denoted L(M), is the set of strings accepted by M. That is,

Equivalence of DFAs and NFAs

UNIT-II. NONDETERMINISTIC FINITE AUTOMATA WITH ε TRANSITIONS: SIGNIFICANCE. Use of ε-transitions. s t a r t. ε r. e g u l a r

CS 154, Lecture 3: DFA NFA, Regular Expressions

CS 121, Section 2. Week of September 16, 2013

Deterministic Finite Automata. Non deterministic finite automata. Non-Deterministic Finite Automata (NFA) Non-Deterministic Finite Automata (NFA)

Lecture 1: Finite State Automaton

Nondeterministic Finite Automata

CMPSCI 250: Introduction to Computation. Lecture #22: From λ-nfa s to NFA s to DFA s David Mix Barrington 22 April 2013

Intro to Theory of Computation

Languages. Non deterministic finite automata with ε transitions. First there was the DFA. Finite Automata. Non-Deterministic Finite Automata (NFA)

Deterministic Finite Automaton (DFA)

Uses of finite automata

Decision, Computation and Language

Automata Theory. Lecture on Discussion Course of CS120. Runzhe SJTU ACM CLASS

Nondeterministic Finite Automata

Nondeterministic finite automata

3515ICT: Theory of Computation. Regular languages

Chapter 2: Finite Automata

CS 154. Finite Automata, Nondeterminism, Regular Expressions

CSE 105 Theory of Computation Professor Jeanne Ferrante

Introduction to the Theory of Computation. Automata 1VO + 1PS. Lecturer: Dr. Ana Sokolova.

Automata and Formal Languages - CM0081 Non-Deterministic Finite Automata

NFA and regex. the Boolean algebra of languages. regular expressions. Informatics 1 School of Informatics, University of Edinburgh

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

Introduction to the Theory of Computation. Automata 1VO + 1PS. Lecturer: Dr. Ana Sokolova.

Great Theoretical Ideas in Computer Science. Lecture 4: Deterministic Finite Automaton (DFA), Part 2

Non-deterministic Finite Automata (NFAs)

Chapter Five: Nondeterministic Finite Automata

Extended transition function of a DFA

Finite Automata and Regular languages

CS 301. Lecture 18 Decidable languages. Stephen Checkoway. April 2, 2018

Further discussion of Turing machines

September 7, Formal Definition of a Nondeterministic Finite Automaton

CPS 220 Theory of Computation REGULAR LANGUAGES

Regular Language Equivalence and DFA Minimization. Equivalence of Two Regular Languages DFA Minimization

Properties of Regular Languages. BBM Automata Theory and Formal Languages 1

CSC173 Workshop: 13 Sept. Notes

Recitation 2 - Non Deterministic Finite Automata (NFA) and Regular OctoberExpressions

Computational Theory

CSC236 Week 11. Larry Zhang

Nondeterminism. September 7, Nondeterminism

Lecture Notes On THEORY OF COMPUTATION MODULE -1 UNIT - 2

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

CS 455/555: Finite automata

CISC 4090: Theory of Computation Chapter 1 Regular Languages. Section 1.1: Finite Automata. What is a computer? Finite automata

Regular Expressions Kleene s Theorem Equation-based alternate construction. Regular Expressions. Deepak D Souza

Chapter 6: NFA Applications

Finite Automata. Mahesh Viswanathan

COM364 Automata Theory Lecture Note 2 - Nondeterminism

Automata and Languages

Homework 1 Due September 20 M1 M2

Harvard CS 121 and CSCI E-207 Lecture 4: NFAs vs. DFAs, Closure Properties

Optimizing Finite Automata

FORMAL LANGUAGES, AUTOMATA AND COMPUTATION

Finite Automata Part Two

Chap. 1.2 NonDeterministic Finite Automata (NFA)

Finite Automata - Deterministic Finite Automata. Deterministic Finite Automaton (DFA) (or Finite State Machine)

Theory of Computation (I) Yijia Chen Fudan University

Sri vidya college of engineering and technology

Formal Language and Automata Theory (CS21004)

Computer Sciences Department

Java II Finite Automata I

Regular expressions and Kleene s theorem

CSE 135: Introduction to Theory of Computation Nondeterministic Finite Automata (cont )

Before we show how languages can be proven not regular, first, how would we show a language is regular?

Fooling Sets and. Lecture 5

Deterministic Finite Automata (DFAs)

GEETANJALI INSTITUTE OF TECHNICAL STUDIES, UDAIPUR I

Formal Models in NLP

Computational Models - Lecture 5 1

Chapter 5. Finite Automata

Finite Automata. Finite Automata

Deterministic Finite Automata (DFAs)

Nondeterministic Finite Automata. Nondeterminism Subset Construction

THEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET

CSE 105 THEORY OF COMPUTATION

Lecture 17: Language Recognition

Regular Languages. Problem Characterize those Languages recognized by Finite Automata.

Let us first give some intuitive idea about a state of a system and state transitions before describing finite automata.

Finite Automata and Regular Languages

Inf2A: The Pumping Lemma

Pushdown automata. Twan van Laarhoven. Institute for Computing and Information Sciences Intelligent Systems Radboud University Nijmegen

Finite Automata. BİL405 - Automata Theory and Formal Languages 1

CS 154 Introduction to Automata and Complexity Theory

Lecture 23 : Nondeterministic Finite Automata DRAFT Connection between Regular Expressions and Finite Automata

CSE 105 THEORY OF COMPUTATION

UNIT-III REGULAR LANGUAGES

CSE 135: Introduction to Theory of Computation Nondeterministic Finite Automata

CSE 105 THEORY OF COMPUTATION

Lecture 4 Nondeterministic Finite Accepters

Transcription:

Constructions on Finite Automata Informatics 2A: Lecture 4 Mary Cryan School of Informatics University of Edinburgh mcryan@inf.ed.ac.uk 24 September 2018 1 / 33

Determinization The subset construction Closure properties of regular languages Union Intersection Complement DFA minimization The problem An algorithm for minimization 2 / 33

Recap of Lecture 3 A language is a set of strings over an alphabet Σ. A language is called regular if it is recognised by some NFA. DFAs are an important subclass of NFAs. (hinted) An NFA with n states can be determinized to an equivalent DFA with 2 n states, using the subset construction. Therefore the regular languages are exactly the languages recognised by DFAs. 3 / 33

The key insight The process we described on Friday for our example (what is the set of states we can reach after reading the next character) is a completely deterministic process! Given any current set of coloured states, and any input symbol in Σ, there s only one right answer to the question: What will the new set of coloured states be? What s more, it s a finite state process. A state is simply a choice of coloured states in the original NFA N. If N has n states, there are 2 n such choices. This suggests how an NFA with n states can be converted into a equivalent DFA with 2 n states. 4 / 33

The subset construction: example Our 3-state NFA gives rise to a DFA with 2 3 = 8 states. The states of this DFA are subsets of {q0, q1, q2}. a {q0,q1, q2} b a a a q0 a a,b q2 a,b a q1 b {q0,q1} {q1,q2} {q0,q2} a b b a {q0} {q1} {q2} b a a,b b a,b {} The accepting states of this DFA are exactly those that contain an accepting state of the original NFA. 5 / 33

The subset construction in general Given an NFA N = (Q,, S, F ), we can define an equivalent DFA M = (Q, δ, s, F ) (over the same alphabet Σ) like this: Q is 2 Q, the set of all subsets of Q. (Also written P(Q).) δ (A, u) = {q Q q A. (q, u, q ) }. (Set of all states reachable via u from some state in A.) s = S. F = {A Q q A. q F }. It s then not hard to prove mathematically that L(M) = L(N). (See Kozen for details.) This process is called determinization. Coming up in lecture 6: Application of this process to efficient string searching. 6 / 33

Summary We ve shown that for any NFA N, we can construct a DFA M with the same associated language. Since every DFA is also an NFA, the classes of languages recognised by DFAs and by NFAs coincide these are the regular languages. Often a language can be specified more concisely by an NFA than by a DFA. We can automatically convert an NFA to a DFA, at the risk of an exponential blow-up in the number of states. To determine whether a string x is accepted by an NFA we do not need to construct the entire DFA, but instead we efficiently simulate the execution of the DFA on x on a step-by-step basis. (This is called just-in-time simulation.) 7 / 33

Question 1 Let M be the DFA shown on Friday: 1 1 0 even odd Give a simple, concise definition of the strings that are in L(M). 0 8 / 33

Question 1 Let M be the DFA shown on Friday: 1 1 0 even odd Give a simple, concise definition of the strings that are in L(M). Answer: They are the strings containing an even number of 0 s. 0 8 / 33

Question 2 Which of these three languages do you think are regular? L 1 = {a, aa, ab, abbc} L 2 = {axb x Σ } L 3 = {a n b n n 0} If not regular, can you explain why not?? 9 / 33

Question 2 Which of these three languages do you think are regular? L 1 = {a, aa, ab, abbc} L 2 = {axb x Σ } L 3 = {a n b n n 0} If not regular, can you explain why not?? Answers: L 1 is regular (easy to see that any finite set of strings is a regular language). 9 / 33

Question 2 Which of these three languages do you think are regular? L 1 = {a, aa, ab, abbc} L 2 = {axb x Σ } L 3 = {a n b n n 0} If not regular, can you explain why not?? Answers: L 1 is regular (easy to see that any finite set of strings is a regular language). L 2 is regular (easy to construct a DFA). 9 / 33

Question 2 Which of these three languages do you think are regular? L 1 = {a, aa, ab, abbc} L 2 = {axb x Σ } L 3 = {a n b n n 0} If not regular, can you explain why not?? Answers: L 1 is regular (easy to see that any finite set of strings is a regular language). L 2 is regular (easy to construct a DFA). L 3 is not regular. We shall see why in lecture 8. 9 / 33

Question 3 Consider our first example NFA over {0, 1} (from Friday): 0,1 1 0,1 0,1 0,1 0,1 q0 q1 q2 q3 q4 q5 What is the number of states of the smallest DFA that recognises the same language? Answer will be given in Lecture 5. 10 / 33

Union of regular languages Consider the following little theorem: If L 1 and L 2 are regular languages over Σ, so is L 1 L 2. This is dead easy to prove using NFAs. Suppose N 1 = (Q 1, 1, S 1, F 1 ) is an NFA for L 1, and N 2 = (Q 2, 2, S 2, F 2 ) is an NFA for L 2. We may assume Q 1 Q 2 = (just relabel states if not). Now consider the NFA (Q 1 Q 2, 1 2, S 1 S 2, F 1 F 2 ) This is just N 1 and N 2 side by side. Clearly, this NFA recognizes precisely L 1 L 2. Number of states = Q 1 + Q 2 no state explosion! 11 / 33

Intersection of regular languages If L 1 and L 2 are regular languages over Σ, so is L 1 L 2. Suppose N 1 = (Q 1, 1, S 1, F 1 ) is an NFA for L 1, and N 2 = (Q 2, 2, S 2, F 2 ) is an NFA for L 2. We define a product NFA (Q,, S, F ) by: (q, r) a (q, r ) Q = Q 1 Q 2 q a q 1 and r S = S 1 S 2 F = F 1 F 2 a r 2 Number of states = Q 1 Q 2 a bit more costly than union! If N 1 and N 2 are DFAs then the product automaton is a DFA too. 12 / 33

Example of language intersection 13 / 33

Complement of a regular language ( Recall the set-difference operation, where A, B are sets. ) A B = {x A x / B} If L is a regular language over Σ, then so is Σ L. Suppose N = (Q, δ, s, F ) is a DFA for L. Then (Q, δ, s, Q F ) is a DFA for Σ L. (We simply swap the accepting and rejecting states in N.) Number of states = Q no blow up at all, but we are required to start with a DFA. This in itself has size implications. The complement construction does not work if N is not deterministic! 14 / 33

Closure properties of regular languages We ve seen that if both L 1 and L 2 are regular languages, then so are: L 1 L 2 (union) L 1 L 2 (intersection) Σ L 1 (complement) We sometimes express this by saying that regular languages are closed under the operations of union, intersection and complementation. ( Closed used here in the sense of self-contained.) Each closure property corresponds to an explicit construction on finite automata. Sometimes this uses NFAs (union), sometimes DFAs (complement), and sometimes the construction works equally well for both NFAs and DFAs (intersection). 15 / 33

The Minimization Problem Determinization involves an exponential blow-up in the automaton. Is it sometimes possible to reduce the size of the resulting DFA? Many different DFAs can give rise to the same language, e.g.: 1 1 0 1 0 1 q0 q1 0,1 q4 even odd 0 0 0 q3 q2 1 0 1 We shall see that there is always a unique smallest DFA for a given regular language. 16 / 33

DFA minimization 1 0 1 q0 q1 0,1 q4 0 0 q3 q2 1 0 1 We perform the following steps to reduce M above: Throw away unreachable states (in this case, q4). Squish together equivalent states, i.e. states q, q such that: every string accepted starting from q is accepted starting from q, and vice versa. (In this case, q0 and q2 are equivalent, as are q1 and q3.) Let s write Min(M) for the resulting reduced DFA. In this case, Min(M) is essentially the two-state machine on the previous slide. 17 / 33

Properties of minimization The minimization operation on DFAs enjoys the following properties which characterise the construction: L(Min(M)) = L(M). If L(M ) = L(M) and M Min(M) then M = Min(M). Here M is the number of states of the DFA M, and = means the two DFAs are isomorphic: that is, identical apart from a possible renaming of states. Two consequences of the above are: Min(M) = Min(M ) if and only if L(M) = L(M ). Min(Min(M)) = Min(M). For a formal treatment of minimization, see Kozen chapters 13 16. 18 / 33

Challenge question Consider the following DFA over {a, b}. q0 a q1 a a b b a q2 b q3 b How many states does the minimized DFA have? 19 / 33

Solution The minimized DFA has just 2 states: a q012 b b q3 a The minimized DFA has been obtained by squishing together states q0, q1 and q2. Clearly q3 must be kept distinct. Note that the corresponding language consists of all strings ending with b. 20 / 33

Minimization in practice Let s look again at our definition of equivalent states: states q, q such that: every string accepted starting from q is accepted starting from q, and vice versa. This is fine as an abstract mathematical definition of equivalence, but it doesn t seem to give us a way to compute which states are equivalent: we d have to check infinitely many strings x Σ. Fortunately, there s an actual algorithm for DFA minimization that works in reasonable time. This is useful in practice: we can specify our DFA in the most convenient way without worrying about its size, then minimize to a more compact DFA to be implemented e.g. in hardware. 21 / 33

An algorithm for minimization First eliminate any unreachable states (easy). Then create a table of all possible pairs of states (p, q), initially unmarked. (E.g. a two-dimensional array of booleans, initially set to false.) We mark pairs (p, q) as and when we discover that p and q cannot be equivalent. 1. Start by marking all pairs (p, q) where p F and q F, or vice versa. 2. Look for unmarked pairs (p, q) such that for some u Σ, the pair (δ(p, u), δ(q, u)) is marked. Then mark (p, q). 3. Repeat step 2 until no such unmarked pairs remain. If (p, q) is still unmarked, can collapse p, q to a single state. 22 / 33

Illustration of minimization algorithm Consider the following DFA over {a, b}. 23 / 33

Illustration of minimization algorithm After eliminating unreachable states: 24 / 33

Illustration of minimization algorithm We mark states to be kept distinct using a half matrix: q0 q1 q2 q3 q0 q1 q2 q3 25 / 33

Illustration of minimization algorithm First mark accepting/non-accepting pairs: q0 q1 q2 q3 q0 q1 q2 q3 26 / 33

Illustration of minimization algorithm (q0,q1) is unmarked, qo a q1, q1 a q3, and (q1,q3) is marked. q0 q1 q2 q3 q0 q1 q2 q3 27 / 33

Illustration of minimization algorithm So mark (q0,q1). q0 q1 q2 q3 q0 q1 q2 q3 28 / 33

Illustration of minimization algorithm (q0,q2) is unmarked, qo a q1, q2 a q3, and (q1,q3) is marked. q0 q1 q2 q3 q0 q1 q2 q3 29 / 33

Illustration of minimization algorithm So mark (q0,q2). q0 q1 q2 q3 q0 q1 q2 q3 30 / 33

Illustration of minimization algorithm The only remaining unmarked pair (q1,q2) stays unmarked. q0 q1 q2 q3 q0 q1 q2 q3 31 / 33

Illustration of minimization algorithm So obtain minimized DFA by collapsing q1, q2 to a single state. 32 / 33

Reading Relevant reading: Subset Construction: Kozen chapter 6 Closure properties of regular languages: Kozen chapter 4. Minimization: Kozen chapters 13 14. Next time: Regular expressions and Kleene s Theorem. (Kozen chapters 7, 8.) 33 / 33