September 11, Second Part of Regular Expressions Equivalence with Finite Aut

Similar documents
Examples of Regular Expressions. Finite Automata vs. Regular Expressions. Example of Using flex. Application

Closure under the Regular Operations

acs-04: Regular Languages Regular Languages Andreas Karwath & Malte Helmert Informatik Theorie II (A) WS2009/10

Theory of Computation (II) Yijia Chen Fudan University

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

Theory of Languages and Automata

CS 154. Finite Automata vs Regular Expressions, Non-Regular Languages

Theory of Computation (I) Yijia Chen Fudan University

CS 530: Theory of Computation Based on Sipser (second edition): Notes on regular languages(version 1.1)

Nondeterministic Finite Automata

What we have done so far

CS21 Decidability and Tractability

Nondeterminism. September 7, Nondeterminism

September 7, Formal Definition of a Nondeterministic Finite Automaton

TDDD65 Introduction to the Theory of Computation

Computational Models Lecture 2 1

Computational Models Lecture 2 1

Formal Definition of a Finite Automaton. August 26, 2013

front pad rear pad door

Sri vidya college of engineering and technology

CS 154, Lecture 3: DFA NFA, Regular Expressions

Computer Sciences Department

Lecture 3: Nondeterministic Finite Automata

UNIT-III REGULAR LANGUAGES

COM364 Automata Theory Lecture Note 2 - Nondeterminism

Deterministic Finite Automata. Non deterministic finite automata. Non-Deterministic Finite Automata (NFA) Non-Deterministic Finite Automata (NFA)

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

Chapter Five: Nondeterministic Finite Automata

Lecture 4 Nondeterministic Finite Accepters

Closure Properties of Regular Languages. Union, Intersection, Difference, Concatenation, Kleene Closure, Reversal, Homomorphism, Inverse Homomorphism

Recap DFA,NFA, DTM. Slides by Prof. Debasis Mitra, FIT.

Harvard CS 121 and CSCI E-207 Lecture 6: Regular Languages and Countability

UNIT-II. NONDETERMINISTIC FINITE AUTOMATA WITH ε TRANSITIONS: SIGNIFICANCE. Use of ε-transitions. s t a r t. ε r. e g u l a r

Nondeterministic Finite Automata

Lecture 4: More on Regexps, Non-Regular Languages

CSE 135: Introduction to Theory of Computation Nondeterministic Finite Automata (cont )

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

COMP4141 Theory of Computation

CS 154 Formal Languages and Computability Assignment #2 Solutions

CS 154, Lecture 2: Finite Automata, Closure Properties Nondeterminism,

CMSC 330: Organization of Programming Languages. Theory of Regular Expressions Finite Automata

Computational Models - Lecture 3

Foundations of

Regular Expressions. Definitions Equivalence to Finite Automata

October 6, Equivalence of Pushdown Automata with Context-Free Gramm

CS 154. Finite Automata, Nondeterminism, Regular Expressions

Clarifications from last time. This Lecture. Last Lecture. CMSC 330: Organization of Programming Languages. Finite Automata.

PS2 - Comments. University of Virginia - cs3102: Theory of Computation Spring 2010

CMSC 330: Organization of Programming Languages

Nondeterministic Finite Automata. Nondeterminism Subset Construction

CS243, Logic and Computation Nondeterministic finite automata

CMPSCI 250: Introduction to Computation. Lecture #22: From λ-nfa s to NFA s to DFA s David Mix Barrington 22 April 2013

CS21 Decidability and Tractability

T (s, xa) = T (T (s, x), a). The language recognized by M, denoted L(M), is the set of strings accepted by M. That is,

CS 322 D: Formal languages and automata theory

CSE 135: Introduction to Theory of Computation Nondeterministic Finite Automata

Automata & languages. A primer on the Theory of Computation. Laurent Vanbever. ETH Zürich (D-ITET) September,

Algorithms for NLP

Exam 1 CSU 390 Theory of Computation Fall 2007

cse303 ELEMENTS OF THE THEORY OF COMPUTATION Professor Anita Wasilewska

Computational Theory

ECS 120 Lesson 15 Turing Machines, Pt. 1

Languages. Non deterministic finite automata with ε transitions. First there was the DFA. Finite Automata. Non-Deterministic Finite Automata (NFA)

Closure under the Regular Operations

Lecture 2: Regular Expression

Finite Automata. BİL405 - Automata Theory and Formal Languages 1

Formal Definition of Computation. August 28, 2013

Theory of Computation p.1/?? Theory of Computation p.2/?? Unknown: Implicitly a Boolean variable: true if a word is

Finite Automata and Regular languages

Non-deterministic Finite Automata (NFAs)

Applied Computer Science II Chapter 1 : Regular Languages

Automata Theory. Lecture on Discussion Course of CS120. Runzhe SJTU ACM CLASS

Deterministic Finite Automata (DFAs)

Finite Automata. Seungjin Choi

Outline. Nondetermistic Finite Automata. Transition diagrams. A finite automaton is a 5-tuple (Q, Σ,δ,q 0,F)

Deterministic Finite Automata (DFAs)

Nondeterminism and Epsilon Transitions

(Refer Slide Time: 0:21)

CS 455/555: Finite automata

HKN CS/ECE 374 Midterm 1 Review. Nathan Bleier and Mahir Morshed

Homework 1 Due September 20 M1 M2

Deterministic Finite Automaton (DFA)

Introduction to Languages and Computation

Lecture 17: Language Recognition

CISC 4090: Theory of Computation Chapter 1 Regular Languages. Section 1.1: Finite Automata. What is a computer? Finite automata

Lecture 1: Finite State Automaton

Automata and Formal Languages - CM0081 Finite Automata and Regular Expressions

CSE 311: Foundations of Computing. Lecture 23: Finite State Machine Minimization & NFAs

Automata: a short introduction

CS 121, Section 2. Week of September 16, 2013

Chapter 5. Finite Automata

Outline. CS21 Decidability and Tractability. Regular expressions and FA. Regular expressions and FA. Regular expressions and FA

Incorrect reasoning about RL. Equivalence of NFA, DFA. Epsilon Closure. Proving equivalence. One direction is easy:

Properties of Regular Languages. BBM Automata Theory and Formal Languages 1

Introduction to Theory of Computing

Computational Models - Lecture 1 1

Inf2A: Converting from NFAs to DFAs and Closure Properties

Extended transition function of a DFA

Automata & languages. A primer on the Theory of Computation. Laurent Vanbever. ETH Zürich (D-ITET) September,

CS:4330 Theory of Computation Spring Regular Languages. Finite Automata and Regular Expressions. Haniel Barbosa

Transcription:

Second Part of Regular Expressions Equivalence with Finite Automata September 11, 2013

Lemma 1.60 If a language is regular then it is specified by a regular expression Proof idea: For a given regular language A we will construct a regular expression that describes A.

Procedure Because A is regular, there is a DFA D A that recognizes A

Procedure Because A is regular, there is a DFA D A that recognizes A D A will be converted into a regular expression R A that specifies A

Procedure Because A is regular, there is a DFA D A that recognizes A D A will be converted into a regular expression R A that specifies A Note: This procedure is broken in two parts:

Procedure Because A is regular, there is a DFA D A that recognizes A D A will be converted into a regular expression R A that specifies A Note: This procedure is broken in two parts: 1. Convert a DFA into a generalized nondeterministic finite automaton GNFA

Procedure Because A is regular, there is a DFA D A that recognizes A D A will be converted into a regular expression R A that specifies A Note: This procedure is broken in two parts: 1. Convert a DFA into a generalized nondeterministic finite automaton GNFA 2. Convert GNFA into a regular expression

What is an GNFA?

What is an GNFA? A GNFA is an NFA wherein the transition arrows may have any regular expressions as labels, instead only members of the alphabet or ɛ

What is an GNFA? A GNFA is an NFA wherein the transition arrows may have any regular expressions as labels, instead only members of the alphabet or ɛ Hence, GNFA reads strings specified by regular expressions (block of symbols) from the input (not necessarily just one symbol)

What is an GNFA? A GNFA is an NFA wherein the transition arrows may have any regular expressions as labels, instead only members of the alphabet or ɛ Hence, GNFA reads strings specified by regular expressions (block of symbols) from the input (not necessarily just one symbol) GNFA moves along a transition arrow connecting two states representing regular expression, Figure 1

Example GNFA aa q 1 (aa) b q start ab a q 2 ab Figure 1 : A GNFA ab ba b q accept

Note A GNFA is nondeterministic and so, it may have many different ways to process the same input string A GNFA accepts its input if its processing can cause the GNFA to be in an accept state at the end of the input

GNFA of special form

GNFA of special form The start state has transition arrows to every other state but no arrow coming from any other state

GNFA of special form The start state has transition arrows to every other state but no arrow coming from any other state There is only one accept state and it has arrows coming in from every other state, but has no arrows going to any other state; in addition, the accept state is not the same with the start state

GNFA of special form The start state has transition arrows to every other state but no arrow coming from any other state There is only one accept state and it has arrows coming in from every other state, but has no arrows going to any other state; in addition, the accept state is not the same with the start state Except for start and accept states,one arrow go from every state to every other state and from each state to itself

Converting DFA to GNFA A DFA is converted to a GNFA of special form by the followsing procedure:

Converting DFA to GNFA A DFA is converted to a GNFA of special form by the followsing procedure: 1. Add a new start state with an ɛ arrow to the old start state and a new accept state with an ɛ arrow from all old accept states

Converting DFA to GNFA A DFA is converted to a GNFA of special form by the followsing procedure: 1. Add a new start state with an ɛ arrow to the old start state and a new accept state with an ɛ arrow from all old accept states 2. If any arrows have multiple labels or if there are multiple arrows going between the same two states in the same direction replace each with a single arrow whose label is the union of the previous labels

Converting DFA to GNFA A DFA is converted to a GNFA of special form by the followsing procedure: 1. Add a new start state with an ɛ arrow to the old start state and a new accept state with an ɛ arrow from all old accept states 2. If any arrows have multiple labels or if there are multiple arrows going between the same two states in the same direction replace each with a single arrow whose label is the union of the previous labels 3. Add arrows labeled between states that had no arrows

Note Adding transitions don t change the language recognized by DFA because a transition labeled by can never be used Assumption: now we assume that all GNFAs are in the special form just defined.

Converting GNFA RE Assume that GNFA has k states

Converting GNFA RE Assume that GNFA has k states Because start and accept states are different from each other, it results that k 2

Converting GNFA RE Assume that GNFA has k states Because start and accept states are different from each other, it results that k 2 If k > 2 we construct an equivalent GNFA with k 1 states. This can be repeated for each new GNFA until we obtain a GNFA with k = 2 states.

Converting GNFA RE Assume that GNFA has k states Because start and accept states are different from each other, it results that k 2 If k > 2 we construct an equivalent GNFA with k 1 states. This can be repeated for each new GNFA until we obtain a GNFA with k = 2 states. If k = 2, GNFA has a single arrow that goes from start to accept and is labeled by a regular expression that specifies the language recognized by the original DFA

Example DFA conversion Assuming that the original DFA has 3 states the process of its conversion is shown in Figure 2 3-state DFA 5-state GNFA 4-state GNFA Regular 2-state GNFA expression 3-state GNFA Figure 2 : Example DFA conversion to regular expression

Note

Note The crucial step is the construction of an equivalent GNFA with one fewer states than a GNFA when GNFA has k > 2 states.

Note The crucial step is the construction of an equivalent GNFA with one fewer states than a GNFA when GNFA has k > 2 states. This is done by selecting a state, ripping it out of the machine, and repairing the remainder so that the same language is still recognized

Note The crucial step is the construction of an equivalent GNFA with one fewer states than a GNFA when GNFA has k > 2 states. This is done by selecting a state, ripping it out of the machine, and repairing the remainder so that the same language is still recognized Any state can be selected for ripping, providing that it is not start or accept state. Such a state exist because k > 2

Repairing after ripping a state Assume that the state of a GNFA selected for ripping is q rip

Repairing after ripping a state Assume that the state of a GNFA selected for ripping is q rip After removing q rip we repair the machine by altering the regular expressions that label each of the remaining transitions

Repairing after ripping a state Assume that the state of a GNFA selected for ripping is q rip After removing q rip we repair the machine by altering the regular expressions that label each of the remaining transitions The new labels compensate for the absence of q rip by adding back the lost computation

Repairing after ripping a state Assume that the state of a GNFA selected for ripping is q rip After removing q rip we repair the machine by altering the regular expressions that label each of the remaining transitions The new labels compensate for the absence of q rip by adding back the lost computation The new label of the arrow going from state q i to q j is a regular expression that specifies all strings that would take the machine from q i to q j either directly or via q rip

Illustration We illustrate the approach of ripping and repairing in Figure 3 q i R 4 q j R3 R 1 (R q 1 )(R 2 ) (R3) (R 4 ) rip q i q j R 2 before ripping after ripping Figure 3 : Ripping and repairing an GNFA

Note New labels are obtained by concatenating regular expressions of the arrows that go through q rip and union them with the labels of the arrows that travel directly between q i and q j This construct is carried out for each arrow that goes from state q i to any state q j including q i = q j

Formal proof First we need to define formally the GNFA Since new labels are regular expressions we use the symbol R Σ to denote the collection of regular expressions over an alphabet Σ To simplify, denote by q s and q a the start and accept states of the GNFA

Transition function of a GNFA Because an arrow connects every state to every other state, except that no arrows are coming from q a or going to q s, the domain of the transition function of a GNFA is δ : (Q {q a }) (Q {q s }) R Σ If δ(q i, q j ) = R the arrow from q i to q j has the label R

Definition 1.64 A generalized nondeterministic finite automaton (GNFA) is a 5-tuple (Q, Σ, δ, q s, q a ) where: 1. Q is the finite set of state 2. Σ is the input alphabet 3. δ : (Q {q a}) (Q {q s}) R Σ is the transition functioni where R Σ is the set of regular expressions over Σ 4. q s is the unique start state 5. q a is the unique accept state and q a q s.

GNFA computation A GNFA accepts a string w Σ if w = w 1 w 2... w k where w i Σ, 1 i k, and a sequence of states q 0, q 1,..., q k exits such that: 1. q o = q s is the start state 2. q k = q a is the accept state 3. For each i, δ(q i 1, q i ) = R i and w i L(R i ), i.e., R i is the regular expression labeling the arrow from q i 1 to q i and w i is an element of the language specified by this expression

More proof ideas Returning to the proof of Lemma 1.60, we assume that M is a DFA recognizing the language A and proceed as follows: Convert M into a GNFA G by adding a new start state and a new accept state and the additional arrows Use the procedure Convert(G) that maps G into a regular expression, as explained before, while preserving the language A Note: Convert() is recursive; however the case when GNFA has only two sates is handled without recursion

Convert(G) 1. Let k be the number of states of G, k 2. 2. If k = 2 then G must consists of a start state and an accept state and a single arrow connecting them, labeled by a regular expression R. Return R 3. While k > 2, select any state q rip Q, different from q s and q a and let G be the GNFA (Q, Σ, δ, q s, q a) where: Q = Q {q rip } for any qi Q {q a } and any q j Q {q s } let δ (q i, q j ) = (R 1 )(R 2 ) (R 3 ) (R 4 ) where: R 1 = δ(q i, q rip ), R 2 = δ(q rip, q rip ), R 3 = δ(q rip, q j ), R 4 = δ(q i, q j ) Convert(G );

Claim 1.65 For any GNFA G, Convert(G) is equivalent to G Proof: by induction on k, the number of states of G

Induction Basis: k = 2 If G has only two states, by definition, it can have only a single arrow which goes from q s to q a The regular expression labeling this arrow specify the language accepted by G Since this expression is returned by Convert(G), it means that G and Convert(G) are equivalent

Induction Step Assume that the claim is true for G having k 1 states and use this assumption to show that the claim is true for an GNFA with k states Observe from construction that G and G recognize the same language Suppose G accepts the input w. Then in an accepting branch of computation, G enters the sequence of states q s, q 1, q 2, q 3,..., q a Show that G has an accepting computation for w, too.

Induction step, continuation 1. If none of the states q s, q 1, q 2,..., q a is q rip, clearly G also accepts w because each of the new regular expressions labeling arrows of G contain the old regular expressions as part of a union

Induction step, continuation 1. If none of the states q s, q 1, q 2,..., q a is q rip, clearly G also accepts w because each of the new regular expressions labeling arrows of G contain the old regular expressions as part of a union 2. If q rip does appear in the computation q s, q 1, q 2,..., q a by removing each run of consecutive q rip states we obtain an accepting computation for G. This is because states q i and q j bracketing a run of consecutive q rip states have a new regular expression on the arrow between them that specify all strings taking q i to q j via q rip on G. So, G accepts w in this case too.

Induction step, continuation For the other direction, suppose that G accepts w. 1. Each arrow between any two states q i and q j in G is labeled by a regular expression that specifies strings specified by arrows in G from q i directly to q j or via q rip 2. Hence, by the definition of GNFA it follows that G must also accept w. That is, G and G accept the same language

Conclusion The induction hypothesis states that when the algorithm calls itself recursively on input G, the result is a regular expression that is equivalent to G because G has k 1 states Hence, that regular expression is also equivalent to G because G is equivalent to G Consequently Convert(G) and G are equivalent

Example 1.35 Convert the DFA D in Figure 4 into the regular expression that specifies the language accepted by D Figure 4 : a 1 b b a 2 DFA D to be converted

GNFA G 1 obtained from D Figure 5 shows the four-state GNFA obtained from D by adding new start state and accept state and replacing a, b by a b a q ɛ s 1 b b a q a ɛ 2 Figure 5 : GNFA G 1 obtained from D

Eliminating nodes Removing state 1 and then state 2, Figure 6 shows the GNFA G 3 : q s a b(a ba b) q a Figure 6 : GNFA G 3 obtained from G 2