acs-04: Regular Languages Regular Languages Andreas Karwath & Malte Helmert Informatik Theorie II (A) WS2009/10

Similar documents
Applied Computer Science II Chapter 1 : Regular Languages

UNIT-III REGULAR LANGUAGES

CISC 4090: Theory of Computation Chapter 1 Regular Languages. Section 1.1: Finite Automata. What is a computer? Finite automata

Introduction to Languages and Computation

Theory of Languages and Automata

September 11, Second Part of Regular Expressions Equivalence with Finite Aut

CPS 220 Theory of Computation REGULAR LANGUAGES

Critical CS Questions

CHAPTER 1 Regular Languages. Contents

Theory of Computation p.1/?? Theory of Computation p.2/?? Unknown: Implicitly a Boolean variable: true if a word is

Theory of Computation (II) Yijia Chen Fudan University

Computational Models #1

Examples of Regular Expressions. Finite Automata vs. Regular Expressions. Example of Using flex. Application

Computational Models - Lecture 1 1

Computational Models: Class 1

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

CS 154. Finite Automata vs Regular Expressions, Non-Regular Languages

Incorrect reasoning about RL. Equivalence of NFA, DFA. Epsilon Closure. Proving equivalence. One direction is easy:

CS 530: Theory of Computation Based on Sipser (second edition): Notes on regular languages(version 1.1)

Computer Sciences Department

Theory of Computation (I) Yijia Chen Fudan University

Finite Automata and Regular languages

Non-Deterministic Finite Automata

What we have done so far

Time Magazine (1984)

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

Closure Properties of Regular Languages. Union, Intersection, Difference, Concatenation, Kleene Closure, Reversal, Homomorphism, Inverse Homomorphism

CS 154, Lecture 3: DFA NFA, Regular Expressions

UNIT-II. NONDETERMINISTIC FINITE AUTOMATA WITH ε TRANSITIONS: SIGNIFICANCE. Use of ε-transitions. s t a r t. ε r. e g u l a r

Nondeterministic Finite Automata

Lecture 4: More on Regexps, Non-Regular Languages

TDDD65 Introduction to the Theory of Computation

CS 455/555: Finite automata

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

Automata: a short introduction

front pad rear pad door

CS 154. Finite Automata, Nondeterminism, Regular Expressions

Lecture Notes On THEORY OF COMPUTATION MODULE -1 UNIT - 2

Closure under the Regular Operations

Intro to Theory of Computation

Computational Models Lecture 2 1

Computational Models Lecture 2 1

UNIT-I. Strings, Alphabets, Language and Operations

CS:4330 Theory of Computation Spring Regular Languages. Finite Automata and Regular Expressions. Haniel Barbosa

Computational Theory

1.3 Regular Expressions

Uses of finite automata

September 7, Formal Definition of a Nondeterministic Finite Automaton

CS 154, Lecture 2: Finite Automata, Closure Properties Nondeterminism,

Clarifications from last time. This Lecture. Last Lecture. CMSC 330: Organization of Programming Languages. Finite Automata.

Chapter Five: Nondeterministic Finite Automata

Lecture 4 Nondeterministic Finite Accepters

CISC 4090 Theory of Computation

CMSC 330: Organization of Programming Languages. Theory of Regular Expressions Finite Automata

Computational Models - Lecture 3

Lecture 3: Nondeterministic Finite Automata

Regular Languages. Problem Characterize those Languages recognized by Finite Automata.

Automata Theory. Lecture on Discussion Course of CS120. Runzhe SJTU ACM CLASS

COMP4141 Theory of Computation

CMSC 330: Organization of Programming Languages

Chapter 6. Properties of Regular Languages

GEETANJALI INSTITUTE OF TECHNICAL STUDIES, UDAIPUR I

Great Theoretical Ideas in Computer Science. Lecture 4: Deterministic Finite Automaton (DFA), Part 2

Before we show how languages can be proven not regular, first, how would we show a language is regular?

Proofs, Strings, and Finite Automata. CS154 Chris Pollett Feb 5, 2007.

Sri vidya college of engineering and technology

CSE 105 THEORY OF COMPUTATION

CS5371 Theory of Computation. Lecture 5: Automata Theory III (Non-regular Language, Pumping Lemma, Regular Expression)

Recap DFA,NFA, DTM. Slides by Prof. Debasis Mitra, FIT.

CS21 Decidability and Tractability

Automata and Computability. Solutions to Exercises

Automata & languages. A primer on the Theory of Computation. Laurent Vanbever. ETH Zürich (D-ITET) September,

Chapter 2 The Pumping Lemma

Languages, regular languages, finite automata

Formal Definition of a Finite Automaton. August 26, 2013

Proving languages to be nonregular

TAFL 1 (ECS-403) Unit- II. 2.1 Regular Expression: The Operators of Regular Expressions: Building Regular Expressions

Automata and Computability. Solutions to Exercises

Theory of Computation

CS21 Decidability and Tractability

CS 121, Section 2. Week of September 16, 2013

Finite Automata and Regular Languages

CSE 105 Theory of Computation Professor Jeanne Ferrante

Tasks of lexer. CISC 5920: Compiler Construction Chapter 2 Lexical Analysis. Tokens and lexemes. Buffering

Nondeterminism and Epsilon Transitions

Name: Student ID: Instructions:

Lecture 1: Finite State Automaton

Properties of Regular Languages. BBM Automata Theory and Formal Languages 1

Lecture 2: Regular Expression

Automata and Languages

3515ICT: Theory of Computation. Regular languages

Nondeterminism. September 7, Nondeterminism

PS2 - Comments. University of Virginia - cs3102: Theory of Computation Spring 2010

Inf2A: Converting from NFAs to DFAs and Closure Properties

Equivalence of DFAs and NFAs

Finite Automata. Seungjin Choi

CSE 105 Homework 1 Due: Monday October 9, Instructions. should be on each page of the submission.

Finite State Automata

T (s, xa) = T (T (s, x), a). The language recognized by M, denoted L(M), is the set of strings accepted by M. That is,

Unit 6. Non Regular Languages The Pumping Lemma. Reading: Sipser, chapter 1

Transcription:

Regular Languages Andreas Karwath & Malte Helmert 1

Overview Deterministic finite automata Regular languages Nondeterministic finite automata Closure operations Regular expressions Nonregular languages The pumping lemma 2

Finite Automata An intuitive example : supermarket door controller front pad rear pad rear both neither closed front open front rear both door Top view of an automatic door neither State diagram for the automatic door controler Probabilistic counterparts exist Markov chains, Bayesian nets, etc. Not in this course Transition table for the automatic door controler: neither front rear both closed closed open closed closed open closed open open open 3

A finite automaton Figure 1.4 Formally 1 0 0 States : q, q, q Startstate : q 1 q 1 q 2 1 2 3 1 Acceptstate : q q 3 0,1 Transitions Output : accept or 2 reject A finite automaton is a 5-tuple ( Q, Σ, δ, q, F) 1. Qis a finite set of states 2. Σ is a finite set, the alphabet 3. δ : Q Σ Qis the transition function 4. q Q is the start state o 5. F Qis the set of accept states 0 4

1 0 0 1 q 1 q 2 q 1 3 0 Ais the language of machine M we write LM ( ) = A A= { w wcontains at least one 1 and an even number of 0s follows the last 1 } Describe Q= { q, q, q } 1 1 1 2 3 Σ= {0,1} δ defined by 0 1 q F 1 1 2 2 3 2 3 2 2 start state = { q } 2 M q q q q q q q q q 5

0 1 1 q 1 q 2 State diagram of the two-state finite automaton M 2 0 0 1 1 q 1 q 2 State diagram of the two-state finite automaton M 3 0 6

Other examples 7,8,9 7

Another example A generalisation : A i is the language of all strings where the sum of the numbers is a multiple of iexcept that the sum is reset to 0 whenever the symbol reset appears Automaton B = 1. Q = { q,..., i 0 qi 1 } 2. Σ ={0,1,2, reset } 3. δ ( q,0) = q j i δ ( q,1) = q where k = ( j+ 1) modi j δ ( q,2) = q where k = ( j+ 2) mod i j δ ( q, reset ) = q j k j k 4. q Q is start and accept state o 0 8

Formal definition of computation Let M be a finite automaton ( Q, Σ, δ, q, F) Let w= w... w be a string over Σ 1 n 0 M accepts wif a sequence of states r,..., r exists in Q such that 1. 0 0 δ i i+ 1 i+ 1 2. ( r, w ) = r for all i = 0,..., n 1 3. r r n = q F 0 n M recognizes language Aif A= { w M accepts w} A language is regular if some finite automaton recognizes it. 9

Designing finite automata Design automaton for language consisting of binary strings with an odd number of 1s Design first states 0 1 Then transitions q even Start state and accept states 1 0 q odd 10

Another example Design an automaton to recognize the language of binary strings containing the string 001 as substring We have four possibilities: 1. we haven t seen any symbol of the pattern yet, or 2. we have seen a 0, or 3. we have seen a 00, or 4. we have seen the pattern 001 11

Another example Design an automaton to recognize the language of binary strings containing the string 001 as substring We have four possibilities: 1. we haven t seen any symbol of the pattern yet, or 2. we have seen a 0, or 3. we have seen a 00, or 4. we have seen the pattern 001 1 0 0 0, 1 0 1 q q 0 q 00 q 001 1 12

The Regular Operations Let A and B be languages We define : Union : A B= { x x A or x B} Concatenation : Ao B= { xy x A and y B} Star * : A = { x1x2... xn n 0 and each xi A} note: always A * Example A = { good, bad} B = { boy, girl} A B= { good, bad, boy, girl} Ao B = { goodboy, goodgirl, badboy, badgirl} * A = {, good, bad, goodgood, goodbad, badgood, badbad, goodgoodgood, goodgoodbad,...} 13

Regular languages are closed under A set S is closed under an operation oif applying oon elements of S yields elements of S. Example: multiplication on natural numbers Counterexample :division of natural numbers Theorem 1.12 The class of the regular languages is closed under the union operation. In other words, if A and A are regular languages, so is A A 1 2 1 2 14

Proof 1.12 (by construction) Let M recognize A, where M = (Q, Σ, δ,q,f ), and 1 1 1 1 1 1 1 M recognize A, where M = (Q, Σδ,,q,F ). 2 2 2 2 2 2 2 Construct M to recognize A A, where M = (Q, Σ,δ, q,f ). 1 2 0 { } 1. Q = (r,r ) r Q and r Q. 1 2 1 1 2 2 This set is the Cartesian product of sets Q and Q (written Q Q ). 1 2 1 2 1 2 It is the set of all pairs of states, the first from Q and the second from Q. 1 2 2. Σ, the alphabet, is the same as in M and M. The theorem remains true if they have different alphabets, Σ and Σ. We would then modify the proof to let Σ = Σ Σ. 1 2 1 2 15

3. δ, the transition function, is defined as follows. For each (r,r ) Q and each a Σ, let δ(( r,r ),a ) = ( δ ( r,a ), δ ( r,a )). 1 2 1 1 2 2 Hence δ gets a state of M (which actually is a pair of states from M and M ), together with an input symbol, and returns M's next state. 4. q is the pair (q,q ). 1 2 1 2 0 1 2 F M1 M2 5. is the set of pairs in which either member is an accept state of and. We can write it as F = {(r,r ) r F or r F }. 1 2 1 1 2 2 This expression is the same as F = (F Q ) (Q F ). 1 2 1 2 Note that it is not the same as F = F F. What would that give us? 1 2 16

Example M 1 with L(M 1 ) = {w w contains a 1} 0 0,1 1 q 1 q 2 M = ( Q, Σ, δ, q, F) constructed from M = ( Q, Σ, δ, q, F) and M = ( Q, Σ, δ, q, F ) Define 1. Q= {( r, r ) r Q and r Q } 2. Σ=Σ Σ 1 1 1 1 1 1 2 2 2 2 2 2 1 2 1 1 2 2 1 2 3. δ(( r, r ), a) = ( δ ( r, a), δ ( r, a)) 1 2 1 1 2 2 4. q = ( q, q ) 1 2 5. F = {( r, r ) r F or r F } 1 2 1 1 2 2 M 2 with L(M 2 ) = {w w contains at least two 0s} 1 1 0,1 0 0 p 1 p 2 p 3 17

Theorem 1.13 The class of the regular languages is closed under the concatenation operation. In other words, if A and A are regular languages, so is A o A 1 2 1 2 18

Non deterministic finite automata Deterministic One successor state transitions not allowed Non deterministic Several successor states possible transitions possible 0,1 0,1 1 0, 1 q 1 q 2 q 3 q 4 19

Deterministic versus non deterministic computation Figure 15 20

0,1 0,1 q 1 1 0, q 2 q 3 1 q 4 0 q 1 q 1 1 Input: w = 010110 0 q 1 q 2 q 3 1 q 1 q 3 1 q 1 q 2 q 3 q 4 0 q 1 q 2 q 3 q 4 q 4 q 1 q 3 q 4 q 4 21

Another NFA 22

Nondeterministic finite automaton A nondeterministic finite automaton is a 5-tuple ( Q, Σ, δ, q, F) 1. Qis a finite set of states 2. Σ is a finite set, the alphabet 3. δ : Q Σ P( Q) is the transition function 4. q Q is the start state o 5. F Qis the set of accept states 0 Σ includes P( Q) the powerset of Q 23

Example 0,1 1 0, 0,1 q 1 q 2 q 3 q 4 1 1. Q= { q, q, q, q } 1 2 3 4 2. Σ = {0,1} 3. δ is given as: 0 1 q { q} { q, q } {} 1 1 1 2 q { q } {} { q } q 2 3 3 3 4 4 4 4 1 {} { q } {} q { q } { q } {} 4. q is the start state 5. F = { q } 4 24

Formal definition of computation Let M be a nondeterministic finite automaton ( Q, Σ, δ, q, F) Let w= w... w be a string over Σ 1 n 0 M accepts wif a sequence of states r,..., r exists in Q such that 1. r = q 0 0 2. r δ ( r, w ) for all i = 0,..., n 1 i+ 1 i i+ 1 3. r n F 0 n 25

Every NFA has an equivalent DFA 26

Equivalence NFA and DFA Two machines are equivalent if they recognize the same language Theorem 1.19 Every nondeterministic finite automaton has an equivalent finite automaton Corollary 1.20 A language is regular if and onl yi f automaton recognizes it. some nondeterministic finite 27

Proof: Theorem 1.19 Let N = (Q, Σ, δ,q,f ) be the NFA recognizing some language A. 0 0 Construct a DFA M recognizing A. First we consider the easier case wherein N has no arrows. The arrows are taken into account later. 28

Proof: Theorem 1.19 (cont.) ' ' Construct M = (Q', Σ, δ 0,q 0,F'). 1. Q' = P(Q ). Every state of M is a set of states of N. (Recall that P(Q) is the power set of Q). 2. For R Q' and a Σ let δ '(R,a) = {q Q q δ(r,a) for some r R}. If R is a state of M, it is also a set of states of N. When M reads a symbol a in state R, it shows where a takes each state in R. Because each state leads to to a set of states, we take the union of all these sets. Alternativly we write: 3. q = {q }. ' 0 0 U δ '( R,a ) = δ( r,a ). r R M starts in the state corresponding to the collection containing just the start state of N. 4. F' = {R Q' R contains an accept state of N }. The machine M accepts if one of the possible states that N could be in at this point is an accept state. 29

Proof: Theorem 1.19 (cont.) Now for the arrows one needs to set up an extra bit of notation. For any state R of M we define E(R) to be the collection of states that can be reached from R by going only along arrows, including the members of R themselves. Formally, for R Q let E( R ) = { q q can be reached from R by traveling along 0 or more arrows }. The transition function of M is then modified to take int o account all states that can be reached by going along arrows after every step. Replacing δ(r,a) by E( δ(r, a)) achieves this. Thus δ '( R,a ) = { q Q q E( δ( r,a )) for some r R }. Additionally the start state of M has to be modified to cater for all possible states that can be reached from the start state of N along the arrows. Changing q to be E({q }) achieves this effect. ' 0 0 We have now completed the construction of the DFA M that simulates the NFA N. 30

An example 31

An example The resulting DFA The resulting DFA after removing redundant states 32

Closure under the regular operations Theorem 1.12/1.22 The class of the regular languages is closed under the union operation. In other words, if A and A are regular languages, so is A A 1 2 1 2 Theorem 1.23 The class of the regular languages is closed under the concatenation operation. Theorem 1.24 The class of the regular languages is closed under the star operation. 33

Proof idea Theorem 1.12/1.22 The class of the regular languages is closed under the union operation. In other words, if A and A are regular languages, so is A A 1 2 1 2 34

Proof 1.12/1.22 Let N = (Q, Σ, δ,q,f ) recognize A, and 1 1 1 1 1 1 N = (Q, Σ, δ,q,f ) recognize A. 2 2 2 2 2 2 Construct N = (Q, Σ, δ,q,f ) to recognize A A. 0 1 2 1. Q = {q } Q Q. 0 1 2 The states of N are all the states of N and N, with the addition of a new start state q. 2. The state q is the start state of N. 0 3. The accept states F = F F. 1 2 1 2 1 2 The accept states of N are all the accept states of N and N. That way N accepts if either N accepts or N accepts. 4. Define δ so that for any q Q and any a Σ, δ1(q,a) q Q1 δ2(q,a) q Q2 δ (q,a) = {q 1,q 2} q= q0 and a= q = q0 and a 1 2 0 35

Proof idea Theorem 1.23 The class of the regular languages is closed under the concatenation operation. 36

Proof 1.23 Let N = (Q, Σ, δ,q,f ) recognize A, and 1 1 1 1 1 1 N = (Q, Σ, δ,q,f ) recognize A. 2 2 2 2 2 2 Construct N = (Q, Σ, δ,q,f ) to recognize A o A. 1 2 1 2 1. Q = Q1 Q 2. The states of N are all the states of N1 and N 2. 2. The state q is the same as the start state of N. 1 1 3. The accept states F are the same as the accept states of N. 2 2 4. Define δ so that for any q Q and any a Σ, δ (q,a) q Q and q F δ (q,a) q F and a 1 1 1 1 1 δ (q,a) = δ 1(q,a) {q 2} q = F1 and a = δ (q,a) q Q 2 2 37

Proof idea Theorem 1.24 The class of the regular languages is closed under the star operation. 38

Proof 1.24 Let N = (Q, Σ, δ,q,f ) recognize A. 1 1 1 1 1 1 Construct N = (Q, Σ, δ,q,f ) recognize A. * 0 1 1. Q = {q } Q. 0 1 The states of N are the states of N plus a new start state. 2. The state q is the new start state. 3. 0 F = {q } F 0 1 The accept states are the old accept states plus the new start state. 4. Define δ so that for any q Q and any a Σ, 1 δ1(q,a) q Q and q F δ 1(q,a) q F1 and a δ (q,a) = δ 1(q,a) {q 1} q F1 and a = {q 1} q= q0 and a= q = q0 and a 1 1 39

Regular expressions Definition Say that R is a regular expression if R is 1. a for some a in the alphabet Σ, 2., 3., 4. (R R ), where R and R are regular expressions, 1 2 1 2 5. (R o R ), where R and R are regular expressions, or 1 2 1 2 6. R, where Ris a regular expression. * 1 1 40

RE Examples In the following examples we assume that the alphabet Σ is {0,1}. * * 1. 010 = {w whas exactly a single 1}. 2. Σ 1 Σ = {w w has at least one 1}. 3. Σ 001 Σ = { w w contains the string 001 as a substring }. 4. ( ΣΣ) = {w w is a string of even length }. * 5. ( ΣΣΣ ) = {w the length of w is a multiple of three }. 6. 01 10 = {01,10 }. 7. 0Σ 0 1Σ 1 0 1 = { w w starts and ends with the same symbol }. 41

RE Examples (cont.) 8. 9. * * * (0 )(1 ) = 01 1. The expression 0 describes the language {0, }, so the concatenation * operation adds either 0 or before every string in 1. (0 )(1 ) = {,0,1,01}. * 10. 1 =. Concatenating the empty set to any set yields the empty set. * 11. = { }. The star operation puts together any number of strings from the language to get a string in the result. If the language is empty, the star operation can put together 0 string, giving only the empty string. R R R R = R o = R o R R 42

Applications Design of compilers * * * * { +,, }( DD DD. D D. DD ) where D = {0,...,9} awk, grep, vi in unix (search for strings) Perl, Python, or Java programming languages Bioinformatics So called motifs (patterns occurring in sequences, e.g. proteins) 43

Equivalence RE and NFA Theorem 1.28 A language is regular if and only if some regular expression describes it Proof through : Lemma 1.29 If a language is described by some regular expression, then it is regular Lemma 1.32 If a language is regular, then it is described by some regular expression 44

Proof for Lemma 1.29 Convert R into an NFA N. Consider the six cases in the formal definition of regular expressions. 1. R= a for some a in Σ. Then L( R ) = { a }, and the following NFA recognizes L( R ). a Note that this machine fits the definition of an NFA but not that of a DFA, as not all input symbols possess exiting arrows. Formally, N = ({ q,q }, Σ, δ,q,{ q }), where we describe δ by saying that δ (q,a) = { q }, δ (r,b) = for r q or b a. 1 2 1 2 1 2 1 45

Proof for Lemma 1.29 (cont.) 2. R =. Then L(R) = { }, and the following NFA recognizes L(R). Formally, N = ({q }, Σ, δ,q,{q }), 1 1 1 where δ (r,b) = for any r and b. 46

Proof for Lemma 1.29 (cont.) 3. R =. Then L(R) =, and the following NFA recognizes L(R). Formally, N = ({q}, Σ, δ,q, ), where δ (r,b) = for any r and b. 47

Proof for Lemma 1.29 (cont.) 4. 5. 6. R= R R. 1 2 R= R o R. R = R. 1 2 * 1 For the last three cases we use the constructions given in the proofs that the class of regular languages is closed under the regular operations. In other words, we construct the NFA for R from the NFAs for R and R (or just R in case 6) and the appropriate closure construction. 1 2 1 48

Example 1.30 * We convert the regular expression (ab a) to an NFA in a sequence of stages. We build up from the smallest subexpressions to larger subexpressions until we have an NFA for the original expression, as shown in the following diagram. Note that this procedure generally doesn't give the NFA with the fewest states! 49

Example: NFA for: (ab a)* a: a b: b ab: a b ab a a b a (ab a)* a a b 50

Exercise: NFA for: (a b)*aba a: b: a b a b a b (a b)* a b 51

acs-04: Regular Languages 52 a b a b Example: NFA for: (a b)*aba (cont.) aba: (a b)*aba: a b a a b a a b a

Lemma 1.32 If a language is regular, then it is described by some regular expression Two steps DFA into GNFA (generalized nondeterministic finite automaton) Convert GNFA into regular expression 53

GNFAs Labels are regular expressions Two states q and r are connected in both directions (fully connected) Exception : One direction only Start state (exiting transition arrows) Accept state (only one!) (only incoming transition arrows) b q start Ø ab ab * a * (aa) * b * q accept aa ab ba 54

Formally A generalized nondeterministic finite automaton is a 5-tuple ( Q, Σ, δ, q, q ) 1. Qis a finite set of states 2. Σ is a finite set, the alphabet 3. δ : ( Q { q }) ( Q { q }) R is the transition function accept 4. q Q is the start state start start 55 start accept 5. qaccept Q the accept state * A GNFA accepts w= w1... wk where each wi Σ if a sequence of states r,..., r exists in Q such that 1. r 0 2. r k = = q q start accept 3. for all i = 0,..., n 1, we ha ve that w L( R) where R = δ ( r, r) i i 1 i 0 n i i

Convert DFA into GNFA Add new start state, with arrow to old start state Add new accept state, with arrows from old accept states If any arrows have multiple labels a and b, replace by a b Add arrows with label between states where necessary (*:between states that had no arrows before) * 56

Convert GNFA into regular expression 3 state DFA 5 state GNFA 4 state GNFA Regular Expression 2 state GNFA 3 state GNFA 57

Ripping of states Replace one state by the corresponding RE R 4 q 1 q 2 (R 1 )(R 2 )* ) (R 3 ) R 4 q 1 q 2 R 1 q rip R 3 R 2 58

Convert(G) Convert( G ) : 1. Let k be the number of states of G. 2. If k = 2, then G must consist of a start state, an accept state, and a single arrow connectiong them and labeled with a regular expression R. Return the expression R. 3. If k > 2, we select any state q Q different from q and q and let G' be the GNFA (Q', Σδ, ',q,q ), where start rip start accept accept Q' = Q {q }, and for any q Q' {q } and any q Q' {q } let rip i accept j start δ '(q,q ) = (R )(R ) (R ) (R ), * i j 1 2 3 4 for R =δ (q,q ),R =δ (q,q ),R =δ (q,q ), and R =δ(q,q ). 1 i rip 2 rip rip 3 rip j 4 i j 4. Compute Convert( G') and return this value. 59

Example R 4 q 1 q 2 q 1 (R 1 )(R 2 )* (R 3 ) R 4 q 2 R 1 q rip R 3 R 2 Rip 2: Rip 1: 60

Another Example DFA: a 1 2 b a b 3 a b GNFA: s a 1 2 b a b 3 a b a Rip 1: Rip 2: 2 a s b ab 3 aa b b aba a a bb (ba a) (aa b)*ab bb s a(aa b)*ab b a(aa b)* 3 a (ba a) (aa b)* Rip 3: s (a(aa b)*ab b)((ba a) (aa b)*ab bb)*((ba a) (aa b)* ) a(aa b)* a 61

Induction Proof Claim For any GNFA G, Convert( G ) is equivalent to G. R 4 q 1 q 2 q 1 (R 1 )(R 2 )* (R 3 ) R 4 q 2 We prove this claim by induction on k, the number of states of the GNFA. R 1 q rip R 2 Basis: Prove the claim true for k = 2 states. If G has only two R 2 states, it can have only a single arrow, which goes from the start state to the accept state. The regular expression label on this arrow describes all the strings that allow G to get to the accept state. Hence this expression is equivalent to G. Induction step: Assume that the claim is true for k 1states and use this assumption to prove that the claim is true for k states. First we show that G and G' recognize the same language. Suppose that G accepts an input w. Then in an accepting branch of the computation G enters a sequence of states q,q,q,q,...,q. start 1 2 3 accept If none of them is the removed state q, clearly G' also accepts w. The reason is that each of the new regular expressions labeling the arrows of G' contains the old regular expression as part of a union. rip 62

Induction Proof (cont.) q 1 R 4 q 2 q 1 (R 1 )(R 2 )* (R 3 ) R 4 q 2 If q does appear, removing each run of consecutive q states rip forms an accepting computation for G'. The states q and q bracketing a run have a new regular expression on the arrow between them that i rip j R 1 q rip R 2 describes all strings taking q to q via q on G. So G' accepts w. i j rip For the other direction, suppose that G' accepts an input w. As each R 2 arrow between any two states q and q in G' describes the collection i j of strings taking q to q in G, either directly or via q,g must also i j rip accept wthus G and G' are equivalent. The induction hypothesis states that when the alorithm calls itself recursively on input G', the result is a regular expression that is equivalent to G' because G' has k 1 states. Hence the regular expression also is equivalent to G, and the algorithm is proved correct. This concludes the proof of Claim 1.34, Lemma 1.32, and theorem 1.28. 63

Nonregular Languages Finite Automata have a finite memory Are the following languages regular? n n B= {0 1 n 0} C = { w whas an equal number of 0s and 1s} D= { w whas an equal number of occurences of 01 and 10} Mathematical proof necessary 64

The pumping lemma If A is regular language, then there is a number p (the pumping length), where, if s is any string in A of length at least p then s may be divided into three pieces s = xyz such that i 1. for each i 0, xy z A 2. y > 0 3. xy p Note from 2: y 65

Proof Idea Let M be a DFA recognizing A. Assign p to be the number of states in M. Show that string s, with length at least p, canbebrokenintoxyz. Pigeonhole principle Now prove that all three conditions are met 66

Proof: Pumping Lemma Let M = (Q, Σ, δ, q 1, F) be a DFA recognizing A and Q = p. Let s = s 1 s 2...s n be a string in A, with s = n, and n p Let r =r 1,...,r n+1 be the sequence of states that M enters for s, so r i+1 = δ(r i,s i ) with 1 i n. r 1,...,r n+1 = n+1, n+1 p+1. Amoung the first p+1 elements in r, there must be a r j and a r l being the same state q m, with j l. As r l occurs in the first p+1 states: l p+1. Let x = s 1...s j-1, y = s j...s l-1 and z = s l...s n : as x takes M from r 1 to r j, y from r j to r l, and z from r l to r n+1, being an accept state, M must accept xy i z for i 0 with j l, y > 0 with l p+1, xy p 67

Pumping Lemma (cont.) Use pumping lemma to prove that a language A is not regular: 1. Assume that A is regular (Proof by contradiction) 2. use the lemma to guarantee the existence of p, such that strings of length p or greater can be pumped 3. find string s of A, with s p that cannot be pumped 4. demonstrate that s cannot be pumped using all different ways of dividing s into x,y, and z (using condition 3. is here very useful ) 5. the existence of s contradicts the assumption, therefore A is not a regular language 68

Nonregular languages examples B n n = {0 1 n 0} p Choose s = 0 1 p for s p: i 1. for each i 0, xy z A 2. y > 0 3. xy p If we would now only consider condition2, then we would have that: 1. string y consists only of 0s 2. string y consists only of 1s 3. string y consists of both 0s and 1s 69

C = { w whas an equal number of 0s and 1s} p Choose s = 0 1 p Would seem possible without condition 3! However, condition 3 of lemma states xy Thus y consists of 0s only Then xyyz C p for s p: i 1. for each i 0, xy z A 2. y > 0 3. xy p Choc i e of s crucial. Consider s = (01) p Alternative proof : B is nonregular If C were C = B * * regular, then 0 1 regular Regular languages closed under intersection 70

Example language B again B n n = {0 1 n 0} for s p: i 1. for each i 0, xy z A 2. y > 0 3. xy p p Choose s = 0 1 p condition 3 of lemma states xy Thus y consists of 0s only Then xyyz B p 71

F = * { ww w {0,1}} p p Choose s = 0101 Condition 3 of lemma states xy Thus y consists of 0s only Then xyyz F p for s p: i 1. for each i 0, xy z A 2. y > 0 3. xy p p p 0 0 would not work, as it can be pumped! 72

i j E = { w 01 where i > j} p+ p Choose s = 0 1 Condition 3 of lemma states xy Thus y consists of 0s only Then 0 xy z F p for s p: i 1. for each i 0, xy z A 2. y > 0 3. xy p 73

Example Exam Question for s p: i 1. for each i 0, xy z A 2. y > 0 3. xy p Q: Use the pumping lemma to prove that L = {0 k 1 j : k,j 0 and k 2j} is not regular. A: Assume that L = {0 k 1 j : k,j 0 and k 2j} is regular. Let p be the pumping length of L. The pumping lemma states that for any string s L of at least length p, there exist string x,y, and z such that s = xyz, xy p, y > 0, and for all i 0: xy i z L. Choose s = 0 2p 1 p. Because s L and s =3p p, we obtain from the pumping lemma the strings x,y, and z with the above properties. As s = xyz, xy p, and s begins with 2p zeros, one can see that xy can only consist of zeros. If we pump s down, i.e. select i = 0, the string xy 0 z = xz = 0 2p- y 1 p. As xz has p ones, and y > 0, xz has fewer than 2p zeros. Hence xz L CONTRADICTION. Therfore L is not regular! 74

Example Exam Question for s p: i 1. for each i 0, xy z A 2. y > 0 3. xy p Q: Use the pumping lemma to prove that L = {0 k 1 j : k,j 0 and k 2j} is not regular. A: Assume that L = {0 k 1 j : k,j 0 and k 2j} is regular. Let p be the pumping length of L. The pumping lemma states that for any string s L of at least length p, there exist string x,y, and z such that s = xyz, xy p, y > 0, and for all i 0: xy i z L. Choose s = 0 2p 1 p. Because s L and s =3p p, we obtain from the pumping lemma the strings x,y, and z with the above properties. As s = xyz, xy p, and s begins with 2p zeros, one can see that xy can only consist of zeros. If we pump s down, i.e. select i = 0, the string xy 0 z = xz = 0 2p- y 1 p. As xz has p ones, and y > 0, xz has fewer than 2p zeros. Hence xz L CONTRADICTION. Therfore L is not regular! 75

Summary Deterministic finite automata Regular languages Nondeterministic finite automata Closure operations Regular expressions Nonregular languages The pumping lemma 76