INTRODUCTION TO AUTOMATA THEORY

Similar documents
2.4 Theoretical Foundations

Automata and Regular Languages

CS311 Computational Structures Regular Languages and Regular Grammars. Lecture 6

Counting Paths Between Vertices. Isomorphism of Graphs. Isomorphism of Graphs. Isomorphism of Graphs. Isomorphism of Graphs. Isomorphism of Graphs

CS 573 Automata Theory and Formal Languages

Finite State Automata and Determinisation

Nondeterministic Finite Automata

CS 491G Combinatorial Optimization Lecture Notes

Lecture 6: Coding theory

NON-DETERMINISTIC FSA

Nondeterministic Automata vs Deterministic Automata

CSE 332. Sorting. Data Abstractions. CSE 332: Data Abstractions. QuickSort Cutoff 1. Where We Are 2. Bounding The MAXIMUM Problem 4

= state, a = reading and q j

Minimal DFA. minimal DFA for L starting from any other

Chapter Five: Nondeterministic Finite Automata. Formal Language, chapter 5, slide 1

22: Union Find. CS 473u - Algorithms - Spring April 14, We want to maintain a collection of sets, under the operations of:

Regular expressions, Finite Automata, transition graphs are all the same!!

Lecture 08: Feb. 08, 2019

GNFA GNFA GNFA GNFA GNFA

Chapter 2 Finite Automata

Solutions for HW9. Bipartite: put the red vertices in V 1 and the black in V 2. Not bipartite!

Factorising FACTORISING.

CMPSCI 250: Introduction to Computation. Lecture #31: What DFA s Can and Can t Do David Mix Barrington 9 April 2014

Lecture 2: Cayley Graphs

Compression of Palindromes and Regularity.

Prefix-Free Regular-Expression Matching

AUTOMATA AND LANGUAGES. Definition 1.5: Finite Automaton

Data Structures LECTURE 10. Huffman coding. Example. Coding: problem definition

The University of Nottingham SCHOOL OF COMPUTER SCIENCE A LEVEL 2 MODULE, SPRING SEMESTER MACHINES AND THEIR LANGUAGES ANSWERS

Project 6: Minigoals Towards Simplifying and Rewriting Expressions

CSE 401 Compilers. Today s Agenda

Technische Universität München Winter term 2009/10 I7 Prof. J. Esparza / J. Křetínský / M. Luttenberger 11. Februar Solution

Numbers and indices. 1.1 Fractions. GCSE C Example 1. Handy hint. Key point

Homework 3 Solutions

Subsequence Automata with Default Transitions

Designing finite automata II

A Disambiguation Algorithm for Finite Automata and Functional Transducers

Compiler Design. Spring Lexical Analysis. Sample Exercises and Solutions. Prof. Pedro C. Diniz

More on automata. Michael George. March 24 April 7, 2014

CARLETON UNIVERSITY. 1.0 Problems and Most Solutions, Sect B, 2005

CS 2204 DIGITAL LOGIC & STATE MACHINE DESIGN SPRING 2014

Convert the NFA into DFA

Regular languages refresher

CIT 596 Theory of Computation 1. Graphs and Digraphs

CSCI 340: Computational Models. Kleene s Theorem. Department of Computer Science

1. For each of the following theorems, give a two or three sentence sketch of how the proof goes or why it is not true.

Finite Automata-cont d

CS103B Handout 18 Winter 2007 February 28, 2007 Finite Automata

Intermediate Math Circles Wednesday, November 14, 2018 Finite Automata II. Nickolas Rollick a b b. a b 4

Formal Languages and Automata

Surds and Indices. Surds and Indices. Curriculum Ready ACMNA: 233,

Nondeterminism and Nodeterministic Automata

Deterministic Finite Automata

CHAPTER 1 Regular Languages. Contents. definitions, examples, designing, regular operations. Non-deterministic Finite Automata (NFA)

CS415 Compilers. Lexical Analysis and. These slides are based on slides copyrighted by Keith Cooper, Ken Kennedy & Linda Torczon at Rice University

CMSC 330: Organization of Programming Languages

1 PYTHAGORAS THEOREM 1. Given a right angled triangle, the square of the hypotenuse is equal to the sum of the squares of the other two sides.

1 Nondeterministic Finite Automata

Assignment 1 Automata, Languages, and Computability. 1 Finite State Automata and Regular Languages

CSC2542 State-Space Planning

CHAPTER 1 Regular Languages. Contents

Formal Language and Automata Theory (CS21004)

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. Comparing DFAs and NFAs (cont.) Finite Automata 2

Chapter 4 State-Space Planning

Finite Automata. Informatics 2A: Lecture 3. Mary Cryan. 21 September School of Informatics University of Edinburgh

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

18.06 Problem Set 4 Due Wednesday, Oct. 11, 2006 at 4:00 p.m. in 2-106

Java II Finite Automata I

Types of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. NFA for (a b)*abb.

1 From NFA to regular expression

Converting Regular Expressions to Discrete Finite Automata: A Tutorial

Finite Automata. Informatics 2A: Lecture 3. John Longley. 22 September School of Informatics University of Edinburgh

NFA DFA Example 3 CMSC 330: Organization of Programming Languages. Equivalence of DFAs and NFAs. Equivalence of DFAs and NFAs (cont.

5. (±±) Λ = fw j w is string of even lengthg [ 00 = f11,00g 7. (11 [ 00)± Λ = fw j w egins with either 11 or 00g 8. (0 [ ffl)1 Λ = 01 Λ [ 1 Λ 9.

Logic, Set Theory and Computability [M. Coppenbarger]

Theory of Computation Regular Languages. (NTU EE) Regular Languages Fall / 38

The DOACROSS statement

Running an NFA & the subset algorithm (NFA->DFA) CS 350 Fall 2018 gilray.org/classes/fall2018/cs350/

I 3 2 = I I 4 = 2A

Discrete Structures, Test 2 Monday, March 28, 2016 SOLUTIONS, VERSION α

NFAs continued, Closure Properties of Regular Languages

Let's start with an example:

CISC 4090 Theory of Computation

Total score: /100 points

CS 373, Spring Solutions to Mock midterm 1 (Based on first midterm in CS 273, Fall 2008.)

@#? Text Search ] { "!" Nondeterministic Finite Automata. Transformation NFA to DFA and Simulation of NFA. Text Search Using Automata

CS 301. Lecture 04 Regular Expressions. Stephen Checkoway. January 29, 2018

Common intervals of genomes. Mathieu Raffinot CNRS LIAFA

Necessary and sucient conditions for some two. Abstract. Further we show that the necessary conditions for the existence of an OD(44 s 1 s 2 )

State Minimization for DFAs

12.1 Nondeterminism Nondeterministic Finite Automata. a a b ε. CS125 Lecture 12 Fall 2016

Anatomy of a Deterministic Finite Automaton. Deterministic Finite Automata. A machine so simple that you can understand it in less than one minute

Section 2.3. Matrix Inverses

Theory of Computation Regular Languages

Thoery of Automata CS402

Regular Expressions (RE) Regular Expressions (RE) Regular Expressions (RE) Regular Expressions (RE) Kleene-*

Harvard University Computer Science 121 Midterm October 23, 2012

3 Regular expressions

NFAs continued, Closure Properties of Regular Languages

Lecture 9: LTL and Büchi Automata

Transcription:

Chpter 3 INTRODUCTION TO AUTOMATA THEORY In this hpter we stuy the most si strt moel of omputtion. This moel els with mhines tht hve finite memory pity. Setion 3. els with mhines tht operte eterministilly for ny given input while Setion 3. els with mhines tht re more flexile in the wy they ompute (i.e., noneterministi hoies re llowe). 3. Deterministi Finite-Stte Mhines A omputer s ility to reognize speifie ptterns is esirle for mny pplitions suh s text eitors, ompilers, n tses. We will see tht mny of these ptterns n e effiiently reognize with the simple finite-stte mhines isusse in this Chpter. 3.. The hep engineer s elevtors Wht kin of mhine hs only finite mount of memory? Of ourse, you my first think tht esk top omputer s memory (e.g., one with 64M RAM) is finite. But this is not extly true sine we hve externl memory (hr rives, tpe rives, n floppy isks) tht oul exten the memory pity to n ritrry lrge limit. We hve muh simpler moel in min where mhine is viewe s lose omputtion ox with only re-only tpe he (or toggle swithes) for externl input. Tht is, the internl prt of the mhine n e in one of finite numer of sttes. A ertin suset of these sttes, lle epting sttes, will inite tht the omputtion hs een suessful. If the mhine hlts in n epting stte, we ept (or reognize) the input s vli. 75

76 COMPSCI.FT To illustrte this further we onsier n over-simplifie onstrution of n elevtor ontrol mehnism. Of ourse, in this exmple, we re not plnning to reognize vli input, just to show how rel-worl finite-stte evie opertes. First onsier n elevtor tht moves etween two levels. We will uil evie with two sttes f g, where the stte numer orrespons with wht floor the elevtor is urrently lote. To sve ost we hve two types of inputs UP n DOWN; utton on eh floor initing tht person on one of the floors wnts to go to the other. The stte hnges, lle trnsitions, of this elevtor n e epite in the following tle formt or grphil igrm formt. The entries in the tle enote new stte of mhine fter n input is reeive (the iretionl rs on the igrph igrm enotes the sme thing). There re four ses to onsier. E.g., if the elevtor is on floor n the DOWN utton is presse then the elevtor shoul move to floor. UP/DOWN (input uttons) Sttes DOWN UP DOWN/UP We n exten this low uget elevtor exmple to hnle more levels. With one itionl floor we will hve more omintions of uttons n sttes to el with. At floor we nee oth n up utton, U, n own utton, D. For floor we just nee n up utton, U, n likewise for floor 3, we just nee own utton, D3. This prtiulr elevtor with three sttes is represente s follows: (input uttons) Sttes U U D D3 3 3 3 3 U, U, D U, D U, D3 D3 U 3 U, D, D3 One my see tht the ove elevtor proly lks in funtionlity sine to trvel two levels one hs to press two uttons. Nevertheless, these two smll exmples shoul inite wht we men y finite-stte mhine. 3.. Finite-stte mhines tht ept/rejet strings We now onsier finite-stte mhines where the input is from some finite hrter lphet. Our exmples will minly use simple hrter sets suh s = f g or = f g ut in prtie they my e s ig s the set of 7-it ASCII hrters

COMPSCI.FT 77 ommonly use y omputers. To o rel omputtions we nee notion of n initil or strting stte; we lso nee some mens to etermine whether the result of our omputtion is suessful or not. To hieve this we nee to esignte n unique strting stte n lssify eh stte s n epting or rejeting stte. A forml efinition of our (first) finite-stte omputtion moel is given next. Definition 4. A eterministi finite utomton (DFA) is five-tuple M =(Q s F) where. Q is the finite set of mhine sttes.. is the finite input lphet. 3. is trnsition funtion from Q to Q. 4. s Q is the strt stte. 5. F Q is the epting (memership) sttes. Notie tht the set of rejeting sttes is etermine y the set ifferene Q n F. Other uthors sometimes efine the next stte funtion s prtil funtion (i.e., not ll sttes ept ll inputs). Exmple 4. A very simple DFA exmple is M = (Q = f g = f g s = F = fg), where is represente in two ifferent wys elow. (input ) Sttes In the grphil representtion we use oule irles to enote the epting sttes F of Q. Also the initil stte s Q hs n isolte rrow pointing t it. Exmple 4. A more omplite DFA exmple is M elow with Q = f eg, =f 3 4g, s =, F = f eg n is represente y the following trnsition tle. (input ) Sttes 3 4 e e e,

78 COMPSCI.FT It is esy to generte irete grph (self-loops llowe) representtion from ove. We just view s n r reltionship on the sttes Q. Notie how we omine rs (into one) with ifferent lels etween the sme two sttes for ese of presenttion, s one for this view of DFA M.,3 4, 3,4,3,4, 3,4 e 3,4 3..3 Reognizing ptterns with DFA There re two min questions we hve t this point onerning DFA n the proess of pttern reognition of strings of hrters. For given DFA, wht strings oes it ept? For given set of strings, n we uil DFA tht reognizes just them? Before proeeing we nee nme for the set of inputs epte y some utomton. Definition 43. For DFA M, the set of strings (wors) epte y M is lle the lnguge L(M ) eie (reognize) y M. The set L(M ) is simply s suset of, ll hrter sequenes of the input lphet. We will see lter in Setion 3.4 tht the lnguges reognizle y finite utomt re extly those expressile s regulr expressions. Exmple 44. For the DFA M liste elow, L(M ) is the set of strings (over =f 3g) tht ontin the sustring 3.,3 3 3,,3

COMPSCI.FT 79 To ompute L(M ) we nee to lssify ll possile strings tht yiel stte trnsitions to n ept stte from the initil stte. When looking t grphil representtion of M, the strings re tken from the hrter sequene of trverse rs. Note tht more thn one string my orrespon to given pth euse some rs hve more thn one lel. Exmple 45. We onstrut DFA tht epts ll wors with n even numer of s n n o numer of s. A / prity guie for eh of the four sttes of the DFA is given on the right. Even/Even O/Even Even/O O/O We en this setion y mentioning tht there re some lnguges suh s L = f n n j n > g tht re not epte y ny finite-stte utomton. Why is L not reognize y ny DFA? Well, if we h DFA M of m sttes then it woul hve prolems epting just the set L sine the utomton hs to keep ifferent stte for eh ount of the s it res efore it res the s. If two ifferent ounts i n j, i < j, of s shre the sme stte then j i woul e epte y M, whih is not in L. 3. Noneterministi Finite-Stte Mhines Noneterminism llows mhine to selet one of severl stte trnsitions rnomly. This inlues hoie for initil stte. This flexiility mkes it esier (for humn esigner) to uil n utomton tht reognizes strings in prtiulr lnguge. Below we formlly efine this relxe moel of omputtion. We will see in the next setion how to (lgorithmilly) proue n equivlent eterministi mhine from noneterministi one. Definition 46. A noneterministi finite utomton (NFA) is five-tuple (Q S F) where. Q is the finite set of mhine sttes.. is the finite input lphet. 3. is funtion from Q to Q, the set of susets of Q. 4. S Q is set of strt (initil) sttes.

8 COMPSCI.FT 5. F Q is the epting (memership) sttes. Notie tht the stte trnsition funtion is more generl for NFA s thn DFA s. Besies hving trnsitions to multiple sttes for given input symol, we n hve (q ) unefine for some q Q n. This mens tht tht we n esign utomt suh tht no stte moves re possile for when in some stte q n the next hrter re is (i.e., the humn esigner oes not hve to worry out ll ses). An NFA epts string w if there exists noneterministi pth following the legl moves of the trnsition funtion on input w to n ept stte. Other uthors sometimes llow the next stte funtion for NFA to inlue epsilon trnsitions. Tht is, NFA s stte my hnge to nother stte without neeing to re the next hrter of the input string. These stte jumps o not mke NFA s ny more powerful in reognizing lnguges euse we n lwys more trnsitions to ypss the epsilon moves (or further strt sttes if n epsilon leves from strt stte). For this introution, we o not onsier epsilon trnsitions further. 3.. Using noneterministi utomt We now present two exmples of noneterministi finite-stte utomt (NFA s). Exmple 47. An NFA N with four sttes Q = f g, input lphet = f 3g, strt sttes S = fg, epting sttes F = fg n trnsition funtion is given elow: (input ) Sttes 3 f,g f,g fg f,g fg fg fg fg fg Note tht there re no legl trnsitions from stte on input (or from stte on inputs or ) in the ove NFA. The orresponing grphil view is given elow.,,3 3 3,,3

COMPSCI.FT 8 We n see tht the lnguge L(N ) epte y this NFA N is the set of strings tht strt with ny numer of s n s, followe y or 33 or( n( s n/or 3 s) n 3). We will see how to esrie lnguges suh s L(N ) more esily when we over regulr expressions in Setion 3.4 of these notes. Exmple 48. An exmple NFA with multiple strt sttes is N with six sttes Q = f e fg, input lphet = f 3g, strt sttes S = f g, epting sttes F = f g n trnsition funtion is given elow: (input ) Sttes 3 f,g fg feg fg fg feg f eg e ff g fg fg f feg 3 3 3 3 e,3 f The ove exmple is somewht omplite. Wht set of strings will this utomt ept? Is there nother NFA of smller size (numer of sttes) tht reognizes the sme lnguge? We hve to evelop some tools to nswer these questions more esily. 3.. The reverse R(L) of lnguge L If we hve n utomton (either DFA or NFA) M tht reognizes lnguge L we n systemtilly onstrut n NFA M tht reognizes the reverse lnguge R(L). The reverse of string w = 3 ::: n is the string w = n n; :::. If M epts w then M epts w. Definition 49. The reverse or ul mhine M of n NFA M is onstrute s follows:. The strt sttes of M re the ept sttes of M.. The ept sttes of M re the initil sttes of M. 3. If (q )=q is in M then (q )=q is in M. I.e., ll trnsitions re reverse. It is esy to see tht the ul mhine M of n utomton M reognizes the reverse strings tht M epts. Exmple 5. The ul mhine of Exmple 44 is given elow.

8 COMPSCI.FT,3 3 3,,3 Notie tht the ul mhine my not e the simplest mhine for reognizing the reverse lnguge of given utomton. 3..3 The losure C(L) of lnguge L We wnt to introue nother useful NFA ssoite with given utomton. The losure, C(L) of lnguge L is efine to e the set of strings tht n e forme y ontenting together ny numer of strings of L. Given DFA (or NFA) M tht reognizes lnguge L we n uil n NFA M tht reognizes the losure of L y simply ing trnsitions from ll ept stte(s) to the neighors of the initil stte(s). Exmple 5. The DFA isplye on the left elow epts only the wor. The losure of this lnguge, C(L) =f k j k g, is epte y the NFA on the right.,,,, In the ove exmple only one trnsition r ( ) = ws e sine the trnsition ( ) = lrey existe. 3.3 Reognition Cpilities of NFA s n DFA s Although NFA s re esier thn DFA s for the humn to esign, they re not s usle y (eterministi) omputer. This is euse noneterminism oes not give preise steps for exeution. This setion shows how one n tke vntge of oth the onveniene of NFA s n the prtility of DFA s. The following lgorithm n e use to onvert n NFA N to DFA M tht epts the sme set of strings. lgorithm NFAtoDFA(NFA N, DFA M) The initil stte s M of M is the set of ll initil sttes of N, S N. Q M = fs M g

COMPSCI.FT 83 while Q M hs not inrese in size for eh new stte q M = f ::: k gq M o for eh input x o M (q M x) is the set q M of ll sttes of N rehle from i on input x. (I.e., q M = f j j N ( i x)= j i kg) Q M = Q M [fq Mg enfor enfor enwhile 3 The epting sttes F M is the set of sttes tht hve n epting stte of N. (I.e., F M = fq M j i q M n i F N g) en The ie ehin the ove lgorithm is to rete potentilly stte in M for every suset of sttes of N. Mny of these sttes re not rehle so the lgorithm often termintes with smller eterministi utomton thn the worst se of jnj sttes. The running time of this lgorithm is O(jQ M jjj) if the orret t strutures re use. The lgorithm NFAtoDFA shows us tht the reognition pilities of NFA s n DFA s re equivlent. We lrey knew tht ny DFA n e viewe s speil se of n NFA; the ove lgorithm provies us with metho for mpping NFA s to DFA s. Exmple 5. For the simple NFA N given on the left elow we onstrut the equivlent DFA M on the right, where L(N ) = L(M ). fg f g f g, fg f g Notie how the resulting DFA from the previous exmple hs only 5 sttes ompre with the potentil worst se of 8 sttes. This often hppens s is evient in the next exmple too. Exmple 53. For the NFA N given on the left elow we onstrut the equivlent DFA M on the right, where L(N) = L(M).

84 COMPSCI.FT fg f g, fg, f g f g In the ove exmple notie tht the empty suset is stte. This is sometimes lle the e stte sine no trnsitions re llowe out of it (n it is non-ept stte). 3.4 Regulr Expressions In this setion we present metho for representing sets of strings over fixe lphet. We egin with some forml efinitions. Definition 54. A wor w over given lphet is n element of = S i= i. The empty wor ontins no symols of. A lnguge L is suset of wors. The ontention of two wors w n w, enote w w, is forme y juxtposing the symols. A prout of lnguges L n L is L L = fw w j w L w L g. The (Kleene) losure of lnguge L is efine y L = S i= L i. The following property hols for the empty wor n ny wor w, w = w = w. For ny lnguge L, L = fg, L = L n L = LL. Exmple 55. If =f g n L = f g then L is the set of wors, inluing, tht hve t lest one following eh. Definition 56. The stnr regulr expressions over lphet (n the sets they esignte) re s follows:. Any is regulr expression (set fg).. If E (for some set S ) n E (for some set S ) re regulr expressions then so re: E je (union S [ S ). Often enote E + E. E E (lnguge ontention S S ). E (Kleene losure S ).

COMPSCI.FT 85 The following tle illustrtes severl regulr expressions over the lphet =f g n the sets of strings, whih we will shortly ll regulr lnguges, tht they represent. regulr expression j (j) (jj) j j (j )( ) regulr lnguge fg fg f g f g f :::g f g f :::g f ::: ::: :::g Definition 57. A regulr lnguge (set) over n lphet is either the empty set, the set fg, or the set of wors esignte y some regulr expression. 3.4. The UNIX extensions to regulr expressions For the users onveniene, the UNIX operting system extens (for progrms suh s vi, sh, grep, lex n perl) the regulr expressions mentione ove for pttern mthing. However, these new opertors o not exten the sets of lnguges tht re reognizle. Some of the most ommon new fetures re liste elow for the lphet eing the set of ASCII hrters. Chrter Clsses n Wil Cr Symol. A rnge of hrters n e enlose in squre rkets. For exmple [-z] woul enote the set of lower se letters. A perio. is wil r symol use to enote ny hrter exept newline. Line Beginning n Ening. To mth string tht egins the line use the ht ˆ s the first hrter of the pttern string. To mth string tht ens the line use the ollr sign $ s the lst hrter of the pttern string. For exmple ˆ[-9]*$ will mth oth empty lines or lines ontining only igits. Aitionl Opertors. Let E e regulr expression. The regulr expression E? enotes extly or mthes of E. This is shorthn for the regulr expression ( je). The regulr expression E + enotes EE, tht is, or more ourrenes of E. Note to mth one of the speil symols ove like * or. (inste of invoking its speil feture) we hve to espe it with preeing kslsh n hrter. For exmple, ig.*\. will mth igest. n iggy. where the perio is mthe. The line eginning n ening hrters were e sine, y efult, most UNIX progrms o sustring mthing.

86 COMPSCI.FT 3.5 Regulr Sets n Finite-Stte Automt We now wnt to present hrteriztion of the omputtionl power of finite stte utomt. We hve lrey seen tht DFA s n NFA s hve the sme omputtionl power. The set of lnguges epte/eie y utomt re preisely the set of regulr lnguges (sets). We show how to uil n NFA tht reognizes the set of wors epite y ny regulr expression. Theorem 58 (Kleene s Theorem). For ny regulr lnguge L there is DFA M suh tht L(M )=L. Proof. It suffies to fin NFA N tht epts L sine we hve lrey seen how to onvert NFA s to DFA s. (See Setion 3.3.) An utomton for L = n n utomton for L = fg re given elow. Now suppose E is regulr expression for L. We onstrut N se on the length of E. If E = fg for some we n use the following utomton. ;fg By inution we only nee to show how to onstrut N for E eing one of E + E, E E or E, for smller regulr expressions E n E. Let use ssume we hve orret utomt M n M for E n E. Cse : E = E + E We onstrut (noneterministi) utomton N representing E simply y tking the union of the two mhines M n M. Cse : E = E E We onstrut n utomton N representing E s follows. We o this y ltering slightly the union of the two mhines M n M. The initil sttes of N will e the initil sttes of M. The the initil sttes of M will only initil sttes of N if t lest one of M s initil sttes is n epting stte. The finl sttes of N will e the finl sttes of M. (I.e., the finl sttes of M eome orinry sttes.) For eh trnsistion (q q ) to finl stte q of M we trnsitions to the initil sttes of M. Tht is, for, ifq j (q i ) for some finl stte q j F then q k N (q ) for eh strt stte q k S. Cse 3: E = E

COMPSCI.FT 87 The losure of n utomton ws seen in Setion 3..3. An utomton representing E is the union of the losure C(M ) n the utomton representing fg given ove. Kleene s Theorem is tully stronger thn wht we mentione ove. He lso prove tht for ny finite utomton M there exists regulr expression tht represents the lnguge eie y M. The onstrution is simple, ut etile, n is less useful so we omit it. Exmple 59. For the regulr expression () + we onstrut n utomton tht epts the strings mthe. First we uil utomt M n M tht ept the simple lnguges fg n fg. M: M: We next onstrut n utomton M tht epts the lnguge fg. We n esily reue the numer of sttes of M to rete n equivlent utomton M3. M: M3: We next onstrut n utomton M4 tht epts the lnguge represente y the regulr expression (). M4: The union of the utomt M n M4 is n utomton N tht epts the regulr lnguge epite y the expression () +. In the next setion we show how to minimize utomt to proue the following finl eterministi utomton (from the output of lgorithm NFADFA on N) tht epts this lnguge.

88 COMPSCI.FT e Usully more omplite (longer length) regulr expressions require utomt with more sttes. However, this is not true in generl. Exmple 6. The DFA for the regulr expression () ( +), isplye elow, hs one fewer stte thn the previous exmple. 3.6 Minimizing Deterministi Finite-Stte Mhines There re stnr tehniques for minimizing eterministi utomt. We present n effiient lgorithm se on fining (n eliminting) equivlent sttes. Definition 6. For DFA M = (Q s F) n ny q Q efine the DFA M q = (Q q F), tht is, s is reple with q in M q. We sy two sttes p n q of M re istinguishle (k-istinguishle) if there exists string w (of length k) suh tht extly one of M p or M q epts w. If there is no suh string w then we sy p n q re equivlent. Note tht the empty string my lso e use to istinguish two sttes of n utomton. Lemm 6. IfDFAM hs two equivlent sttes p n q then there exists smller DFA M suh tht L(M )=L(M ). Proof. Assume M = (Q s F) n p 6= s. We rete n equivlent DFA M = (Q nfpg s F nfpg) where is with ll instnes of (q i ) = p reple with (q i ) = q n ll instnes of (p ) = q i elete. The resulting utomton M is eterministi n epts lnguge L(M ).

COMPSCI.FT 89 Lemm 63. Two sttes p n q re (not) k-istinguishle if n only if for eh, the sttes (p ) n (q ) re (not) (k ; )-istinguishle. Proof. Consier ll strings w = w of length k. If (p ) n (q ) re (k ; )- istinguishle y some string w then p n q must e k-istinguishle y w. Likewise, p n q eing k-istinguishle y w implies there exists two sttes (p ) n (q ) tht re (k ; )-istinguishle y the shorter string w. Algorithm MinimizeDFA: Our lgorithm to fin equivlent sttes of DFA M =(Q s F) egins y efining series equivlent reltions,,...onthesttes of Q. p q if oth p n q re in F or oth not in F. p k+ q if p k q n, for eh, (p ) k (q ). We stop generting these equivlene lsses when n n n+ re ientil. Lemm 63 gurntees no more non-equivlent sttes. Sine there n e t most jqj non-equivlent sttes this ouns the numer of equivlene reltions k generte. We n eliminte one stte from M (using Lemm 6) whenever there exists two sttes p n q suh tht p n q. In prtie, we often eliminte more thn one (i.e., ll ut one) stte per equivlene lss. Theorem 64. There exists polynomil-time lgorithm to minimize ny DFA M. Proof. To ompute k+ from k we hve to etermine the equivlene (or non-equivlene) for t most = O(jQj ) possile pirs of sttes p n q. Eh equivlene hek jqj requires jj trnsitions look-ups. Sine we hve to ompute this for t most n jqj ifferent equivlene lsses k, the preeing lgorithm MinimizeDFA runs in time O(jjjQj 3 ). Currently there re no iret, effiient minimiztion lgorithms for the noneterministi ounterprts of DFA. Note tht the minimize equivlent DFA for n NFA my e lrger then the originl (nonminimize) NFA. We en our introution to utomt theory y showing how to use this minimiztion lgorithm. The first exmple shows how to verify tht n utomton is miniml n the seon shows how to fin equivlent sttes for elimintion. Exmple 65. We use the lgorithm MinimizeDFA to show the following utomton M hs the smllest numer of sttes for the regulr lnguge it represents.

9 COMPSCI.FT e f The initil equivlent reltion is f fgf eg se solely on the finl sttes of M. We now lulte using the reursive efinition: ( ) = 6 ( ) = e ) 6. ( ) = e 6 ( ) = f ) 6. ( ) = 6 (f ) = e ) 6 f. ( ) = 6 ( ) = f ) 6. (( ) = (f ) = e n ( ) = e (f ) = e) ) f. ( ) = f 6 (f ) = e ) 6 f. ( ) = 6 (e ) = ) 6 e. So is fgfgf fgfgfeg. We now lulte to hek the two possile remining equivlent sttes: ( ) = 6 (f ) = e ) 6 f: This shows tht ll sttes of M re non-equivlent (i.e., our utomton is minimum). Exmple 66. We use the lgorithm MinimizeDFA to show tht the following utomton M n e reue. e The initil equivlent reltion is f gf eg. We now lulte : ( ) = 6 ( ) = ) 6. ( ) = 6 ( ) = ) 6. (( ) = ( ) = n ( ) = ( ) = e) ). (( ) = (e ) = e n ( ) = e (e ) = e) ) e.

COMPSCI.FT 9 So is fgf gf eg. We lulte in the sme fshion n see tht it is the sme s. This shows tht we n eliminte, sy, sttes n e to yiel the following minimum DFA tht reognizes the sme lnguge s M oes. To test ones unerstning, we invite the reer reproue the finl utomton of Exmple 59 y using the lgorithms NFADFA n MinimizeDFA.

9 COMPSCI.FT