Victor Admchik Dnny Sletor Gret Theoreticl Ides In Computer Science CS 5-25 Spring 2 Lecture 2 Mr 3, 2 Crnegie Mellon University Deterministic Finite Automt Finite Automt A mchine so simple tht you cn understnd it in less thn one minute Wishful thinking, strt stte (q ) ccept sttes (F), trnsitions The mchine ccepts string if the process ends in double circle The mchine ccepts string if the process ends in double circle sttes Antomy of Deterministic Finite Automton The Lnguge L(M) of Mchine M The singulr of utomt is utomton., The lphbet of finite utomton is the set where the symbols come from, for exmple {,} The lnguge of finite utomton is the set of strings tht it ccepts q L(M) = All strings of s nd s The lnguge of finite utomton is the set of strings tht it ccepts
The Lnguge L(M) of Mchine M q q L(M) = { w w hs n even number of s} Nottion An lphbet Σ is finite set (e.g., Σ = {,}) A string over Σ is finite-length sequence of elements of Σ For x string, x is the length of x The unique string of length will be denoted by ε nd will be clled the empty or null string A lnguge over Σ is set of strings over Σ A finite utomton is M = (Q, Σ,, q, F) Q is the finite set of sttes Σ is the lphbet : Q Σ Q is the trnsition function q Q is the strt stte F Q is the set of ccept sttes L(M) = the lnguge of mchine M = set of ll strings mchine M ccepts M = (Q, Σ,, q, F) where Q = {q, q, q 2, q 3 } Σ = {,} q Q is strt stte F = {q, q 2 } Q ccept sttes : Q Σ Q trnsition function q q M q 3, q 2 q q q q q 2 q 2 q 2 q 3 q 2 q 3 q q 2 EXAMPLE The finite-stte utomt re deterministic, if for ech pir Q Σ of stte nd input vlue there is unique next stte given by the trnsition function. There is nother type mchine in which there my be severl possible next sttes. Such mchines clled nondeterministic. Build n utomton tht ccepts ll nd only those strings tht contin {} {}, {} 2
Build n utomton tht ccepts ll binry numbers tht re divisible by 3, i.e, L =,,,,,,, A lnguge over Σ is set of strings over Σ A lnguge is regulr if it is recognized by deterministic finite utomton L = { w w contins } is regulr L = { w w hs n even number of s} is regulr Determine the lnguge recognized by Determine the lnguge recognized by,,, L(M)={ n n =,, 2, } L(M)={, } Determine the lnguge recognized by DFA Membership problem Determine whether some word belongs to the lnguge. Theorem: The DFA Membership Problem is solvble in liner time.,, L(M)={ n, n x n=,,2, nd x is ny string} Let M = (Q, Σ,, q, F) nd w = w...w m. Algorithm for DFA M: p := q ; for i := to m do p := (p,w i ); if p F then return Yes else return No. 3
Equivlence of two DFAs Definition: Two DFAs M nd M 2 over the sme lphbet re equivlent if they ccept the sme lnguge: L(M ) = L(M 2 ). Given few equivlent mchines, we re nturlly interested in the smllest one with the lest number of sttes. Union Theorem Given two lnguges, L nd L 2, define the union of L nd L 2 s L L 2 = { w w L or w L 2 } Theorem: The union of two regulr lnguges is lso regulr lnguge. Theorem: The union of two regulr lnguges is lso regulr lnguge Proof (Sketch): Let M = (Q, Σ,, q, F ) be finite utomton for L 2 nd M 2 = (Q 2, Σ, 2, q, F 2 ) be finite utomton for L 2 We wnt to construct finite utomton M = (Q, Σ,, q, F) tht recognizes L = L L 2 Ide: Run both M nd M 2 t the sme time! Q = pirs of sttes, one from M nd one from M = { (q, q 2 ) q Q nd q 2 Q 2 } = Q Q 2 Theorem: The union of two regulr lnguges is lso regulr lnguge Automton for Union q q p q p q p p p q p q 4
The Regulr Opertions Union: A B = { w w A or w B } Reverse Reverse: A R = { w w k w k w A } Intersection: A B = { w w A nd w B } Negtion: A = { w w A } Reverse: A R = { w w k w k w A } Conctention: A B = { vw v A nd w B } Str: A* = { w w k k nd ech w i A } How to construct DFA for the reversl of lnguge? The direction in which we red string should be irrelevnt. If we flip trnsitions round we might not get DFA. The Kleene closure: A* Str: A* = { w w k k nd ech w i A } From the definition of the conctention, we definite A n, n =,, 2, recursively A = {ε} A n+ = A n A A* is set consisting of conctentions of rbitrry mny strings from A. A* U A k k The Kleene closure: A* Wht is A* of A={,}? Wht is A* of A={}? All binry strings All binry strings of n even number of s Regulr Lnguges Are Closed Under The Regulr Opertions We hve seen prt of the proof for Union. The proof for intersection is very similr. The proof for negtion is esy. Theorem: Any finite lnguge is regulr Clim : Let w be string over n lphbet. Then {w} is regulr lnguge. Proof: By induction on the number of chrcters. If {} nd {b} re regulr then {b} is regulr Clim 2: A lnguge consisting of n strings is regulr Proof: By induction on the number of strings. If {} then L {} is regulr 5
Pttern Mtching Input: Text T of length t, string S of length n Problem: Does string S pper inside text T? Nïve method:, 2, 3, 4, 5,, t Cost: Roughly nt comprisons Automt Solution Build mchine M tht ccepts ny string with S s consecutive substring Feed the text to M Cost: t comprisons + time to build M As luck would hve it, the Knuth, Morris, Prtt lgorithm builds M quickly Rel-life Uses of DFAs Grep Are ll lnguges regulr? Coke Mchines Thermostts (fridge) Elevtors Trin Trck Switches Lexicl Anlyzers for Prsers Consider the lnguge L = { n b n n > } i.e., bunch of s followed by n equl number of b s No finite utomton ccepts this lnguge Cn you prove this? n b n is not regulr. No mchine hs enough sttes to keep trck of the number of s it might encounter 6
Tht is firly wek rgument Consider the following exmple L = strings where the # of occurrences of the pttern b is equl to the number of occurrences of the pttern b Cn t be regulr. No mchine hs enough sttes to keep trck of the number of occurrences of b b b L = strings where the # of occurrences of the pttern b is equl to the number of occurrences of the pttern b b b b Cn t be regulr. No mchine hs enough sttes to keep trck of the number of occurrences of b M ccepts only the strings with n equl number of b s nd b s! Let me show you professionl strength proof tht n b n is not regulr How to prove lnguge is not regulr Assume it is regulr, hence is ccepted by DFA M with n sttes. Show tht there re two strings s nd s 2 which both rech some stte in M (usully by pigeonhole principle) Then show there is some string t such tht string s t is in the lnguge, but s 2 t is not. However, M ccepts either both or neither. 7
Pigeonhole principle: If we put n objects into m pigeonholes nd if n > m, then t lest one pigeonhole must hve more thn one item in it. Theorem: L= { n b n n > } is not regulr Proof (by contrdiction): Assume tht L is regulr, M=(Q,{,b},,q,F) Consider (q, i ) for i =,2,3, There re infinitely mny i s but finite number of sttes. (q, n )=q nd (q, m ) =q, nd n m Since M ccepts n b n (q, b n )=q f (q, m b n )= ( (q, m ),b n )= (q, b n )= q f It follows tht M ccepts m b n, nd n m The finite-stte utomt re deterministic, if for ech pir of stte nd input vlue there is unique next stte given by the trnsition function. There is nother type mchine in which there my be severl possible next sttes. Such mchines clled nondeterministic. Nondeterministic finite utomton (NFA) A NFA is defined using the sme nottions M = (Q, Σ,, q, F) s DFA except the trnsition function ssigns set of sttes to ech pir Q Σ of stte nd input. Note, every DFA is utomticlly lso NFA. Nondeterministic finite utomton NFA for { k k is multiple of 2 or 3} ε q k ε Allows trnsitions from q k on the sme symbol to mny sttes 8
Wht does it men tht for NFA to recognize string x = x x 2 x k? s s s 2 s 3 s 4, Since ech input symbol x j (for j>) tkes the previous stte to set of sttes, we shll use union of these sttes. Wht does it men tht for NFA to recognize string? Here we re going formlly define this. For stte q nd string w, * (q, w) is the set of sttes tht the NFA cn rech when it reds the string w strting t the stte q. Thus for NFA= (Q, Σ,, q, F), the function * : Q x Σ* -> 2 Q is defined by * (q, y x k ) = p *(q,y) (p,x k ) Find the lnguge recognized by this NFA Find the lnguge recognized by this NFA s s 3 s, s s 2 s 4 L = { n, n, n n =,, 2 } L = * (,, ) ()* Nondeterministic finite utomton Theorem. If the lnguge L is recognized by NFA M, then L is lso recognized by DFA M. In other words, if we sk if there is NFA tht is not equivlent to ny DFA. The nswer is No. NFA vs. DFA Advntges. Esier to construct nd mnipulte. Sometimes exponentilly smller. Sometimes lgorithms much esier. Drwbcks Acceptnce testing slower. Sometimes lgorithms more complicted. 9
DFA NFA Study Bee