CMPSCI 250: Introduction to Computtion Lecture #31: Wht DFA s Cn nd Cn t Do Dvid Mix Brrington 9 April 2014
Wht DFA s Cn nd Cn t Do Deterministic Finite Automt Forml Definition of DFA s Exmples of DFA s DFA s in Jv Chrcterizing Strings With Given Behvior Distinguishle Strings Lnguges With No DFA s
Deterministic Finite Automt We now turn to finite-stte mchines, model of computtion tht cptures the ide of reding file of text with fixed limit on the memory we cn use to rememer wht we hve seen. In prticulr, the memory used must e constnt, independent of the length of the file. We ensure this y requiring our mchine to hve finite stte set, so tht t ny time during the computtion ll tht it knows is which stte it is in.
Deterministic Finite Automt The initil stte is fixed. When the mchine sees new letter, it chnges to new stte sed on fixed trnsition function. When it finishes the string, it gives yes or no nswer sed on whether it is in finl stte. Becuse the new stte depends only on the old stte nd the letter seen, the computtion is deterministic nd the mchine is clled deterministic finite utomton or DFA.
Where We Are Going A DFA decides lnguge -- it reds string over its lphet nd then nswers yes or no. The lnguge of the DFA is the set of strings for which it sys yes. We cll lnguge X decidle if there exists DFA whose lnguge is X. Lter we ll prove tht some lnguges re not decidle.
Where We Are Going The Myhill-Nerode Theorem will give us wy to tke n ritrry lnguge nd determine whether it is decidle. We ll define prticulr equivlence reltion on strings, sed only on the lnguge. If this reltion hs finite set of equivlence clsses, there is DFA for the lnguge, nd there is miniml DFA with s mny sttes s there re clsses. We ll see how to compute the miniml DFA from ny DFA for the lnguge.
Where We Are Going As we ve mentioned, there is DFA for lnguge if nd only if the lnguge is regulr (tht is, if nd only if it is the lnguge denoted y some regulr expression). We ll prove this importnt result, clled Kleene s Theorem, over severl lectures. Our proofs will show us how to convert DFA to regulr expression nd vice vers.
Forml Definition of DFA s Formlly DFA is defined y its stte set S, its initil stte i S, its finl stte set F S, its input lphet Σ, nd its trnsition function δ from (S Σ) to S. We usully represent DFA s y digrms (leled directed multigrphs) with node for ech stte, specil mrk for the initil stte, doule circle on ech finl stte, nd n rrow leled from node p to node q whenever δ(p, ) = q.
A DFA Exmple Here is DFA with stte set {1, 2, 3, 4}, initil stte 1, finl stte set {3}, lphet {, }, nd trnsition function indicted y the rrows. For ny string, we cn follow the rrows for its letters in order. The strings,, nd re in this DFA s lnguge. 1 2 3 4
Clicker Question #1 Which of these strings is not in the lnguge of this DFA? () () (c) (d) 1 2 3 4
Answer #1 Which of these strings is not in the lnguge of this DFA? () (1-3-2-4-2-4-2-2-4-3-1-4) () (1-4-3-2-4-2-4-3-1-4-3) (c) (1-3-1-3-1-4-2-2-2-4-3-1-3) (d) (1-4-2-2-2-2-4-2-4-3-2-4-3) 1 2 3 4
Behvior of DFA The ehvior function of prticulr DFA is function clled δ * from (S Σ * ) to S, such tht δ * (p, w) is the stte of the DFA fter it strts in stte p nd reds the string w. Formlly, we sy tht δ * (p, λ) = p nd tht δ * (p, w) = δ(δ * (p, w), ). The lnguge of DFA is defined to e the set of strings w such tht δ * (i, w) is finl stte. For DFA M, we cll this lnguge L(M).
More Exmples of DFA s One of the simplest possile DFA s decides the lnguge of inry strings with n odd numer of ones. It hs two sttes E nd O, representing whether the mchine hs seen n even or odd numer of ones so fr. The initil stte is E, nd the finl stte set is {O}. The trnsition function hs δ(e, 0) = E, δ(e, 1) = O, δ(o, 0) = O, nd δ(o, 1) = E. E 1 1 0 0 O
More Exmples of DFA s We cn uild four-stte DFA for the lnguge EE, of strings with n even numer of s nd n even numer of s. Its sttes re EE, EO, OE, nd OO. For exmple, δ * (EE, w) = EO if w hs n even numer of s nd n odd numer of s. EE OE EO OO
More Exmples of DFA s Another four-stte DFA cn decide whether the next to lst letter of inry string w exists nd is 1. The stte set is {00, 01, 10, 11} nd the stte fter reding w represents the lst two letters seen. The initil stte is 00 nd the finl stte set is {10, 11}. 0 00 1 0 1 11 0 1 01 1 0 10
DFA s in Pseudo-Jv We consider the input to e given like file, with method to give the next letter nd one to tell when the input is done. We relel the stte set nd the lphet to e {0,..., sttes - 1} nd {0,..., letters - 1} respectively.
DFA s in Pseudo-Jv pulic clss DFA { nturl sttes; nturl letters; nturl strt; oolen [ ] isfinl = new oolen[sttes]; nturl [ ] [ ] delt = new nturl [sttes] [letters]; nturl getnext( ) {code omitted} oolen inputdone( ) {code omitted}
DFA s in Pseudo-Jv oolen decide ( ) {// returns whether input string is // in the lnguge of the DFA nturl current = strt; while (!inputdone( )) current = delt[current][getnext( )]; return isfinl [current];}}
The Strings With Behvior How do we prove tht prticulr DFA hs prticulr lnguge? With the even-odd DFA, we cn sy tht δ * (E, w) = E if w hs n even numer of ones, nd δ * (E, w) = O if it hs n odd numer of ones. E 1 O 1 0 0
The Strings With Behvior δ * (E, w) = E if w hs n even numer of ones, nd δ * (E, w) = O if it hs n odd numer of ones. Letting P(w) e the entire sttement in the ullet ove, we cn prove w:p(w) y induction on ll inry strings. P(λ) sys tht δ * (E, λ) = E, ecuse λ hs no ones nd 0 is even, nd δ * (E, λ) = E is true y definition of δ *. E 1 1 0 0 O
The Strings With Behvior Now ssume tht P(w) is true, so tht δ * (E, w) is E if w hs n even numer of ones nd O otherwise. Then w0 hs the sme numer of ones s w, so δ * (E, w0) should e the sme stte s δ * (E, w). And w1 hs one more one thn w, so δ * (E, w1) should e the other stte from δ * (E, w). In ech of the four cses, the new stte is the stte given y the δ function of the DFA. E 1 1 0 0 O
The No- Lnguge The lnguge No- is the set of strings tht never hve n sustring. We cn uild DFA M for No- with stte set {1, 2, 3, 4}, strt stte 1, finl stte set {1, 2, 3}, nd trnsition function s shown. (We cll 4 deth stte.) We cn see tht n will tke us from ny stte to 4. 1 4, 2 3
Clicker Question #2 Suppose δ * (1, w) = 4. Which sttement must e true of w? () w contins n sustring () w No- (c) w contins no (d) w ()* (+) * 1 4, 2 3
Answer #2 Suppose δ * (1, w) = 4. Which sttement must e true of w? () w contins n sustring () w No- (c) w contins no (d) w ()* (+) * 1 4, 2 3
Chrcterizing the Sttes Let L1 e the set of strings tht hve no nd don t end in or. Let L2 e the set of strings tht don t hve n nd end in. L3 is the set of strings tht don t hve n nd end in. L4 is the set tht hve. 1 4, 2 3
Chrcterizing the Sttes We cn mke eight checks, one for ech vlue of δ. If δ(i, x) = j, we check tht ny string in Li, followed y the letter x, yields string in Lj. We then do n inductive proof, where P(w) is the sttement on the previous slide: For ll sttes i, δ * (1, w) = i if nd only if w Li where ech Li is s defined there. Thus w L(M) w No-. 1 4, 2 3
Distinguishle Strings Is it possile tht nother DFA with only three sttes could decide No-? We divided ll possile strings into four sets. Suppose DFA reds w nd does not know which of the four sets w is in. We ll show tht in this cse it is doomed -- for some string x, it will e wrong if it sees x nd hs to decide whether wx is in the lnguge No-.
Dinstinguishle Strings Look t the four strings λ,,, nd. If the DFA hs δ * (i, λ) = δ * (i, ), we sy tht it cnnot distinguish etween λ nd. If this is true, the DFA must lso hve δ * (i, ) = δ * (i, ) ecuse will tke the sme stte to the sme stte. Then s well δ * (i, ) = δ * (i, ).
Distinguishle Strings But now we know tht the DFA cnnot decide No-, ecuse it gives the sme nswer on the strings (which is in No-) nd (which is not in No-). We cn cll this n experiment tht distinguishes the two strings λ nd.
Clicker Question #3 Two strings u nd v re defined to e No-distinguishle if there exists string w such tht exctly one of the strings uw nd vw re in No-. Which one of these pirs of strings is not No--distinguishle? () {, } () {λ, } (c) {, } (d) {, }
Answer #3 Two strings u nd v re defined to e No-distinguishle if there exists string w such tht exctly one of the strings uw nd vw re in No-. Which one of these pirs of strings is No--distinguishle? () {, } (oth lredy hve ) () {λ, } (ppend to ech) (c) {, } (ppend to ech) (d) {, } (ppend λ to ech)
Sets of Distinguishle Strings Let L e ny lnguge. We sy tht two strings u nd v re L-distinguishle (lso clled L-inequivlent) if there exists string w such tht uw L nd vw L, or vice vers. We cll the strings L-equivlent if the negtion of this sttement is true, tht is, if w: uw L vw L.
A Lemm on Distinguishility Lemm: If M is DFA with trnsition function δ, L is ny lnguge, u nd v re two L-distinguishle strings, nd δ * (i, u) = δ * (i, v), then L(M) L. Proof: We cn prove y induction tht if δ * (i, u) = δ * (i, v), then for ny string w, δ * (i, uw) = δ * (i, vw). For the prticulr w tht distinguishes u nd v, then, the single stte δ * (i, uw) = δ * (i, vw) needs to e oth finl nd non-finl if L(M) = L.
A Distinguishility Theorem Theorem: If there exists set of k pirwise L-distinguishle strings, then no DFA tht decides L cn hve fewer thn k sttes. Proof: If there re more strings thn there re sttes, y the Pigeonhole Principle there must exist two L-distinguishle strings u nd v such tht δ * (i, u) = δ * (i, v). In this cse the Lemm sys tht the DFA does not decide L.
Lnguges With No DFA s Consider the lnced prenthesis lnguge Pren, which we will write s suset of {L, R} * with L for left prens nd R for right prens. We cn prove tht there is no DFA t ll tht decides this lnguge. Look t the infinite set of strings {λ, L, LL, LLL,...}. I clim tht this set is pirwise Prendistinguishle, ecuse if i nd j re two nturls with i j, then L i nd L j re distinguished y R i, since L i R i is in Pren nd L j R i is not.
Lnguges With No DFA s So for ny nturl k, we cn find more thn k pirwise Pren-distinguishle strings, nd y our theorem there cnnot e k-stte DFA. Our rel-life lgorithm to decide Pren is to rememer the numer of L s we hve seen, minus the numer of R s. If this numer ends t 0, without ever going negtive, we re in Pren. But this requires more thn constnt memory -- potentilly stte for every nturl from 0 through n.