CS:4330 Theory of Computation Spring Regular Languages. Equivalences between Finite automata and REs. Haniel Barbosa

CS:4330 Theory of Computtion Spring 208 Regulr Lnguges Equivlences between Finite utomt nd REs Hniel Brbos

Redings for this lecture Chpter of [Sipser 996], 3rd edition. Section.3.

Finite utomt nd regulr expressions re equivlent Theorem A lnguge is regulr if nd only if some regulr expression describes it. Proof ides. If lnguge A is described by regulr expression R then A is recognized by n NFA, therefore A is regulr There is n NFA N such tht N recognizes L (R) 2. If lnguge A is regulr, it mens tht it is recognized by DFA. Then we cn lwys deduce regulr expression from it. Turn DFA into equivlent regulr expression / 0

Prt : From regulr expressions to NFAs By induction on the length of R: Bse cses (R hs length ): R = {} R = R = 2 / 0

Prt : From regulr expressions to NFAs By induction on the length of R: Bse cses (R hs length ): R = {} R = R = Inductive cse: let R hve length k >. Assume tht for ny smller regulr expression, there is n NFA. R my be one of the following cses: R = R R 2 R = R R 2 R = (R ) 2 / 0

Prt 2: From DFAs to regulr expressions. Define Generlized Nondeterministic Finite Automton (GNFA in short). Insted of δ : Q Σ Q, we use δ : Q RE Q Arrows lbelled with regulr expressions Blocks of symbols insted of one symbol t time One strt nd one ccept stte 2. How to convert ny DFA to n equivlent GNFA 3. Algorithm to convert ny GNFA to n equivlent GNFA with 2 sttes 4. Convert 2-stte GNFA to n equivlent RE. 3 / 0

Step : DFA to GNFA q strt strt DFA q ccept Add unique nd distinct strt nd ccept sttes Edges with multiple lbels become regexp lbels If internl sttes (q, q 2 ) don t hve n edge between them, dd one lbeled with This should be such tht q strt hs no incoming edges nd q ccept hs no outgoing edges. 4 / 0

Step 2: Eliminte sttes from GNFA While mchine hs more thn 2 sttes: Pick nd internl stte, rip it out Re-lbel the rrows with regulr expressions to ccount for the missing stte 5 / 0

Step 2: Eliminte sttes from GNFA While mchine hs more thn 2 sttes: Pick nd internl stte, rip it out Re-lbel the rrows with regulr expressions to ccount for the missing stte strt b 2,b 5 / 0

Step 2: Eliminte sttes from GNFA While mchine hs more thn 2 sttes: Pick nd internl stte, rip it out Re-lbel the rrows with regulr expressions to ccount for the missing stte strt strt s b b 2,b f 2 b 5 / 0

GNFA: definition nd cceptnce A GNFA is tuple (Q, Σ, δ, q strt,qccept ) Q is the set of sttes, Σ is the finite lphbet (not regexps) q strt : initil stte (unique, no incoming edges) q ccept : ccepting stte (unique, no outgoing edges) δ : (Q \ {q ccept }) (Q \ {q strt }) R R is the set of ll regexps over Σ A GNFA ccepts string w Σ if w = w,..., w k, with ech w i Σ nd sequence of sttes q 0,..., q k exists such tht: q 0 = q strt is the strt stte q k = q ccept is the ccept stte for ech i, we hve w i L (R i ), where R i = δ(q i,q i ), i.e. R i is the expression on the rrow from q i to q i 6 / 0

CONVERT Given DFA M, let G be its GNFA. CONVERT(G) yields the equivlent regexp.. Let k be the number of sttes of G 2. If k = 2, then G, return the regexp lbeling its single trnsition 3. Select ny stte q rip Q \ {q strt,q ccept } nd let G be the GNFA (Q, Σ, δ, q strt,qccept ) such tht Q = Q \ {q rip } nd for ny q i Q \ {q ccept } nd ny q j Q \ {q strt }, let δ (q i,q j ) = (R )(R 2 ) (R 3 ) (R 4 ) for R = δ(q i,q rip ), R 2 = δ(q rip,q rip ), R 3 = δ(q rip,q j ) nd R 4 = δ(q i,q j ) 4. Return CONVERT(G ) 7 / 0

Is CONVERT correct? Theorem Given ny GNFA G, CONVERT(G) is equivlent to G. Proof ide By induction on k, the number of sttes of G. Bse step: k = 2 Show tht the regexp lbeling its single rrow describe ll ccepting strings of G Inductive step: ssume it holds for k. Show tht G nd G re equivlent (i.e. ccept the sme words), then by the induction hypothesis so it will be for CONVERT(G ). 8 / 0

The Complete Picture DFA NFA Reg. Lnguge Reg. Expression 9 / 0

Limits of finite utomt Are the following lnguges regulr? L = {w w hs equl number of s nd 0s} L 2 = {w w hs equl number of occurrences of 0 nd 0} 0 / 0