Incorrect reasoning about RL. Equivalence of NFA, DFA. Epsilon Closure. Proving equivalence. One direction is easy:

Incorrect reasoning about RL Since L 1 = {w w=a n, n N}, L 2 = {w w = b n, n N} are regular, therefore L 1 L 2 = {w w=a n b n, n N} is regular If L 1 is a regular language, then L 2 = {w R w L 1 } is regular, and Therefore L 1 L 2 = {w w R w L 1 } is regular Equivalence of NFA, DFA Pages 54-58 (Sipser) We will prove that every NFA is equivalent to a DFA (with upto exponentially more states). Non-determinism does not help FA s to recognize more languages! 5/23/27 CSE 21, Summer 27 28 5/23/27 CSE 21, Summer 27 29 Epsilon Closure Let N=(Q,Σ,δ,q,F) be any NFA Consider any set R Q E(R) = {q q can be reached from a state in R by following or more ε-transitions} Proving equivalence For all languages L Σ * L = L( N) iff L = L( M ) for some NFA N for some DFA M One direction is easy: 5/23/27 CSE 21, Summer 27 3 A DFA M is also a NFA N! So N does not have to be `constructed from M 5/23/27 CSE 21, Summer 27 31 Proving equivalence contd. The other direction: Construct M from N N = (Q,Σ,δ,q,F) Construct M= (Q,Σ,δ,q,F ) such that, for any string w Σ*, w is accepted by N iff w is accepted by M Special case Assume that ε is not used in the NFA N. - Need to keep track of each subset of N - So Q = P (Q), q = {q } - δ (R,a) = (δ(r,a)) over all r R, R Q -F ={R Q R contains an accept state of N} Now let us assume that ε is used. 5/23/27 CSE 21, Summer 27 32 5/23/27 CSE 21, Summer 27 33

Construction (general case) 1. Q = P(Q) 2. q = E({q }) 3. for all R Q and a Σ δ (R, a) = {q Q q E(δ(r,a)) for some r R} 4. F = { R Q R contains an accept state of N} Why the construction works for any string w Σ*, w is accepted by N iff w is accepted by M Can prove using induction on the number of steps of computation 5/23/27 CSE 21, Summer 27 34 5/23/27 CSE 21, Summer 27 35 State minimization It may be possible to design DFA s without the exponential blowup in the number of states. Consider the NFA and DFA below. We will defer this question for later.,1,1 1 1 1 1 1 1,1 5/23/27 CSE 21, Summer 27 36 Regular Expressions (Def. 1.52) Given an alphabet Σ, R is a regular expression if: (INDUCTIVE DEFINITION) R = a, with a Σ R = ε R = R = (R 1 R 2 ), with R 1 and R 2 regular expressions R = (R 1 R 2 ), with R 1 and R 2 regular expressions R = (R 1 *), with R 1 a regular expression Precedence order: *,, 5/23/27 CSE 21, Summer 27 37 Regular Expressions Unix grep command: Global Regular Expression and Print Lexical Analyzer Generators (part of compilers) Both use regular expression to DFA conversion Examples e 1 = a b, L(e 1 ) = {a,b} e 2 = ab ba, L(e 2 ) = {ab,ba} e 3 = a*, L(e 3 ) = {a}* e 4 = (a b)*, L(e 4 ) = {a,b}* e 5 = (e m. e n ), L(e 5 ) = L(e m ) L(e n ) e 6 = a*b a*bb, L(e 6 ) = {w w {a,b}* and w has or more a s followed by 1 or 2 b s} 5/23/27 CSE 21, Summer 27 38 5/23/27 CSE 21, Summer 27 39

Thm 1.54: RL ~ RE We need to prove both ways: If a language is described by a regular expression, then it is regular (Lemma 1.55) (We will show we can convert a regular expression R into an NFA M such that L(R)=L(M)) The second part: If a language is regular, then it can be described by a regular expression (Lemma 1.6) Regular expression to NFA Claim: If L = L(e) for some RE e, then L = L(M) for some NFA M Construction: Use inductive definition 1. R = a, with a Σ 2. R = ε 3. R = 4. R = (R 1 R 2 ), with R 1 and R 2 regular expressions 5. R = (R 1 R 2 ), with R 1 and R 2 regular expressions 6. R = (R 1 *), with R 1 a regular expression 1. 2. 3. a 4,5,6: similar to closure of RL under regular operations. 5/23/27 CSE 21, Summer 27 4 5/23/27 CSE 21, Summer 27 41 Examples of RE to NFA conv. L = {ab,ba} L = {ab,abab,ababab, } L = {w w = a m b n, m<1, n>1} Back to RL ~ RE The second part (Lemma 1.6): If a language is regular, then it can be described by a regular expression. Proof strategy: regular implies equivalent DFA. convert DFA to GNFA (generalized NFA) convert GNFA to NFA. GNFA: NFA that have regular expressions as transition labels 5/23/27 CSE 21, Summer 27 42 5/23/27 CSE 21, Summer 27 43 Example GNFA 11 q S 1* ε * 11 5/23/27 CSE 21, Summer 27 44 Generalized NFA - defn Generalized non-deterministic finite automaton M=(Q, Σ, δ, q start,q accept ) with Q finite set of states Σ the input alphabet q start the start state q accept the (unique) accept state δ:(q - {q accept }) (Q - {q start }) R is the transition function (R is the set of regular expressions over Σ) (NOTE THE NEW DEFN OF δ) 5/23/27 CSE 21, Summer 27 45

Characteristics of GNFA s δ δ:(q\{q accept }) (Q\{q start }) R The interior Q\{q accept,q start } is fully connected by δ From q start only outgoing transitions To q accept only ingoing transitions Impossible q i transitions are labeled δ(q i, ) = Observation: This GNFA recognizes the language L(R) q S R R Proof Idea of Lemma 1.6 Proof idea (given a DFA M): Construct an equivalent GNFA M with k 2 states Reduce one-by-one the internal states until k=2 This GNFA will be of the form This regular expression R will be such that L(R) = L(M) q S R 5/23/27 CSE 21, Summer 27 46 5/23/27 CSE 21, Summer 27 47 DFA M Equivalent GNFA M Let M have k states Q={q 1,,q k } - Add two states q accept and q start ε q q 1 S - Connect q start to earlier q 1 : ε - Connect old accepting states to q accept - Complete missing transitions by q i - Join multiple transitions: 1 becomes q i 1 q i 5/23/27 CSE 21, Summer 27 48 Remove Internal state of GNFA If the GNFA M has more than 2 states, rip internal q rip to get equivalent GNFA M by: - Removing state q rip : Q =Q\{q rip } - Changing the transition function δ by δ (q i, ) = δ(q i, ) (δ(q i,q rip )(δ(q i, ))*δ(q rip, )) for every q i Q \{q accept } and Q \{q start } R 1 q rip R 2 q i R 3 = R 4 q i R 4 (R 1 R 2 *R 3 ) 5/23/27 CSE 21, Summer 27 49 Proof Lemma 1.6 Let M be DFA with k states Create equivalent GNFA M with k+2 states Reduce in k steps M to M with 2 states The resulting GNFA describes a single regular expressions R The regular language L(M) equals the language L(R) of the regular expression R 5/23/27 CSE 21, Summer 27 5 Proof Lemma 1.6 - continued Use induction (on number of states of GNFA) to prove correctness of the conversion procedure. Base case: k=2. Inductive step: 2 cases q rip is/is not on accepting path. R 1 q rip R 2 q i R 3 = R 4 q i R 4 (R 1 R 2 *R 3 ) 5/23/27 CSE 21, Summer 27 51

Recap RL = RE Let R be a regular expression, then there exists an NFA M such that L(R) = L(M) Example L = {w the sum of the bits of w is odd} The language L(M) of a DFA M is equivalent to a language L(M ) of a GNFA = M, which can be converted to a two-state M The transition q start R q accept of M obeys L(R) = L(M ) Hence: RE NFA = DFA GNFA RE 5/23/27 CSE 21, Summer 27 52 5/23/27 CSE 21, Summer 27 53 Non-regular Languages 1.4 Which languages cannot be recognized by finite automata? Repeating DFA Paths Consider an accepting DFA M with size Q On a string of length p, p+1 states get visited For p Q, there must be a j such that the computational path looks like: q 1,,,,,,q k Example: L={ n 1 n n N } Playing around with FA convinces you that the finiteness of FA is problematic for all n N The problem occurs between the n and the 1 n q 1 Informal: the memory of a FA is limited by the the number of states Q 5/23/27 CSE 21, Summer 27 54 5/23/27 CSE 21, Summer 27 55 q k Repeating DFA Paths The action of the DFA in is always the same. If we repeat (or ignore) the,, part, the new path will again be an accepting path q 1 5/23/27 CSE 21, Summer 27 56 q k Line of Reasoning Proof by contradiction: Assume that L is regular Hence, there is a DFA M that recognizes L For strings of length Q the DFA M has to repeat itself Show that M will accept strings outside L Conclude that the assumption was wrong Note that we use the simple DFA, not the more elaborate (but equivalent) NFA or GNFA 5/23/27 CSE 21, Summer 27 57

Pumping Lemma (Thm 1.37) For every regular language L, there is a pumping length p, such that for any string s L and s p, we can write s=xyz with 1) x y i z L for every i {,1,2, } 2) y 1 3) xy p Note that 1) implies that xz L 2) says that y cannot be the empty string ε Condition 3) is not always used 5/23/27 CSE 21, Summer 27 58 Formal Proof of Pumping Lemma Let M = (Q,Σ,δ,q 1,F) with Q = {q 1,,q p } Let s = s 1 s n L(M) with s = n p Computational path of M on s is the sequence r 1,,r n+1 Q n+1 with r 1 = q 1, r n+1 F and r t+1 = δ(r t,s t ) for 1 t n Because n+1 p+1, there are two states such that r j =r k (with j<k and k p+1) Let x = s 1 s j 1, y = s j s k 1, and z = s k s n+1 x takes M from q 1 =r 1 to r j, y takes M from r j to r j, and z takes M from r j to r n+1 F As a result: xy i z takes M from q 1 to r n+1 F (i ) 5/23/27 CSE 21, Summer 27 59 Formal Proof of Pumping Lemma Pumping n 1 n (Ex. 1.38) Let M = (Q,Σ,δ,q 1,F) with Q = {q 1,,q p } Let s = s 1 s n L(M) with s = n p Computational path of M on s is the sequence r 1,,r n+1 Q n+1 with r 1 = q 1, r n+1 F and r t+1 = δ(r t,s t ) for 1 t n Because n+1 p+1, there are two terms such that r j =r k (with j<k y and 1 and k xy p+1) p Let x = s 1 s j 1, y = s j s k 1, and z = s k s n+1 x takes M from q 1 =r 1 to r j, y takes M from r j to r j, and z takes M from r j to r n+1 F As a result: xy x i yz i takes z L(M) M from for every q 1 to i {,1,2, } r n+1 F (i ) 5/23/27 CSE 21, Summer 27 6 Assume that B = { n 1 n n } is regular Let p be the pumping length, and s = p 1 p B P.L.: s = xyz = p 1 p, with xy i z B for all i Three options for y: 1) y= k, hence xyyz = p+k 1 p B 2) y=1 k, hence xyyz = p 1 k+p B 3) y= k 1 l, hence xyyz = p 1 l k 1 p B Conclusion: The pumping lemma does not hold, the language B is not regular. 5/23/27 CSE 21, Summer 27 61 F = { ww w {,1}* } (Ex. 1.4) Intersecting Regular Languages Let p be the pumping length, and take s = p 1 p 1 Let s = xyz = p 1 p 1 with condition 3) xy p Only one option: y= k, with xyyz = p+k 1 p 1 F Without 3) this would have been a pain. 5/23/27 CSE 21, Summer 27 62 Let C = { w # of s in w equals # of 1s in w} Problem: If xyz C with y C, then xy i z C Idea: If C is regular and F is regular, then the intersection C F has to be regular as well Solution: Assume that C is regular Take the regular F = { n 1 m n,m N}, then for the intersection: C F = { n 1 n n N } But we know that C F is not regular Conclusion: C is not regular 5/23/27 CSE 21, Summer 27 63

Pumping Down E = { i 1 j i j } Problem: pumping up s= p 1 p with y= k gives xyyz = p+k 1 p, xy 3 z = p+2k 1 p, which are all in E (hence do not give contradictions) Solution: pump down to xz = p k 1 p. Overall for s = xyz = p 1 p (with xy p): y= k, hence xz = p k 1 p E Contradiction: E is not regular 5/23/27 CSE 21, Summer 27 64