DFA to Regular Expressions Proposition: If L is regular then there is a regular expression r such that L = L(r). Proof Idea: Let M = (Q,Σ, δ, q 1, F) be a DFA recognizing L, with Q = {q 1,q 2,...q n }, and F = {q f1,...q fk } Construct regular expression r 1fi such that L(r 1fi ) = {w : δ(q 1,w) = q fi }, i.e., r 1fi describes the set of strings on which M ends up in state q fi when started in the initial state q 1. Then L = L(M) = L(r 1f1 ) L(r 1f2 ) L(r 1fk ) = L(r 1f1 + r 1f2 + + r 1fk ) Thus, the desired regular expression is r 1f1 + r 1f2 + + r 1fk 89
Constructing r 1fi Idea 1 For every i,j, build regular expressions r describing strings taking M from state q i to state q j, where q i need not be the initial state and q j need not be a final state. Idea 2 Build the expression r inductively. Start with expressions that describe paths from q i to q j that do not pass through any intermediate states; i.e., these are single nodes or single edges. Inductively build expressions that describe paths that pass through progressively a larger set of states. 90
Definitions and Notation Define R k to be the set (not regular expression) of strings leading from q i to q j such that any intermediate state is k Note, the superscript k refers only to the intermediate states; so i and j could be greater than k. R 0 set of strings that go from q i to q j without passing through any intermediate states; in other words they are ǫ or single edges. R n is set of all strings going from q i to q j 91
Constructing set R k : Base Case R 0 = {a δ(q i, a) = q j } if i j {a δ(q i, a) = q j } {ǫ} if i = j 92
Constructing set R k : Inductive Step i j k k 1 Assume we have R k 1 R k = R k 1 R k 1 ik (R k 1 kk ) R k 1 kj 93
Constructing the Regular Expression Task: Construct expression r k such that L(r k ) = R k. Base Case r 0 = if R 0 = a 1 + a 2 + a m if R 0 = {a 1,a 2,... a m } ǫ + a 1 + a m if R 0 = {ǫ,a 1,...a m } 94
Constructing the Regular Expression: Inductive step Assume inductively, r k 1 is the regular expression for R k 1 R k = R k 1 = L(r k 1 = L(r k 1 R k 1 ik ) L(r k 1 ik + r k 1 ik (R k 1 kk ) R k 1 kj )(L(r k 1 kk )) L(r k 1 kj ) (r k 1 kk ) r k 1 kj ) r k 1 + r k 1 ik (rk 1 kk ) r k 1 kj is the Regular Expression for R k. 95
Completing the Proof Proposition: If L is regular then there is a regular expression r such that L = L(r). Proof: Let q 1 be the initial state, and {q f1,q f2,...q fk } the final states of M (which recognizes L), then the desired regular expression is r n 1f 1 + r n 1f 2 + r n 1f k 96
Example 1 1 0 q 1 q 2 0 r 0 11 = 1 + ǫ r 0 22 = 1 + ǫ r 0 12 = 0 r 0 21 = 0 r12 1 = r12 0 + r0 11 (r0 11 ) r12 0 r22 1 = r22 0 + r0 21 (r0 11 ) r12 0 = 0 + (1 + ǫ) + 0 = (1 + ǫ) + 0(1 + ǫ) 0 r 2 12 = r 1 12 + r1 12 (r1 22 ) r 1 22 = (0 + (1 + ǫ) + 0) + (0 + (1 + ǫ) + 0)((1 + ǫ) + 0(1 + ǫ) 0) + = (0 + (1 + ǫ) + 0)(1 + ǫ + 0(1 + ǫ) 0) = (1 + ǫ) 0(1 + ǫ + 01 0) = 1 0(1 + 01 0) L(M) = L(r 2 12 ) 97
Analysis of the Translation Size of the constructed regular expression Number of regular expressions = O(n 3 ) At each step the regular expression may blowup by a factor of 4 Each regular expression r n can be of size O(4 n ) The above method works for both NFA and DFA For converting DFA there is slightly more efficient method (see textbook) 98
Thus far... NFA DFA ǫ-nfa Regular Exp. 99
Regular Expression Identities Associativity and Commutativity L + M = M + L (L + M) + N = L + (M + N) (LM)N = L(MN) Note: LM ML Distributivity L(M + N) = LM + LN (M + N)L = ML + NL 100
More Identities Identities and Anhilators + L = L + = L ǫl = Lǫ = L L = L = Idempotent Law Closure Laws L + L = L (L ) = L ( ) = ǫ ǫ = ǫ L + = LL = L L L = L + + ǫ 101
Testing Regular Expression Identities To test if an equation E = F holds [Example: (L + M) = (L M ) ] 1. Convert E and F into concrete expressions C and D, by replacing each language variable in E and F by a concrete symbol [Replacing L by a and M by b, we get C = (a + b) and D = (a b ) ] 2. L(C) = L(D) iff the equation E = F holds [L((a + b) ) = L((a b ) ) and so (L + M) = (L M ) holds] Correctness of algorithm follows from Theorems 3.13 and 3.14 in the book 102
However, caution!! The algorithm only applies to equations of regular expressions and not to all equations Consider the equation L M N = L M, where means set intersection The equation is clearly false Concretizing we get {a} {b} {c} and {a} {b}, which are clearly equal!! 103