Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018 Lecture 9 Ana Bove April 19th 2018 Recap: Regular Expressions Algebraic representation of (regular) languages; R, S ::= a R + S RS R...... representing the languages, {}, {a}, L(R) L(S), L(R)L(S) and L(R) respectively; Brief on algebraic laws for RE; How to transform a FA into a RE: By eliminating states; With a system of linear equations and Arden s lemma. April 19th 2018, Lecture 9 TMV027/DIT321 1/29
Example: Eliminating States Consider the automaton D a c q b 0 q 1 b d By eliminating states the expression is a b(c + da b) a q 2 Consider the automaton D a c By eliminating states the expression is b q 0 q 1 (a + bc d) bc d But intuitively these automata are equivalent... April 19th 2018, Lecture 9 TMV027/DIT321 2/29 Example: Linear Equation System The linear equations corresponding to the automaton D are E 0 = ae 0 + be 1 E 1 = + ce 1 + de 0 The resulting RE depends on the order we solve the system. If we solve E 1 first we get E 0 = (a + bc d) bc. If we solve E 0 first we get E 0 = a b(c + da b). It should be that a b(c + da b) = (a + bc d) bc! (see proof in slide 10) Exercise: What RE do we obtain for the automaton D? April 19th 2018, Lecture 9 TMV027/DIT321 3/29
Overview of Today s Lecture Equivalence between FA and RE: from RE to FA; More on algebraic laws for regular expressions; Pumping Lemma for RL; Closure properties of RL. Contributes to the following learning outcome: Explain and manipulate the diff. concepts in automata theory and formal lang; Have a clear understanding about the equivalence between (non-)deterministic finite automata and regular expressions; Understand the power and the limitations of regular lang and context-free lang; Prove properties of languages, grammars and automata with rigorously formal mathematical methods; Design automata, regular expressions and context-free grammars accepting or generating a certain language; Describe the language accepted by an automata or generated by a regular expression or a context-free grammar; Determine if a certain word belongs to a language; Differentiate and manipulate formal descriptions of lang, automata and grammars. April 19th 2018, Lecture 9 TMV027/DIT321 4/29 From Regular Expressions to Finite Automata Proposition: Every language defined by a RE is accepted by a FA. Proof: Let L = L(R) for some RE R. By induction on R we construct a -NFA E with only one final state and no arcs into the initial state or out of the final state. Moreover, E is such that L = L(E). Base cases are the RE, and a Σ. The corresponding -NFA accepting the languages, {} and {a} are: a April 19th 2018, Lecture 9 TMV027/DIT321 5/29
From RE to FA: Inductive Step (Cont.) IH: Given the RE R and S there are -NFA with only one final state and no arcs into the initial state or out of the final state accepting L(R) and L(S). We construct the -NFA for R + S, RS and R accepting L(R) L(S), L(R)L(S) and L(R) respectively: R S R S R April 19th 2018, Lecture 9 TMV027/DIT321 6/29 Example: From RE to FA If we follow this method for the RE 0 1 we obtain the -NFA 0 1 Compare it with the following DFA for the same language: 0 1 April 19th 2018, Lecture 9 TMV027/DIT321 7/29
Recall: Algebraic Laws for Regular Expressions The following equalities hold for any RE R, S and T : Idempotent: R + R = R Commutative: R + S = S + R In general, RS SR Associative: R + (S + T ) = (R + S) + T R(ST ) = (RS)T Distributive: R(S + T ) = RS + RT (S + T )R = SR + TR Identity: R + = + R = R R = R = R Annihilator: R = R = = = R + = RR = R R R = (R ) = R R = + R + Note: Compare (some of) these laws with those for sets on slide 14 lecture 2. April 19th 2018, Lecture 9 TMV027/DIT321 8/29 Recall: More Algebraic Laws for Regular Expressions Other useful laws to simplify regular expressions are: Shifting rule: R(SR) = (RS) R Denesting rule: (R S) R = (R + S) Note: By the shifting rule we also get R (SR ) = (R + S) Variation of the denesting rule: (R S) = + (R + S) S Note: These rules are not always trivial to apply... :-) April 19th 2018, Lecture 9 TMV027/DIT321 9/29
Example: Proving Equalities Using the Algebraic Laws Example: The set of all words with no substring of more than two adjacent 0 s is (1 + 01 + 001) ( + 0 + 00). Now, (1 + 01 + 001) ( + 0 + 00) = (( + 0)( + 0)1) ( + 0)( + 0) by distributivity = ( + 0)( + 0)(1( + 0)( + 0)) by shifting = ( + 0 + 00)(1 + 10 + 100) by distributivity Then (1 + 01 + 001) ( + 0 + 00) = ( + 0 + 00)(1 + 10 + 100) Example: A proof that a b(c + da b) = (a + bc d) bc (recall slides 2 and 3): a b(c + da b) = a b(c da b) c by denesting (R = c, S = da b) a b(c da b) c = (a bc d) a bc by shifting (R = a b, S = c d) (a bc d) a bc = (a + bc d) bc by denesting (R = a, S = bc d) April 19th 2018, Lecture 9 TMV027/DIT321 10/29 Equality of Regular Expressions Recall: RE are a way to denote languages. Then, for RE R and S, R = S actually means L(R) = L(S). Hence we can prove the equality of RE in the same way we prove the equality of languages! Example: Let us show that R = R R. Let L = L(R). Then L(R ) = L(R) = L. L L L since L. Conversely, if L L L then x = x 1 x 2 with x 1 L and x 2 L. If x 1 = or x 2 = then it is clear that x L. Otherwise x 1 = u 1 u 2... u n with u i L and x 2 = v 1 v 2... v m with v j L. Then x = x 1 x 2 = u 1 u 2... u n v 1 v 2... v m is in L. April 19th 2018, Lecture 9 TMV027/DIT321 11/29
Proving Algebraic Laws for Regular Expressions In general, given the RE R and S we can prove the law R = S as follows: 1 Convert R and S into concrete RE C and D, respectively, by replacing each variable in the RE R and S by (different) concrete symbols. Example: R(SR) = (RS) R can be converted into a(ba) = (ab) a. 2 Prove or disprove whether L(C) = L(D). If L(C) = L(D) then R = S is a true law, otherwise it is not. Example: We can prove the shifting law by induction: n Æ.a(ba) n = (ab) n a. Theorem: The above procedure correctly identifies the true laws for RE. Proof: See theorems 3.14 and 3.13 in pages 121 and 120 respectively. April 19th 2018, Lecture 9 TMV027/DIT321 12/29 Example: Proving the Denesting Rule We can state (R S) R = (R + S) by proving L((a b) a ) = L((a + b) ): : Let x (a b) a, then x = vw with v (a b) and w a. By (structural) induction on v. If v = we are done. Otherwise v = av or v = bv. In both cases v (a b) hence by IH v w (a + b) and so is vw. : Let x (a + b). By (structural) induction on x. If x = then we are done. Otherwise x = x a or x = x b and x (a + b). By IH x (a b) a and then x = vw with v (a b) and w a. If x a = v(wa) (a b) a since v (a b) and (wa) a. If x b = (v(wb)) (a b) a since v(wb) (a b) and a. April 19th 2018, Lecture 9 TMV027/DIT321 13/29
How to Identify Regular Languages? We have seen that a language is regular iff there is a DFA that accepts the language. Then we saw that DFA, NFA and -NFA are equivalent in the sense that we can convert between them. Hence FA accept all and only the regular languages (RL). Now we have seen how to convert between FA and RE. Thus RE also define all and only the RL. April 19th 2018, Lecture 9 TMV027/DIT321 14/29 How to Prove that a Language is NOT Regular? In a FA with n states, any path has a loop if m n. a q 1 a 1 2 a q2 3 a m 1 a q3... q m m q m+1 That is, we have i < j such that q i = q j in the path above. This is an application of the Pigeonhole Principle. April 19th 2018, Lecture 9 TMV027/DIT321 15/29
How to Prove that a Language is NOT Regular? Example: Let us prove that L = {0 m 1 m m 0} is NOT a RL. Let us assume it is: then L = L(A) for some FA A with n states, n > 0. Let k n > 0 and let w = 0 k 1 k L. Then there must be an accepting path q 0 w q f F. Since k n, there is a loop (pigeonhole principle) when reading the 0 s. Then w = xyz with xy = j n, y and z = 0 k j 1 k such that q 0 x q l y ql z q f F Observe that the following path is also an accepting path q 0 x q l z q f F However y must be of the form 0 i with i > 0 hence xz = 0 k i 1 k / L. This contradicts the fact that A accepts L. April 19th 2018, Lecture 9 TMV027/DIT321 16/29 The Pumping Lemma for Regular Languages Theorem: Let L be a RL. Then, there exists a constant n which depends on L such that for every string w L with w n, it is possible to break w into 3 strings x, y and z such that w = xyz and 1 y ; 2 xy n; 3 k 0. xy k z L. April 19th 2018, Lecture 9 TMV027/DIT321 17/29
Proof of the Pumping Lemma Assume we have a FA A that accepts the language, then L = L(A). Let n be the number of states in A. Then any path of length m n has a loop. Let us consider w = a 1 a 2... a m L. We have an accepting path and a loop such that with w = xyz L, y, xy n. q 0 x q l y ql z q f F Then we also have q 0 x q l y k q l z q f F for any k, that is, k 0. xy k z L. April 19th 2018, Lecture 9 TMV027/DIT321 18/29 Example: Application of the Pumping Lemma We use the Pumping lemma to prove that L = {0 m 1 m m 0} is not a RL. We assume it is. Then the Pumping lemma applies. Let n be the constant given by the lemma and let w = 0 n 1 n L, then w n. By the lemma we know that w = xyz with y, xy n and k 0. xy k z L. Since y and xy n, we know that y = 0 i with i 1. However, we have a contradiction since xy k z / L for k 1 since it either has too few 0 s (k = 0) or too many (k > 1). Note: This is connected to the fact that a FA has finite memory! If we could build a machine with infinitely many states it would be able to recognise the language. April 19th 2018, Lecture 9 TMV027/DIT321 19/29
Example: Application of the Pumping Lemma Example: Let us prove that L = {0 i 1 j i j} is not a RL. Let us assume it is, hence the Pumping lemma applies. Let n be given by the Pumping lemma and let w = 0 n 1 n+1 L, hence w n. Then we know that w = xyz with y, xy n and k 0. xy k z L. Since y and xy n, we know that y = 0 r with r 1. However, we have a contradiction since xy k z / L for k > 2 since it will have more 0 s than 1 s. (Even for k = 2 if r > 1.) Example: What happens if we choose w = 1 n L? Or w = 0 n/2 1 n L? Exercise: What about the languages {0 i 1 j i j}, {0 i 1 j i > j} and {0 i 1 j i j}? April 19th 2018, Lecture 9 TMV027/DIT321 20/29 Pumping Lemma is not a Sufficient Condition By showing that the Pumping lemma does not apply to a certain language L we prove that L is not regular. However, if the Pumping lemma does apply to L, we cannot conclude whether L is regular or not! Example: We know L = {b m c m m 0} is not regular. Let us consider L = a + L (b + c). Using clousure properties (to come later) we can prove that L is not regular. However, the Pumping lemma does apply for L with n = 1. This shows the Pumping lemma is not a sufficient condition for a language to be regular. April 19th 2018, Lecture 9 TMV027/DIT321 21/29
Closure Properties for Regular Languages Let M and N be RL. Then M = L(R) = L(D) and N = L(S) = L(F ) for RE R and S, and DFA D and F. We have seen that RL are closed under the following operations: Union: M N = L(R + S) or M N = L(D F ) (s.22, l.5); Complement: M = L(D) (slide 24, lec. 5) Intersection: M N = M N or M N = L(D F ) (s.21, l.5); Difference: M N = M N ; Concatenation: MN = L(RS); Closure: M = L(R ). April 19th 2018, Lecture 9 TMV027/DIT321 22/29 More Closure Properties for Regular Languages RL are also closed under the following operations: Prefix: Reversal: See additional exercise 3 on DFA. Hint: in D, make final all states in a path from the start state to final state. Recall that rev(a 1... a n ) = a n... a 1 and x.rev(rev(x)) = x (slides 15 & 17, lec. 4). April 19th 2018, Lecture 9 TMV027/DIT321 23/29
Closure under Prefix Another way to prove that the language of prefixes of a RL is regular: Define the function: pre : RE RE pre( ) = pre() = pre(a) = + a pre(r 1 + R 2 ) = pre(r 1 ) + pre(r 2 ) pre(r 1 R 2 ) = pre(r 1 ) + R 1 pre(r 2 ) pre(r ) = R pre(r) and prove that L(pre(R)) = Prefix(L(R)). Then, if L = L(R) for some RE R then Prefix(L) = Prefix(L(R)) = L(pre(R)). April 19th 2018, Lecture 9 TMV027/DIT321 24/29 Closure under Reversal We define the function: r : RE RE r = (R 1 + R 2 ) r = R1 r + R 2 r r = (R 1 R 2 ) r = R2 rr 1 r a r = a (R ) r = (R r ) Theorem: If L is regular so is L r. Proof: (See theo. 4.11, pages 139 140). Let R be a RE such that L = L(R). We need to prove by induction on R that L(R r ) = (L(R)) r. Hence L r = (L(R)) r = L(R r ) and L r is regular. Example: The reverse of the language defined by (0 + 1) 0 can be defined by 0(0 + 1). April 19th 2018, Lecture 9 TMV027/DIT321 25/29
Closure under Reversal Another way to prove this result is by constructing a -NFA for L r. Proof: Let N = (Q, Σ, δ N, q 0, F ) be a NFA such that L = L(N). Define a -NFA E = (Q {q}, Σ, δ E, q, {q 0 }) with q / Q and δ E such that r δ E (s, a) iff s δ N (r, a) for r, s Q r δ E (q, ) iff r F April 19th 2018, Lecture 9 TMV027/DIT321 26/29 Using Closure Properties Example: Consider L 1 and L 2 such that L 1 is regular, L 2 is not regular but L 1 L 2 is regular. Is L 1 L 2 is regular? Let us assume that L 1 L 2 is regular. Then (L 1 L 2 L 1 ) (L 1 L 2 ) should also be regular because of the closure properties. But this is actually L 2 which is not regular! We arrive to a contradiction. Hence L 1 L 2 cannot be regular. April 19th 2018, Lecture 9 TMV027/DIT321 27/29
Overview of Next Lecture Sections 4.3 4.4: Decision properties for RL; Equivalence of RL; Minimisation of automata. April 19th 2018, Lecture 9 TMV027/DIT321 28/29 Overview of next Week Mon 23 Tue 24 Wed 25 Thu 26 Fri 27 Lec 13-15 HB3 RL. Ex 15-17 EA RL. Ex 10-12 EA RL. 15-17 EL41 Consultation 10-12 ES61 Individual help Lec 13-15 HB3 CFG. Assignment 4: RL. Deadline: Sunday April 29th 23:59. April 19th 2018, Lecture 9 TMV027/DIT321 29/29