Chapter 2: Finite Automata Peter Cappello Department of Computer Science University of California, Santa Barbara Santa Barbara, CA 93106 cappello@cs.ucsb.edu Please read the corresponding chapter before attending this lecture. These notes are supplemented with figures, and material that arises during the lecture in response to questions. Please report any errors in these notes to cappello@cs.ucsb.edu. I ll fix them immediately. Based on An Introduction to Formal Languages and Automata, 3rd Ed., Peter Linz, Jones and Bartlett Publishers, Inc. 1
2.4 Reduction of the Number of States in Finite Automata Any DFA defines a unique language, but the converse is not true. Example: (Draw 2 distinct DFAs that accept the same language.) We present an algorithm that, given a DFA, produces an equivalent DFA with the minimum states possible. Defn. 2.8 2 states p and q of a DFA are indistinguishable if w Σ, δ (p, w) F δ (q, w) F. 2 states p and q of a DFA are distinguishable if they are not indistinguishable: If w Σ, δ (p, w) F and δ (q, w) / F. The binary relation q is indistinguishable from p, I(p, q) is an equivalence relation: 2
p Q, I(p, p), I(p, q) I(q, p), (I(p, q) I(q, r)) I(p, r). Thus, the indistinguishability relation partitions Q. (Informally indicate this on the example.) First, here is a procedure for marking pairs of states distinguishable. Imagine a table with a row for every state and a column for every state. Entry (p, q) is marked if p is distinguishable from q. 3
Procedure mark 1. Remove all inaccessible states. (a) Do a DFS, starting at q 0, marking visited nodes reachable. (b) Remove all nodes that are not marked reachable. 2. p, q Q, if p F and q / F or vice versa, mark (p, q) distinguishable. 3. Repeat { p, q Q and a Σ, (a) compute δ(p, a) = p a and δ(q, a) = q a. (b) If (p a, q a ) distinguishable, mark (p, q) distinguishable. } until ( no previously unmarked pairs are marked ). 4
Thm. 2.3: Procedure mark, applied to any DFA M = (Q, Σ, δ, q 0, F ), terminates and determines all pairs of distinguishable states. Proof: 1. The procedure terminates, since the number of state pairs is finite. 2. It is clear that if a state pair is marked, then it is distinguishable. 3. We prove the converse: if a state pair is distinguishable, then it is marked. 4. First, we prove that after the n th iteration of the Repeat step, a state pair is marked if its component states are distinguishable by a string w, w n. (a) Basis n = 0: Step 2 marks all state pairs that are distinguishable by λ. 5
(b) Induction hypothesis: Assume for i = 0,..., n 1 that after i iterations of the Repeat step, a state pair is marked if its component states are distinguishable by a string w, w i. (c) Induction step: i. Consider state pair (p, q) that is distinguishable by a string of length n, but not of length n 1. ii. Let w = va be a string of length n that distinguishes them. iii. δ (p, v) = p v δ (q, v) = q v. Since (p, q) are not distinguishable by any string of length n 1, the pair (p, q) is not marked after the n 1 th iteration. iv. The nth character distinguishes them. 6
v. This is detected during the n th iteration of the Repeat step: δ(p v, a) = p a δ(q v, a) = q a, and either p a F and q a / F or vice versa. vi. (p, q) is marked after the n th iteration of the Repeat step. 5. Assume that the Repeat step terminates after n iterations. 6. Then, during the n th iteration, no new state pairs were marked. 7. If we add any character to the string and iterate again, we must get the same result. 8. Therefore, when the Repeat step terminates, all distinguishable states have been marked. 7
Procedure mark identifies each pair of distinguishable states. We then identify the equivalence classes, P, of indistinguishable states: Procedure partition 1. Let S = Q be a set of states not assigned to an equivalence class. Let P = be the set of equivalence classes. 2. While (S ) { (a) Insert new equivalence class C into P. (b) Let q S. Insert q into C. Delete q from S. (c) For all p S indistinguishable from q, Insert p into C. Delete p from S. } 8
Procedure reduce Given DFA M = (Q, Σ, δ, q 0, F ), construct DFA M = ( Q, Σ, δ, q 0, F ): 1. Invoke procedure mark to find all pairs of distinguishable states. 2. Invoke procedure partition to partition Q into equivalence classes. 3. For each equivalence class {q i, q j,..., q k }, create a state labelled ij k in Q. 4. For each transition rule of M of the form δ(r, a) = p, where r {q i, q j,..., q k } and p {q l, q m,..., q n }, Add to δ the transition: δ(ij k, a) = lm n. 5. q 0 is that state of M whose label includes q 0. 6. F is the set of states whose label includes i such that q i F. 9
Example: (Apply mark, partition, and reduce to a DFA.) Thm. 2.4: Given a DFA M, applying procedure reduce yields another DFA M such that L(M) = L( M) and there is no DFA with fewer states that accepts L(M). Proof: There are 2 things to prove: 1. L(M) = L( M). 2. There is no DFA with fewer states that accepts L(M). Part 1 can be done by induction on w, the input (in a manner similar to showing the equivalence of DFAs and NFAs). We now show part 2 by contradiction. 1. Let M have states {p 0, p 1,..., p m } with initial state p 0. 10
2. Assume there is an equivalent DFA M with transition function δ and initial state q 0 with fewer states. 3. Since M has no inaccessible states, there must be distinct strings w 1, w 2,..., w m such that δ (p 0, w i ) = p i, i = 1, 2,..., m. 4. Since M has fewer states than M, there must be at least 2 strings, w k and w l, such that δ (q 0, w k ) = δ (q 0, w l ). 5. Since p k is distinguishable from p l, there is a string x such that is a final state, and δ (p 0, w k x) = δ (p k, x) δ (p 0, w l x) = δ (p l, x) 11
is not a final state (or vice versa): w k x L( M) and w l x / L( M) or vice versa. 6. But, δ (q 0, w k x) = δ (δ (q 0, w k ), x) = δ (δ (q 0, w l ), x) = (δ (q 0, w l x). Thus, w k x L(M ) and w l x L(M ) or w k x / L(M ) and w l x / L(M ). 7. This contradicts the assumption that there is a DFA with fewer states than M that is equivalent to M. 12