CS 154, Lecture 4: Limitations on FAs (I), Pumping Lemma, Minimizing FAs
Regular or Not? Non-Regular Languages = { w w has equal number of occurrences of 01 and 10 } REGULAR! C = { w w has equal number of 1s and 0s} NOT REGULAR! How can we prove that there is no FA for a particular language? Surprising Algorithms (even in restricted models) are routinely being discovered
The Pumping Lemma: Structure in Regular Languages Let L be a regular language Then there is a positive integer P s.t. for all strings w L with w P there is a way to write w = xyz, where: 1. y > 0 (that is, y ε) 2. xy P 3. For all i 0, xy i z L Why is it called the pumping lemma? The word w gets pumped into longer and longer strings
Proof: Let M be a FA that recognizes L Let P be the number of states in M Let w be a string where w L and w P We show: w = xyz y x w 1. y > 0 2. xy P 3. xy i z L for all i 0 z q 0 q j q k q w Claim: There must exist j and k such that 0 j < k P, and q j = q k
Applying the Pumping Lemma Let s prove that EQ={ w #1s = #0s} is not regular. By contradiction. Assume EQ is regular. Let P be as in pumping lemma. Let w = 0 P 1 P EQ. Can write w = xyz, with y > 0, xy P, such that for all i 0, xy i z is also in EQ Claim: The string y must be all zeroes. Why? Because xy P and w = xyz = 0 P 1 P But then xyyz has more 0s than 1s Contradiction!
Applying the Pumping Lemma Prove: SQ = {0 n2 n 0} is not regular Assume SQ is regular. Let w = 0 P2 Can write w = xyz, with y > 0, xy P, such that for all i 0, xy i z is also in EQ So xyyz SQ. Note that xyyz = 0 P2 + y Note that 0 < y P So xyyz = P 2 + y P 2 + P < P 2 + 2P + 1 = (P+1) 2 and P 2 < xyyz < (P+1) 2 Therefore xyyz is not a perfect square! Hence 0 P2 + y = xyyz SQ, so our assumption must be false. SQ is not regular!
oes this FA have a minimal number of states?
Is this minimal? How can we tell in general?
Theorem: For every regular language L, there is a unique (up to re-labeling of the states) minimal-state FA M* such that L = L(M*). Furthermore, there is an efficient algorithm which, given any FA M, will output this unique M*. If these were true for more general models of computation, that would be an engineering breakthrough
Note: There isn t a uniquely minimal NFA
Extending transition function to strings Given FA M = (Q, Σ,, q 0, F), we extend to a function : Q Σ* Q as follows: (q, ε) = (q, ) = q (q, ) (q, 1 k+1 ) = ( (q, 1 k ), k+1 ) (q, w) = the state of M reached after reading in w, starting from state q Note: (q 0, w) F M accepts w ef. w Σ* distinguishes states q 1 and q 2 if (q exactly 1, w) one F of (q (q 1, w), 2, w) (q 2 F, w) is a final state
istinguishing two states ef. w Σ* distinguishes states q 1 and q 2 if exactly one of (q 1, w), (q 2, w) is a final state Here read this I m in q 1 or q 2, but which? How can I tell? Ok, I m accepting! Must have been q 1 W
Fix M = (Q, Σ,, q 0, F) and let p, q Q efinitions: State p is distinguishable from state q if there is w Σ* that distinguishes p and q ( there is w Σ* so that exactly one of (p, w), (q, w) is a final state) State p is indistinguishable from state q if p is not distinguishable from q ( for all w Σ*, (p, w) F (q, w) F) Pairs of indistinguishable states are redundant
Which pairs of states are distinguishable here? ε distinguishes all final states from non-final states
Which pairs of states are distinguishable here? The string 10 distinguishes q 0 and q 3
Which pairs of states are distinguishable here? The string 0 distinguishes q 1 and q 2
Fix M = (Q, Σ,, q 0, F) and let p, q, r Q efine a binary relation on the states of M: p q iff p is indistinguishable from q p q iff p is distinguishable from q Proposition: is an equivalence relation p p (reflexive) p q q p (symmetric) p q and q r p r (transitive) Proof?
Fix M = (Q, Σ,, q 0, F) and let p, q, r Q Therefore, the relation partitions Q into disjoint equivalence classes Proposition: is an equivalence relation [q] := { p p q }
Algorithm: MINIMIZE-FA Input: FA M Output: FA M MIN such that: L(M) = L(M MIN ) M MIN has no inaccessible states M MIN is irreducible For all states p q of M MIN, p and q are distinguishable Theorem: M MIN is the unique minimal FA that is equivalent to M
Intuition: The states of M MIN will be the equivalence classes of states of M We ll uncover these equivalent states with a dynamic programming algorithm
Input: FA M = (Q, Σ,, q 0, F) The Table-Filling Algorithm Output: (1) M = { (p, q) p, q Q and p q } (2) EQUIV M = { [q] q Q } High-Level Idea: We know how to find those pairs of states that the string ε distinguishes Use this and iteration to find those pairs distinguishable with longer strings The pairs of states left over will be indistinguishable
The Table-Filling Algorithm Input: FA M = (Q, Σ,, q 0, F) Output: (1) M = { (p, q) p, q Q and p q } (2) EQUIV M = { [q] q Q } Base Case: For all (p, q) such that p accepts and q rejects p q
The Table-Filling Algorithm Input: FA M = (Q, Σ,, q 0, F) Output: (1) M = { (p, q) p, q Q and p q } (2) EQUIV M = { [q] q Q } Base Case: For all (p, q) such that p accepts and q rejects p q Iterate: If there are states p, q and symbol σ Σ satisfying: (p, ) = (q, ) = p mark p q q Repeat until no more s can be added
Claim: If (p, q) is marked by the Table-Filling algorithm, then p q Proof: By induction on the number of steps in the algorithm before (p,q) is marked If (p, q) is marked at the start, then one state s in F and the other isn t, so ε distinguishes p and q Suppose (p, q) is marked at a later point. Then there are states p, q such that: 1. (p, q ) are marked p q (by induction) So there s a string w s.t. (p, w) F (q, w) F 2. p = (p, ) and q = (q, ), where Σ The string w distinguishes p and q!
Claim: If (p, q) is not marked by the Table-Filling algorithm, then p q Proof (by contradiction): Suppose the pair (p, q) is not marked by the algorithm, yet p q (call this a bad pair ) Then there is a string w such that w > 0 and: (p, w) F and (q, w) F (Why is w > 0?) Of We all have such w bad = w, pairs, for let some p, q string be a pair w and with some the shortest Σ distinguishing string w Let p = (p, ) and q = (q, ) Then (p, q ) is also a bad pair, but with a SHORTER distinguishing string, w!
Algorithm MINIMIZE Input: FA M Output: Equivalent minimal-state FA M MIN 1. Remove all inaccessible states from M 2. Run Table-Filling algorithm on M to get: EQUIV M = { [q] q is an accessible state of M } 3. efine: M MIN = (Q MIN, Σ, MIN, q 0 MIN, F MIN ) Q MIN = EQUIV M, q 0 MIN = [q 0 ], F MIN = { [q] q F } MIN ( [q], ) = [ ( q, ) ] Claim: L(M MIN ) = L(M)
Thm: M MIN is the unique minimal FA equivalent to M Claim: If L(M )=L(M MIN ) and M has no inaccessible states and M is irreducible there is an isomorphism between M and M MIN Suppose for now the Claim is true. If M is a minimal FA, then M has no inaccessible states and is irreducible (why?) So the Claim implies: Let M be a minimal FA for M. there is an isomorphism between M and the FA M MIN that is output by MINIMIZE(M). The Thm holds!
Thm: M MIN is the unique minimal FA equivalent to M Claim: If L(M )=L(M MIN ) and M has no inaccessible states and M is irreducible there is an isomorphism between M and M MIN Proof: We recursively construct a map from the states of M MIN to the states of M Base Case: q 0 MIN q 0 Recursive Step: If p p q q Then q q
Base Case: q 0 MIN q 0 Recursive Step: If p p Then q q q q
Base Case: q 0 MIN q 0 Recursive Step: We need to prove: If p p The map is defined everywhere The map is well defined The map is a bijection q Then q q q The map preserves all transitions: If p p then MIN (p, ) (p, ) (this follows from the definition of the map!)
Base Case: q 0 MIN q 0 Recursive Step: If p p The map is defined everywhere That is, for all states q of M MIN there is a state q of M such that q q q Then q q q If q M MIN, there is a string w such that MIN (q 0 MIN,w) = q Let q = (q 0,w). Then q q
Base Case: q 0 MIN q 0 Recursive Step: If p p Then q q q q The map is well defined Proof by contradiction. Suppose there are states q and q such that q q and q q We show that q and q are indistinguishable, so it must be that q = q
Suppose there are states q and q such that q q and q q Suppose q and q are distinguishable M MIN M q 0 MIN u q w Accept q 0 u q w Accept q 0 MIN v Contradiction! w q Reject q 0 v q w Reject
Base Case: q 0 MIN q 0 Recursive Step: The map is onto If p p q Then q q q Want to show: For all states q of M there is a state q of M MIN such that q q For every q there is a string w such that M reaches state q after reading in w Let q be the state of M MIN after reading in w Claim: q q
The map is one-to-one Proof by contradiction. Suppose there are states p q such that p q and q q If p q, then p and q are distinguishable M MIN M q 0 MIN u p w Accept q 0 u q w Accept q 0 MIN v q w Reject q 0 v Contradiction! w q Reject
How can we prove that two regular expressions are equivalent?
Parting thoughts: Pumping for contradictions FAs can t count FAs can be optimized Later: Can FAs be learned? Questions?