CS 154, Lecture 4: Limitations on DFAs (I), Pumping Lemma, Minimizing DFAs

Similar documents
CS154. Non-Regular Languages, Minimizing DFAs

Lecture 5: Minimizing DFAs

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

Closure Properties of Regular Languages. Union, Intersection, Difference, Concatenation, Kleene Closure, Reversal, Homomorphism, Inverse Homomorphism

Minimization of DFAs

Lecture 4: More on Regexps, Non-Regular Languages

Proving languages to be nonregular

Before we show how languages can be proven not regular, first, how would we show a language is regular?

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

What we have done so far

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

More on Regular Languages and Non-Regular Languages

CS5371 Theory of Computation. Lecture 5: Automata Theory III (Non-regular Language, Pumping Lemma, Regular Expression)

Properties of Regular Languages. BBM Automata Theory and Formal Languages 1

CSCE 551 Final Exam, Spring 2004 Answer Key

Incorrect reasoning about RL. Equivalence of NFA, DFA. Epsilon Closure. Proving equivalence. One direction is easy:

Formal Language and Automata Theory (CS21004)

Theory of Computation p.1/?? Theory of Computation p.2/?? Unknown: Implicitly a Boolean variable: true if a word is

Name: Student ID: Instructions:

Outline. CS21 Decidability and Tractability. Regular expressions and FA. Regular expressions and FA. Regular expressions and FA

FORMAL LANGUAGES, AUTOMATA AND COMPUTATION

Chapter 2: Finite Automata

Chapter 2 The Pumping Lemma

CS 455/555: Finite automata

Computational Theory

Automata Theory. Lecture on Discussion Course of CS120. Runzhe SJTU ACM CLASS

TDDD65 Introduction to the Theory of Computation

CISC 4090: Theory of Computation Chapter 1 Regular Languages. Section 1.1: Finite Automata. What is a computer? Finite automata

CPSC 421: Tutorial #1

CSE 105 THEORY OF COMPUTATION

Lecture 4 Nondeterministic Finite Accepters

THEORY OF COMPUTATION (AUBER) EXAM CRIB SHEET

CSE 311: Foundations of Computing. Lecture 23: Finite State Machine Minimization & NFAs

Computability and Complexity

Uses of finite automata

3515ICT: Theory of Computation. Regular languages

CS 301. Lecture 18 Decidable languages. Stephen Checkoway. April 2, 2018

COMP-330 Theory of Computation. Fall Prof. Claude Crépeau. Lec. 9 : Myhill-Nerode Theorem and applications

CS 154 Introduction to Automata and Complexity Theory

Computational Models - Lecture 3

Fooling Sets and. Lecture 5

CSE 105 THEORY OF COMPUTATION. Spring 2018 review class

CS 530: Theory of Computation Based on Sipser (second edition): Notes on regular languages(version 1.1)

T (s, xa) = T (T (s, x), a). The language recognized by M, denoted L(M), is the set of strings accepted by M. That is,

Theory of Computation (II) Yijia Chen Fudan University

COMP4141 Theory of Computation

Great Theoretical Ideas in Computer Science. Lecture 4: Deterministic Finite Automaton (DFA), Part 2

CS 154, Lecture 3: DFA NFA, Regular Expressions

CSE 311: Foundations of Computing I Autumn 2014 Practice Final: Section X. Closed book, closed notes, no cell phones, no calculators.

CS481F01 Solutions 6 PDAS

COMP-330 Theory of Computation. Fall Prof. Claude Crépeau. Lec. 5 : DFA minimization

Nondeterministic Finite Automata

Finite Automata and Regular languages

CS 154 NOTES CONTENTS

Intro to Theory of Computation

The Pumping Lemma and Closure Properties

Equivalence of DFAs and NFAs

We define the multi-step transition function T : S Σ S as follows. 1. For any s S, T (s,λ) = s. 2. For any s S, x Σ and a Σ,

Lecture Notes On THEORY OF COMPUTATION MODULE -1 UNIT - 2

More on Finite Automata and Regular Languages. (NTU EE) Regular Languages Fall / 41

Computational Models - Lecture 3 1

Computational Models: Class 3

Applied Computer Science II Chapter 1 : Regular Languages

Nondeterministic finite automata

CSE 105 THEORY OF COMPUTATION

Definition: Let S and T be sets. A binary relation on SxT is any subset of SxT. A binary relation on S is any subset of SxS.

Chapter 5. Finite Automata

Nondeterministic Finite Automata

CSE 105 THEORY OF COMPUTATION

Extended RE's. UNIX pioneered the use of additional operators. and notation for RE's: æ E? = 0 or 1 occurrences of E = æ + E.

Harvard CS 121 and CSCI E-207 Lecture 6: Regular Languages and Countability

CS 154. Finite Automata vs Regular Expressions, Non-Regular Languages

CSE 105 THEORY OF COMPUTATION

CSE 105 Theory of Computation

CSE 135: Introduction to Theory of Computation Optimal DFA

Fall 1999 Formal Language Theory Dr. R. Boyer. 1. There are other methods of nding a regular expression equivalent to a nite automaton in

Please give details of your calculation. A direct answer without explanation is not counted.

PS2 - Comments. University of Virginia - cs3102: Theory of Computation Spring 2010

Properties of the Integers

a + b = b + a and a b = b a. (a + b) + c = a + (b + c) and (a b) c = a (b c). a (b + c) = a b + a c and (a + b) c = a c + b c.

1 More finite deterministic automata

Languages, regular languages, finite automata

CSE 105 Theory of Computation Professor Jeanne Ferrante

Lecture 13: Foundations of Math and Kolmogorov Complexity

acs-04: Regular Languages Regular Languages Andreas Karwath & Malte Helmert Informatik Theorie II (A) WS2009/10

V Honors Theory of Computation

Decidability (What, stuff is unsolvable?)

CSE 105 THEORY OF COMPUTATION

Chapter Five: Nondeterministic Finite Automata

CS243, Logic and Computation Nondeterministic finite automata

CpSc 421 Final Exam December 6, 2008

The Pumping Lemma: limitations of regular languages

CS 154 NOTES MOOR XU NOTES FROM A COURSE BY LUCA TREVISAN AND RYAN WILLIAMS

15.1 Proof of the Cook-Levin Theorem: SAT is NP-complete

Automata Theory and Formal Grammars: Lecture 1

Notes on State Minimization

1 The Local-to-Global Lemma

4.2 The Halting Problem

CS154, Lecture 12: Kolmogorov Complexity: A Universal Theory of Data Compression

Sample Midterm. Understanding the questions is part of the exam; you are not allowed to ask questions during the exam.

Transcription:

CS 154, Lecture 4: Limitations on FAs (I), Pumping Lemma, Minimizing FAs

Regular or Not? Non-Regular Languages = { w w has equal number of occurrences of 01 and 10 } REGULAR! C = { w w has equal number of 1s and 0s} NOT REGULAR! How can we prove that there is no FA for a particular language? Surprising Algorithms (even in restricted models) are routinely being discovered

The Pumping Lemma: Structure in Regular Languages Let L be a regular language Then there is a positive integer P s.t. for all strings w L with w P there is a way to write w = xyz, where: 1. y > 0 (that is, y ε) 2. xy P 3. For all i 0, xy i z L Why is it called the pumping lemma? The word w gets pumped into longer and longer strings

Proof: Let M be a FA that recognizes L Let P be the number of states in M Let w be a string where w L and w P We show: w = xyz y x w 1. y > 0 2. xy P 3. xy i z L for all i 0 z q 0 q j q k q w Claim: There must exist j and k such that 0 j < k P, and q j = q k

Applying the Pumping Lemma Let s prove that EQ={ w #1s = #0s} is not regular. By contradiction. Assume EQ is regular. Let P be as in pumping lemma. Let w = 0 P 1 P EQ. Can write w = xyz, with y > 0, xy P, such that for all i 0, xy i z is also in EQ Claim: The string y must be all zeroes. Why? Because xy P and w = xyz = 0 P 1 P But then xyyz has more 0s than 1s Contradiction!

Applying the Pumping Lemma Prove: SQ = {0 n2 n 0} is not regular Assume SQ is regular. Let w = 0 P2 Can write w = xyz, with y > 0, xy P, such that for all i 0, xy i z is also in EQ So xyyz SQ. Note that xyyz = 0 P2 + y Note that 0 < y P So xyyz = P 2 + y P 2 + P < P 2 + 2P + 1 = (P+1) 2 and P 2 < xyyz < (P+1) 2 Therefore xyyz is not a perfect square! Hence 0 P2 + y = xyyz SQ, so our assumption must be false. SQ is not regular!

oes this FA have a minimal number of states?

Is this minimal? How can we tell in general?

Theorem: For every regular language L, there is a unique (up to re-labeling of the states) minimal-state FA M* such that L = L(M*). Furthermore, there is an efficient algorithm which, given any FA M, will output this unique M*. If these were true for more general models of computation, that would be an engineering breakthrough

Note: There isn t a uniquely minimal NFA

Extending transition function to strings Given FA M = (Q, Σ,, q 0, F), we extend to a function : Q Σ* Q as follows: (q, ε) = (q, ) = q (q, ) (q, 1 k+1 ) = ( (q, 1 k ), k+1 ) (q, w) = the state of M reached after reading in w, starting from state q Note: (q 0, w) F M accepts w ef. w Σ* distinguishes states q 1 and q 2 if (q exactly 1, w) one F of (q (q 1, w), 2, w) (q 2 F, w) is a final state

istinguishing two states ef. w Σ* distinguishes states q 1 and q 2 if exactly one of (q 1, w), (q 2, w) is a final state Here read this I m in q 1 or q 2, but which? How can I tell? Ok, I m accepting! Must have been q 1 W

Fix M = (Q, Σ,, q 0, F) and let p, q Q efinitions: State p is distinguishable from state q if there is w Σ* that distinguishes p and q ( there is w Σ* so that exactly one of (p, w), (q, w) is a final state) State p is indistinguishable from state q if p is not distinguishable from q ( for all w Σ*, (p, w) F (q, w) F) Pairs of indistinguishable states are redundant

Which pairs of states are distinguishable here? ε distinguishes all final states from non-final states

Which pairs of states are distinguishable here? The string 10 distinguishes q 0 and q 3

Which pairs of states are distinguishable here? The string 0 distinguishes q 1 and q 2

Fix M = (Q, Σ,, q 0, F) and let p, q, r Q efine a binary relation on the states of M: p q iff p is indistinguishable from q p q iff p is distinguishable from q Proposition: is an equivalence relation p p (reflexive) p q q p (symmetric) p q and q r p r (transitive) Proof?

Fix M = (Q, Σ,, q 0, F) and let p, q, r Q Therefore, the relation partitions Q into disjoint equivalence classes Proposition: is an equivalence relation [q] := { p p q }

Algorithm: MINIMIZE-FA Input: FA M Output: FA M MIN such that: L(M) = L(M MIN ) M MIN has no inaccessible states M MIN is irreducible For all states p q of M MIN, p and q are distinguishable Theorem: M MIN is the unique minimal FA that is equivalent to M

Intuition: The states of M MIN will be the equivalence classes of states of M We ll uncover these equivalent states with a dynamic programming algorithm

Input: FA M = (Q, Σ,, q 0, F) The Table-Filling Algorithm Output: (1) M = { (p, q) p, q Q and p q } (2) EQUIV M = { [q] q Q } High-Level Idea: We know how to find those pairs of states that the string ε distinguishes Use this and iteration to find those pairs distinguishable with longer strings The pairs of states left over will be indistinguishable

The Table-Filling Algorithm Input: FA M = (Q, Σ,, q 0, F) Output: (1) M = { (p, q) p, q Q and p q } (2) EQUIV M = { [q] q Q } Base Case: For all (p, q) such that p accepts and q rejects p q

The Table-Filling Algorithm Input: FA M = (Q, Σ,, q 0, F) Output: (1) M = { (p, q) p, q Q and p q } (2) EQUIV M = { [q] q Q } Base Case: For all (p, q) such that p accepts and q rejects p q Iterate: If there are states p, q and symbol σ Σ satisfying: (p, ) = (q, ) = p mark p q q Repeat until no more s can be added

Claim: If (p, q) is marked by the Table-Filling algorithm, then p q Proof: By induction on the number of steps in the algorithm before (p,q) is marked If (p, q) is marked at the start, then one state s in F and the other isn t, so ε distinguishes p and q Suppose (p, q) is marked at a later point. Then there are states p, q such that: 1. (p, q ) are marked p q (by induction) So there s a string w s.t. (p, w) F (q, w) F 2. p = (p, ) and q = (q, ), where Σ The string w distinguishes p and q!

Claim: If (p, q) is not marked by the Table-Filling algorithm, then p q Proof (by contradiction): Suppose the pair (p, q) is not marked by the algorithm, yet p q (call this a bad pair ) Then there is a string w such that w > 0 and: (p, w) F and (q, w) F (Why is w > 0?) Of We all have such w bad = w, pairs, for let some p, q string be a pair w and with some the shortest Σ distinguishing string w Let p = (p, ) and q = (q, ) Then (p, q ) is also a bad pair, but with a SHORTER distinguishing string, w!

Algorithm MINIMIZE Input: FA M Output: Equivalent minimal-state FA M MIN 1. Remove all inaccessible states from M 2. Run Table-Filling algorithm on M to get: EQUIV M = { [q] q is an accessible state of M } 3. efine: M MIN = (Q MIN, Σ, MIN, q 0 MIN, F MIN ) Q MIN = EQUIV M, q 0 MIN = [q 0 ], F MIN = { [q] q F } MIN ( [q], ) = [ ( q, ) ] Claim: L(M MIN ) = L(M)

Thm: M MIN is the unique minimal FA equivalent to M Claim: If L(M )=L(M MIN ) and M has no inaccessible states and M is irreducible there is an isomorphism between M and M MIN Suppose for now the Claim is true. If M is a minimal FA, then M has no inaccessible states and is irreducible (why?) So the Claim implies: Let M be a minimal FA for M. there is an isomorphism between M and the FA M MIN that is output by MINIMIZE(M). The Thm holds!

Thm: M MIN is the unique minimal FA equivalent to M Claim: If L(M )=L(M MIN ) and M has no inaccessible states and M is irreducible there is an isomorphism between M and M MIN Proof: We recursively construct a map from the states of M MIN to the states of M Base Case: q 0 MIN q 0 Recursive Step: If p p q q Then q q

Base Case: q 0 MIN q 0 Recursive Step: If p p Then q q q q

Base Case: q 0 MIN q 0 Recursive Step: We need to prove: If p p The map is defined everywhere The map is well defined The map is a bijection q Then q q q The map preserves all transitions: If p p then MIN (p, ) (p, ) (this follows from the definition of the map!)

Base Case: q 0 MIN q 0 Recursive Step: If p p The map is defined everywhere That is, for all states q of M MIN there is a state q of M such that q q q Then q q q If q M MIN, there is a string w such that MIN (q 0 MIN,w) = q Let q = (q 0,w). Then q q

Base Case: q 0 MIN q 0 Recursive Step: If p p Then q q q q The map is well defined Proof by contradiction. Suppose there are states q and q such that q q and q q We show that q and q are indistinguishable, so it must be that q = q

Suppose there are states q and q such that q q and q q Suppose q and q are distinguishable M MIN M q 0 MIN u q w Accept q 0 u q w Accept q 0 MIN v Contradiction! w q Reject q 0 v q w Reject

Base Case: q 0 MIN q 0 Recursive Step: The map is onto If p p q Then q q q Want to show: For all states q of M there is a state q of M MIN such that q q For every q there is a string w such that M reaches state q after reading in w Let q be the state of M MIN after reading in w Claim: q q

The map is one-to-one Proof by contradiction. Suppose there are states p q such that p q and q q If p q, then p and q are distinguishable M MIN M q 0 MIN u p w Accept q 0 u q w Accept q 0 MIN v q w Reject q 0 v Contradiction! w q Reject

How can we prove that two regular expressions are equivalent?

Parting thoughts: Pumping for contradictions FAs can t count FAs can be optimized Later: Can FAs be learned? Questions?