c 1998 Society for Industrial and Applied Mathematics Vol. 27, No. 4, pp , August

Similar documents
DESCRIPTIONAL COMPLEXITY OF NFA OF DIFFERENT AMBIGUITY

Minimizing finite automata is computationally hard

CMPSCI 250: Introduction to Computation. Lecture #22: From λ-nfa s to NFA s to DFA s David Mix Barrington 22 April 2013

3515ICT: Theory of Computation. Regular languages

Quantifying Nondeterminism in Finite Automata

BALA RAVIKUMAR Department of Computer Science, Sonoma State University Rohnert Park, CA 94928, USA

T (s, xa) = T (T (s, x), a). The language recognized by M, denoted L(M), is the set of strings accepted by M. That is,

Automata and Formal Languages - CM0081 Non-Deterministic Finite Automata

Mergible States in Large NFA

Theoretical Computer Science. State complexity of basic operations on suffix-free regular languages

cse303 ELEMENTS OF THE THEORY OF COMPUTATION Professor Anita Wasilewska

UNIT-II. NONDETERMINISTIC FINITE AUTOMATA WITH ε TRANSITIONS: SIGNIFICANCE. Use of ε-transitions. s t a r t. ε r. e g u l a r

1 Two-Way Deterministic Finite Automata

Finite Automata and Regular languages

What we have done so far

CS243, Logic and Computation Nondeterministic finite automata

Notes on State Minimization

Chapter 2: Finite Automata

Nondeterministic Finite Automata

arxiv: v3 [cs.fl] 2 Jul 2018

Theory of Computation

Büchi Automata and their closure properties. - Ajith S and Ankit Kumar

arxiv: v1 [cs.cc] 3 Feb 2019

Introduction to Kleene Algebra Lecture 9 CS786 Spring 2004 February 23, 2004

CS 154. Finite Automata, Nondeterminism, Regular Expressions

More on Finite Automata and Regular Languages. (NTU EE) Regular Languages Fall / 41

Finite Automata. Mahesh Viswanathan

The efficiency of identifying timed automata and the power of clocks

On Stateless Multicounter Machines

Prime Languages, Orna Kupferman, Jonathan Mosheiff. School of Engineering and Computer Science The Hebrew University, Jerusalem, Israel

Combinatorial Interpretations of a Generalization of the Genocchi Numbers

NOTES ON AUTOMATA. Date: April 29,

arxiv: v1 [cs.fl] 19 Mar 2015

Uses of finite automata

What You Must Remember When Processing Data Words

Lecture 3: Nondeterministic Finite Automata

Theory of computation: initial remarks (Chapter 11)

Results on Transforming NFA into DFCA

Reversal of Regular Languages and State Complexity

Finite Automata, Palindromes, Patterns, and Borders

On Properties and State Complexity of Deterministic State-Partition Automata

AMBIGUITY AND COMMUNICATION

CS 455/555: Finite automata

Pushdown Automata (2015/11/23)

Finite Automata and Regular Languages

Lecture 1: Finite State Automaton

State Complexity of Two Combined Operations: Catenation-Union and Catenation-Intersection

An algebraic characterization of unary two-way transducers

Undecidability COMS Ashley Montanaro 4 April Department of Computer Science, University of Bristol Bristol, UK

Theory of Computation (I) Yijia Chen Fudan University

Nondeterministic finite automata

Finite Automata and Languages

Nondeterministic State Complexity of Basic Operations for Prefix-Free Regular Languages

On NFAs Where All States are Final, Initial, or Both

Recitation 2 - Non Deterministic Finite Automata (NFA) and Regular OctoberExpressions

Decision, Computation and Language

Nondeterministic Finite Automata. Nondeterminism Subset Construction

Computational Models - Lecture 4

Computational Theory

CPSC 421: Tutorial #1

CS 154, Lecture 2: Finite Automata, Closure Properties Nondeterminism,

Formal Definition of Computation. August 28, 2013

September 7, Formal Definition of a Nondeterministic Finite Automaton

Extended transition function of a DFA

Chapter 2: Finite Automata

Operations on Unambiguous Finite Automata

Classes and conversions

CSCI 2200 Foundations of Computer Science Spring 2018 Quiz 3 (May 2, 2018) SOLUTIONS

Introduction to the Theory of Computing

Computational Models - Lecture 3 1

ECS 120: Theory of Computation UC Davis Phillip Rogaway February 16, Midterm Exam

arxiv:cs/ v1 [cs.cc] 9 Feb 2007

On decision problems for timed automata

Finite Automata. Seungjin Choi

Nondeterministic Finite Automata

Deterministic Finite Automata. Non deterministic finite automata. Non-Deterministic Finite Automata (NFA) Non-Deterministic Finite Automata (NFA)

FORMAL LANGUAGES, AUTOMATA AND COMPUTATION

(Refer Slide Time: 0:21)

Automata extended to nominal sets

Computational Models - Lecture 3 1

Introduction to the Theory of Computation. Automata 1VO + 1PS. Lecturer: Dr. Ana Sokolova.

September 11, Second Part of Regular Expressions Equivalence with Finite Aut

CS 530: Theory of Computation Based on Sipser (second edition): Notes on regular languages(version 1.1)

2. Elements of the Theory of Computation, Lewis and Papadimitrou,

DM17. Beregnelighed. Jacob Aae Mikkelsen

Remarks on Separating Words

FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY

CS 208: Automata Theory and Logic

Introduction to the Theory of Computation. Automata 1VO + 1PS. Lecturer: Dr. Ana Sokolova.

Harvard CS 121 and CSCI E-207 Lecture 6: Regular Languages and Countability

Further discussion of Turing machines

Let us first give some intuitive idea about a state of a system and state transitions before describing finite automata.

On the Average Complexity of Brzozowski s Algorithm for Deterministic Automata with a Small Number of Final States

Theory of Computation (II) Yijia Chen Fudan University

The Parameterized Complexity of Intersection and Composition Operations on Sets of Finite-State Automata

Formal Models in NLP

HKN CS/ECE 374 Midterm 1 Review. Nathan Bleier and Mahir Morshed

CSC173 Workshop: 13 Sept. Notes

Algorithms for NLP

Lecture 23 : Nondeterministic Finite Automata DRAFT Connection between Regular Expressions and Finite Automata

Transcription:

SIAM J COMPUT c 1998 Society for Industrial and Applied Mathematics Vol 27, No 4, pp 173 182, August 1998 8 SEPARATING EXPONENTIALLY AMBIGUOUS FINITE AUTOMATA FROM POLYNOMIALLY AMBIGUOUS FINITE AUTOMATA HING LEUNG Abstract We resolve an open problem raised by Ravikumar and Ibarra [SIAM J Comput, 18 (1989), pp 1263 1282] on the succinctness of representations relating to the types of ambiguity of finite automata We show that there exists a family of nondeterministic finite automata {A n} over a two-letter alphabet such that, for any positive integer n, A n is exponentially ambiguous and has n states, whereas the smallest equivalent deterministic finite automaton has 2 n states, and any smallest equivalent polynomially ambiguous finite automaton has 2 n 1 states Key words nondeterministic finite automata, ambiguity, succinctness of representation AMS subject classification 68Q68 PII S9753979325292 1 Introduction In their 1989 paper [RI89] Ravikumar and Ibarra raised some interesting questions relating the type of ambiguity of finite automata to the succinctness in their number of states They considered the following five classes of finite automata: deterministic finite automata (DFA), nondeterministic finite automata (NFA), unambiguous NFA (UFA), finitely ambiguous NFA (FNA), and polynomially ambiguous NFA (PNA) Formal definitions of these classes are given in section 2 of this paper Let C 1 and C 2 be any two of the above five classes of finite automata We say that C 1 can be polynomially converted to C 2 (written C 1 P C 2 ) if there exists a polynomial p such that for any finite automaton in C 1 with n states we can find an equivalent finite automaton in C 2 with at most p(n) states C 1 is said to be polynomially related to C 2 (written C 1 = P C 2 ) if C 1 P C 2 and C 2 P C 1 C 1 is said to be separated from C 2 if C 1 P C 2 We write C 1 < P C 2 if C 1 P C 2 and C 1 P C 2 It is immediate that DFA P UFA, UFA P FNA, FNA P PNA, and PNA P NFA The following is known: DFA < P NFA [MF71], [Mo71], DFA < P UFA [Sc78], [SH85], [RI89], UFA < P FNA [Sc78], [RI89], and UFA < P NFA [SH85] It is unknown whether FNA < P NFA Ravikumar and Ibarra conjecture that FNA < P PNA and that PNA < P NFA [RI89] In this paper, we prove that PNA < P NFA which immediately implies that FNA < P NFA The other conjecture that FNA < P PNA still remains open In summary, we have DFA < P UFA < P FNA P PNA < P NFA Specifically, we show that there exists a family of NFAs {A n n 1} over a twoletter alphabet such that, for any positive integer n, A n is exponentially ambiguous Received by the editors July 14, 1993; accepted for publication (in revised form) May 31, 1996; published electronically May 19, 1998 This research was supported by an Alexander von Humboldt research fellowship and was done while the author was visiting the University of Frankfurt, Germany A preliminary version of the paper appeared in the proceedings of ISAAC 93 http://wwwsiamorg/journals/sicomp/27-4/2529html Department of Computer Science, New Mexico State University, Las Cruces, NM 883 (hleung@csnmsuedu) 173

174 HING LEUNG and has n states, whereas the smallest equivalent DFA has 2 n states and any smallest equivalent PNA has 2 n 1 states Our results show that any PNA equivalent to A n cannot do better in the number of states than the smallest equivalent DFA obtained by the subset construction except for the saving of the dead state Another way to interpret our results is as follows: let us first define that an NFA is strongly ambiguous if there is a useful state q (that is, q can be reached from some starting state and can reach some final state) and a string w such that M can process w starting from state q and ending also with state q in more than one way Then by a characterization in [IR86] our results show that A n is strongly ambiguous with n states, whereas the smallest equivalent DFA has 2 n states and any smallest equivalent NFA that is not strongly ambiguous has 2 n 1 states Section 2 presents the definitions and some basic results Section 3 presents the family of NFAs {A n } and proves the main result of this paper by a series of lemmas 2 Preliminaries We assume that the reader is familiar with the basic definitions and notations in finite automata theory and the Myhill Nerode theorem [HU79] as well as the basics of graph theory [Ha69] Let w be a string, and L a language We denote by w R the reverse of w and by L R the set of strings v R where v L Throughout this paper, we assume a model of NFA that is slightly more general than the one defined in [HU79] in that we allow a set of starting states instead of only one starting state Thus, an NFA M is a 5-tuple (Q, Σ, δ, Q I, Q F ) where Q is the set of states, Σ is the alphabet set, δ : Q Σ 2 Q is the transition function, Q I is the set of starting states, and Q F is the set of final states Given an NFA M, we define the ambiguity of a string w to be the number of different accepting paths for w in M Note that a string w is in the language of M if and only if the ambiguity of w is not zero The ambiguity function amb M : N N is defined such that amb M (n) is the maximum of the ambiguities of strings that are of length n or less Remark: amb M is nondecreasing M is called unambiguous if the ambiguity of any string is either zero or one M is called finitely (respectively, polynomially, exponentially) ambiguous if amb M can be bounded by a constant (respectively, polynomial, exponential) function f; that is, for all n N, amb M (n) f(n) It is easy to see that amb M (n) s 1 s n where s 1 is the cardinality of Q I and s is the cardinality of Q Thus, every NFA must be exponentially ambiguous M is called strictly exponentially ambiguous [IR86] if M is exponentially ambiguous but not polynomially ambiguous It is known [IR86] that M is strictly exponentially ambiguous if and only if there is a useful state q and there is a string w such that M can process w starting from state q and ending also with state q in more than one way 3 Main result For any positive integer n, we define an NFA A n = (Q, Σ, δ, {q 1 }, {q 1 }) where Q = {q 1, q 2,,q n }, q 1 is the only starting state and the only final state, Σ = {, 1}, and δ (see Figure 1) is defined as follows: δ(q 1, ) = {q 1, q 2 }, δ(q i, ) = {q i+1 } for 2 i n 1, δ(q n, ) = {q 1 }, δ(q 1, 1) =, δ(q i, 1) = {q i } for 2 i n

SEPARATING AMBIGUOUS FINITE AUTOMATA 175 1 1 1 q 1 q2 q3 qn Fig 1 Transition diagram of A n We denote the language of A n by L n, which is ( + (1 ) n 1 ) It is easy to see that L n = L R n = L n It is argued in section 2 that every NFA is exponentially ambiguous Thus A n is exponentially ambiguous Moreover, A n is strictly exponentially ambiguous since the ambiguity of m is at least 2 m/n We will prove that any DFA recognizing L n has at least 2 n states (Lemma 2), any UFA recognizing L n has at least 2 n 1 states (Lemma 4), and any PNA recognizing L n also has at least 2 n 1 states (Theorem 1) First, we present some definitions Given a language L and a string x, prefix(l) def = {w w, ww L} and x 1 (L) def = {w xw L} The two operations prefix and x 1 commute since both x 1 (prefix(l)) and prefix(x 1 (L)) equal {w w, xww L} Let kill non q 1 denote (1) n 1, accept denote n 1, and reset denote accept kill non q 1 The intuitive concepts of the strings kill non q 1, accept, and reset are reflected in the following properties: for any P Q, δ(p,kill non q 1 ) = P {q 2,, q n } For any nonempty subset P Q, observe that q 1 δ(p,accept) and δ(p,reset) = {q 1 } Equivalently, for any x prefix(l n ), x accept L n and (x reset) 1 (L n ) = L n For any P Q, let w P Σ be w 1 w n w n 1 w 1 where w i = ǫ if q i P and w i = 1 otherwise; and let u P Σ be n 1 w P The meaning of w P and u P can be understood as the strings that satisfy Lemma 1 and Corollary 1, respectively, given below The following properties (Lemma 1 and Corollaries 1 and 2) of the strings w P and u P are very crucial in proving the main theorem of the paper Lemma 1 For any P Q, we have (1) δ(p, w P ) = P; (2) for any q P, δ(q, w P ) {q}; (3) δ(q P, w P ) = Proof The proof of Lemma 1, which is quite long but straightforward, is given in the appendix Corollary 1 For any P Q, δ(q 1, u P ) = P Proof δ(q 1, u P ) = δ(q 1, n 1 w P ) = δ(q, w P ) = δ(p, w P ) δ(q P, w P ) = δ(p, w P ) = P by parts (1) and (3) of Lemma 1 Corollary 2 For any P, P Q, u P w P prefix(l n ) if and only if P P Proof Suppose P P Let q P P Then δ(q 1, u P w P ) = δ(p, w P ) δ(q, w P ) {q} by Corollary 1 and part (2) of Lemma 1 Since all states in A n are useful, u P w P prefix(l n ) Suppose P P = Then P Q P and δ(q 1, u P w P ) = δ(p, w P ) δ(q P, w P ) = by Corollary 1 and part (3) of Lemma 1 Thus we have that u P w P prefix(l n )

176 HING LEUNG With the basic properties established, we begin to prove the first result that the smallest DFA recognizing L n has 2 n states Lemma 2 The smallest DFA recognizing L n has 2 n states Proof By Corollary 1, all subsets of states can be realized in the subset construction We show that any two different subsets of states are not equivalent; then we are done by the Myhill Nerode theorem Let P P Q Let q i be the state with the largest subscript such that q i belongs only to exactly one of P and P That is, for any q j where i + 1 j n, either q j belongs to both P and P or q j does not belong to any one of P and P If i = 1, then P and P can be distinguished by the empty string Assume that 1 < i n Then P and P can be distinguished by (1) n+1 i The next result (Lemma 4) that we want to establish is that a smallest UFA recognizing L n cannot do better in the number of states than a smallest DFA besides the saving of the dead state Some technical definitions and a technical lemma (Lemma 3) are needed first Let M n be a 2 n 1 2 n 1 matrix over the field of characteristic 2 with rows and columns indexed by the nonempty subsets of Q such that M n (P, P ) = 1 if u P w P accept L n, and M n (P, P ) = otherwise By Corollary 2 and the property of accept, M n (P, P ) = 1 if P P and M n (P, P ) = otherwise Lemma 3 The rank of M n is 2 n 1 Proof Equivalently, we can index rows and columns of M n by n-bit positive binary numbers in the order of increasing values such that any n-bit positive binary number b n b n 1 b 1 corresponds to the nonempty subset P Q with the property that for any 1 i n, q i P if and only if b i = 1 Note that the indices range from binary number of value 1 to binary number of value 2 n 1 Thus, M n (α, β) = 1 if there is some i such that the ith bits of α and β are both 1, and M n (α, β) = otherwise We are going to show by induction on n that M n has rank 2 n 1 For n = 1, then M n = [1], which is a 1 1 matrix of rank 2 1 1 = 1 Suppose that the statement is true for n = k Consider n = k + 1 We observe that the matrix M k+1 (see Figure 2) can be characterized as follows: The matrix M k+1 is symmetric (Reason: According to the definition given above for M n (α, β), it is immediate that M n (α, β) = M n (β, α)) The middle row and the middle column both have 2 k 1 zeros followed by 2 k ones (Reason: The middle row has an index of a one followed by k zeros On the other hand, the first 2 k 1 column indices are binary numbers that begin with zero, and the last 2 k column indices are binary numbers that begin with one Thus, by definition, the middle row has 2 k 1 zeros followed by 2 k ones Since M k+1 is symmetric, the middle column also has 2 k 1 zeros followed by 2 k ones) Separated by the middle row and the middle column, we have four square submatrices of sizes 2 k 1 2 k 1 each Let us name the upper-left, upperright, lower-left, and lower-right submatrices as UL, UR, LL, and LR Then UL, UR, and LL are the same as M k, and LR is a matrix of ones (Reason: Elements in LR correspond to row and column indices that both begin with one Thus LR is a matrix of ones For elements in other submatrices, either the row or column index begins with zero Thus the element values are determined by the remaining k bits of the row and column indices, which behave in the same way as the row and column indices of M k Hence, UL,

SEPARATING AMBIGUOUS FINITE AUTOMATA 177 UL UR M k +1 1 1 1 1 LL LR 1 M k M k 1 1 1 1 1 1 M k 1 1 1 Fig 2 Structure of M k+1 UL UR M k +1 1 1 1 LL LR M k 1 1 1 M k M k Fig 3 Structure of M k+1 (after transformations) UR, and LL are the same as M k ) We want to apply elementary row operations to show that M k+1 can be transformed to an identity matrix By subtracting the middle row from each of the rows in the lower half of the matrix, LL remains unchanged whereas the rest of the entries in the lower half become all zeros Note that the middle column now has all zeros except a one in the middle Let us swap the lower half of the matrix with the upper half See Figure 3 for the current structure of M k+1 By induction, we can apply elementary row operations to the upper half such that UL becomes an identity matrix, and the rest of the entries in the upper half are zeros With UL being the identity matrix and zeros elsewhere in the upper half, we apply again elementary row operations so that LL becomes all zeros whereas LR remains the same as M k See Figure 4 for the current structure of M k+1 Next, again by induction, we can transform LR to identity matrix since the rest of the entries in the lower half are all zeros Finally subtract each of the rows in the lower half from the middle row; the middle row becomes a row with all zeros except a one in the middle position Therefore, an identity matrix is obtained and the rank of M k+1 is thus 2 k+1 1 Lemma 4 A smallest UFA recognizing L n has 2 n 1 states Proof First, by removing the dead state from the DFA obtained by the subset construction, we have a UFA with 2 n 1 states Next we are going to use a technique introduced in [Sc78] to show that any UFA would require at least 2 n 1 states Let U be a UFA recognizing L n with the finite set of states denoted by K Let R be a matrix over the field of characteristic 2 with rows indexed by K and columns indexed by the nonempty subsets of Q such that R(k, P) = 1 if U can reach a final state starting from state k K on consuming w P accept, and R(k, P) = otherwise We claim that any row in M n is a linear combination of the rows in R Given a nonempty subset P of Q, let K K be the set of states reached by U from the set of starting states on consuming u P For any k 1 k 2 K and for any nonempty subset

178 HING LEUNG UL UR M k +1 1 1 1 LL LR I k 1 1 1 M k Fig 4 Structure of M k+1 (after transformations) P of Q, R(k 1, P ) and R(k 2, P ) cannot both have value one; that is, at most one of R(k, P ), for k K, is 1 Otherwise, there are two different accepting paths for u P w P accept in U, which contradicts the assumption that U is unambiguous Thus, the row indexed by P in M n is the sum of the rows indexed by K in R Therefore, the rank of M n is less than or equal to the rank of R Hence, K must have at least 2 n 1 states so that the rank of R is at least 2 n 1 We need Lemma 5 below to prove Lemma 6, which is the main technical lemma that helps us to show that a smallest PNA recognizing L n has 2 n 1 states (Theorem 1) In fact, Lemma 5 can be viewed as a generalization of Lemma 4 Lemma 5 Any UFA recognizing L such that prefix(l) = prefix(l n ) requires at least 2 n 1 states Proof Since L is regular, there exists a finite set of strings {γ 1,,γ h } Σ such that for any z prefix(l), exactly one of zγ i, for 1 i h, is in L Let X be a 2 n 1 h(2 n 1) matrix over the field of characteristic 2 with rows indexed by the nonempty subsets of Q and columns indexed by {(P, i) P Q, 1 i h} such that X(P, (P, i)) = 1 if u P w P γ i L, and X(P, (P, i)) = otherwise We claim that the rank of X is 2 n 1 We define X to be a 2 n 1 2 n 1 matrix over the field of characteristic 2 with rows and columns indexed by the nonempty subsets of Q such that the column indexed by P is the sum of the h columns in X indexed by {(P, i) 1 i h} We want to show that X is the same matrix as M n That is, we want to show that X (P, P ) = 1 if P P and X (P, P ) = otherwise Suppose P P By Corollary 2, u P w P prefix(l n ) = prefix(l) By the definition of γ i s, exactly one of u P w P γ i, for 1 i h, is in L That is, exactly one of X(P, (P, i)), for 1 i h, is 1 Hence, X (P, P ) = 1 Suppose P P = By Corollary 2, u P w P prefix(l n ) = prefix(l) Therefore, u P w P γ i L for 1 i h Hence, X(P, (P, i)) = for 1 i h and X (P, P ) = By Lemma 3, the rank of X is 2 n 1 Since each column of X is obtained by a linear combination of the columns of X, the rank of X must be bigger than or equal to the rank of X Thus, the rank of X is at least 2 n 1 Moreover since the number of rows in X is 2 n 1, the rank of X is at most 2 n 1 Therefore, the rank of X is 2 n 1 Finally, since the rank of X is 2 n 1 and by using the same technique as in the proof of Lemma 4, a smallest UFA for L must have at least 2 n 1 states The following lemma is a generalization of Lemma 5, which is the special case where x = ǫ

SEPARATING AMBIGUOUS FINITE AUTOMATA 179 Lemma 6 Let U be a UFA with the number of states less than 2 n 1 which accepts L such that prefix(l) prefix(l n ) Then for all x prefix(l n ), x 1 (prefix(l)) x 1 (prefix(l n )) Proof From prefix(l) prefix(l n ), it is immediate that x 1 (prefix(l)) x 1 (prefix(l n )) for any arbitrary x We want to show that x 1 (prefix(l)) x 1 (prefix(l n )) for any x prefix(l n ) Suppose to the contrary that x 1 (prefix(l)) = x 1 (prefix(l n )) for some x prefix(l n ) Thus, z 1 (prefix(l)) = z 1 (prefix(l n )) where z = x reset Since z 1 and prefix commute, prefix(z 1 (L)) = prefix(z 1 (L n )) Let L be z 1 (L) Then we obtain prefix(l ) = prefix(l n ) since z 1 (L n ) = L n by the property of reset We are going to construct a UFA U with less than 2 n 1 states to recognize L which is a contradiction, because of Lemma 5 The transition diagram for U is the same as that of U The set of starting states for U is defined to be the set of states reached by U on consuming z from the set of starting states of U The set of final states for U is again the same as that of U It is clear that the language accepted by U is L = z 1 (L) Also, U cannot be ambiguous otherwise U is also ambiguous Moreover, the number of states in U is the same as the number of states in U; therefore, it is less than 2 n 1 We are ready to prove the main result of this paper Theorem 1 A smallest PNA recognizing L n has 2 n 1 states Proof By removing the dead state from the DFA obtained by the subset construction, we have a UFA, which is polynomially ambiguous, with 2 n 1 states Let M be a PNA for L n with the smallest number of states Then every state in M must be useful Consider the transition diagram of M Since the strongly connected components form a partial ordering with respect to reachability, there must exist one strongly connected component, denoted T, that cannot be reached from other strongly connected components We claim that T must have at least 2 n 1 states Suppose to the contrary that it has less than 2 n 1 states T must have some starting states in it Otherwise it is not useful which contradicts the definition of M Let the set of starting states of M that appears in T be {p 1,,p k } Let 1 i k We define an NFA T pi such that p i is now the only starting and final state, and the transition diagram for T pi is T We want to check that Lemma 6 can be applied to the language of T pi First, T pi has less than 2 n 1 states Next, T pi is a UFA; otherwise by the characterization given in section 2, M is strictly exponentially ambiguous, a contradiction Let w prefix(l(t pi )) Then p i must reach a nonempty subset of states in T on consuming w Since all states in M are useful, w is therefore in prefix(l n ) Hence, prefix(l(t pi )) prefix(l n ) Consider the set of UFAs {T pi 1 i k} Let x = ǫ prefix(l n ) For 1 i k, we define x i to be a string chosen arbitrarily from (x x 1 x i 1 ) 1 (prefix(l n )) (x x 1 x i 1 ) 1 (prefix(l(t pi )) Thus, x x 1 x i prefix(l n ) for i k The existence of x i, for 1 i k, is then guaranteed by Lemma 6 Let x be x 1 x k By the way x i s are defined, x prefix(l(t pi )) for 1 i k Thus, each UFA T pi, 1 i k, reaches the empty set from the starting state p i on consuming x since all states in T pi are useful

18 HING LEUNG Let z = x reset Since x prefix(l n ), then z 1 (L n ) = L n by the property of reset Let us consider M again From the set of starting states, M reaches a subset of states, denoted P, on consuming z Since z prefix(l n ) and L(M) = L n, P is not empty Moreover, by the previous discussions, P does not include any state in T We define another NFA M by removing the set of states in T from the state set of M and let P be the new set of starting states, whereas the set of final states is the set of final states of M minus the set of states in T By the facts that M accepts L n and z 1 (L n ) = L n, M must also accept L n But this is a contradiction since M is now a PNA accepting L n with a smaller number of states than M Therefore, T cannot have less than 2 n 1 states Hence, M has at least 2 n 1 states Remark We can prove the same results for the related family of automata B n, defined as the same as A n except that all states in B n are final Appendix (Proof of Lemma 1) Lemma 1 For any P Q, we have (1) δ(p, w P ) = P; (2) for any q P, δ(q, w P ) {q}; (3) δ(q P, w P ) = Proof Given a state q i Q, we define shift(q i ) to be q i+1 if 1 i n 1, and q 1 if i = n Given P Q, we extend the definition of shift such that shift(p) = {shift(q) q P } Moreover, for any q Q and P Q, we define shift (q) = q and shift (P) = P Note that shift n (q) = q and shift n (P) = P We begin by proving part (1) of the lemma that δ(p, w P ) = P Observe that δ(p, w 1 ) = P This is because if q 1 P, then w 1 = ǫ and δ(p, w 1 ) = δ(p, ǫ) = P Otherwise if q 1 P, then w 1 = 1 and δ(p, w 1 ) = δ(p, 1) = P since q 1 P Therefore, δ(p, w P ) = δ(p, w n w n 1 w 1 ) We want to show by induction that δ(p, w n w n 1 w n i+1 ) = shift i (P) for i n Then we are done since δ(p, w n w n 1 w 1 ) = shift n (P) = P Base i = δ(p, ǫ) = P = shift (P) Induction hypothesis Assume that the statement is true for k n 1 Induction step By induction, we have δ(p, w n w n 1 w n k+1 w n k ) = δ(δ(p, w n w n 1 w n k+1 ), w n k ) = δ(shift k (P), w n k ) The induction proof is completed if we can verify that δ(shift k (P), w n k ) = shift k+1 (P) Case 1 (q n k P) Then w n k = ǫ and q n shift k (P) Thus, δ(shift k (P), w n k ) = δ(shift k (P), ) = shift k+1 (P) since q n shift k (P) Case 2 (q n k P) Then w n k = 1 and q n shift k (P) Thus, δ(shift k (P), w n k ) = δ(shift k (P), 1) = shift k+1 (P) since q n shift k (P) We finish the proof of part (1) of the lemma Next, we prove part (2) of the lemma that for any q P, δ(q, w P ) {q} Observe that δ(q, w 1 ) = {q} This is because if q 1 P, then w 1 = ǫ and δ(q, w 1 ) = δ(q, ǫ) = {q} Otherwise if q 1 P, then w 1 = 1 and δ(q, w 1 ) = δ(q, 1) = {q} since q q 1 by the facts that q 1 P and q P Therefore, δ(q, w P ) = δ(q, w n w n 1 w 1 ) We want to show by induction that δ(q, w n w n 1 w n i+1 ) shift i ({q}) for i n Then we are done since δ(q, w n w n 1 w 1 ) shift n ({q}) = {q} Base i = δ(q, ǫ) = {q} = shift ({q}) Thus, δ(q, ǫ) shift ({q}) Induction hypothesis Assume that the statement is true for k n 1

SEPARATING AMBIGUOUS FINITE AUTOMATA 181 Induction step By induction, we have δ(q, w n w n 1 w n k+1 w n k ) = δ(δ(q, w n w n 1 w n k+1 ), w n k ) δ(shift k ({q}), w n k ) The induction proof is complete if we can verify that δ(shift k ({q}), w n k ) shift k+1 ({q}) Case 1 (q n = shift k (q)) Then w n k = ǫ since q n k = q P Thus, δ(shift k ({q}), w n k ) = δ(shift k ({q}), ) = δ(q n, ) = {q 1 } = shift({q n }) = shift(shift k ({q})) = shift k+1 ({q}) Hence, δ(shift k ({q}), w n k ) shift k+1 ({q}) Case 2 (q n shift k (q)) Then no matter whether w n k = ǫ or w n k = 1, we always have δ(shift k ({q}), w n k ) shift k+1 ({q}) We finish the proof of part (2) of the lemma Finally, we prove part (3) of the lemma that δ(q P, w P ) = We claim that δ(q P, w 1 w n w n 1 w 2 ) = Hence, δ(q P, w P ) = δ(q P, w 1 w n w n 1 w 2 w 1 ) = δ(δ(q P, w 1 w n w n 1 w 2 ), w 1 ) = δ(, w 1 ) = First observe that δ(q P, w 1 ) = Q P {q 1 } This is because if q 1 P then w 1 = ǫ and q 1 Q P Thus δ(q P, w 1 ) = δ(q P, ǫ) = Q P = Q P {q 1 } since q 1 Q P Otherwise if q 1 P then w 1 = 1 Thus δ(q P, w 1 ) = δ(q P, 1) = Q P {q 1 } Next we want to show by induction that for 1 i n, δ(q P {q 1 }, w n w n 1 w n i+2 ) = shift i 1 (Q P) {q 1, q 2,, q i } Then we are done since by taking i = n, we have δ(q P {q 1 }, w n w n 1 w 2 ) = shift n 1 (Q P) {q 1, q 2,, q n } = Base i = 1 δ(q P {q 1 }, ǫ) = Q P {q 1 } = shift (Q P) {q 1 } Induction hypothesis Assume that the statement is true for 1 k n 1 Induction step By induction, we have δ(q P {q 1 }, w n w n 1 w n k+2 w n k+1 ) = δ(shift k 1 (Q P) {q 1, q 2,,q k }, w n k+1 ) The induction proof is completed if we can verify that δ(shift k 1 (Q P) {q 1, q 2,,q k }, w n k+1 ) = shift k (Q P) {q 1, q 2,, q k, q k+1 } Case 1 (q n k+1 P) Then w n k+1 = ǫ and q n shift k 1 (Q P) Thus, shift k 1 (Q P) = shift k 1 (Q P) {q n } Hence, δ(shift k 1 (Q P) {q 1, q 2,, q k }, w n k+1 ) = δ(shift k 1 (Q P) {q n, q 1, q 2,, q k }, ) = shift k (Q P) {q 1, q 2,, q k, q k+1 } Case 2 (q n k+1 P) Then w n k+1 = 1 Thus, δ(shift k 1 (Q P) {q 1, q 2,,q k }, w n k+1 ) = δ(shift k 1 (Q P) {q 1, q 2,,q k }, 1) = shift k (Q P) {q 1, q 2,, q k, q k+1 } We finish the proof of the claim and hence part (3) of the lemma Acknowledgments The author thanks Andreas Weber for many fruitful discussions and comments throughout this work He also thanks Jonathan Goldstine and Detlef Wotschke for their valuable discussions REFERENCES [Ha69] F Harary, Graph Theory, Addison-Wesley, Reading, MA, 1969 [HU79] J Hopcroft and J Ullman, Introduction to Automata Theory, Languages and Computation, Addison-Wesley, Reading, MA, 1979 [IR86] O Ibarra and B Ravikumar, On sparseness, ambiguity and other decision problems for acceptors and transducers, in Proc 3rd Annual Symposium on Theoretical Aspects of Computer Science, Orsay, France, Lecture Notes in Computer Science 21, 1986, pp 171 179 [MF71] A Meyer and M Fischer, Economy of description by automata, grammars, and formal systems, in Proc 12th Symposium on Switching and Automata Theory, 1971, pp 188 191 [Mo71] F Moore, On the bounds for state-set size in the proofs of equivalence between deterministic, nondeterministic, and two-way finite automata, IEEE Trans Comput, 2 (1971), pp 1211 1214

182 HING LEUNG [RI89] B Ravikumar and O Ibarra, Relating the type of ambiguity of finite automata to the succinctness of their representation, SIAM J Comput, 18 (1989), pp 1263 1282 [Sc78] E Schmidt, Succinctness of Descriptions of Context-Free, Regular, and Finite Languages, PhD Thesis, Cornell University, Ithaca, NY, 1978 [SH85] R Stearns and H Hunt, On the equivalence and containment problems for unambiguous regular expressions, regular grammars and finite automata, SIAM J Comput, 14 (1985), pp 598 611