arxiv: v1 [cs.fl] 5 Dec 2009

Similar documents
Descriptional Complexity of Formal Systems (Draft) Deadline for submissions: April 14, 2009 Final versions: June 15, 2009

Ordered fields and the ultrafilter theorem

Reversal of Regular Languages and State Complexity

Theoretical Computer Science. State complexity of basic operations on suffix-free regular languages

State Complexity of Neighbourhoods and Approximate Pattern Matching

Packing Plane Spanning Trees into a Point Set

Searching All Approximate Covers and Their Distance using Finite Automata

State Complexity of Two Combined Operations: Catenation-Union and Catenation-Intersection

Next: Pushdown automata. Pushdown Automata. Informal Description PDA (1) Informal Description PDA (2) Formal Description PDA

Complexity of Regularization RBF Networks

Nonreversibility of Multiple Unicast Networks

CMSC 451: Lecture 9 Greedy Approximation: Set Cover Thursday, Sep 28, 2017

Quotient Complexity of Regular Languages

Some GIS Topological Concepts via Neutrosophic Crisp Set Theory

Chapter 8 Hypothesis Testing

LECTURE NOTES FOR , FALL 2004

SYNTACTIC COMPLEXITY OF R- AND J-TRIVIAL REGULAR LANGUAGES

The Effectiveness of the Linear Hull Effect

arxiv: v1 [math.co] 16 May 2016

Hankel Optimal Model Order Reduction 1

Discrete Bessel functions and partial difference equations

Reversal of regular languages and state complexity

Advanced Computational Fluid Dynamics AA215A Lecture 4

After the completion of this section the student should recall

Weighted Neutrosophic Soft Sets

arxiv: v2 [cs.dm] 4 May 2018

Universal Disjunctive Concatenation and Star

Operations on Unambiguous Finite Automata

Maximum Entropy and Exponential Families

Soft BCL-Algebras of the Power Sets

A Recursive Approach to the Kauffman Bracket

Nondeterministic State Complexity of Basic Operations for Prefix-Free Regular Languages

Overview. Regular Expressions and Finite-State. Motivation. Regular expressions. RE syntax Additional functions. Regular languages Properties

Some Properties on Nano Topology Induced by Graphs

Estimating the probability law of the codelength as a function of the approximation error in image compression

Journal of Inequalities in Pure and Applied Mathematics

Sufficient Conditions for a Flexible Manufacturing System to be Deadlocked

Sequence Analysis, WS 14/15, D. Huson & R. Neher (this part by D. Huson & J. Fischer) January 21,

Control Theory association of mathematics and engineering

Computer Science 786S - Statistical Methods in Natural Language Processing and Data Analysis Page 1

EDGE-DISJOINT CLIQUES IN GRAPHS WITH HIGH MINIMUM DEGREE

languages by semifilter-congruences

(q) -convergence. Comenius University, Bratislava, Slovakia

ON THE LEAST PRIMITIVE ROOT EXPRESSIBLE AS A SUM OF TWO SQUARES

arxiv:math/ v1 [math.ca] 27 Nov 2003

Tutorial 4 (week 4) Solutions

Modal Horn Logics Have Interpolation

arxiv:math.co/ v1 2 Aug 2006

I F I G R e s e a r c h R e p o r t. Minimal and Hyper-Minimal Biautomata. IFIG Research Report 1401 March Institut für Informatik

Hellis Tamm Institute of Cybernetics, Tallinn. Theory Seminar, April 21, Joint work with Janusz Brzozowski, accepted to DLT 2011

On the density of languages representing finite set partitions

A Characterization of Wavelet Convergence in Sobolev Spaces

On External Contextual Grammars with Subregular Selection Languages

Syntactic Complexity of Ideal and Closed Languages

max min z i i=1 x j k s.t. j=1 x j j:i T j

JAST 2015 M.U.C. Women s College, Burdwan ISSN a peer reviewed multidisciplinary research journal Vol.-01, Issue- 01

SERIJA III

Sensitivity analysis for linear optimization problem with fuzzy data in the objective function

A new method of measuring similarity between two neutrosophic soft sets and its application in pattern recognition problems

Aperiodic languages and generalizations

COMPARISON OF GEOMETRIC FIGURES

Pushdown Specifications

The First Integral Method for Solving a System of Nonlinear Partial Differential Equations

Effective Resistances for Ladder Like Chains

The universal model of error of active power measuring channel

Maxmin expected utility through statewise combinations

Counting Idempotent Relations

SQUARE ROOTS AND AND DIRECTIONS

CS235 Languages and Automata Fall 2012

State Diagrams. Margaret M. Fleck. 14 November 2011

(, ) Anti Fuzzy Subgroups

G-subsets and G-orbits of Q ( n) under action of the Modular Group.

SURFACE WAVES OF NON-RAYLEIGH TYPE

Ayan Kumar Bandyopadhyay

Research Article Approximation of Analytic Functions by Solutions of Cauchy-Euler Equation

SINCE Zadeh s compositional rule of fuzzy inference

SECOND HANKEL DETERMINANT PROBLEM FOR SOME ANALYTIC FUNCTION CLASSES WITH CONNECTED K-FIBONACCI NUMBERS

Sensitivity Analysis in Markov Networks

Danielle Maddix AA238 Final Project December 9, 2016

Stability of alternate dual frames

Integration of the Finite Toda Lattice with Complex-Valued Initial Data

Dept. of Computer Science. Raleigh, NC 27695, USA. May 14, Abstract. 1, u 2 q i+1 :

Millennium Relativity Acceleration Composition. The Relativistic Relationship between Acceleration and Uniform Motion

Chapter 3 Church-Turing Thesis. CS 341: Foundations of CS II

Obtaining the syntactic monoid via duality

Methods of evaluating tests

Performing Two-Way Analysis of Variance Under Variance Heterogeneity

On Properties and State Complexity of Deterministic State-Partition Automata

MODELING MATTER AT NANOSCALES. 4. Introduction to quantum treatments Eigenvectors and eigenvalues of a matrix

On NFAs Where All States are Final, Initial, or Both

Lecture 3 - Lorentz Transformations

A Functional Representation of Fuzzy Preferences

Probabilistic Graphical Models

15.12 Applications of Suffix Trees

Tight bounds for selfish and greedy load balancing

The Hanging Chain. John McCuan. January 19, 2006

Coefficients of the Inverse of Strongly Starlike Functions

Where as discussed previously we interpret solutions to this partial differential equation in the weak sense: b

Some Results of Intuitionistic Fuzzy Soft Sets and. its Application in Decision Making

A Queueing Model for Call Blending in Call Centers

Transcription:

Quotient Complexity of Closed Languages Janusz Brzozowski 1, Galina Jirásková 2, and Chenglong Zou 1 arxiv:0912.1034v1 [s.fl] 5 De 2009 1 David R. Cheriton Shool of Computer Siene, University of Waterloo, Waterloo, ON, Canada N2L 3G1 {brzozo@,2zou@student.math.}uwaterloo.a 2 Mathematial Institute, Slovak Aademy of Siene, Grešákova 6, 040 01 Košie, Slovakia {jiraskov@saske.sk} Abstrat. A language L is prefix-losed if, whenever a word w is in L, theneveryprefixofw is also in L. Wedefinesuffix-,fator-, andsubwordlosed languages in the same way, where by subword we mean subsequene. We study the quotient omplexity (usually alled state omplexity) of operations on prefix-, suffix-, fator-, and subword-losed languages. We find tight upper bounds on the omplexity of the prefix-, suffix-, fator-, and subword-losure of arbitrary languages, and on the omplexity of boolean operations, onatenation, star and reversal in eah of the four lasses of losed languages. We show that repeated appliation of positive losure and omplement to a losed language results in at most four distint languages, while Kleene losure and omplement gives at most eight languages. Keywords: automaton, losed, fator, language, prefix, quotient, state omplexity, subword, suffix, regular operation, upper bound 1 Introdution The state omplexity of a regular language L is the number of states in the minimal deterministi finite automaton (dfa) reognizing L. The state omplexity of an operation f(k,l) (or g(l)) in a sublass C of regular languages is the maximal state omplexity of the language f(k,l) (or g(l)), when K and L range over all languages in C. For a detailed disussion of general issues of state omplexity see [4,22] and the referene lists in those papers. In 1994 the omplexity of onatenation, star, left and right quotients, reversal, intersetion and union in regular languages were examined in detail in [23]. The omplexity of operations was also onsidered in several sublasses of regular languages: finite [22], unary [18, 23], prefix-free [13] and suffix-free [12], and ideal languages [6]. These studies show that the omplexity an be signifiantly lower in a sublass than in the general ase. Here we examine state omplexity in the lasses of prefix-, suffix-, fator-, and subword-losed regular languages. This work was supported by the Natural Sienes and Engineering Researh Counil of Canada grant OGP0000871 and by VEGA grant 2/0111/09.

There are several reasons for onsidering losed languages. They appear often in theoretial omputer siene. Subword-losed languages were studied in 1969 [11], and also in 1973 [20]. Suffix-losed languages were onsidered in 1974 [10], and later in [9, 14, 21]. Fator-losed languages, also alled fatorial, have reeived some attention, for example, in [2, 16]. Subword-losed languages were studied in [17]. Prefix-losed languages play a role in preditable semiautomata [7]. All four lasses of losed languages were examined in [1], and deision problems for losed languages were studied in [8]. A language is a left ideal (respetively, right, two-sided, all-sided ideal) if L = Σ L, (respetively, L = LΣ, L = Σ LΣ and L = Σ L), where Σ L is the shuffle of Σ with L). Closed languages are related to ideal languages as follows [1]: For every non-empty L, L is a right (left, two-sided,all-sided) ideal, if and only if L is a prefix(suffix, fator, subword)-losed language. Closed languages are defined by binary relations is a prefix of (respetively, is a suffix of, is a fator of, is a subword of ) [1], and are speial ases of onvex languages [1,20]. The fat that the four lasses of losed languages are related to eah other permits us to obtain many omplexity results using similar methods. 2 Quotient Complexity If Σ is a non-empty finite alphabet, then Σ is the free monoid generated by Σ. A word is any element of Σ, and ε is the empty word. The length of a word w Σ is w. A language over Σ is any subset of Σ. The ardinality of a set is denoted by S. If w = uxv for some u,v,x Σ, then u is a prefix of w, v is a suffix of w, and x is a fator of w. If w = w 0 a 1 w 1 a n w n, where a 1,...,a n Σ, and w 0,...,w n Σ, then v = a 1 a n is a subword of w. A language L is prefix-losed if w L implies that every prefix of w is also in L. In the same way, we define suffix-, fator-, and subword-losed languages. A language is losed if it is prefix-, suffix-, fator-, or subword-losed. The following set operations are defined on languages: omplement (L = Σ \L), union (K L), intersetion (K L), differene (K\L), and symmetri differene (K L). A general boolean operation with two arguments is denoted by K L. We also define the produt, usually alled onatenation or atenation, (KL = {w Σ w = uv,u K,v L}), (Kleene) star (K = i 0 Ki ), and positive losure (K + = i 1 Ki ). The reverse w R of a word w Σ is defined as follows: ε R = ε, and (wa) R = aw R. The reverse of a language L is denoted by L R and is defined as L R = {w R w L}. Regular languages over Σ are languages that an be obtained from the set of basi languages {,{ε}} {{a} a Σ}, using a finite number of operations of union, produt and star. Suh languages are usually denoted by regular expressions. If E is a regular expression, then L(E) is the language denoted by that expression. For example, E = (ε a) b denotes L = L(E) = ({ε} {a}) {b}. We usually do not distinguish notationally between regular languages and regular expressions; the meaning is lear from the ontext. 2

A deterministi finite automaton (dfa) is a tuple D = (Q,Σ,δ,q 0,F), where Q is a set of states, Σ is the alphabet, δ : Q Σ Q is the transition funtion, q 0 istheinitial state,andf isthesetoffinal oraepting states.anondeterministi finite automaton (nfa) is a tuple N = (Q,Σ,η,Q 0,F), where Q, Σ and F are as in a dfa, η : Q Σ 2 Q is the transition funtion and Q 0 Q is the set of initial states. If η also allows ε, i.e., η : Q (Σ {ε}) 2 Q, we all N an ε-nfa. Our approah to quotient omplexity follows losely that of [4]. Sine state omplexity is a property of a language, it is more appropriately defined in language-theoreti terms. The left quotient, or simply quotient, of a language L by a word w is the language L w = {x Σ wx L}. The quotient omplexity of L is the number of distint quotients of L, and is denoted by κ(l). Quotients of regular languages [3, 4] an be omputed as follows: First, the ε-funtion L ε of a regular language L is L ε = if ε L and L ε = ε if ε L. The quotient by a letter a Σ is omputed by strutural indution: b a = if b {,ε} or b Σ and b a, and b a = ε if b = a; (L) a = L a ; (K L) a = K a L a ; (KL) a = K a L K ε L a ; (K ) a = K a K. Thequotient byawordw Σ is omputed by indution on the length of w: L ε = L; L w = L a if w = a Σ; L wa = (L w ) a. A quotient L w is aepting if ε L w ; otherwise it is rejeting. The quotient automaton of a regular language L is D = (Q,Σ,δ,q 0,F), where Q = {L w w Σ }, δ(l w,a) = L wa, q 0 = L ε = L, and F = {L w (L w ) ε = ε}. This is the minimal dfa aepting L; hene quotient omplexity of L is equal to the state omplexity of L. However, there are some advantages to using quotients [4]. If a language L has the empty quotient, we say that L has. To simplify the notation, we write (L w ) ε as L ε w. Whenever onvenient, the following formulas are used to establish upper bounds on quotient omplexity: Proposition 1 ([3, 4]). If K and L are regular languages, then (L) w = L w ; (K L) w = K w L w. (1) (KL) w = K w L K ε L w Ku ε L v. (2) w=uv u,v Σ + (L ) ε = ε LL, (L ) w = L w (L ) ε u L v L for w Σ +. (3) w=uv u,v Σ + 3 Closure Operations We now turn to the losure of languages under binary relations. All the relations that westudy in this paperarepartialorders.let be a partialorderonσ ; the -losure of a language L is the language L = {x Σ x w for some w L}. We use,,, for the relations is a prefix of, is a suffix of, is a fator of, is a subword of, respetively. SupposeLisanarbitraryregularlanguageofomplexityn.Ifn = 1then L = or L = Σ, and eah losure is L. We show that the worst-ase omplexity for 3

prefix-losure is n, for suffix-losure it is 2 n 1, and for fator-losure it is 2 n 1. These bounds are tight for binary languages. Subword-losure of languages was previously studied by Okhotin [17] under the name sattered subwords, but tight upper bounds were not established. Our next theorem solves this problem. Theorem 1 (Closure Operations). Let L be a regular language with κ(l) = n 2. Let L, L, L, L be the prefix-losure, suffix-losure, fator-losure, and subword-losure of L, respetively. Then 1. κ( L) n. 2. κ( L) 2 n 1 if L does not have, and κ( L) 2 n 1 otherwise. 3. κ( L) 2 n 1. 4. κ( L) 2 n 2 +1. The last bound is tight if Σ n 2; the other bounds are tight if Σ 2. Proof. 1. Given a language L reognized by dfa D, to get the dfa for its prefixlosure L, we need only make eah non-empty state aepting. Hene κ( L) n. For tightness, onsider the language L = {a i i n 2}. We have L = L and κ( L) = n. 2. Having a quotient automaton of a language L, we an onstrut an nfa for its suffix-losure by making eah non-empty state initial. The equivalent dfa has at most 2 n 1 states if L does not have the empty quotient (the empty set of states annot be reahed), and at most 2 n 1 states otherwise. To prove tightness, onsider the language L defined by the quotient automaton shown in Fig. 1. Construt an nfa for the suffix-losure of L, by making all states initial. Let us show that the orresponding subset automaton has 2 n 1 reahable and pairwise inequivalent states. b b b b a a a a 0 1 2 3... a n 1 b a Fig.1. Quotient automaton of a language L whih does not have. We prove reahability by indution on the size of subsets. The basis, S = n, holds true sine {0,1,...,n 1} is the initial state. Assume that eah set of size k is reahable, and let S be a set of size k 1. If S ontains state 0 but does not ontain state 1, then it an be reahed from the set S {1} of size k by b. If S ontains both 0 and 1, then there is a state i suh that i S and i+1 / S. Then S an be reahed from {s i mod n s S} by a i. The latter set ontains 0 and does not ontain 1, and so is reahable. If a non-empty S does not ontain 0, then it an be reahed from {s mins s S}, whih ontains 0, by a mins. To prove inequivalene notie that the word a n i is aepted by the nfa only from state i for all i = 0,1,...,n 1. It turns out that all the states in the subset automaton are pairwise inequivalent. 4

Now onsider the ase where a language has. Let L be the language defined by the quotient automaton shown in Fig. 2. We first remove state n 1 and all transitions going to this state, and then onstrut an nfa as above. The proof of reahability of all non-empty subsets of {0,1,...,n 2} is the same as in the previous ase. The empty set an be reahed from {0} by b. For inequivalene, (ab) n is aepted only from 0, and a n 1 i (ab) n only from i for i = 1,2,...,n 2. a,b b b b b a a a a... a n 1 0 1 2 3 n 2 b a Fig.2. Quotient automaton of a language L whih has. 3.SupposewehavethequotientautomatonofalanguageL.Tofindannfafor the fator losure L, we make all non-empty states of the quotient automaton both aepting and initial and delete the empty state. Hene the bound is 2 n 1. The languageldefined byquotient automaton shownin Fig. 2 meets the bound. 4. To get an ε-nfa for the subword-losure L from the quotient automaton of L, we remove the empty state (if there is no empty state, then L = Σ ), and add an ε-transition from state p to state q whenever there is a transition from p to q in the quotient automaton. Sine the initial state an reah every non-empty state through ε-transitions, no other subset ontaining the initial state an be reahed. Hene there are at most 2 n 2 +1 reahable subsets. To prove tightness, if n = 2, let Σ = {a,b}; then L = a meets the bound. If n 3, let Σ = {a 1,...,a n 2 }, and L = a a i Σ i(σ\{a i }). Thus the language L onsists of all words over Σ, in whih the first letter ours exatly one. Let K be the subword-losure of L. Then K = L {w Σ at least one letter is missing in w}. For eah boolean vetor b = (b 1,b 2,...,b n 2 ), define the word w(b) = w 1 w 2 w n 2, in whih w i = ε if b i = 0 and w i = a i if b i = 1. Now onsider the word ε, and eah word a 1 w(b). Let us show that all quotients of K by these 2 n 2 + 1 words are distint. For eah binary vetor b, we have a 1 a 2 a n 2 K ε \K a1w(b). Let b and b be two different vetors with b i = 0 and b i = 1. Then we have a 1a 2 a i 1 a i+1 a i+2 a n 2 K a1w(b) \ K a1w(b ). Thus all quotients are distint, and so κ(k) 2 n 2 +1. 4 Basi Operations on Closed Languages Now we study the quotient omplexity of operations on losed languages. For regular languages, the following bounds are known [23]: mn for boolean operations, m2 n 2 n 1 for produt, 3/4.2 n for star, and 2 n for reversal. The bounds for losed languages are smaller in most ases. We also show that the bounds are tight, usually for a fixed alphabet. The bounds for boolean operations and reversal follow from the results on ideal languages [6]. 5

Theorem 2 (Boolean Operations). If K and L are prefix-losed (or fatorlosed or subword-losed) with κ(k) = m and κ(l) = n, then 1. κ(k L) mn (m+n 2), 2. κ(k L),κ(K L) mn, 3. κ(k \L) mn (n 1), For suffix-losed languages, κ(k L) mn. All bounds are tight if Σ 4. Proof. Reall that the omplement of a prefix-losed(respetively, suffix-, fator-, or subword-losed) language is a right (respetively, left, two-sided, all-sided) ideal. We get all the results using De Morgan s laws and the results from [6]. Remark 1. If L is prefix-losed, then either L = Σ or L has as a quotient. Moreover, eah quotient of L is either aepting or. Remark 2. For a suffix-losed language L, if v is a suffix of w then L w L v. In partiular, L w L ε = L for eah word w in Σ. Theorem 3 (Produt). Let K and L be losed languages with κ(k) = m and κ(l) = n, and let k be the number of aepting quotients of K. If m = 1 or n = 1, then κ(kl) = 1. Otherwise, 1. If K and L are prefix-losed, then κ(kl) (m+1) 2 n 2. 2. If K and L are suffix-losed, then κ(kl) (m k)n+k. 3. If K and L are both fator- or both subword-losed, then κ(kl) m+n 1. All bounds are tight if Σ 3. Proof. If m = 1, then K = or K = Σ, and so KL = or, sine ε L, KL = Σ. Thus κ(kl) = 1. The ase of n = 1 is similar. Now let m,n 2. 1. If K and L are prefix-losed, then ε K, and, by Remark 1, both languageshave as a quotient. The quotient (KL) w is given by Equation (2). If K w is aepting,then Lis alwaysin the union, andthere are2 n 2 non-emptysubsets of non-empty quotients of L that an be added. Sine there are m 1 aepting quotients of K, there are (m 1)2 n 2 suh quotients of KL. If K w is rejeting, then 2 n 1 subsets of non-empty quotients of L an be added. Altogether, κ(kl) 2 n 1 +(m 1)2 n 2 = (m+1)2 n 2. For tightness, onsider prefix-losed languages K and L defined by the quotient automata of Fig. 3 (if n = 2, then L = {a,} ). Construt an ε-nfa for the language KL from these quotient automata by adding an ε-transition from states q 0,q 1,...,q m 2 to state 0. The initial state ofthe nfa is q 0, and the aepting states are 0,1,...,n 2. Let us show that there are (m+1) 2 n 2 reahable and pairwise inequivalent states in the orresponding subset automaton. State {q 0,0} is the initial state, and eah state {q 0,0,i 1,i 2,...,i k }, where 1 i 1 < i 2 < < i k n 2,anbereahedfromstate{q 0,0,i 2 i 1,...,i k i 1 } by word ab i1 1. For eah subset S of {0,1,...,n 2} ontaining state 0, eah state {q i } S with 1 i m 1 an be reahed from state {q 0 } S by i. If a non-empty set S does not ontain state 0, then state {q m 1 } S an be reahed from state {q m 1 } {s mins s S}, whih ontains state 0, by a mins. State {q m 1,n 1} an be reahed from state {q m 1,n 2} by b. 6

a,b a,b a,b a,b a,b, q 0 q 1 q 2... q m 2 q m 1 b, a,b, a a,b a,b... a,b b 0 1 2 n 2 n 1 a Fig. 3. Quotient automata of prefix-losed languages K and L. To prove inequivalene, notie that the word b n is aepted by the quotient automaton for L only from state 0, and the word a n 1 i b n only from state i (1 i n 2). It turns out that two different states {q m 1 } S and {q m 1 } T are inequivalent. It follows that states {q i } S and {q i } T are inequivalent as well.states{q i } S and{q j } T withi < j anbedistinguishedby m 1 j b n ab n. Hene the subset automaton has (m+1) 2 n 2 reahable and pairwise inequivalent states, and so κ(kl) = (m+1)2 n 2. 2. If K and L are suffix-losed, then, by Remark 2, for eah word w we have (KL) w = K w L K ε L w ( KuL ε v ) = K w L L x, w=uv u,v Σ + for some suffix x of w. If K w is a rejeting quotient, there are at most (m k)n suh quotients. If K w is aepting, then ε K w, and sine L x L ε = L K w L, we have (KL) w = K w L. There are at most k suh quotients. Therefore there are at most (m k)n+k quotients in total. To prove tightness, let K and L be ternary suffix-losed languages defined by quotient automata shown in Fig. 4. Consider the words ε = a 0 b 0, and a i b j with b, b b b a,b, a a a a a 0 1 2... m 2 m 1 a, a a a b b b... b b 0 1 2 n 2 a,b, n 1 Fig. 4. Quotient automata of suffix-losed languages K and L. 7

1 i m 1 and 0 j n 1. Let us show that all quotients of KL by these wordsaredistint. Let (i,j) (k,l), andlet x = a i b j and y = a k b l. Ifi < k, take z = a m 1 k b n. Then xz is in KL, while yz is not, and so z (KL) x \(KL) y. If i = k and j < l, take z = a m b n 1 l. We again have z (KL) x \ (KL) y. Thus the language KL has at least (m 1)n + 1 distint quotients, and so κ(kl) = (m 1)n+1. Notie that, if the quotients K a i with 0 i k 1 are aepting, then the resulting produt has quotient omplexity (m k)n + k. 3. It suffies to derive the bound for fator-losed languages, sine every subword-losed language is also fator-losed. Sine fator-losed languages are suffix-losed, κ(kl) (m k)n+k. The language K has at most one rejeting quotient, beause it is prefix-losed. Thus, k = m 1 and κ(kl) m+n 1. For tightness, onsider binary subword-losed languages K = {w {a,b} a m 1 is not a subword of w} and L = {w {a,b} b n 1 is not a subword of w} with κ(k) = m and κ(l) = n. Consider the word w = a m 1 b n 1. This word is not in the produt KL. However, removing any non-empty subword from w results in a word in KL. Therefore, κ(kl) m+n 1. Theorem 4 (Star). Let L be a losed language with κ(l) = n 2. 1. If L is prefix-losed, then κ(l ) 2 n 2 +1. 2. If L is suffix-losed, then κ(l ) n if L = L and κ(l ) n 1 if L L. 3. If L is fator- or subword-losed, then κ(l ) 2. If κ(l) = 1, then κ(l ) 2. All bounds are tight if Σ 2. Proof. 1. For every non-empty word w, the quotient (L ) w is given by Equation (3). If L is prefix-losed, then so is L and (L ) w. Thus, if (L ) w is nonempty, then it must ontain the empty word. Hene (L ) w L LL L. Sine the empty quotient of L and L itself are always ontained in every nonempty quotient of L, there are at most 2 n 2 non-empty quotients of L. Sine there is at most one empty quotient, there are at most 2 n 2 + 1 quotients in total. The quotient (L ) ε has already been ounted, sine L is losed and ε L implies (L ) ε = LL, whih has the form of Equation (3). If n = 1 and n = 2, the bound 2 is met by L = and L = ε, respetively. Now let n 3 and let L be the prefix-losed language defined by the quotient automatonshowninfig.5;transitionsnotdepited inthefiguregotostaten 1. Construt an ε-nfa for L by removing state n 1 and adding an ε-transition 0 a b 1 a,b 2 a,b... a,b n 2 b n 1 a,b, ; Fig. 5. Quotient automaton of prefix-losed language L. 8

from all the remaining states to the initial state. Let us show that 2 n 2 +1 states are reahable and pairwise inequivalent in the orresponding subset automaton. We first prove that eah subset of {0,1,...,n 2} ontaining state 0 is reahable.theproofisbyindution onthe sizeofthe subsets.thebasis, S = 1, holds true sine {0} is the initial state of the subset automaton. Assume that eah set of size k ontaining state 0 is reahable, and let S = {0,i 1,i 2,...,i k }, where 0 < i 1 < i 2 < < i k n 2, be a set of size k + 1. Then S an be reahed from the set {0,i 2 i 1,...,i k i 1 } of size k by ab i1 1. Sine the latter set is reahable by the indution hypothesis, the set S is reahable as well. The empty set an be reahed from {0} by b, and we have 2 n 2 +1 reahable states. To prove inequivalene of these states notie that the word b n 3 is aepted by the nfa only from state 1, and eah word b n 2 i b n 3 (2 i n 2), only from state i. It follows that all the states in the subset automaton are pairwise inequivalent. 2. Foranon-empty suffix-losed languagel, the quotient (L ) ε is LL, whih is of the same form as the quotients by a non-empty word w given by Equation (3), (L ) w = (L w L v1 L vk )L, where the v i are suffixes of w, and v k is the shortest. By Remark 2, if v is a suffix of w, then L w L v. Thus the quotient beomes (L ) w = L vk L. There are at most n suh quotients. If L L for a non-empty suffix-losed language L, then there must be two words x,y in L suh that xy / L. Hene y L ε \L x, and so L ε L x. However, sine ε L x and L is suffix-losed, we have (L ) ε = L L x L (L ) x (L ) ε, and so (L ) ε = (L ) x. It turns out that κ(l ) n 1. For n = 1, L = and for n = 2, L = ε meet the bound 2. Let n 3. If L = (a ba n 2 ), then L is suffix-losed, κ(l) = n, and L = L. If L = ε n 3 i=0 ai b, then L is suffix-losed, κ(l) = n, L = ( n 3 i=0 ai b), and κ(l ) = n 1. 3. If eah letter in Σ appears in some word of a fator-losed language L, then L = Σ and κ(l ) = 1. Otherwise, κ(l ) = 2. The bound is met by subword-losed language L = {w {a,b} w = a i and 0 i n 2}. Sine the operation of reversal ommutes with omplementation, we have the following results on ideal languages from [6]: Theorem 5 (Reversal). Let L be a losed language with κ(l) = n 2. 1. If L is prefix-losed, then κ(l R ) 2 n 1. The bound is tight if Σ 2. 2. If L is suffix-losed, then κ(l R ) 2 n 1 +1. The bound is tight if Σ 3. 3. If L is fator-losed, then κ(l R ) 2 n 2 +1. The bound is tight if Σ 3. 4. If L is subword-losed, then κ(l R ) 2 n 2 +1. The bound is tight if Σ 2n. If κ(l) = 1, then κ(l R ) = 1. Unary Languages: Unary losed languages have speial properties beause the produt of unary languages is ommutative. The lasses of prefix-losed, suffix-losed, fator-losed, and subword-losed unary languages all oinide. If a unary losed language L is finite, then either it is empty and has κ(l) = 1, or has the form {a i i n 2}, for some n 2, and has κ(l) = n. If L is infinite, then L = a, and κ(l) = 1. The bounds for unary languages are given in Tables 1 and 2 on page 11. 9

5 Kuratowski Algebras Generated by Closed Regular Languages A theorem of Kuratowski [15] states that, given a topologial spae, at most 14 distint sets an be produed by repeatedly applying the operations of losure and omplement to a given set. A losure operation on a set S is an operation : 2 S 2 S satisfying the following onditions for any subsets X,Y of S: (1) X X, (2) X Y implies X Y, (3) X X. Kuratowski s theorem was studied in the setting of formal languages in [5]. Positive losure and Kleene losure (star) are both losure operations. It was shown in [5] that at most 10 distint languages an be produed by repeatedly applying the operations of positive losure and omplement to a given language, and at most 14 distint languages an be produed with Kleene losure instead of positive losure. We onsider here the ase where the given language is losed and regular, and give upper bounds for the omplexity of the resulting languages. Here we denote the omplement of a language L by L. Moreover, the positive losure of the omplement of L is denoted by L +, et. We begin with positive losure. Let L be a -losed language not equal to Σ. Then L is an ideal, and L + = L. In addition, L + is also -losed, so L + + = L +.Hene thereareatmost4distint languagesthat anbeprodued with positive losure and omplementation. Theorem 6. The worst-ase omplexities in every 4-element algebra generated by a losed language L with κ(l) = n under positive losure and omplement are: κ(l) = κ(l ) = n, κ(l + ) = κ(l + ) = f(n), where f(n) is: 2 n 2 +1 for prefix-losed languages, n 1 for suffix-losed languages, and 2 for fator- and subword-losed languages. There exist losed languages that meet these bounds. Proof. Sine L + = L for a non-empty losed language we have κ(l + ) = κ(l ), and the upper bounds f(n) follow from our results on the quotient omplexity of star operation; in the ase of suffix-losed languages, to get a 4-element algebra we need L L. All the languages that we have used in Theorem 4 to prove tighness an be used as examples meeting the bound f(n). The ase of Kleene losure is similar. Let be a -losed language suh that L {,Σ }. Then L is an ideal and L does not ontain ε. Thus L = L ε and L = L \ ε, whih gives at most four languages thus far. Now L = (L \ ε), and L is also -losed. By the previous reasoning, we have at most four additional languages, giving a total of eight languages as the upper bound. The 8-element algebras are of the form (L, L, L = L ε, L = L\ε, L, L, L = L ε, L = L \ε). Theorem 7. The worst-ase omplexities in every 8-element algebra generated by a losed language L with κ(l) = n under Kleene losure and omplement are: κ(l) = κ(l ) = n, κ(l ) = κ(l ) = f(n), κ(l ) = κ(l ) = f(n)+1, κ(l ) = κ(l ) = n+1, where f(n) is: 2 n 2 +1 for prefix-losed languages, n 1 for suffix-losed languages, and 2 for fator-and subword-losed languages, Moreover, there exist losed languages that meet these bounds. 10

Proof. Sine L = L\ε and L = L \ε we have κ(l ) n+1 and κ(l ) f(n) + 1. In the ase of suffix-losed languages, sine L must be distint from L, we have f(n) = n 1 by Theorem 4. 1. Let L be the prefix-losed language defined by the quotient automaton in Fig. 5 on page 8; then L meets the upper bound on star. Add a loop with a new letter d in eah state and denote the resulting language by K. Then K is a prefix-losed language with κ(k) = n and κ(k \ ε) = n + 1. Next we have κ(k ) = κ(l ) = 2 n 2 +1 and κ(k \ε) = 2 n 2 +2. 2. Let L = b n 3 i=1 b a i b. Then L is a suffix-losed language with κ(l) = n and κ(l\ε) = n+1. Next, κ(l ) = n 1, and κ(l \ε) = n. 3. Let L = {w {a,b,} w = b a i and 0 i n 2}. Then L is a subword-losed language with κ(l) = n and κ(l\ε) = n+1. Next L = {a,b}, and so κ(l ) = 2 and κ(l \ε) = 3. 6 Conlusions Tables 1 and 2 summarize our omplexity results. The omplexities for regular languages are from [23], exept those for differene and symmetri differene, whih are from [4]. The bounds for boolean operations and reversal of losed languagesarediretonsequenesoftheresultsin [6].In Table2,k isthe number of aepting quotients of K. K L K L K \L K L unary losed max(m, n) max(m, n) m max(m, n) -, -, -losed mn mn (m+n 2) mn (n 1) mn -losed mn mn mn mn regular mn mn mn mn Table 1. Bounds on quotient omplexity of boolean operations. L KL K K R unary losed n m+n 2 2 n -losed n m2 n 2 2 n 2 +1 2 n 1 -losed 2 n 1 m+n 1 2 2 n 2 +1 -losed 2 n 2 +1 m+n 1 2 2 n 2 +1 -losed 2 n 1 (m k)n+k n 2 n 1 +1 regular m2 n k2 n 1 2 n 1 +2 n k 1 2 n Table 2. Bounds on quotient omplexity of losure, produt, star and reversal. 11

Referenes 1. Ang, T., Brzozowski, J.: Languages onvex with respet to binary relations, and their losure properties. Ata Cybernet., to appear 2. Avgustinovih, S.V., Frid, A.E.: A unique deomposition theorem for fatorial languages. Internat. J. Algebra Comput. 15, 149 160 (2005) 3. Brzozowski, J.: Derivatives of regular expressions. J. ACM 11, 481 494 (1964) 4. Brzozowski, J.: Quotient omplexity of regular languages. In: Dassow, J., Pighizzini, G., Truthe, B. (eds.) DCFS 2009, pp. 25 42. Otto-von-Guerike- Universität, Magdeburg, Germany (2009) http://arxiv.org/abs/0907.4547 5. Brzozowski, J., Grant, E., Shallit, J.: Closures in formal languages and Kuratowski s theorem. In: Diekert, V., Nowotka, D. (eds.) DLT 2009. LNCS, vol. 5583, pp. 125 144. Springer, Heidelberg (2009) 6. Brzozowski, J., Jirásková, G., Li, B.: Quotient omplexity of ideal languages. In: LATIN 2010, to appear. Full paper at http://arxiv.org/abs/0908.2083 7. Brzozowski, J., Santean, N.: Preditable semiautomata. Theoret. Comput. Si. 410, 3236 3249 (2009) 8. Brzozowski, J., Shallit, J., Xu, Z.: Deision proedures for onvex languages. In: Dediu, A., Ionesu, A., Martin-Vide, C. (eds.) LATA 2009. LNCS, vol. 5457, pp. 247-258. Springer, Heidelberg (2009) 9. Galil, Z., Simon, J.: A note on multiple-entry finite automata. J. Comput. System Si. 12, 350 351 (1976) 10. Gill, A., Kou, L.T.: Multiple-entry finite automata. J. Comput. System Si. 9, 1 19 (1974) 11. Haines, L.H.: On free monoids partially ordered by embedding. J. Combin. Theory 6, 94 98 (1969) 12. Han, Yo-S., Salomaa, K.: State omplexity of basi operations on suffix-free regular languages. Theoret. Comput. Si. 410, 2537 2548 (2009) 13. Han, Yo-S., Salomaa, K., Wood, D.: Operational state omplexity of prefix-free regular languages. In: Automata, Formal Languages, and Related Topis, pp. 99-115. University of Szeged, Hungary (2009) 14. Holzer, M., Salomaa, K., Yu, S.: On the state omplexity of k-entry deterministi finite automata. J. Autom. Lang. Comb. 6, 453-466 (2001) 15. Kuratowski, C.: Sur l opération A de l analysis situs. Fund. Math. 3, 182 199 (1922) 16. de Lua, A., Varrihio, S.: Some ombinatorial properties of fatorial languages. In: Capoelli, R. (ed.) Sequenes, pp. 258 266. Springer (1990) 17. Okhotin. A: On the state omplexity of sattered subwords and superwords. Turku Centre for Computer Siene Tehnial Report No. 849 (2007) 18. Pighizzini, G., Shallit, J.: Unary language operations, state omplexity and Jaobsthal s funtion. Int. J. Found. Comput. Si. 13, 145-159 (2002) 19. Salomaa, A., Wood, D., Yu, S.: On the state omplexity of reversals of regular languages. Theoret. Comput. Si. 320, 315 329 (2004) 20. Thierrin, G.: Convex languages. In: Nivat, M. (ed.) Automata, Languages and Programming, pp. 481 492. North-Holland (1973) 21. Veloso, P.A.S., Gill, A.: Some remarks on multiple-entry finite automata. J. Comput. System Si. 18, 304 306 (1979) 22. Yu., S.: State omplexity of regular languages. J. Autom., Lang. Comb. 6, 221 234 (2001) 23. Yu, S., Zhuang, Q., Salomaa, K.: The state omplexities of some basi operations on regular languages. Theoret. Comput. Si. 125, 315 328 (1994) 12