Generating All Circular Shifts by Context-Free Grammars in Greibach Normal Form

Similar documents
Generating All Circular Shifts by Context-Free Grammars in Chomsky Normal Form

Generating All Permutations by Context-Free Grammars in Chomsky Normal Form

Generating All Permutations by Context-Free Grammars in Chomsky Normal Form

A Fuzzy Approach to Erroneous Inputs in Context-Free Language Recognition

CPS 220 Theory of Computation

Parsing. Context-Free Grammars (CFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 26

MA/CSSE 474 Theory of Computation

Blackhole Pushdown Automata

6.8 The Post Correspondence Problem

Math 324 Summer 2012 Elementary Number Theory Notes on Mathematical Induction

Section 1 (closed-book) Total points 30

Chapter 4: Context-Free Grammars

CS481F01 Prelim 2 Solutions

Context Free Grammars

MATH 324 Summer 2011 Elementary Number Theory. Notes on Mathematical Induction. Recall the following axiom for the set of integers.

Optimal Regular Expressions for Permutations

This lecture covers Chapter 7 of HMU: Properties of CFLs

Context-Free Grammars: Normal Forms

Introduction to Formal Languages, Automata and Computability p.1/42

Pushdown Automata. Reading: Chapter 6

ON PARTITIONS SEPARATING WORDS. Formal languages; finite automata; separation by closed sets.

ECS 120: Theory of Computation UC Davis Phillip Rogaway February 16, Midterm Exam

Theory of Computation 7 Normalforms and Algorithms

CS375: Logic and Theory of Computing

Chap. 7 Properties of Context-free Languages

Languages. Languages. An Example Grammar. Grammars. Suppose we have an alphabet V. Then we can write:

CS 373: Theory of Computation. Fall 2010

Left-Forbidding Cooperating Distributed Grammar Systems

A New Shuffle Convolution for Multiple Zeta Values

Parikh s theorem. Håkan Lindqvist

Ambiguity of context free languages as a function of the word length

Context-Free and Noncontext-Free Languages

Finite Automata Theory and Formal Languages TMV027/DIT321 LP4 2018

Einführung in die Computerlinguistik

The constructible universe

Lecture Notes 1 Basic Concepts of Mathematics MATH 352

Theory of Computation - Module 3

0.Axioms for the Integers 1

Mathematics 114L Spring 2018 D.A. Martin. Mathematical Logic

Notes for Comp 497 (Comp 454) Week 10 4/5/05

IMA Preprint Series # 2066

Chapter 1. Sets and Numbers

Isomorphisms between pattern classes

Coloring k-trees with forbidden monochrome or rainbow triangles

Chapter 3. Regular grammars

arxiv: v2 [math.nt] 4 Jun 2016

arxiv: v1 [math.co] 3 Nov 2014

Congruence Classes of 2-adic Valuations of Stirling Numbers of the Second Kind

NP-problems continued

Hanoi Graphs and Some Classical Numbers

Outline. We will now investigate the structure of this important set.

Notes on Pumping Lemma

c 1998 Society for Industrial and Applied Mathematics Vol. 27, No. 4, pp , August

11 Division Mod n, Linear Integer Equations, Random Numbers, The Fundamental Theorem of Arithmetic

Theory of Computation 1 Sets and Regular Expressions

Theory of Computation 8 Deterministic Membership Testing

Simplification of CFG and Normal Forms. Wen-Guey Tzeng Computer Science Department National Chiao Tung University

Simplification of CFG and Normal Forms. Wen-Guey Tzeng Computer Science Department National Chiao Tung University

Enumeration of Automata, Languages, and Regular Expressions

Einführung in die Computerlinguistik

Fundamentele Informatica II

Hierarchy among Automata on Linear Orderings

Some Properties in Generalized n-inner Product Spaces

d(ν) = max{n N : ν dmn p n } N. p d(ν) (ν) = ρ.

Pushdown Automata (2015/11/23)

A SUMMATION FORMULA FOR SEQUENCES INVOLVING FLOOR AND CEILING FUNCTIONS

Grammars (part II) Prof. Dan A. Simovici UMB

SYLLABUS. Introduction to Finite Automata, Central Concepts of Automata Theory. CHAPTER - 3 : REGULAR EXPRESSIONS AND LANGUAGES

TAFL 1 (ECS-403) Unit- III. 3.1 Definition of CFG (Context Free Grammar) and problems. 3.2 Derivation. 3.3 Ambiguity in Grammar

Linear conjunctive languages are closed under complement

Properties of Context-free Languages. Reading: Chapter 7

Notes for Comp 497 (454) Week 10

MANFRED DROSTE AND WERNER KUICH

A REFINED ENUMERATION OF p-ary LABELED TREES

Permutation groups/1. 1 Automorphism groups, permutation groups, abstract

How to Pop a Deep PDA Matters

1. Induction on Strings

1 Alphabets and Languages

MA/CSSE 474 Theory of Computation

Context Free Grammars: Introduction

Theory Of Computation UNIT-II

Context Free Languages (CFL) Language Recognizer A device that accepts valid strings. The FA are formalized types of language recognizer.

Theory of Computation Turing Machine and Pushdown Automata

Homework 5 - Solution

6 Cosets & Factor Groups

The commutation with ternary sets of words

Axiomatic set theory. Chapter Why axiomatic set theory?

Lecture 12 Simplification of Context-Free Grammars and Normal Forms

Undecidability COMS Ashley Montanaro 4 April Department of Computer Science, University of Bristol Bristol, UK

ADVANCED CALCULUS - MTH433 LECTURE 4 - FINITE AND INFINITE SETS

Finite Presentations of Pregroups and the Identity Problem

0 Sets and Induction. Sets

10. The GNFA method is used to show that

Grammars and Context Free Languages

Foundations of Informatics: a Bridging Course

Homework #7 Solutions

Chapter 6. Properties of Regular Languages

Definitions. Notations. Injective, Surjective and Bijective. Divides. Cartesian Product. Relations. Equivalence Relations

Harvard CS 121 and CSCI E-207 Lecture 12: General Context-Free Recognition

Some Operations Preserving Primitivity of Words

Transcription:

Generating All Circular Shifts by Context-Free Grammars in Greibach Normal Form Peter R.J. Asveld Department of Computer Science, Twente University of Technology P.O. Box 17, 7500 AE Enschede, the Netherlands e-mail: infprja@cs.utwente.nl Abstract For each alphabet Σ n = {a 1,a,...,a n }, linearly ordered by a 1 < a < < a n, let C n be the language of circular or cyclic shifts over Σ n, i.e., C n = {a 1 a a n 1 a n, a a 3 a n a 1,...,a n a 1 a n a n 1 }. We study a few families of context-free grammars G n (n 1) in Greibach normal form such that G n generates C n. The members of these grammar families are investigated with respect to the following descriptional complexity measures: the number of nonterminals ν(n), the number of rules π(n) and the number of leftmost derivations δ(n) of G n. As in the case of Chomsky normal form, these ν, π and δ are functions bounded by low-degree polynomials. However, the question whether there exists a family of grammars that is minimal with respect to all these measures remains open. Keywords: context-free grammar, Greibach normal form, permutation, circular shift, cyclic shift, descriptional complexity, unambiguous grammar. 1 Introduction Let Σ n = {a 1, a,...,a n } be an alphabet, linearly ordered by a 1 < a < < a n, and let L n be the language over Σ n of the n! permutations of a 1 a a n. In 00 G. Satta [15] conjectured that any context-free grammar G n in Chomsky normal form (CNF) that generates L n must have a number of nonterminal symbols that is not bounded by any polynomial function in n. This statement has been proved in [9], but without showing how to generate {L n } n 1 by context-free grammars {G n } n 1 in CNF. In [] we provided several grammar families for {L n } n 1 together with the usual descriptional complexity measures as the number of nonterminals ν(n) and the number of rules π(n); cf. [11, 13, 14, 7, 5, 1, 6] for

Peter R.J. Asveld these measures. The relative descriptional complexity of these grammar families is anything but straightforward and the quest for a family of minimal grammars (with respect to these complexity measures) remains a challenging problem. Then in [3] we investigated some specific permutations over Σ n, viz. the circular or cyclic shifts, defined by C n = {a 1 a a n 1 a n, a a 3 a n a 1, a 3 a 4 a 1 a,...,a n a 1 a n a n 1 }. An alternative definition of C n in terms of the so-called circular closure operator c on languages L, which is defined by c(l) = {vu uv L} [8], is: C n = c({a 1 a a n }). The fact that the n elements of C n is much less than the n! elements of L n is also reflected by the complexity measures of the corresponding grammar families: for {L n } n 1, the functions ν(n) and π(n) are exponential functions [15, 9, ], whereas for {C n } n 1, they are bounded by low-degree polynomial functions; cf. [3]. In this paper we investigate a few ways of generating the family {C n } n 1 by contextfree grammars in Greibach normal form (GNF) and for these families of grammars we determine the complexity measures ν, π and δ as functions of n. The results we obtain are rather similar to those in [3], as is the organization of this paper. is devoted to preliminaries and in 3 we consider elementary properties of grammars G n in GNF for C n. As using the arbitrary GNF for C n is trivial ( 3), we focus in 4 7 on the Greibach k-form (k = 1, ). An approach based on the set of circularly ordered strings results in a grammar family in Greibach -form ( 4). Modifying this family into Greibach 1-form in 5 results in a family with less rules. Unambiguous grammars for C n are studied in 6: then ν(n) and π(n) are related in a simple way. In 7 we discuss minimality for unambiguous grammars in GNF but the existence of a family of grammars for which these complexity measures are minimal remains open. Finally, some concluding remarks are in 8. Preliminaries For rudiments of discrete mathematics, particularly of combinatorics, and of formal language theory, we refer to standard texts like [10] and [1], respectively. We denote the empty word by λ and the length of the word x by x. For each word w over Σ, A(w) is defined as the set of all symbols from Σ that do occur in w. Formally, A(λ) =, and A(ax) = {a} A(x) for each a Σ and x Σ. This mapping is extended to languages L over Σ by A(L) = {A(w) w L}. Remember that a λ-free context-free grammar G = (V, Σ, P, S) is in Chomsky normal form (CNF) if P N (N {S}) N Σ where N = V Σ. And G is in Greibach normal form (GNF) if P N Σ(N {S}). Particularly, G is in Greibach k-form if P N Σ( k i=0 (N {S})i ). For a context-free grammar G = (V, Σ, P, S) with α V, L(G, α) denotes the language defined by L(G, α) = {w Σ α w}. Thus for the language L(G) generated by G, we have L(G) = L(G, S). Notice that, if G is in CNF or GNF, then G has no useless symbols, L(G, α) is a nonempty language for each symbol α in V, and L(G, α) = {α} for each α in Σ.

Generating Circular Shifts by Context-Free Grammars in GNF 3 In studying C n we need, as in [3], subwords of a 1 a a n a 1 a n 1 ; so we consider the set Fk n all subwords of length k (1 k n) that obey the circular succession relation on Σ n defined by: a i a j if and only if either (i) i < n and j = i + 1 or (ii) i = n and j = 1; cf. 6.1 in [10]. Clearly, is not a transitive relation: it is a kind of successor relation. Then the formal definition of Fk n reads F n k = {x Σ n u, v Σ n : uxv = a 1 a a n a 1 a a k 1 ; x = k} with 1 k n; their partial unions Q n m are defined by Qn m = m k=1 F k n (1 m n). For a finite set X, we denote the cardinality of X by #X. Then we obviously have C n = Fn n = Qn n Qn n 1, #F k n = n (1 k n), #Qn m = mn and #C n = #Fn n = n. 3 Elementary Properties We first recall some simple properties of grammars in GNF that generate L n (the language of all permutations over Σ n ). From [4] we quote the following results. Proposition 3.1. For n 1, let G n = (V n, Σ n, P n, S n ) be a context-free grammar in GNF that generates L n, and let A, B N n = V n Σ n. (1) The language L(G n, A) is a nonempty subset of an isomorphic copy M k of the language L k for some k (1 k n). Consequently, each string z in L(G n, A) has length k, z consists of k different symbols, and A(z) = A(L(G n, A)). () If L(G n, A) L(G n, B), then A(L(G n, A)) = A(L(G n, B)). (3) If A aa 1 A A m is a rule in G n, then for each (i, j) with 1 i < j m, A(L(G n, A i )) A(L(G n, A j )) =, a / A(L(G n, A k )) with 1 k m, and A(L(G n, A)) = {a} A(L(G n, A 1 )) A(L(G n, A )) A(L(G n, A m )). This result gives rise to the following equivalence relation on N n : A and B in N n are called equivalent if x = y for some x L(G n, A) and some y L(G n, B). The equivalence classes are denoted by {E n,k } n k=1. The number of elements #E n,k of the equivalence class E n,k will be denoted by D(n, k) (1 k n). Example 3.. Consider G P 3 = (V 3, Σ 3, P 3, A 13 ) with N 3 = {A 13, A 1, A 13, A 3, A 1, A, A 3 } and P 3 consists of A 13 a 1 A 3 a A 13 a 3 A 1, A 1 a 1 A a A 1, A 13 a 1 A 3 a 3 A 1, A 3 a A 3 a 3 A, A 1 a 1, A a and A 3 a 3. Clearly, G P 3 is in GNF. Then L(G P 3 ) = L 3, E 3,3 = {A 13 }, E 3, = {A 1, A 13, A 3 }, E 3,1 = {A 1, A, A 3 }, and hence D(3, 3) = 1 and D(3, ) = D(3, 1) = 3. Proposition 3.1 relies on the fact that each word in L(G n ) is a permutation. As circular shifts are special permutations, Proposition 3.1 still applies; but what is particular about generating C n rather than L n is expressed in Proposition 3.4, the proof of which depends on the following result from [3]. Lemma 3.3. If X is a nonempty proper subalphabet of Σ n, then there exists at most one word x with A(x) = X such that x satisfies the circular succession relation. And if

4 Peter R.J. Asveld X = {b 1, b,..., b l }, then x = b p(1) b p() b p(l) provided there exists a permutation p of {1,,..., l} such that b p(1) b p() b p(l). For each w Σ n, let α(w) be the first and ω(w) be the last symbol of w. Thus if w = σ 1 σ σ m with σ i Σ (1 i m), then α(w) = σ 1 and ω(w) = σ m. Proposition 3.4. Let G n = (V n, Σ n, P n, S n ) be a context-free grammar in GNF that generates C n, and let α, β V n {S n }. (1) For each α, the language L(G n, α) is a singleton. () If L(G n, α) L(G n, β), then L(G n, α) = L(G n, β). (3) If A aa 1 A A m is in P n with L(G n, A i ) = {x i } (1 i m), then a α(x 1 ) and for each i (1 i < m), ω(x i ) α(x i+1 ). Consequently, if L(G n, A i ) = {x i } F n Λ(i), then L(G n, A) = {ax 1 x x m } F n k with k = 1 + m i=1 Λ(i). Proof. (1) G n is in GNF; so each symbol α in V n {S n } is useful: there is a derivation S n + ϕαψ + ϕx α ψ + x where x is a circular shift, ϕψ λ, A(x) = Σ n, and 1 #A(x α ) < n. Now, by Lemma 3.3, L(G n, α) contains at most one word over Σ n, and since L(G n, α) is nonempty, L(G n, α) is a singleton. () As L(G n, α) and L(G n, β) are singletons by (1), L(G n, α) L(G n, β) implies that they are equal. Finally, (3) is a direct consequence of the fact that L(G n, A) Q n n. Henceforth, in examples we will always assume tacitly that E n,1 = {A 1,...,A n } and we will use R n = {A i a i A i E n,1 }. Example 3.5. Consider G U 4 = (V 4, Σ 4, P 4, S 4 ) in GNF with P 4 = {S 4 a 1 A 3 A 4 a A 34 A 1 a 3 A 41 A a 4 A 1 A 3, A 1 a 1 A, A 3 a A 3, A 34 a 3 A 4, A 41 a 4 A 1 } R 4. Then L(G U 4 ) = C 4, E 4,4 = {S 4 }, E 4,3 =, E 4, = {A 1, A 3, A 34, A 41 }, E 4,1 = {A 1, A, A 3, A 4 }, D(4, 4) = 1, D(4, 3) = 0 and D(4, ) = D(4, 1) = 4. Since S 4 a 3 A 41 A is in P 4, we have L(G 4, A 41 ) = {a 4 a 1 }, L(G 4, A ) = {a }, a 3 a 4 = α(a 4 a 1 ) and ω(a 4 a 1 ) = a 1 a = α(a ). As measures for the descriptional complexity of G n from {G n } n 1, we use ν(n) = #N n and π(n) = #P n ; cf. [11, 13, 14, 7, 5, 1, 6]. A less-known measure has been introduced in [, 3]; viz. the number of left-most derivations δ(n) of G n. Remember that in a leftmost derivation the leftmost nonterminal is always expanded. Thus δ(n) = #{S n L x x L(G n )}, where L denotes the leftmost derivation relation. Clearly, this measure makes sense when we generate a finite language by a λ-free grammar with bounded ambiguity. Notice that these descriptional complexity measures depend on n as well as on the family under consideration; so we use ν α (n), π α (n) and δ α (n) in the context of a family {G α n } n 1 of which the individual members are labeled by α. Example 3.6. For G P 3 of Example 3., we have ν P(3) = 7, π P (3) = 1 and δ P (3) = 3! = 6, since G P 3 is unambiguous []. Similarly, for the unambiguous GU 4 of Example 3.5, we have ν U (4) = 9, π U (4) = 1 and δ U (4) = 4.

Generating Circular Shifts by Context-Free Grammars in GNF 5 For each family {G α n} n 1 = {(V n, Σ n, P n, S n )} n 1 for {C n } n 1 to be considered in the sequel, we assume that the first two (unspecified) elements G α 1 and Gα satisfy N 1 = {S 1 } and P 1 = {S 1 a 1 } for G α 1, and N = {S, A 1, A }, P = {S a 1 A a A 1, A 1 a 1, A a } for G α. Then ν α (1) = π α (1) = δ α (1) = 1, ν α () = 3, π α () = 4, δ α () =, whereas for n 3, ν α (n) = n k=1 D(n, k) n + 1, π α(n) n + and δ α (n) n. This implies that specifying a family {G α n} n 1 reduces to defining the family {G α n} n 3. As an illustration we consider a simple family of grammars in GNF for {C n } n 1. It is based on a single nonterminal S n and the trivial set of rules {S n w w C n }. To obtain grammars in GNF we need isomorphisms ϕ n : Σ n {A 1, A,...,A n } defined by ϕ n (a i ) = A i (1 i n), that are extended to words by ϕ n (σ 1 σ σ k ) = ϕ n (σ 1 )ϕ n (σ ) ϕ n (σ k ) (σ i Σ n, 1 i k). Definition 3.7. {G T n} n 1 is given by {(V n, Σ n, P n, S n )} n 1 with for n 3, N n = V n Σ n = {S n } {A i 1 i n}, P n = {S n σ 1 ϕ n (σ σ n ) σ 1 σ σ n C n } {A i a i 1 i n}. In the sequel we will slightly change our notation in order to reduce the number of subscript levels: if x = a j a k, we will write A j k for A x instead of A aj a k ; cf. Examples 3. and 3.5 above. In this way the set of indices {1,,..., n} inherits the linear order of Σ n ; a similar remark applies with respect to the -relation. Example 3.8. For n = 3, we have G T 3 = (V 3, Σ 3, P 3, S 3 ) with P3 T = {S 3 a 1 A A 3 a A 3 A 1 a 3 A 1 A } R 3. Now E 3,3 = {S 3 }, E 3, =, E 3,1 = {A 1, A, A 3 }, D(3, 3) = 1, D(3, ) = 0, D(3, 1) = 3, ν T (3) = 4, π T (3) = 6 and δ T (3) = 3. The following result easily follows from Definition 3.7. Proposition 3.9. For the family {G T n } n 1 of Definition 3.7 we have for n 3, (1) D(n, n) = 1, D(n, k) = 0 (1 < k < n), and D(n, 1) = n, () ν T (n) = n + 1, (3) π T (n) = n, (4) δ T (n) = n. 4 Greibach -form A Straightforward Approach The trivial family {G T n } n 1 of grammars in (unrestricted) GNF of Definition 3.7 gives rise to simple results that are not very interesting. Therefore we restrict ourselves in the sequel to grammars in Greibach k-form with k=1,. It turns out that in those cases the corresponding descriptional complexity measures are less trivial. The idea on which our next family of grammars is based stems from Proposition 3.4: we have nonterminals A x for all strings x in Q n n 1 with x = k < n such that L(G n, A x ) = {x}

6 Peter R.J. Asveld and A x E n,k. For the words in C n = Q n n Q n n 1 we have rules S n aa x A y for all nonempty words x and y with axy C n and E n,n = {S n }. Definition 4.1. {G 0 n} n 1 is given by {(V n, Σ n, P n, S n )} n 1 with for n 3, N n = V n Σ n = {S n } {A x x Q n n }, P n = {S n aa x A y axy C n ; a Σ n ; x, y Σ + n } {A a a a Σ n } {A axy aa x A y a Σ n ; axy Q n n 1; x, y Σ + n } {A ab aa b a, b Σ n ; a b }. Example 4.. Consider G 0 5 = (V 5, Σ 5, P 5, S 5 ) with P 5 = {S 5 a 1 A A 345 a 1 A 3 A 45 a 1 A 34 A 5 a A 3 A 451 a A 34 A 51 a A 345 A 1 a 3 A 4 A 51 a 3 A 45 A 1 a 3 A 451 A a 4 A 5 A 13 a 4 A 51 A 3 a 4 A 51 A 3 a 5 A 1 A 34 a 5 A 1 A 34 a 5 A 13 A 4, A 13 a 1 A A 3, A 34 a A 3 A 4, A 345 a 3 A 4 A 5, A 451 a 4 A 5 A 1, A 51 a 5 A 1 A, A 1 a 1 A, A 3 a A 3, A 34 a 3 A 4, A 45 a 4 A 5, A 51 a 5 A 1 } R 5. We have E 5,5 = {S 5 }, E 5,4 =, E 5,3 = {A 13, A 34, A 345, A 451, A 51 }, E 5, = {A 1, A 3, A 34, A 45, A 51 }, E 5,1 = {A 1, A, A 3, A 4, A 5 }, D(5, 5) = 1, D(5, 4) = 0, D(5, 3) = D(5, ) = D(5, 1) = 5. Consequently, ν 0 (5) = 16, π 0 (5) = 30 and δ 0 (5) = 15; hence G 0 5 is ambiguous. In general, if S is a statement that can be true or false, then [S] is equal to 1 if S is true, and to 0 otherwise; cf. [10]. Proposition 4.3. For the family {G 0 n } n 1 of Definition 4.1 we have for n 3, (1) D(n, n) = 1, D(n, n 1) = 0, and D(n, k) = n with 1 k < n 1, () ν 0 (n) = n n + 1, (3) π 0 (n) = [n 4] ( 1 n3 7 n + 6n) + n, (4) δ 0 (n) = n n. Proof. From Definition 4.1 it follows that for n 3, ν 0 (n) = #N n = 1 + #Q n n = 1 + (n )n = n n + 1, while π 0 (n) = h 0 (n) + h 1 (n) + h (n) + h 3 (n) with h 0 (n) = #{S n aa x A y axy C n ; a Σ n ; x, y Σ + n }, h 1 (n) = #{A axy aa x A y a Σ n ; axy Q n n 1 ; x, y Σ+ n }, h (n) = #{A ab aa b a, b Σ n ; a b }, h 3 (n) = #{A a a a Σ n }. Clearly, h 0 (n) = n(n ) and h (n) = h 3 (n) = n. For h 1 we observe that h 1 (3) = 0, and for n 4, we have h 1 (n) = n k=3 (k ) n = 1 n(n 3)(n 4) = 1 n3 7 n + 6n. So π 0 (n) = n(n ) + [n 4] ( 1 n3 7 n + 6n) + n + n = [n 4] ( 1 n3 7 n + 6n) + n. The grammar G 0 n generates n strings, each of which can be obtained by a left-most derivation in n ways (n 3); consequently, we have δ 0 (n) = n(n ).

Generating Circular Shifts by Context-Free Grammars in GNF 7 5 Greibach 1-form An Improvement The next grammar family {G 1 n } n 1 to generate {C n } n 1 consists of context-free grammars in Greibach 1-form; this family is closely related to Definition 5.1 in [3] which in turn has been inspired by generating {C n } n 1 with regular grammars. Definition 5.1. {G 1 n} n 1 is given by {(V n, Σ n, P n, S n )} n 1 with for n 3, N n = V n Σ n = {S n } {A x x Q n n 1}, P n = {S n aa x ax C n ; a Σ n } {A a a a Σ n } {A ax aa x ax Q n n 1 ; a Σ n, x Σ + n }. Example 5.. For n = 3, we obtain G 1 3 = (V 3, Σ 3, P 3, S 3 ) with P 3 = {S 3 a 1 A 3 a A 31 a 3 A 1, A 1 a 1 A, A 3 a A 3, A 31 a 3 A 1 } R 3 Then we have E 3,3 = {S 3 }, E 3, = {A 1, A 3, A 31 }, E 3,1 = {A 1, A, A 3 }, D(3, 3) = 1, D(3, ) = D(3, 1) = 3, ν 1 (3) = 7, π 1 (3) = 9 and δ 1 (3) = 3. The proof of the following result is similar to the one of Proposition 5.3 in [3]. Proposition 5.3. For the family {G 1 n} n 1 of Definition 5.1 we have for n 3, (1) D(n, n) = 1 and D(n, k) = n with 1 k < n, () ν 1 (n) = n n + 1, (3) π 1 (n) = n, (4) δ 1 (n) = n. Comparing Propositions 4.3 and 5.3 yields for n 4, ν 0 (n) < ν 1 (n), π 0 (n) > π 1 (n) and δ 0 (n) > δ 1 (n). The latter two inequalities may be considered as an improvement; the price we have to pay is n additional nonterminal symbols. 6 Families of Unambiguous Grammars In [3] we argued that a first step towards minimal grammars in CNF is to avoid ambiguity. The situation for the GNF is very similar: the following crucial result and its proof are almost identical to the one for CNF in [3]. Proposition 6.1. Let {G n } n 1 be a family of grammars in GNF that generates {C n } n 1. Then δ(n) = n if and only if π(n) = ν(n) + n 1. The proof tells us that in an unambiguous grammar for C n, there are n rules for S n and a single rule for each A N n {S n }. So we try to minimize ν(n), and as a consequence π(n) will reach its minimum value as well. Clearly, Proposition 6.1 applies to {G 1 n } n 1 but not to {G 0 n} n 1. However, {G 1 n} n 1 is not the only family satisfying Proposition 6.1; another one will be introduced now. Definition 6.. {G n} n 1 is given by {(V n, Σ n, P n, S n )} n 1 with for n 3, N n = V n Σ n = {S n } {A a a Σ n } M n with for m,

8 Peter R.J. Asveld M m = {A x x Fm Fm 4 F m }, and M m 1 = {A x x Fm 3 m 1 F m 5 m 1 F 3 m 1 }, P n = {S n aa b A x a, b Σ n ; x Σ + n ; abx C n; x Fn n } Q n {A abx aa b A x a, b Σ n ; A abx M n ; x Σ + n } {A a a a Σ n } with for m, Q m = {A ab aa b a, b Σ n ; a b }, and Q m 1 =. Example 6.3. Let G 6 = (V 6, Σ 6, P 6, S 6 ) with P 6 = {S 6 a 1 A A 3456 a A 3 A 4561 a 3 A 4 A 561 a 4 A 5 A 613 a 5 A 6 A 134 a 6 A 1 A 345, A 134 a 1 A A 34, A 345 a A 3 A 45, A 3456 a 3 A 4 A 56, A 4561 a 4 A 5 A 61, A 561 a 5 A 6 A 1, A 613 a 6 A 1 A 3, A 1 a 1 A, A 3 a A 3, A 34 a 3 A 4, A 45 a 4 A 5, A 56 a 5 A 6, A 61 a 6 A 1 } R 6. Now E 6,6 = {S n }, E 6,5 =, E 6,4 = {A 134, A 345, A 3456, A 4561, A 561, A 613 }, E 6,3 =, E 6, = {A 1, A 3, A 34, A 45, A 56, A 61 } and E 6,1 = {A 1, A, A 3, A 4, A 5, A 6 }. Then we obtain ν (6) = 19 < 31 = ν 1 (6) and π (6) = 4 < 36 = π 1 (6). Proposition 6.4. For the family {G n } n 1 of Definition 6., we have (1) D(n, n) = 1, D(n, n 1) = 0, D(n, 1) = n, and for k < n 1, D(n, k) = if k n (mod ) then n else 0, () for n 3, ν (n) = 1 n + 1 n [n is odd] + 1, (3) for n 3, π (n) = 1 n + 1 n ([n is odd] + ), (4) δ (n) = n. Proof. From Definition 6., Proposition 6.4(1) and (4) easily follow. Then for even n with n 4, we have ν (n) = 1 + n + n k=4 D(n, k) = 1 + n + 1 (n 4)n = 1 n + 1. For odd n with n 3, we obtain ν (n) = 1+n+ n k=3 D(n, k) = 1+n+ 1 (n 3)n = 1 n + 1 n+1. Combining these two cases results in ν (n) = 1 n + 1 n [n is odd] + 1 for n 3. Finally, Proposition 6.1 implies Proposition 6.4(3). From Propositions 5.3 and 6.4 it follows that for n 4, we have ν (n) < ν 1 (n) and, consequently, π (n) < π 1 (n). It is possible to continue in this way by introducing families {G k n } n 1 (k 3) such that for each rule A abc, L(G k n, B) consists of a single word of length k 1. As in [3] the objections are twofold: the definitions become more complicated as k increases, and we are probably left with ν k (n) and π k (n) being functions in Θ(n ). 7 Towards a Family of Minimal Grammars For the CNF we defined in [3] a family of grammars {G k n } n 1 to be minimal if each G k is unambiguous and ν(n) Θ(n); cf. Proposition 6.1. As we will see, this latter condition is likely to be too ambitious for the GNF. In [3] we also established the existence of a minimal family for the CNF; it turns out that the corresponding problem for the GNF remains open. But let us first have a look at a GNF-family as simple as the minimal CNF-family of [3].

Generating Circular Shifts by Context-Free Grammars in GNF 9 Definition 7.1. {G n} n 1 is given by {(V n, Σ n, P n, S n )} n 1 with for even n 4, N n = V n Σ n = {S n, A 1, A,...,A n } {A i k i = 1, 3, 5,..., n 1; a i a k F n F n 4 F n 6 F }, P n = {S n a i A j A k m a j A k m A i i = 1, 3,..., n 1; a i a j a k a m C n } and for odd n 3, {A k a k 1 k n} {A i...m a i A j A k m i = 1, 3,..., n 1; a i a j a k a m F n F n 4 F n 6 F }, where A k...m is taken equal to λ whenever a k a m equals λ; N n = V n Σ n = {S n, A 1, A,...,A n } {A i k i = 1, 3, 5,...,n; a i a k F n F n 4 F n 6 F 3 }, P n = {S n a i A j A k m a j A k m A i a n A 1...(n ) A n 1 i = 1, 3,..., n ; a i a j a k a m C n } {A k a k 1 k n} {A i...m a i A j A k m i = 1, 3,..., n ; a i a j a k a m F n F n 4 F n 6 F 3 } {A n1...k a n A 1...(k 1) A k k =, 4,..., n 3; a n a 1 a k F n F n 4 F n 6 F 3 }. Example 7.. Let G 7 = (V 7, Σ 7, P 7, S 7 ) with P 7 = {S 7 a 1 A A 34567 a A 34567 A 1 a 3 A 4 A 5671 a 4 A 5671 A 3 a 5 A 6 A 7134 a 6 A 7134 A 5 a 7 A 1345 A 6, A 1345 a 1 A A 345, A 34567 a 3 A 4 A 567, A 5671 a 5 A 6 A 71, A 7134 a 7 A 13 A 4, A 13 a 1 A A 3, A 345 a 3 A 4 A 5, A 567 a 5 A 6 A 7, A 71 a 7 A 1 A } R 7. Then ν (7) = 16 < 9 = ν (7), π (7) = < 35 = π (7) and δ (7) = 7. Proposition 7.3. For the family {G n} n 1 of Definition 7.1 we have (1) D(n, n) = 1, D(n, 1) = n, and for even n and k =, 4,..., n, D(n, k) = 1n, for odd n and k = 3, 5,..., n, D(n, k) = 1n, () ν (n) = 1 4 n + 1n + 1 + 3 [n is even], 4 4 (3) π (n) = 1 4 n + 3n 3 [n is odd], 4 (4) δ (n) = n. Proof. It is easy to establish (1) and (4); then for even n we have ν (n) = 1 + n + ( 1 n 1)1 n = 1 4 n + 1 n + 1, and for odd n, ν (n) = 1 + n + 1 n 1 1 n = 1 4 n + 1 n + 1 4. Finally, (3) follows from (), (4) and Proposition 6.1. Although this is an improvement with respect to Propositions 4.3, 5.3 and 6.4, {G n } n 1 is by no means a minimal family as we will see from the following divide-and-conquer family; cf. 8 in [].

10 Peter R.J. Asveld P n := {A i a i a i Σ n } {S n a i A x A y a i xy C n ; x = 1(n 1) }; N n := {S n } {A i a i Σ n }; M := {x, y S n a i A x A y P n }; while M Σ n [i.e. x M : x ] do begin N n := N n {A x }; M := M {x}; case x of = : P n := P n {A x a i A j a i a j = x}; = 3: P n := P n {A x a i A j A k a i a j a k = x}; 4: begin P n := P n {A x a i A y A z a i yz = x; y = 1 ( x 1) }; M := M {y, z a i yz = x; y = 1 ( x 1) }; end endcase end Figure 1: Algorithm to determine N n and P n of G n. Definition 7.4. {G n} n 1 is given by {(V n, Σ n, P n, S n )} n 1 where the sets N n and P n are determined by the algorithm in Figure 1. Example 7.5. G 7 = (V 7, Σ 7, P 7, S 7 ) with P 7 = {S 7 a 1 A 34 A 567 a A 345 A 671 a 3 A 456 A 71 a 4 A 567 A 13 a 5 A 671 A 34 a 6 A 71 A 345 a 7 A 13 A 456, A 13 a 1 A A 3, A 34 a A 3 A 4, A 345 a 3 A 4 A 5, A 456 a 4 A 5 A 6, A 567 a 5 A 6 A 7, A 671 a 6 A 7 A 1, A 71 a 7 A 1 A } R 7. Now ν (7) = 15 < 16 = ν (7), π (7) = 1 < = ν (7) and δ (7) = 7. n 3 4 5 6 7 8 9 10 11 1 13 14 15 16 ν (n) 4 9 11 19 15 33 8 41 34 61 53 71 46 96 π (n) 6 1 15 4 1 40 36 50 44 7 65 84 60 111 Table 1: ν (n) and π (n) for 3 n 16. As usual in analyzing such a divide-and-conquer approach, a closed form for ν (n) and π (n) is very hard or even impossible to obtain; for small values we refer to Table 1. Only for special values of n we can infer some manageable expressions. Proposition 7.6. For the family {G n } n 1 we have in case n = k 1 (k ), (1) D(n, n) = 1, D(n, i 1) = n (i = 1,,..., k 1), and D(n, i) = 0 otherwise, () ν (n) = n log (n + 1) n + 1, (3) π (n) = n log (n + 1), (4) δ (n) = n.

Generating Circular Shifts by Context-Free Grammars in GNF 11 So using this divide-and-conquer approach we end up with ν (n) and π (n) in Θ(n log n), rather than in Θ(n ) as for the previous families. 8 Concluding Remarks We discussed a few ways of generating the languages {C n } n 1 of circular shifts by contextfree grammars {G n } n 1 in GNF, and we compared these families with respect to the measures ν, π and δ. Our results give rise to the following observation. Conjecture 8.1. Any family of context-free grammars in GNF {G n } n 1 that generates {C n } n 1 must have measures ν(n) and π(n) that are not bounded by any linear function in n. The situation in the GNF-case differs considerably from the CNF-case: in [3] we established the existence of a minimal family in CNF for which ν and π are linear functions in n (even with small coefficients). For the GNF the definition of minimality remains a problem; viz. setting {G n } n 1 in GNF is minimal for {C n } n 1 if (i) each G n is unambiguous, and (ii) ν(n) Θ(f(n)) leaves us with the question of an adequate choice for f(n). Conjecture 8.1 implies f(n) ω(n). Taking f(n) equal to n log n results in the minimality of {G n } n 1 (Proposition 7.6), but the question whether this family is also minimal in an absolute sense (i.e., does there exists no family with ν(n) Θ(n log n) and ν(n) < n log (n + 1) n + 1 for n large enough with n = k 1 and k?) remains an open problem as well. References 1. B. Alspach, P. Eades & G. Rose, A lower-bound for the number of productions for a certain class of languages, Discrete Appl. Math. 6 (1983) 109-115.. P.R.J. Asveld, Generating all permutations by context-free grammars in Chomsky normal form, Theoret. Comput. Sci. 354 (006) 118 130. 3. P.R.J. Asveld, Generating all circular shifts by context-free grammars in Chomsky normal form, CTIT TR 05-3 (005), ISSN 1381-365, University of Twente, Enschede, the Netherlands; to appear in J. Autom., Lang. Combin. 4. P.R.J. Asveld, Generating all permutations by context-free grammars in Greibach normal form, (in preparation). 5. W. Bucher, A note on a problem in the theory of grammatical complexity, Theoret. Comput. Sci. 14 (1981) 337-344. 6. W. Bucher, H.A. Maurer & K. Culik II, Context-free complexity of finite languages, Theoret. Comput. Sci. 8 (1984) 77-85. 7. W. Bucher, H.A. Maurer, K. Culik II & D. Wotschke, Concise description of finite languages, Theoret. Comput. Sci. 14 (1981) 7-46.

1 Peter R.J. Asveld 8. J. Dassow, On the circular closure of languages, EIK 15 (1979) 87 94. 9. K. Ellul, B. Krawetz, J. Shallit & M.-w. Wang, Regular expressions: new results and open problems, J. Autom. Lang. Comb. 9 (004) 33 56. 10. R.L. Graham, D.E. Knuth & O. Patashnik, Concrete Mathematics (1989), Addison- Wesley, Reading, MA. 11. J. Gruska, Some classifications of context-free languages, Inform. Contr. 14 (1969) 15 179. 1. M.A. Harrison, Introduction to Formal Language Theory (1978), Addison-Wesley, Reading, MA. 13. V.A. Iljuškin, The complexity of the grammatical description of context-free languages, Dokl. Akad. Nauk SSSR 03 (197) 144-145 / Soviet Math. Dokl. 13 (197) 533-535. 14. A. Kelemenová, Complexity of normal form grammars, Theoret. Comput. Sci. 8 (1984) 99 314. 15. G. Satta, personal communication (00).