Generating All Circular Shifts by Context-Free Grammars in Greibach Normal Form

Generating All Circular Shifts by Context-Free Grammars in Greibach Normal Form Peter R.J. Asveld Department of Computer Science, Twente University of Technology P.O. Box 17, 7500 AE Enschede, the Netherlands e-mail: infprja@cs.utwente.nl Abstract For each alphabet Σ n = {a 1,a,...,a n }, linearly ordered by a 1 < a < < a n, let C n be the language of circular or cyclic shifts over Σ n, i.e., C n = {a 1 a a n 1 a n, a a 3 a n a 1,...,a n a 1 a n a n 1 }. We study a few families of context-free grammars G n (n 1) in Greibach normal form such that G n generates C n. The members of these grammar families are investigated with respect to the following descriptional complexity measures: the number of nonterminals ν(n), the number of rules π(n) and the number of leftmost derivations δ(n) of G n. As in the case of Chomsky normal form, these ν, π and δ are functions bounded by low-degree polynomials. However, the question whether there exists a family of grammars that is minimal with respect to all these measures remains open. Keywords: context-free grammar, Greibach normal form, permutation, circular shift, cyclic shift, descriptional complexity, unambiguous grammar. 1 Introduction Let Σ n = {a 1, a,...,a n } be an alphabet, linearly ordered by a 1 < a < < a n, and let L n be the language over Σ n of the n! permutations of a 1 a a n. In 00 G. Satta [15] conjectured that any context-free grammar G n in Chomsky normal form (CNF) that generates L n must have a number of nonterminal symbols that is not bounded by any polynomial function in n. This statement has been proved in [9], but without showing how to generate {L n } n 1 by context-free grammars {G n } n 1 in CNF. In [] we provided several grammar families for {L n } n 1 together with the usual descriptional complexity measures as the number of nonterminals ν(n) and the number of rules π(n); cf. [11, 13, 14, 7, 5, 1, 6] for

Peter R.J. Asveld these measures. The relative descriptional complexity of these grammar families is anything but straightforward and the quest for a family of minimal grammars (with respect to these complexity measures) remains a challenging problem. Then in [3] we investigated some specific permutations over Σ n, viz. the circular or cyclic shifts, defined by C n = {a 1 a a n 1 a n, a a 3 a n a 1, a 3 a 4 a 1 a,...,a n a 1 a n a n 1 }. An alternative definition of C n in terms of the so-called circular closure operator c on languages L, which is defined by c(l) = {vu uv L} [8], is: C n = c({a 1 a a n }). The fact that the n elements of C n is much less than the n! elements of L n is also reflected by the complexity measures of the corresponding grammar families: for {L n } n 1, the functions ν(n) and π(n) are exponential functions [15, 9, ], whereas for {C n } n 1, they are bounded by low-degree polynomial functions; cf. [3]. In this paper we investigate a few ways of generating the family {C n } n 1 by contextfree grammars in Greibach normal form (GNF) and for these families of grammars we determine the complexity measures ν, π and δ as functions of n. The results we obtain are rather similar to those in [3], as is the organization of this paper. is devoted to preliminaries and in 3 we consider elementary properties of grammars G n in GNF for C n. As using the arbitrary GNF for C n is trivial ( 3), we focus in 4 7 on the Greibach k-form (k = 1, ). An approach based on the set of circularly ordered strings results in a grammar family in Greibach -form ( 4). Modifying this family into Greibach 1-form in 5 results in a family with less rules. Unambiguous grammars for C n are studied in 6: then ν(n) and π(n) are related in a simple way. In 7 we discuss minimality for unambiguous grammars in GNF but the existence of a family of grammars for which these complexity measures are minimal remains open. Finally, some concluding remarks are in 8. Preliminaries For rudiments of discrete mathematics, particularly of combinatorics, and of formal language theory, we refer to standard texts like [10] and [1], respectively. We denote the empty word by λ and the length of the word x by x. For each word w over Σ, A(w) is defined as the set of all symbols from Σ that do occur in w. Formally, A(λ) =, and A(ax) = {a} A(x) for each a Σ and x Σ. This mapping is extended to languages L over Σ by A(L) = {A(w) w L}. Remember that a λ-free context-free grammar G = (V, Σ, P, S) is in Chomsky normal form (CNF) if P N (N {S}) N Σ where N = V Σ. And G is in Greibach normal form (GNF) if P N Σ(N {S}). Particularly, G is in Greibach k-form if P N Σ( k i=0 (N {S})i ). For a context-free grammar G = (V, Σ, P, S) with α V, L(G, α) denotes the language defined by L(G, α) = {w Σ α w}. Thus for the language L(G) generated by G, we have L(G) = L(G, S). Notice that, if G is in CNF or GNF, then G has no useless symbols, L(G, α) is a nonempty language for each symbol α in V, and L(G, α) = {α} for each α in Σ.

Generating Circular Shifts by Context-Free Grammars in GNF 3 In studying C n we need, as in [3], subwords of a 1 a a n a 1 a n 1 ; so we consider the set Fk n all subwords of length k (1 k n) that obey the circular succession relation on Σ n defined by: a i a j if and only if either (i) i < n and j = i + 1 or (ii) i = n and j = 1; cf. 6.1 in [10]. Clearly, is not a transitive relation: it is a kind of successor relation. Then the formal definition of Fk n reads F n k = {x Σ n u, v Σ n : uxv = a 1 a a n a 1 a a k 1 ; x = k} with 1 k n; their partial unions Q n m are defined by Qn m = m k=1 F k n (1 m n). For a finite set X, we denote the cardinality of X by #X. Then we obviously have C n = Fn n = Qn n Qn n 1, #F k n = n (1 k n), #Qn m = mn and #C n = #Fn n = n. 3 Elementary Properties We first recall some simple properties of grammars in GNF that generate L n (the language of all permutations over Σ n ). From [4] we quote the following results. Proposition 3.1. For n 1, let G n = (V n, Σ n, P n, S n ) be a context-free grammar in GNF that generates L n, and let A, B N n = V n Σ n. (1) The language L(G n, A) is a nonempty subset of an isomorphic copy M k of the language L k for some k (1 k n). Consequently, each string z in L(G n, A) has length k, z consists of k different symbols, and A(z) = A(L(G n, A)). () If L(G n, A) L(G n, B), then A(L(G n, A)) = A(L(G n, B)). (3) If A aa 1 A A m is a rule in G n, then for each (i, j) with 1 i < j m, A(L(G n, A i )) A(L(G n, A j )) =, a / A(L(G n, A k )) with 1 k m, and A(L(G n, A)) = {a} A(L(G n, A 1 )) A(L(G n, A )) A(L(G n, A m )). This result gives rise to the following equivalence relation on N n : A and B in N n are called equivalent if x = y for some x L(G n, A) and some y L(G n, B). The equivalence classes are denoted by {E n,k } n k=1. The number of elements #E n,k of the equivalence class E n,k will be denoted by D(n, k) (1 k n). Example 3.. Consider G P 3 = (V 3, Σ 3, P 3, A 13 ) with N 3 = {A 13, A 1, A 13, A 3, A 1, A, A 3 } and P 3 consists of A 13 a 1 A 3 a A 13 a 3 A 1, A 1 a 1 A a A 1, A 13 a 1 A 3 a 3 A 1, A 3 a A 3 a 3 A, A 1 a 1, A a and A 3 a 3. Clearly, G P 3 is in GNF. Then L(G P 3 ) = L 3, E 3,3 = {A 13 }, E 3, = {A 1, A 13, A 3 }, E 3,1 = {A 1, A, A 3 }, and hence D(3, 3) = 1 and D(3, ) = D(3, 1) = 3. Proposition 3.1 relies on the fact that each word in L(G n ) is a permutation. As circular shifts are special permutations, Proposition 3.1 still applies; but what is particular about generating C n rather than L n is expressed in Proposition 3.4, the proof of which depends on the following result from [3]. Lemma 3.3. If X is a nonempty proper subalphabet of Σ n, then there exists at most one word x with A(x) = X such that x satisfies the circular succession relation. And if

4 Peter R.J. Asveld X = {b 1, b,..., b l }, then x = b p(1) b p() b p(l) provided there exists a permutation p of {1,,..., l} such that b p(1) b p() b p(l). For each w Σ n, let α(w) be the first and ω(w) be the last symbol of w. Thus if w = σ 1 σ σ m with σ i Σ (1 i m), then α(w) = σ 1 and ω(w) = σ m. Proposition 3.4. Let G n = (V n, Σ n, P n, S n ) be a context-free grammar in GNF that generates C n, and let α, β V n {S n }. (1) For each α, the language L(G n, α) is a singleton. () If L(G n, α) L(G n, β), then L(G n, α) = L(G n, β). (3) If A aa 1 A A m is in P n with L(G n, A i ) = {x i } (1 i m), then a α(x 1 ) and for each i (1 i < m), ω(x i ) α(x i+1 ). Consequently, if L(G n, A i ) = {x i } F n Λ(i), then L(G n, A) = {ax 1 x x m } F n k with k = 1 + m i=1 Λ(i). Proof. (1) G n is in GNF; so each symbol α in V n {S n } is useful: there is a derivation S n + ϕαψ + ϕx α ψ + x where x is a circular shift, ϕψ λ, A(x) = Σ n, and 1 #A(x α ) < n. Now, by Lemma 3.3, L(G n, α) contains at most one word over Σ n, and since L(G n, α) is nonempty, L(G n, α) is a singleton. () As L(G n, α) and L(G n, β) are singletons by (1), L(G n, α) L(G n, β) implies that they are equal. Finally, (3) is a direct consequence of the fact that L(G n, A) Q n n. Henceforth, in examples we will always assume tacitly that E n,1 = {A 1,...,A n } and we will use R n = {A i a i A i E n,1 }. Example 3.5. Consider G U 4 = (V 4, Σ 4, P 4, S 4 ) in GNF with P 4 = {S 4 a 1 A 3 A 4 a A 34 A 1 a 3 A 41 A a 4 A 1 A 3, A 1 a 1 A, A 3 a A 3, A 34 a 3 A 4, A 41 a 4 A 1 } R 4. Then L(G U 4 ) = C 4, E 4,4 = {S 4 }, E 4,3 =, E 4, = {A 1, A 3, A 34, A 41 }, E 4,1 = {A 1, A, A 3, A 4 }, D(4, 4) = 1, D(4, 3) = 0 and D(4, ) = D(4, 1) = 4. Since S 4 a 3 A 41 A is in P 4, we have L(G 4, A 41 ) = {a 4 a 1 }, L(G 4, A ) = {a }, a 3 a 4 = α(a 4 a 1 ) and ω(a 4 a 1 ) = a 1 a = α(a ). As measures for the descriptional complexity of G n from {G n } n 1, we use ν(n) = #N n and π(n) = #P n ; cf. [11, 13, 14, 7, 5, 1, 6]. A less-known measure has been introduced in [, 3]; viz. the number of left-most derivations δ(n) of G n. Remember that in a leftmost derivation the leftmost nonterminal is always expanded. Thus δ(n) = #{S n L x x L(G n )}, where L denotes the leftmost derivation relation. Clearly, this measure makes sense when we generate a finite language by a λ-free grammar with bounded ambiguity. Notice that these descriptional complexity measures depend on n as well as on the family under consideration; so we use ν α (n), π α (n) and δ α (n) in the context of a family {G α n } n 1 of which the individual members are labeled by α. Example 3.6. For G P 3 of Example 3., we have ν P(3) = 7, π P (3) = 1 and δ P (3) = 3! = 6, since G P 3 is unambiguous []. Similarly, for the unambiguous GU 4 of Example 3.5, we have ν U (4) = 9, π U (4) = 1 and δ U (4) = 4.

Generating Circular Shifts by Context-Free Grammars in GNF 5 For each family {G α n} n 1 = {(V n, Σ n, P n, S n )} n 1 for {C n } n 1 to be considered in the sequel, we assume that the first two (unspecified) elements G α 1 and Gα satisfy N 1 = {S 1 } and P 1 = {S 1 a 1 } for G α 1, and N = {S, A 1, A }, P = {S a 1 A a A 1, A 1 a 1, A a } for G α. Then ν α (1) = π α (1) = δ α (1) = 1, ν α () = 3, π α () = 4, δ α () =, whereas for n 3, ν α (n) = n k=1 D(n, k) n + 1, π α(n) n + and δ α (n) n. This implies that specifying a family {G α n} n 1 reduces to defining the family {G α n} n 3. As an illustration we consider a simple family of grammars in GNF for {C n } n 1. It is based on a single nonterminal S n and the trivial set of rules {S n w w C n }. To obtain grammars in GNF we need isomorphisms ϕ n : Σ n {A 1, A,...,A n } defined by ϕ n (a i ) = A i (1 i n), that are extended to words by ϕ n (σ 1 σ σ k ) = ϕ n (σ 1 )ϕ n (σ ) ϕ n (σ k ) (σ i Σ n, 1 i k). Definition 3.7. {G T n} n 1 is given by {(V n, Σ n, P n, S n )} n 1 with for n 3, N n = V n Σ n = {S n } {A i 1 i n}, P n = {S n σ 1 ϕ n (σ σ n ) σ 1 σ σ n C n } {A i a i 1 i n}. In the sequel we will slightly change our notation in order to reduce the number of subscript levels: if x = a j a k, we will write A j k for A x instead of A aj a k ; cf. Examples 3. and 3.5 above. In this way the set of indices {1,,..., n} inherits the linear order of Σ n ; a similar remark applies with respect to the -relation. Example 3.8. For n = 3, we have G T 3 = (V 3, Σ 3, P 3, S 3 ) with P3 T = {S 3 a 1 A A 3 a A 3 A 1 a 3 A 1 A } R 3. Now E 3,3 = {S 3 }, E 3, =, E 3,1 = {A 1, A, A 3 }, D(3, 3) = 1, D(3, ) = 0, D(3, 1) = 3, ν T (3) = 4, π T (3) = 6 and δ T (3) = 3. The following result easily follows from Definition 3.7. Proposition 3.9. For the family {G T n } n 1 of Definition 3.7 we have for n 3, (1) D(n, n) = 1, D(n, k) = 0 (1 < k < n), and D(n, 1) = n, () ν T (n) = n + 1, (3) π T (n) = n, (4) δ T (n) = n. 4 Greibach -form A Straightforward Approach The trivial family {G T n } n 1 of grammars in (unrestricted) GNF of Definition 3.7 gives rise to simple results that are not very interesting. Therefore we restrict ourselves in the sequel to grammars in Greibach k-form with k=1,. It turns out that in those cases the corresponding descriptional complexity measures are less trivial. The idea on which our next family of grammars is based stems from Proposition 3.4: we have nonterminals A x for all strings x in Q n n 1 with x = k < n such that L(G n, A x ) = {x}

6 Peter R.J. Asveld and A x E n,k. For the words in C n = Q n n Q n n 1 we have rules S n aa x A y for all nonempty words x and y with axy C n and E n,n = {S n }. Definition 4.1. {G 0 n} n 1 is given by {(V n, Σ n, P n, S n )} n 1 with for n 3, N n = V n Σ n = {S n } {A x x Q n n }, P n = {S n aa x A y axy C n ; a Σ n ; x, y Σ + n } {A a a a Σ n } {A axy aa x A y a Σ n ; axy Q n n 1; x, y Σ + n } {A ab aa b a, b Σ n ; a b }. Example 4.. Consider G 0 5 = (V 5, Σ 5, P 5, S 5 ) with P 5 = {S 5 a 1 A A 345 a 1 A 3 A 45 a 1 A 34 A 5 a A 3 A 451 a A 34 A 51 a A 345 A 1 a 3 A 4 A 51 a 3 A 45 A 1 a 3 A 451 A a 4 A 5 A 13 a 4 A 51 A 3 a 4 A 51 A 3 a 5 A 1 A 34 a 5 A 1 A 34 a 5 A 13 A 4, A 13 a 1 A A 3, A 34 a A 3 A 4, A 345 a 3 A 4 A 5, A 451 a 4 A 5 A 1, A 51 a 5 A 1 A, A 1 a 1 A, A 3 a A 3, A 34 a 3 A 4, A 45 a 4 A 5, A 51 a 5 A 1 } R 5. We have E 5,5 = {S 5 }, E 5,4 =, E 5,3 = {A 13, A 34, A 345, A 451, A 51 }, E 5, = {A 1, A 3, A 34, A 45, A 51 }, E 5,1 = {A 1, A, A 3, A 4, A 5 }, D(5, 5) = 1, D(5, 4) = 0, D(5, 3) = D(5, ) = D(5, 1) = 5. Consequently, ν 0 (5) = 16, π 0 (5) = 30 and δ 0 (5) = 15; hence G 0 5 is ambiguous. In general, if S is a statement that can be true or false, then [S] is equal to 1 if S is true, and to 0 otherwise; cf. [10]. Proposition 4.3. For the family {G 0 n } n 1 of Definition 4.1 we have for n 3, (1) D(n, n) = 1, D(n, n 1) = 0, and D(n, k) = n with 1 k < n 1, () ν 0 (n) = n n + 1, (3) π 0 (n) = [n 4] ( 1 n3 7 n + 6n) + n, (4) δ 0 (n) = n n. Proof. From Definition 4.1 it follows that for n 3, ν 0 (n) = #N n = 1 + #Q n n = 1 + (n )n = n n + 1, while π 0 (n) = h 0 (n) + h 1 (n) + h (n) + h 3 (n) with h 0 (n) = #{S n aa x A y axy C n ; a Σ n ; x, y Σ + n }, h 1 (n) = #{A axy aa x A y a Σ n ; axy Q n n 1 ; x, y Σ+ n }, h (n) = #{A ab aa b a, b Σ n ; a b }, h 3 (n) = #{A a a a Σ n }. Clearly, h 0 (n) = n(n ) and h (n) = h 3 (n) = n. For h 1 we observe that h 1 (3) = 0, and for n 4, we have h 1 (n) = n k=3 (k ) n = 1 n(n 3)(n 4) = 1 n3 7 n + 6n. So π 0 (n) = n(n ) + [n 4] ( 1 n3 7 n + 6n) + n + n = [n 4] ( 1 n3 7 n + 6n) + n. The grammar G 0 n generates n strings, each of which can be obtained by a left-most derivation in n ways (n 3); consequently, we have δ 0 (n) = n(n ).

Generating Circular Shifts by Context-Free Grammars in GNF 7 5 Greibach 1-form An Improvement The next grammar family {G 1 n } n 1 to generate {C n } n 1 consists of context-free grammars in Greibach 1-form; this family is closely related to Definition 5.1 in [3] which in turn has been inspired by generating {C n } n 1 with regular grammars. Definition 5.1. {G 1 n} n 1 is given by {(V n, Σ n, P n, S n )} n 1 with for n 3, N n = V n Σ n = {S n } {A x x Q n n 1}, P n = {S n aa x ax C n ; a Σ n } {A a a a Σ n } {A ax aa x ax Q n n 1 ; a Σ n, x Σ + n }. Example 5.. For n = 3, we obtain G 1 3 = (V 3, Σ 3, P 3, S 3 ) with P 3 = {S 3 a 1 A 3 a A 31 a 3 A 1, A 1 a 1 A, A 3 a A 3, A 31 a 3 A 1 } R 3 Then we have E 3,3 = {S 3 }, E 3, = {A 1, A 3, A 31 }, E 3,1 = {A 1, A, A 3 }, D(3, 3) = 1, D(3, ) = D(3, 1) = 3, ν 1 (3) = 7, π 1 (3) = 9 and δ 1 (3) = 3. The proof of the following result is similar to the one of Proposition 5.3 in [3]. Proposition 5.3. For the family {G 1 n} n 1 of Definition 5.1 we have for n 3, (1) D(n, n) = 1 and D(n, k) = n with 1 k < n, () ν 1 (n) = n n + 1, (3) π 1 (n) = n, (4) δ 1 (n) = n. Comparing Propositions 4.3 and 5.3 yields for n 4, ν 0 (n) < ν 1 (n), π 0 (n) > π 1 (n) and δ 0 (n) > δ 1 (n). The latter two inequalities may be considered as an improvement; the price we have to pay is n additional nonterminal symbols. 6 Families of Unambiguous Grammars In [3] we argued that a first step towards minimal grammars in CNF is to avoid ambiguity. The situation for the GNF is very similar: the following crucial result and its proof are almost identical to the one for CNF in [3]. Proposition 6.1. Let {G n } n 1 be a family of grammars in GNF that generates {C n } n 1. Then δ(n) = n if and only if π(n) = ν(n) + n 1. The proof tells us that in an unambiguous grammar for C n, there are n rules for S n and a single rule for each A N n {S n }. So we try to minimize ν(n), and as a consequence π(n) will reach its minimum value as well. Clearly, Proposition 6.1 applies to {G 1 n } n 1 but not to {G 0 n} n 1. However, {G 1 n} n 1 is not the only family satisfying Proposition 6.1; another one will be introduced now. Definition 6.. {G n} n 1 is given by {(V n, Σ n, P n, S n )} n 1 with for n 3, N n = V n Σ n = {S n } {A a a Σ n } M n with for m,

8 Peter R.J. Asveld M m = {A x x Fm Fm 4 F m }, and M m 1 = {A x x Fm 3 m 1 F m 5 m 1 F 3 m 1 }, P n = {S n aa b A x a, b Σ n ; x Σ + n ; abx C n; x Fn n } Q n {A abx aa b A x a, b Σ n ; A abx M n ; x Σ + n } {A a a a Σ n } with for m, Q m = {A ab aa b a, b Σ n ; a b }, and Q m 1 =. Example 6.3. Let G 6 = (V 6, Σ 6, P 6, S 6 ) with P 6 = {S 6 a 1 A A 3456 a A 3 A 4561 a 3 A 4 A 561 a 4 A 5 A 613 a 5 A 6 A 134 a 6 A 1 A 345, A 134 a 1 A A 34, A 345 a A 3 A 45, A 3456 a 3 A 4 A 56, A 4561 a 4 A 5 A 61, A 561 a 5 A 6 A 1, A 613 a 6 A 1 A 3, A 1 a 1 A, A 3 a A 3, A 34 a 3 A 4, A 45 a 4 A 5, A 56 a 5 A 6, A 61 a 6 A 1 } R 6. Now E 6,6 = {S n }, E 6,5 =, E 6,4 = {A 134, A 345, A 3456, A 4561, A 561, A 613 }, E 6,3 =, E 6, = {A 1, A 3, A 34, A 45, A 56, A 61 } and E 6,1 = {A 1, A, A 3, A 4, A 5, A 6 }. Then we obtain ν (6) = 19 < 31 = ν 1 (6) and π (6) = 4 < 36 = π 1 (6). Proposition 6.4. For the family {G n } n 1 of Definition 6., we have (1) D(n, n) = 1, D(n, n 1) = 0, D(n, 1) = n, and for k < n 1, D(n, k) = if k n (mod ) then n else 0, () for n 3, ν (n) = 1 n + 1 n [n is odd] + 1, (3) for n 3, π (n) = 1 n + 1 n ([n is odd] + ), (4) δ (n) = n. Proof. From Definition 6., Proposition 6.4(1) and (4) easily follow. Then for even n with n 4, we have ν (n) = 1 + n + n k=4 D(n, k) = 1 + n + 1 (n 4)n = 1 n + 1. For odd n with n 3, we obtain ν (n) = 1+n+ n k=3 D(n, k) = 1+n+ 1 (n 3)n = 1 n + 1 n+1. Combining these two cases results in ν (n) = 1 n + 1 n [n is odd] + 1 for n 3. Finally, Proposition 6.1 implies Proposition 6.4(3). From Propositions 5.3 and 6.4 it follows that for n 4, we have ν (n) < ν 1 (n) and, consequently, π (n) < π 1 (n). It is possible to continue in this way by introducing families {G k n } n 1 (k 3) such that for each rule A abc, L(G k n, B) consists of a single word of length k 1. As in [3] the objections are twofold: the definitions become more complicated as k increases, and we are probably left with ν k (n) and π k (n) being functions in Θ(n ). 7 Towards a Family of Minimal Grammars For the CNF we defined in [3] a family of grammars {G k n } n 1 to be minimal if each G k is unambiguous and ν(n) Θ(n); cf. Proposition 6.1. As we will see, this latter condition is likely to be too ambitious for the GNF. In [3] we also established the existence of a minimal family for the CNF; it turns out that the corresponding problem for the GNF remains open. But let us first have a look at a GNF-family as simple as the minimal CNF-family of [3].

Generating Circular Shifts by Context-Free Grammars in GNF 9 Definition 7.1. {G n} n 1 is given by {(V n, Σ n, P n, S n )} n 1 with for even n 4, N n = V n Σ n = {S n, A 1, A,...,A n } {A i k i = 1, 3, 5,..., n 1; a i a k F n F n 4 F n 6 F }, P n = {S n a i A j A k m a j A k m A i i = 1, 3,..., n 1; a i a j a k a m C n } and for odd n 3, {A k a k 1 k n} {A i...m a i A j A k m i = 1, 3,..., n 1; a i a j a k a m F n F n 4 F n 6 F }, where A k...m is taken equal to λ whenever a k a m equals λ; N n = V n Σ n = {S n, A 1, A,...,A n } {A i k i = 1, 3, 5,...,n; a i a k F n F n 4 F n 6 F 3 }, P n = {S n a i A j A k m a j A k m A i a n A 1...(n ) A n 1 i = 1, 3,..., n ; a i a j a k a m C n } {A k a k 1 k n} {A i...m a i A j A k m i = 1, 3,..., n ; a i a j a k a m F n F n 4 F n 6 F 3 } {A n1...k a n A 1...(k 1) A k k =, 4,..., n 3; a n a 1 a k F n F n 4 F n 6 F 3 }. Example 7.. Let G 7 = (V 7, Σ 7, P 7, S 7 ) with P 7 = {S 7 a 1 A A 34567 a A 34567 A 1 a 3 A 4 A 5671 a 4 A 5671 A 3 a 5 A 6 A 7134 a 6 A 7134 A 5 a 7 A 1345 A 6, A 1345 a 1 A A 345, A 34567 a 3 A 4 A 567, A 5671 a 5 A 6 A 71, A 7134 a 7 A 13 A 4, A 13 a 1 A A 3, A 345 a 3 A 4 A 5, A 567 a 5 A 6 A 7, A 71 a 7 A 1 A } R 7. Then ν (7) = 16 < 9 = ν (7), π (7) = < 35 = π (7) and δ (7) = 7. Proposition 7.3. For the family {G n} n 1 of Definition 7.1 we have (1) D(n, n) = 1, D(n, 1) = n, and for even n and k =, 4,..., n, D(n, k) = 1n, for odd n and k = 3, 5,..., n, D(n, k) = 1n, () ν (n) = 1 4 n + 1n + 1 + 3 [n is even], 4 4 (3) π (n) = 1 4 n + 3n 3 [n is odd], 4 (4) δ (n) = n. Proof. It is easy to establish (1) and (4); then for even n we have ν (n) = 1 + n + ( 1 n 1)1 n = 1 4 n + 1 n + 1, and for odd n, ν (n) = 1 + n + 1 n 1 1 n = 1 4 n + 1 n + 1 4. Finally, (3) follows from (), (4) and Proposition 6.1. Although this is an improvement with respect to Propositions 4.3, 5.3 and 6.4, {G n } n 1 is by no means a minimal family as we will see from the following divide-and-conquer family; cf. 8 in [].

10 Peter R.J. Asveld P n := {A i a i a i Σ n } {S n a i A x A y a i xy C n ; x = 1(n 1) }; N n := {S n } {A i a i Σ n }; M := {x, y S n a i A x A y P n }; while M Σ n [i.e. x M : x ] do begin N n := N n {A x }; M := M {x}; case x of = : P n := P n {A x a i A j a i a j = x}; = 3: P n := P n {A x a i A j A k a i a j a k = x}; 4: begin P n := P n {A x a i A y A z a i yz = x; y = 1 ( x 1) }; M := M {y, z a i yz = x; y = 1 ( x 1) }; end endcase end Figure 1: Algorithm to determine N n and P n of G n. Definition 7.4. {G n} n 1 is given by {(V n, Σ n, P n, S n )} n 1 where the sets N n and P n are determined by the algorithm in Figure 1. Example 7.5. G 7 = (V 7, Σ 7, P 7, S 7 ) with P 7 = {S 7 a 1 A 34 A 567 a A 345 A 671 a 3 A 456 A 71 a 4 A 567 A 13 a 5 A 671 A 34 a 6 A 71 A 345 a 7 A 13 A 456, A 13 a 1 A A 3, A 34 a A 3 A 4, A 345 a 3 A 4 A 5, A 456 a 4 A 5 A 6, A 567 a 5 A 6 A 7, A 671 a 6 A 7 A 1, A 71 a 7 A 1 A } R 7. Now ν (7) = 15 < 16 = ν (7), π (7) = 1 < = ν (7) and δ (7) = 7. n 3 4 5 6 7 8 9 10 11 1 13 14 15 16 ν (n) 4 9 11 19 15 33 8 41 34 61 53 71 46 96 π (n) 6 1 15 4 1 40 36 50 44 7 65 84 60 111 Table 1: ν (n) and π (n) for 3 n 16. As usual in analyzing such a divide-and-conquer approach, a closed form for ν (n) and π (n) is very hard or even impossible to obtain; for small values we refer to Table 1. Only for special values of n we can infer some manageable expressions. Proposition 7.6. For the family {G n } n 1 we have in case n = k 1 (k ), (1) D(n, n) = 1, D(n, i 1) = n (i = 1,,..., k 1), and D(n, i) = 0 otherwise, () ν (n) = n log (n + 1) n + 1, (3) π (n) = n log (n + 1), (4) δ (n) = n.

Generating Circular Shifts by Context-Free Grammars in GNF 11 So using this divide-and-conquer approach we end up with ν (n) and π (n) in Θ(n log n), rather than in Θ(n ) as for the previous families. 8 Concluding Remarks We discussed a few ways of generating the languages {C n } n 1 of circular shifts by contextfree grammars {G n } n 1 in GNF, and we compared these families with respect to the measures ν, π and δ. Our results give rise to the following observation. Conjecture 8.1. Any family of context-free grammars in GNF {G n } n 1 that generates {C n } n 1 must have measures ν(n) and π(n) that are not bounded by any linear function in n. The situation in the GNF-case differs considerably from the CNF-case: in [3] we established the existence of a minimal family in CNF for which ν and π are linear functions in n (even with small coefficients). For the GNF the definition of minimality remains a problem; viz. setting {G n } n 1 in GNF is minimal for {C n } n 1 if (i) each G n is unambiguous, and (ii) ν(n) Θ(f(n)) leaves us with the question of an adequate choice for f(n). Conjecture 8.1 implies f(n) ω(n). Taking f(n) equal to n log n results in the minimality of {G n } n 1 (Proposition 7.6), but the question whether this family is also minimal in an absolute sense (i.e., does there exists no family with ν(n) Θ(n log n) and ν(n) < n log (n + 1) n + 1 for n large enough with n = k 1 and k?) remains an open problem as well. References 1. B. Alspach, P. Eades & G. Rose, A lower-bound for the number of productions for a certain class of languages, Discrete Appl. Math. 6 (1983) 109-115.. P.R.J. Asveld, Generating all permutations by context-free grammars in Chomsky normal form, Theoret. Comput. Sci. 354 (006) 118 130. 3. P.R.J. Asveld, Generating all circular shifts by context-free grammars in Chomsky normal form, CTIT TR 05-3 (005), ISSN 1381-365, University of Twente, Enschede, the Netherlands; to appear in J. Autom., Lang. Combin. 4. P.R.J. Asveld, Generating all permutations by context-free grammars in Greibach normal form, (in preparation). 5. W. Bucher, A note on a problem in the theory of grammatical complexity, Theoret. Comput. Sci. 14 (1981) 337-344. 6. W. Bucher, H.A. Maurer & K. Culik II, Context-free complexity of finite languages, Theoret. Comput. Sci. 8 (1984) 77-85. 7. W. Bucher, H.A. Maurer, K. Culik II & D. Wotschke, Concise description of finite languages, Theoret. Comput. Sci. 14 (1981) 7-46.

1 Peter R.J. Asveld 8. J. Dassow, On the circular closure of languages, EIK 15 (1979) 87 94. 9. K. Ellul, B. Krawetz, J. Shallit & M.-w. Wang, Regular expressions: new results and open problems, J. Autom. Lang. Comb. 9 (004) 33 56. 10. R.L. Graham, D.E. Knuth & O. Patashnik, Concrete Mathematics (1989), Addison- Wesley, Reading, MA. 11. J. Gruska, Some classifications of context-free languages, Inform. Contr. 14 (1969) 15 179. 1. M.A. Harrison, Introduction to Formal Language Theory (1978), Addison-Wesley, Reading, MA. 13. V.A. Iljuškin, The complexity of the grammatical description of context-free languages, Dokl. Akad. Nauk SSSR 03 (197) 144-145 / Soviet Math. Dokl. 13 (197) 533-535. 14. A. Kelemenová, Complexity of normal form grammars, Theoret. Comput. Sci. 8 (1984) 99 314. 15. G. Satta, personal communication (00).